Rationale
One of the criticisms of Routino version 2 was that it was very much slower than earlier versions (a point raised by a user of Routino in an e-mail). I thought that the reason for this was the introduction of turn relations which means that the direction of travel is as important as the location when calculating the route. Even so I was surprised to be told that calculating a route took five times as long as before. After the release of version 2.1 it seemed like a good time to start making some proper measurements of the time taken for generating the database and calculating routes. With version 2.1.1 the cause of the slowdown has been identified and minimised.One thing that this benchmarking does not do is compare the speed of Routino to other routers using OSM data. Since it only makes sense to compare programs that do the same thing and all of the routers have different features any such simple comparison would be meaningless.
General Method
The general idea for the benchmarking was to take all released versions of Routino and run the same tests on them. These tests would measure the amount of time and memory that it takes for each of them to generate the routing database and calculate some routes. The size of the database and the number of entries would also be recorded. The results are then collated and compared to try and find out the differences.All tests were run on 32-bit Linux running on a ~3GHz Core i3 processor with 4 GB of RAM. Since the programs are all single-threaded only one core would be used and since it is a 32-bit system no more than 3 GB of virtual memory is available to the programs.
The results for time and memory were found by using /usr/bin/time -f "%S %U %e %M" which prints out the system (kernel) time, user time, elapsed time and maximum resident set size (memory usage). The sum of the user and system time has been used for plotting the graphs below. The amount of memory shown is 1/4 of the maximum resident set size given by the above method because tests show that this method overestimates by a factor of 4. The amount of allocated memory at the peak is not as high as the figure given even with this scaling. The memory statistic should therefore be taken with some caution but is probably correct in showing relative usage even if not absolute.
Database Generation
The first step in calculating a route is to generate the routing database. For these tests the OSM extract for Great Britain (from Geofabrik) was used. This file is 6.1GB when uncompressed and contains 92.9 million lines, 30 million nodes, 3.7 million highways and 77,000 relations (statistics come from planetsplitter output).Time and Memory
The results for time and memory are presented below as a table and a pair of graphs with explanations for the major features.
| Version | System Time (s) | User Time (s) | Elapsed Time (s) | Max Resident Set Size (MB) |
|---|---|---|---|---|
| 1.0 | 29.11 | 188.87 | 238.98 | 1953 |
| 1.1 | 28.89 | 254.12 | 293.98 | 2091 |
| 1.2 | 647.44 | 213.32 | 914.70 | 717 |
| 1.3 | 674.69 | 222.25 | 953.20 | 718 |
| 1.4 | 601.51 | 382.39 | 1039.60 | 719 |
| 1.4.1 | 619.24 | 267.96 | 940.59 | 748 |
| 1.5 | n/a | n/a | n/a | n/a |
| 1.5.1 | 616.75 | 269.90 | 946.29 | 839 |
| 2.0 | n/a | n/a | n/a | n/a |
| 2.0.1 | 504.17 | 258.18 | 815.10 | 523 |
| 2.0.2 | 504.72 | 256.75 | 809.73 | 523 |
| 2.0.3 | 503.82 | 257.74 | 811.11 | 523 |
| 2.1 | 509.57 | 269.50 | 830.62 | 530 |
| 2.1.1 | 442.38 | 236.89 | 727.24 | 530 |
| 2.1.2 | 438.21 | 227.16 | 714.15 | 530 |
Notes:
- Versions 1.0 and 1.1 contain a bug that was fixed in version 1.2 that caused a memory leak during OSM parsing. The bug fix from version 1.2 was applied to these versions because without it they ran out of memory.
- Versions 1.5 and 2.0 failed to run without crashing - since bug-fix releases are available (1.5.1 and 2.0.1) there was no point in patching the source code to allow them to run.
The first obvious thing to notice is that version 1.5 and version 2.0 have no results. This is because neither of these two versions could process the complete input file without crashing. Both of these versions were followed up by bug-fix versions which did work correctly. The interesting thing is that at the time of the release of these two versions they could process the Great Britain OSM file because that is the one used as the example on the Routino website.
Version 1.2 introduced new sorting functions that operate with limited memory overhead rather than using the operating system supplied qsort function which seemed to use an extra copy of the data. This explains the large reduction in memory used between version 1.1 and version 1.2. There were some other memory leaks fixed in version 1.2 but these were probably small by comparison. The reduction in memory usage in this version is offset by the increase in time which is mostly system time and related to the use of temporary files for sorting.
Version 1.4 added the use of tagging rules from an XML configuration file which made very little difference to the memory usage although the time increased but then decreased again in version 1.4.1.
Version 1.5 added processing for node properties and route relations and this seems to have increased the memory usage although the time taken is the same.
Version 2.0 changed the internal data structure so that each segment is only stored once and the rule for finding super-nodes. This would be expected to halve the amount of memory required for segments (which probably explains the reduction in memory to about 2/3 of that in version 1.5.1).
Version 2.1.1 reduces the default number of iterations from 10 down to 5 because tests showed that there was no significant change in the routing time by doing this. This is the primary reason for the speed increase and the other optimisations were quite small in comparison.
Database Size and Contents
There is less variation in the size of the database between the different versions than there was for time and memory usage.
| Version | Nodes (k) | Segments (k) | Ways (k) | Relations (k) |
|---|---|---|---|---|
| 1.0 | 12034.2 | 14195.8 | 2022.5 | 0.0 |
| 1.1 | 12034.2 | 14195.8 | 2022.5 | 0.0 |
| 1.2 | 12033.8 | 14195.6 | 2022.4 | 0.0 |
| 1.3 | 12039.5 | 14226.0 | 2023.7 | 0.0 |
| 1.4 | 12058.1 | 14246.5 | 2026.8 | 0.0 |
| 1.4.1 | 12058.1 | 15072.4 | 2026.8 | 0.0 |
| 1.5 | n/a | n/a | n/a | n/a |
| 1.5.1 | 12058.1 | 15042.4 | 2026.8 | 0.0 |
| 2.0 | n/a | n/a | n/a | n/a |
| 2.0.1 | 12058.1 | 14355.2 | 2026.8 | 2.2 |
| 2.0.2 | 12060.6 | 14357.7 | 2026.9 | 2.2 |
| 2.0.3 | 12060.6 | 14357.7 | 2026.9 | 2.2 |
| 2.1 | 12062.4 | 14361.4 | 2027.3 | 2.1 |
| 2.1.1 | 12062.4 | 14361.6 | 2027.3 | 2.1 |
| 2.1.2 | 12062.5 | 14361.7 | 2027.3 | 2.1 |
| Version | Nodes (MB) | Segments (MB) | Ways (MB) | Relations (MB) |
|---|---|---|---|---|
| 1.0 | 91.963 | 270.763 | 28.161 | 0.000 |
| 1.1 | 91.963 | 270.763 | 28.161 | 0.000 |
| 1.2 | 91.959 | 270.759 | 10.195 | 0.000 |
| 1.3 | 92.003 | 271.339 | 12.257 | 0.000 |
| 1.4 | 92.145 | 271.730 | 12.216 | 0.000 |
| 1.4.1 | 92.145 | 287.484 | 12.216 | 0.000 |
| 1.5 | n/a | n/a | n/a | n/a |
| 1.5.1 | 138.143 | 286.911 | 12.439 | 0.000 |
| 2.0 | n/a | n/a | n/a | n/a |
| 2.0.1 | 138.143 | 273.804 | 12.354 | 0.033 |
| 2.0.2 | 138.211 | 273.852 | 12.370 | 0.033 |
| 2.0.3 | 138.211 | 273.852 | 12.370 | 0.033 |
| 2.1 | 138.232 | 273.922 | 12.370 | 0.032 |
| 2.1.1 | 138.232 | 273.926 | 12.370 | 0.032 |
| 2.1.2 | 138.233 | 273.929 | 12.370 | 0.032 |
Version 1.2 combines together ways with identical properties which accounts for the large reduction in the size of the file containing them (although the number remains the same).
Version 1.3 added a lot more highway properties which will reduce the number of identical ways and leads to the 25% increase in the size of the file although the number of ways remains about the same.
Version 1.4 uses the tagging rules in the configuration file and the additional rules compared to the built-in ones earlier explain the small increase in nodes, segments and ways.
Version 1.4.1 changed the algorithm used to find super-nodes which will change the number of super-segments and explains the increase in the number of segments although the number of nodes and ways remains the same.
Version 1.5 added node properties which explains the increase in the file size even though the number of nodes stays the same.
Version 2.0 changed the method of finding super-nodes back to what it was in version 1.4 which explains the reduction in the number of segments.
Version 2.1 adds more tagging rules which explains the increase in the number of nodes, segments and ways.
Routing
The routing was performed by selecting pairs of random nodes from the generated database (using version 2.1) and using these as the start and finish points of routes. A total of 100 pairs of nodes were used and when the test was run not all combinations were routable (typically 90% routed with version 2.x). The statistics presented in the table and graphs are the average for all of the routes that could be found. The routes that cannot be found typically take much longer and introduce a bias.Time and Memory
There are some significant differences in the time taken and memory used for routing between the different versions.
| Version | System Time (s) | User Time (s) | Elapsed Time (s) | Max Resident Set Size (MB) | Percentage Routed |
|---|---|---|---|---|---|
| 1.0 | 0.02 | 0.23 | 0.88 | 133 | 48 |
| 1.1 | 0.02 | 0.28 | 0.66 | 145 | 100 |
| 1.2 | 0.04 | 0.68 | 1.05 | 262 | 100 |
| 1.3 | 0.04 | 0.73 | 1.10 | 259 | 96 |
| 1.4 | 0.05 | 1.01 | 1.46 | 264 | 86 |
| 1.4.1 | 0.05 | 1.49 | 1.94 | 262 | 85 |
| 1.5 | n/a | n/a | n/a | n/a | n/a |
| 1.5.1 | 0.06 | 1.50 | 1.95 | 288 | 89 |
| 2.0 | n/a | n/a | n/a | n/a | n/a |
| 2.0.1 | 0.11 | 11.55 | 11.94 | 257 | 89 |
| 2.0.2 | 0.11 | 11.19 | 11.57 | 256 | 89 |
| 2.0.3 | 0.11 | 11.23 | 11.65 | 256 | 89 |
| 2.1 | 0.11 | 11.20 | 11.57 | 256 | 89 |
| 2.1.1 | 0.06 | 1.86 | 1.95 | 274 | 89 |
| 2.1.2 | 0.03 | 0.75 | 0.79 | 111 | 89 |
There is a gradual increase in routing time between version 1.0 and 1.5.1 as new features are added (like highway properties and node restrictions) that mean that extra factors need to be taken account at each stage.
Version 2.0 added turn relations and because of this the direction of travel is as important as the location when calculating the route. At each intermediate node in the route there will likely be more than one possible route heading in different directions towards the final destination. The large increase in routing time is due to the fact that the hash function used to look up the intermediate nodes had not been modified to expect this. The lookup function was therefore taking a large fraction of the total routing time when it should only be a small part. This change was made in version 1.4.1 and caused no problems until version 2.0 when the routing changed.
Version 2.1.1 corrects the problem described above for the hash function but also adds some other improvements which increased the speed by a further 10%. Version 2.1.2 changes the method used to decide when to stop searching for better routes, this greatly reduces the number of nodes that need to be checked. The method used now is the method that was always intended but wasn't implemented correctly before. This change reduces the time and memory by a factor of 2.5.
Apart from versions 1.0 and 1.1 the amount of memory required to calculate a route remained fairly constant until version 2.1.2.
Conclusions
Versions 1.0 and 1.1 of Routino are quite different from the later versions in their requirements for memory and execution time. They can be considered as quite primitive compared to the later versions.Versions 2.0 through to 2.1 are very much slower at routing because of an inefficient hash function that causes a massive slow-down in looking up intermediate nodes.
Version 2.1.1 corrects the problems of versions 2.0 to 2.1 as well as some further small speed improvements. Version 2.1.2 makes significant changes to the routing speed and memory usage.