An experiment on transfer patterns robustness in the presence of real-time updates

This page provides the theoretical background needed to understand **transfer pattern routing**. The notion of **transit networks** and their representation as **GTFS feed** is introduced.

Road networks can be effectively modeled as **directed graphs**, where the nodes correspond to geographical positions and the arcs are used to depict the road connections between places. The arc costs amount to the time of travel along a road and are computed from the distance between two nodes it connects and the respective road type.

Whereas road network connections are independent of date and time, connectivity on transit networks depends on **time tables**. You cannot take a bus from the public garden to the main station at 3am, if the earliest bus goes at 6pm. Therefore the modeling of public transportation demands a different approach. There are two different representations of transit networks: the time-dependent and the time-expanded graph.

In a **time-dependent** graph there is one node for each stop and there are arcs connecting the nodes with a cost function depending on the departure time. The advantage of such a model is its **compact size**, but routing on it demands more complex search algorithms.

The **time-expanded** approach is what we use for the Transit Planner. In this model each node represents a stop at a specific time and each arc represents an actual connection or transfer. The main advantage of the time-expanded model is its **general domain-independent form**, making general shortest-path search algorithms like Dijkstra applicable. The trade-off is its bigger size. More details on the time-expanded graph is provided in the sections on the data structure for transit networks.

The GTFS is the de facto standard for the description of public transportation networks. A feed specification consists of multiple files describing various aspects of a transit network. In this text, we only describe elements of the GTFS, which are supported by Transit Planner's GTFS parser. For a complete specification of the stanard, please read the GTFS reference.

The file `stops.txt`

describes all public transportation stations, like bus stops, way or metro stations. In `trips.txt`

a set of trips is defined. Each trip has a unique id and is associated with a service id. Service groups like *on sunday* or *on workdays* describe the days on which a trip is active. For each service id, an entry in `calendar.txt`

provides information on weekday activity. For each trip, the file `stop_times.txt`

lists successively served stops with corresponding arrival and departure times. Furthermore Transit Planner's network parser supports the optional file `frequency.txt`

, which provides information about periodical trips, in order to reduce redundancy in `stop_times.txt`

.

State-of-the-art routing algorithms are not suitable for transit networks. The main reason is the difference between road and transit networks: whereas in road networks there is just one arc between two places, there is one arc between two stops for each connection in a transit network. Moreover each arc in a transit network is on at least one optimal path. The need for a routing algorithm exhibiting the speciality of transit networks arises.

Two observations motivate the idea behind **transfer patterns**. The first is that wherever you travel to, the optimal path has only a **limited number of transfers**. Furthermore the optimal path between two stops depending on the departure time is one of a **limited set of paths**. This set of paths from one stop to another is called transfer pattern.

Storing the transfer patterns for every pair of stops has quadratic space complexity. Besides, computing each pattern involves a full Dijkstra-search on the transit network. On a large network as for *New York City* a straight-forward computation of transfer patterns is rather impractical, if not impossible. Therefore the notion of **hub stations** is introduced. These correspond to the stops, where most transfers happen. Given a suitable set of hubs, we only compute and store the transfer patterns for each stop in the local neighborhood, until the nearest hubs. For each hub station a full Dijkstra-search is applied. Thereby, we drastically reduce the transfer patterns precomputation time and memory consumption.

In **real-world scenarios** public transportation vehicles are often **delayed** or even **canceled**. Trip delays change the network in such a way, that previously optimal paths may become non-optimal and new optimal routes can emerge. The efficiency of the search via transfer patterns is based on its **costly precomputation** of all shortest paths in the transit network. To deal with real-time updates, the recomputation of all transfer patterns must be avoided.

One aspect of our experiment is to evaluate, whether the originally precomputed transfer patterns are sufficiently roboust or redundant, such that most of the newly created optimal paths are already contained in them. If so, transfer patterns routing would be applicable for transit networks with dynamically changing trips and connections, without the costly precomputation of the transfer patterns for the network.