Routing in Delay Tolerant Networks Revision History 04.28.04 Initial Revision, mh I. Introduction This document covers the routing requirements for Delay Tolerant Networking. Our objective is to minimize the delay of bundles, and in doing so maximize the probability of delivery, and reduce resource (buffer and network) contention. II. Scenarios A. Scheduled: Interplanetary Internet Nodes in this network are characterized by high latency links that attenuate with distance, leading to variable bandwidth. In addition, they experience frequent scheduled disconnection due to orbital visibility constraints. Regional boundaries may be established along geographical lines, or the entire Interplanetary Internet can be considered to be a region unto itself. If the latter, then all routing within the region is intra-regional, and any identifiers for particular nodes in the internet are in the admin id. If the former, then the routing layer will determine the best next hop necessary to reach the destination region using parameters in its inter- regional routing table. B. Internetworking: Sensor Networks->Gateway->Internet There are several approaches to enabling sensor networks to communicate reliably with the internet. In one scenario, each sensor is a DTN node, and uses custody transfer to ensure delivery to a DTN gateway at the edge of the sensor net. In another scenario, a DTN mule walks among the sensors on a scheduled route, collects the data, and transports it to a DTN gateway, which serves as an interface between the sensor region and the internet region. C. Unreliable: Wireless Radio Wireless radio links are characterized by frequent and unpredictable disconnection, due to interference, power management or other issues. Using DTN, a node can choose to transfer bundles to another node that is "closer" to the destination, or to store the bundle until it sees the destination node. In most cases this will entail intra-regional routing and the routing mechanism will be entirely up to the region administrator. For other regions to communicate with a wireless ad hoc region, however, they will need to know the predicted/probabilistic delay between the gateways in the region, as well as the frequency of contacts. (Delay is the latency once a bundle has been sent through the convergence layer, which includes in-network delay if buffering is required at intermediary nodes. The frequency of contacts determines the additional predicted delay due to initial disconnection. D. Opportunistic: Military Ad Hoc Some connections are opportunistic, meaning that they are not necessarily predictable, but are still useful when available. In some cases, the appearance of an opportunistic connection will allow earlier transmission than expected (which would require the ability to rechedule bundles already designated for a particular contact). In other cases, one could depend entirely on opportunistic connection for intercommunication between regions (or within). Thus a bundle could be scheduled for the "next available" contact for a particular next hop, or it could be placed on an unscheduled list, for review next time a contact is available. Like scenario C, particular nodes may have an expected probability of next contact. A region might correspond to the nodes that are expected to be together at all times (such as a particular unit), or it may correspond to the set of nodes that may be in contact over time. In the latter case, routing between nodes is delegated to the region and not governed by the dtn specification. E. Contact Discovery: Datamule Bus(es) passing a PDA Given a small DTN node that knows very little about the current routing topology, it should be able to make use of opportunistic contacts as they become available. A new contact might advertise the length of time the contact will be available, the bandwidth, the buffer capacity, and what regions it knows. Some of these calculations can be done in the convergence layer, and other Most likely, though, the contact duration will not be advertised, and both sides will employ reactive fragmentation to make maximum use of the connection. Potential issues are link congestion (what happens if everyone attacks the link at once) and reliability. * How does the PDA know that this contact will relay the bundle successfully? * How can custody acknowledgements or application data be sent to the PDA? * Would it make sense to employ gps-based geographic routing and attempt to send the data to the general physical area where the pda was last found? * How would that impact the inter-regional routing? * Where are the region boundaries in this case? F. Multi-protocol: Personal regions Here I describe the scenario in which one region supports multiple protocols. The classic example is the "Internet region" which will support requests from a number of protocols, including gprs, tcp, and udp. Another conceivable application is the concept of the "personal" region. Since region identifiers are used to aid routing between regions, one could conceive of a region mapping to your set of personal DTN nodes, including a laptop, your phone, and a number of bluetooth devices. The laptop and phone can serve as gateways, and intra-regional routing can direct bundles to the interface/convergence layer corresponding to the other devices. Each gateway has convergence layers for the interfaces it supports. G. Internetworking: Village Scenario (from Sushant's paper) A village kiosk is connected to a nearby town via a digital courier (man on motorbike), a wired dialup Internet connection, and a LEO. Actually, I'll take the paragraph straight out of the paper: "We begin with a hypoteical village equipped with a digital courier, a wired dialup Internet connection, and a LEO (e.g. PACSAT) satellite store-and- forward connection. These satellites have low to moderate bandwidth (around 10kbps) and are visible for 4-5 short periods of time ('passes') per day (lasting arund 10 minutes per pass, depending on orbit inclination and location on Earth). We call the opportunity to communicate a contact, which is characterized by a duration of time, a capacity, ad a latency (assumed to remain constant during the contact duration). In addition, depending on the type of connection used, finite buffering constraints related to the contact may need to be considered. The digital courier service thus represents a high-bandwidth, high-latency contact, the dialup represents a low-bandwidth, low-latency contact, and the LEO satellite represents a moderate-bandwidth, low-latency contact. The problem of selecting which contacts to carry traffic (and consequently what time to send traffic) represents an instance of the DTN routing problem..." III. Definitions A. Tuple: A pair of identifiers is used in routing a message to its final destination. The region id is used throughout the DTN to get the bundle to its destination region, and the admin id is used within the region. B. Inter-regional Routing: Bundles destined for other regions are routed using a well-known routing standard. This may entail, but is not limited to, hand configured tables, machine-learned, or dynamic routing updates. C. Intra-regional Routing: This is region-specific routing for bundles that are already within their target region. This allows each node in a region to decide whether to try and deliver the bundle directly to the app, or to another DTN node within the same region. The exact implementation and mechanism may differ from region to region. D. Contact: A period of time when data can be transmitted over a particular interface. E. Custody Overlay: There is a question of whether special routing should be done to send IV. Bundle Layer Interface A. Inter-regional Routing - non-matching region identifier Given a bundle, the routing layer will determine which next hops are best, given the destination region, bandwidth and storage requirements, and whether custodial transfer has been requested. Specific bundle parameters that affect this decision are priority, custody transfer,, bundle size, and destination region id. The destination admin id does NOT affect this decision. Given a particular next hop, the convergence layer is responsible for getting the bundle to that particular node, and may employ intermediary hosts not within the DTN overlay. The routing layer should return a set of next hop identifiers, together with the corresponding fragment offsets and lengths. The routing layer may choose to replicate some or all of the bundle across different contacts. In addition, the routing layer may choose to delegate scheduling until later, if no known path is yet available. B. Intra-regional Routing - matching region identifier Once a bundle has reached its destination region, the bundle daemon can use the admin id to determine the next end point. Design and implementation of this mechanism is left up to the region administrator, but may use a similar mechanism as inter-regional routing. C. Non-DTN Routing - between DTN hops Between DTN hops, it is up to the underlying technology to determine which path is the most appropriate for delivering bundles between DTN nodes. For ad hoc regions, this may entail AODV, DSDV, or even epidemic routing. V. Routing Considerations The following is the data we can take into account in determining the best set of routes for a given bundle. We have the option of choosing one or more routes A. Contacts Schedule This includes pre-arranged, recurring, and probabilistic contacts. The probabilistic contacts may involve some level of machine learning, but if all of your contacts are recurring, then this can be set by hand and ignored thereafter. This schedule should be modifiable by a management interface. Information in this table includes latency and bandwidth for each contact. Information may also include buffer availability. Note: Buffer availability may also be used as a form of hop-by-hop congestion control. B. Contact->Region Map Particular contacts can reach specific regions. If this is an oracle, it would include the number of DTN hops to, the effective throughput of, and the frequency of contact with each region. C. Region->Region Map Some regions may be named hierarchically, in which case regions can be matched by longest match. In other cases, we know that we can reach certain regions through other regions. For example, the sensor network gateway may not recognize the interplanetary region, but has an entry for "*" to some Internet gateway that might know. Eventually, bundles routed to "*" entries will reach a gateway that can identify the region. D. Traffic Demand Schedule In anticipation of higher traffic demand, a routing algorithm may be able to optimize by distributing load, either by proactively fragmenting the bundle or by sending bundles on different routes via some form of round robin routing. Note: This will probably not be a factor in routing. E. Queueing Considerations The contacts schedule should track immediate Queue state. However, if it is known that a downsteam queue will not have sufficient capacity, the routing layer may decide to use a different path, to prevent future loss or delays. F. Custodial Nodes If we are using custody transfer for the bundle, we may choose a route that has more custodians, rather than a route without custodians. VI. Cost Factors (from Sushant's paper) Delay Components: * queuing time: time until a contact becomes available * transmission delay: time to inject a message into an edge * propagation delay: time for a message to arrive at next hop Route Stability: * how often do topology changes occur? Is source routing viable? * proactive vs reactive routing * scheduled routes: for recurring contacts, we can set up known scheduled routes, allowing for pre-planning of buffer space and load * with delay tolerant networking, there is no way to propagate route changes instantaneously * source routing vs per hop routing: - network queuing time may be large compared to forwarding delay - topological changes will probably occur while a message is in transit - potential loops Resource Consumption: * a routing protocol will consume some amount of overhead * the decision to replicate must be weigh the increased reliability against overhead of additional traffic Fragmentation * routing layer may choose to split a bundle across multiple subsequent contacts, if the throughput of a given contact is insufficient to transport the whole thing * routing layer may choose to splite a bundle across differnt paths entirely to improve load balancing and reduce delay (e.g. you have 14 phone lines and the node two hops down as a T1 to the destination, so you use all 14 at once to send the message)