Return-Path: Received: from munnari.oz.au by ray.lloyd.com (4.1/SMI-4.1/Brent-911016) id AA10211; Mon, 29 Jun 92 22:21:48 PDT Received: by munnari.oz.au (5.83--+1.3.1+0.50) id AA03570; Tue, 30 Jun 1992 11:14:43 +1000 (from owner-Big-Internet) Received: from Angband.Stanford.EDU by munnari.oz.au with SMTP (5.83--+1.3.1+0.50) id AA03560; Tue, 30 Jun 1992 11:14:28 +1000 (from bsimpson@angband.stanford.edu) Received: from via.ws13.merit.edu by Angband.Stanford.EDU (5.65/inc-1.0) id AA19068; Mon, 29 Jun 92 18:11:05 -0700 Date: Mon, 29 Jun 92 19:59:20 EDT From: "William Allen Simpson" Message-Id: <486.bsimpson@angband.stanford.edu> To: big-internet@munnari.oz.au Reply-To: bsimpson@angband.stanford.edu Subject: PIPE Status: Read We're having some very long discussions here, but I haven't seen any proposal that does the few simple things that I would like to see. My last attempt to get the group to focus on practical details wasn't widely received. I was enthused last year by Noel's routing ideas. I haven't been able to get too excited by any of the other proposals, and was leaning toward DNAT because it seemed "practical". I finished a contract on June 15th, and set aside a some time to write a draft which is both practical and implements some of the "ideal" that Noel first brought to my attention. It may not get theoretical acclaim, but it should move us in the right direction. PRACTICAL INTERNET PROTOCOL EXTENSIONS (PIPE) Abstract Current IP addresses are used for both host identification and routing. This memo proposes a simple and practical migration to routes using a variable length "path", which is separate from the final host identifier. It extends the concept of "areas" already present in some routing protocols to reduce the size of current routing tables and provide for future expansion of the IP address space. To reduce the size of routing tables, a pair of IP4 header options is defined to implement area to area source routing. These variable length options will be present in both IP4 and future IP(n) headers. It is hoped that the simplicity of the extension will ease implementation. To deal with several perceived limitations with the current IP4 header, an IP(n) header is defined which is very similar to the IP4 header. The similarity should allow IP(n) routers to continue to route IP4 packets with very little effort. It is intended that this migration occur over a period of many years. The text of this proposal is preliminary, and it is hoped that comments and extensions will be offered to make this document more complete. Motivation This author is convinced that attempts to extend the IP address space by simply making it larger are outweighed by the explosion of routing table space needed for the larger size. Also, a fixed size address is still subject to future exhaustion, since the address is burdened with the need to carry the routing information. Such an address space is likely to be highly fragmented. Therefore, a scheme is proposed which separates the IP address from the IP routing at the inter-domain level, and can be extended into the intra-domain level. The proposed routing mechanism is extemely hierarchical, which facilitates route folding, and extensible, to adapt to future growth. In addition, any scheme which requires significantly more network resources for final implementation is doomed to failure. Current Domain Name Service consumes 10% of the bandwidth; doubling this would be a significant problem. This proposal requires no more DNS accesses than currently, and does not require new servers. New services such as inter-domain policy routing and mobile routes are handled entirely within the usual routing table update mechanism. Finally, this proposal does not require any immediate changes to current IP practices at any level, does not require deployment be universal, and provides an extremely gradual migration path to a new IP header. The new header is designed to be routed concurrently with older IP by maintaining most fields in their current relative offsets within the IP header. Terms and Concepts A "Internet Protocol System Address" (IP-SA) is the number associated with the end producer/consumer of packets. An actual host may have several numbers, perhaps corresponding to different network attachment points, but it is expected that eventually each system will have a single number. This number is also called the "IP address". A "Routing Area" is defined as a region in the internet routing space which is reachable with a common defined set of quality of service and security. The area may contain other areas. Several area numbers may be assigned to the same region when there are non-default quality of services or security provided. A "Routing Path" is a sequence of areas which defines the connectivity of an area to a designated root. The leaf areas in a path MUST NOT partially overlap with another leaf. Each set of quality of service or security creates a separate path. Consider the following topology: Backbone Router A ----------------------------------- Router B | | + Host 1.2.3.4 + Host 6.7.8.9 | | + Router AC ----+------------+---- Router BC + | | | + Host 4.3.2.1 -+ +- Host 9.8.7.6 The default path to these hosts would be defined as: 0:RA 0:RB A:RA B:RB A:1.2.3.4 B:6.7.8.9 A:4.3.2.1 B:RBC A:RAC B.C:RBC A.C:RAC B.C:9.8.7.6 A.C:4.3.2.1 B.C:4.3.2.1 A.C:9.8.7.6 Defining System Addresses In the example above, note that the System Address is used only in the final hop. It need not be subnetted, nor have any particular structure. This allows a very dense IP address space, and should extend the life of the current globally unique IP address. When fully deployed, it is only necessary that the IP-SA be unique within a routing area, and together with the complete routing path would provide unique identification of a host. However, to promote ease of migration and support for mobile routing and automatic partition repair, the IP-SA is required to be unique within the last three levels of the completely specified routing path. Defining Routing Areas Routing Areas will be defined from the hosts upward. A host which is attached to a single network is in the same area with others attached to the same network with the same policy. A host which is attached at two points is in both areas (has two routing paths), regardless of whether it also acts a router between the areas. Routing Areas are bounded by routers which are connected to more than one level of area or are separated by policy. The Area numbers range from 1 to 254. Area 0 means this level. Area 255 means all areas at this level. Defining Routing Paths Routing Paths will be defined from the root backbone downward. At each separation of level, and for each quality of service or security, an additional number is assigned to an area. In the above example, suppose that Host 4.3.2.1 was not participating in a security agreement with all of the other hosts. Even though Host 4.3.2.1 appears on the same network, it would simply not have an entry in the security Routing Paths, and would not be visible to Hosts 9.8.7.6 and 1.2.3.4 for that type of traffic. Automatic discovery of Routing Paths is possible, but would require cooperation between the hosts and the DNS for the automatic assignment to be propagated. Any host, including routers, would request the path at each point of attachment. As the routers discover their position in the path, they could assign area numbers and respond to the requests that appear on other points of attachment. The hosts could then inform the DNS of the current path(s). The path may be padded with area zero. When found at the beginning of the path, it means root. When found at the end of the path, it means this level. Area zero may never be found in the interior of a path. All zeroes may be removed from the path without a change in meaning. Final Hop Delivery This definition extends Routing Paths only to a particular area. Delivery to the final destination continues to be accomplished with current methods. One advantage of this design is that routing paths may be gradually and automatically extended as routing implementations are upgraded. The Routing Path may be specified at the source or the inter-domain router, and used until it is no longer recognized. From that point on, the routing takes place using the current subnet methods. The separation from local delivery mechanisms allows the design to adapt without change to all current and future link media, and to benefit from improvements in such things as automatic host configuration and router discovery without change to the overall routing scheme. Another advantage is that mobile systems may continue to receive traffic without changing the Routing Path, by specifying an incomplete path and relying on local routing to make the final delivery. This reduces propagation of changes and hides such details from the global internet. This feature also facilitates automatic repair of partitioned areas. The path can be locally shortened to a level which includes the partitioned area, and delivery can take place by internal means. Migration to Path Routing Initially, the path will simply be the Autonomous System number. These are centrally administered, and are already distributed to backbone routers. The AS number can easily be reorganized as a two level heirarchy with no effect on current backbone routing tables. Backbone routers will add the Routing Path (AS number) to IP4 datagrams, which will improve the speed of routing over subsequent backbone routers. Next, a new record must be added to the Domain Name System to maintain the default Routing Path for each SOA, and the Routing Path must be returned in the information part of every DNS access, including inverse address lookups. Eventually, all participating domains connecting to the backbone will be required to add the Routing Path when it is missing from IP4 datagrams. At that point, the massive backbone routing tables may be eliminated. The conversion tables in the border routers are likely to be considerably smaller than in the backbone routers, since most domains do not continuously send to all other domains. In the long term, every IP4 host will be converted to PIPE, and include the Routing Path in every datagram. After this is completed, system addresses will no longer need to be globally unique. New IP Header Note that each tick mark represents one bit position, and bits are numbered from most significant to least significant. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IHV | IHL | Path Level | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification | Flags | Time to Live | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ IHV: 3 bits Internet Header Version indicates the format of the internet header. This document describes version 011 binary. The IP4 Version is 0100 binary, but is encoded in a 4 bit field. Therefore, this choice is equivalent to IP4 Version 0110. IHL: 5 bits Internet Header Length is the length of the internet header in 32 bit words, and thus points to the beginning of the data. Note that the minimum value for a correct header is 5. The maximum header length is 124 octets, and a typical header is 28 octets. The IP4 IHL field occupied this location. The field was expanded in order to provide more option space. The minimum header size is always known by the Version field. Path Level: 8 bits The Path Level indicates the area level to which the sender is routing. A zero indicates the root backbone, a one indicates the first area in the path, etc. This field may be used as an index into the Path Option. The IP4 Type of Service field was removed, since TOS was rarely used, and is now encoded in the Routing Path. Total Length: 16 bits Total Length is the length of the datagram, measured in octets, including internet header and data. This field allows the length of a datagram to be up to 65,535 octets. Such long datagrams are impractical for most hosts and networks. All hosts must be prepared to accept datagrams of up to 1480 octets (whether they arrive whole or in fragments). It is recommended that hosts only send datagrams larger than 1480 octets if they have assurance that the destination is prepared to accept the larger datagrams. The number 1480 is selected to allow a reasonable sized data block to be transmitted in addition to the required header information, and to allow typical lower-layer encapsulations room for thier respective headers. IP4 has maximum 65,535, minimum 576 octet datagrams. Over time, memory limitations have eased considerably, and there have been some indications that a larger minimum datagram size throughout the internet would be beneficial. Raising the maximum would primarily benefit those users whose entire path consists of high bandwidth, high delay transmission. This design trades off an increase in size of the Protocol field, instead. Identification: 16 bits An identifying value assigned by the sender to aid in assembling the fragments of a datagram. Flags: 8 bits Various Control Flags. CX: 0 = Not Congested, 1 = Congestion Experienced. DF: 0 = May Fragment, 1 = Don't Fragment. MF: 0 = Last Fragment, 1 = More Fragments. IT: 0 = default, 1 = Interactive Traffic. 2 6 7 8 9 0 1 2 3 +---+---+---+---+---+---+---+---+ | C | D | M | I | | | | | | X | F | F | T | 0 | 0 | 0 | 0 | +---+---+---+---+---+---+---+---+ The IP4 Flags field was only 3 bits. The DF and MF flags are located in the same position. The CX bit is set by any router where congestion is experienced. The IT bit is set by the host or router to indicate interactive traffic within a particular quality of service or security. Time to Live: 8 bits This field indicates the maximum time the datagram is allowed to remain in the internet system. If this field contains the value zero, then the datagram must be destroyed. This field is modified in internet header processing. The time is measured in units of seconds, but since every module that processes a datagram must decrease the TTL by at least one even if it process the datagram in less than a second, the TTL must be thought of only as an upper bound on the time a datagram may exist. The intention is to cause undeliverable datagrams to be discarded, and to bound the maximum datagram lifetime. The IP4 Fragment Offset field was removed, and replaced with an IP Option. The TTL field was moved to a new location, where it is easier to increment on most hardware. Protocol: 16 bits This field indicates the next level protocol used in the data portion of the internet datagram. The values for various protocols are specified in "Assigned Numbers" [9]. The size of the protocol field was increased, since there is concern that 255 numbers may be too few over the extended life of the IP protocol. Header Checksum: 16 bits A checksum on the header only. Since some header fields change (e.g., time to live), this is recomputed and verified at each point that the internet header is processed. The checksum algorithm is: The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero. This is a simple to compute checksum and experimental evidence indicates it is adequate, but it is provisional and may be replaced by a CRC procedure, depending on further experience. Source Address: 32 bits The source System Address. Destination Address: 32 bits The destination System Address. New Option Definitions Path +--------+--------+--------//--------+ | type | length | path | +--------+--------+--------//--------+ Type = ?? The Length is the number of octets in the option counting the type, length, and path. The length will always be a multiple of 4. The Path is a list of areas, as described above. The path may be padded with trailing zeroes to align on 32 bit boundaries. Must be copied on fragmentation. This option usually appears twice in a datagram. The first (destination) Path must always appear immediately after the fixed portion of the header. The second (source) Path must always appear after the first. The Path option is compatible with IP4. However, it MUST NOT be added to datagrams already containing the Loose Source Route or Strict Source Route options. Fragment +--------+--------+--------+--------+ | type | length |000 Offset | +--------+--------+--------+--------+ Type = ?? Length = 4 000 = currently zero Offset This field indicates where in the datagram this fragment belongs. The fragment offset is measured in units of 8 octets (64 bits). The first fragment has offset zero. Must be copied on fragmentation. Appears at most once in a datagram. This option provides for fragmentation and reassembly of PIPE datagrams in a similar fashion to IP4. Fragmentation handling requires expanding and contracting the header; adding an additional option at that time should not be difficult. Since many believe that fragmentation is evil, this option may not be necessary. Bill.Simpson@um.cc.umich.edu