Network Working Group L. Wei Internet-Draft A. Karan Intended status: Experimental N. Shen Expires: April 19, 2010 Y. Cai cisco Systems Inc M. Napierala AT&T Labs October 16, 2009 Tunnel Based Multicast Fast Reroute (TMFRR) Extensions to PIM draft-lwei-pim-tmfrr-00.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 19, 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. L. Wei, et al. Expires April 19, 2010 [Page 1] Internet-Draft TMFRR Extensions to PIM October 2009 Abstract This specification defines simple extensions to PIM to support Tunnel based Multicast Fast Reroute (TMFRR) utilizing different tunnel technologies. This mechanism allows protection of native IP multicast traffic against link or node failures, with minimal traffic interruption, compared with situations relying on unicast and multicast route convergence. The repair for link or node failure happens locally near the point of failure. This version of specification addresses PIM-SM and SSM traffic protection. Table of Contents 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 3 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 3. Link Protection . . . . . . . . . . . . . . . . . . . . . . . 6 3.1. MP Allocated Special Label . . . . . . . . . . . . . . . . 7 4. Node Protection . . . . . . . . . . . . . . . . . . . . . . . 8 5. The Effect of Downstream Routing Convergence . . . . . . . . . 9 6. New PIM Extension . . . . . . . . . . . . . . . . . . . . . . 10 6.1. TMFRR Hello Option . . . . . . . . . . . . . . . . . . . . 10 6.2. TMFRR Join Attribute . . . . . . . . . . . . . . . . . . . 11 7. Selective Traffic Protection by MP . . . . . . . . . . . . . . 14 7.1. Selective Mode Operation on MP . . . . . . . . . . . . . . 14 8. Security Considerations . . . . . . . . . . . . . . . . . . . 16 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 17 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 10.1. Normative References . . . . . . . . . . . . . . . . . . . 18 10.2. Informative References . . . . . . . . . . . . . . . . . . 18 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 L. Wei, et al. Expires April 19, 2010 [Page 2] Internet-Draft TMFRR Extensions to PIM October 2009 1. Requirements Notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. L. Wei, et al. Expires April 19, 2010 [Page 3] Internet-Draft TMFRR Extensions to PIM October 2009 2. Introduction This document proposes a mechanism to support tunnel based fast reroute protection for IP multicast traffic. The tunnel is routed around the susceptible link or node, and set up when multicast routing state is established. When a failure is detected, traffic is redirected from the primary interface to the backup tunnel without waiting for routing to converge. To allow setting up of such a tunnel, an upstream router of the protected link or node needs information about the router downstream of the link or node under protection. For link protection, the tunnel end points are on two PIM neighbor routers. The tunnel setup information is conveyed via two new PIM hello options, For node protection, the tunnel setup information is first conveyed via the PIM hello options from the downstream router to the node under protection, then propagated one more hop, via a new PIM join attribute [RFC5384] to the RPF neighbor of the protected node. This work has been revised and enhanced from prior work [NFRR] [PIM-NNHOP] that extended RSVP-TE and PIM to do fast reroute protection of IP multicast traffic. In comparison, this proposal does not require RSVP-TE extension, and can utilize either a MPLS or an IP tunnel. 2.1. Terminology The following defines the terms used in this draft. TMFRR: Tunnel based multicast fast reroute. Point of Local Repair (PLR): The router or node that redirects traffic to the preset backup path when a failure in the primary path is detected. Merge Point (MP): The router where the traffic from the backup path gets "merged" back into the original forwarding path. Protected IP address: The IP address on the MP's interface for the link under protection, or on the link leading to the node under protection. L. Wei, et al. Expires April 19, 2010 [Page 4] Internet-Draft TMFRR Extensions to PIM October 2009 Multicast route: A (S,G) or (*,G) entry in ASM or SSM mode. TMFRR label: The MPLS label a MP allocates for a protected link or node. L. Wei, et al. Expires April 19, 2010 [Page 5] Internet-Draft TMFRR Extensions to PIM October 2009 3. Link Protection The following picture depicts a protected multi-access link in between PIM routers MP1, MP2 and PLR. ------ ----------| R1 |--------- | ------ | | | ----- A | | Rcv1 .. ---|MP1|-----| Protected | D ----- | Link ----- |--------------|PLR|-----... Source ----- B | C ----- Rcv2 .. ---|MP2|-----| | ----- | | ----- | ----------| R2|---------- ----- Figure 1: Link Protection MP1 and MP2 have downstream receivers for traffic from source. Both can reach PLR along alternative reverse paths via R1 and R2 respectively. The backup tunnel can be any MPLS or IP tunnel. The following discussion uses MP1 as an example. Each MP transmits its router ID in the PIM TMFRR hello option. For each distinct router ID the PLR learns, it sets up a tunnel with itself as the source, and with the MP's router ID as the destination. This tunnel is routed around the IP address on the MP, such that it avoids going through the protected link or node. The algorithm to route the backup path around the protected link is outside the scope of this specification. In order for the PLR to learn which multicast routes MP1 has joined and which ones MP2 has joined reliably, PIM Join suppression must be turned off on the protected link, and PIM explicit tracking of downstream joiners must be enabled on the PLR. After the tunnel is set up, if the PLR in Figure 1 detects failure on link C, the PLR stops forwarding traffic on interface C, and starts to forward traffic onto the backup tunnel. Since interface C has gone done, the PLR's PIM implementation may remove it from the multicast route's outgoing interface list. This may lead to a PIM prune being sent toward the source, disrupting traffic flowing into the backup tunnel. To prevent this prune from interrupting traffic L. Wei, et al. Expires April 19, 2010 [Page 6] Internet-Draft TMFRR Extensions to PIM October 2009 that the backup tunnel needs for the duration of the repair, the PLR must hold off the transmission of this prune for PRUNE_HOLD_OFF seconds. The default value for PRUNE_HOLD_OFF is TBD. When link C comes back up, the PLR adds it back to the outgoing interface list, and resumes forwarding traffic onto link C and stops forwarding traffic onto the backup tunnel. An implementation may delay to switch back onto the primary path to avoid thrashing due to quick link flaps. 3.1. MP Allocated Special Label An MP needs to accept and forward traffic received from the prmary RPF interface, and from the backup tunnel. When a MP needs to receive traffic from a backup tunnel and to merge into the primary forwarding path, the MP needs to verify if the traffic was from a backup TMFRR tunnel. To identify such packets, an MP allocates a MPLS label for each link it acts as MP for, and conveys that label to the PLR via the PIM TMFRR hello option. This label is also referred to as the TMFRR label. When the PLR sends traffic into the backup FRR tunnel, it pushes the TMFRR label onto the packet, before doing the tunnel encapsulation. The MP caches the mapping between an RPF interface and its TMFRR label, and accepts packets with the TMFRR label as an alternative to a successful RPF check. L. Wei, et al. Expires April 19, 2010 [Page 7] Internet-Draft TMFRR Extensions to PIM October 2009 4. Node Protection The following figure shows R as the node to be protected. Similar to the link protection scenario in Section 3, there are two alternative path between the PLR and MP1, MP2. ------ ----------| R1 |------------------- | ------ | | | ----- A | Protected | Rcv1 .. ---|MP1|-----| Node | D ----- | Link ----- ----- |-------------| R |--------|PLR|----... Source ----- B | ----- C ----- Rcv2 .. ---|MP2|-----| | ----- | | ----- | ----------| R2|--------------------- ----- Figure 2: Node Protection The MPs send PIM hellos with the TMFRR hello option, as described in Section 3. R caches the router ID and TMFRR label for each MP. When R sends a (*,G) or (S,G) join toward the PLR, it includes a TMFRR Join Attribute for each MP downstream. In order for router R to know which MP has joined which multicast route, PIM join suppression must be turned off on the link between MP1, MP2 and R. PIM explicit tracking must be turned on on router R. The operations of backup tunnel setup, and traffic redirect are similar to the link protection operations described in Section 3. Note that if a multicast route has receivers behind multiple MPs, as shown in the figure above, the PLR will set up multiple tunnels, one tunnel for each MP it has learned. L. Wei, et al. Expires April 19, 2010 [Page 8] Internet-Draft TMFRR Extensions to PIM October 2009 5. The Effect of Downstream Routing Convergence When a link or a node protected by TMFRR fails, the multicast tree upstream of the PLR is intact for the PRUNE_HOLD_OFF period. However, the failure may result in a routing convergence event downstream of the MP and may cause that part of the multicast tree to route away from the failure point. This will cause traffic interruption. To prevent this from happening, an end-to-end, make-before-break solution may be implemented at all routers downstream of the MP. The make-before-break solution ensures that the new multicast tree is receiving traffic at the receiver before pruning off the old tree. The description of the make-before-break enhancement is outside the scope of this document. L. Wei, et al. Expires April 19, 2010 [Page 9] Internet-Draft TMFRR Extensions to PIM October 2009 6. New PIM Extension This section defines the new TMFRR hello option and TMFRR join attribute. 6.1. TMFRR Hello Option This option conveys the TMFRR setup of the sending router. It has a short and a long format, determined by the value of the L bit. Short format, 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OptionType = TBD | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved |N|M|L| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Long format, 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OptionType = TBD | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TMFRR Label | Reserved |N|M|L| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Router ID in Encoded Unicast Address Format | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ OptionType: to be assigned by IANA. Length: 4 + L * x where L is defined below, x is the size of the router ID field. L: 0 short format; 1 long format. Reserved: Set to 0 when sent, ignored when received. N: Node protection bit. It is set to 0 for link protection on this interface, 1 for node protection. This bit informs the downstream PIM neighbor if the sending router accepts TMFRR join attribute for node protection. L. Wei, et al. Expires April 19, 2010 [Page 10] Internet-Draft TMFRR Extensions to PIM October 2009 M: MoFRR bit. Set by MPs. It is set to 1, if traffic must always be put on the backup tunnel. It is set to 0 if traffic is put on backup tunnel only when primary link or node fails. TMFRR Label: Locally assigned 20-bit MPLS label. An MP must use the long format of this hello option, and offer a non-zero label value. Router ID in Encoded Unicast Address Format: Router ID of the MP. It is the backup tunnel's destination address, and needs to be routeable from the PLR to the MP. The encoded unicast address is specified in [RFC4601], in the following format, 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Addr Family | Encoding Type | Unicast Address +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+... With this encoding format, it is possible to have multicast traffic protected by tunnel technology of a different address family. E.g. a IPv6 multicast traffic stream can be protected by a traffic engineered IPv4 tunnel. A receiving router needs to cache all values in this option and associate them with the sending PIM neighbor. These values are cached until the PIM neighbor expires, or when new values are received to replace the old ones. These values are invalidated by a new PIM hello that does not contain a TMFRR option. If the receiving router is a PLR, it must clean up its tunnel setup state for an old router ID before setting up state for the new backup tunnel. 6.2. TMFRR Join Attribute The PIM Join attribute is described in [RFC5384]. The PIM TMFRR join attribute is introduced here to support node protection. The protected node caches the MP's Router ID and a locally assigned label value for each MP, then transmits this set of information in the TMFRR join attribute to the PLR. The following is the format of this new attribute. L. Wei, et al. Expires April 19, 2010 [Page 11] Internet-Draft TMFRR Extensions to PIM October 2009 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|E| Attr Type | Length | Reserved |M| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TMFRR Label | Reserved2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Router ID in Encoded Unicast Address Format | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address of Downstream Interface under protection +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3: PIM TMFRR Join Attribute Format The above format is used only for entries in the join-list section of the Join/Prune message. F bit: 0 Non-Transitive Attribute. E bit: As specified by [RFC5384]. Attr Type: To be assigned by IANA. Length: Number of bytes in this join attribute. Reserved: Set to 0 when sent, ignored when received. M-bit: Copied from the M-bit in TMFRR hello option. TMFRR Label: The label value to be conveyed to the PLR. It is copied from the TMFRR option's label value field. Reserved2: Set to 0 when sent, ignored when received. Router ID in Encoded Unicast Address Format : Router ID of the MP. It is usually an address associated with a virtual interface, such as a loopback interface. Address of Downstream Interface Under Protection: MP's address on interface connecting to the protected node. The size of this field is determined by the address family of the source and group address in the same Join prune record. The PIM TMFRR join attribute may appear multiple times for the same multicast route, one for each MP downstream. L. Wei, et al. Expires April 19, 2010 [Page 12] Internet-Draft TMFRR Extensions to PIM October 2009 The PIM TMFRR join attribute is uniquely identified by the MP's address on the interface connected to the protected node. The values in the PIM TMFRR join attribute must be cached and associated with the corresponding multicast route, until the multicast route expires, or new values are received to replace the old values. When a PIM join is received without a TMFRR join attribute for the same MP's interface address, all cached values in the TMFRR join attribute must be invalidated. L. Wei, et al. Expires April 19, 2010 [Page 13] Internet-Draft TMFRR Extensions to PIM October 2009 7. Selective Traffic Protection by MP When there is a need to selectively protect only the more critical multicast flows, it is operationally more convenient to make the selection on the MP. The selection needs to be conveyed to the PLR, which will redirect traffic to the backup tunnel only for the flows the MP has chosen. The MP can tag the selected multicast routes with a shortened TMFRR join attribute. A new L bit is introduced to differentiate this format from the long format defined in Figure 3. The receiver of this join message treats the presence of this attribute as indication that the multicast route is to be TMFRR protected. The following is the short TMFRR join attribute format, 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|E| Attr Type | Length | Reserved |L|M| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ L : Long format bit. 0 for short format; 1 for long format. Figure 4: Extended TMFRR Join Attribute For link protection, the PLR associates the back up tunnel with the OIF for the multicast routes that had the TMFRR attribute attached in the previous join received. For node protection, in each multicast route, the node under protection (router R in figure Figure 2) keeps track of the set of MPs on the OIF that have the TMFRR attribute attached in the last join. In the next PIM join sent to the PLR, R adds the TMFRR join attributes, in long format, for all MPs in the set. 7.1. Selective Mode Operation on MP The selective mode of operation needs to co-exist with the general mode described in Section 3 and Section 4, where TMFRR is applied to all multicast flows for an MP. The MP needs to tell the RPF neighbor via the TMFRR hello option which mode is in use. Without such indication, a PLR may first apply TMFRR to all multicast flows, then receive the PIM join attributes to selectively apply TMFRR to a subset of the multicast flows. The PLR will have to remove the L. Wei, et al. Expires April 19, 2010 [Page 14] Internet-Draft TMFRR Extensions to PIM October 2009 association of other flows with the backup tunnel. Figure 5 shows the TMFRR hello option with the S-bit defined, for an MP to indicate whether it operates in selective mode. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OptionType = TBD | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved |S|N|M|L| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ L-bit 0. S-bit Selective Bit. 0 if all flows the MP joins must be TMFRR protected. 1 TMFRR protection applies only if the join has the TMFRR join attribute attached. Figure 5: Selective Bit (S-bit) in TMFRR Hello Option L. Wei, et al. Expires April 19, 2010 [Page 15] Internet-Draft TMFRR Extensions to PIM October 2009 8. Security Considerations There are no security considerations for this design other than what is already in the main PIM specification [RFC4601]. L. Wei, et al. Expires April 19, 2010 [Page 16] Internet-Draft TMFRR Extensions to PIM October 2009 9. Acknowledgments Many people contributed to the discussions and sanitized the ideas. We'd like to thank Dino Farinacci, Clarence Filsfils, Rayen Mohanty, Karthik Subramanian, Eric Rosen, IJsbrand Wijnands, DP Ayyadevara, Vatsa Kumar, Chandra Nagarajan for their comment and feedback. L. Wei, et al. Expires April 19, 2010 [Page 17] Internet-Draft TMFRR Extensions to PIM October 2009 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)", RFC 4601, August 2006. [RFC5015] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano, "Bidirectional Protocol Independent Multicast (BIDIR- PIM)", RFC 5015, October 2007. [RFC5384] Boers, A., Wijnands, I., and E. Rosen, "The Protocol Independent Multicast (PIM) Join Attribute Format", RFC 5384, November 2008. 10.2. Informative References [AFI] IANA, "Address Family Indicators (AFIs)", ADDRESS FAMILY NUMBERS http://www.iana.org/numbers.html, February 2007. [HELLO] IANA, "PIM Hello Options", PIM-HELLO-OPTIONS per RFC4601 http://www.iana.org/assignments/pim-hello-options, March 2007. [NFRR] Shen, N., "Nexthop Fast ReRoute for IP and MPLS", draft-shen-nhop-fastreroute-01 (work in progress), July 2004. [PIM-NNHOP] Shen, N., "Discovering PIM-SM Next-Nexthop Downstream Nodes", draft-shen-pim-nnhop-nodes-01 (work in progress), July 2004. L. Wei, et al. Expires April 19, 2010 [Page 18] Internet-Draft TMFRR Extensions to PIM October 2009 Authors' Addresses Liming Wei cisco Systems Inc 170 W Tasman Drive San Jose, CA 95134 USA Email: lwei@cisco.com Apoorva Karan cisco Systems Inc 170 W Tasman Drive San Jose, CA 95134 USA Email: apoorva@cisco.com Naiming Shen cisco Systems Inc 170 W Tasman Drive San Jose, CA 95134 USA Email: naiming@cisco.com Yiqun Cai cisco Systems Inc 170 W Tasman Drive San Jose, CA 95134 USA Email: ycai@cisco.com Maria Napierala AT&T Labs 200 Laurel Drive Middletown, New Jersey 07748 USA Email: mnapierala@att.com L. Wei, et al. Expires April 19, 2010 [Page 19]