Explicit Null in Segment Routing

MPLS is such a user-friendly technology it needs a special label that does nothing.

Why explicit null

Normally, the penultimate router in the LSP removes (pops) the top transport label, so that the egress LSR will deal either with the service label, or IPv4/IPv6 packet. This is known as Penultimate Hop Popping or PHP.

Fig. 1

However, sometimes there is a need to have the LSP labeled all the way to the egress LSR. So meet the explicit null – label value 0 for IPv4 and 2 for IPv6.

Fig. 2

The most common reason why explicit null might be needed is to preserve QoS markings on MPLS packets. For example, in pipe/short pipe DiffServ QoS models, it might be undesirable to map MPLS TC markings to IP DSCP. Hence the need to have the packet labelled across the whole LSP.

Another situation is forwarding IPv6 traffic over an IPv4-based MPLS network, for example 6PE or color-only steering in SR-TE. 6PE actually works in a way similar to MPLS VPN, that is it uses a service label advertised by the PE. But if the PE is configured to use one label for all IPv6 prefixes (similar to per-VRF label allocation), that “service label” will be explicit null.

Vendors also found a use for explicit null as a workaround for chipset limitations on network hardware. For instance, if a chip can swap but not push/pop labels, it can still make a P router! Or in FRR/LFA scenarios, some chips have difficulty switching over from the primary path with pop action to the backup path with swap or push action – explicit null comes to rescue here.1

Note that old [RFC3032] specified explicit null as pop the entire label stack, forward based on the IPv4/IPv6 header. [RFC4182] corrected the definition, so if the explicit null is not at the bottom of stack, the next label is used for forwarding.

Now to the main topic.

SR-TE policies

In Segment Routing, there is no concept of Label Switch Path (LSP). Traffic Engineering in SR is achieved by pushing multiple labels to steer the packet over the desired path, but transit routers are completely unaware of the SR-TE policy. This creates some unique problems for SR-TE, such as protecting SR-TE with TI-LFA, or explicit null handling.

Since there is no LSP, there is also no penultimate hop. If the explicit null is desired for an SR-TE policy, it must be pushed by the SR-TE headend. Then on top of that, any relevant SID are pushed, to actually steer the packet. Therefore, the packet will travel all the way along the SR-TE path, with explicit null below transport labels.

 

Fig. 3

What if one of the segments used to construct the SR-TE path is also advertised with explicit null? Well, then you can get 2 explicit null labels stacked on the same packet, and the hardware should be able to handle this.

IPv6 over IPv4 SID, IPv4 over IPv6 SID2

It is possible to map IPv6 traffic to SR-TE policies based on IPv4 SID, or vice versa. This is achieved by color-only steering3, where the BGP prefix can be mapped to a SR-TE policy with an endpoint from a different address family, or a policy with null nexthop. This can lead to a situation where an IPv6 packet can arrive on a non-IPv6 enabled interface, after the PHP operation. Therefore, explicit null is required when steering IPv6 traffic over an IPv4-based SR-TE policy, or the other way around.

Fig. 4

Consider an IPv4-based SR network on figure 4, when steering IPv6 traffic over it, the SR-TE headend will impose IPv6 explicit null (label value 2) so that after the last segment is popped, the IPv6 packet is not exposed to an IPv4-only link.

Since there are 2 different explicit null labels for IPv4 and IPv6 (values 0 and 2 respectively), it is not a good idea to just add explicit null to the list of labels in the SR-TE policy. Instead, [draft-ietf-idr-segment-routing-te-policy] introduced Explicit Null Label Policy or ENLP.

ENLP

ENLP can be configured statically, or advertised via the BGP SR-TE address-family. Depending on the ENLP config, policy type and mapped traffic we can get 12 possible permutations.

SR-TE endpoint / ENLP config
IPv4 prefixesIPv6 prefixes
Default / IPv4 endpointno explicit nullIPv6 explicit null
Default / IPv6 endpointIPv4 explicit nullno explicit null
Noneno explicit nullno explicit null
IPv4IPv4 explicit nullno explicit null
IPv6no explicit nullIPv6 explicit null
BothIPv4 explicit nullIPv6 explicit null

For MPLS VPN traffic steered over an SR-TE policy, explicit null is never imposed.

So if explicit null in SR-TE is so useful, why ever disable it?

Egress Peer Engineering

Usual routing policies allow us to choose the egress ASBR for the given prefix, but not the nexthop that ASBR is going to use to send traffic outside our AS.

Egress Peer Engineering (EPE) uses source routing to let an ingress ASBR to send traffic to a specific peer of a specific egress ASBR. EPE consists of 2 parts:

  1. The egress ASBR allocates MPLS label per egress peer
  2. The ingress ASBR uses those labels to steer traffic to a specific peer

Part (1) can be done by static configuration, BGP-LU [draft-gredler-idr-bgplu-epe] or special BGP-LS segments [RFC9086]

Part (2) usually involves a 3rd party controller that programs LSP or SR-TE policies on the ingress ASBR, or BGP-LU routes propagated all the way to ingress ASBR, then it is possible to design EPE based on pure dynamic routing, without a controller.

Fig. 5

Consider topology on figure 5 – if we want ingress ASBR R1 to send traffic to different BGP prefixes to different peers of egress ASBR R3 (AS200 and AS300), without impacting route selection of other routers in AS100 (e.g. R2).

The way to do this is let R3 allocate MPLS labels per egress peer, and then configure null endpoint SR-TE policies on R1 that will use those labels to steer traffic to the respective peer. Then use color-only steering to map BGP prefixes to those policies. Given the default SR-TE behaviour, if an IPv6 prefix is mapped to an IPv4 policy, or vice versa, R1 will push explicit null which will then be exposed on the inter-AS link that is not MPLS-enabled. Hence explicit null must be disabled for SR-TE in such EPE designs.

References

  1. MPLS Label Stack Encoding https://datatracker.ietf.org/doc/html/rfc3032
  2. Removing a Restriction on the use of MPLS Explicit NULL https://datatracker.ietf.org/doc/html/rfc4182
  3. Advertising Segment Routing Policies in BGP https://datatracker.ietf.org/doc/html/draft-ietf-idr-segment-routing-te-policy-13
  4. Segment Routing, Part II: Traffic Engineering – Clarence Filsfils, Kris Michielsen, Francois Clad, ISBN-13: 978-1095963135
  5. Egress Peer Engineering using BGP-LU https://datatracker.ietf.org/doc/html/draft-gredler-idr-bgplu-epe-14
  6. Border Gateway Protocol – Link State (BGP-LS) Extensions for Segment Routing BGP Egress Peer Engineering https://datatracker.ietf.org/doc/html/rfc9086

Notes

  1. ^I will not disclose specific vendors or hardware, but trust me, every hardware has limitations, and I’ve seen these issues across different vendors. Patching those limitations with some design or config changes is not as bad as it might sound.
  2. ^here I mean MPLS-SR based on IPv6, not SRv6!
  3. ^note that colors in SR-TE have no relation whatsoever to affinity bits in classical MPLS-TE, which are also sometimes called color bits

Leave a Reply

Your email address will not be published. Required fields are marked *