Simplifying SR-TE with mesh templates

A big problem with deploying traffic engineering is configuration complexity. This has now been solved.

Why use Traffic Engineering

Typical reasons for using traffic engineering are:

  • Send some traffic via a low latency path
  • Send A/B streams via different paths avoiding shared risk links
  • Load-balance traffic across a custom network topology
  • Ensure traffic is rerouted without congestions after link failures

The configuration hell

This works fine in a lab or a small network, but trying to scale MPLS-TE (or SR-TE) policy config to a few hundred routers becomes a big pain. You need n*(n-1) policy entries which easily become thousands of policies even in a mid-size network. All these policies need to be provisioned, monitored, updated, added/deleted when new routers are provisioned/decommissioned.

This is true regardless whether you use distributed or centralized TE. Centralized TE (with a controller) makes management a bit easier (all config in one place) but the fundamental problem remains. Network automation can help solve this problem but it’s yet another level of abstraction and a new source of potential problems.

Flex Algo

Segment Routing introduces a different approach to solving the TE problem: Flex Algo [RFC9350]. Instead of configuring a lot of SR-TE policies, the operator defines multiple flex algo on each router, and those flex algo can have different constraints, for example use delay metric type instead of IGP, or include/exclude certain admin groups.

Flex algo is a decent solution for a lot of traffic engineering problems and is definitely better than maintaining many thousands SR-TE policies, but it is not without drawbacks, such as:

  • Required feature support
  • Multi-vendor interoperability issues
  • Lack of bandwidth reservations
  • Path diversity is topology-dependent

Feature support and interoperability

This was the problem with legacy MPLS-TE that Segment Routing successfully solved. Now the routing stack is minimalistic and all the complexity is outsourced to the controller, so that adding a new feature requires just a controller software upgrade.

Flex algo makes a full circle and brings us back to massive network upgrades to add a new routing feature, and all sorts of fun of troubleshooting interop issues in multivendor networks.

No bandwidth reservations

Obviously it’s not possible without a centralized controller.

Path diversity

Consider the diagram:

A common traffic engineering requirement is sending 2 streams of traffic via different paths so that any link or node failure will not impact both streams.

With flex algo there are 2 ways to achieve this:

  1. Admin groups – e.g. stream A can use only blue links, stream B can use only yellow links
  2. Different flex algo definitions – e.g. R2, and R4 participate in flex algo 128 while R3 and R5 participate in flex algo 129.

The problem is that neither approach is topology-independent. What if R5 is also a PE router and it also needs to have 2 redundant streams to R1?

The topology allows this but it is not possible to achieve with flex algo.

Better solution: SR-TE mesh templates

Traffic Dictator v1.8 introduced a new feature that combines configuration simplicity of Flex Algo with flexibility of SR-TE policies.

It started with a customer requirement to configure a lot of similar SR-TE policies in each IGP domain and then connect those islands with BGP-CT [RFC9832], so I came up with this idea of mesh templates.

The configuration is very minimalistic:

traffic-eng mesh-templates
   !
   template TOPO_101_BLUE
      topology-id 101
      color 11101
      install indirect srte peer-group TOPO_101_RR
      !
      candidate-path preference 100
         affinity-set BLUE_ONLY

This creates a full mesh of SR-TE policies, between all routers in topology 101, using the admin group constraint.

It is possible to exclude some routers with an access list – for example, if we want to have SR-TE policies only between PE routers but exclude P routers; or generate IPv4-only/IPv6-only policies in a dual stack topology (by default TD will try to generate both IPv4 and IPv6).

In a multi-area topology, mesh template will generate policies only within L1 or L2 separately, and in case of discontiguous L1 with the same area-id (not recommended, but valid config), the logic is smart enough to recognize that and generate policies only within the contiguous L1 area.

Self-updating config

The mesh is fully dynamic and reacts on IGP topology changes. The difference from the usual SR-TE policies, is that before each SPF recalculation, the mesh template checks if it needs to add or remove policies, based on routers being added or removed from the topology. So there is no need to update SR-TE config when deploying new routers. Thanks to fast and memory-safe Rust code, the overhead is negligible and the feature scales very well – definitely better than any network automation solution that generates and provisions config across many routers.

Multiple candidate paths

A very flexible feature of SR-TE policies is the ability to configure multiple candidate paths. If the higher priority path fails, the policy will try to use the next path etc. This allows for a lot of interesting designs. Consider the topology:

Blue links are core high-bandwidth links, orange links are aggregation low-bandwidth links. We can configure a template like this:

traffic-eng mesh-templates
   template ALL_PE
      topology-id 101
      color 101
      install indirect srte peer-group RR
      !
      candidate-path preference 200
         affinity-set CORE
      !
      candidate-path preference 100

This ensures that traffic between core routers will never be routed via low-bandwidth aggregation links as long as a core path is available, regardless of metric. But traffic to/from an aggregation router (R5) will be routed via aggregation links, possibly with other constraints (such as bandwidth or different metric type).

Another use case for this is when we need to drain traffic from a link or set of links. Rather than playing with metric or making SR-TE policy or Flex Algo config changes on every router, with mesh templates you can just add a higher-priority candidate path that excludes the relevant set of links (by admin group or SRLG) and that will update all policies generated from that template.

Path diversity

Mesh templates allow for a universal path diversity config that works in every topology (unlike flex algo).

Config example:

traffic-eng mesh-templates
   !
   template TOPO_101_A
      topology-id 101
      color 11101
      install indirect srte peer-group TOPO_101_RR
      !
      candidate-path preference 100
         disjoint-group 100 link
   !
   template TOPO_101_B
      topology-id 101
      color 11102
      install indirect srte peer-group TOPO_101_RR
      !
      candidate-path preference 100
         disjoint-group 100 link

These 2 templates are configured with the same disjoint group. This means, a policy from template A and a policy from template B, that both have the same headend and endpoint, cannot use the same links. But policies with different headends and endpoints can use the same links.

Policy distribution and redundancy

Mesh templates use BGP-SRTE [RFC9830] for policy distribution. This is a very neat and scalable solution: just configure one BGP session with a route reflector and advertise all policies there. TD sets route target in each policy to the router-id of the relevant headend, so that each router knows which policies he needs to install.

Note: the diagram shows a simplified picture, in reality every router receives all policies. It is theoretically possible to implement RT constraint [RFC4684] on the route reflector to optimize this, but even in large scale deployments, the amount of SR-TE policies is negligible compared to the number of routes BGP can handle.

As for redundancy, multiple controllers advertise the same set of policies with different SR-TE distinguishers.

 

If one controller fails, all routers will still retain policies from the other controller. There is no need for any state sync so it is possible to deploy as many controllers as you want.

The future of mesh templates

The current implementation is already superior to traditional SR-TE and Flex Algo, but it can get better.

With advances in streaming telemetry, and as more vendors will implement proper hardware counters per SID and per SR-TE policy, mesh templates can become a foundation for a truly automated, intent-based, self-healing traffic steering solution. The type which you configure once (and the config is very small), and it just works, reacting in real time on bandwidth usage patterns, delay/packet loss and avoiding congestions. It will also be easy to troubleshoot, freeze the routing state at a certain point, revert to a specific state, and run various “what if” simulations.

I have several ideas on how to implement this, one way is to add bandwidth extensions to RFC9857, so that either each router can advertise real bandwidth usage per policy, or a monitoring system would collect statistics using vendor-specific telemetry and advertise it into BGP.

The actual implementation will be driven by customer demand and vendor feature support, so we’ll see how this will work out in the future.

Conclusion

I started Traffic Dictator as a simple and minimalistic SR-TE controller that would lower the entry barrier into SR-TE for small and medium-size operators, and allow them to do traffic engineering with basic SR implementations in open source and whitebox routers. I didn’t try to do SR-TE better than big vendors who have teams of experts and decades of expertise.

But now, with new features and various optimizations, TD offers a superior way of traffic engineering, compared to the rest of the SR controllers on the market; at least from a routing perspective.

Leave a Reply

Your email address will not be published. Required fields are marked *