Seamless Suffering

This story is about the importance of remembering networking fundamentals when dealing with advanced routing topics.

Overview of Seamless MPLS

Seamless MPLS is a really neat design for large ISP networks. The idea is to overcome the scalability limitations of link-state IGPs by segregating the network into multiple IGP domains and using a hierarchy with BGP-LU to build the inter-domain LSP. It is largely inspired by Inter-AS MPLS VPN option C1.

There are a lot of flavours of Seamless MPLS.

The simplest possible design involves one IGP split into multiple domains, BGP-LU sessions between PE and ABR nodes (in aggregation domain) as well as between ABR nodes in core domain (the same BGP sessions can be used for MPLS services, ABR serving as in-band RR), and some protocol to distribute MPLS labels in each IGP domain (e.g. LDP or SR). There is no redistribution between BGP and IGP; PE routers must push 2 transport labels (BGP+IGP)  when forwarding to the PE router in another IGP domain.

 

Fig. 1

On figure 1, each IS-IS domain runs LDP for label distribution and BGP-LU is stretched all the way to PE routers. Each ABR rewrites nexthop to own loopback in BGP updates. Those loopbacks are recursively resolved over IS-IS with LDP labels. Therefore, there is an end-to-end BGP-LU LSP PE1-ABR1-ABR2-PE2, recursively resolved over 3 different LDP LSP (one in each IGP domain).

Another option is to run BGP-LU only between ABR, and enable redistribution between IS-IS AGG domains and BGP-LU:

Fig. 2

In this case, PE don’t have to run BGP-LU. They might still need to run BGP for MPLS services. Also PE have to push fewer labels.

There are more possible permutations of different protocols and design patterns, for example:

  1. Different IGP domains can run IS-IS, OSPF, another IGP, BGP-LU only2, or static routing.
  2. IS-IS multi-level or OSPF multi-area hierarchy can be used in each IGP domain.
  3. Different MPLS label protocols can be used in different IGP domains: SR, LDP, RSVP, static LSP, BGP-LU. There can be also a combination of those within one IGP domain (e.g. LDP over RSVP, or LDP interworking with SR).
  4. More than 2 levels of hierarchy can be used.
  5. The BGP-LU design can use iBGP or eBGP, mesh or route-reflectors/confederations, or any combination of those.
  6. Route reflectors for MPLS services can be the same as for BGP-LU (ABR), or dedicated routers somewhere in the core (can be also out-of-band VMs).

This is just to get into perspective. More MPLS applications can be integrated in this design: fast restoration technologies, traffic engineering, multicast VPNs etc. Therefore, Seamless MPLS can become very complicated and it’s easy to get overwhelmed by a lot of protocols interacting with each other, sometimes in not so obvious ways.

Something like this (treat it half-seriously):

Fig. 3

When dealing with such a complex design, it’s important to remember fundamentals.

Redistribution 101

In routing, redistribution is a concept of advertising routes from one routing domain to another one (which can also run a different routing protocol). In order for a route to be redistributed, it must be installed in RIB. When doing mutual redistribution between two protocols at more than one redistribution point, routes can be redistributed more than once, thus getting back into their “native” domain and creating routing loops.

There are several ways to prevent this, for example:

  1. The protocol must distinguish “native” (internal) routes from redistributed (external) routes. Internal routes will always be preferred over external regardless of metrics, therefore a redistributed route will never be installed in RIB as long as a native route is present for the given prefix.
  2. When a route is redistributed into the routing protocol, it is marked with a tag. Routes with this tag are not allowed to be redistributed from the protocol. The same applies to the other protocol participating in redistribution. This double safeguard seems excessive, but it protects from configuration mistakes or software bugs which cause the other protocol to not filter routes properly. In BGP, communities are used instead of tags.

Fig. 4

Redistribution 102

A more subtle problem which arises with 2 redistribution points is suboptimal routing. Every routing protocol has administrative distance (also known as route preference) – local configuration on the router which controls which protocol will take priority when routes from multiple protocols are available. On figure 5, redistribution is configured between OSPF with AD 110 and IS-IS with AD 115, R1 installs the OSPF route to 1.1.1.1/32 via R2, because it has a lower (better) AD than the IS-IS route.

Fig. 5

This is also the case with non-deterministic routing: the state of routing tables depends on the sequence of events. Had R1 been the first to redistribute the route from IS-IS into OSPF, R2 would install the OSPF route. Non-deterministic routing is a clear sign of bad network design and should be avoided in all cases.

Possible solutions to this problem:

  1. If the routing protocol distinguishes between external and internal routes, it can be configured in a way that the AD for internal routes is better than for the routes from the other protocol, but AD for external (redistributed) routes is worse.
  2. In PE-CE routing in MPLS VPNs, things like OSPF DN bit or EIGRP cost communities are used to prevent PE from installing routes via CE dual-homed to another PE.
  3. If none of the above applies, it is possible to filter routes from being installed in RIB by a particular protocol. This can be done by a prefix list or tags/communities. Different vendors provide different configuration structures for that.

Sample configuration on IOS:

router ospf 1
 redistribute isis 1 subnets route-map ISIS_TO_OSPF
 distribute-list route-map OSPF_DIST in
!
route-map ISIS_TO_OSPF permit 10
 set tag 100
!
route-map OSPF_DIST deny 10
 match tag 100
!
route-map OSPF_DIST permit 20

In this example, routes redistributed from IS-IS into OSPF are marked with tag 100, and then routes with tag 100 are blocked from being installed in RIB.

Since link-state IGP always synchronize their LSDB3, they will propagate the LS updates of routes blocked by the distribute list further into the IGP domain. But they will not install redistributed routes in their own RIB.

A side effect of this configuration is that filtering during redistribution seems to not be needed anymore. Distribute lists guarantee that the ABR  will never install redistributed routes in RIB – and if the route is not in RIB, it can’t be redistributed! Even a config mistake or a bug which leads an ABR to somehow install a redistributed route in RIB will not cause a redistribution loop, as long as the other ABR is functioning correctly.

This all looks very neat and reliable. What could possibly go wrong?

Seamless MPLS with redistribution

One particular flavour of Seamless MPLS Cisco at some point used to recommend as a best practice for large mobile operators was a 3-level hierarchy, with levels traditionally named Core, Aggregation and Access. While in theory this design allowed to use any IGP in each domain, all examples featured Core using IS-IS level 2, Aggregation – the same IS-IS domain but level 1 (with blocked L1-L2 leaking, so they act almost like separate IGP domains) and Access using OSPF4. BGP-LU could be stretched all the way to the Access level, or, alternatively, only to the Aggregation level – in which case mutual redistribution between BGP and OSPF is required. The guide warns about dangers of mutual redistribution and advises to use route tags and BGP communities to prevent redistribution loops.

High-level picture:

Fig. 6

Of course there are more BGP sessions as each ABR must have a BGP session with every other ABR in the same IGP domain. Also route reflectors can be used. Up to this point, the design is totally fine.

Sample configuration

Fig. 7

A BGP community is assigned to each access site, local routes are advertised into BGP with this community (in this example 65001:100 for site 1 and 65001:200 for site 2). Whenever a BGP route with any remote site community is received, it is marked with a special tag during redistribution into OSPF and distribute-list on the ABR prevents that route from getting installed in RIB as OSPF route. The distribute-list is needed to prevent the ABR from installing the route redistributed from BGP into OSPF by the other ABR in the same access site.

Sample access ABR config would be something like:

router ospf 1
 redistribute bgp 65001 subnets route-map BGP_TO_OSPF
 distribute-list route-map OSPF_DIST in
!
router bgp 65001
 bgp redistribute-internal
 redistribute ospf 1 route-map OSPF_TO_BGP
!
ip community-list 1 permit 65001:200
!
ip prefix-list LOOPBACKS seq 5 permit 0.0.0.0/0 ge 32
!
route-map OSPF_TO_BGP permit 10
 match ip address prefix-list LOOPBACKS
 set community 65001:100
!
route-map BGP_TO_OSPF permit 10
 match community 1
 set tag 200
!
route-map BGP_TO_OSPF permit 20
!
route-map OSPF_DIST deny 10
 match tag 200
!
route-map OSPF_DIST permit 20

There can be multiple variations, for example BGP “network” command vs redistribution, or more precise filtering in the prefix list etc, but you get the idea.

For my picky taste, this configuration has a big nope, as there is no filtering in redistribution from OSPF to BGP. However, it isn’t really needed here, perhaps only as an extra paranoid safeguard which won’t be of much use in practice. As shown above, distribute-lists guarantee loop-free redistribution.

If due to misconfiguration or a bug, an ABR somehow installs a route redistributed by the other ABR as OSPF, that will of course impact that site, for example it will have suboptimal traffic flow from ABR to ABR.

But the important thing is whether a redistribution problem in one access site would impact other access sites.

Let’s assume because of a missing community (for whatever reason) or distribute-list configuration mistake, a route from site 1 gets redistributed into BGP on site 2 and propagated back into site 1 (but now with a wrong community!). It will not be redistributed back into its native OSPF domain, since ABRs will not install the BGP route (because they have the native OSPF route with better AD).

Fig. 8

On figure 8, PE1 originates 1.1.1.1/32 in OSPF, ABR1 and ABR2 both redistribute it in BGP, ABR3 and ABR4 redistribute it in OSPF on site 2. If somehow ABR3 installs the route to 1.1.1.1/32 via OSPF instead of BGP, it will redistribute it back into BGP with the community of site 2.

This is almost a redistribution loop. But OSPF always prefers internal routes over external, so that the redistributed route will never be installed in RIB of any routers in site 1, as long as the native intra-area route from PE1 is present.

If PE1 goes down and the network is in the state described above, this is a slight problem as the route won’t get withdrawn and therefore won’t trigger convergence events for MPLS services. At some point convergence will happen anyway (BGP or LDP hold timer), just slower than expected. Not great, not terrible.

And now the network designer decided to replace OSPF on access with IS-IS. It’s the same thing, isn’t it? On a high level – yes, they are both link-state IGPs. But there are a lot of differences once we get into details.

One of such differences is that IS-IS with wide metrics does not distinguish external routes from internal routes – the same TLV #135 is used for both.

Consider the topology:

Fig. 9

There are only 2 levels of hierarchy (in reality, there will be 3 levels and hundreds of access IGP domains). IS-IS routes from the green site are redistributed into BGP with community 65001:100 and routes from the blue site – 65001:200. When routes with a “foreign” community are redistributed into IS-IS, they are marked with the tag which prevents ABRs from installing them as IS-IS routes redistributed by the other ABR.

Sample ABR redistribution config:

hostname ABR1
!
router isis AGG
 redistribute bgp 65001 route-map BGP_TO_ISIS
 distribute-list route-map ISIS_DIST in
!
router bgp 65001
 redistribute isis AGG level-2 route-map ISIS_TO_BGP
!
ip community-list 1 permit 65001:200
!
route-map ISIS_TO_BGP permit 10
 match ip address prefix-list LOOPBACKS
 set community 65001:100
!
route-map BGP_TO_ISIS permit 10
 match community 1
 set tag 200
!
route-map BGP_TO_ISIS permit 20
!
route-map ISIS_DIST deny 10
 match tag 200
!
route-map ISIS_DIST permit 20

This is the same as the OSPF configs shown previously, and seems to prevent possible redistribution loops as well.

Failure

Now, let’s assume for some reason one ABR on a remote access site installs a route redistributed by the other ABR on the same site. This can happen due to a lot of reasons, such as:

  • Config mistake on the ABR
  • BGP community is not propagated for some reason
  • Duplicate IP
  • Software bug

Since by design, new sites will be added to an existing network, it’s not difficult to imagine one of the aforementioned problems occurring when a new site is provisioned.

So what will happen in this situation?

Not much, really. Perhaps some suboptimal traffic flows on the affected site.

For instance, ABR3 installs the route to PE1 (1.1.1.1/32) via ABR4.

ABR3#show ip route 1.1.1.1
Routing entry for 1.1.1.1/32
Known via "bgp 65001", distance 200, metric 20, type internal
Redistributing via isis AGG
Advertised by isis AGG level-2 route-map BGP_TO_ISIS
Last update from 5.5.5.5 00:37:31 ago
Routing Descriptor Blocks:
* 5.5.5.5, from 7.7.7.7, 00:37:31 ago
Route metric is 20, traffic share count is 1
AS Hops 0
MPLS label: 18


ABR3(config-router)#no distribute-list route-map ISIS_DIST in


ABR3#show ip route 1.1.1.1 
Routing entry for 1.1.1.1/32
Known via "isis", distance 115, metric 10, type level-2
Redistributing via isis AGG, bgp 65001
Advertised by bgp 65001 level-2 route-map ISIS_TO_BGP
Last update from 10.0.10.10 on Ethernet0/2, 00:00:01 ago
Routing Descriptor Blocks:
* 10.0.10.10, from 10.10.10.10, 00:00:01 ago, via Ethernet0/2
Route metric is 10, traffic share count is 1

Fig. 10

Doesn’t seem to be a big deal. It better has been. Because now, the problem remains unnoticed, as everything keeps working fine.

There are new BGP routes, since ABR3 redistributes 1.1.1.1/32 into BGP, with blue site community 65001:200. But this has no impact.

ABR1#sh ip bgp 1.1.1.1
BGP routing table entry for 1.1.1.1/32, version 2
Paths: (2 available, best #2, table default)
  Advertised to update-groups:
     1         
  Refresh Epoch 1
  Local
    9.9.9.9 (metric 20) from 7.7.7.7 (7.7.7.7)
      Origin incomplete, metric 10, localpref 100, valid, internal
      Community: 65001:200
      Originator: 9.9.9.9, Cluster list: 7.7.7.7
      mpls labels in/out 18/35
      rx pathid: 0, tx pathid: 0
  Refresh Epoch 1
  Local
    10.0.3.3 from 0.0.0.0 (5.5.5.5)
      Origin incomplete, metric 20, localpref 100, weight 32768, valid, sourced, best
      Community: 65001:100
      mpls labels in/out 18/nolabel
      rx pathid: 0, tx pathid: 0x0

Now, PE1 goes down. If suboptimal traffic flow wasn’t a problem, now there will be one: 1.1.1.1/32 doesn’t disappear from every router’s RIB, and therefore doesn’t trigger convergence events for MPLS services. This is because both ABR1 and ABR2 install a BGP route, advertised by ABR3 (with the community of the blue site).

ABR1#sh ip bgp 1.1.1.1
BGP routing table entry for 1.1.1.1/32, version 71
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  Local
    9.9.9.9 (metric 20) from 7.7.7.7 (7.7.7.7)
      Origin incomplete, metric 10, localpref 100, valid, internal, best
      Community: 65001:200
      Originator: 9.9.9.9, Cluster list: 7.7.7.7
      mpls labels in/out 18/35
      rx pathid: 0, tx pathid: 0x0

At some point, PE1 goes up. ABRs on the green site should now prefer the native route to the redistribute ones? That would be the case in OSPF.

But in IS-IS, there is no distinction between internal and external (redistributed) routes. Whatever has the lower metric, is better. Depending on metrics, ABRs would prefer the “native” route again and the network would recover.

If it wasn’t for…

LDP-IGP sync

In LDP-based MPLS networks, there can be a situation when the router has an IP route to a prefix, but no MPLS label received from the downstream router. This typically happens when a new link is brought up and the IGP adjacency comes up before LDP. In some situations the condition with unlabeled routes can persist. This is highly undesirable and in most cases leads to traffic blackholing. LDP-IGP sync attempts to prevent this. It works as follows:

  1. If the IGP adjacency is down and there is a route to the LDP neighbour’s loopback, the IGP adjacency is not brought up until the LDP session comes up. The idea is to bring up the LDP session over the redundant path and propagate labels, so that by the time IGP comes up, there will be labels for all IP routes.
  2. If the IGP adjacency is up, but LDP is down, the IGP advertises maximum metric for the impacted link. This way it makes the link the least preferable, but still a valid path.

Because of the first rule, the IGP adjacency might never come up in some cases, for example if there is a default or aggregate route but the LDP neighbor is not actually reachable over it. To avoid this catch 22 scenario, it is recommended to configure a holddown timer. After a while (usually a few seconds or so), the IGP session is brought up even if LDP is down.

For the second rule, there are no timers. As long as sync is enabled but not achieved, the IGP will keep advertising maximum metric.

Therefore, once PE1 goes up, its neighbours PE2 and P1 will have a route to 1.1.1.1/32 via ABR1 and ABR2 (leading to nowhere). Therefore, LDP between PE1 and its neighbours will not come up. Because of LDP-IGP sync, PE2 and P1 will advertise their links to PE1 with max-metric (16777214).

P1#sh mpls ldp igp sync
    Ethernet0/0:
        LDP configured; LDP-IGP Synchronization enabled.
        Sync status: sync achieved; peer reachable.
        Sync delay time: 0 seconds (0 seconds left)
        IGP holddown time: infinite.
        Peer LDP Ident: 5.5.5.5:0
        IGP enabled: ISIS AGG
    Ethernet0/1:
        LDP configured; LDP-IGP Synchronization enabled.
        Sync status: sync not achieved; peer reachable.
        Sync delay time: 0 seconds (0 seconds left)
        IGP holddown time: infinite.
        IGP enabled: ISIS AGG

P1#show isis database R3.00-00 detail

Tag AGG:

IS-IS Level-2 LSP P1.00-00
LSPID                 LSP Seq Num  LSP Checksum  LSP Holdtime      ATT/P/OL
P1.00-00            * 0x0000000B   0x2F12        996               0/0/0
  Area Address: 49
  NLPID:        0xCC
  Hostname: P1
  Metric: 16777214   IS-Extended PE1.00
  Metric: 10         IS-Extended ABR1.00
  IP Address:   3.3.3.3
  Metric: 0          IP 3.3.3.3/32
  Metric: 16777214   IP 10.0.1.0/24
  Metric: 10         IP 10.0.3.0/24

Alright, redistributed IS-IS routes are preferred over native routes because LDP-IGP sync screws metrics. But even with high metrics, IS-IS routes still should be preferred over BGP routes, because of better AD?

Best path selection

Every routing protocol has own best path selection algorithm which is executed before applying general routing things like distribute-list or comparing AD to other protocols5. If the protocol selected the best route and then that route lost AD to other protocols, or was blocked by a distribute list – well, bad luck, this protocol lost. Even if other routes could potentially go through, they were discarded earlier by best path selection algorithm.

ABR1 has 2 IS-IS routes to 1.1.1.1/32 (the native one advertised by PE1 and the one redistributed by ABR2) and a BGP route. IS-IS has AD 115 and iBGP has AD 200.

First, IS-IS runs its best path selection algorithm.

ABR1#show isis rib 1.1.1.1

IPv4 local RIB for IS-IS process AGG

IPV4 unicast topology base (TID 0, TOPOID 0x0) =================
Routes under majornet 1.0.0.0/8:

1.1.1.1/32
  [115/L2/100] via 10.0.5.6(Ethernet0/1), from 6.6.6.6, tag 200, LSP[4/14] -
  [115/L2/16777224] via 10.0.3.3(Ethernet0/0), from 1.1.1.1, tag 0, LSP[3/10]

The blue route wins because of better metric (100 vs 16777224).

Then, ABR1 attempts to install that route in RIB. But tag 200 is blocked by the distribute-list. So the only remaining valid route is from BGP.

ABR1#show ip route 1.1.1.1
Routing entry for 1.1.1.1/32
  Known via "bgp 65001", distance 200, metric 10, type internal
  Redistributing via isis AGG
  Advertised by isis AGG level-2 route-map BGP_TO_ISIS
  Last update from 9.9.9.9 00:03:17 ago
  Routing Descriptor Blocks:
  * 9.9.9.9, from 7.7.7.7, 00:03:17 ago
      Route metric is 10, traffic share count is 1
      AS Hops 0
      MPLS label: 35

There is a similar condition on ABR2. Clearing the routing table, flapping links, restarting IS-IS or BGP won’t help here.

Removing distribute lists from either ABR1 or ABR2 is a tempting idea, except that it will make the situation even worse, creating a complete redistribution loop (remember how it all started?)

How to recover

The most reasonable thing to do is to disable and re-enable LDP-IGP sync on either P1 or PE2.

P1(config-if)#no mpls ldp igp sync

P1#show isis database R3.00-00 detail

Tag AGG:

IS-IS Level-2 LSP P1.00-00
LSPID                 LSP Seq Num  LSP Checksum  LSP Holdtime      ATT/P/OL
P1.00-00            * 0x0000000C   0xAB7E        1190              0/0/0
  Area Address: 49
  NLPID:        0xCC
  Hostname: R3
  Metric: 10         IS-Extended PE1.00
  Metric: 10         IS-Extended ABR1.00
  IP Address:   3.3.3.3
  Metric: 0          IP 3.3.3.3/32
  Metric: 10         IP 10.0.1.0/24
  Metric: 10         IP 10.0.3.0/24

Lower metric allows P1 to install the route to PE1 via the directly connected link, LDP comes up, the network recovers.

*Sep  3 00:28:35.439: %LDP-5-NBRCHG: LDP Neighbor 1.1.1.1:0 (1) is UP

ABR1 and ABR2 now also have the valid IS-IS route to PE1:

ABR1#show ip route 1.1.1.1
Routing entry for 1.1.1.1/32
  Known via "isis", distance 115, metric 20, type level-2
  Redistributing via isis AGG, bgp 65001
  Advertised by bgp 65001 level-2 route-map ISIS_TO_BGP
  Last update from 10.0.3.3 on Ethernet0/0, 00:00:33 ago
  Routing Descriptor Blocks:
  * 10.0.3.3, from 1.1.1.1, 00:00:33 ago, via Ethernet0/0
      Route metric is 20, traffic share count is 1

Everything seems to work now. Just until another router, perhaps in another IS-IS AGG domain goes down and then up – and the same problem repeats.

How to fix

In most cases, filtering by tags/communities is sufficient to prevent redistribution loops. However, in this design, redistribution happens in 2 stages, hence extra safeguards are required. For example, ABRs should not allow any of local IGP domain prefixes to be redistributed from BGP into IS-IS. That should be applied during redistribution; RIB filtering with distribute lists is not good enough!

What could have been done to prevent this problem

  1. Consider designing Seamless MPLS without redistribution. You will probably need BGP for MPLS services anyway, so there is no reason not to enable BGL-LU on the same sessions.
  2. If you decided to walk on the thin ice of redistribution, configure filtering in both directions! This won’t completely exclude the possibility of redistribution loops, especially in complex designs, but will make them less likely to occur.
  3. RFC7794 and RFC7775 introduce distinction between IS-IS internal and external routes.. Or rather bring it back, because the original standard for IS-IS IP routing (RFC 1195) had separate TLV for internal and external routes – 128 and 130 respectively. They are not used anywhere because they don’t support traffic engineering extensions and metric values higher than 64.
  4. Don’t blame LDP-IGP sync, it does what it should do. In some scenarios, the same problem would occur even without it. But migrating to Segment Routing simplifies the design and makes various problems less likely to occur and easier to spot.

Conclusion

While routing in a large MPLS network can be complicated, in the end it comes down to fundamentals:

  • Link-state IGP concepts
  • Best path selection
  • Route redistribution

The future of MPLS network design

Despite SDN hype, there are no real alternatives to MPLS in ISP networks. However, the design doesn’t necessarily have to be Seamless MPLS.

Scaling IS-IS

In link-state IGPs, there are two main scalability limiting factors: flooding and SPF. If we solve them, it will be possible to scale one IS-IS domain so that it will suffice to large ISP network needs.

draft-ietf-lsr-dynamic-flooding optimizes flooding on dense graphs so that fewer updates are generated. As of the time of this writing, this is already supported by Cisco and Arista and is beneficial even with old IS-IS designs.

draft-li-lsr-isis-area-proxy allows routers from one area to present themselves as one node to the outside world, so that the number of nodes on level 2 is reduced and SPF computation is simplified.

draft-ietf-lsr-isis-extended-hierarchy extends IS-IS from levels 1 and 2 to theoretically possible 8 levels.

Whether this will become the new best practice for ISP core routing  – time shall tell.

Large Scale Interconnect with SR

RFC8604 proposes another approach: no communication at all between different IGP domains, an external SDN controller computes paths and installs static Label Switch Paths on routers. This also resolves a very hypothetical problem where there are more than  1,048,576 (yes, a million) routers in the network so that there would not be enough MPLS labels to assign node SID to all routers.

References

  1. Cisco EPN4 design guide (outdated) https://www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Mobility/EPN/4_0/EPN_4_Transport_Infrastructure_DIG.pdf
  2. Unified MPLS Configuration Example https://www.cisco.com/c/en/us/support/docs/multiprotocol-label-switching-mpls/mpls/116127-configure-technology-00.html
  3. Scaling MPLS Networks https://blog.ine.com/2010/08/16/scaling-mpls-networks
  4. Seamless MPLS Architecture – abandoned draft https://tools.ietf.org/html/draft-ietf-mpls-seamless-mpls-07
  5. IS-IS Route Preference for Extended IP and IPv6 Reachability https://tools.ietf.org/html/rfc7775
  6. IS-IS Prefix Attributes for Extended IPv4 and IPv6 Reachability https://tools.ietf.org/html/rfc7794
  7. IS-IS Multi-Instance https://tools.ietf.org/html/rfc8202
  8. Dynamic Flooding on Dense Graphs https://tools.ietf.org/html/draft-ietf-lsr-dynamic-flooding-04
  9. Area Proxy for IS-IS https://tools.ietf.org/html/draft-li-lsr-isis-area-proxy-01
  10. IS-IS Extended Hierarchy https://tools.ietf.org/html/draft-ietf-lsr-isis-extended-hierarchy-01
  11. Interconnecting Millions of Endpoints with Segment Routing https://tools.ietf.org/html/rfc8604

Notes

  1. ^RFC4364#section-10
  2. ^Designs with BGP as the only routing protocol are very common in DC networks (RFC7938), it is possible to apply this design to ISP networks (with BGP-LU). Also draft-ketant-idr-bgp-ls-bgp-only-fabric proposes BGP-LS extensions for this case – it doesn’t make BGP link-state, but with the help of external controller allows some sort of source routing.
  3. ^for simplicity, assuming single area OSPF/single-level IS-IS
  4. ^I don’t know why this particular combination, but probably to avoid hassle with extra redistribution when loopbacks on ABR must be advertised in 2 different IGP domains of the same protocol. Although multi-instance IS-IS (RFC8202) makes things easier.
  5. ^Except Cisco EIGRP. That’s a weird protocol which relies on AD to choose between its own internal/external routes

3 thoughts on “Seamless Suffering”

  1. The idea behind using OSPF in an access domain is simple. OSPF has better area partitioning than IS-IS. Mobile operators (and not only them) tend to use ring/half-ring topologies for access devices where each ring/half-ring is placed in its own area. So an access device maintains fewer number of LSA’s. It improves scalability (such access devices are often low-end and based on a merchant silicon) and convergence times.

    1. Thanks for the comment Igor. I don’t think those access rings ever grow large enough for IGP scalability limitations to kick in, although I agree that some vendors have rather poor support for IS-IS while their OSPF implementations are somewhat more decent – this might play a role in the IGP choice. But the general consensus is that single-level IS-IS scales better than single-area OSPF.

      1. I agree, the scaling factor isn’t main here. I’ve just mentioned it as a good side effect. But nevertheless, I think a good IGP partitioning can restrict negatives events in one part of a network from others. It can be important for an access domain where devices often have problems with the code quality.
        And of course, I agree that IS-IS is more scalable than OSPF by design.

Leave a Reply

Your email address will not be published. Required fields are marked *