Move IP gateways for codfw row C/D vlans to EVPN Anycast GW
Open, MediumPublic

Description

As part of the codfw row C/D switch upgrade/migration we need to move the IP gateway for the vlans in those rows from the core routers (w/VRRP) to the new L3 switches (using EVPN Anycast GW).

This can be completed at any stage after we T366941: Move asw-c-codfw and asw-d-codfw CR uplinks to Spine switches.

For IPv6 the process is much easier, as we use router-advertisements to control what lladdr / MAC is used as gateway. The beauty of which is that both the CRs and switches can be sending RAs at the same time, and hosts will use one or other but still things work. Obviously we don't want a long overlap but it means we can do things gracefully. Once we start sending the RAs from the Spines we can disable advertisements on the CRs, so the servers will transition to using the Spine GW when the time limit expires on the last RA they got from CRs.

For IPv4 the situation is trickier, as the hosts will resolve the MAC for their gateway using ARP, and cache it. If we have an overlap where both the new switches and CRs have the same IP configured hosts will randomly get one or other back in an ARP response. It's not like the v6 situation where they will receive and process the RAs from both the CRs and Spines.

The solution followed previously was to use a trick by adding routes via cumin (thankfully which will use v6 to get to hosts) to migrate temporarily to a new GW IP (only on Spines), then move the actual one from the CRs:

https://wikitech.wikimedia.org/wiki/Migrate_from_VC_switch_stack_to_EVPN#Migrate_IP_Gateways

Details

Other Assignee
Papaul