Controlling ARP Traffic on AMS-IX platform

  1. ARP (Address Resolution Protocol)

    ARP (Address Resolution Protocol) is the Layer-2 protocol used by AMS-IX member's router to associate IPv4 address with the MAC address of peers interfaces.
    More about ARP
  2. Problems caused by too much ARP traffic

    On Ethernet networks, the Address Resolution Protocol (ARP) is used to find the MAC-address for a given IPv4 address. ARP uses Ethertype 0x0806 together with Ethernet broadcasting. A node will broadcast an ARP Request packet to ask for the MAC address of an unknown IPv4address. The node using the requested IP address replies (using regular unicast) with an ARP Reply packet, which includes its MAC address. In order to work, it is important that all nodes using IPv4 listen for ARP packets and reply to them if necessary. The nodes therefore need to process all Ethernet broadcast messages with Ethertype 0x0806. For each ARP packet, they must decide whether or not to reply. Processing ARP packets can take a lot of processing power. Because all ARP packets need to be examined in order for ARP to work, processing ARP packets may take precedence over other activities, depending on the Operating System. As such, when there is a lot of ARP traffic, routers may be unable to do other pro- cessing tasks like maintaining BGP sessions.This problem was noticed on AMS-IX when the ISP peering LAN was renum- bered to new IPv4 addresses. Members in the new IPv4 range were trying to reach members in the old IPv4range and vice versa. Larger amounts of ARP packets than usual crossed the network, consuming all available processing power on some customer routers, not leaving enough to process BGP in a timely manner, result- ing in lost BGP sessions. Other routers started sending ARP packets to re-establish these BGP sessions, resulting in an ARP storm that brought even more routers down.
  3. ARP Sponge – the AMS-IX solution
    To help routers survive heavy ARP traffic, AMS-IX decided to try keeping the amount of ARP traffic down. For this purpose, AMS-IX developed a daemon, written in Perl, called ARP Sponge.The ARP Sponge listens on the ISP peering LAN for ARP traffic.
    When the number of ARP Requests for a certain IP address exceeds a threshold, the ARP Sponge sends out an ARP Reply for that IP address using its own MAC address.
    From that moment, the IP address is sponged: all traffic to that node is sent to the ARP Sponge. This prevents ARP storms because it keeps the amount of ARP traffic down. When the interface of a sponged IP address comes up again, it generally sends out a gratuitous ARP request packet. This is an ARP packet with both source and destination IP address set to the IP address of the node sending the packet. It is used mostly in case the MAC-address changed, so that other nodes can update their ARP caches. When the ARP Sponge receives any traffic from a sponged IP address (including but not limited to gratuitous ARP requests, ARP requests for other nodes, BGP peering initiations, etc.), it ceases sponging the IP address, thus no longer sending out ARP replies for that IP address.

    Current AMS-IX ARP Sponge MAC address is : 00:25:90:0a:0a:bd 
    Arp Sponge Sponging

  4. Common Issue with IPv4 addresses after being sponged by ARP Sponge 

    * Unable to exchange traffic with AMS-IX peers when IPv4 address comes up after being inactive for a period of time

    If a IPv4 is sponged, it means that in the members ARP tables, the ARP entry for this IP is registered with the ARP sponge MAC address. After the IPv4 is again reachable again and being "un-sponged", the ARP table of peers might not be updated fast enough with the IP's MAC address, result in traffic from these peers toward the recovered IP still being forward to the sponged MAC address.

    For instance, if the IP 80.249.208.1 of member A with MAC address AAAA.AAAA.AAA is sponged with the sponged MAC address EEEE.EEEE.EEEE, then member B ARP entry for the address will be 80.249.208.1 -  EEEE.EEEE.EEE. After the member A recovers, it send traffics toward member B, but member B ARP entry is not yet updated with the original address AAAA.AAAA.AAAA, then traffic will be ended up sending to EEEE.EEEE.EEEE, until member B updates the ARP entry.

    This issue should be automatically resolved after a certain period of time, after the daemon stop replying to ARP reply for this IP and let the un-sponged IP and peers update ARP entries themselves.

    The issue is more significant with members that only have peering sessions with the AMS-IX route-servers. If members do not have peering sessions with route-servers, BGP sessions with peers must be brought up one-by-one and ARP entries are sure to be updated through the BGP initialization process. Subsequently, traffic will be properly forwarded and received from each peers. However, if the newly "un-sponged" member only has peering sessions with route-servers, and after recovery establishes BGP sessions and receives AMS-IX peers prefixes from there, there could be a case that traffic is forwarded to the next hop IP of peers that still have the spoofed ARP entries.

    Therefore, AMS-IX NOC recommend members that have their IPv4 address being unreachable for prolonged period of time (so it is certainly sponged), to temporary shutdown peering with route-servers and send gratuitous ARP request to update peer's ARP tables. 

Acknowledgement 

The ARP sponge explanation section is extract from the report of Marco Wessel and Niels Sijm from Universiteit van Amsterdam (mwessel@os3.nl, nsijm@os3.nl) in 2009, after they did the research about effect of IPv4 and IPv6 address solution on AMS-IX platform and the ARP sponge  during their course for Master in System and Network Engineering.