Docs - EVPN: extend the information about arp suppression support in case of L2 #16015
Open
2 tasks done
Labels
triage
Needs further investigation
Description
With the premise that I am still learning about evpn, I tried to setup a EVPN L2 topology to see how it works. I was able to make everything work but arp suppression, so I wonder if my case is a corner case or we could enhance the docs a bit.
Arp suppression is meant to reduce the amount of multicast (or replicated) traffic going through the underlay.
The docs says:
"ARP/ND suppression is enabled per bridge_slave via neigh_suppress." and suggests how setting the parameter will enable ARP suppression in a MAC VRF. The problem is, that parameter alone is not enough (or, I am missing something).
Given a topology like the one I have in my container lab based example:
When I arping host2 from host1, everything but arp suppression works, meaning that every arp request goes through the vxlan tunnel.
my understanding is that:
But the arp request is broadcasted, the reply gets back to the leaf's bridge and gets unicasted towards Host1. So the leaf's host never sees it, and the arp cache is never filled.
Note: I don't think this is an FRR bug, but I think the docs should be more clear on what's provided by FRR and by the kernel in this scenario.
Relevant discussion here https://frrouting.slack.com/archives/CP5NXU36G/p1715767012284389
Version
How to reproduce
Set up a L2EVPN, make some traffic between two hosts connected to the linux bridges. An example can be found here https://github.com/fedepaol/evpnlab/tree/main/02_clab_l2
Expected behavior
After the first arping from host1 to host2, the local leaf will reply instead of forwarding.
Actual behavior
The neigh_suppress parameter relies on the neighbor table to proxy the arp request.
The neighbor table of the leaf either because a type 2 route is sent, or by the kernel when it receives an arp reply.
So, if the traffic is pure l2 between the hosts, no arp cache is filled and arp suppression never kicks. I was reading #12574 (comment) where
neighmgrd
is mentioned, which would probably solve the issue.Additional context
I am not sure whether there are scenarios with mixed l2 and l3 where the leaf acts as gateway and thus arp requests would go from the bridge to the host which would make the arp request fill the local arp table, but in a pure L2 scenario I think it just don't work.
Checklist
The text was updated successfully, but these errors were encountered: