Saturday 17 October 2020

Multicast packets dropped on OpenWRT VLANs

Multicast issue on OpenWRT.md

Multicast issue on OpenWRT

Issue

After setting up my hacked switches with VLANs, I wanted to create a pfSense cluster for more resilient internet access. pfSense uses CARP (closely related to VRRP) which uses multicast to detect when a router or interface goes down. When a router is in CARP master mode, it constantly sends out multicast packets to 224.0.0.18. If another router is in backup mode and cannot see the packets arriving from the master, it will attempt to become the master node.

The issue I was seeing was that the multicast packets weren’t arriving and therefore both nodes came up as master on all but one of the CARP IPs. The OpenWRT configuration I had initially only passed multicast on VLAN id 1, all multicast packets bound for VLAN id 1 were echoed to all ports on both switches. All of the multicast packets for the other VLANs were dropped.

Here is my original /etc/config/network configuration file:

Here’s how the CARP interfaces look. The only working network is LAN which is on VLAN id 1

Primary

Master CARP

Secondary

Backup CARP

It’s important to know that the VLANs all work and pass regular unicast and broadcast traffic just fine. Only the multicast CARP packets are not getting through.

Troubleshooting Steps

Packet capture on pfSense

Attempting to capture CARP on any of the affected VLANs shows no packets arriving.

CARP Packet Capture from pfSense

Nothing!

Nothing captured

tcpdump on the switch

Shows CARP packets arriving on the port destined for affected VLANs (Two IP ranges here are coming in on an untagged port on VLAN 9.) Also notice the CARP packets from VLAN id 1 are being echoed onto this port.

tcpdump -i eth1 -c 10 -n -e -T carp carp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:34:39.400955 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:34:39.401219 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 192.168.1.1 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=13620309968023688989
20:34:39.401370 00:00:5e:00:01:0f > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 10.1.15.2 > 224.0.0.18: CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 authlen=7 counter=13306912573729581052
20:34:40.415665 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:34:40.415871 00:00:5e:00:01:0f > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 10.1.15.2 > 224.0.0.18: CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 authlen=7 counter=13306912573729581052
20:34:40.416211 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 192.168.1.1 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=13620309968023688989
20:34:41.419605 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:34:41.419810 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 192.168.1.1 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=13620309968023688989
20:34:41.419956 00:00:5e:00:01:0f > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 10.1.15.2 > 224.0.0.18: CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 authlen=7 counter=13306912573729581052
20:34:42.429564 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
10 packets captured
12 packets received by filter
0 packets dropped by kernel
4 packets dropped by interface

tcpdump on a linux endpoint

Shows CARP packets only for VLAN 1, this Linux machine is on an untagged port on VLAN 9.

# tcpdump -i eth0 -c 10 -n -e -T carp carp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:37:24.422703 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:25.438853 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:26.458870 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:27.522734 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:28.560403 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:29.625900 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:30.635363 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:31.645144 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:32.667945 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:37:33.717977 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
10 packets captured
10 packets received by filter
0 packets dropped by kernel

Resolution

I noticed on the switch that the bridge interface was configured only on VLAN id 1 since it’s member adapters were all fixed to vid='1'. My fix was to create a bridge interface and eth0-3 vlan interfaces in each VLAN that I wanted the multicast packets to work on.

The changed parts in /etc/config/network are the new bridge interfaces:

config interface 'lan2'
    option type 'bridge'
    option force_link '0'
    option igmp_snooping '1'
    option ipv6 '0'
    list ifname 'vlan_eth0_2'
    list ifname 'vlan_eth1_2'
    list ifname 'vlan_eth2_2'
    list ifname 'vlan_eth3_2'
    option ip6assign '0'
    list pppoerelay ''

And the new vlan_ethX_xx interfaces:

config device 'vlan_eth0_2'
    option type '8021q'
    option ifname 'eth0'
    option name 'vlan_eth0_2'
    option mtu '1500'
    option vid '2'

config device 'vlan_eth1_2'
        option type '8021q'
        option ifname 'eth1'
        option name 'vlan_eth1_2'
        option mtu '1500'
        option vid '2'

config device 'vlan_eth2_2'
        option type '8021q'
        option ifname 'eth2'
        option name 'vlan_eth2_2'
        option mtu '1500'
        option vid '2'

config device 'vlan_eth3_2'
        option type '8021q'
        option ifname 'eth3'
        option name 'vlan_eth3_2'
        option mtu '1500'
        option vid '2'

I think the issue is due to the bridge interface being only on vlan id 1 and having multicast snooping enabled. I tried many other changes, like bridging the bare eth0-3 interfaces, but most changes either locked me out of the switch configuration interface or made no difference.

Here is my /etc/config/network configuration file now. It looks a mess, but it works and since I didn’t want to make huge changes to the config and lock myself out (again,) it’s left like this. There is for sure a better way to do this, but these changes are good enough for me!

Recieving packets on pfSense now

Packets arriving on the right interfaces now. Only multicast packets for the relevant VLAN are captured, I believe because VMware is stripping the other VLANs from the Port Group.

CARP Packet Capture from pfSense working

tcpdump on the switch now

All CARP packets for all VLANs arriving at the switch port shown earlier. All CARP packets from all vlans are echoed on all ports.

# tcpdump -i eth1 -c 10 -n -e -T carp carp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:04:38.379521 00:00:5e:00:01:63 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 15, p 0, ethertype IPv4, 192.168.99.1 > 224.0.0.18: CARPv2-advertise 36: vhid=99 advbase=1 advskew=0 authlen=7 counter=4683731516847153633
20:04:38.379922 00:00:5e:00:01:05 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 2, p 0, ethertype IPv4, 192.168.5.1 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=0 authlen=7 counter=9456878584255914769
20:04:38.380439 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:04:38.380625 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 192.168.1.1 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=13620309968023688989
20:04:38.380912 00:00:5e:00:01:0f > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 10.1.15.2 > 224.0.0.18: CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 authlen=7 counter=13306912573729581052
20:04:39.389533 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
20:04:39.389852 00:00:5e:00:01:05 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 2, p 0, ethertype IPv4, 192.168.5.1 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=0 authlen=7 counter=9456878584255914769
20:04:39.390086 00:00:5e:00:01:63 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 15, p 0, ethertype IPv4, 192.168.99.1 > 224.0.0.18: CARPv2-advertise 36: vhid=99 advbase=1 advskew=0 authlen=7 counter=4683731516847153633
20:04:39.390316 00:00:5e:00:01:0f > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 10.1.15.2 > 224.0.0.18: CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 authlen=7 counter=13306912573729581052
20:04:39.390533 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 192.168.1.1 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=13620309968023688989

tcpdump on a linux endpoint now

tcpdump from a linux machine now shows carp packets from all VLANs on it’s physical port. Worth mentioning that the port is configured untagged on VLAN 15.

# tcpdump -i eth0 -c 10 -n -e -T carp carp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
18:15:14.122660 00:00:5e:00:01:63 > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 192.168.99.1 > 224.0.0.18: CARPv2-advertise 36: vhid=99 advbase=1 advskew=0 authlen=7 counter=4683731516847153633
18:15:14.122965 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
18:15:14.123263 00:00:5e:00:01:05 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 2, p 0, ethertype IPv4, 192.168.5.1 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=0 authlen=7 counter=9456878584255914769
18:15:14.123316 00:00:5e:00:01:0f > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 10.1.15.2 > 224.0.0.18: CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 authlen=7 counter=13306912573729581052
18:15:14.123878 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 192.168.1.1 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=13620309968023688989
18:15:15.134075 00:00:5e:00:01:05 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 2, p 0, ethertype IPv4, 192.168.5.1 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=0 authlen=7 counter=9456878584255914769
18:15:15.134351 00:00:5e:00:01:02 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 1, p 0, ethertype IPv4, 192.168.2.1 > 224.0.0.18: CARPv2-advertise 36: vhid=2 advbase=1 advskew=0 authlen=7 counter=3809067134136217446
18:15:15.134627 00:00:5e:00:01:63 > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 192.168.99.1 > 224.0.0.18: CARPv2-advertise 36: vhid=99 advbase=1 advskew=0 authlen=7 counter=4683731516847153633
18:15:15.134823 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 192.168.1.1 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=13620309968023688989
18:15:15.134831 00:00:5e:00:01:0f > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 74: vlan 9, p 0, ethertype IPv4, 10.1.15.2 > 224.0.0.18: CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 authlen=7 counter=13306912573729581052
10 packets captured
10 packets received by filter
0 packets dropped by kernel
3 packets dropped by interface

Wireshark from windows inside a VMware virtual machine only shows carp packets on the vlan that it’s port group is configured to. Presumably VMware is stripping irrelevant VLANs. To show CARP traffic in Wireshark, select a VRRP packet, right click, click decode as and choose CARP from the list.

Wireshark Windows

The CARP interfaces all now show Master/Backup correctly. Failover is also working as expected.

Backup CARP

Written with StackEdit.

No comments:

Post a Comment

Please be nice! :)

Nutanix CE 2.0 on ESXi AOS Upgrade Hangs

AOS Upgrade on ESXi from 6.5.2 to 6.5.3.6 hangs. Issue I have tried to upgrade my Nutanix CE 2.0 based on ESXi to a newer AOS version for ...