It's once again me with a very specific issue:
Setup:
- two routers (80-gw01/2) (RouterOS 7.12.1, independent config, no vrrp)
- two firewalls (80-fw01/2) (HA-config, one configuration on cluster of two, active-passive)
- switchstack (80-csw)
connections (see attached diagram):
- lacp/bonding gw<>gw
- lacp/bonding gw01<>fw01
- lacp/bonding g02<>fw02
- ha-links fw01<>fw02
- lacp/bonding gw01/2 <> csw
ospf config:
Code: Select all
/routing ospf interface-template
add area=backbonev4 disabled=no interfaces=bonding_cluster use-bfd=yes
add area=backbonev4 cost=10 disabled=no interfaces=bonding_fw use-bfd=no
add area=backbonev4 cost=100 disabled=no interfaces=vlan2 use-bfd=no
Primary router and primary firewall see each other over direct bonding_fw and over switched vlan2.
Now, in case of firewall updates, we need to switch from primary to backup.
During this, the primary firewall shortly kills it's ethernet connections, but they will come up again.
After the switch to the backup firewall took over, the OSPF session to primary fw stays dead (as it is functionally dead at this point).
The backup firewall is now the master and the only connection to the primary router is via the vlan2 (which is by design and well).
Unfortunately, since the router-id of the backup firewall is the same and the ip-address on vlan2 interface is the same, gw01 thinks it's the same router and it can continue where it was before, but it's not. It's an other peer, not knowing the OSPF state of before, and as such:
Code: Select all
11:15:21 route,ospf,info ospfv4 { version: 2 router-id: 1.1.1.1 } backbonev4 { 0.0.0.0 } interface { broadcast 1.2.3.4%vlan2 } neighbor { router-id: 5.5.5.5 state: ExStart } state change to Exchange
11:15:21 route,ospf,info ospfv4 { version: 2 router-id: 1.1.1.1 } backbonev4 { 0.0.0.0 } interface { broadcast 1.2.3.4%vlan2 } neighbor { router-id: 5.5.5.5state: Exchange } exchange lsdb size 105
11:15:21 route,ospf,info ospfv4 { version: 2 router-id: 1.1.1.1 } backbonev4 { 0.0.0.0 } interface { broadcast 1.2.3.4%vlan2 } neighbor { router-id: 5.5.5.5 state: Exchange } sequence mismatch
11:15:21 route,ospf,info ospfv4 { version: 2 router-id: 1.1.1.1 } backbonev4 { 0.0.0.0 } interface { broadcast 1.2.3.4%vlan2 } neighbor { router-id: 5.5.5.5 state: Exchange } state change to ExStart
11:15:26 route,ospf,info ospfv4 { version: 2 router-id: 1.1.1.1 } backbonev4 { 0.0.0.0 } interface { broadcast 1.2.3.4%vlan2 } neighbor { router-id: 5.5.5.5 state: ExStart } negotiation done
Router-id of fw-cluster: 5.5.5.5
ip of fw in vlan2: 1.2.3.4
So, what can I do here to fix this issue? I can fix it manually by temporarily disabling the OSPF interface-template for vlan2, then reenabling it - but manual interaction defeats the dynamic and redundant approach.
I don't see any additional timers to configure and I can't use BFG, because this particular firewall doesn't support it - although I'm quite sure it wouldn't even help.
Any ideas my friends?
Thank you so much,
Irrwitzer