fail over

42 Topics

Moving F5 from switch A to switch B
We have LTM 1600 pair (HA) and they are connected to switch A now we recently add new switch B and i want to move my F5 from switch A to switch B so i have two option and please let me know if anything wrong there. (Notes: we are not physically moving F5 just replacing switch) Shutdown standby F5 and move it to switch B and once it up failover and make it Active and then move remaining one. Do not Shutdown standby and just unplug cable and move it to switch B and failover to switch B and move remaining F5
satish_txt_2254
Jun 05, 2023 Place Technical Forum
334Views
0likes
2Comments
BIG-IP Sync-Failover - Sync Failed
Hi, In a project we're running a device-group in Sync-Failover* mode with Manual Sync type. After a change on the Active unit trying to sync from the Active unit to the device-group, Sync Failed with the information below: Sync Summary Status Sync Failed Summary A validation error occurred while syncing to a remote device Details Sync error on 2nd-unit: Load failed from 1st-unit 01070110:3: Node address 'node' is referenced by a member of pool 'pool'. Recommended action: Review the error message and determine corrective action on the device We're totally sure that nothing had been changed manually on the 2nd node, and both nodes were in sync before the change on 1st node. The Last Sync Type field for both nodes shows Manual Full Node. I couldn't find anything on this case; is it safe to just manipulate the configuration on the 2nd node and then sync from 2nd node to the device-group? Many thanks in advance!
Mo_Moghaddas
Jun 04, 2023 Place Technical Forum
1.2KViews
0likes
5Comments
GTM iRule and log local0. crashing BIG-IP
Hi, I have iRule like that attached to Wide IP: when DNS_REQUEST { check where LDNS is, if below true it is in DR1 if { [active_members location_pl] < 1 } { log local0. "Active members in \"location_pl\" - [active_members location_pl]" check if there are any active members in DR1 pool/location if { [active_members dr_a_pl] > 0 } { pool dr_a_pl log local0. "Active members in \"dr_a_pl\" - [active_members dr_a_pl], selected pool [LB::server pool]" log local0. "Switching to DR1" } else { Use DC1 pool pool dc_a_pl } } } When executed it is immediately crashing my BIG-IP VE v13.0.0HF2 - failover etc. When all log local0. is changed to log local2. everything works OK. From GTM command reference for log command it seems that there should be no problem using ltm log instead of gtm - am I wrong or it's kind of bug? Piotr
Piotr_Lewandows
Jun 03, 2023 Place Technical Forum
244Views
0likes
0Comments
Active was down, Standby took over, then Active went up, conflict happened.
Hello, I have an issue with my Active/Standby F5 devices. Active node (F5_A) lost its network connection. Standby node (F5_B) took over as Active. After 10 minutes, F5_A went back online. So, I have Active/Active devices. Everything failed because of this case. I had to force to Standby F5_B to be able to be online again. Why does this conflict happened? This article is what we have setup right now, except, we use Network Failover because they are located in different location. http://itadminguide.com/configure-high-availability-activestandby-of-big-ip-f5-ltms/ Auto failback is disabled on both devices. I saw this logs when F5_A went back online. I am not sure about the behavior of it once it went back. Sep 7 22:39:12 f5_B notice sod[7345]: 010c007e:5: Not receiving status updates from peer device /Common/f5_A (10.41.253.44) (Disconnected). Sep 7 22:39:12 f5_B notice sod[7345]: 010c006d:5: Leaving Standby for Active (best load): NextActive:. Sep 7 22:39:12 f5_B notice sod[7345]: 010c0053:5: Active for traffic group /Common/only_4751. Sep 7 22:39:12 f5_B notice sod[7345]: 010c006d:5: Leaving Standby for Active (best load): NextActive:. Sep 7 22:39:12 f5_B notice sod[7345]: 010c0053:5: Active for traffic group /Common/prefer_4751. Sep 7 22:39:12 f5_B notice sod[7345]: 010c006d:5: Leaving Standby for Active (best load): NextActive:. Sep 7 22:39:12 f5_B notice sod[7345]: 010c0053:5: Active for traffic group /Common/prefer_MDR. Sep 7 22:39:12 f5_B notice sod[7345]: 010c006d:5: Leaving Standby for Active (best load): NextActive:. Sep 7 22:39:12 f5_B notice sod[7345]: 010c0053:5: Active for traffic group /Common/traffic-group-1. Sep 7 22:39:12 f5_B notice sod[7345]: 010c0019:5: Active Sep 7 22:49:10 f5_B notice sod[7345]: 010c007f:5: Receiving status updates from peer device /Common/f5_A (10.41.253.44) (Online). Sep 7 22:49:10 f5_B notice tmm1[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32771 for traffic-group /Common/only_4751 established. Sep 7 22:49:10 f5_B notice tmm3[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32769 for traffic-group /Common/only_4751 established. Sep 7 22:49:10 f5_B notice tmm2[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32768 for traffic-group /Common/only_4751 established. Sep 7 22:49:10 f5_B notice tmm[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32770 for traffic-group /Common/only_4751 established. Sep 7 22:49:10 f5_B notice tmm[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32771 for traffic-group /Common/prefer_4751 established. Sep 7 22:49:10 f5_B notice tmm2[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32769 for traffic-group /Common/prefer_4751 established. Sep 7 22:49:10 f5_B notice tmm1[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32770 for traffic-group /Common/prefer_4751 established. Sep 7 22:49:10 f5_B notice tmm3[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32768 for traffic-group /Common/prefer_4751 established. Sep 7 22:49:10 f5_B notice tmm3[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32769 for traffic-group /Common/prefer_MDR established. Sep 7 22:49:10 f5_B notice tmm1[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32771 for traffic-group /Common/prefer_MDR established. Sep 7 22:49:10 f5_B notice tmm2[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32768 for traffic-group /Common/prefer_MDR established. Sep 7 22:49:10 f5_B notice tmm[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32770 for traffic-group /Common/prefer_MDR established. Sep 7 22:49:10 f5_B notice tmm3[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32770 for traffic-group /Common/traffic-group-1 established. Sep 7 22:49:10 f5_B notice tmm1[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32768 for traffic-group /Common/traffic-group-1 established. Sep 7 22:49:10 f5_B notice tmm2[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32771 for traffic-group /Common/traffic-group-1 established. Sep 7 22:49:10 f5_B notice tmm[21172]: 01340001:5: HA Connection with peer 10.70.1.236:32769 for traffic-group /Common/traffic-group-1 established.
CKFR_356853
Jun 02, 2023 Place Technical Forum
1.2KViews
0likes
1Comment
Failover status error on internal interface
In my VE Lab environment I have two VE's in sync-failover configuration. I have two failover networks, the (only) inside vlan and management network. However something seems wrong: root@(bigip2)(cfg-sync In Sync)(Standby)(/Common)(tmos) show cm failover-status -------------------- CM::Failover Status -------------------- Color gray Status STANDBY Summary 1/1 standby Details -------------------------------------------------------------------------------------------- CM::Failover Connections Local Failover Address Remote Device Packets Transitions Received Last Packet Status -------------------------------------------------------------------------------------------- 10.1.20.2:1026 bigip1.study.lab 0 1 - Error 192.168.14.112:1026 bigip1.study.lab 774520 1 2018-Nov-16 08:11:22 Ok Warning(s): Only receiving failover updates from bigip1.study.lab on 1 failover interfaces, out of 2. When I tcpdump the inside vlan (that has the 10.1.20.0/24 network) I see failover packets going both ways (snipped the output): root@(bigip2)(cfg-sync In Sync)(Standby)(/Common)(tmos) tcpdump -nni inside.vlan port 1026 out slot1/tmm0 lis= 08:07:26.132896 IP 10.1.20.1.63468 > 10.1.20.2.1026: failover_packet { failover_packet_cluster_mgmt_ip ip_address 192.168.14.113 failover_packet_slot_id uword 0 failover_packet_state ulong 7 failover_packet_sub_state ulong 0 failover_packet_monitor_fault ulong 0 failover_packet_hop_cnt uword 1 failover_packet_peer_signal ulong 0 failover_packet_version ulong 2 failover_packet_msg_bits ulong 2 failover_packet_traffic_grp_score ulong 5118 failover_packet_device_load ulong 1 failover_packet_device_capacity ulong 0 failover_packet_traffic_group_load ulong 0 failover_packet_build_num ulong 4158237899 failover_packet_next_active ulong 0 failover_packet_traffic_grp string `/Common/traffic-group-1` failover_packet_previous_active ulong 0 failover_packet_active_reason ulong 8 failover_packet_left_active_reason ulong 0 } in slot1/tmm0 lis= 08:07:26.133414 IP 10.1.20.2.8775 > 10.1.20.1.1026: failover_packet { failover_packet_cluster_mgmt_ip ip_address 192.168.14.112 failover_packet_slot_id uword 0 failover_packet_state ulong 5 failover_packet_sub_state ulong 0 failover_packet_monitor_fault ulong 0 failover_packet_hop_cnt uword 2 failover_packet_peer_signal ulong 0 failover_packet_version ulong 2 failover_packet_msg_bits ulong 2 failover_packet_traffic_grp_score ulong 5102 failover_packet_device_load ulong 1 failover_packet_device_capacity ulong 0 failover_packet_traffic_group_load ulong 1 failover_packet_build_num ulong 4158237899 failover_packet_next_active ulong 1 failover_packet_traffic_grp string `/Common/traffic-group-1` failover_packet_previous_active ulong 1 failover_packet_active_reason ulong 0 failover_packet_left_active_reason ulong 1 } AFM is provisioned, that's the only thing I can see that would stop the packets. Default action is set to ALLOW (for now). I've also tried settings Security Policy on the SelfIP allowing failover traffic, but that does not fix the problem. Why does it says error on one interface?
ecce_297791
Jun 01, 2023 Place Technical Forum
376Views
0likes
1Comment
Failover status error on internal interface
In my VE Lab environment I have two VE's in sync-failover configuration. I have two failover networks, the (only) inside vlan and management network. However something seems wrong: root@(bigip2)(cfg-sync In Sync)(Standby)(/Common)(tmos) show cm failover-status -------------------- CM::Failover Status -------------------- Color gray Status STANDBY Summary 1/1 standby Details -------------------------------------------------------------------------------------------- CM::Failover Connections Local Failover Address Remote Device Packets Transitions Received Last Packet Status -------------------------------------------------------------------------------------------- 10.1.20.2:1026 bigip1.study.lab 0 1 - Error 192.168.14.112:1026 bigip1.study.lab 774520 1 2018-Nov-16 08:11:22 Ok Warning(s): Only receiving failover updates from bigip1.study.lab on 1 failover interfaces, out of 2. When I tcpdump the inside vlan (that has the 10.1.20.0/24 network) I see failover packets going both ways (snipped the output): root@(bigip2)(cfg-sync In Sync)(Standby)(/Common)(tmos) tcpdump -nni inside.vlan port 1026 out slot1/tmm0 lis= 08:07:26.132896 IP 10.1.20.1.63468 > 10.1.20.2.1026: failover_packet { failover_packet_cluster_mgmt_ip ip_address 192.168.14.113 failover_packet_slot_id uword 0 failover_packet_state ulong 7 failover_packet_sub_state ulong 0 failover_packet_monitor_fault ulong 0 failover_packet_hop_cnt uword 1 failover_packet_peer_signal ulong 0 failover_packet_version ulong 2 failover_packet_msg_bits ulong 2 failover_packet_traffic_grp_score ulong 5118 failover_packet_device_load ulong 1 failover_packet_device_capacity ulong 0 failover_packet_traffic_group_load ulong 0 failover_packet_build_num ulong 4158237899 failover_packet_next_active ulong 0 failover_packet_traffic_grp string `/Common/traffic-group-1` failover_packet_previous_active ulong 0 failover_packet_active_reason ulong 8 failover_packet_left_active_reason ulong 0 } in slot1/tmm0 lis= 08:07:26.133414 IP 10.1.20.2.8775 > 10.1.20.1.1026: failover_packet { failover_packet_cluster_mgmt_ip ip_address 192.168.14.112 failover_packet_slot_id uword 0 failover_packet_state ulong 5 failover_packet_sub_state ulong 0 failover_packet_monitor_fault ulong 0 failover_packet_hop_cnt uword 2 failover_packet_peer_signal ulong 0 failover_packet_version ulong 2 failover_packet_msg_bits ulong 2 failover_packet_traffic_grp_score ulong 5102 failover_packet_device_load ulong 1 failover_packet_device_capacity ulong 0 failover_packet_traffic_group_load ulong 1 failover_packet_build_num ulong 4158237899 failover_packet_next_active ulong 1 failover_packet_traffic_grp string `/Common/traffic-group-1` failover_packet_previous_active ulong 1 failover_packet_active_reason ulong 0 failover_packet_left_active_reason ulong 1 } AFM is provisioned, that's the only thing I can see that would stop the packets. Default action is set to ALLOW (for now). I've also tried settings Security Policy on the SelfIP allowing failover traffic, but that does not fix the problem. Why does it says error on one interface?
ecce
Jun 01, 2023 Place Technical Forum
442Views
0likes
0Comments
Standby shows as offline in HA pair in version 12.1.2
Hello all, I know this topic has been discussed before but for older code versions and none of the solutions worked for me and I tried pretty much all of them. I have 2 3900 series boxes in HA pair. Network failover, port lockdown set to "allow all", boxes directly connected in HA vlan, no device in between. Upgraded from 12.0.0 to 12.1.2 and in the "Overview" page on the active box it shows the standby peer as "offline". If I failover to the other box it will again show the standby peer as "offline" from the new active box. Everything looks fine from the standby box "Overview". Both failover group and trust group show full sync, HA status on both boxes shows "in sync", no issues there. I tried multiple failovers, reboots from CLI, power cycles, changing port lockdown, even tried the process of forcing the configuration reload to no avail, still the same. Does anybody experience the same behaviour? I suspect this is a bug in the 12.1.2 code version.
Michal_Kratoch1
Nov 19, 2018 Place Technical Forum
870Views
0likes
5Comments
F5 Gratuitous-ARP issue when failover
Hi Last night we upgrade F5 v. 11.5.4 to v.12.1.3 when we failover from old unit v.11.5.4 to newly unit v.12.1.3, We experience some IP has more request timeout than the rest (we ping ip of each vs (~20 ip) when failover) From my understanding, F5 will send G-ARP to neighbour unit when it's active. Is it possible that G-ARP that sent is drop so those IP experience longer downtime due to still using old ARP? or Is it because neighbour unit not use some G-ARP from F5? or Is there any possibilities that make neighbour unit not learn new ARP as expect? Thank you
kridsana
Oct 18, 2018 Place Technical Forum
2.9KViews
0likes
5Comments
ARP/MAC Tables Not Updating on Core Switches After F5 LTM Failover (GARP Issue?)
We have two F5 LTM 5250v appliances configured with 2 vCMP instances each in an HA pair (Active/Standby). Each F5 5250v has a 10G uplink to two core switches (Cisco Nexus 7010) configured as an LACP port-channel on the F5 side and a Port-Channel/vPC on the Nexus side. Port-Channel127/vPC127 = F5ADC01 Port-Channel128/vPC128 = F5ADC01 When I look at the MAC address tables on both 7K1 and 7K2, I can see all the individual F5 MACs for each VLAN we have configured on the F5 vCMP instances. We are having an issue during automatic or manual failover where the MAC addresses for the virtual-servers are not being updated. If F5ADC01 is Active and we force it Standby, it immediately changes to Standby and F5ADC02 immediately takes over the Active role. However, the ARP tables on the Nexus 7K Core switches do not get updated so all the virtual-servers continue to have the MAC address associated with F5ADC01. We have multiple partitions on each vCMP instance with several VLANs associated with each partition. Each partition only has a single route-domain the VLANs are allocated to. For traffic to virtual-servers, we are using Auto-MAP to SNAT to the floating Self-IP and using Auto-Last Hop so return traffic passes through the correct source VLAN. We are not using MAC masquerading. The ARP time out on the Nexus 7Ks is 1500 seconds (default) so it takes 25min after a failover for a full network recovery. Eventually the ARP entries age out for all virtual servers and get refreshed with the correct MAC address. Obviously this is not acceptable. I found an SOL article that talks about when GARPs can be missed after failover: SOL7332: Gratuitous ARPs may be lost after a BIG-IP failover event. We have confirmed the upstream core switches are not dropping any GARPs. As a test I went in and manually disabled all virtual-servers and then enabled them and all MACs updated immediately. I have opened a support case with F5 and we have yet to determine where the issue lies. Does anybody have any ideas what the issue might be? If I need to provide more information about our configuration, let me know. We are pretty new to the F5 platform. We recently migrated from the Cisco ACE30 platform. Failover on the ACE platform worked perfectly. Similar cabling setup (two port-channels to two separate Catalyst 6509 switches with an ACE30 module in each switch). After ACE failover, the MAC tables/ARP caches immediately updated. Thank You!
Ron_Peters_2122
Apr 30, 2018 Place Technical Forum
8.2KViews
0likes
22Comments
SSL Mirroring when using Proxy SSL
Hi guys, to perform a failover as seamless as possible we use the SSL Session Mirroring feature. Now I haven't found any documentation if it makes sense to use it together with the Proxy SSL feature. What's your opinion and why? Thanks and regards!
EuropeanITCrow1
Jul 11, 2017 Place Technical Forum
364Views
0likes
1Comment