For more information regarding the security incident at F5, the actions we are taking to address it, and our ongoing efforts to protect our customers, click here.

Forum Discussion

Fahad_130290's avatar
Fahad_130290
Icon for Nimbostratus rankNimbostratus
Sep 10, 2014

Odd Viprion behaviour

We have a pair of Viprion with two blades each in Slot1 & Slot2 respectively.

 

On the standby Viprion the Slot2 got abruptly rebooted and was not be complete the initialization thereon. We saw the following logs.

 

Sep 10 05:14:11 slot2/lb2 notice mcpd[5544]: 01070413:5: Updated existing subscriber tmrouted with new filter class 8004000. Sep 10 05:14:11 slot2/lb2 err mcpd[5544]: 0107092b:3: Received error result from peer mcpd, result: result { result_code 17237778 result_message "01070712:3: Caught configuration exception (0), Failed to link files existing(/config/ssl/ssl.crt/default.crt) new(/config/.snapshots_d/certificate_d/1410351251_:Common:default.crt_1) errno(2)(No such file or directory). - sys/validation/FileObject.cpp, line 1057." } Sep 10 05:14:12 slot2/lb2 info mprov:28542:: Invoked as: /usr/bin/mprov.pl (pid=28542) --quiet --legacy Sep 10 05:14:12 slot2/lb2 info mprov:28542:: 'Provisioning (legacy update) successful.'

 

We noticed that on the Slot1 of LB2 the following file was not present /config/ssl/ssl.crt/default.crt. Thinking that the actual culprit here to be Slot1 we decided to perform a full_box_reboot on the Slot1. This ironically meant on the LB2 both the Slots will not available

 

Now when we did a full_box_reboot on the LB2 (the standby box), the slot1 on LB1(Active) also got rebooted (we don't know what caused it) and both LB1 and LB2 went into Standby mode.

 

Sep 10 05:05:51 slot2/lb1 notice clusterd[4856]: 013a0006:5: Slot 1 failed (heartbeat).Sep 10 05:05:51 slot2/lb1 err clusterd[4856]: 013a0014:3: Blade 2: blade 1 FAILEDSep 10 05:05:51 slot2/lb1 notice sod[5372]: 01140029:5: HA min_up_cluster_member clusterd fails action is failover.

 

Out setup is configured with min_up_cluster_member parameter. Which if 1 on 2 blade goes down it will force itself to Standby. This would be the reason why LB1 forced itself to Standby, but in our case the Standby LB2 was not in a state to take up the active role.

 

Is this a normal behavior?

 

1 Reply

  • BinaryCanary_19's avatar
    BinaryCanary_19
    Historic F5 Account

    when two boxes go into standby mode, the reason is usually because Vlan failsafe is misconfigured.

     

    by misconfigured, i mean, it is configured on a VLAN which is tied to a device that went down. One typical example is the HA vlan. Naturally, when you reboot the standby device, this VLAN will go down and trigger a failsafe action in the active box (which is typically "go standby").