Forum Discussion
TomSu_93471
Nimbostratus
Jan 30, 2012Splitting 2 production Viprion clusters(active-stdby) into two..
Hi,
I need to split 2 production Viprion clusters (redundant pair -Active/Stdby) into two standalone clusters.
Since this is Active-Stdby config I suppose I could just force offline the stdby one and uncable it, then reset its config and configure as a seperate standalone cluster. However I wonder what about that Active cluster which will all the time think it is still part of Active/Stdby pair, looking for Stdby one to be up someday again... I'm wondering if I might run into issues later when upgrading the system/relicensing or changing the configs etc. ?
If someone could advise what is the best way to split two redundant clusters (Active/Stdby) into two seperate standalone clsuters, I would be greatfull.
Cheers,
Tom
17 Replies
- nitass
Employee
i think just removing relevant configuration such as mirroring, floating selfip and changing high availability setting at system > platform to single device could be okay. - TomSu_93471
Nimbostratus
yes basically it should be doable but I just don't want to have any outage. I think that setting system>platform>ha to Single_Device should remove all the other configs, as normally on the node which is set to SingleDevice you don't even have options available for network mirrornig and all that stuff for redundancy.
Question is has anyone done this and can tell nothing bad will happen? like for example license re-activation or some services/modules restarts that will affect traffic.. - nitass
Employee
i did test on my pair. nothing was restarted but i noticed sod went offline and came back to active in ltm log. it happened in a second. anyway, i do not have much traffic, so i am not sure if it affects. let us see if somebody here has done before.[root@B1600-R66-S17:Active] config tail -f /var/log/ltm (1) Jan 30 07:23:16 local/B1600-R66-S17 notice sod[3422]: 010c003e:5: Offline Jan 30 07:23:16 local/B1600-R66-S17 notice mcpd[3453]: 01070413:5: Updated existing subscriber tmrouted with new filter class 800c000. (2) Jan 30 07:23:16 local/B1600-R66-S17 notice sod[3422]: 010c0051:5: Active (not redundant) Jan 30 07:23:16 local/B1600-R66-S17 notice mcpd[3453]: 01070413:5: Updated existing subscriber tmrouted with new filter class 4800d210. Jan 30 07:23:16 local/B1600-R66-S17 notice mcpd[3453]: 01070410:5: Removed subscription with subscriber id bgpd Jan 30 07:23:20 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 1.1.1.19 (00:01:d7:b4:60:44) on vlan 3422 Jan 30 07:23:21 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 200.200.200.19 (00:01:d7:b4:60:43) on vlan 423 Jan 30 07:23:21 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 2.2.2.19 (00:01:d7:b4:60:45) on vlan 3421 Jan 30 07:23:21 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 1.1.1.19 (00:01:d7:b4:60:44) on vlan 3422 Jan 30 07:23:22 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 200.200.200.19 (00:01:d7:b4:60:43) on vlan 423 Jan 30 07:23:22 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 2.2.2.19 (00:01:d7:b4:60:45) on vlan 3421 Jan 30 07:23:22 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 1.1.1.19 (00:01:d7:b4:60:44) on vlan 3422 Jan 30 07:23:23 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 200.200.200.19 (00:01:d7:b4:60:43) on vlan 423 Jan 30 07:23:23 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 2.2.2.19 (00:01:d7:b4:60:45) on vlan 3421 Jan 30 07:23:23 local/tmm info tmm[5113]: 01190004:6: Per-invocation log rate exceeded; throttling. - hoolio
Cirrostratus
I think you'd want to disable the switch ports that the second Viprion is connected to, change the first Viprion's config to a single unit and then reconfigure the second Viprion for its new role. Make sure to disable the failover VLAN last so the second Viprion doesn't go active while it can ARP for the shared IP addresses (VIPs, SNATs, etc). Doing it in this order should avoid any effect on the production traffic. Though to be safe, it would be best to do this work during a maintenance window.
Aaron - TomSu_93471
Nimbostratus
@nitass, many thanks for continued support on this topic.
Could you please describe what you did exactly?
From the piece of log you pasted I can see some conflicts has happend? - TomSu_93471
Nimbostratus
Thanks Aaron.
I can power off completly the second Viprion chassis. I will have to do it anyway since it will be moved to different location anyway before setting it up to its new role. So I guess things like cluster split brain were both clusters claim to be Active one shouldn't be a problem for me. I'm just afraid and wonder how to change the first f5 which will remain all the time Active one and in live service, to the SingleDevice and get rid of all the cluster to cluster redundancy config elements and nothing more, so that all it services will be intact.
Cheers,
Tom - nitass
Employee
Could you please describe what you did exactly?
From the piece of log you pasted I can see some conflicts has happend?i did change high availability setting to single device. the address conflict happened because the peer unit was there and went active. - TomSu_93471
Nimbostratus
@nitass
Thanks for confirmation.
1) Regarding the sod process, I was able to find this on AskF5:
" The sod process is the high availability management daemon for the BIG-IP system. The heartbeat timeout for the sod process is 60 seconds, and the default failover action for redundant systems is restart all. Therefore, if the sod process does not increment its heartbeat in 60 seconds, the BIG-IP system restarts all system services.
When the heartbeat timeout for the sod process expires, the system logs the following message to the /var/log/ltm file:
overdog[1628]: 01140029:5: HA daemon_heartbeat sod fails action is restart all.
"
so as far as I understood it, in your case on Active cluster sod was restarted due to your config change -> SingleDevice, which makes sense since sod is ha deamon so its need to be restarted due to such config change. And for the redundant node heartbeat failed and it had restarted all its services and they have come up in active state?
2) When you changed the setting to SingleDevice was all the realted HA config deleted automatically ? I mean things like: - gateway failsafe/vlan failsafe - config sync mechanism - network mirroring - mirror persistency and etc. ? I'd like to know if there are any left overs in the config which must be manually removed to not case issues later.
Thanks, Tom - nitass
Employee
i do not think sod was restarted when changing ha setting to single device.
i did it again today and this is log. also, i am able to see ha configuration is still there.[root@B1600-R66-S17:Active] config b failover list failover { network failover enable peer mgmt addr 172.28.66.18 redundant enable standby link down time 0 unicast peer peer { dest addr 172.28.66.18 port 1026 source addr 172.28.66.17 } } [root@B1600-R66-S17:Active] config b db|grep -i mirror|grep -i addr StateMirror.Ipaddr = 200.200.200.17 StateMirror.PeerIpaddr = 200.200.200.18 StateMirror.Secondary.Ipaddr = :: StateMirror.Secondary.PeerIpaddr = :: [root@B1600-R66-S17:Active] config bigstart status sod sod run (pid 3422) 22 hours [root@B1600-R66-S17:Active] config cat /var/log/ltm Jan 30 23:45:45 local/B1600-R66-S17 notice sod[3422]: 010c003e:5: Offline Jan 30 23:45:45 local/B1600-R66-S17 notice mcpd[3453]: 01070413:5: Updated existing subscriber tmrouted with new filter class 800c000. Jan 30 23:45:45 local/B1600-R66-S17 notice sod[3422]: 010c0051:5: Active (not redundant) Jan 30 23:45:45 local/B1600-R66-S17 notice mcpd[3453]: 01070413:5: Updated existing subscriber tmrouted with new filter class 4800d210. Jan 30 23:45:45 local/B1600-R66-S17 notice mcpd[3453]: 01070410:5: Removed subscription with subscriber id bgpd Jan 30 23:45:48 local/tmm info tmm[5113]: 01190004:6: Resuming log processing at this invocation; held 6 messages. Jan 30 23:45:48 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 1.1.1.19 (00:01:d7:b4:60:44) on vlan 3422 Jan 30 23:45:48 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 200.200.200.19 (00:01:d7:b4:60:43) on vlan 423 Jan 30 23:45:49 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 2.2.2.19 (00:01:d7:b4:60:45) on vlan 3421 Jan 30 23:45:49 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 1.1.1.19 (00:01:d7:b4:60:44) on vlan 3422 Jan 30 23:45:49 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 200.200.200.19 (00:01:d7:b4:60:43) on vlan 423 Jan 30 23:45:50 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 2.2.2.19 (00:01:d7:b4:60:45) on vlan 3421 Jan 30 23:45:50 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 1.1.1.19 (00:01:d7:b4:60:44) on vlan 3422 Jan 30 23:45:50 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 200.200.200.19 (00:01:d7:b4:60:43) on vlan 423 Jan 30 23:45:51 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 2.2.2.19 (00:01:d7:b4:60:45) on vlan 3421 Jan 30 23:45:51 local/tmm warning tmm[5113]: 01190004:4: address conflict detected for 1.1.1.19 (00:01:d7:b4:60:44) on vlan 3422 Jan 30 23:45:51 local/tmm info tmm[5113]: 01190004:6: Per-invocation log rate exceeded; throttling. [root@B1600-R66-S17:Active] config bigstart status sod sod run (pid 3422) 22 hours [root@B1600-R66-S17:Active] config b failover list failover { network failover enable peer mgmt addr 172.28.66.18 redundant disable standby link down time 0 unicast peer peer { dest addr 172.28.66.18 port 1026 source addr 172.28.66.17 } } [root@B1600-R66-S17:Active] config [root@B1600-R66-S17:Active] config b db|grep -i mirror|grep -i addr StateMirror.Ipaddr = 200.200.200.17 StateMirror.PeerIpaddr = 200.200.200.18 StateMirror.Secondary.Ipaddr = :: StateMirror.Secondary.PeerIpaddr = :: [root@B1600-R66-S17:Active] config b db|grep -i redundant Failover.IsRedundant = false - TomSu_93471
Nimbostratus
ok, so looks like the config is preserved and just the single settings/flags in config are changed..
When you are changing back the config setting to REdundantPair you got all the settings back in place ? and cluster redundancy works again without any other changes necessary, just as it was before the change to SingleDevice ?
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects
