cancel
Showing results for 
Search instead for 
Did you mean: 

LTM Connection to Dual Switches

jfrizzell_43066
Nimbostratus
Nimbostratus
Hello Everyone,

 

I am hoping that someone can help me understand which connection type is best for F5. We currently have two F5's in an active/failover cluster. In our environment, we are going away from access ports with single HTTP/HTTPS VIP to multiple VLANs. As part of this setup, I have done the following:

 

 

- Created 4 VLANs

 

- Created Self-IPs on each unit, plus one Floating IP

 

 

The current network setup is displayed in the attached Diagram-1, which has LTM-01 and LTM-02 split between multiple switches. Here is what I have done to test the new VLAN setup. On both switches, I have set the ports connecting to 1.4 on both LTM to down. I created trunk ports on both switches connecting to ports1.3. I was successful in reaching the self-IPs and the HTTP/HTTPS VIPs.

 

 

Is it preferable to leave the LTM ports as connected in Diagram-1 and change the access ports to trunk ports? Doing so would leave me with 4 trunk ports.

 

 

OR

 

 

Should I re-cable according to Diagram-2 and configure the switch with port channels?

 

 

I am just looking for the best performance and redundancy. Any feedback would be greatly appreciated.

 

 

Thanks,

 

Jeremy

 

 

25 REPLIES 25

Techgeeeg
Nimbostratus
Nimbostratus
Hi Jeremy,

 

(I am considering that you have configured the fail-over via serial as well as the network)

 

I will not prefer the connectivity as shown in diagram-2 you have to keep it distributed between switches 1 and 2 as shown in Diagram-1. let us say that the switch on which the Active unit is connected Fails what will happen in this situation, your Fail-over unit will not switch to the Active mode as it will continue getting the signal from the serial cable. Also failing of the switch should not cause the units to fail-over. So what will happen in this case is that all of your traffic processing will be stuck unless you do something manually. You should strictly follow what you have shown in diagram1.

 

 

Regards,

mikand_61525
Nimbostratus
Nimbostratus
Cant you change when the HA will fail then?

 

 

Like if F5_1 can reach 0/2 servers (server1 on switch1 connected to F5_1 and server2 on switch2 connected to F5_2 and switch1 is connected to switch2) while F5_2 can reach 2/2 then it should failover (even if the serial is still functional given that the serial is being used)?

Nathan_Houck_65
Nimbostratus
Nimbostratus
I believe Diagram two looks like it should suffice, but make sure you use Vlan Failover so the bigips will failover if the switch its connected to goes down.

hoolio
Cirrostratus
Cirrostratus
I'd suggest HA groups over VLAN failsafe as the former provides faster failover and more intelligence to prevent failover loops if both units experience the same failsafe event.

 

 

Manual Chapter: Understanding Fast Failover

 

http://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/tmos-redundant-systems-config-11-1-0/8.html?sr=18822521

 

 

Aaron

Techgeeeg
Nimbostratus
Nimbostratus
Mikand I really want to understand the scenario you are explaining here can you make it more detailed as I really didn't get your reply completely.....

 

 

Nathan are you really sure the devices will fail over in case the switches will go down i believe both the switches may acquire the Active state.....

 

 

 

mikand_61525
Nimbostratus
Nimbostratus
Techgeeeg: I think hoolio just explained that.

 

 

Instead of relying on if the serial cable functions or not you can use various monitors to decide when the F5 unit should failover or not.

 

 

If you for example have this setup:

 

 

F5_1 -> switch1 -> server1

 

 

F5_2 -> switch2 -> server2

 

 

and then have cables between switch1 - switch2 you could end up with a situation where F5_1 is active but switch1 is broken. In this situation server1 cannot reach F5_1 nor F5_2 while server2 can reach F5_2 but for no good since this unit is still passive (and just ignores the packets).

 

 

If you now use some monitors to trigger the HA instead of the status of the serial cable you would see that F5_1 cant reach server1 nor server2, BUT... F5_2 can at least reach server2. So in this particular case I would prefer that F5_2 becomes active (and sends some snmp trap thats something bad happend since all redundancy is now lost for the moment).

Techgeeeg
Nimbostratus
Nimbostratus
Thanks Mikand I got it now... but this setup seems like a work around solution and I feel like you have to un necessarily setup alot of thing in the shape of monitors to achieve the working setup in all situations. Also in case if there is any problem with server 2 and server 1 is very much fine health wise then failing of switch one along with server2 will leave the setup no where.... so what do you think now diagram 1 is a better design or diagram 2 should be followed?????

jfrizzell_43066
Nimbostratus
Nimbostratus
Thanks for all the feedback so far on this topic. To give you an idea of our failover, we use serial and LAN. I can remove the serial and use the LAN if that seems best.

mikand_61525
Nimbostratus
Nimbostratus
Techgeeeg: Unfortunately I cant open your drawings here - is it possible for you to publish them on bayimg.com or similar (and post links)?

 

 

If you have F5_2 as current active and server2 dies this is a non issue since F5_2 will be able to reach server1 since the flow will then be: F5_2 -> switch2 -> switch1 -> server1, sure non-optimal but still functional. Not until switch2 fails there is a need to (in this case) perform a failover so F5_1 becomes active.

 

 

Setting up failover can, depending on surrounding design, get you into one or another trap. CARP (as example) often doesnt failover at all if none of the machines in the failover-cluster have 100% reachability for the hosts the boxes monitor. This is why you should think twice when you setup failover along with which monitors to use along with in which scenarios to failover. Even if F5 have statesync there is always a posibility of lost packets in the network when failover occurs so a rule of thumb is often to not failover unless your really need to.

Techgeeeg
Nimbostratus
Nimbostratus
Well mikand i prefer you better have a look at both of the diagrams. Then i believe you reply will be more acurate and i would love to understand you point behind diagram 2 as a better option over option1. The query basically came from mikand and i am refering to the diagrams attached here nothing else .... and i believe you can open the two diagrams....

Techgeeeg
Nimbostratus
Nimbostratus
Well mikand i prefer you better have a look at both of the diagrams. Then i believe you reply will be more acurate and i would love to understand you point behind diagram 2 as a better option over option1. The query basically came from mikand and i am refering to the diagrams attached here nothing else .... and i believe you can open the two diagrams....

jfrizzell_43066
Nimbostratus
Nimbostratus
So I posted the diagrams on bayimg and the links are below:

 

http://bayimg.com/KAMpmaada (Diagram-1)

 

http://bayimg.com/KamPnaADA (Diagram-2)

 

 

Techgeeeg - I setup the the switching and LTM's as shown in diagram-1, but had a an issue. As I described in the original post, the two switch ports connecting to LTM-01 & LTM-02 1.1 are configured as trunks worked great. To finish the configuration off, I had configured the remaining switch ports that connect to LTM-01 & LTM-02 1.2 as trunk ports and enabled them. The two switches correctly went through spanning-tree and placed ports as active/blocked. After enable all four ports, I made a connection to the website and the load time was 6 seconds. I disabled the two ports on the switch that lead to LTM-01 & LTM-02 1.2 and the load time was less than 1 second. I tried this a number times of disabled the two ports and adding it back, but the result was exactly the same, that being load delay. Any ideas on why this might be occurring?

 

 

I spoke with F5 support and the engineer told me to go with Diagram-2 and enable failsafe. Basically, the failsafe method checks to ensure that the VLAN is continuously passing traffic and if it doesn't, the F5 will failover. Additionally, he said the best method for failover detection was the serial cable.

 

 

At this point, I am struggling with which option I should take as I see valid arguments for both sides.

mikand_61525
Nimbostratus
Nimbostratus
jfrizzell: Thanks

 

 

Techgeeeg: No need to be upset when you ask for advice. I have seen others in this forum which successfully uploaded pictures in such way that one wont need to first manually download them in order to see them.

 

 

The good thing with diagram2 is that you will utilize LACP which will in total raise total throughput (but verify how the LACP hashing is performed and choose srcip+srcport+dstip+dstport to fully utilize all cables involved in the LACP-group - if you use standard which is just srcmac+dstmac then only one cable will be used between the F5 and each server (for all sessions)).

 

 

On the other hand you need to failover if switch1 dies and F5_1 was active for the moment (dies not only at connectivity level but can die from missconfiguration and other stuff aswell).

 

 

You seem to have 4 interfaces on your LTM's... is it possible for you to use all 4?

 

 

So the setup would be:

 

 

LTM01: int0

 

LACP (towards switch1)

 

LTM01: int1

 

 

LTM01: int2

 

LACP (towards switch2)

 

LTM01: int3

 

 

LTM02: int0

 

LACP (towards switch1)

 

LTM02: int1

 

 

LTM02: int2

 

LACP (towards switch2)

 

LTM02: int3

 

 

SW01: int47

 

LACP (towards switch2)

 

SW01: int48

 

 

SW02: int47

 

LACP (towards switch2)

 

SW02: int48

Hamish
Cirrocumulus
Cirrocumulus
You'd also have to check if LACP is supported across the two separate switches that you're using... Most likely not though. Very few do.

 

 

However if you're using something like Cisco 3750 series switches (e.g. 2x 3750E's) you can stack them to make a single LOGICAL switch, AND you can perform LACP across both of them... (LACP balancing modes on the 3750 are a bit limited though. Most of the 'cheaper' ones default to mac headers rather than using IP src/dest/ports and full IP hashing may not be available - Can't remember the complete list the 3750 supports).

 

 

However assuming you have the extra ports on the BigIP units you could do diagram 1 WITH lACP to each port... best of both worlds and rely on spanning tree for link availability (However that does mean you need a reconvergence when you lose a link which may be more disruption than you think is worth it).

 

 

I'l re-iterate what Nathan and Aaron said above.. use VLAN failsafe to ensure the units LTM failover if you lose VLAN connectivity for any reason.

 

 

 

[Note. You don't specify whether the switches are Layer-2 only, or if you have SVI's on them. You may want to think about GLBP vs HSRP for instance. Or even whether spending extra on something like a 6506 would be better economy than two separate switches in data centre, or even 2x6506's 🙂 ].

 

 

H

jfrizzell_43066
Nimbostratus
Nimbostratus
Looking at the F5 support page, just reading that vPC support starts in BIG-IP version 10.1.0 and we are currently running a older version. This is the reason why the vPC would not work.

Techgeeeg
Nimbostratus
Nimbostratus
Jfrizzell , yes you have to create a trunk and place both the ports in the trunk group you will be creating. Also as you have said above you checked out the things with F5 Support and the things were having delay with the first diagram setup and it started working fine in the second setup. Well to me this is a work around that if the support is saying you to go with the second diagram only because it is working fine and the reason of the setup not working on diagram is never found out....

jfrizzell_43066
Nimbostratus
Nimbostratus
Techgeeeg:

 

 

I knew there was a disconnect somewhere. I am going to remove the tagged ports and then create trunks for each port and try again. I will keep you posted on the results. Keeping with diagram-1, I will probably add what Hamish recommend with the additional ports and LACP when I can free up those additional ports.

Hamish
Cirrocumulus
Cirrocumulus
Ah... vPC's to BigIP's work fine... They don't even know that there's a vPC involved because the vPC is at the Nexus end. The BigIP end if just a normal BigIP LACP trunk.

 

 

FWIW I have vPC's to lots of devices (Including BigIP LTM's, Checkpoint VSX's) from nexus 7010's on both 10Gb and 1Gb links. They work fine. A little strange when debugging, because they show a few funny bots about MAC addresses and the ports they see it down, but that's at he nexus end.

 

 

Also be careful of the terminology between Cisco and F5 equipment. A Cisco Trunk is F5 VLAN Tagging. A Cisco Port-Channel (Or even ether channel depending on IOS/NX-OS version) is an F5 Trunk... So if you just say 'Trunk' it's a bit ambiguous (And very prone to confusion. try getting some Cisco and F5 guys together in a room to talk about link aggregation and watch the confusion start 🙂

 

 

H

mikand_61525
Nimbostratus
Nimbostratus
Hamish: I think most manageable switches do support LACP these days (which is the IEEE standard for bundling interfaces).

 

 

Only ones who usually doesnt are the non-manageable switches (but there exists also non-manageable switches which have "LACP passive" (or "LACP active" for that matter) set to be able to bundle interfaces without the need to manually configure the device).

 

mikand_61525
Nimbostratus
Nimbostratus
jfrizzell: Your config from switch-01, is that a copperport (RJ45)?

 

 

Because "speed 1000" can be problematic due to the fact that the IEEE standard says you need to use auto/auto regarding speed/duplex when it comes to gbit and higher speeds. I have seen this confusion happen between a cisco switch and a hp switch - which was fixed once you set auto/auto on both ends (even if it felt wrong compared to how bad autoneg worked back in the 10/100 days for some equipment :P).

Hamish
Cirrocumulus
Cirrocumulus
The 'problematic' configurations when specifying speed and duplex on a cisco and auto on the connecting interface is due to the fact that cisco read the specs slightly differently from others... When you specify duplex on a cisco switch port then the switch no longer advertises the duplex to the connected port. Speed is easy (That's voltage). But duplex needs advertising. However cisco reads the spec as saying that if duplex is hard-set, then you don't advertise any more.

 

 

The sad part of that is that if you hard-set the cisco switch port to full-duplex and have auto/auto on the connected port, then speed is detected by voltage, but because there's no advertising the connected switch port chooses half-duplex (because a half-duplex hub doesn't advertise).

 

 

A connection that's full-duplex at one end and full at the other then generates unexpected collisions... It'll work fine at low speeds, but if you try to push too much data through it, it will just crawl. Also some versions of cisco (Catos especially, but you can configure IOS to do the same) will disable the switch port if it's getting errors (And collisions on a full-duplex port is an error).

 

 

Most other systems advertise when you hard-set duplex... e.g. Nokia, AIX, Solaris... It's safe to say that unless you set a cisco to auto duplex, you'll probably get problems (UNless you're willing to put up with the pain of hard-setting ALL your devices. Not sure why you would, but I have seen it done).

 

 

H

 

Techgeeeg
Nimbostratus
Nimbostratus
Hamesh.... really lilked your clarification of trun & vlan and port broadcasting of duplex......for F5 & Cisco do you have any particular document which can explain this ... can you provide a link to any....

 

 

Mikand really liked ur info for the auto/auto link aggregation do you have any document stating this... i mean any further details systamatic information

mikand_61525
Nimbostratus
Nimbostratus
Techgeeeg: You mean regarding the LACP active and LACP passive modes?

 

 

When LACP active is set the unit will send LACP packets every now and then (at least at the moment when a link goes up) to inform the other side that this unit wants to do LACP (instead of letting STP (spanning tree) disable the "looping" interface if you have STP enabled).

 

 

The other unit must be in either LACP active or LACP passive mode in order to having this LACP trunk to form.

 

 

So except from manually set up the bundling (which I would recommend because then you know where you are expected to have a bundle or not, at least by manually set dedicated interfaces into LACP active mode) the auto feature works as:

 

 

unit1: LACP active

 

unit2: LACP passive

 

= LACP trunk will form

 

 

unit1: LACP active

 

unit2: LACP active

 

= LACP trunk will form

 

 

unit1: LACP passive

 

unit2: LACP passive

 

= no LACP trunk will form (loop occurs unless you have STP enabled)

 

 

unit1: LACP active

 

unit2: no LACP

 

= no LACP trunk will form (loop occurs unless you have STP enabled)

 

 

unit1: LACP passive

 

unit2: no LACP

 

= no LACP trunk will form (loop occurs unless you have STP enabled)

jfrizzell_43066
Nimbostratus
Nimbostratus
Thank you Techgeeeg, Mikland, and Hamish for your time on this issue. Your feedback and guidance is truly appreciated. Just as a final note, I will explain the reason behind the speed 1000. On the Nexus 5548UP, when I installed the GBIC and issued a no shutdown on the port, it goes into an invalid state. The only way to bring it out of invalid is to place speed 1000 into the configuration. Maybe something odd or a bug.

Thong_196816
Nimbostratus
Nimbostratus
Hi. can yu insert the diagram..unable to read your diagram link. tq.