BIG-IP 11.4 Behavior Change: Global Data Now Partitioned by Traffic Group
Prior to v11, active-standby and active-active in paired devices were your only options for failover configurations. Traffic groups were introduced in BIG-IP version 11 to allow administrators to group configuration objects that failover to another device in a device group, scaling past two devices to allow for a number of devices to handle the various traffic groups. Each traffic-group can have a different device which is next in line for failover, so the backup copy of an arbitrary piece of data needs to be mirrored to different places. So, after a failover event, two pieces of data might not live on the same device anymore (because the first one failed over to one box, and the second one failed over to a different box). Beginning in 11.4, the mirroring infrastructure has been changed to handle the traffic groups correctly.
As a result, all global state inside the TMM was split up by traffic-group. This way, we'll always see a consistent state for each traffic-group, no matter how the user configures the box (meaning, each piece of data can't see any other piece of data which won't necessarily follow it via mirroring). This included both the persistence and session databases, and so naturally affects the session table as well. In short, the BIG-IP now acts like there is a separate copy of the persistence and session databases for each traffic group. Data in one traffic group cannot be accessed by any others. If after an upgrade to 11.4 persistence stop working or the table command stops sharing data across virtuals, then this change is a primary suspect.
One scenario that was diagnosed had two virtual servers using different iRules to access the same data via the session table. It worked sometimes, but not always. In comparing the virtual server configurations, the issue was not apparent because the traffic group assignment is attached to the virtual address object, not the virtual server. Consider the following virtual addresses:
ltm virtual-address /Common/10.120.37.1 {
address 10.120.37.1
mask 255.255.255.255
traffic-group /Common/traffic-group-local-only
}
ltm virtual-address /Common/10.120.37.2 {
address 10.120.37.2
mask 255.255.255.255
traffic-group /Common/traffic-group-1
}
If you created two virtual servers with the first virtual address or the second virtual address, all is well. If you create one with the first virtual address and the other with the second virtual address, then neither one will see the other's data, because the data would be associated with different traffic groups.
A quick tip for quickly diagnosing this issue, at least for the session database, is to set the "tmm.sessiondb.match_ha_unit" DB variable to false and see if that fixes things. If it does, then you know that this partitioning of global state is one of the sources of your problems. There is not currently a corresponding DB variable for persistence entries, however, and there is no current way to know which traffic-group a particular persistence entry is associated with via tmsh, though both issues are known and under consideration for future releases.
Much thanks to Spark for this info!
- ryan_126547NimbostratusAre you sure this started in v11.4? I believe it was started in 11.1. Running earlier version 11.2 and it has traffic-groups like you discussed.
- JRahmAdminTraffic groups existed before 11.4, but the behavior of separating the global data by traffic group is new in 11.4
- Being aware of this behaviour is relevant in with SNAT deployments.
- netsecurity_666Nimbostratus
Hello,
I actually have this problem on LTM version 12.0
I need to cross-reference persist record across different traffic-group. I tried setting the DB variable tmm.sessiondb.match_ha_unit to false, but without success.
Do you know if this behaviour is to be "fixed" or has been "fixed" in current or upcoming releases?
Thanks, Andrea