BIG-IP : iControl : LocalLBDataGroupFile.set_local_path() : swap large file under load

Question

F5 BIG-IP v11.4.1 (Build 635.0) LTM on ESXi
I have a .NET C app that uses iControl to perform following sequence :
• transfer data-file to staging location on BIG-IP device
• recache data-group with contents of this data-file
The specific iControl API used for the recache is :
LocalLBDataGroupFile.set_local_path()
This operation has been 100% consistently successful in non-prod environments with very low traffic to perform data-group-file updates of up to 2M records.  
Non-prod config : single stand-alone device ( no HA pair ).
The operation has also been 100% consistently successful in prod environments with high traffic to update a non-live data-group-file up to 300K records.
Prod config : HA pair consisting of 2-node sync-failover device-group with auto-sync disabled.
NOTE: By "live data-group-file" I mean an enabled virtual-server has an assigned iRule that references the data-group ( performs matches against data-group maps ).  By "non-live data-group-file" I mean that the data-group exists but either is not referenced by any iRules, or iRules that reference it currently are not assigned to any enabled virtual-server.
Here is where the problem occurs : 
When the operation is run in prod environments with high traffic (40-60% baseline cpu utilization, 200-400 Mbps baseline throughput) to update a live data-group-file ( 100K+ records ) the iRule fails.
Exactly how the iRule fails is unknown and currently is under investigation by F5 Support, however here are some data-points :
• the file-transfer and data-group recache iControl calls return success to the C caller
• requests that the iRule normally conditionally rewrites to various backend pools no longer arrive at those servers
• BIG-IP logs contain zero errors related to either the iControl operation of the iRule
• public client requests that should be processed by the iRule display generic Akamai 500 error-pages
NOTE: I have a test that removes Akamai from the equation, but have not yet had an opportunity to run it.
My understanding is that LocalLBDataGroupFile.set_local_path() was re-designed/coded for 11.4 and was lab-tested up to 1M records. However, I wonder if any testing was performed in an environment with significant load ? 
Through trial-and-error I discovered the following workaround (for an HA pair only) : 
• create a/b pair of data-groups and corresponding set of a/b iRules that are identical except that "a" iRule references "a" data-group, and "b" iRule references "b" data-group 
• on active node, initially configure virtual-server to use "a" iRule
• use C application to update "b" data-group-file 
( NOTE: possibly this could also be accomplished via the admin browser, but above 100K records the time-lags and potential impact on prod operations become concerning. )
• if now swap-in "b" iRule to virtual-server ( effectively swapping-in "b" data-group ) the irule will begin to behave strangely (requests swallowed and never routed to backend pool although no errors present in LTM logs)
• however, the following "trick" seems to work :
sync active to standbypromote standby to activeon new active, swap-in "b" iRule to the virtual-serverreboot new standby*   sync new active to new standby
Somehow the sync operation "cures" the issues induced by swapping the live iRule to point to a just-updated data-group.
So in summary it seems that for a high-load environment attempting to swap new contents into a live data-group somehow induces a failure-case for iRule lookups against that data-group.  
The failure symptoms are identical both for the technique of re-caching the live data-group with new contents ( iControl API LocalLBDataGroupFile.set_local_path() ), and for the iRule a/b swap technique.
However, an active-to-standby sync operation seems to "cure" whatever bad-state the data-group has been put into.
Can anyone provide insights as to why swapping-in new contents to a large data-group-file associated with an iRule assigned to a VIP under heavy load would cause iRule data-group lookup failures ?

samstep · Answer

I suggest you raise a support call with F5 if you experience any "site down" type failures. I have seen TMM crashing, core-ing and re-starting under heavyload with some iControl calls before and the impact varies version-to-version.&nbsp;
You need to examine the LTM logs at the time the VIP came down - there should be plenty of log messages informing about the reason for the service interruption. &nbsp;

Forum Discussion

BIG-IP : iControl : LocalLBDataGroupFile.set_local_path() : swap large file under load

1 Reply

F5 Container Ingress Services (CIS) and using k8s traffic policies to send traffic directly to pods

F5 Architecture Track Sessions - AppWorld 2026

Recent Discussions

Bash shell and ping command on F5 rseries

Same LTM Monitor applied to different Pools with Common Nodes

Request for Bug Tracker/Known Issues – BIG-IP Version 17.5.1.2

Owa did not show support id asm

error code 503 redirect irule

Related Content

Automating Certificate Management on F5 BIG-IP

What's new in BIG-IP v21.0?

Introducing the F5 AI Assistant for BIG-IP

BIG-IP Report

BIG-IP security hardening and compliance checks

ABOUT DEVCENTRAL

RESOURCES

SUPPORT

PARTNERS