Populating Tables With CSV Data Via Sideband Connections
Introduction
Datagroups and tables are the two primary methods we have in iRules for organizing key and value pairs. Both can be reused for subsequent connections. Datagroups have the advantage of being directly editable from the BIG-IP user interface, however they cannot be modified from within an iRule. This would open a potential security hole by allowing BIG-IP filesystem access from an iRule. Tables on the other hand must be populated from within an iRule, which allows for tracking user state, session data, etc. Table cannot be modified from any user interface though. With the advent of sideband connections, we can now make a connection to a separate service retrieve structured data and populate a table.
Tables were introduced in BIG-IP version 10 as a means to store key and value pairs across connections and sessions. Because tables are stored in memory, they are ultra-low latency and can handle large volumes of transactions. There are many cases when we want to functionality of a datagroup without requiring administrative access to the BIG-IP. With the introduction of sideband connections in version 11, we now have the ability to retrieve data from other sources. We can then parse that data and populate the table.
For this example, we’ve chosen the CSV (comma-separated value) file format to store the tabular data. CSV files are one of the oldest file structures for formatting tabular data. The standard is loosely described in RFC 4180, but there have been several interpretations of CSV structure in the past. We will be using the most common format whereby there is no header line, each field is separated by a comma, and each line terminated with carriage return and line feed (CR+LF).
Session Continuity, Table Locking, and Graceful Failures
The simplest implementation of this functionality would be to periodically delete all table data, retrieve new data, parse, and insert. This method presents a number of challenges for any virtual that is handling connections at a faster rate than the data retrieval can complete and even then this would be a poor implementation. As a remedy, we conduct the process differently, by only deleting old records after all new records have been added. This insures that there are no interruptions in the user experience.
Locking is another consideration that must be taken for high volume implementations. In order to prevent table corruption, we will lock the table to prevent two instances from modifying the same table simultaneously. We handle the locking mechanism using an entry in another table with a timeout and lifetime equal to the table’s cache timeout.
Because we are modifying a potentially mission critical table, we want to handle any interruption in updates gracefully. Therefore, no existing data will be removed until we receive a valid set of new data. If a failure occurs, the next update attempt will not commence until the next scheduled refresh.
Caching
There is a small, but measurable amount of latency in triggering retrieval of new records. This latency was measured as less than one second for a sideband connection from an LTM virtual edition to a web server on the same LAN. In order to avoid each connection from performing this operation, we added a caching mechanism so updates are only initiated periodically. The length of time you’ll want to cache records is dependent on your application. Some content such as a sitemap or a robot.txt file could be updated once daily. Other examples such as redirects or ACLs may need to be refreshed more regularly. This option is tunable via a static variable in RULE_INIT.
Putting It All Together
When we condense all of those provisions into a single process, we wind up with something that looks similar to the diagram below. This process outlines the procedure for importing CSV data. Each additional request during the cache period will bypass the import process and retrieve data directly from the table.
Conclusion
The ability to populate tables with data via a sideband connection opens up many possibilities for handling application logic on the LTM. The biggest advantage is having an off-box method for adding data to iRules. We mentioned a few of the use cases such as redirects, sitemaps, and ACLs, but the list is really limitless. In addition to being able to manually edit the CSV files and import data to the LTM, there is now the capability to ingest dynamically generated data of any format. While the ability to perform redirects with an iRule may seem mundane, the underlying code opens up endless possibilities all of which translate to lower management effort. Until next time, happy coding!
Code and References
CSV Tabular Data Sideband Importer iRule - Documentation and code for the iRule used in this Tech Tip
RFC 4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files
- L4L7_53191NimbostratusGeorge: fantastic. These sideband connections are amazing functionality...
- George_Watkins_Historic F5 AccountMatt: Aren't they awesome? We've had a ton of ideas for sideband connection Tech Tips stuck in limbo. Now that we've got sideband fucntionality we are working our way down the "if we only had sideband connections" Tech Tip list.
- JRahmAdminStellar work, George!
- Colin_Walker_12Historic F5 AccountWicked stuff, nice man!
- ahmed_2340NimbostratusIs this version specific ? I get the error undefined procedure set static:
- whswhswhs124_98Nimbostratus'
Really great stuff!
Because of some errors [use curly braces to avoid double substitution] like the ones in https://support.f5.com/csp/article/K57410758 I have replaced line 29, 62 and 63.
Old code:
if { [expr [clock seconds]-$last_refresh] > $static::db_cache_timeout } {
---------------------
set key [string range $field 0 [expr $sep_offset - 1]]
set value [string range $field [expr $sep_offset + 1] end]
New code:
set time_now [clock seconds]
set test_var [expr {${time_now} - ${last_refresh}}]
if { ${test_var} > ${static::db_cache_timeout} } {
---------------
set key [string range $field 0 [expr {${sep_offset} - 1}]]
set value [string range $field [expr {${sep_offset} + 1}] end]
- Justin1Nimbostratus
Can someone tell me what the CSV file should look like. Seems it just needs to be <something, url>
In linux a cvs file looks like:
Data,STATUS
1.1.1.1,BAD
2.2.2.2,GOOD
,C
As you see it is just comma separation and you should never miss the comma as the last example " ,C " means that the " Data " column is empty and the value for the " STATUS " column is " C ".