F5 BIG-IP and ENTRUST nShield HSM SSL key/cert auto synchronization between HA peers with iCall
A long time in a galaxy far away for a client I had to configure HSM connection to nShield HSM on active/standby pair and now I will share a useful tip.
Code version:
The code was tested on 15.1.8.
Main Article:
For more information about RFS and Client agent I suggest seeing the vendors article.
Useful F5 links for F5 and nShield Integration for GTM and LTM:
- https://my.f5.com/manage/s/article/K000135349
- https://techdocs.f5.com/en-us/bigip-15-1-0/big-ip-system-and-nshield-hsm-implementation/setting-up-t...
The nShield architecture includes a component called the Remote File System (RFS) that stores and manages the encrypted key files. The RFS can be installed on the BIG-IP system or on another server on your network.
Basically the HSM agent/client is installed on the F5 devices hos Linux host system and the F5 devices are also the RFS servers.
The RFS commands are bellow as when installed on the BIG-IP the HSM agent and RFS they are available for use:
The issue I solved with iCall script is that when when you create a new HSM key for BIG-IP HA, you must run command ‘rfs-sync --update’ on all standby BIG-IP devices (the devices where the cert/key were not created or changed) to update the local Thales encrypted file object cache. Without this action, SSL traffic using this key will fail when BIG-IP fails over to one of the unsynced standby devices.
When you create the the key and cert on the active F5 device "rfs-sync -commit" and "rfs-sync -update" run automatically on it but not the "rfs-sync -update" on the standby devices and the icall script basically is triggered on the standby devices when you run the normal config sync.
The iCall script matched an event called "HA_EVENT" that is configured in the custom alarms section and triggers the full command with the path "/opt/nfast/bin/rfs-sync --update" to check if there was an update in the rfs.
I suggest reading the links below that explain the iCall (one is from JRahm ) and the HA logs and the last one is mine that is from the time before I learned proper article formatting 😅 and it also shows how to run scripts not only with iCall but also during HA events and so on.
- Run tcpdump on event | DevCentral
- What is iCall? | DevCentral
- https://my.f5.com/manage/s/article/K34291400
- https://my.f5.com/manage/s/article/K3727
- https://my.f5.com/manage/s/article/K11127
- Knowledge sharing: Ways to trigger and schedule scripts on the F5 BIG-IP devices. | DevCentral
tmsh list sys icall
sys icall handler triggered ha-handler {
script ha-script
subscriptions {
ha-subscription {
event-name HA_EVENT
}
}
}
sys icall script ha-script {
app-service none
definition {
exec /bin/bash -c "logger -p local0.notice 'yes'"
exec /bin/bash -c "/opt/nfast/bin/rfs-sync --update"
}
description none
events none
}
cat /config/user_alert.conf
alert HA_EVENT "(.*)Sync of device group(.*)" {
snmptrap OID=".1.3.6.1.4.1.3375.2.4.0.500"
}
- You can use tmsh::log "yes" to log to /var/log/ltm as shown in:
iCall script to validate Virtual Server and node with same IP addresses
Testing:
This can be tested even without HSM as I don't have one at my home using "logger -p" to inject the logs. I have added "yes" in the logs as a test 😎
logger -p local0.notice "010714a0:5: Sync of device group /Common/Failover"
cat /var/log/ltm
............
Jul 1 08:34:07 bigip1.com notice root[23762]: yes
Jul 1 08:34:53 bigip1.com notice root[23940]: 010714a0:5: Sync of device group /Common/Failover
Using Linux bash script:
In some versions you can trigger from icall script a bash or sh script with advanced logic inside, for example /bin/sh -c "/var/tmp/ha_script" but I saw issues on 17.1.x triggering from from icall a bash script that in previous versions I solved by adding "HOME=/root <linux bash command>" .
In the "What is iCall" it is shown that you can do "if else" or "for" loops inside the iCall script but I find it easy to use bash for advanced logic. Good thing like everything with F5 usually there is more than one way to do things and this case in the user_alert.conf you can actually trigger a bash script from the log messages!
- What is iCall? | DevCentral
- iCall script triggers error need ${HOME} to run | DevCentral
- Running a command or custom script based on a syslog message
cat /var/tmp/ha_script
#!/bin/bash
logger -p local0.notice "yes"
/opt/nfast/bin/rfs-sync --update
cat /config/user_alert.conf
alert HA_EVENT "(.*)Sync of device group(.*)" {
exec command="/var/tmp/ha_script"
}
Extra Notes:
- Using /opt/nfast/bin/rfs-sync --update or rfs-sync --update depends in some cases on the versions in the iCall script.
- In the release notes I saw a new bug https://cdn.f5.com/product/bugtracker/ID1429897.html that is solved in the latest 17.1.x versions where if the RFS is on the BIG-IP after a key/cert are created 'rfs-sync -c' needs to be run on the F5 Device that created them as well. The 'rfs-sync -c' can also be automated the way I have shown and my iCall script will work as well for BIG-IP that use external RFS and after the key/cert are created and committed from an F5 device (usually the active one) then an HA config sync needs to be started that will trigger 'rfs-sync -u' on the other F5 devices.
- Another nice way if you are using something like Ansible for example is to make to trigger the RFS update command on all F5 devices in a cluster as F5 supports bash commands even through API not only CLI.
Example:
curl -sku admin:XXX https://XXXX/mgmt/tm/util/bash -H "Content-Type: application/json" -X POST -d '{"command":"run", "utilCmdArgs":"-c \"/opt/nfast/bin/rfs-sync --update\""}'