Forum Discussion
Monitoring RAID status?
We just upgraded our BigIP 6900 from 10.2.3 to 11.2.1. Maybe caused by the usage of a previously unused part of the disk, maybe by bad karma, one of the two hard drives decided to go belly up:
sd 1:0:0:0: [sdb] Unhandled sense code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 1:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
...
cat /proc/mdstat
Personalities : [raid0] [raid1]
md13 : active raid1 dm-28[0]
3145664 blocks [2/1] [U_]
...
While this is certainly manageable with F5 support, the fact that we noticed it only by pure chance (accessing the standby device via the web interface) somehow makes us concerned. We can monitor a lot of things on the F5 via SNMP, including temperature, memory and disk space, but /proc/mdstat seems inaccessible from the SNMP agent.
On a vanilla netsnmp agent, one could add a custom OID and script, but the big THIS IS AN AUTO-GENERATED FILE -- DO NOT EDIT!!! somehow makes this a moot idea.
Has anyone solved this yet?
- nukleus_145085Nimbostratus
We are currently looking into deploying a daemon (aka caching) version of check_mk agent on the devices.
Another option would be to use the cronjob and ssh pubkey-authentication to push the contents of /proc/mdstat and smartctl output to a remote server, but that would mean to add a ton of monitoring to ensure that the cronjob is still running by looking at the result and comparing modification timestamps etc.
- bdraschk_114903Nimbostratus
I actually opened a case with f5 to add raid status to the SNMP MIB and they send me information that it's already in there:
config snmpwalk -v 2c -c public 127.0.0.1 sysPhysicalDiskTable F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskSerialNumber."WD-WCAT1E407050" = STRING: WD-WCAT1E407050 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskSerialNumber."WD-WCAT1E408695" = STRING: WD-WCAT1E408695 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskSerialNumber."B92341DDYGKJ0908HV00" = STRING: B92341DDYGKJ0908HV00 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskSlotId."WD-WCAT1E407050" = INTEGER: 0 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskSlotId."WD-WCAT1E408695" = INTEGER: 0 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskSlotId."B92341DDYGKJ0908HV00" = INTEGER: 0 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskName."WD-WCAT1E407050" = STRING: HD1 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskName."WD-WCAT1E408695" = STRING: F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskName."B92341DDYGKJ0908HV00" = STRING: CF1 F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskIsArrayMember."WD-WCAT1E407050" = INTEGER: true(1) F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskIsArrayMember."WD-WCAT1E408695" = INTEGER: true(1) F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskIsArrayMember."B92341DDYGKJ0908HV00" = INTEGER: false(0) F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskArrayStatus."WD-WCAT1E407050" = INTEGER: ok(1) F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskArrayStatus."WD-WCAT1E408695" = INTEGER: missing(3) F5-BIGIP-SYSTEM-MIB::sysPhysicalDiskArrayStatus."B92341DDYGKJ0908HV00" = INTEGER: undefined(0)
The indexing with the disks' serial numbers is somewhat strange, you'll have to adapt your checks after replacing a drive.
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com