Forum Discussion

RPriesing_79448's avatar
RPriesing_79448
Icon for Nimbostratus rankNimbostratus
Nov 18, 2009

BIG3D restarts frequently after Mgmt Pack install

We just installed the latest licensed version of the Management pack with OpsMgr 2007 R2 and notice in the Syslog that big3d is restarting on a frequent basis every minute or 2, should this be happening? Our devices are at "BIG-IP 9.3.1 Build 37.1" in a Active/Passive configuration. Both devices have been discovered through OpsMgr and monitoring has been enabled via command line.
  • Adding discovery log:

     

     

    Execute device discovery:Success

     

    Attempt to connect to the iControl device socket:Success

     

    Retrieve device information:Success

     

    Verify device is supported:Success

     

    Retrieve minimum supported device version:Success

     

    Calculate clock skew:Success

     

    Retrieve device management address:Success

     

    Validate unique device management address:Success

     

    Checking license expiration:Success

     

    Retrieve device self address:Success

     

    Retrieve device failover info:Success

     

    Retrieve device features:Success

     

    Create persisted iQuery connection:Success

     

    Attempt to connect to the iQuery device socket:Success

     

    Retrieve Big3d Version:Success

     

    Disconnecting existing Big3d connections:Success

     

    Recreate the iQuery connection with updated Big3d:Success

     

    Retrieve device information:Success

     

    Verify device is supported:Success

     

    Retrieve minimum supported device version:Success

     

    Calculate clock skew:Success

     

    Retrieve device management address:Success

     

    Validate unique device management address:Success

     

    Checking license expiration:Success

     

    Retrieve device self address:Success

     

    Retrieve device failover info:Success

     

    Retrieve device features:Success

     

    Create persisted iQuery connection:Success

     

    Attempt to connect to the iQuery device socket:Success

     

    Retrieve Big3d Version:Success

     

    Store device Configuration:Success

     

    Store device in Database:Success

     

    Create device configuration:Success

     

    Retrieve configuration from device:Success

     

    Retrieve device trunks:Success

     

    Retrieve device interfaces:Success

     

    Retrieve device cpus:Success

     

    Retrieve device chassis:Success

     

    Retrieve device partitions:Success

     

    Retrieve device database variables:Success

     

    Retrieve device trunk to interface relationships:Success

     

    Retrieve device LTM virtual addresses:Success

     

    Retrieve device LTM virtual servers:Success

     

    Retrieve device LTM iRules:Success

     

    Retrieve device LTM nodes:Success

     

    Retrieve device LTM pools:Success

     

    Retrieve device LTM pool members:Success

     

    Retrieve device HTTP classes:Success

     

    Retrieve device GTM data centers:Success

     

    Retrieve device GTM servers:Success

     

    Retrieve device GTM virtual servers:Success

     

    Retrieve device GTM Link:Success

     

    Retrieve device GTM distributed applications:Success

     

    Retrieve device GTM wide IP's:Success

     

    Retrieve device GTM pools:Success

     

    Retrieve device GTM pool virtual servers (members):Success

     

    Retrieve device GTM wide IP to pool relationships:Success

     

    Retrieve device GTM distributed application to wide IP relationships:Success

     

    Save device configuration in Database:Success

     

    Verify device authenticity after saving the device:Success

     

    Wait for verification of device configuration change completion:Success

     

    Setting initial object states:Success

     

    Set device as connected:Success
  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    Your problem may be related to a known issue of the F5 Management Pack collecting CPU stats from an F5 device running a BIG-IP 9.3.1 platform. We will have a fix for this issue in our very next release of the F5 Management Pack (around mid-December, this year). You can have a workaround, by disabling the collection of the CPU stats. The SCOM collection rules related to the CPU stats are enabled by default in our F5 Management Pack. In order to disable them, you'll have to create the appropriate overrides. Here are the rules you'll need to override and disable from being collected by the F5 Management Pack:

     

     

    - Device CPU Fan Speed

     

    - Device CPU Processor Utilization

     

    - Device CPU Temperature

     

     

    I would suggest the following steps to disable these rules:

     

     

    1. Open the SCOM 2007 Management Console and select the Authoring section.

     

    2. Look for the F5 device CPU rules mentioned above.

     

    3. For each of these rules do the following: select > right click > Overrides > Disable the Rule >For all objects of class: F5 Management Pack Monitoring Service > click Yes when prompted to override.

     

    4. Go to the Monitoring section of the SCOM 2007 Management Console. Browse to Monitoring > F5 Networks > F5 Actions and select the "Refresh F5 Device Collection Rules" task under the "F5 Action Tasks" on the right.

     

    5. Run the "Refresh F5 Device Collection Rule" task.

     

     

    Upon the successful completion of the refresh task, the F5 device CPU stats should not be collected anymore by the F5 Management Pack and hopefully your problem would go away.

     

     

    * * *

     

     

    In my suggested workaround I assumed that both of your devices in the redundant pair configuration run the same BIG-IP v.9.3.1 platform. If you need to disable the rules per certain device only, let us know and we'll assist you.

     

     

    Thanks,

     

    Julian

     

  • No luck, still getting the restarts and corresponding 300/301 events in the Windows event log. Attaching the SystemInformation log from our OpsMgr server. Both F5 nodes show up as Healthy in the OpsMgr console. both nodes have matching versions 9.3.1.

     

     

    big3d version big3d Version 10.0.1 for linux.

     

     

    Thanks
  • Something I just noticed when trying to run remote commands like iControlSysteminfo.exe is the connection fails with my service account which is an Administrator on the F5 boxes but succeeds with another Administrator account. I can telnet into the LTM's with my service account and do anything, I can also connect through the web GUI and manage the LTM's with the service account. Are there some hidden permissions or permissions that may have not been set during the account creation that I can check?
  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    Could you send us the [device] logs found in:

     

     

    /var/log/ltm

     

    /var/log/gtm

     

     

    These logs would possibly give us more information about what's going on with the big3d. You can email them to managementpack(at)f5.com.

     

     

    Thank you.

     

    Julian
  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    I went through the logs you sent me and it appears that the big3d regularly bounces every 5-6 minutes or so. Is this still happening if you stop the F5 Monitoring service? I still think the problem may be related to collecting certain rules and I would suggest turning off all of the rules collected on the F5 Management Pack. I'll work on a PowerShell script for disabling all the rules collected on the F5 Management Pack and save you the hassle of manually going through the rules enabled and override them back to disabled. But if you don't have that many, you can disable them and see if the problem still persists.

     

     

    Julian
  • The restart does continue to happen when we enable monitoring. Per your reccomendation earlier I did disable the CPU Rules, not sure how many are left out of the default install so a script would be handy. Thanks
  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    I attach a PowerShell script that you can use to disable all the rules in the F5 Management Pack (Set-F5MPRules.ps1). Here're the steps that I recommend following, by using this script:

    1. If you have an override management pack where the F5 Management Pack rules are defined, make a backup of the related XML file. Normally this file would be located in the SCOM installation folder. In the following sections I will refer to the override management pack by "OverrideMP", which would be the SCOM Id of the override management pack, but if you have one, you should use the actual name (ID) of the override management pack

    2. Copy the attached file (Set-F5MPRules.ps1) to an arbitrary folder, preferably on the SCOM Root Management Server.

    3. Open the Operations Manager Shell (e.g. the SCOM PowerShell command prompt) and execute the Set-F5MPRules.ps1 script by invoking the following command:

        
        . \Set-F5MPRules.ps1 "OverrideMP" $false    
        

    where

    OverrideMP = the ID of the override management pack where your existing F5 Management Pack rule overrides are defined. If you don't have any F5 Management Pack rule overrides defined, you can choose to name the override management pack to anything you like.

    $false = will override all the F5 Management Pack rules to be disabled (collection rules and threshold rules).

    Both parameters are mandatory.

    You should be getting an output similar to the capture attached to this post. The script will process first the F5 Management Pack rule overrides, if any, which should run pretty fast. Then it would process the default rules enabled in the F5 Management Pack, and will override them to being disabled. This could take a bit longer. Once you're done with running the script, you can run the "Refresh F5 Collection Rules" and "Refresh F5 Threshold Rules" tasks, and see if the big3d bouncing problem goes away. Eventually, you can revert your override management pack to the initial one (that you made a backup of), to get your initial F5 Management Pack rule state back.

    Let me know how it goes.

    Thanks!

    Julian

  • Ran the script and then re-enabled monitoring. Still getting same issue/errors.

     

     

    Here's the event from the F5 Monitoring Log on the OpsMgr server:

     

     

    Failed to discover device at address: 10...

     

    Network-related failure has occurred: [Category]SecureSocketLayer:[Type]ConnectFailure;LastError=SSL IO Error 495568656: SYSCALL:

     

     

     

    F5Networks.Protocols.iQuery.iQueryException: [Category]SecureSocketLayer:[Type]ConnectFailure;LastError=SSL IO Error 495568656: SYSCALL:

     

    at F5Networks.Protocols.iQuery.iQuerySocketBase.Connect()

     

    at F5Networks.ManagementPack.Services.DeviceMonitor._CompleteDiscovery(DeviceDiscoveredEventArgs successContext)

     

    at F5Networks.ManagementPack.Services.DeviceMonitor._DeviceDiscovered(Object sender, DeviceDiscoveredEventArgs successContext)
  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    Now that we assume that there are no stats collected from the device, could you please enable the verbose logging? Click here for details (see the "Verbose Logging Support" section).

     

     

    Stop the F5 Monitoring Service, enable verbose logging, delete the existing trace.log file (in the Program Files/F5 Networks/Management Pack/log folder), start the F5 Monitoring Service, let it run at least for a few big3d bouncing cycles, then zip the trace.log file and send it to us at: managementpack(at)f5.com.

     

     

    Thank you,

     

    Julian