Forum Discussion

netteam_66633's avatar
netteam_66633
Icon for Nimbostratus rankNimbostratus
Jan 10, 2011

F5 monitoring agent not stable after upgrading MP to 2.1.5.440 x64

Hi,

 

 

We are having issues with the F5 monitoring service running on the SCOM 2007 R2 RMS server. We've successfully upgraded the MP to the latest release available, 2.1.5.440 x64, however we had this issue before too. From time to time the F5 monitoring service on RMS stops. Occasionally when this happens I have to restart system center management service. Both Data Source Server and Operations Manager Connector are in critical state.

 

 

The following Event IDs are logged: 201

 

 

Unable to connect to data source: The PerformanceDataSourceConnector connection to Operations Manager Health Service host HealthService could not be established: Failed to connect to an IPC Port: The system cannot find the file specified.

 

:HealthService.

 

 

401:

 

 

The EventDataSourceConnector connection to Operations Manager Health Service host localhost was lost: Failed to connect to an IPC port: The system cannot find the file specified.

 

 

 

806:

 

 

Unable to process device [F5 device [x.x.x.x]] statistics due to data failure: The PerformanceDataSourceConnector connection to Operations Manager host Health Service could not be established: Failed to connect to an IPC Port: The system cannot find the file specified.

 

 

  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    Hi,

     

     

    Please send us the following (archived) files (to managementpack(at)f5(dot)com):

     

     

    - F5 Monitoring Log event log

     

    - OperationsManager log event log

     

    - trace.log file (in Program Files\F5 Networks\Management Pack\Log folder)

     

     

    In the anticipation of the next occurrence of the issue, I'd like to get as much trace logs as possible on the F5 Monitoring Service crash. Please enable the verbose logging on the F5 Monitoring Service. See the 'Verbose Logging Support' section in this article: http://devcentral.f5.com/wiki/default.aspx/MgmtPack/GeneralTroubleshooting.html

     

     

    Prepare a clear running slate (not mandatory, but would make browsing later the logs easier):

     

     

    Clear the F5 Monitoring Log.

     

    Clear the Operations Manager log.

     

    Restart the F5 Monitoring Service

     

     

    Wait for the crash to happen, then zip and send us the following:

     

    - F5 Monitoring Log event log

     

    - OperationsManager log event log

     

    - trace.log file (in Program Files\F5 Networks\Management Pack\Log folder)

     

     

    Thank you!

     

    Julian

     

     

     

  • Hi,

     

     

    I've got the logs , archived and sent for troubleshooting. I;ve also enabled verbose logging, cleared logs and restarted F5 Monitoring service.

     

     

    Let me give you a quick overview about our environment: we have 4 bigip appliances, 6400, in 2 datacenters, that we would like to monitor. After I've done the discovery when I tried see performance view I noticed that I don't see all the rules for each device. For example on active device in thew first datacenter, I can see only one Device CPU Processor Utilization (target 0:0) and Global Server Current Connections, Server - Current Connections and Client - Current Connections and nothing else. I selected any of the items I don't see any data displayed. On the standby unit I see more rules, like both CPUs and partitions.

     

    The local account on the each unit has an administrator role, when I tried to re-run discovery I've got a message saying that the service was not running, it was running. I;ve restarted the service and tried again.

     

     

    Thanks,

     

  • There is no override created so far. It works with all the default configurations.
  • Everytime I restart F5 monitoring service on RMS server I get these alerts:

     

     

    Alert: Root Management Server Unavailable. Priority

     

    Last modified time: 1/13/2011 9:08:04 PM Alert description: The root management server (HealthService) is running but has reported limited functionality soon after 1/13/2011 9:06:08 PM. The specific reason code is 43 and description is " System workflows essential to running of the product have failed to load. ".

     

     

    followed by

     

     

    Alert: Device Configuration Update Failure Priority

     

    Last modified time: 1/14/2011 9:25:18 AM Alert description: Failed to update device configuration. Attempt to reconnect.