Forum Discussion

Joe_44512's avatar
Joe_44512
Icon for Nimbostratus rankNimbostratus
Dec 08, 2010

F5 SCOM Management Pack Not Collecting All Data

I have the F5 SCOM Management pack (version 2.0.0.516) installed on SCOM 2007 R2 (version 6.1.7221.0). I did an override on specific rules to enable data collection such as the 'LTM PoolMember Server Current Connections' rule.

 

 

After enabling this rule, I created a performance view filtered by specific rules that begin with the letters 'LTM'. Once the performance view was created, I waited minutes, then days for SCOM to collect performance data. Only about 5 of the nodes displayed in this performance view are displaying data in the graph . All the other nodes displayed in this view show a blank white section where the graph is supposed to display. (Attached image)

 

 

Can someone please provide some guidance as to why I am collecting data for certain nodes and not others that all fall under the same rule? I need to collect data for all the nodes listed in this performance view.

 

 

In addition, it is specific to this environment. I have the same version of SCOM and the F5 Management pack installed in a different isolated environment with the same model devices and performance views. Everything in this environment is collecting and graphing data as expected. What is wrong in my non-functional environment?

 

 

Thank you in advance,

 

Joe

 

7 Replies

  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    Hi Joe,

     

     

    Please make sure you applied the right procedure for creating the overrides for the F5 LTM collection rules. This article will tell you the details: http://devcentral.f5.com/wiki/default.aspx/MgmtPack/PerformanceCollectionAndMonitoring.html

     

    Assuming that your rule overrides are set up correctly, please run the 'Refresh F5 Device Collection Rules' task, accessible under the F5 Actions >> F5 Action Tasks, in the SCOM Management Console. Make sure the task completes successfully.

     

     

    If you still can't see the performance data coming in, you can do the following:

     

     

    - analyze the Operations Manager event log, and let us know if there are any errors / warnings, that may be relevant to our issue; you can also archive the logs and send them to managementpack(at)f5(dot)com.

     

    - save the override pack for the F5 collection rules and send it to us at: managementpack(at)f5(dot)com

     

    - use the Workflow Analyzer tool (in System Center Operations Manager 2007 R2 Authoring Resource Kit), and make sure the related workflows to collect the F5 rules are working properly (~they show up green). You can download the SCOM authoring resource kit here: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9104af8b-ff87-45a1-81cd-b73e6f6b51f0&displaylang=en

     

    - enable verbose logging on the F5 Management Pack; see this article for more details: http://devcentral.f5.com/wiki/default.aspx/MgmtPack/GeneralTroubleshooting.html. After enabling the verbose logging, run the 'Refresh F5 Device Collection Rules' task again and wait for the expected cycle of getting a first round of stats. Then pack the trace.log file (found in Program Files\F5 Networks\Management Pack\log folder and send it to us: managementpack(at)f5(dot)com.

     

    - perform a lower level troubleshooting by using the Operations Manager debug traces (we can assist you with this) and try to find out if there are any internal SCOM errors when processing these rules.

     

     

    Let us know.

     

     

    Thank you.

     

    Julian

     

  • I have followed the instructions and when running the Authoring Resource Kit I did not see any errors and running the Rule collection refresh did not force data to be collected. Although, when looking at the view 'F5 Mangement Pack Monitoring Services' in SCOM, the F5.MonitoringService is in critical state. Below is the details from the State Change Event:

     

     

    Context: Date and Time: 10/8/2010 6:10:22 PM Log Name: F5 Monitoring Log Source: F5 Events Event Number: 401 Level: 2 Logging Computer: User: N/A Description: The PerformanceDataSourceConnector connection to Operations Manager Health Service host localhost was lost: Failed to write to an IPC Port: The pipe is being closed. Event Data: The PerformanceDataSourceConnector connection to Operations Manager Health Service host localhost was lost: Failed to write to an IPC Port: The pipe is being closed.

     

     

    Can this be the issue? Failed to write to an IPC Port: The pipe is being closed

     

     

    Let me know what data or logs may be useful to send over to better troubleshoot this issue.

     

     

    Thanks!

     

  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    The time stamp of the log trace that you mentioned in your post seems to be 3 months old. Is this still the error that you're currently getting as the F5.MonitoringService monitor is in critical state? Please send us the verbose trace logs for the F5 Monitoring Service (see my previous post on how to collect them). This would give us more information about the error(s).

     

     

    To enable verbose logging on the F5 Management Pack see this article for more details: http://devcentral.f5.com/wiki/default.aspx/MgmtPack/GeneralTroubleshooting.html. After enabling the verbose logging, restart the service and run the 'Refresh F5 Device Collection Rules' task again and wait for the expected cycle of getting a first round of stats. Then pack the trace.log file (found in Program Files\F5 Networks\Management Pack\log folder and send it to us: managementpack(at)f5(dot)com.

     

     

    Thank you.

     

    Julian
  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    From the trace logs you've sent us I can see that statistics are collected from the F5 device. The only errors that I've seen in the trace log were: Device x.x.x.x has received configuration iQuery on the stats connection; may indicate stats request for invalid object. These errors eventually cleared up, right after your service configuration update for enabling the verbose logging. So I assume you may have had a stale device configuration state cached for the F5 Monitoring Service.

     

     

    Regarding the critical state on the F5.MonitoringService monitor: this could have been triggered by the 401 event ID error (which I assume you would see in your F5 Monitoring Log in the Event Viewer). The critical state of this monitor should be automatically reset by a 'paired' 202 event ID, on a successful service start-up / restart, but unfortunately this doesn't work as expected in your case (the related bug has been recently fixed). So, go ahead please and [manually] close / resolve the related alert on the F5.MonitoringService monitor and restart the F5 Monitoring Service (using the Windows Services Control Manager MMC snap-in).

     

     

    After the service restart, check the F5 Monitoring Log in the Event Viewer and make sure there are no errors (Event ID 201 or 401). Then make sure you have a healthy state on the F5.MonitoringService monitor in SCOM. If it does stay green, you should be fine with the health of the F5 Management Pack.

     

     

    If you still get the red state on the F5 Monitoring health, let us know. Otherwise you should be seeing the stats enabled by default for the F5 Management Pack coming in. If you plan on overriding other rules for the F5 Management Pack, make sure you follow the appropriate steps described here: http://devcentral.f5.com/wiki/default.aspx/MgmtPack/PerformanceCollectionAndMonitoring.html

     

     

    Let us know the results.

     

    Thank you.

     

    Julian
  • Thanks Julian - I cleared the alert and restarted the service as you recommended and the F5.MonitoringService is still in Critical state with "Operations Manager Connector - F5.MonitoringService (F5 Management Pack Monitoring Service)" in error.

     

     

    The EventID number within SCOM is a 401 as you stated, but I only see 701 EventID's (DeviceConfig: Unknown config item was encountered; ignored: conn_limit = 0, 0 items) in the F5 Monitoring Log.

     

     

    The 401 error still remains:

     

     

    **Event Data: The PerformanceDataSourceConnector connection to Operations Manager Health Service host localhost was lost: Failed to write to an IPC Port: The pipe is being closed.

     

     

    Although, I am now collecting more data than I was before, but still not collecting everything. I checked to ensure the overrides were done properly using the documentation you have provided. As of now, data is showing for most of the "LTM Pool Member Server Current Connections", but some data for this rule is still not being collected. In addition, I am not seeing any data coming in for the "LTM Node Server Current Connections" Rule eventhough all the nodes are listed in the SCOM view, no data is being graphed when a report is run.

     

     

    Thanks again and please let me know if my post is not clear.

     

     

    Joe
  • Julian_Balog_34's avatar
    Julian_Balog_34
    Historic F5 Account
    Hi Joe,

     

     

    Thanks for the update. The post is very clear. I believe the critical state that you're seeing for the F5.MonitoringService monitor may not be accurate, since you don't see any 401 event ID errors in the event log. This monitor should only be triggered by a current 401 event ID error. The 704 event ID warnings about the unknown config items encountered are not critical. This issue should be fixed in a more recent build/release of the F5 Management Pack (v2.1.5.440).

     

     

    First I'd like to troubleshoot the [apparently] inadequate health state of the F5 Monitoring Service in SCOM. This may reflect a stale config state of SCOM's Health State Configuration Cache. I would recommend the following steps:

     

     

    1. Close the alert regarding the 401 event ID error in SCOM.

     

    2. Stop the F5 Monitoring Service.

     

    3. Enable verbose logging support (see http://devcentral.f5.com/wiki/default.aspx/MgmtPack/GeneralTroubleshooting.html).

     

    4. Stop the SCOM services (Health, Config and SDK).

     

    5. Delete the SCOM health state config cache file (OpsMgrConnector.Config.xml), located in \Program Files\System Center Operations Manager 2007\Health Service State\Connector Configuration Cache\ folder

     

    6. Restart the SCOM services (SDK, Health, Config).

     

    7. Start the F5 Monitoring Service.

     

     

    Make sure the health state config cache file has been created in the location mentioned at 5. If this file has not been created automatically we need to take a deeper look on why the SCOM SDK connector cache wouldn't update. If the config cache is successfully created and you still get the red state on the F5 Monitoring Service monitor, please zip and email the trace.log file located in \Program Files\F5 Networks\Management Pack\Log to managementpack(at)f5(dot)com.

     

     

    Also, please send us the F5 Monitoring Log (zipped) and the override management pack (XML file) for the F5 Management Pack overrides.

     

     

    Thank you.

     

    Julian

     

  • I did as you recommended and I'm still getting the critical alert after doing what you recommended. I have captured the logs and will send it via email. I tried running the "Refresh F5 Device Collection" action and got the following error in the Task Output window: (assuming this is related to the same issue, it may be useful)

     

     

    The Event Policy for the process started at 5:01:22 PM has detected errors in the output. The 'StdErr' policy expression:

     

    \a+

     

     

    matched the following output:

     

     

    Monitoring Service is not available

     

     

     

     

     

    Command executed: "C:\Windows\system32\windowspowershell\v1.0\powershell.exe" -NonInteractive -Command "& '"C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 13\35921\RunF5MPCmd.ps1"' '-RefreshCollectionRules'"

     

     

    Working Directory: C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 13\35921\

     

     

     

     

    One or more workflows were affected by this.

     

     

     

     

    Workflow name: F5.Task_RefreshCollectionRules.Service

     

     

     

    Instance name: F5.MonitoringService

     

     

     

    Instance ID: {3E056931-A3C5-BC1E-E34E-D7D1E3B410BE}

     

     

     

    Management group: PRD_CD

     

     

    Error Code: -2130771918 (Unknown error (0x80ff0032)).