Forum Discussion
boczon_108037
Nimbostratus
Feb 07, 2011Hard Drive failures
Recently we have had issues with Hard drive failures on a few 6900 devices. I was just curious if anyone else is experiencing similar issues and what you were doing to monitor and alert for those failures.
I am able to pick up some of the errors in syslog but am not sure if all of the errors are hitting my syslog.
Thanks
13 Replies
- hoolio
Cirrostratus
I haven't heard of any significant numbers of drive failures on 6900 units. I expect that any drive failures would be logged locally and via syslog if you have it configured. Are you seeing intermittent syslog delivery?
Aaron - boczon_108037
Nimbostratus
I do have sysloging configure. I wasn't sure if all of the errors would be sent to syslog. If they are I was curious if anyone worked out what all of the possile events being logged would be. I was trying to configure some filters for alerting that would capture the correct events.
I have started adding filters for "err kernel" and was loking to add anything that would caputer disk or raid erorrs.
In the last couple of months I replace three complete units(all 6900) and a drive that was in one of the replacement units.
I logged cases for everything but support has not found anything that would be causing so many failures. - Christopher_Boo
Cirrostratus
These are hard drives and not compact flash storage you are referring to? I ask because I ran into a wierd bug a few months ago upgrading to 10.2. The upgrade set my compact flash storage as swap. Didn't catch it until a couple months later, but continued use as swap would likely result in premature failure. - boczon_108037
Nimbostratus
Yes it is the hard drive. We saw the same issue and made the change documented in sol12170. Sometime after that was when we started experiencing the hard disk issues.
After I replaced the units i did not modify them to move the swap space. I was trying to figure the actuall root cause before I made any additional changes. These devices were up and running for close to 2 years with the swap space on CF card with out issue so I figure it would be OK to keep running that way for a while - Christopher_Boo
Cirrostratus
Cool. I was advised by support (and it does make sense) that swap file on compact flash would affect performance and could result in shortened life of the flash. I had some performance issues with ASM running on the same box and it was suggested as a contributing factor. - Chris_Miller
Altostratus
I'm on my 5th HD failure in 9 months on 6900s/8900s. First time, support asked me to RMA the drive. Second one, they asked me to RMA the entire box. Still waiting to see what happens this time.
On mine, I'll logon to a box, see a drive has failed in the RAID array, and go from there. I can simply remove it from the array, reboot the box, and re-add it to the array and there's no problems. Definitely something whacky. - Christopher_Boo
Cirrostratus
I'd be checking the quality of the power going to the units and making sure there aren't any ventilation issues. Sure sounds like something external is a factor to me. - boczon_108037
Nimbostratus
Good point I can double check power and ventilation but I think its OK..
The devices I had issues with are in two different data centers. Both data centers are industrial strength and loaded with tons of other gear that are not experiencing any issues.
Things like power, tempature. and air flow are monitor 24 x 7. - Chris_Miller
Altostratus
Hah...they had me do both Power Supplies and both HDs in one of my units as well, along with a clean install. Definitely no power/ventilation issues on our side either. - Daniel_23711
Nimbostratus
I would check the Drive manufacturer, Seagate had a lot of issues with SATA drives in 2009, early 2010 time frame.
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects
