Attachment | Size |
---|---|
cmts-0-config.txt | 200.1 KB |
cmts-0-pfx-crash-log.txt | 32.05 KB |
cmts-1-config.txt | 69.54 KB |
cmts-1-pfx-crash-log.txt | 32.12 KB |
cmts-1-stack-trace.txt | 269.15 KB |
cmts-2-config.txt | 241.94 KB |
cmts-2-pfx-crash-log.txt | 32.01 KB |
Hello,
In the past two weeks, we've had 4 different UBR10Ks with PRE4s crash for unknown reasons. They're all on 12.2(33)SCG6, and have been with no issues for about two years. We didn't make any configuration changes, or push any new features leading up to this, they just seemed to happen out of nowhere.
I've attached running-configs and crash-logs from 3 of the CMTS'. Also I managed, to get a 'show stack' from one of them because it has redundant PREs and wasn't cold rebooted.
Some additional background - We contract with a third party for our CMTS/Modem NMS that our CSRs use. It's mostly SNMP. based, but they do have limited read-only ssh access to screen-scrap a few things. I only say this because maybe they started using a different command that exposed some bug. This is completely grasping at straws, but it's the only thing I could think of that may have changed.
Anyway, has anyone had similar issues with this version of IOS or any version with a PRE4 for that matter? Any experience with a show command that crashed the PRE and if so what command was it? If anyone can find anything significant from the logs/configs I attached that would be amazing, I've been banging my head against this for the past couple weeks.
If you need more information let me know. I can also be reached off-list - skyler dot blumer at zitomedia dot com.
These types of faults can be painful
TAC often blame them on solar radiation etc
As long as they happen rarely it is probably nothing to worry about
If they start happening continuously then it is probably a sign of failing hardware
I think the uBR10k gear is getting a bit too old
We've recently had assorted 20x20 linecards, PRE4, and RF switches fail
We've been installing some cBR8s and now have fair few uBR10k parts which are being kept for spares for this situation
uBR10k gear is selling cheap, you can probably pickup more used parts easily
Could also be a software issue
Any chance all your devices had the same sort of uptime when they crashed?
Could be some sort of resource leak that takes a long time
SCG6 is a bit old (Nov 2013). We had it at one stage but had some problems.
Currently using SCH6 for last year or two, has been stable for us.
Confirm, our last firmware was SCH5, had no problems.