I'm having an issue our local system. CMTS is a 7246VXR, IOS ubr7200-ik8su2-mz.123-23 running an NPE400, 1 active of 4 installed cards (all cards are MC16C) ~350 active modems on the system. DHCP / DNS / TFTP system is CNR v5.5.13
I'm having sbg900, 901, and 940 model modems (perhaps any model, but these are the ones i can confirm) drop offline for a few minutes at a time, and then coming back on. This doesn't seem to be happening plant wide, or at least not vocally by customers. Ive looked into the logs of modems that I know are being affected, and this is what I am seeing:
NOTICE: DHCP Renew Failed - Reinitializing MAC
CRITICAL: No Ranging Response Received - T3 Time-out (US 3)
CRITICAL: DHCP Failed - Requested Info not supported.
There are other items in the logs, but this bit is repeating twice a day, every day. Am I to assume the modem is kicking offline due to the DHCP issue, then finally getting a lease back from DHCP a few minutes later? Modem never goes fully offline, it just drops it's online light, then it starts blinking again like it is trying to sync back up to the system.
what is the lease time for the modems from the dhcp server? they run on a private network and should have at least a 2 to 3 week lease time. have you tried to increase the lease time to see if the problem happens less often? cpu for the dhcp normal or on the high side?
~Carl
Took a look at the DHCP. Seems the private addresses are running on a 24h lease (which I agree seems low considering the size of said pool with it being private), 8H lease on the public addresses being handed out to end devices on the network. DHCP server CPU load is practically idle (dual core sparcs, 15% high usage, 1% low).
Guess the next step to troubleshoot would be increase the CPE private address lease time?
up the modem ip network. example change it from a /22 to a /20 network. private anyway... also increase the lease time from a day to 3 weeks.
the cpe network wouldn't cause modems to reset. if you got calls from customers who have modems online, but cannot surf the web or get an address that is not from your scope, then I would look at something on the cpe side
Moved the lease time up to 21 days (3 weeks) this afternoon. Private network was already a /20 (like i said, definitely big enough to not need a lease time that low). Haven't really been getting any alarming call volume with online modems that also have no external route, and anything along those lines have been simple reset fixes, so I'd feel safe saying that is definitely not the issue here.
Will look over the logs and check on the known modems to see if this clears up the issue.
Thanks for your help, it is greatly appreciated.
Checked on one of the modems affected today. It had kicked offline at 14:08, right at the time the lease was set to expire. Here is the odd thing going on though. I have the leases set to expire in 21 days now, and when i checked the lease table in CNR, this modems lease was set to expire TOMORROW.
Checking the modems logs presented a new set of errors:
Wed Aug 03 14:09:43 2011 Critical (3) DHCP FAILED - Requested Info not supported.
Wed Aug 03 14:09:31 2011 Notice (6) T4 No Station Maint Timeout - Reinitialize MAC...
Wed Aug 03 14:09:31 2011 Critical (3) Received Response to Broadcast Maintenance Request, But no Un...
Wed Aug 03 14:09:12 2011 Critical (3) DHCP FAILED - Discover sent, no offer received
Wed Aug 03 14:08:59 2011 Critical (3) Started Unicast Maintenance Ranging - No Response received - ...
Wed Aug 03 14:08:56 2011 Critical (3) DHCP FAILED - Discover sent, no offer received
Wed Aug 03 14:08:56 2011 Critical (3) Started Unicast Maintenance Ranging - No Response received - ...
Wed Aug 03 14:08:48 2011 Critical (3) DHCP FAILED - Discover sent, no offer received
Wed Aug 03 14:08:47 2011 Critical (3) Started Unicast Maintenance Ranging - No Response received - ...
Wed Aug 03 14:08:13 2011 Critical (3) No Ranging Response received - T3 time-out (US 3)
Wed Aug 03 14:08:07 2011 Notice (6) Dhcp Renew Failed - Reinitialize MAC...
And here is the kicker, i checked the Connection tab of this modem, and I'm getting this at the bottom:
CM IP Address****Duration***********Expires
---.---.---.---******D: -- H: -- M: -- S: --***--- --- -- --:--:-- ----
*** characters added for spacing. I know this modem at least got a lease via dhcp as it is active in CNR, and I can successfully remote login to the modem at the office via the internal ip assigned by CNR.
Any new suggestions?
Also I'm adding the logs for that particular modem over the last 24H (two renewal tries) that are in the CNR DHCP logs for further reference material. Noticed it is applying a 24H lease still both times, so I'm wondering is I have an upstream policy still set at 24H... will look into that bit. Noticing its denying it a lease at first, and then turning around a bit later allowing it the lease. Think that is whats causing the few minutes of offline activity? And why would it deny it a lease it is actively using?
08/03/2011 2:07:54 name/dhcp/1 Activity Protocol 0 05019 10.17.17.82 Client Host: '6489' CID: 01:00:25:f1:b2:1c:4b packet 'R3370' tried to renew a lease which was deactivated
. Renewal was denied.
08/03/2011 2:08:06 name/dhcp/1 Activity Protocol 0 05265 10.17.17.82 Lease offered to Host: '6489' CID: 01:00:25:f1:b2:1c:4b packet 'R3371' until Thu, 04 Aug 2011 02:08:05 -0
400. 1 ms (NOTE: client had existing lease and it remains leased).
08/03/2011 2:08:08 name/dhcp/1 Activity Protocol 0 04994 10.17.17.82 Lease granted to Host: '6489' CID: 01:00:25:f1:b2:1c:4b packet 'R3372' until Thu, 04 Aug 2011 02:08:05 -0
400. 1 ms.
08/03/2011 14:08:06 name/dhcp/1 Activity Protocol 0 05019 10.17.17.82 Client Host: '6489' CID: 01:00:25:f1:b2:1c:4b packet 'R6336' tried to renew a lease which was deactivated
. Renewal was denied.
08/03/2011 14:09:41 name/dhcp/1 Activity Protocol 0 05265 10.17.17.82 Lease offered to Host: '6489' CID: 01:00:25:f1:b2:1c:4b packet 'R6338' until Thu, 04 Aug 2011 14:09:40 -0
400. 1 ms (NOTE: client had existing lease and it remains leased).
08/03/2011 14:09:43 name/dhcp/1 Activity Protocol 0 04994 10.17.17.82 Lease granted to Host: '6489' CID: 01:00:25:f1:b2:1c:4b packet 'R6339' until Thu, 04 Aug 2011 14:09:40 -0
400. 1 ms.
Got the leases setup on a 21 day rotation, and lo and behold, no twice daily resets. Looks the dhcp is the culprit.
Any suggestions to squash the bug completely? I'm assuming now that its up for 24H, it will be doing the same thing, just every 10 1/2 days instead of every 12H?
Thanks for all the help Carl
Glad it helped find root cause. better dhcp server or longer lease times are the only options you have.
good luck and thanks for your re-post
~Carl