Guest

Preview Tool

Cisco Bug: CSCvv95834 - VSM: Diagnostic DPSwitchLoopback test failure does not trigger a PFM action (automatic reboot)

Last Modified

Oct 23, 2020

Products (1)

  • Cisco ASR 9000 Series Aggregation Services Routers

Known Affected Releases

6.4.2.BASE

Description (partial)

Symptom:
- Customer noticed traffic loss associated with a particular VSM card at exact time this message was generated:
-- LC/0/1/CPU0:Sep 17 10:33:26.589 UTC: canb-server-lc[131]: %PLATFORM-CANB_SERVER-6-OPERATION_INFO : send_drv_cbc_reset.2163, Info - CPU-ctrl reset CBC


- Problem was cleared as soon as the affected VSM was removed from the chassis;
- As per as solution design, this action will force this specific traffic to move to other VSM card;
- That log stopped when we removed the VSM and restored service.


- At that time, we noticed that this node was reporting a diag failure associated with the affected VSM card:
------
[SNIP]

A9K-VSM-500 0/1/CPU0:
 
  Overall diagnostic result: MINOR ERROR
  Diagnostic level at card bootup: bypass
 
  Test results: (. = Pass, F = Fail, U = Untested)
 
  1  ) LcEobcHeartbeat -----------------> .
  2  ) FIAScratchRegister --------------> .
  3  ) CUPOLAScratchRegister -----------> .
  4  ) NPULoopback ---------------------> .
  5  ) DPSwitchLoopback ----------------> F

[SNIP]
------

Conditions:
- The error message ?Info - CPU-ctrl reset CBC" seems to be related to CBC heartbeat mechanism;
-- Looks like that there was h/b failure and as a recovery s/w trying to reset the CBC;
- Heartbeat messages exchanged only when card is up and running;
- Heartbeat won’t go over the eobc link. The packet path is as follows: Active RP(cbc_server) -> UART -> local cbc ->  can bus -> remote cbc(LC)

- Apparently, both keepalives for CBC and for DPSwitchLoopback failed at the same time.

- When the diagnostic DPSwitchLoopback fails, that failure is not registered with the PFM, therefore, the automatic reload is never executed;
-- This will eliminate the need for any manual intervention during a failure and avoid the blackholing of traffic.
- Besides the PFM register code, this ddts should also implement commands to be taken automatically before recovery actions is performed should be implemented.
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.