Cisco Bug: CSCut52652 - NCS6K MC Many Fabric Asic Errors even without Traffic
Aug 06, 2018
- Cisco Network Convergence System 6000 Series Routers
Known Affected Releases
Symptom:This problem affects multi-chassis systems more than single-chassis ones. When fabric is congested, SFE link flaps and cell loss are seen. Link flaps are indicated by 'RxLostOfSyncCh' interrupts. (where # indicates a valid number) MAC_##.Interrupt_Register.MAC_##.Interrupt_Register2.RxLostOfSyncCh# Cell loss is indicated by reassembly errors at the destination FIAs. EGQ.Interrupt_Register.EGQ.Packet_Reassembly_Interrupt_Register. The system is relatively free of any other link errors (like CRC errors, transmit errors, Decode errors, etc.) Conditions:When congested, fabric ASICs send link-level flow control messages to their upstream ASICs. At the upstream ASICs, these messages result in link level halt interrupts, which are classified as a non-error interrupt. The SW driver logic to monitor repeated MAC errors (which indicate noise) unnecessarily counts these informational interrupts also. When these errors reach a threshold, the perfectly healthy link is reset, causing cell loss. Sustained congestion can happen when all slices are carrying almost line-rate traffic AND there is significant amount of multicast traffic AND fewer than 5 planes are UP in the system. Multi-chassis systems are more susceptible to this problem than single chassis routers. In single chassis routers, momentary congestion can be seen upon LC insertion or reload when the fix for CSCus77973 is not present.
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.
Bug Details Include
- Full Description (including symptoms, conditions and workarounds)
- Known Fixed Releases
- Related Community Discussions
- Number of Related Support Cases