Guest

Preview Tool

Cisco Bug: CSCvu78495 - ACI leaf crashed due to the kernel-panic

Last Modified

Oct 13, 2020

Products (2)

  • Cisco Nexus 9000 Series Switches
  • Cisco Nexus 9000 Series Switches

Known Affected Releases

14.1(2g)

Description (partial)

Crashed due to memory error in the hardware

Symptom:
A Cisco ACI switch reloads due to a "Kernel Panic" similar to the following output:

*************** module reset reason (1) *************
0) At 2020-07-24T20:00:55.050+00:00
    Reason: kernel-panic
    Service:system crash
    Version: 14.2(3q)


show logging onboard stack-trace | egrep 'MACHINE CHECK ERROR|Kernel panic'                                                                                                                                         
<0>[26788898.983262] MACHINE CHECK ERROR 
<0>[26788900.279258] Kernel panic - not syncing: FPGA watchdog

Conditions:
It is very common to expect PCIe errors from PCIe devices such as the ASIC, FPGA, and NIC. These PCIe errors can be correctable (soft) and fatal (hard) errors. The PCIe Advanced Error Reporting (AER) driver handles all soft errors and a kernel crash is enforced on hard errors. So, occasional PCIe soft errors are ok and is not a concern.

Also, it is common to have DRAM correctable errors (CE), which are mostly 1-bit ECC correctable errors and they are corrected by the driver software. On fatal DRAM errors, the switch will be forced to reboot by calling a kernel panic.

Similarly, CPU machine check errors (MCE) can also be the soft and hard error type. The soft errors are handled by kernel and kernel panic is invoked on a hard error.
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.