Guest

Preview Tool

Cisco Bug: CSCvr96728 - Servers running RHEL with async drivers loses network connections during heavy load

Last Modified

May 31, 2020

Products (1)

  • Cisco Unified Computing System

Known Affected Releases

4.0(4c)C 4.0(4e)C

Description (partial)

Symptom:
In the VIC card obfl we can see one of the vnic goes into hung state -

200412-09:05:58.321543 mcp.hang_notify ERROR: vnic15: hang_notify
200412-09:17:43.339992 mcp.hang_notify ERROR: vnic15: hang_notify
200412-09:20:43.339380 mcp.hang_notify ERROR: vnic15: hang_notify

On RHEL CLI run the following commands and match it with the following backtrace -

ServerX# abrt-cli list
reason:         WARNING: CPU: 18 PID: 0 at net/sched/sch_generic.c:356 dev_watchdog+0x248/0x260

ServerX# grep -ia -A 30 'NETDEV WATCHDOG:' /var/log/messages*

Apr  9 00:34:37 ServerX systemd[1]: oracle-orachkscheduler.service failed.
Apr  9 00:34:43 ServerX kernel: [429310.499575] ------------[ cut here ]------------
Apr  9 00:34:43 ServerX kernel: [429310.499617] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:356 dev_watchdog+0x248/0x260
Apr  9 00:34:43 ServerX kernel: [429310.499649] NETDEV WATCHDOG: eth2 (enic): transmit queue 5 timed out
Apr  9 00:34:43 ServerX kernel: [429310.499672] Modules linked in: oracleacfs(POE) oracleadvm(POE) oracleoks(POE) mmfs26(OE) mmfslinux(OE) tracedev(OE) tcp_diag udp_diag inet_diag unix_diag rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache ktap_104955(OE) ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter oracleasm(O) falcon_lsm_serviceable(PE) falcon_nf_netcontain(PE) falcon_lsm_pinned_9205(E) iTCO_wdt iTCO_vendor_support mxm_wmi sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr lpc_ich sg ipmi_ssif ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter pcc_cpufreq binfmt_misc auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 dm_service_time sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm fnic crct10dif_pclmul crct10dif_common libfcoe crc32c_intel libfc megaraid_sas scsi_transport_fc enic(OE) drm_panel_orientation_quirks scsi_tgt dm_multipath dm_mirror dm_region_hash dm_log dm_mod
Apr  9 00:34:43 ServerX kernel: [429310.508868] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: P           OE  ------------   3.10.0-1062.12.1.el7.x86_64 #1
Apr  9 00:34:43 ServerX kernel: [429310.511088] Hardware name: Cisco Systems Inc UCSB-B200-M4/UCSB-B200-M4, BIOS B200M4.4.0.1d.0.1003181546 10/03/2018
Apr  9 00:34:43 ServerX kernel: [429310.512236] Call Trace:
Apr  9 00:34:43 ServerX kernel: [429310.513352]  <IRQ>  [<ffffffffae97ac43>] dump_stack+0x19/0x1b
Apr  9 00:34:43 ServerX kernel: [429310.514490]  [<ffffffffae29b958>] __warn+0xd8/0x100
Apr  9 00:34:43 ServerX kernel: [429310.515743]  [<ffffffffae29b9df>] warn_slowpath_fmt+0x5f/0x80
Apr  9 00:34:43 ServerX kernel: [429310.516958]  [<ffffffffae87cf98>] dev_watchdog+0x248/0x260
Apr  9 00:34:43 ServerX kernel: [429310.518111]  [<ffffffffae87cd50>] ? dev_deactivate_queue.constprop.27+0x60/0x60
Apr  9 00:34:43 ServerX kernel: [429310.519309]  [<ffffffffae2ac358>] call_timer_fn+0x38/0x110
Apr  9 00:34:43 ServerX kernel: [429310.520597]  [<ffffffffae87cd50>] ? dev_deactivate_queue.constprop.27+0x60/0x60
Apr  9 00:34:43 ServerX kernel: [429310.521704]  [<ffffffffae2ae7bd>] run_timer_softirq+0x24d/0x300
Apr  9 00:34:43 ServerX kernel: [429310.522790]  [<ffffffffae2a5305>] __do_softirq+0xf5/0x280
Apr  9 00:34:43 ServerX kernel: [429310.523917]  [<ffffffffae99142c>] call_softirq+0x1c/0x30
Apr  9 00:34:43 ServerX kernel: [429310.525113]  [<ffffffffae22f715>] do_softirq+0x65/0xa0
Apr  9 00:34:43 ServerX kernel: [429310.526264]  [<ffffffffae2a5685>] irq_exit+0x105/0x110
Apr  9 00:34:43 ServerX kernel: [429310.527324]  [<ffffffffae9929d8>] smp_apic_timer_interrupt+0x48/0x60
Apr  9 00:34:43 ServerX kernel: [429310.528392]  [<ffffffffae98eefa>] apic_timer_interrupt+0x16a/0x170
Apr  9 00:34:43 ServerX kernel: [429310.529467]  <EOI>  [<ffffffffae7c1a67>] ? cpuidle_enter_state+0x57/0xd0
Apr  9 00:34:43 ServerX kernel: [429310.530498]  [<ffffffffae7c1a5d>] ? cpuidle_enter_state+0x4d/0xd0
Apr  9 00:34:43 ServerX kernel: [429310.531552]  [<ffffffffae7c1bbe>] cpuidle_idle_call+0xde/0x230
Apr  9 00:34:43 ServerX kernel: [429310.532609]  [<ffffffffae237c6e>] arch_cpu_idle+0xe/0xc0
Apr  9 00:34:43 ServerX kernel: [429310.533618]  [<ffffffffae30159a>] cpu_startup_entry+0x14a/0x1e0
Apr  9 00:34:43 ServerX kernel: [429310.534521]  [<ffffffffae969b97>] rest_init+0x77/0x80
Apr  9 00:34:43 ServerX kernel: [429310.535457]  [<ffffffffaef891cb>] start_kernel+0x450/0x471
Apr  9 00:34:43 ServerX kernel: [429310.536381]  [<ffffffffaef88b7b>] ? repair_env_string+0x5c/0x5c
Apr  9 00:34:43 ServerX kernel: [429310.537289]  [<ffffffffaef88120>] ? early_idt_handler_array+0x120/0x120
Apr  9 00:34:43 ServerX kernel: [429310.538160]  [<ffffffffaef8872f>] x86_64_start_reservations+0x24/0x26
Apr  9 00:34:43 ServerX kernel: [429310.539110]  [<ffffffffaef88885>] x86_64_start_kernel+0x154/0x177
Apr  9 00:34:43 ServerX kernel: [429310.540012]  [<ffffffffae2000d5>] start_cpu+0x5/0x14
Apr  9 00:34:43 ServerX kernel: [429310.540870] ---[ end trace 871fb6de6d19fa49 ]---
Apr  9 00:34:43 ServerX kernel: [429310.571406] enic 0000:0d:00.0 eth2: vNIC resources used: wq 8 rq 8 cq 16 qp 8 intr 10 rq_desc 512 wq_desc 256 intr mode MSI-X
Apr  9 00:34:43 ServerX kernel: [429310.575852] enic 0000:0d:00.0 eth2: Link UP

Conditions:
RHEL 7.6 and above.
Enic driver 3.2.210.27 and above
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.