Guest

Preview Tool

Cisco Bug: CSCvv87025 - constant OVS reprograming after Docker EE CNI upgrade from 4.1.1 to 5.0.2

Last Modified

Oct 06, 2020

Products (1)

  • Cisco Application Policy Infrastructure Controller (APIC)

Known Affected Releases

4.2(2f)

Description (partial)

Symptom:
After upgrade some PODs started to crash, it took about 1-2h for PODs to recover. From that point, every new pod that was scheduled, took several minutes to resolve ARP.
During tshoot, we noticed that the OVS data plane rules does not appear during the outage time.
Endpoint file is created immediately after pod is scheduled with MAC, IP and veth interface.
We also noticed that restart of the node cause again crash of all PODs for 1-2h, and its recovered.
Monitoring shows CPU utilisation of opflex-agent at 90-100% and 8GB RAM memory consumption.
In addition in the openvswitch logs there is a lot of flow_mod events happening especially on the same node (paasworkert01).
25/09/2020 08:34	aci-containers-openvswitch (aci-containers-openvswitch-4m78v)	2020-09-25T06:34:00.680Z|17633|connmgr|INFO|br-int<->unix#2: 1360 flow_mods in the 1 s starting 10 s ago (680 adds, 680 deletes)

Conditions:
Docker EE (UCP3.2.6) ACI CNI upgrade from 4.1.1 to 5.0.2.
APIC release: 4.2(2f)
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.