Cisco Bug: CSCvh78443 - IO pool exhaustion by IPC while checkpointing auth states in scale 6800/IA setup
Sep 17, 2019
- Cisco Catalyst 6000 Series Switches
Known Affected Releases
Symptom: IO pool exhaustion can be seen by IPC while trying to checkpoint auth state of clients on a large scale VSS_6800IA setup. VSS could crash as a result. One or more of following logs could be seen when this problem hits. %AAA-SW1_STBY-3-ACCT_LOW_IO_MEM_TRASH: AAA unable to handle accounting requests due to insufficient I/O memory and could be trashing the queued accounting records XDR-SW1_STBY-3-XDROOS: Received an out of sequence IPC message. Expected 594 but got 639 from slot RP (63). -Traceback= <> %IPC-SW1_STBY-5-WATERMARK: 57928 messages pending in rcv for the port CHKPT:STANDBY SP(3050000.22) from source seat 10000, %CHKPT-SW1_STBY-5-HIGHBUFFER: Checkpoint client using Large No. of Buffers in domain 0are (AUTH MGR CHKPT CLIENT/40/2/3) %SYS-SW1_STBY-2-MALLOCFAIL: Memory allocation of 4308 bytes failed from 0xD0ECA1, alignment 64 Pool: I/O Free: 776 Cause: Not enough free memory Alternate Pool: None Free: 0 Cause: No Alternate pool -Process= "Pool Manager", ipl= 0, pid= 11 -Traceback= Conditions: Scaled VSS +IA setup with fex ports configured for auth (dot1x, mab etc) with aggressive inactivity timer (15seconds). For this particular issue reported by customer we had below configuration: service-template INACTIVITY_TIMER inactivity-timer 15 interface <IA interface> switchport port-security aging time 1 switchport port-security aging type inactivity The aggressive Inactive timer (15 sec) caused around 60k checkpoint messages (these are the sync messages between the active & standby supervisor) in short span time (due to the huge churn of end user authentication events), so the enquing rate is too high, compared to the dequing process cpf_msg_rcvq_process. Inactive timer causes sessions deletion and creation, which will add two sync messages.
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.
Bug Details Include
- Full Description (including symptoms, conditions and workarounds)
- Known Fixed Releases
- Related Community Discussions
- Number of Related Support Cases