Guest

Preview Tool

Cisco Bug: CSCvt89709 - 4.0.2b -- DR pairing code causing deadlock in stMgr

Last Modified

Jul 10, 2020

Products (1)

  • Cisco HyperFlex HX-Series

Known Affected Releases

4.0(2b)

Description (partial)

Symptom:
In Cluster configured with DR network , StMgr may not get initialised  and stuck in a deadlock while enabling the IPTable rules(can be verified from the stMgr.log).

During StMgr initialization, as part of enabling the DR  network firewall rules and handling callbacks, recursive call to thrift calls fetching the DR network was leading to the deadlock.


"main":
at com.storvisor.sysmgmt.stMgr.DRPairFirewallController.peerListChanged(DRPairFirewallController.scala:276)
- waiting to lock <0x00000000f382f9e8> (a com.storvisor.sysmgmt.stMgr.DRPairFirewallController)
at com.storvisor.sysmgmt.stMgr.DRPairFirewallConfigMgr.triggerCacheChangedEvent(DRPairFirewallConfigMgr.scala:167)
at com.storvisor.sysmgmt.stMgr.StMgrImpl.enableIPTablesInt(StMgrImpl.scala:5131)
at com.storvisor.sysmgmt.stMgr.StMgrImpl.com$storvisor$sysmgmt$stMgr$StMgrImpl$$enableIPTables(StMgrImpl.scala:5107)
- locked <0x00000000f382f830> (a com.storvisor.sysmgmt.stMgr.StMgrImpl)
at com.storvisor.sysmgmt.stMgr.StMgrImpl.<init>(StMgrImpl.scala:655)
at com.storvisor.sysmgmt.stMgr.StMgrImplFactory$.<init>(Server.scala:452)
"Curator-PathChildrenCache-0":
at com.storvisor.sysmgmt.stMgr.StMgrImpl.StMgrAPIWrapper$lzycompute$1(StMgrImpl.scala:351)
- waiting to lock <0x00000000f382f830> (a com.storvisor.sysmgmt.stMgr.StMgrImpl)
at com.storvisor.sysmgmt.stMgr.StMgrImpl.StMgrAPIWrapper(StMgrImpl.scala:351)
at com.storvisor.sysmgmt.stMgr.StMgrImpl.getDrNetworkConfig(StMgrImpl.scala:15352)
at com.storvisor.sysmgmt.stMgr.DRPairFirewallController.getLocalIpPool(DRPairFirewallController.scala:270)
- locked <0x00000000f382f9e8> (a com.storvisor.sysmgmt.stMgr.DRPairFirewallController)
at com.storvisor.sysmgmt.stMgr.DRPairFirewallController.$anonfun$childEvent$1(DRPairFirewallController.scala:263)
at com.storvisor.sysmgmt.stMgr.DRPairFirewallController.childEvent(DRPairFirewallController.scala:262)

Conditions:
While upgrading to HX 4.0.2b, Cluster configured with DR network may observer this race condition resulting in deadlock. The race is very rarely observed.
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.