Guest

Preview Tool

Cisco Bug: CSCur11134 - BGP coredump on ASR9k running 5.1.2

Last Modified

Jan 14, 2017

Products (1)

  • Cisco ASR 9000 Series Aggregation Services Routers

Known Affected Releases

5.1.2.BASE

Description (partial)

$$IGNORE
From: Stephen Cobb (sfc) 
Sent: Wednesday, October 01, 2014 2:04 PM
To: Sunder Gabbita -X (sungabbi - GLOW NETWORKS INC at Cisco); Santosh Sharma (santsha2); Gerald Quinn -X (gerquinn - COMPUCOM SYSTEMS INC at Cisco); Mahendra H N (mahhn); Rayen Mohanty (ramohant)
Cc: Ashish Rastogi (ashisras); Rohith Ramannagari -X (roramann - GLOW NETWORKS INC at Cisco); viking-infra-india(mailer list); Tricia Bill (trbill); cs-asr9k(mailer list); q-os-dev(mailer list)
Subject: Re: BGP coredump on ASR9k running 5.1.2 - [CC265]
Importance: High

Yes, there should be a ddts, if there is not one already. cdets comp is infra-time, DE Mgr arunprab.

This email chain should be put into the ddts. If there are any core files available, location of those cores and the relevant workspaces should be included.

Thanks//Steve

From: "Sunder Gabbita -X (sungabbi - GLOW NETWORKS INC at Cisco)" <sungabbi@cisco.com>
Date: Wednesday, October 1, 2014 8:34 AM
To: sfc <sfc@cisco.com>, "Santosh Sharma (santsha2)" <santsha2@cisco.com>, "Gerald Quinn -X (gerquinn - COMPUCOM SYSTEMS INC at Cisco)" <gerquinn@cisco.com>, "Mahendra H N (mahhn)" <mahhn@cisco.com>, "Rayen Mohanty (ramohant)" <ramohant@cisco.com>
Cc: "Ashish Rastogi (ashisras)" <ashisras@cisco.com>, "Rohith Ramannagari -X (roramann - GLOW NETWORKS INC at Cisco)" <roramann@cisco.com>, "viking-infra-india(mailer list)" <viking-infra-india@cisco.com>, "Tricia Bill (trbill)" <trbill@cisco.com>, "cs-asr9k(mailer list)" <cs-asr9k@cisco.com>, "q-os-dev(mailer list)" <q-os-dev@cisco.com>
Subject: RE: BGP coredump on ASR9k running 5.1.2 - [CC265]

Hi Infra team,
 
Please provide your updates/comments. If a bug can be opened to addressed this fix, please provide the component and I can open one. 
 
Thanks,
Sunder
 
From: Sunder Gabbita -X (sungabbi - GLOW NETWORKS INC at Cisco) 
Sent: Monday, September 29, 2014 11:21 AM
To: Stephen Cobb (sfc); Santosh Sharma (santsha2); Gerald Quinn -X (gerquinn - COMPUCOM SYSTEMS INC at Cisco); Mahendra H N (mahhn); Rayen Mohanty (ramohant)
Cc: Ashish Rastogi (ashisras); Rohith Ramannagari -X (roramann - GLOW NETWORKS INC at Cisco); viking-infra-india(mailer list); Tricia Bill (trbill); cs-asr9k(mailer list); q-os-dev(mailer list)
Subject: RE: BGP coredump on ASR9k running 5.1.2 - [CC265]
 
Hi Steve,
 
Thank you for your analysis and comments.
 
Hi infra/infra team,
 
As per Steve's analysis, I assume further investigation needs to happen by reboot API owner(infra). 
Please provide your feedback and further plan of action.
 
Thanks
Sunder
 
From: Stephen Cobb (sfc) 
Sent: Thursday, September 25, 2014 3:58 PM
To: Santosh Sharma (santsha2); Gerald Quinn -X (gerquinn - COMPUCOM SYSTEMS INC at Cisco); Mahendra H N (mahhn); Sunder Gabbita -X (sungabbi - GLOW NETWORKS INC at Cisco); Rayen Mohanty (ramohant)
Cc: Ashish Rastogi (ashisras); Rohith Ramannagari -X (roramann - GLOW NETWORKS INC at Cisco); viking-infra-india(mailer list); Tricia Bill (trbill); cs-asr9k(mailer list); q-os-dev(mailer list)
Subject: Re: BGP coredump on ASR9k running 5.1.2 - [CC265]
Importance: High
 
 
A couple of things - but I am not involved in  XR Classic, so won't have more time for this issue.
 
The BT that Rayen posted is pretty clear - the thread calling ctime..getenv has deadlocked with itself. (Although it is not clear why the earlier getenv call was interrupted by a signal.)
The BT that Saikat posted is not clear to me, and would need a closer look at the core.
 
The ctime function is called by reboot_checkpoint_dump - that is in a signal handler. The ctime function is NOT listed in the posix standard as signal-safe, so should not be called from a signal handler. From looking at the QNX libc code, it is 100% clear that getenv is not signal safe.
 
So:
The owner of the reboot API - that is the infra/infra component - should remove the ctime call from the signal handler.
The owner of the reboot API should examine the core that has the BT that Saikat posted, to be certain to understand the cause of that deadlock.
If necessary, consult with QNX team for confirmation of what APIs are/are not signal safe, and to be certain that the causes of the deadlock are correctly understood - I am only looking at the BTs posted, and I think I see the problems, but due diligence is always required. Since the deadlocks originate from use of ctime in reboot API, infra/infra needs to evaluate reboot API carefully.
 
This is a link to a list of Posix async-signal-safe functions:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
 
Thanks//Steve
 
From: "Santosh Sharma (santsha2)" <santsha2@cisco.com>
Date: Wednesday, September 24, 2014 11:48 PM
To: "Santosh Sharma (santsha2)" <santsha2@cisco.com>, "Gerald Quinn -X (gerquinn - COMPUCOM SYSTEMS INC at Cisco)" <gerquinn@cisco.com>, "Mahendra H N (mahhn)" <mahhn@cisco.com>, "Sunder Gabbita -X (sungabbi - GLOW NETWORKS INC at Cisco)" <sungabbi@cisco.com>, "Rayen Mohanty (ramohant)" <ramohant@cisco.com>
Cc: "Ashish Rastogi (ashisras)" <ashisras@cisco.com>, "Rohith Ramannagari -X (roramann - GLOW NETWORKS INC at Cisco)" <roramann@cisco.com>, "viking-infra-india(mailer list)" <viking-infra-india@cisco.com>, "Tricia Bill (trbill)" <trbill@cisco.com>, "cs-asr9k(mailer list)" <cs-asr9k@cisco.com>, "q-os-dev(mailer list)" <q-os-dev@cisco.com>
Subject: Re: BGP coredump on ASR9k running 5.1.2 - [CC265]
 
+q-os-dev
 
Thanks,
Santosh


Symptom:BGP coredump on ASR9k running 5.1.2

Conditions:Not known
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.