Guest

Preview Tool

Cisco Bug: CSCut60891 - The RSS of dmserver is increasing

Last Modified

Oct 31, 2018

Products (1)

  • Cisco Unified Computing System

Known Affected Releases

3.0(2b)A

Description (partial)

Symptom:
2204xp 
2.2(1b)
=======
IOM crash due to memory leak caused by thermal process.
=======

thermal logs show mc reading errors.

IOCard1_TechSupport/techsupport_detailed_iocard1/cmc/log/thermal


2016-05-05T14:33:32.076510+00:00 CMC NOCSN_thermal-5-CMC  0:slot_owner_clear:B4 no longer owns slot 3
2016-05-05T14:33:32.076608+00:00 CMC NOCSN_thermal-3-CMC  0:sg_update_readings:b4 Error during mc_get_reading,
2016-05-05T14:33:58.541572+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_get_blade_profile:Using dynamic thermal profile for b4, r=1 n=12
2016-05-05T14:34:17.118721+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_disconnect:b4 temperature maxima:[b4=UCSB-B22-M3,FCHxxxxxxxx[,][,][,,,,,,,,,,,]]
2016-05-05T14:34:17.118889+00:00 CMC NOCSN_thermal-5-CMC  0:slot_owner_clear:B4 no longer owns slot 3
2016-05-05T14:34:17.118986+00:00 CMC NOCSN_thermal-3-CMC  0:sg_update_readings:b4 Error during mc_get_reading,
2016-05-05T14:34:43.915809+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_get_blade_profile:Using dynamic thermal profile for b4, r=1 n=12
2016-05-05T14:34:55.416844+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_disconnect:b4 temperature maxima:[b4=UCSB-B22-M3,FCHxxxxxxxx[,][,][,,,,,,,,,,,]]
2016-05-05T14:34:55.417012+00:00 CMC NOCSN_thermal-5-CMC  0:slot_owner_clear:B4 no longer owns slot 3
2016-05-05T14:34:55.417109+00:00 CMC NOCSN_thermal-3-CMC  0:sg_update_readings:b4 Error during mc_get_reading,
2016-05-05T14:35:11.057220+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_get_blade_profile:Using dynamic thermal profile for b4, r=1 n=12
2016-05-05T14:35:29.159557+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_disconnect:b4 temperature maxima:[b4=UCSB-B22-M3,FCHxxxxxxxx[,][,][,,,,,,,,,,,]]
2016-05-05T14:35:29.159727+00:00 CMC NOCSN_thermal-5-CMC  0:slot_owner_clear:B4 no longer owns slot 3
2016-05-05T14:35:29.159824+00:00 CMC NOCSN_thermal-3-CMC  0:sg_update_readings:b4 Error during mc_get_reading,
2016-05-05T14:35:47.767503+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_get_blade_profile:Using dynamic thermal profile for b4, r=1 n=12
2016-05-05T14:36:00.263879+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_disconnect:b4 temperature maxima:[b4=UCSB-B22-M3,FCHxxxxxxxx[,][,][,,,,,,,,,,,]]
2016-05-05T14:36:00.264050+00:00 CMC NOCSN_thermal-5-CMC  0:slot_owner_clear:B4 no longer owns slot 3
2016-05-05T14:36:00.264148+00:00 CMC NOCSN_thermal-3-CMC  0:sg_update_readings:b4 Error during mc_get_reading,
2016-05-05T14:36:32.518553+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_get_blade_profile:Using dynamic thermal profile for b4, r=1 n=12
2016-05-05T14:36:50.594339+00:00 CMC NOCSN_thermal-5-CMC  OBFL:0:sg_disconnect:b4 temperature maxima:[b4=UCSB-B22-M3,FCHxxxxxxxx[,][,][,,,,,,,,,,,]]



===========================

low memory observed in iom and note the memory utilization by thermal process

cmc0:/obfl# cat mem_low_emerg.log
Linux cmc0 2.6.10_mvl401-832xmds #2 Wed Dec 11 03:57:25 PST 2013 ppc unknown
CMC Version: 2.2(1b)
 07:00:24 up 14 days, 12:53, load average: 2.15, 2.07, 2.03
              total         used         free       shared      buffers
  Mem:       515472       507312         8160            0            0
 Swap:            0            0            0
Total:       515472       507312         8160
MemTotal:       515472 kB
MemFree:          8160 kB
Buffers:             0 kB
Cached:          96472 kB
SwapCached:          0 kB
Active:         469888 kB
Inactive:         3568 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       515472 kB
LowFree:          8160 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:               0 kB
Writeback:           0 kB
Mapped:         393340 kB
Slab:            16812 kB
CommitLimit:    257736 kB
Committed_AS:   654424 kB
PageTables:       1188 kB
VmallocTotal:   352256 kB
VmallocUsed:    245000 kB
VmallocChunk:   107168 kB
Filesystem                Size      Used Available Use% Mounted on
none                     40.0M    752.0k     39.3M   2% /var/cmc
/dev/mtdblock4           32.0M     16.3M     15.7M  51% /obfl
/dev/mtdblock6           59.0M      1.5M     57.5M   3% /ws
/dev/mtdblock5           61.0M     32.1M     28.9M  53% /flash
tmpfs                    32.0M      8.2M     23.8M  26% /dev/shm
none                      5.0M      4.0k      5.0M   0% /var/sysmgr
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1   2432   600 ?        S    Apr18   0:01 init
root         2  0.0  0.0      0     0 ?        S<   Apr18   0:02 [ksoftirqd/0]
root         3  0.0  0.0      0     0 ?        S<   Apr18   0:11 [desched/0]
root         4  0.0  0.0      0     0 ?        S<   Apr18   0:04 [events/0]
root         5  0.0  0.0      0     0 ?        S<   Apr18   0:00 [khelper]
root        10  0.0  0.0      0     0 ?        S<   Apr18   8:59 [kthread]
root       110  0.0  0.0      0     0 ?        S    Apr18   0:00 [kswapd0]
root       114  0.0  0.0      0     0 ?        S    Apr18   0:00 [pciehpd_event]
root       776  0.0  0.0      0     0 ?        S    Apr18   0:00 [mtdblockd]
root       824  0.0  0.0      0     0 ?        SN   Apr18   0:05 [jffs2_gcd_mtd4]
root       839  0.0  0.0      0     0 ?        SN   Apr18   0:00 [jffs2_gcd_mtd6]
root       953  0.0  0.0      0     0 ?        SN   Apr18   0:03 [jffs2_gcd_mtd5]
root      1010  0.0  0.1   2808   732 ?        Ss   Apr18   0:00 /usr/sbin/inetd
root      1162  0.0  0.1   2208   704 ?        Ss   Apr18  16:54 /nuova/bin/obfllogger --write -F
root      1170  0.0  0.4   8948  2380 ?        Sl   Apr18   2:08 /sbin/rsyslogd -c3 -i/var/run/rsyslogd.pid
root      1182  0.0  0.0   2432   500 ?        Ss   Apr18   0:00 /usr/sbin/telnetd
root      1293  0.1  0.1   2124   720 ?        Ss   Apr18  22:33 /nuova/bin/pmon -f /etc/pmon.conf
root      1294  0.8  0.4  15168  2068 ?        Sl   Apr18 177:54 platform_ohms -F
root      1300  2.8  0.2  17100  1368 ?        Dl   Apr18 588:35 dmserver -F
root      1327 43.3  0.3  17400  1844 ?        Sl   Apr18 9068:06 ipmiserver -F 
root      1328  5.9  0.3   6264  1688 ?        Sl   Apr18 1238:10 cmc_manager -F
root      1329  0.0  0.1   3624   996 ?        S    Apr18   7:08 cluster_manager -F
root      1330  0.0  0.1   2788   852 ?        S    Apr18   8:08 updated -F
root      1331  0.0  0.2  21508  1228 ?        Sl   Apr18   5:47 bmcd -F
root      1351  0.1 64.5 344988 332940 ?       Sl   Apr18  32:24 thermal -F << High mem usage by thermal process


===================
core dump

Kernel access of bad area (sig 11)

Kernel coredump registers:
NIP: C003C970 LR: C003C9BC SP: DAB43DC0 REGS: dab43d10 TRAP: 0300    Tainted: P     
MSR: 00009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 00000078, DSISR: 20000000
TASK = daaaa070[1315] 'dmserver' THREAD: dab42000
Last syscall: 4 
GPR00: 00000A5C DAB43DC0 DAAAA070 C094EB70 00002911 00000002 00002911 C02102C8 
GPR08: 00000000 C094E804 00000000 C0410000 00000000 1004BA9C 1FF75000 00000000 
GPR16: 100047C4 10043A20 30428458 10040000 10040000 10040000 C0410000 C0410000 
GPR24: 460A4D4B C094EB70 C0210000 00000007 C094E7B0 0000047B 00000000 C094E7B0 
ERR_DETECT=0x0 ERR_SBE=0x10000 AER=0x4 AEATR=0x200020c AEADR=0x1855c320
CAPTHI =0x0 CAPLO=0x0 CAPECC=0x0 SRR0=0x78 SRR1=0x20000000 HID0=0xc000c000 


Conditions:
FW less then 3.1(1e) or 2.2(7b)

Low memory caused by thermal process.

You can check the overall free memory by connecting to the iom and running the commands below

#con loca a|b << do both sides of a or b
#con iom x << number of the chassis
#show platform software cmcctrl process meminfo

fex-1# sh platform software cmcctrl process meminfo
MemTotal:       515472 kB
MemFree:        182464 kB << anything less then 14000KB or 14MB is critical state.
Buffers:             0 kB
Cached:          94836 kB
SwapCached:          0 kB
Active:         296236 kB
Inactive:         3276 kB
HighTotal:           0 kB
HighFree:            0 kB
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.