Cisco Bug: CSCvu59097 - Incorrect dbreplication status showing for Out-of-Sync issue

Sep 29, 2020

  • Cisco Unified Communications Manager (CallManager)

10.5(2.10000.5) 11.5(1.10000.6) 12.0(1.10000.10) 12.5(1.10000.22)

CUCM dbreplication runtimestate status showing as value 2 even though there were  Out-of-Sync between tables

As per cisco documentation, a dbreplication state value 2 means the Logical connections are established and the tables are matched with the other servers on the cluster but there were Out-of-Sync tables between nodes

Here is the detailed analysis from DBMON logs of subscriber sub1: x01gucmsub01a
09:04:13.154 |   MaintenanceTask::checkRTMT ReplicationDynamic modify query UPDATE replicationdynamic set datetimestamp = 1590541453 WHERE fkprocessnode = '0d6bdad8-f01a-b68f-f304-9836ee448b3a'
DBmon is updating its own (self) datetimestamp on replicationdynamic table.
09:04:13.159 |   MaintenanceTask::checkRTMT Below is the summary of current_time - table_time...
09:04:13.181 |   MaintenanceTask::checkRTMT Against node[] time-diff 330269
There is logic to compare the datetimestamp  from self-node to all other nodes.
Now if time-diff is more than the repltimeout timer which is  the case above, the next logic will be checked in the code.
09:04:13.181 |-->MaintenanceTask::checkCdrListServ 
09:04:13.182 |   MaintenanceTask::checkCdrListServ cmd: cdr list serv|grep -i local
09:04:13.332 |   MaintenanceTask::checkCdrListServ line read of cdr list serv: g_10_ccm12_5_1_11005_1   10 Active   Local           0     
 Local cdr server name will be fetched and will be checked if it is in active .
09:04:13.332 |   MaintenanceTask::checkCdrListServ localhost: g_10_ccm12_5_1_11005_1
09:04:13.332 |   MaintenanceTask::checkCdrListServ cmd: cdr list serv |awk '{if ($2 == 11) print}'
09:04:13.456 |   MaintenanceTask::checkCdrListServ line: [g_11_ccm12_5_1_11005_1   11 Active   Hi 
] readbuff1: [g_11_ccm12_5_1_11005_1] remotehost: [g_11_ccm12_5_1_11005_1]
09:04:13.456 |   MaintenanceTask::checkCdrListServ Setting 0 as cdr list serv do not find Active and Connected, remote server is bad...
Since the local node is active and has no issues it checks against the remote servers/other servers in cluster if they are in active as well as connected.
Here is the most important part :
If remote host is Active Connected, then assuming local host is bad  the self-node  RTMT  value will be set to 3. But in above case the cdr connectivity from sub01 to node g_11_ccm12_5_1_11005_1 shows Disconnect. Thus RTMT counter value of sub01 was not changed to 3.

It’s a very corner case scenario as issue seems to be appearing when port level connectivity exists between 2 subscribers.
