Guest

Preview Tool

Cisco Bug: CSCun74286 - MCQ of VEQ block returns dummy data when the queue is not empty

Last Modified

Nov 20, 2016

Products (1)

  • Cisco Nexus 7000 Series Switches

Known Affected Releases

1

Description (partial)

Symptom:
The tests sends a lot of enqueue to reach a condition of full queue, however before the queue is completely full the rtl uncorrectly returns dummy data in response to dequeue even though the queue is not empty.

on 03/13/2014 the run at
==========
/auto/nvbu-asic20/users/icozzani/voq/2014_02_21/ip_waverider/sim/linked_list/veq/runs
test_basic_random_Nenq_Ndeq_full_930432259_p0_FAIL_RTL_DUMMY

there is a dummy returned on port 0 queue5 when there are 121 entries in that queue.

@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=0] = 84
@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=1] = 102
@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=2] = 111
@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=3] = 105
@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=4] = 117
@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=5] = 121
@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=6] = 102
@    3154.243 ns: [MSG] num_entries[port_sel=0][pri=7] = 112
@    3154.243 ns: [MSG]         Generating DEQ = 146 - DEQ pid(0) qid(6) ipg(0x1)
@    3154.243 ns: [MSG] [mcq_ref_t::process_deq_cmd] Received DEQ pid(0) qid(3)
@    3154.243 ns: [MSG] [mcq_ref_t::process_deq_cmd] Popped list entry DEQ RSP portid(0) qid(3) pkt_ptr(0x5a) pkt_len(5595) err_drop(1) is_mr(0)
@    3154.243 ns: [ERROR] [mcq_ref_t::check] Data mismatch RTL: (DEQ RSP portid(0) qid(5) DUMMY) REF EXP: (DEQ RSP portid(0) qid(5) pkt_ptr(0x3c) pkt_len(9398) err_drop(1) is_mr(0))


On 03/13/2014 From PD:
=================
I think I know what is causing this.
Can you please rerun the test by skipping the dummy descriptor and proceeding without halting the test?
Thanks
-PD


[icozzani] On 03/14/2014
==========
I have removed the test ending when the checker compares a dummy from rtl with a correct data from ref.
The run is at
/auto/nvbu-asic20/users/icozzani/voq/2014_02_21/ip_waverider/sim/linked_list/veq/runs/
test_basic_random_Nenq_Ndeq_full_930432259_p0_disable_dummy_check_fail

There is first DUMMY at time 3154 from p0 q5
@    3154.243 ns: [MSG] [mcq_deq_mon_t::run] Received DEQ RSP portid(0) qid(5) DUMMY

the test does not stop, just prints a warning and continues:
@    3154.243 ns: [WARN] [mcq_ref_t::check] WARN RTL DUMMY Data mismatch (DEQ RSP portid(0) qid(5) DUMMY) EXP (DEQ RSP portid(0) qid(5) pkt_ptr(0x3c) pkt_len(9398) err_drop(1) is_mr(0))

===> There are no more data coming from port0 queue5 after that: why the correct data does not come back?

Then there is a second DUMMY at time 3164  from port0 queue 6
@    3162.247 ns: [MSG] [mcq_ref_t::process_deq_cmd] Received DEQ pid(0) qid(6)
@    3162.247 ns: [MSG] [mcq_ref_t::process_deq_cmd] Popped list entry DEQ RSP portid(0) qid(6) pkt_ptr(0x4a) pkt_len(14706) err_drop(1) is_mr(0)


@    3164.915 ns: [MSG] [mcq_deq_mon_t::run] Received DEQ RSP portid(0) qid(6) DUMMY

with the following warn printed
@    3164.915 ns: [WARN] [mcq_ref_t::check] WARN RTL DUMMY Data mismatch
(DEQ RSP portid(0) qid(6) DUMMY)
EXP (DEQ RSP portid(0) qid(6) pkt_ptr(0x4a) pkt_len(14706) err_drop(1) is_mr(0))


the tests continues
@    3167.583 ns: [MSG] [mcq_ref_t::process_deq_cmd] Received DEQ pid(0) qid(6)
ref pop next entry from p0 q6
@    3167.583 ns: [MSG] [mcq_ref_t::process_deq_cmd] Popped list entry DEQ RSP portid(0) qid(6) pkt_ptr(0x4c) pkt_len(9940) err_drop(1) is_mr(0)

finally gets the correct entry from rtl from p0 q6
@    3170.251 ns: [MSG] [mcq_deq_mon_t::run] Received DEQ RSP portid(0) qid(6) pkt_ptr(0x4a) pkt_len(14706) err_drop(1) is_mr(0)

but ref had already popped that at ptr 4a and also popped the next one at ptr 4c so the data mismatch becasue ref has removed the data corresponding to the dummy and can not use it anymore.

@    3170.251 ns: [ERROR] [mcq_ref_t::check] 111 Data mismatch RTL: (DEQ RSP portid(0) qid(6) pkt_ptr(0x4a) pkt_len(14706) err_drop(1) is_mr(0)) REF EXP: (DEQ RSP portid(0) qid(6) pkt_ptr(0x4c) pkt_len(9940) err_drop(1) is_mr(0))


Further change:
==========
I have changed the checker to do PEEK instead of GET of the data in the mailbox because I though that  I just peek without removing the data from the mailbox, every time they match I do get to actually remove the data and things should be as before, but when the rtl returns dummy, I keep the ref data there to wait for the correct data coming from rtl.

This actually work if the correct data is exactly the next one to come. Remember that  after the 1st dummy from p0 q5 we do NOT get the correct data?

For this reason, my change it does not work either because after the DUMMY from port0 q5, the next data coming back from RTL is not the correct one from port0 q5, it is from port0 q3, so the rtl and ref data mismatch again.

The second run (with PEEK instead of GET) is at

/auto/nvbu-asic20/users/icozzani/voq/2014_02_21/ip_waverider/sim/linked_list/veq/runs/test_basic_random_Nenq_Ndeq_full_930432259_p0_disable_dummy_peek_fail


we have the DUMMY
@    3154.243 ns: [WARN] [mcq_ref_t::check] WARN RTL DUMMY Data mismatch (DEQ RSP portid(0) qid(5) DUMMY) EXP (DEQ RSP portid(0) qid(5) pkt_ptr(0x3c) pkt_len(9398) err_drop(1) is_mr(0))

then we do not get the correct data from q5, we have a data from q3:
@    3156.911 ns: [MSG] [mcq_deq_mon_t::run] Received DEQ RSP portid(0) qid(3) pkt_ptr(0x5a) pkt_len(5595) err_drop(1) is_mr(0)

ref still has the data corresponding to the dummy p0 q5 so the mismatch:
@    3156.911 ns: [ERROR] [mcq_ref_t::check] Data mismatch (DEQ RSP portid(0) qid(3) pkt_ptr(0x5a) pkt_len(5595) err_drop(1) is_mr(0)) EXP (DEQ RSP portid(0) qid(5) pkt_ptr(0x3c) pkt_len(9398) err_drop(1) is_mr(0))

This is untangled because the data corresponding the 1st dummy from p0 q5 does not come back from rtl, rtl sends the data from p0 q3 instead.


CVS update:
This change is
/ip_waverider/sim/linked_list/veq/tb/mcq_ref.sv cvs version 1.5

The original ref that halt the test when rtl returns an incorrect dummy frame is cvs version 1.4


Update 03/21/2014
=============
New rtl from 
/auto/waverider-dump2/bpriyada/wvr/ws/2014_03_13/ip_waverider/rtl/ipwvr_veq/ipwvr_veq_mcq_port2x_1xdesc_ctrl.sv.vp

The issue is not fixed: flow control is asserted so enq can not proceed and deq have not started yet.

Tried several configurations:
8 descr/page: 
test fails as before

16 descr/page, 8 queues:
pass up to 4000 enq
 hangs at 4150 enq

32 descriptors/page, 8 queues
pass up to 8000 enq
hangs at 8164 enq

32 descriptors/page, 1 queue
pass up to 8000 enq
hangs at 8305 enq


03/28/2014
========
When a queue is empty and there is a DEQ request, rtl should send back a dummy frame but instead it sends an empty response with all fields set to zero. AS the dummy bit is not flagged, the rsp data received in compared with ref and the test fails.
This fix should go in the next rtl release.

Conditions:
runll -tn test_basic_random_Nenq_Ndeq_full -uq -dump -vo "+define+AV_MEM28_GBL_BLACKOUT=1 +min_enq_port_id=0 +max_enq_port_id=0 +min_deq_port_id=0 +max_deq_port_id=0 +num_enq=1000 +min_deq_ipg=1 +max_deq_ipg=1" -seed 930432259
Bug details contain sensitive information and therefore require a Cisco.com account to be viewed.

Bug Details Include

  • Full Description (including symptoms, conditions and workarounds)
  • Status
  • Severity
  • Known Fixed Releases
  • Related Community Discussions
  • Number of Related Support Cases
Bug information is viewable for customers and partners who have a service contract. Registered users can view up to 200 bugs per month without a service contract.