You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
======================================================================
FAIL: runTest (testcases.OpTestPCI.PcieLinkErrorsSkiroot)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/afs/rchland.ibm.com/projects/esw/dvt/git/apolloprep/op-test-framework/testcases/OpTestPCI.py", line 191, in runTest
self.pcie_link_errors()
File "/afs/rchland.ibm.com/projects/esw/dvt/git/apolloprep/op-test-framework/testcases/OpTestPCI.py", line 80, in pcie_link_errors
self.assertTrue( len(total_entries) == 0, "pcie link down/timeout Errors in OPAL log:\n%s" % msg)
AssertionError: pcie link down/timeout Errors in OPAL log:
[ 118.408432659,3] PHB#0030[8:0]: TRACE: Timeout waiting for link up.
Boot log (from a different boot, same failure failure mode):
# cat /sys/firmware/opal/msglog | grep 'PHB#0030'
[ 113.328283004,7] PHB#0030:00:00.0 PCI: Registered PHB
[ 113.328284700,6] PHB#0030[8:0]: Found /pciex@620c3c0000000 @0x620c3c0000000
[ 113.328286204,6] PHB#0030[8:0]: M32 [0x000620c000000000..0x000620c07fffffff]
[ 113.328287413,6] PHB#0030[8:0]: M64 [0x0006200000000000..0x0006203fffffffff]
[ 113.328289896,7] PHB#0030[8:0]: Core revision 0xa40002
[ 113.328291181,7] PHB#0030[8:0]: Found 512 max PEs and 4096 IRQs
[ 113.328292319,6] PHB#0030[8:0]: Max link speed: GEN3
[ 113.328293163,7] PHB#0030[8:0]: Override lane equalization settings:
[ 113.328294156,7] PHB#0030[8:0]: 0x5454545454545454 0x5454545454545454
[ 113.328295401,7] PHB#0030[8:0]: 0x5454545454545454 0x5454545454545454
[ 113.328296521,7] PHB#0030[8:0]: 0x7777777777777777 0x7777777777777777
[ 113.328412018,7] PHB#0030[8:0]: Initializing PHB4...
[ 113.328413302,7] PHB#0030[8:0]: Default system config: 0x4410000040400000
[ 113.328414715,7] PHB#0030[8:0]: New system config : 0x4410000030400000
[ 113.328415956,7] PHB#0030[8:0]: Initial PHB CRESET is 0xe000000000000000
[ 113.328417754,7] PHB#0030[8:0]: Waiting for DLP PG reset to complete...
[ 113.328931741,7] PHB#0030[8:0]: Reset state SRC_ID: 0ff8000000000000
[ 113.329255294,7] PHB#0030[8:0]: Initialization complete
[ 113.345615008,7] PHB#0030[8:0]: FRESET: Starts
[ 113.345810088,7] PHB#0030[8:0]: FRESET: Prepare for link down
[ 113.345824006,7] PHB#0030[8:0]: FRESET: Assert skipped
[ 113.345835344,7] PHB#0030[8:0]: FRESET: Deassert
[ 113.345852423,3] PHB#0030[8:0]: TRACE:0x0000001101000000 0ms GEN1:x16:detect
[ 116.345848015,3] PHB#0030[8:0]: TRACE: Timeout waiting for link up.
[ 116.345949079,7] PHB#0030[8:0]: LINK: Start polling
[ 117.346501203,7] PHB#0030[8:0]: LINK: No in-band presence
[ 117.350916395,7] PHB#0030:00:00.0 Link down
[ 117.351057838,7] PHB#0030:00:00.0 Scanning (upstream only)...
[ 117.351118551,7] PHB#0030:00:00.0 Found VID:1014 DEV:04c1 TYP:4 MF- BR+ EX+
[ 117.351343054,7] DT-SLOT: PHB#0030:00:00.0 Finding slot
[ 117.351360918,7] DT-SLOT: PHB#0030:00:00.0 Slot found /ibm,pcie-slots/root-complex@8,0
[ 117.413082227,5] PHB#0030:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:00..00 SLOT=SLOT2
[ 122.326533258,7] PHB#0030[8:0]: Purging all IODA tables...
My diagnoses:
Clearing the pci-tracing nvram option is the fix.
If the trace ends due to a timeout we print the final status at PR_ERROR. The trace timeout error strings are similar to the strings printed when there's a link training failure so the test fails spuriously when pci-tracing is enabled.
For reference, the two messages printed are:
PHBERR(p, "TRACE: Timeout waiting for link up.\n");
PHBERR(p, "LINK: Timeout waiting for link up\n");
The TRACE: messages indicates that the pci-tracing ended with the link being down. This isn't really an error and op-test shouldn't flag it as such. The second is only printed if we detect an electrical link (i.e there's a card in the slot), but timeout waiting for the link finish training.
I'll bump the log level of the TRACE: message to PR_NOTICE (lvl 5), but op-test should be made a more robust against this sort of spurious failure too.
Looks like the test itself in testcases/OpTestPCI.py:322 is just scanning for the string "Timeout waiting for". This should be made more specific to the sort of errors the test is looking for.
The text was updated successfully, but these errors were encountered:
When a sufficient amount of debugging has been enabled[0] Skiboot will
produce messages of the form
PHB#0030[8:0]: TRACE: Timeout waiting for link up
This is harmless and only represents an empty slot so filter it out on
all platforms.
Fixesopen-power#491
[0] nvram -p ibm,skiboot --update-config pci-tracing=true
Signed-off-by: Samuel Mendoza-Jonas <[email protected]>
Copied from an internal bug report:
Failure message:
Boot log (from a different boot, same failure failure mode):
My diagnoses:
Looks like the test itself in
testcases/OpTestPCI.py:322
is just scanning for the string "Timeout waiting for". This should be made more specific to the sort of errors the test is looking for.The text was updated successfully, but these errors were encountered: