Score:2

HBA card seems disabled, link appears downgraded. Not sure why

ng flag

I have a HBA attached in a 8x pci slot, however, it is showing as x4 (downgraded). In addition, it seems to be disabled. I assume these things are not normal and the reason my setup is not working. Trying to troubleshoot the cause and get the HBA card working with my JBOD Enclosure. Currently, the SAS cables are coming up as fault, and im assuming the HBA is to blame.

Additional background

Note: I have 8 sticks of ram, 1 per each channel of CPU_0. Other 24 slots are empty. I note this as not sure if this can make an effect.

Finding my HBA card

SAS 9305-16e Host Bus Adapter

root@EPY00:~# lspci | grep -i broad
c1:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 (rev 01)

Searching dmesg for my HBA card. Observing that my card is limited. Not sure why.

root@EPY00:~# dmesg | grep c1:00.0
[    2.337229] pci 0000:c1:00.0: [1000:00c9] type 00 class 0x010700
[    2.337241] pci 0000:c1:00.0: reg 0x10: [io  0xd000-0xd0ff]
[    2.337252] pci 0000:c1:00.0: reg 0x14: [mem 0x9c100000-0x9c10ffff 64bit]
[    2.337274] pci 0000:c1:00.0: reg 0x30: [mem 0x9c000000-0x9c0fffff pref]
[    2.337361] pci 0000:c1:00.0: supports D1 D2
[    2.337410] pci 0000:c1:00.0: 31.504 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x4 link at 0000:c0:01.1 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[    2.780056] pci 0000:c1:00.0: Adding to iommu group 87
[    4.159479] mpt3sas 0000:c1:00.0: enabling device (0000 -> 0002)

Observing the link. I see that the link is downgraded. Not sure what this means. Guessing this may be the root issue?

root@EPY00:~# lspci -vv -s 0000:c0:01.1
c0:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin ? routed to IRQ 70
        NUMA node: 1
        IOMMU group: 87
        Bus: primary=c0, secondary=c1, subordinate=c1, sec-latency=0
        I/O behind bridge: 0000d000-0000dfff [size=4K]
        Memory behind bridge: 9c000000-9c1fffff [size=2M]
        Prefetchable memory behind bridge: [disabled]
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16+ MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0
                        ExtTag+ RBE+
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L1 <64us
                        ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s (downgraded), Width x4 (downgraded)
                        TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
                        Slot #3, PowerLimit 75.000W; Interlock- NoCompl+
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet- LinkState-
                RootCap: CRSVisible+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 1
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
                         AtomicOpsCap: Routing- 32bit+ 64bit+ 128bitCAS-
                DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- LTR- OBFF Disabled, ARIFwd+
                         AtomicOpsCtl: ReqEn- EgressBlck-
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00000  Data: 0000
        Capabilities: [c0] Subsystem: Gigabyte Technology Co., Ltd Starship/Matisse GPP Bridge
        Capabilities: [c8] HyperTransport: MSI Mapping Enable+ Fixed+
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [270 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [370 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2- PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1+ L1_PM_Substates+
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                L1SubCtl2:
        Capabilities: [380 v1] Downstream Port Containment
                DpcCap: INT Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
                DpcCtl: Trigger:0 Cmpl- INT- ErrCor- PoisonedTLP- SwTrigger- DL_ActiveErr-
                DpcSta: Trigger- Reason:00 INT- RPBusy- TriggerExt:00 RP PIO ErrPtr:1f
                Source: 0000
        Capabilities: [400 v1] Data Link Feature <?>
        Capabilities: [410 v1] Physical Layer 16.0 GT/s <?>
        Capabilities: [440 v1] Lane Margining at the Receiver <?>
        Kernel driver in use: pcieport

Possibly unrelated, but also observed the following.

root@EPY00:~# dmesg | grep -i pci | grep -i bar
[    2.314469] pci 0000:63:00.0: VF(n) BAR0 space: [mem 0x18090f60000-0x18090f7ffff 64bit pref] (contains BAR0 for 8 VFs)
[    2.314469] pci 0000:63:00.0: VF(n) BAR3 space: [mem 0x18090f40000-0x18090f5ffff 64bit pref] (contains BAR3 for 8 VFs)
[    2.314648] pci 0000:63:00.1: VF(n) BAR0 space: [mem 0x18090f20000-0x18090f3ffff 64bit pref] (contains BAR0 for 8 VFs)
[    2.314668] pci 0000:63:00.1: VF(n) BAR3 space: [mem 0x18090f00000-0x18090f1ffff 64bit pref] (contains BAR3 for 8 VFs)
[    2.326651] pci 0000:66:00.0: BAR 0: assigned to efifb
[    2.381614] pci 0000:00:01.1: BAR 14: assigned [mem 0xf6000000-0xf61fffff]
[    2.381616] pci 0000:00:01.1: BAR 15: assigned [mem 0x300f1000000-0x300f11fffff 64bit pref]
[    2.381617] pci 0000:00:01.2: BAR 14: assigned [mem 0xf6200000-0xf63fffff]
[    2.381619] pci 0000:00:01.2: BAR 15: assigned [mem 0x300f1200000-0x300f13fffff 64bit pref]
[    2.381622] pci 0000:00:01.1: BAR 13: assigned [io  0x1000-0x1fff]
[    2.381623] pci 0000:00:01.2: BAR 13: assigned [io  0x2000-0x2fff]
[    2.381826] pci 0000:60:03.1: BAR 15: assigned [mem 0x10091000000-0x100911fffff 64bit pref]
[    2.381828] pci 0000:60:03.2: BAR 15: assigned [mem 0x10091200000-0x100913fffff 64bit pref]
[    2.381829] pci 0000:60:03.1: BAR 13: no space for [io  size 0x1000]
[    2.381830] pci 0000:60:03.1: BAR 13: failed to assign [io  size 0x1000]
[    2.381831] pci 0000:60:03.2: BAR 13: no space for [io  size 0x1000]
[    2.381832] pci 0000:60:03.2: BAR 13: failed to assign [io  size 0x1000]
[    2.381833] pci 0000:60:03.2: BAR 13: no space for [io  size 0x1000]
[    2.381834] pci 0000:60:03.2: BAR 13: failed to assign [io  size 0x1000]
[    2.381835] pci 0000:60:03.1: BAR 13: no space for [io  size 0x1000]
[    2.381835] pci 0000:60:03.1: BAR 13: failed to assign [io  size 0x1000]
[    2.381947] pci 0000:80:01.1: BAR 14: assigned [mem 0x90000000-0x901fffff]
[    2.381949] pci 0000:80:01.1: BAR 15: assigned [mem 0x581b1000000-0x581b11fffff 64bit pref]
[    2.381949] pci 0000:80:01.2: BAR 14: assigned [mem 0x90200000-0x903fffff]
[    2.381951] pci 0000:80:01.2: BAR 15: assigned [mem 0x581b1200000-0x581b13fffff 64bit pref]
[    2.381952] pci 0000:80:01.1: BAR 13: assigned [io  0x9000-0x9fff]
[    2.381952] pci 0000:80:01.2: BAR 13: assigned [io  0xa000-0xafff]
[    2.382039] pci 0000:a0:03.1: BAR 15: assigned [mem 0x501b1000000-0x501b11fffff 64bit pref]
[    2.382040] pci 0000:a0:03.2: BAR 15: assigned [mem 0x501b1200000-0x501b13fffff 64bit pref]
[    2.382041] pci 0000:a0:03.3: BAR 14: assigned [mem 0x96000000-0x961fffff]
[    2.382042] pci 0000:a0:03.3: BAR 15: assigned [mem 0x501b1400000-0x501b15fffff 64bit pref]
[    2.382043] pci 0000:a0:03.4: BAR 14: assigned [mem 0x96200000-0x963fffff]
[    2.382044] pci 0000:a0:03.4: BAR 15: assigned [mem 0x501b1600000-0x501b17fffff 64bit pref]
[    2.382045] pci 0000:a0:03.1: BAR 13: assigned [io  0xc000-0xcfff]
[    2.382046] pci 0000:a0:03.2: BAR 13: no space for [io  size 0x1000]
[    2.382046] pci 0000:a0:03.2: BAR 13: failed to assign [io  size 0x1000]
[    2.382047] pci 0000:a0:03.3: BAR 13: no space for [io  size 0x1000]
[    2.382048] pci 0000:a0:03.3: BAR 13: failed to assign [io  size 0x1000]
[    2.382049] pci 0000:a0:03.4: BAR 13: no space for [io  size 0x1000]
[    2.382049] pci 0000:a0:03.4: BAR 13: failed to assign [io  size 0x1000]
[    2.382051] pci 0000:a0:03.4: BAR 13: assigned [io  0xc000-0xcfff]
[    2.382052] pci 0000:a0:03.3: BAR 13: no space for [io  size 0x1000]
[    2.382053] pci 0000:a0:03.3: BAR 13: failed to assign [io  size 0x1000]
[    2.382054] pci 0000:a0:03.2: BAR 13: no space for [io  size 0x1000]
[    2.382054] pci 0000:a0:03.2: BAR 13: failed to assign [io  size 0x1000]
[    2.382055] pci 0000:a0:03.1: BAR 13: no space for [io  size 0x1000]
[    2.382056] pci 0000:a0:03.1: BAR 13: failed to assign [io  size 0x1000]
[    2.382218] pci 0000:e0:03.1: BAR 14: assigned [mem 0xa0000000-0xa01fffff]
[    2.382219] pci 0000:e0:03.1: BAR 15: assigned [mem 0x40151000000-0x401511fffff 64bit pref]
[    2.382220] pci 0000:e0:03.2: BAR 14: assigned [mem 0xa0200000-0xa03fffff]
[    2.382222] pci 0000:e0:03.2: BAR 15: assigned [mem 0x40151200000-0x401513fffff 64bit pref]
[    2.382222] pci 0000:e0:03.1: BAR 13: assigned [io  0xe000-0xefff]
[    2.382223] pci 0000:e0:03.2: BAR 13: no space for [io  size 0x1000]
[    2.382224] pci 0000:e0:03.2: BAR 13: failed to assign [io  size 0x1000]
[    2.382225] pci 0000:e0:03.2: BAR 13: assigned [io  0xe000-0xefff]
[    2.382226] pci 0000:e0:03.1: BAR 13: no space for [io  size 0x1000]
[    2.382227] pci 0000:e0:03.1: BAR 13: failed to assign [io  size 0x1000]

Bonus command on driver

root@EPY00:~# modinfo mpt3sas                  
filename:       /lib/modules/5.10.0-9-amd64/kernel/drivers/scsi/mpt3sas/mpt3sas.ko
alias:          mpt2sas
version:        35.100.00.00
license:        GPL
description:    LSI MPT Fusion SAS 3.0 Device Driver
author:         Avago Technologies <[email protected]>
srcversion:     2D6BBDB9CE0F1B2FA0B159D
alias:          pci:v00001000d000000E7sv*sd*bc*sc*i*
alias:          pci:v00001000d000000E4sv*sd*bc*sc*i*
alias:          pci:v00001000d000000E6sv*sd*bc*sc*i*
alias:          pci:v00001000d000000E5sv*sd*bc*sc*i*
alias:          pci:v00001000d000000B2sv*sd*bc*sc*i*
alias:          pci:v00001000d000000E3sv*sd*bc*sc*i*
alias:          pci:v00001000d000000E0sv*sd*bc*sc*i*
alias:          pci:v00001000d000000E2sv*sd*bc*sc*i*
alias:          pci:v00001000d000000E1sv*sd*bc*sc*i*
alias:          pci:v00001000d000000D1sv*sd*bc*sc*i*
alias:          pci:v00001000d000000ACsv*sd*bc*sc*i*
alias:          pci:v00001000d000000ABsv*sd*bc*sc*i*
alias:          pci:v00001000d000000AAsv*sd*bc*sc*i*
alias:          pci:v00001000d000000AFsv*sd*bc*sc*i*
alias:          pci:v00001000d000000AEsv*sd*bc*sc*i*
alias:          pci:v00001000d000000ADsv*sd*bc*sc*i*
alias:          pci:v00001000d000000C3sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C2sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C1sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C0sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C8sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C7sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C6sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C5sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C4sv*sd*bc*sc*i*
alias:          pci:v00001000d000000C9sv*sd*bc*sc*i*
alias:          pci:v00001000d00000095sv*sd*bc*sc*i*
alias:          pci:v00001000d00000094sv*sd*bc*sc*i*
alias:          pci:v00001000d00000091sv*sd*bc*sc*i*
alias:          pci:v00001000d00000090sv*sd*bc*sc*i*
alias:          pci:v00001000d00000097sv*sd*bc*sc*i*
alias:          pci:v00001000d00000096sv*sd*bc*sc*i*
alias:          pci:v00001000d0000007Esv*sd*bc*sc*i*
alias:          pci:v00001000d000002B1sv*sd*bc*sc*i*
alias:          pci:v00001000d000002B0sv*sd*bc*sc*i*
alias:          pci:v00001000d0000006Esv*sd*bc*sc*i*
alias:          pci:v00001000d00000087sv*sd*bc*sc*i*
alias:          pci:v00001000d00000086sv*sd*bc*sc*i*
alias:          pci:v00001000d00000085sv*sd*bc*sc*i*
alias:          pci:v00001000d00000084sv*sd*bc*sc*i*
alias:          pci:v00001000d00000083sv*sd*bc*sc*i*
alias:          pci:v00001000d00000082sv*sd*bc*sc*i*
alias:          pci:v00001000d00000081sv*sd*bc*sc*i*
alias:          pci:v00001000d00000080sv*sd*bc*sc*i*
alias:          pci:v00001000d00000065sv*sd*bc*sc*i*
alias:          pci:v00001000d00000064sv*sd*bc*sc*i*
alias:          pci:v00001000d00000077sv*sd*bc*sc*i*
alias:          pci:v00001000d00000076sv*sd*bc*sc*i*
alias:          pci:v00001000d00000074sv*sd*bc*sc*i*
alias:          pci:v00001000d00000072sv*sd*bc*sc*i*
alias:          pci:v00001000d00000070sv*sd*bc*sc*i*
depends:        scsi_mod,scsi_transport_sas,raid_class
retpoline:      Y
intree:         Y
name:           mpt3sas
vermagic:       5.10.0-9-amd64 SMP mod_unload modversions 
sig_id:         PKCS#7
signer:         Debian Secure Boot CA
sig_key:        4B:6E:F5:AB:CA:66:98:25:17:8E:05:2C:84:66:7C:CB:C0:53:1F:8C
sig_hashalgo:   sha256
signature:      96:D9:EB:25:37:10:96:E1:BD:55:F1:66:9C:87:2A:C1:E8:B1:9A:A1:
                28:42:A8:DD:EF:25:B8:DF:BA:1D:B2:FC:E5:45:42:6D:DC:2B:77:02:
                6A:55:29:F0:08:04:3E:A2:42:53:1E:F8:F0:EF:07:4F:D0:F4:74:93:
                35:3E:E3:1E:AC:01:25:0F:87:4D:94:71:B1:6D:1C:4B:10:EF:C3:6E:
                BA:B5:58:37:19:CC:35:99:CB:1C:00:35:60:4A:39:CA:8E:53:99:40:
                3C:03:FE:4A:FE:44:2E:72:F6:F3:62:FC:89:CA:4A:88:C3:83:A6:D2:
                66:56:47:FA:FC:47:1D:F7:E1:FB:2D:A9:DD:E2:E2:B8:BC:19:A7:64:
                51:99:36:FD:53:6A:40:5B:75:A3:03:57:4E:6C:03:62:D1:BC:68:31:
                E2:52:71:75:69:92:E4:72:BB:21:7E:F5:D3:E4:27:1C:95:25:36:00:
                8E:63:02:CB:D3:4E:9B:03:D2:A7:A0:BD:43:93:3C:32:E0:F1:8D:E9:
                EA:D0:6B:56:1B:C6:61:43:97:4B:EB:57:B7:1D:FB:EA:4B:5F:DA:1E:
                A1:9F:9E:E3:C8:7A:6F:4A:A5:82:7C:51:05:78:4E:25:BF:74:4E:A6:
                FC:86:1C:CD:52:37:D5:9E:83:41:C9:0F:1A:5D:1C:EB
parm:           logging_level: bits for enabling additional logging info (default=0)
parm:           max_sectors:max sectors, range 64 to 32767  default=32767 (ushort)
parm:           missing_delay: device missing delay , io missing delay (array of int)
parm:           max_lun: max lun, default=16895  (ullong)
parm:           hbas_to_enumerate: 0 - enumerates both SAS 2.0 & SAS 3.0 generation HBAs
                  1 - enumerates only SAS 2.0 generation HBAs
                  2 - enumerates only SAS 3.0 generation HBAs (default=0) (ushort)
parm:           diag_buffer_enable: post diag buffers (TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0) (int)
parm:           disable_discovery: disable discovery  (int)
parm:           prot_mask: host protection capabilities mask, def=7  (int)
parm:           enable_sdev_max_qd:Enable sdev max qd as can_queue, def=disabled(0) (bool)
parm:           max_queue_depth: max controller queue depth  (int)
parm:           max_sgl_entries: max sg entries  (int)
parm:           msix_disable: disable msix routed interrupts (default=0) (int)
parm:           smp_affinity_enable:SMP affinity feature enable/disable Default: enable(1) (int)
parm:           max_msix_vectors: max msix vectors (int)
parm:           irqpoll_weight:irq poll weight (default= one fourth of HBA queue depth) (int)
parm:           mpt3sas_fwfault_debug: enable detection of firmware fault and halt firmware - (default=0)
parm:           perf_mode:Performance mode (only for Aero/Sea Generation), options:
                0 - balanced: high iops mode is enabled &
                interrupt coalescing is enabled only on high iops queues,
                1 - iops: high iops mode is disabled &
                interrupt coalescing is enabled on all queues,
                2 - latency: high iops mode is disabled &
                interrupt coalescing is enabled on all queues with timeout value 0xA,
                default - default perf_mode is 'balanced' (int)
cn flag
5.10.0-9 smells like Debian, not Ubuntu. Note that Ubuntu backports several modules including the Broadcom/LSI SAS drivers. However, that's all not very helpful here for you; your HBA controller is old generation and should just work. Have you tried to install the StorCLI tool and list the controller status for clues? Also, I've seen PCI-E bus instabilities with similar symptoms, which turned out to be non-genuine cards bought on eBay...
Trevor K Smith avatar
ng flag
Dont scare me like that :( i bought the card on ebay. Anything i can check to see if its non-genuine? Looking into checking with StoreCLI now.
cn flag
Oof, red flag for your purchase channel. I don't have specific advise for where to look; these eBay sellers try different approaches every time. Google `counterfeit lsi card ebay` and look for clues. It appears you can also ask LSI/Broadcom support for a serial number verification. To me it just smells like "one of those vague" PCI-E instabilities caused by poor soldering or cheap board components that occurs with these counterfeit cards.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.