4

I just came across this question and its excellent answers ("How to check how many lanes are used by the PCIe card?"). I am just looking at the output of lspci -vv for a GTX 1050ti graphics card and I am not entirely sure that I am interpreting it right. What I'd expect is that the card uses all 16 lanes of an x16 PCIe 3.0 slot at the speed of PCIe 3.0. Both card and main-board should (allegedly) support it. In terms of performance (CUDA) I am looking at a lot less, i.e. I am trying to locate the bottleneck. The (hopefully) relevant sections of the output of lspci -vv:

01:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1) (prog-if 00 [VGA controller])
[...]
        Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Via message
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
[...]

The sections LnkCap: Port #0, Speed 5GT/s, Width x16 [...] and LnkSta: Speed 2.5GT/s, Width x16[...] as well as the phrase Express (v2) Legacy Endpoint make it appear as if this is a connection running on all 16 lanes at PCIe 2.0 speeds because some component is only PCIe-2.0-capapable ... Am I correct in this assumption? How (else) do I have to interpret this output?

EDIT: For what it is worth, this is a PCIe-2-capable connection running at PCIe-1 speed.

s-m-e
  • 475
  • Yes, it's running 16 lanes at gen1 speeds, while the link on the card is capable of gen2 speeds. As to why, first place to look is the capabilities of the bridge this card is behind (use lspci -t). "Legacy Endpoint" refers to a device that can claim I/O resources to BARs and other stuff a "pure" endpoint can't, and VGA legacy supports needs that. This doesn't say anything about speed. – dirkt Jan 19 '19 at 13:47
  • 2
    @dirkt Thanks for your help. Yes, the PCI bridge also reports LnkCap: Port #2, Speed 5GT/s, Width x16. So I guess at least the board is limited to PCIe-2. The question then changes to why is it dropping to 2.5GT/s ... – s-m-e Jan 19 '19 at 14:10
  • Yes, that's indeed the question. Possibly digging through the PCIe specification about downgrading links might bring up something. – dirkt Jan 19 '19 at 14:27
  • Possibly relevant question about link speed negotiation. – dirkt Jan 19 '19 at 14:37

1 Answers1

1

Please give some graphical load to the VGA and do lspci -vv same time.

For me it looks like quite modern VGA goes into power saving mode switching to lover bandwidth LnkSta: Speed 2.5GT/s, Width x16 due to LnkCtl: ASPM L0s L1 Enabled.

Still you could check BIOS settings for PCIE slot Generation setting, also can try to change slot.

As for example you can see status of working PCIe Gen 3 VGA with downgraded link:

05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1) (prog-if 00 [VGA controller])
...
    Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00
        DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
        DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap: Port #2, Speed 8GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
            ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s (downgraded), Width x4 (downgraded)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range AB, TimeoutDis+, NROPrPrP-, LTR-
             10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
             AtomicOpsCtl: ReqEn-
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
             EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
...
        Status: InProgress-

Arunas Bart
  • 811
  • 6
  • 13