1

I'm trying to monitor AMD gpus in a system running AMDGPU-PRO 18.10 and linux kernel 4.4.0.

I am reading values from:

/sys/kernel/debug/dri/$X/amdgpu_pm_info

where $X is a card index.

I am also reading the pp_dpm_cclk values from another directory, found under

/sys/class/drm/card$X/

I have 2 questions about this.

Does $X in both these cases refer to the same card? E.g. is /sys/class/drm/card0/device/pp_dpm_mclk returning information about the same card as /sys/kernel/debug/dri/0/amdgpu_pm_info?

Will this be true every boot/if I add or remove cards?

Finally, should I be using /sys/devices/pci0000:00 to access pp_dpm_mclk rather than the symlinks in /sys/class/drm? If so, how can I find out which card in /sys/devices/pci0000:00 corresponds to the cards in /sys/kernel/debug/dri ?

Thanks

keda
  • 82

1 Answers1

2

First question the answer is Yes.
/sys/kernel/debug/dri/0 is for card /sys/class/drm/card0 and so on..

Will this be true every boot/if I add or remove cards?

Considering my personal case: I have 3 pcie x16 on my motherboard. This is order as they are physicaly on my board.

  PCIEx16 [================] bus 0000:65:00.0 First slot
  PCIEx16 [================] bus 0000:17:00.0 Second slot
  PCIEx16 [================] bus 0000:15:00.0 Third slot

If you have one video cards plug into bus 65. Bus 65 will be card0. But if you add a second video card into bus 17, this will reorder all the card in  /sys/class/drm/card$X.

card0 will be bus 17, and card1 bus 65.
Same with one more card on bus 15.
card0 bus 15, card1 bus 17, card2 bus 65.

So the card number is depending of the pcie slot you have plug in the video card and the number of video cards you currently have installed on your motherboard.

Finally, should I be using /sys/devices/pci0000:00 to access pp_dpm_mclk rather than the symlinks in /sys/class/drm? If so, how can I find out which card in /sys/devices/pci0000:00 corresponds to the cards in /sys/kernel/debug/dri ?

When you cd into /sys/class/drm/card0/device this is a symlink to /sys/devices/pci0000:00/0000:00:$PCI.0/subsystem/devices/0000:$PCI:00.0

Both are the same.

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232