Here are some answers to your questions:
Is there somewhere that lists the options and how to set them and what they do?
(Short answer, so I'm placing this first)
modinfo amdgpu
Look for param:
in the output. These are all the available parameter options for this kernel module. The Linux kernel documentation also has some good information regarding these:
How do you set amdgpu options?
(Longer answer, because there are many ways)
As we saw above, amdgpu
is the name for the Open Source AMD graphics card drivers that exist in the Linux Kernel source tree. They are included with Ubuntu's stock kernel.
Kernel modules (a.k.a. drivers) have parameters which can be set in multiple ways:
- Set via Grub Kernel command-line
- There are two ways to do this depending on whether you want the options to persist across reboots or not.
- Temporary method via GRUB command line
- Start your system and wait for the GRUB menu to show (if you don't see a GRUB menu, press and hold the left
Shift
key right after starting the system). Some systems use UEFI boot and skip this screen, while others still support the older MBR boot method and do not skip it.
- At GRUB kernel selection screen, highlight the kernel version entry you want to use.
- Press
e
to edit that kernel command line.
- The line you want to find looks like this:
linux /boot/vmlinuz-6.2.0-20-generic ROOT=UUID=1234567-ABCD ro quiet splash
- Add your kernel options and kernel module options at the end of this line.
- Kernel-level parameters can be passed directly (e.g.
noacpi
, nomodeset
, etc...)
- Kernel Module-level parameters can be passed using the name + dot
modulename.param
syntax: (e.g.: amdgpu.dpm=0
, amdgpu.aspm=0
, etc...)
- Persistent method via generated GRUB config command line
- Edit the
/etc/default/grub
file as root
(e.g.: sudo vi /etc/default/grub
, or sudo nano /etc/default/grub
)
- Find the line with
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
- Inside the last double-quote, add your Linux kernel boot parameters and/or Module-level parameters.
- Note: The syntax for Module-level parameters is the same as the Temporary GRUB command line method above. (e.g.
amdgpu.dpm=0
, amdgpu.aspm=0
, etc...)
- Update Grub:
sudo update-grub
- Reboot and your parameters should now be added every time the kernel boots. (This can be viewed and verified to be the case using the
e
edit GRUB boot line method as above)
Set via Modprobe Drop-In directory
This method is also persistent, and applies slightly later in the boot process when modprobe
is loading kernel modules
- You can not set Kernel-level parameters this way, only Module-level parameters.
- This only works for Loadable Kernel Modules (not those compiled-in, but as a module. See Gentoo Wiki for details)
- Note: The syntax for these config files is a bit different, as you do not need the
modulename.param
syntax here.
(See man modprobe.d
for full documentation of /etc/modprobe.d
Drop-In config file syntax.)
Add a new Drop-In config file for your GPU
For example, to set both dpm=0
and aspm=0
:
echo 'options amdgpu dpm=0 aspm=0' | sudo tee /etc/modprobe.d/amdgpu-options.conf
Regenerate the initramfs
sudo update-initramfs -u -k all
Reboot!
Loading a Module with Temporary Changes
Usually this works for testing temporary changes for plug-and-play devices
However, this may not be the ideal method for something such as a GPU which is in use very early in the UEFI -> Kernel boot -> Init boot phases.
If your system has an integrated graphics card (e.g. Intel Corporation HD Graphics 630 or similar), this could be helpful when diagnosing or testing kernel module parameters for the secondary GPU.
sudo modprobe <module_name> [parameter=value]
Where [parameter=value]
represents a list of customized parameters available to that module, and <module_name>
would be the name of the kernel module (amdgpu
in this case)
See more detailed information in RedHat's documentation here
Testing Temporary Kernel Module Parameters on a Dual-GPU System
The last method can be useful when testing a system which has both an integrated GPU, and a secondary PCIe GPU (such as AMD / Nvidia / Intel ARC). It is especially helpful when diagnosing basic card initialization issues, when using VFIO and/or IOMMU, and other use-cases. Note: If in doubt and you're unsure about these more advanced topics, then try one of the other easier methods above first.
To follow this method, you usually need to go into the BIOS of a motherboard (assuming it supports this) and enable the Integrated GPU as the primary / default display GPU. Then, we must boot into Linux and check Kernel log messages in one terminal while unloading the kernel module and resetting the other secondary PCIe GPU in another terminal.
For an AMD secondary GPU using the amdgpu
kernel module, the process looks like this:
Open a terminal and run: sudo dmesg -H --nopager --follow
- Look for messages from your GPU driver (e.g.
amdgpu
). There may be some helpful error messages to diagnose the issue.
- It may be helpful to press
enter
a few times in this terminal to give some spacing so new messages will be easily visible at the end.
Open another terminal and run: sudo rmmod amdgpu
(or whichever driver name or kernel module the secondary GPU uses)
- Check that the module has been unloaded with:
sudo lsmod | grep -i amdgpu
- You should see no output if it is not currently loaded in the kernel.
Find the PCIe bus ID of the secondary GPU:
Run: sudo lspci
Look for AMD
in the output, for example on my system I see:
$ sudo lspci | grep -i amd
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1)
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] (rev c1)
03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
On this system, the AMD RX 6600 shows up on PCI bus ID: 03:00.0
Note that internal to the GPU card, there are multiple PCIe ports / switches, and an Intel HDA-based HDMI audio device which we can ignore. (The switches are essentially pass-through to the GPU + Intel HDA sound card. The sound card uses snd_hda_intel
Kernel module in this case)
Simulate removal of the PCIe device using the bus ID found above:
Rescan the PCIe bus, and immediately reload the GPU driver module with params:
# The semicolon separates two commands and runs them in quick succession
# The reasoning here is that once you write '1' to 'rescan' via sysfs, the kernel might decide to auto-load the amdgpu module automatically without your specified parameters.
# As such, sometimes it's best to use /etc/modprobe.d or another method for specifying parameters, although reboots can be slower to test.
echo 1 | sudo tee /sys/bus/pci/rescan ; sudo modprobe amdgpu dpm=0 aspm=0
Check that the loaded module parameters look set correctly like you intended:
module=amdgpu;
ls /sys/module/$module/parameters/ | while read parameter; do \
echo -n "Parameter: $parameter --> "; \
sudo cat /sys/module/$module/parameters/$parameter; \
done;
- If the settings do not match what you passed to
modprobe
, the driver may have loaded automatically before your options could be applied.
- If the
modprobe param=foo
settings did not work, try using the /etc/modprobe.d/
method for setting the option instead, then retry.
Check your dmesg
output in the other terminal.
- Are the previous errors still there?
- Anything changed or new since the param value was changed?
Repeat again and tweak parameters until you find something that might solve the issue (or crashes the kernel completely & needs a reboot!)