This post lead me to the conclusion I need to choose the older kernel in grub.
This is the one of the two valid solutions.
You should report the bug to Canonical if it hasn't reported already, meanwhile use the older kernel (by freezing upgrades via Synaptic) and wait until an updated kernel that works on your system is ready.
You can try a PPA mainline kernel though.
The second valid solution is installing amdgpu-pro from AMD, which is an alternate driver. However I would not recommend it, since a lot of users end up having trouble with it, and if you're not fluent with the CLI, it gets hard to restore a graphical interface if things go wrong.
It's hit or miss.
radeon vs amdgpu
You can't use the radeon
driver for your GPU. That driver is an older one written to work with Radeon HD 2000-6000 series and first-generation GCN cards (e.g. Radeon HD 7770, Radeon R9 280, etc).
The amdgpu
driver is a newer driver that works with all GCN and RDNA chips.
Your Vega chip is the last GPU with GCN architecture, and it won't work with radeon
. Trying to force it could either not work or lead to HW damage.
Driver architecture explanation
It seems you are confused so I'll explain what's going on: Your AMD drivers in Linux have 2 components: kernel space and user space.
Kernel-space drivers are shipped alongside with the kernel. When a bug is introduced, they're tied to the kernel version. Kernel bugs like these get often fixed quickly upstream (in days or weeks), but Ubuntu takes its time (often a few months) due to how backporting changes from upstream kernel works in Ubuntu.
From what you are mentioning, this is the cause of the problem.
User-space drivers are shipped separately with Mesa and X11 packages. Since the problem lies in the Kernel section of the drivers, messing with these packages won't fix it and you risk breaking up something by accident.
Update
OK from your reply you said that you downgraded your kernel and it still didn't work.
First, make sure you're actually downgraded via:
uname -r
You can try multiple Kernel versions by going to GRUB -> Advanced options for Ubuntu
and start selecting the other versions.
OK, so assuming that won't work the next thing to try/suspect:
Your Xorg logs
Post the contents of /var/log/Xorg.0.log
. It may contain valuable info.
Since you boot to a black screen, you can try switching to tty (Ctrl + Alt + F2)
If the keyboard is unresponsive, then boot into safe mode, install openssh-server, reboot into normal node, and control your computer from a 2nd computer.
Make sure amdgpu module is loaded
Run:
lsmod | grep amdgpu
It should have hits and amdgpu should be in use.
If it's not there then run:
sudo modprobe amdgpu
If that fails, check dmesg
.
Restart your DM
Sometimes there is a race condition in loading your drivers, and the real problem is that your DM started before your GPU drivers were ready. So the DM crashed.
I've personally run into this problem: Half of the time, at random, I would boot into a black screen and the keyboard would be dead. However the following command from an SSH session would restore my computer.
I don't know which DM you use so one of these commands should be the one for you:
sudo service lightdm restart
sudo service gdm3 restart
sudo service sddm restart
Add amdgpu firmware to initramfs so it's available early
This is a fix to the mentioned race condition in the previous item.
Run the following:
echo "amdgpu" | sudo tee --append /etc/initramfs-tools/modules
sudo update-initramfs -c -k $(uname -r)
If done properly then to check it worked, running:
lsinitramfs /boot/initrd.img-$(uname -r) | grep amdgpu
Should return multiple hits (where there previously should be none).
Then reboot and cross finger things work.
You can undo this change by removing the amdgpu line from /etc/initramfs-tools/modules
and running update-initramfs
again
Try another firmware package
The package linux-firmware provides all the firmware blobs for your HW, including AMD's GPUs.
An older version may fix your problems, or a newer one.
You can also try the latest ones from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
You can rollback the latest ones from git via sudo apt install --reinstall linux-firmware
(since they are all files that go into /lib/firmware
)
Check from a LiveUSB
If you stil are unable to fix it; create a LiveUSB and boot from there: You may be tempted to reinstall Ubuntu (a valid option).
If the LiveUSB shows the same problem, you may have to consider HW damage (your iGPU is malfunctioning, or the cable to your monitor is bad, the cable is loose, or your monitor is bad, or the connector/port is). If this is the case, try Windows as well.