Hotplug support for eGPU on Linux

As noted previously, I have an eGPU set up with my laptop. Today I figured out how to successfully unplug it without the system freezing up.

Initially, I was quite confused as to why I was unable to just unplug the eGPU. It works on Windows after all. It turns out that a Thunderbolt disconnection (at least with my fully updated Arch Linux) does not gracefully remove the PCI devices from the system. Everything that was relying on the eGPU and any other peripherals in the eGPU box, such as the wired ethernet, fail horribly. Threads lock up waiting for a reply to arrive on the PCI bus, but nothing does.

So the fix is really quite simple. Use the pcihp module’s functionality to gracefully remove the entire Thunderbolt chain from the PCI tables before unplugging the cable. This is easiest done by echoing “1” to the “remove” file present in the lowest-numbered thunderbolt PCI-device. (Note: Dependent on your system, this might not be a clever idea if you happen to have more than one Thunderbolt port… It might be worth removing the attached devices instead. Check your lspci output first.)

I wrote up a little script to automate this:

#!/bin/bash

secs=5
tbt_chain=/sys/bus/pci/devices/0000:01:00.0

echo "Unplug eGPU script started."
if [ "$(id -u)" != "0" ]; then
	echo "Please run using sudo. Exiting."
	exit 1
fi
if [ -e $tbt_chain/remove ]
then
	echo 1 > $tbt_chain/remove
	echo "Thunderbolt chain removed from PCI tree. Please unplug eGPU now."
	while [ $secs -gt 0 ]; do
   		echo -ne "$secs to rescan...\033[0K\r"
   		sleep 1
   		: $((secs--))
	done
	echo 1 > /sys/bus/pci/rescan
	echo "Rescanned the PCI bus. Completed."
	exit 0
else
	echo "eGPU does not appear to be attached. Exiting."
	exit 1
fi
Advertisement

Author: jpamills

Website: www.jpamills.dk

%d bloggers like this: