C# Software Development and Technology in General

[Blog] Enable Nvidia GPU support on WSL, and by extension Podman - by simon at Tue, 28 Oct 2025 19:57:53 GMT

I was fighting with unsloth / Nvidia Cuda / Python versioning today. I eventually gave up on unsloth and managed to use pytorch directly - however as part of the journey I found that Microsoft, Nvidia and Linux have made GPU paravirtualisation so smooth it's invisible.

First of all, it just works in WSL - you can just install the Nvidia drivers (inside WSL) and magically get full GPU support, with no prior configuration of how much of the GPU you are taking, it's shared dynamically and very smoothly. You can just use the GPU as if it's local once you've installed the drivers.

Which is clearly intended to handle container tech and brings us on to the Nvidia container toolkit which provides the local libraries for the cotnainers, and Podman GPU support.

Install CUDA and the Nvidia drivers on the host
Install the CUDA and Nvidia drivers on the WSL instance / VM, or even the Nvidia Container Toolkit.
Install the Nvidia Container Toolkit on the docker machine/podman machine in WSL ("podman machine ssh" to get into it)
Use "--gpus all" on podman when running a container to enable the GPU's to be passed through to the container!

Overall I was very surprised, as Nvidia historically put a lot of road blocks on paravirtualisation of their cards. It's great to see this working.

Regarding the situation with unsloth, I found it was breaking even though the libraries were installed - and think different bits have different dependencies. It could do with some re-engineering to guarantee it's consistency, even their own containers weren't working and I tried several tags.

Instead, I created my own micro-environment to work in with a Dockerfile based on Nvidia's pytorch package:-

FROM nvcr.io/nvidia/pytorch:25.09-py3
EXPOSE 8888
WORKDIR /app
RUN pip install jupyter
CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root"]

Once this had been built with a quick:-

podman build -t aiplatform .

I could run the container using

podman run --gpus all -p 8888:8888 aiframework

And then access a Jupyter workbook with a fully functional CUDA enabled pytorch in it. This allowed me to get on and sort out what I wanted (I'm porting some of my old AI tech into the current LLM model ecosystem).

One thing I did find is that if I used DockerCLI I wasn't able use the --gpus parameter, if I wanted to use the docker command, I had to remove DockerCLI and then I had to symlink podman to docker with a quick:-

mklink "C:\Program Files\RedHat\Podman\docker.exe" "C:\Program Files\RedHat\Podman\podman.exe"

any	Connect to any server, this is the default
read-write	Connect to any server that is writeable
read-only	Connect to any server that is in read-only mode, you can either have a server set to standby or set to read-only explicitly
primary	Connect to the primary server
standby	Connect to any standby servers
prefer-standby	Try to connect to a standby server first, and then use the primary if no connection could be made. I have read it falls back to any so if you have no replication it will still find a system.

Context Changes

Base Context Changes

The New Interface

Controller Changes

Startup Changes

Docker Environment Changes

Conclusion