Harnessing CUDA on Windows with Python in WSL2

Harnessing CUDA on Windows with Python in WSL2: A Seamless GPU Workflow

Whether you are a wannabe or a professional data-scientist who wants to dabble with deep learning models on your local Windows laptop powered by NVIDIA GPU, this post is for you. Working locally has its own pros apart from the drastic reduction in cumulative cloud costs (even if this is reducing by a big magnitude with every passing year) - at least until you get done with your POCs using distilled version of DL model with Gradio/Streamlit on your Jupyter notebooks.

Why WSL2? Windows is my primary OS, equipped with an NVIDIA GPU that I want to leverage for CUDA-accelerated tasks. Meanwhile, Linux offers a robust ecosystem for Python development. Simply put, I would like to combine the flexibility of Linux tools with the hardware power of a Windows machine. And WSL2 is a game changer that bridges this gap.

By integrating CUDA into WSL2, I can:

Accelerate Compute-Intensive Tasks: Run deep learning models (e.g., PyTorch, TensorFlow) or data pipelines faster using GPU parallelization.
Streamline Development: Use Linux tools (e.g., Bash, pip, poetry) seamlessly while staying within my Windows ecosystem.
Prototype Locally: Test GPU-powered applications—like a Gradio app with a distilled LLM—on my laptop before deploying to a cloud or server.

But does it work as promised? To find out, I needed a reliable way to verify that CUDA is correctly installed and accessible from WSL2. This blog post walks through that process.

Step 1: Verify CUDA on Windows and its visibility in WSL2

Before testing in WSL2, ensure CUDA works on the Windows host by executing `nvcc --version` on your windows terminal. This failed for me even when CUDA was installed via NVIDIA app (also known as GeForce Experience). I realized this installation doesn't include CUDA Toolkit. The CUDA Toolkit is a separate installation package that includes the CUDA compiler (nvcc), libraries, and other tools for developing CUDA applications. Just so you know, nvcc, stands for "NVidia CUDA Compiler" driver.

To install the CUDA Toolkit:

Visit the NVIDIA CUDA download page.
Select the correct version: Choose the version that matches your NVIDIA driver version.
Download the CUDA Toolkit: Select the "Network Installer" or "Local Installer" option, depending on your preference. In my case it looked like below:
Run the installer: Follow the installation prompts to install the CUDA Toolkit.

If the CUDA drivers installation is successful, executing `nvcc --version` on your windows terminal should show the version details of the CUDA driver.

It is now important to confirm the visibility of GPU in both windows and WSL2. Running the command `nvidia-smi` in your windows and WSL2 terminal should show same output. In my case it was like below:

Step 2: Test CUDA with Python (using PyTorch) in WSL2

Step 2.1: The first step to this is to install PyTorch with CUDA support (match your Windows CUDA version, e.g., 12.8, in my case as can be seen in the earlier screenshot). This step is tricky and crucial, so please know the correct version of CUDA as the URL ending in the torch installation command denotes this version. In my case the pytorch installation command looked like

`pip install torch --index-url https://download.pytorch.org/whl/cu128`.

Check PyTorch’s installation page for the latest command.

Step 2.2: Test CUDA availability by running the script below and you should see output as in corresponding comments. When run in my old rig, it looked like below:

If you have come this far, bingo!.. you are all set now for running distilled DL models on your local machine.

Blog @ Codonomics

Search This Blog

Buy @ Amazon