Training PyTorch Models on Your Local GPU and Deploying to Hugging Face

Training PyTorch Models on Your Local GPU and Deploying to Hugging Face

Training deep learning models can be a time-consuming and resource-intensive task, especially when dealing with large datasets and complex architectures. Many developers and researchers rely on cloud-based platforms like Google Colab to train their models, but this approach comes with its own set of challenges. In this article, we’ll explore how to set up PyTorch to use your own local GPU and deploy your trained models to Hugging Face, a popular platform for sharing and hosting machine-learning models.

Why Not Google Colab?

Google Colab is a popular choice for training deep learning models due to its free access to GPU resources. However, it comes with several limitations that can hinder productivity and efficiency (at least the free version). One major issue is the long training times, especially for large models and datasets. Colab’s GPUs are shared among multiple users, which means you may not always have access to the full computational power, leading to slower training speeds.

Another frustrating aspect of using Colab is the runtime disconnection issue. After a certain period of inactivity, Colab disconnects the runtime, causing your training progress to be lost. This can be particularly annoying when you’re in the middle of a long training session and have to start over from scratch.

Setting Up PyTorch to Use Your Local GPU

To overcome these limitations, you can leverage the power of your own local GPU for training PyTorch models. Here’s a step-by-step guide on how to set up PyTorch to use your local GPU:

Step 1: Install CUDA Toolkit
— Visit the NVIDIA CUDA Toolkit download page: https://developer.nvidia.com/cuda-downloads
— Select “Windows” as the operating system and choose the appropriate version and installer type.
— Download and run the installer, following the installation instructions.

Step 2: Initialize and activate a virtual environment in your project

virtualenv --no-site-packages venv
source venv/Scripts/activate

Step 3: Install PyTorch with CUDA support
Run the following command to install PyTorch with CUDA support using pip3:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Note: Replacecu121with the appropriate CUDA version tag that matches your installed CUDA version (e.g., cu121 for CUDA 12.1).

Step 4: Verify the installation in your project
Run the following code to check if PyTorch is successfully using the GPU:

import torch
print(torch.cuda.is_available())

If the output is True, then PyTorch is set up to use the GPU. Also: torch.cuda.get_device_name(0) should also show your GPU configuration.

Step 5: Install other dependencies in your project

pip3 install fastai fastbook nbdev gradio

With these steps, you should be all set to work on your local machine and train PyTorch models using your own GPU.

Note: When executing a cell in VSCode for the first time, it might ask you to select interpreter. Be sure to selectvenvinterpreter.

Deploying Models to Hugging Face

Once you have trained your PyTorch model locally, you can deploy it to Hugging Face for easy sharing and hosting. For that go to https://huggingface.co/spaces/ and create a new space. Clone it on your machine. Add your trained model (in .pkl format) and sample test images to the root directory of the space. Finally, add an app.py file to the root directory with the following code snippet:

from fastai.vision.all import *
import gradio as gr

my_labels = ( ... ) # list your custom labels here

model = load_learner('my_model.pkl') # load your trained model

def recognize_image(image):
    pred, idx, probs = model.predict(image)
    return dict(zip(my_labels, map(float, probs)))

image = gr.inputs.Image(shape=(192,192))
label = gr.outputs.Label(num_top_classes=5)

examples = [ ... ] # list your test images here

iface = gr.Interface(fn=recognize_image, inputs=image, outputs=label, examples=examples)
iface.launch(inline=False)

Now commit and push to the remote space and your model should be up and running! However, if you’re working on a Windows machine, you might encounter the following error when deploying to Hugging Face:

NotImplementedError: cannot instantiate 'WindowsPath' on your system

To resolve this issue, add the following code to your app.py file:

import platform
import pathlib
plt = platform.system() 
if plt == 'Linux': 
    pathlib.WindowsPath = pathlib.PosixPath

This code snippet checks if the operating system is Linux (which is used by Hugging Face’s systems) and sets the WindowsPath to PosixPath to ensure compatibility.

Conclusion

Training PyTorch models on your local GPU offers several advantages over relying on cloud-based platforms like Google Colab. By following the steps outlined in this article, you can set up PyTorch to leverage the power of your own GPU, resulting in faster training times and eliminating the frustration of runtime disconnections. Additionally, deploying your trained models to Hugging Face allows for easy sharing and hosting, making your models accessible to a wider audience. With these techniques, you can streamline your deep learning workflow and focus on building innovative models without the hassle of cloud-based limitations.