Hugging Face

Building and Deploying a Hugging Face Model with Docker

Objective

By the end of this lesson, you will be able to build and deploy a Hugging Face model for natural language processing (NLP) tasks using Docker. We’ll use Python and the Hugging Face Transformers library for building the model and Docker for deploying it.

Prerequisites

  • Basic understanding of Python.
  • Familiarity with machine learning concepts.
  • An environment with Python 3.6+ installed.
  • Docker installed on your system.

Step 1: Setting Up the Environment

  1. Create a virtual environment:
   python -m venv hf_env
   source hf_env/bin/activate  # On Windows use `hf_env\Scripts\activate`
  1. Install necessary libraries:
   pip install transformers
   pip install torch  # If using PyTorch as the backend
   pip install tensorflow  # If using TensorFlow as the backend
   pip install datasets
   pip install fastapi
   pip install uvicorn[standard]

Step 2: Loading and Fine-Tuning a Pre-trained Model

  1. Choose a pre-trained model:
   from transformers import AutoModelForSequenceClassification, AutoTokenizer

   model_name = "distilbert-base-uncased-finetuned-sst-2-english"
   model = AutoModelForSequenceClassification.from_pretrained(model_name)
   tokenizer = AutoTokenizer.from_pretrained(model_name)
  1. Prepare the dataset:
   from datasets import load_dataset

   dataset = load_dataset("imdb")
  1. Tokenize the dataset:
   def tokenize_function(examples):
       return tokenizer(examples["text"], padding="max_length", truncation=True)

   tokenized_datasets = dataset.map(tokenize_function, batched=True)
  1. Fine-tune the model:
   from transformers import Trainer, TrainingArguments

   training_args = TrainingArguments(
       output_dir="./results",
       evaluation_strategy="epoch",
       learning_rate=2e-5,
       per_device_train_batch_size=16,
       per_device_eval_batch_size=16,
       num_train_epochs=3,
       weight_decay=0.01,
   )

   trainer = Trainer(
       model=model,
       args=training_args,
       train_dataset=tokenized_datasets["train"],
       eval_dataset=tokenized_datasets["test"],
   )

   trainer.train()

Step 3: Saving the Model

  1. Save the model and tokenizer:
   model.save_pretrained("./model")
   tokenizer.save_pretrained("./model")

Step 4: Creating an API with FastAPI

  1. Set up FastAPI:
   mkdir my_api
   cd my_api
   touch main.py
  1. Write the FastAPI code:
   from fastapi import FastAPI, Request
   from transformers import AutoModelForSequenceClassification, AutoTokenizer
   import torch

   app = FastAPI()

   model = AutoModelForSequenceClassification.from_pretrained("./model")
   tokenizer = AutoTokenizer.from_pretrained("./model")

   @app.post("/predict")
   async def predict(request: Request):
       input_data = await request.json()
       inputs = tokenizer(input_data['text'], return_tensors="pt", padding=True, truncation=True)
       with torch.no_grad():
           outputs = model(**inputs)
       predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
       return {"predictions": predictions.tolist()}

Step 5: Creating a Dockerfile

  1. Create a Dockerfile:
   # Use the official Python image from the Docker Hub
   FROM python:3.8-slim

   # Set the working directory
   WORKDIR /app

   # Copy the current directory contents into the container at /app
   COPY . /app

   # Install the necessary libraries
   RUN pip install --no-cache-dir transformers torch fastapi uvicorn[standard]

   # Expose the port the app runs on
   EXPOSE 8000

   # Run the application
   CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
  1. Build the Docker image:
   docker build -t huggingface-api .
  1. Run the Docker container:
   docker run -p 8000:8000 huggingface-api

Step 6: Testing the API

  1. Send a request to the API:
   curl -X 'POST' \
     'http://127.0.0.1:8000/predict' \
     -H 'accept: application/json' \
     -H 'Content-Type: application/json' \
     -d '{"text": "I love using Hugging Face models!"}'
  1. Check the response:
    The response should contain the prediction probabilities for the given input text.

Conclusion

In this lesson, we’ve walked through the steps to build and deploy a Hugging Face model using Docker. You learned how to set up your environment, fine-tune a pre-trained model, save it, create an API using FastAPI, containerize it with Docker, and test the deployed model.

Other Recent Posts