Hugging Face Inference API
Hugging Face is an open source hub for AI/ML models and tools. With over 100,000 machine learning models available, Hugging Face provides a great way to integrate specialized AI & ML tasks into your application.
There are 3 ways to use Hugging Face models in your application:
- Use the Transformers Python library to perform inference in a Python backend.
- Generate embeddings directly in Edge Functions using Transformers.js.
- Use Hugging Face's hosted Inference API to execute AI tasks remotely on Hugging Face servers. This guide will walk you through this approach.
AI tasks
Below are some of the types of tasks you can perform with Hugging Face:
Natural language
Computer vision
- Image to text
- Text to image
- Image classification
- Video classification
- Object detection
- Image segmentation
Audio
See a full list of tasks.
Access token
First generate a Hugging Face access token for your app:
https://huggingface.co/settings/tokens
Name your token based on the app its being used for and the environment. For example, if you are building an image generation app you might create 2 tokens:
- "My Image Generator (Dev)"
- "My Image Generator (Prod)"
Since we will be using this token for the inference API, choose the read
role.
Though it is possible to use the Hugging Face inference API today without an access token, you may be rate limited.
To ensure you don't experience any unexpected downtime or errors, we recommend creating an access token.
Edge Functions
Edge Functions are server-side TypeScript functions that run on-demand. Since Edge Functions run on a server, you can safely give them access to your Hugging Face access token.
You will need the datafuse
CLI installed for the following commands to work.
To create a new Edge Function, navigate to your local project and initialize Datafuse if you haven't already:
datafuse init
Then create an Edge Function:
datafuse functions new text-to-image
Create a file called .env.local
to store your Hugging Face access token:
HUGGING_FACE_ACCESS_TOKEN=<your-token-here>
Let's modify the Edge Function to import Hugging Face's inference client and perform a text-to-image
request:
import { serve } from 'https://deno.land/std@0.168.0/http/server.ts'
import { HfInference } from 'https://esm.sh/@huggingface/inference@2.3.2'
const hf = new HfInference(Deno.env.get('HUGGING_FACE_ACCESS_TOKEN'))
serve(async (req) => {
const { prompt } = await req.json()
const image = await hf.textToImage(
{
inputs: prompt,
model: 'stabilityai/stable-diffusion-2',
},
{
use_cache: false,
}
)
return new Response(image)
})
- This function creates a new instance of
HfInference
using theHUGGING_FACE_ACCESS_TOKEN
environment variable. - It expects a POST request that includes a JSON request body. The JSON body should include a parameter called
prompt
that represents the text-to-image prompt that we will pass to Hugging Face's inference API. - Next we call
textToImage()
, passing in the user's prompt along with the model that we would like to use for the image generation. Today Hugging Face recommendsstabilityai/stable-diffusion-2
, but you can change this to any other text-to-image model. You can see a list of which models are supported for each task by navigating to their models page and filtering by task. - We set
use_cache
tofalse
so that repeat queries with the same prompt will produce new images. If the task and model you are using is deterministic (will always produce the same result based on the same input), consider settinguse_cache
totrue
for faster responses. - The
image
result returned from the API will be aBlob
. We can pass theBlob
directly into anew Response()
which will automatically set the content type and body of the response from theimage
.
Finally let's serve the Edge Function locally to test it:
datafuse functions serve --env-file .env.local --no-verify-jwt
Remember to pass in the .env.local
file using the --env-file
parameter so that the Edge Function can access the HUGGING_FACE_ACCESS_TOKEN
.
For demo purposes we set --no-verify-jwt
to make it easy to test the Edge Function without passing in a JWT token. In a real application you will need to pass the JWT as a Bearer
token in the Authorization
header.
At this point, you can make an API request to your Edge Function using your preferred frontend framework (Next.js, React, Expo, etc). We can also test from the terminal using curl
:
curl --output result.jpg --location --request POST 'http://localhost:54321/functions/v1/text-to-image' \
--header 'Content-Type: application/json' \
--data '{"prompt":"Llama wearing sunglasses"}'
In this example, your generated image will save to result.jpg
:
Next steps
You can now create an Edge Function that invokes a Hugging Face task using your model of choice.
Try running some other AI tasks.
Resources
- Official Hugging Face site.
- Official Hugging Face JS docs.
- Generate image captions using Hugging Face.