Getting Started with Inference Providers

Hugging Face Inference Providers unifies 15+ inference partners under a single, OpenAI‑compatible endpoint.
Move from prototype to production with the same, unified API and no infrastructure to manage.

Hugging Face Inference Partners

Groq
Novita
Nebius AI
Cerebras
SambaNova
Nscale
fal
Hyperbolic
Together AI
Fireworks
Featherless AI
Zai
Replicate
Cohere
Scaleway
Public AI
OVHcloud AI Endpoints
WaveSpeed
HF Inference API

Your first LLM call

Here we are going to make your first inference request to an LLM using moonshotai/Kimi-K2-Instruct-0905.

import os
from openai import OpenAI

client = OpenAI(
    base_url="/static-proxy?url=https%3A%2F%2Frouter.huggingface.co%2Fv1%26quot%3B%3C%2Fspan%3E%2C
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct-0905",
    messages=[
        {
            "role": "user",
            "content": "Summarize the plot of 'Matrix'."
        }
    ],
)

print(completion.choices[0].message)

Generate an image

Next lets generate an image using the very fast black-forest-labs/FLUX.1-dev.

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key=os.environ["HF_TOKEN"],
)

# output is a PIL.Image object
image = client.text_to_image(
    "A fantasy forest with glowing mushrooms",
    model="black-forest-labs/FLUX.1-dev",
)

Start using Inference Providers today

You can browse compatible models and run inference directly in their model card widgets.

Get PRO

to instantly get 20x more included monthly credits, and unlock pay as you go billing!