The Inference API
DuguetLabs serves frontier open-source and proprietary models from a single OpenAI-compatible endpoint. Drop-in for any OpenAI or OpenRouter client — change the base URL, keep your code.
https://api.duguetlabs.com/v1Bearer dg_...OpenAI-compatible JSON / SSEQuickstart
Three minutes from zero to your first token.
- Get a key. Sign up for a free API key — no card required. You'll receive $5 in prepaid credit.
-
Pick a model. Browse the
models catalogue. Start with
duguet-ai/llama-3.1-8b($0.05 / $0.08 per MTok) for day-to-day work. - Fire a request. Your existing OpenAI SDK works. Just change the base URL.
curl https://api.duguetlabs.com/v1/chat/completions \
-H "Authorization: Bearer $DUGUET_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "duguet-ai/llama-3.1-8b",
"messages": [
{ "role": "user", "content": "Hello from DuguetLabs." }
]
}'
from openai import OpenAI
client = OpenAI(
base_url="https://api.duguetlabs.com/v1",
api_key=os.environ["DUGUET_API_KEY"],
)
resp = client.chat.completions.create(
model="duguet-ai/llama-3.1-8b",
messages=[{"role": "user", "content": "Hello from DuguetLabs."}],
)
print(resp.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.duguetlabs.com/v1",
apiKey: process.env.DUGUET_API_KEY,
});
const resp = await client.chat.completions.create({
model: "duguet-ai/llama-3.1-8b",
messages: [{ role: "user", content: "Hello from DuguetLabs." }],
});
console.log(resp.choices[0].message.content);
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.duguetlabs.com/v1",
apiKey: process.env.DUGUET_API_KEY!,
});
const resp = await client.chat.completions.create({
model: "duguet-ai/llama-3.1-8b",
messages: [{ role: "user", content: "Hello from DuguetLabs." }],
});
console.log(resp.choices[0].message.content);
Authentication
All requests carry an API key in the Authorization
header, Bearer scheme. Keys start with dg_ and
are shown only once at signup — save them securely.
Authorization: Bearer dg_177984c7a0e3ef28tHi0zHgQXdjYLHdLqm87M3kPmQ6OLSYf
Inspect a key's state (usage, remaining credit, call count) at
GET /v1/auth/key.
$ curl https://api.duguetlabs.com/v1/auth/key \
-H "Authorization: Bearer $DUGUET_API_KEY"
{
"data": {
"label": "you@company.com",
"usage": 0.0014,
"limit": 5.00,
"is_free_tier": true,
"rate_limit": { "requests": 500, "interval": "60s" },
"tokens_used": { "prompt": 2431, "completion": 802 },
"calls": 17
}
}
Chat completions
POST /v1/chat/completions — OpenAI-compatible. All
standard parameters are accepted: messages,
temperature, top_p, max_tokens,
stream, stop, seed,
frequency_penalty, presence_penalty.
The response includes OpenRouter-compatible extensions:
provider— always"duguet-ai"native_finish_reason— upstream raw reason, preserved alongside the normalisedfinish_reasonusage.cost— dollar cost of this call, computed from the per-model price
Request example — with tool use
{
"model": "duguet-ai/llama-3.3-70b",
"messages": [
{ "role": "system", "content": "You are a research assistant." },
{ "role": "user", "content": "Summarise the paper at arxiv.org/abs/2401.XXX" }
],
"temperature": 0.3,
"max_tokens": 1024,
"tools": [{
"type": "function",
"function": {
"name": "fetch_url",
"description": "Fetch the text of a URL",
"parameters": {
"type": "object",
"properties": { "url": { "type": "string" } },
"required": ["url"]
}
}
}]
}
Response
{
"id": "chatcmpl-9ecf…",
"object": "chat.completion",
"created": 1776293537,
"model": "duguet-ai/llama-3.3-70b",
"provider": "duguet-ai",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "…"
},
"finish_reason": "stop",
"native_finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 142,
"completion_tokens": 387,
"total_tokens": 529,
"cost": 0.000197,
"is_byok": false
}
}
Streaming
Pass "stream": true to receive a Server-Sent-Events
stream of chat.completion.chunk objects. The stream
ends with data: [DONE]. All chunks carry
provider and the prefixed model name.
from openai import OpenAI
client = OpenAI(
base_url="https://api.duguetlabs.com/v1",
api_key=os.environ["DUGUET_API_KEY"],
)
stream = client.chat.completions.create(
model="duguet-ai/mistral-large-3",
messages=[{"role": "user", "content": "Write a haiku about sovereignty."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
Embeddings
POST /v1/embeddings — OpenAI-compatible. Returns
vector representations suitable for semantic search and RAG.
curl https://api.duguetlabs.com/v1/embeddings \
-H "Authorization: Bearer $DUGUET_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "duguet-ai/nomic-embed",
"input": "Sovereignty means the compute stays where the data lives."
}'
Batching: input accepts an array of strings to
embed in one request.
Models
Thirteen carefully chosen SKUs — five open-source frontier, three
proprietary, four self-hosted on our sovereign A100, one embedding.
The full live list is at GET /v1/models.
Rate limits
Free tier: 500 requests / minute per key, across all models. Paid accounts: bespoke, configured per contract. Signup rate limit: 3 signups / minute per source IP.
When a request would exceed the limit, the API returns HTTP
429 with a Retry-After header.
Errors
All errors follow the OpenAI shape:
{
"error": {
"message": "Invalid or missing API key",
"type": "authentication_error"
}
}
| Status | Type | When |
|---|---|---|
400 | invalid_request_error | Malformed body or unsupported parameter. |
401 | authentication_error | Missing, invalid, or disabled API key. |
402 | insufficient_quota | Credit exhausted. Top up via email for now. |
404 | model_not_found | Unknown model id. See /v1/models. |
429 | rate_limit_exceeded | Slow down. |
5xx | upstream_error | A backend provider is misbehaving. We'll tell you which. |
OpenRouter compatibility
Every response body mirrors OpenRouter's extensions: provider,
native_finish_reason, usage.cost,
usage.is_byok. Model IDs use the duguet-ai/
prefix (drop the prefix, we still recognise the bare name).
If you're migrating from OpenRouter, set
base_url to https://api.duguetlabs.com/v1
and adjust model names. That's the whole migration.
Support
One inbox. Answered by the person who wrote this.