GPU Compute / Recipe
Transcription API
Launch a persistent Whisper endpoint for audio-to-text workloads using badgr serve --image. The container stays running so every request gets a warm model with no cold-start.
npm install -g badgr-cli then badgr login. Setup guide →1. Start the endpoint
Serve a Whisper API container. Pass model size and any other config via --env:
badgr serve \ --image ghcr.io/my-org/whisper-api:latest \ --gpu RTX_4090 \ --env MODEL=large-v3 \ --env LANGUAGE=en
The RTX 4090 fits Whisper large-v3 at 24 GB VRAM and is the most cost-effective option. Badgr prints the endpoint URL once the model is loaded (usually under 2 minutes).
2. Transcribe audio
If your container follows the OpenAI audio transcriptions API shape, any OpenAI client works:
Python
from openai import OpenAI
client = OpenAI(
api_key="your-badgr-api-key",
base_url="https://dep-a1b2c3.api.badgr.ai/v1", # printed by badgr serve
)
with open("audio.mp3", "rb") as f:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=f,
response_format="text",
)
print(transcript)Node.js
import OpenAI from "openai";
import fs from "fs";
const client = new OpenAI({
apiKey: process.env.BADGR_API_KEY,
baseURL: "https://dep-a1b2c3.api.badgr.ai/v1",
});
const transcript = await client.audio.transcriptions.create({
model: "whisper-1",
file: fs.createReadStream("audio.mp3"),
response_format: "text",
});
console.log(transcript);curl
curl https://dep-a1b2c3.api.badgr.ai/v1/audio/transcriptions \ -H "Authorization: Bearer $BADGR_API_KEY" \ -F model=whisper-1 \ -F file=@audio.mp3 \ -F response_format=text
3. Stop billing
badgr down <deployment-id>
Terminates the GPU instance. Use badgr receipts to see cost after.
Options
--image <ref>Your Docker image with faster-whisper or whisper.cpp--gpu <type>RTX_4090 is the cost-effective choice for Whisper large-v3--env KEY=VALUESet environment variables (MODEL, LANGUAGE, etc.) — repeatable--count <n>Multiple instances for higher concurrency--max-price 2.00Hard price cap in USD/hr--max-cost 5.00Auto-stop when total spend reaches this amount--no-waitReturn immediately; use badgr logs to confirm startupNext steps