Provider Guide#
OpenMontage connects to external providers only through dedicated tools. The agent never calls general-purpose LLM APIs at runtime. Each provider is configured via environment variables in .env. Keys are optional—add only those you need. The tool registry discovers available providers automatically.
See Configuring Providers for full .env setup and Tool Registry for how selectors route requests.
Quick Start#
Start with free options and add paid providers as needed.
| Step | Cost | Provider | Unlocks |
|---|---|---|---|
| 1 | $0 | Pexels + Pixabay | Stock photos and videos |
| 2 | $0 | TTS (700+ voices, 1M chars/month free) + Imagen images + $300 new-account credit | |
| 3 | $0 | ElevenLabs | Premium TTS + music + SFX (10K chars/month free) |
| 4 | $0 | Piper (local) | Fully offline TTS |
| 5 | ~$0.03/image | fal.ai | FLUX images + Kling/Veo/MiniMax video + Recraft |
| 6 | ~$0.04/image | OpenAI | DALL-E 3 images + TTS |
| 7 | $12/month | Runway | Gen-4 video |
| 8 | pay-as-you-go | HeyGen | Avatar and multi-model video |
| 9 | pay-as-you-go | Suno | Full song generation |
| 10 | $0 + GPU | Local video | WAN, Hunyuan, CogVideo, LTX |
Environment Variables#
Copy the template and edit:
cp .env.example .env
Key variables (add only what you use):
# Free stock
PEXELS_API_KEY=
PIXABAY_API_KEY=
# Google (TTS + Imagen)
GOOGLE_API_KEY=
# Voice and music
ELEVENLABS_API_KEY=
OPENAI_API_KEY=
XAI_API_KEY=
DOUBAO_SPEECH_API_KEY=
DOUBAO_SPEECH_VOICE_TYPE=zh_female_vv_uranus_bigtts
# Multi-model gateway
FAL_KEY=
# Video
HEYGEN_API_KEY=
RUNWAY_API_KEY=
SUNO_API_KEY=
# Local GPU
VIDEO_GEN_LOCAL_ENABLED=true
VIDEO_GEN_LOCAL_MODEL=wan2.1-1.3b
Cloud Providers#
xAI (Grok)#
Tools unlocked: grok_image, grok_video
Setup:
- Create an xAI developer account.
- Generate an API key in the developer console.
- Add
XAI_API_KEY=xai-...to.env.
Best for: Image editing, style transfer, short reference-image video.
Pricing:
grok-imagine-image: $0.02 per image- Input images (edits): $0.002 per image
- Video at 480p: $0.05/sec
- Video at 720p: $0.07/sec
fal.ai#
Tools unlocked: flux_image, recraft_image, kling_video, veo_video, minimax_video
Setup:
- Sign up at fal.ai (GitHub or Google).
- Create a key at fal.ai/dashboard/keys.
- Add
FAL_KEY=your-key-hereto.env.
Pricing (pay-as-you-go):
- FLUX Pro v1.1: $0.05/image (20 images per $1)
- FLUX Dev: $0.03/image (33 images per $1)
- Kling 2.5 Turbo Pro: $0.07/sec (14 seconds per $1)
- Veo 3: $0.40/sec (2.5 seconds per $1)
ElevenLabs#
Tools unlocked: elevenlabs_tts, music_gen
Setup:
- Sign up at elevenlabs.io.
- Go to Profile > API Keys and create a key.
- Add
ELEVENLABS_API_KEY=xi_your-key-hereto.env.
Pricing:
- Free: 10,000 characters/month (2-3 minutes narration)
- Starter: $5/mo for 30,000 characters
Google#
Tools unlocked: google_tts, google_imagen
Setup:
- Go to Google AI Studio and create an API key.
- Enable Text-to-Speech API and Generative Language API in the Cloud console.
- Add
GOOGLE_API_KEY=AIza...to.env.
TTS pricing (free tier 1M chars/month per voice type):
- Standard: $4.00 per 1M chars after free tier
- WaveNet/Neural2: $16.00 per 1M chars
Imagen pricing:
- Fast: $0.02/image
- Standard: $0.04/image
New accounts receive $300 in credits.
OpenAI#
Tools unlocked: openai_tts, openai_image
Setup:
- Create an account at platform.openai.com.
- Add a payment method.
- Create a secret key and add
OPENAI_API_KEY=sk-...to.env.
Pricing:
- TTS: $15.00 per 1M characters (tts-1)
- DALL-E 3: $0.040 per 1024x1024 standard image
Runway#
Tools unlocked: runway_video
Setup:
- Create a developer account at dev.runwayml.com.
- Subscribe to Standard or higher.
- Generate an API key and add
RUNWAY_API_KEY=key_...to.env.
Pricing:
- Standard: $12/mo for 625 credits (~25 seconds Gen-4)
- Gen-4 Turbo: ~$0.05 per second
Free tier: 125 one-time credits.
HeyGen#
Tools unlocked: heygen_video
Setup:
- Register at app.heygen.com.
- Generate an API key in settings.
- Add prepaid balance and
HEYGEN_API_KEY=your-key-hereto.env.
Pricing:
- Avatar video (Engine III): $0.017/sec
- Prompt to Video: $0.033/sec
Suno#
Tools unlocked: suno_music
Setup:
- Create a Suno account.
- Obtain an API key from sunoapi.org.
- Add
SUNO_API_KEY=your-key-hereto.env.
Pricing: 1 credit = $0.005. Free tier: 50 credits/day (non-commercial).
Pexels and Pixabay#
Tools unlocked: pexels_image, pexels_video, pixabay_image, pixabay_video
Setup:
- Create free accounts at pexels.com and pixabay.com.
- Copy the displayed API keys.
- Add
PEXELS_API_KEY=andPIXABAY_API_KEY=to.env.
Pricing: Completely free. No attribution required. Commercial use allowed.
Local Options#
Local providers require no API keys and incur no cost.
Piper TTS#
Tool: piper_tts
Setup:
pip install piper-tts
Download a voice model on first use (e.g., en_US-lessac-medium). Fully offline.
Remotion#
Tool: video_compose (auto-routes when needed)
Setup:
cd remotion-composer && npm install
Requires Node.js 18+. Renders animated text cards, charts, and image scenes.
HyperFrames#
Tool: hyperframes_compose or video_compose with render_runtime=hyperframes
Setup:
npx --yes hyperframes doctor
Requires Node.js ≥ 22 and FFmpeg. Used for kinetic typography and character rigs.
Local Video Generation#
Tools: wan_video, hunyuan_video, cogvideo_video, ltx_video_local
Setup:
make install-gpu
Then set in .env:
VIDEO_GEN_LOCAL_ENABLED=true
VIDEO_GEN_LOCAL_MODEL=wan2.1-1.3b
Requires NVIDIA GPU. Models vary by VRAM (6GB+ for entry-level).
Local Diffusion#
Tool: local_diffusion
Setup:
pip install diffusers transformers accelerate torch
First run downloads the model (~4GB). Runs offline on GPU.
Verification#
Run preflight before production:
make preflight
Or:
python -c "
from tools.tool_registry import registry
import json
registry.discover()
print(json.dumps(registry.provider_menu_summary(), indent=2))
This reports configured capabilities, missing keys, and available runtimes.
All paid operations go through cost estimation and reservation. See Cost Tracking for details.