O OpenMontage

OpenMontage includes over 50 Python tools for video production. These are auto-discovered at runtime by the tool registry and grouped by category. Each tool declares its provider, runtime (LOCAL, API, or LOCAL_GPU), tier, and schemas. Selectors abstract families of providers for graceful fallback. Inspect the live set with this preflight command:

python -c "
from tools.tool_registry import registry
import json
registry.discover()
print(json.dumps(registry.provider_menu_summary(), indent=2))
"

See the tool system for the BaseTool contract and selector pattern.

Analysis (4 tools)#

These tools extract structure and semantics from source footage or references.

  • transcriber (WhisperX): word-level transcription and speaker diarization.
  • scene_detect: scene boundary detection via PySceneDetect + FFmpeg.
  • frame_sampler: keyframe or uniform frame extraction with metadata.
  • video_understand (CLIP/BLIP-2): semantic video understanding and embedding.

Audio (8 tools)#

Tools for voice, music, and post-processing.

  • elevenlabs_tts, google_tts, openai_tts, piper_tts: text-to-speech (ElevenLabs, Google Cloud, OpenAI, local Piper).
  • tts_selector: ranks and routes among available TTS providers.
  • music_gen: background music generation.
  • audio_mixer: multi-track mixing and level balancing.
  • audio_enhance: noise reduction and clarity processing.

Avatar (2 tools)#

  • talking_head (SadTalker/MuseTalk): animated talking-head video from audio and reference image.
  • lip_sync (Wav2Lip): lip synchronization on existing video.

Enhancement (5 tools)#

Visual cleanup and quality upgrades.

  • upscale (Real-ESRGAN): resolution increase.
  • bg_remove (rembg/U2Net): background removal.
  • face_enhance: face detail improvement.
  • face_restore (CodeFormer/GFPGAN): face restoration.
  • color_grade (FFmpeg LUTs): color correction and grading.

Graphics (13 tools)#

Image and graphic asset creation.

  • flux_image, grok_image, google_imagen, openai_image, recraft_image, local_diffusion: image generation (fal.ai, xAI, Google, OpenAI, Recraft, local diffusion).
  • pexels_image, pixabay_image: stock image retrieval.
  • image_selector: routes image requests across providers.
  • code_snippet: code block graphics.
  • diagram_gen: diagram generation.
  • math_animate (ManimCE): mathematical animation.
  • image_gen (deprecated).

Subtitle (1 tool)#

  • subtitle_gen: SRT/VTT generation from timestamped text.

Video (18 tools)#

Video generation, stock retrieval, and editing primitives.

  • grok_video, heygen_video, higgsfield_video, veo_video, kling_video, runway_video, minimax_video: cloud video generation (xAI, HeyGen, Higgsfield, fal.ai Veo/Kling/MiniMax, Runway).
  • wan_video, hunyuan_video, cogvideo_video, ltx_video_local, ltx_video_modal: local GPU video models.
  • pexels_video, pixabay_video: stock video retrieval.
  • video_selector: routes video requests.
  • video_compose (FFmpeg): scene composition (dispatches to Remotion or HyperFrames when locked).
  • video_stitch, video_trimmer: concat, trim, and basic assembly.

Composition runtime choice is locked at proposal and enforced by video_compose. See composition runtimes and provider guide for setup and selection details.