Command Line Interface#

`scholium generate`#

scholium generate <slides.md> <output.mp4> [OPTIONS]

Generate an instructional video from markdown slides with embedded narration.

Arguments#

slides.md: Path to markdown file with embedded :::notes::: blocks.
output.mp4: Path for output video file.

Options#

Option	Description	Default
`--provider`	TTS provider: `piper`, `elevenlabs`, `coqui`, `openai`, `bark`, `f5tts`, `styletts2`, `tortoise`	`piper`
`--voice`	Voice name or ID (see note below)	from config
`--model`	TTS model ID	from config
`--config`	Path to configuration file	`config.yaml`
`--speed RATE`	Speech rate multiplier (0.1–5.0; 1.0=normal, 0.9=10% slower)	from config
`--quality PRESET`	Quality preset: `fast`, `balanced`, `best`	from config
`--slides RANGE`	Process only a subset of slides, e.g. `5` or `3-7` (1-indexed pages)	all
`--dry-run`	Parse narration and print it; skip all generation	false
`--resume`	Skip audio generation for slides whose temp files already exist	false
`--section-duration`	Duration for silent slides (seconds)	`3.0`
`--verbose`	Show detailed progress output	false
`--keep-temp`	Keep temporary files for debugging	false
`--no-pdf`	Do not save slides as PDF alongside the video	false
`--play`	Play video after generation	false
`--audio-only`	Generate audio segments only (no video)	false
`--open-dir`	Open output directory after generation	false

Note on --voice: What --voice expects depends on the provider:

Piper — voice model name, e.g. en_US-lessac-medium

ElevenLabs — the Voice ID (not the display name), e.g. Xb7hH8MSUJpSbSDYk0k2. Run scholium list-voices --provider elevenlabs to find IDs.

OpenAI — built-in voice name: alloy, echo, fable, onyx, nova, shimmer

Coqui / F5-TTS / StyleTTS2 / Tortoise — name of a registered voice from scholium list-voices

Note on --quality: The preset maps to provider-specific settings automatically:

Provider

fast

balanced

best

piper

quality: low

quality: medium

quality: high

openai

model tts-1

model tts-1

model tts-1-hd

elevenlabs

turbo model

multilingual v2

multilingual v2

bark

model: small

model: small

model: large

tortoise

ultra_fast preset

fast preset

high_quality preset

styletts2

3 diffusion steps

5 steps

10 steps

f5tts

vocoder: vocos

vocoder: vocos

vocoder: bigvgan

Run scholium providers info PROVIDER to see the exact mapping for your provider.

Provider	`fast`	`balanced`	`best`
piper	`quality: low`	`quality: medium`	`quality: high`
openai	model `tts-1`	model `tts-1`	model `tts-1-hd`
elevenlabs	turbo model	multilingual v2	multilingual v2
bark	`model: small`	`model: small`	`model: large`
tortoise	`ultra_fast` preset	`fast` preset	`high_quality` preset
styletts2	3 diffusion steps	5 steps	10 steps
f5tts	`vocoder: vocos`	`vocoder: vocos`	`vocoder: bigvgan`

Note on --speed: For piper and openai, speed is passed natively to the provider. For all other providers, Scholium applies a pitch-preserving time-stretch via ffmpeg’s atempo filter after generation.

Examples#

# Basic generation
scholium generate lecture.md output.mp4

# Custom voice
scholium generate lecture.md output.mp4 --voice en_US-amy-medium

# Different provider
scholium generate lecture.md output.mp4 --provider elevenlabs --voice Xb7hH8MSUJpSbSDYk0k2

# Slow down speech by 10%, use best quality
scholium generate lecture.md output.mp4 --speed 0.9 --quality best

# Preview narration without generating anything (fast, no pandoc/ffmpeg)
scholium generate lecture.md output.mp4 --dry-run

# Re-generate only slide 5
scholium generate lecture.md output.mp4 --slides 5

# Re-generate slides 3 through 7
scholium generate lecture.md output.mp4 --slides 3-7

# Resume an interrupted run (skips existing audio files in ./temp/)
scholium generate lecture.md output.mp4 --resume --keep-temp

# Verbose with temp files kept
scholium generate lecture.md output.mp4 --verbose --keep-temp

# Audio-only (no video encoding)
scholium generate lecture.md output/ --audio-only

`scholium train-voice`#

scholium train-voice --name NAME --provider PROVIDER --sample AUDIO [OPTIONS]

Required Options#

Option	Description
`--name`	Name for the voice
`--provider`	TTS provider (`coqui`, `f5tts`, `styletts2`, or `tortoise`)
`--sample`	Path to reference audio file (5-15 s recommended)

Optional Options#

Option	Description	Default
`--description`	Description of the voice	auto-generated
`--language`	Language code	`en`
`--config`	Configuration file	`config.yaml`

Example#

scholium train-voice \
  --name my_voice \
  --provider f5tts \
  --sample recording.wav \
  --description "My teaching voice"

`scholium list-voices`#

scholium list-voices [--provider PROVIDER] [--config PATH]

List available voices. Behaviour depends on whether --provider is given.

Without `--provider` (default)#

Lists all voices registered in the local voice library:

scholium list-voices

Voices directory: ~/.local/share/scholium/voices

Available voices:
  • my_voice
    Provider: f5tts
    Description: My teaching voice

With `--provider piper`#

Lists all built-in Piper voices and shows which are already downloaded locally:

scholium list-voices --provider piper

Piper voices directory: ~/.local/share/piper/voices

Known voices (9 total):

  Voice                             Status
  --------------------------------------------------
  en_US-lessac-medium               downloaded
  en_US-lessac-low                  auto-downloads on first use
  en_US-lessac-high                 auto-downloads on first use
  ...

Use a voice:
  scholium generate slides.md output.mp4 --provider piper --voice <name>

Full catalogue (900+ voices):
  https://huggingface.co/rhasspy/piper-voices

Undownloaded voices are fetched automatically the first time they are used.

With `--provider elevenlabs`#

Queries the ElevenLabs API and lists every voice on your account with its Voice ID:

scholium list-voices --provider elevenlabs

ElevenLabs voices (42 total):
  Name                            Voice ID                  Category
  ------------------------------  ------------------------  --------
  Alice                           Xb7hH8MSUJpSbSDYk0k2     premade
  Antoni                          ErXwobaYiN019PkySvjV      premade
  Colin                           ZGuEOd751j7qVTkXR73w      premade
  ...

Use the Voice ID (not the name) with --voice or in config.yaml:
  voice: "Xb7hH8MSUJpSbSDYk0k2"   # Alice

Requires ELEVENLABS_API_KEY to be set in the environment.

`scholium regenerate-embeddings`#

scholium regenerate-embeddings --voice NAME [OPTIONS]

Pre-compute speaker embeddings for a Coqui voice to speed up future generation.

Example#

scholium regenerate-embeddings --voice my_voice

`scholium config init`#

scholium config init [OPTIONS]

Create a config.yaml in the current directory with every supported setting included, annotated with comments explaining each option.

Options#

Option	Description	Default
`--path PATH`	Write to a different location	`config.yaml`
`--force`	Overwrite an existing file	false

Example#

# Generate a config file in the current directory
scholium config init

# Write to a custom location
scholium config init --path project/settings.yaml

# Overwrite an existing file at a custom location
scholium config init --path project/settings.yaml --force

Edit only the settings you want to change — everything else defaults to sensible values.

`scholium config show`#

scholium config show [OPTIONS]

Print the effective configuration: built-in defaults merged with your config.yaml and any environment-variable overrides. API keys are masked as *** so the output is safe to share or log.

Options#

Option	Description	Default
`--path PATH`	Config file to inspect	`config.yaml`

Example#

# Inspect config in current directory
scholium config show

# Inspect a config in a different location
scholium config show --path ~/lectures/config.yaml

`scholium providers list`#

scholium providers list

Show all available TTS providers and their installation status.

`scholium providers info`#

scholium providers info PROVIDER

Show detailed information about a specific provider.

scholium providers info f5tts

Configuration File#

Use scholium config init to generate a fully-annotated config.yaml, or create it manually. Place it in the same directory as your slides and it is picked up automatically.

For a complete reference of every setting — including provider-specific speed and quality controls — see Advanced Configuration.

# TTS settings
tts_provider: "piper"
voice: "en_US-lessac-medium"

# Provider-specific settings
piper:
  quality: "medium"
  speed: 1.0       # 0.1–5.0  (lower = slower)

elevenlabs:
  api_key: ""          # Leave empty — use ELEVENLABS_API_KEY env var
  model: "eleven_multilingual_v2"
  stability: 0.5       # 0.0–1.0  (optional)
  similarity_boost: 0.75  # 0.0–1.0  (optional)

coqui:
  model: "tts_models/multilingual/multi-dataset/xtts_v2"

openai:
  api_key: ""          # Leave empty — use OPENAI_API_KEY env var
  model: "tts-1"
  speed: 1.0           # 0.25–4.0

bark:
  model: "small"

f5tts:
  model: "F5-TTS"
  # model_path: "f5tts/my_voice/sample.wav"   # relative to voices_dir
  # ref_text: "Words spoken in the reference clip."

styletts2:
  alpha: 0.3
  beta: 0.7
  diffusion_steps: 5
  # model_path: "styletts2/my_voice/sample.wav"

tortoise:
  preset: "fast"
  # model_path: "tortoise/my_voice/sample.wav"

# Timing defaults
timing:
  default_pre_delay: 1.0
  default_post_delay: 2.0
  min_slide_duration: 4.0
  silent_slide_duration: 3.0

# Video settings
resolution: [1920, 1080]
fps: 30

# Paths
voices_dir: "~/.local/share/scholium/voices"
temp_dir: "./temp"
output_dir: "./output"

# Options
keep_temp_files: false
verbose: true

Environment Variables#

export ELEVENLABS_API_KEY="your_key"
export OPENAI_API_KEY="your_key"

Command Line Interface#

scholium generate#

Arguments#

Options#

Examples#

scholium train-voice#

Required Options#

Optional Options#

Example#

scholium list-voices#

Without --provider (default)#

With --provider piper#

With --provider elevenlabs#

scholium regenerate-embeddings#

Example#

scholium config init#

Options#

Example#

scholium config show#

Options#

Example#

scholium providers list#

scholium providers info#

Configuration File#

Environment Variables#

`scholium generate`#

`scholium train-voice`#

`scholium list-voices`#

Without `--provider` (default)#

With `--provider piper`#

With `--provider elevenlabs`#

`scholium regenerate-embeddings`#

`scholium config init`#

`scholium config show`#

`scholium providers list`#

`scholium providers info`#