scholium.tts_engine.TTSEngine#
- class TTSEngine(provider_name, provider_config=None, voices_dir=None, config=None, quality_preset=None, speed_override=None)[source]#
Bases:
objectManages TTS provider and audio generation.
Initialize TTS engine.
- Parameters:
provider_name (
str) – Name of TTS provider (‘piper’, ‘elevenlabs’, ‘coqui’, ‘openai’, ‘bark’, ‘f5tts’, ‘styletts2’, ‘tortoise’)provider_config (
Dict[str,Any]) – Configuration for the providervoices_dir (
str) – Directory for storing voice models and trained voicesconfig – Config object for accessing global settings
quality_preset (
Optional[str]) – High-level quality preset: ‘fast’, ‘balanced’, or ‘best’. Overrides the matching provider config key(s).speed_override (
Optional[float]) – Speech rate multiplier (0.1–5.0). For providers that accept native speed (piper, openai) this is wired through the provider config; for all others it is applied as a pitch-preserving ffmpegatempopost-process step.
Methods
Generate audio from text.
Generate audio for multiple narration segments.
- generate_segments(segments, voice_config, output_dir, progress_callback=None, resume=False)[source]#
Generate audio for multiple narration segments.
- Parameters:
segments (
List[Dict[str,Any]]) – List of segment dicts, each containing at least:text(str),slide_number(int), and optionallymin_duration,pre_delay,post_delay,fixed_duration.voice_config (
Dict[str,Any]) – Voice configuration passed to the TTS provider.output_dir (
str) – Directory where individual audio files are saved.progress_callback – Optional zero-argument callable invoked after each segment is processed (useful for progress bars).
resume (
bool) – WhenTrue, skip TTS generation for segments whose audio file already exists on disk (useful for resuming an interrupted run).
- Returns:
List of enriched segment dicts, each containing all original keys plus:
{ "audio_path": "/path/to/audio_0000.mp3", "audio_duration": 5.2, "duration": 7.2, # includes pre/post delays "fixed_duration": None, # if specified "min_duration": 10.0, # if specified "pre_delay": 1.0, "post_delay": 1.0, }
- Return type: