Worked Example: The Demo Video#

The video on the Scholium home page was generated by Scholium itself — a self-referential demo that exercises the main features of the format. This page walks through the source file line by line.

The complete source is at docs/demo/lecture.md.

Generating the Demo#

From the project root:

# Full regeneration (TTS required — takes ~30 s)
bash docs/demo/build_demo.sh

# Regenerate the terminal GIF only (fast — Pillow only)
bash docs/demo/build_demo.sh --gif-only

The Source File#

YAML Frontmatter#

---
title: "Scholium"
header-includes: |
  \titlegraphic{\includegraphics[width=0.45\textwidth]{logo-horizontal.png}}
  \logo{\includegraphics[height=0.35cm]{logo-horizontal.png}}
title_notes: |
  [PRE 1s]
  Scholium converts markdown slides into narrated instructional videos.
  Everything you need — slides, narration, and timing — lives in one
  plain text file. Let me show you how it works.
---

Points of interest:

  • header-includes passes raw LaTeX to the Beamer template. \titlegraphic places the full logo on the title slide; \logo places a small version at the bottom-right corner of every content slide.

  • title_notes narrates the title slide. [PRE 1s] adds a one-second pause before narration begins, letting the title settle before speaking.

Slide 1 — The Format#

# The Format

Write slides and narration together in one file:

```markdown
# Slide Title

::: notes
This narration is spoken aloud.
:::
` ` `

::: notes
A Scholium lecture file is plain markdown.
Each slide contains its content and narration in the same file.
The triple-colon notes block holds the text that gets spoken.
One file. Everything in one place.
:::

The fenced code block shows the format as an example — Scholium’s parser correctly ignores the ::: notes inside the fence and reads only the real notes block that follows it.

Slide 2 — One Command (Incremental Reveals)#

# One Command

` ` `bash
scholium generate lecture.md output.mp4
` ` `

>- Parses slides and narration from your markdown
>- Synthesises speech with your chosen TTS voice
>- Combines slides and audio into a finished video

::: notes
To generate a video, run one command: scholium generate, your lecture file,
and the output filename.

Scholium synthesises narration using whichever text-to-speech engine you
choose — from fast local engines to high-quality cloud APIs.

Finally, it combines the rendered slides and audio into a finished video,
ready to share with students.
:::

The >- prefix creates incremental reveals — Pandoc/Beamer renders three PDF pages, one per bullet (cumulative). The notes block contains three paragraphs separated by blank lines; each paragraph narrates while its corresponding bullet is revealed.

See Incremental Lists (Bullet-by-Bullet Reveals) for the full specification.

Slide 3 — Timing Control#

# Timing Control

Fine-tune when narration plays and how long slides show:

` ` `markdown
::: notes
[PRE 1s] [POST 2s] [MIN 10s]
Your narration here. [PAUSE 0.5s] More narration.
:::
` ` `

- `[PRE 1s]` — pause before narration begins
- `[POST 2s]` — hold after narration ends
- `[MIN 10s]` — minimum slide duration
- `[PAUSE 0.5s]` — mid-narration silence

::: notes
[PRE 0.5s]
Scholium gives you precise timing control inside the notes block.
PRE adds a pause before narration begins — useful to let the opening slide breathe.
POST holds the slide after narration finishes, for emphasis.
MIN sets a minimum display duration for the whole slide.
And PAUSE inserts a brief silence mid-narration.
[POST 0.5s]
:::

This slide itself uses [PRE 0.5s] and [POST 0.5s] — the directives are stripped from the spoken narration and applied as timing parameters on the slide. See Timing Control for the full directive reference.

Slide 4 — This Video#

# This Video

This video was generated by Scholium.

- **Source:** about 90 lines of markdown
- **Voice:** Piper TTS, running locally on CPU
- **Time:** under 30 seconds to produce

::: notes
This video was generated by Scholium itself, from about ninety lines of markdown.
The voice you are hearing is Piper — a fast, local text-to-speech engine.
No API key. No cloud service. No waiting.
That is the Scholium workflow: write markdown, run one command, get a video.
:::

The penultimate slide is self-referential — it describes the video that contains it, giving the viewer a concrete sense of the tool’s speed and simplicity.

Slide 5 — Get Started#

# Get Started

\vfill

\begin{center}
\includegraphics[width=0.5\textwidth]{logo-horizontal.png}

\bigskip

\Large\texttt{pip install scholium[piper]}

\smallskip

\normalsize\texttt{ccaprani.github.io/scholium}
\end{center}

\vfill

::: notes
[PRE 0.5s]
Ready to try it? One command, and you are ready to go.
[PAUSE 1s]
Full documentation at the address on screen.
[POST 2s]
:::

Points of interest:

  • \vfill before and after the center block vertically centres the content on the Beamer frame, preventing the default top-alignment.

  • \includegraphics reuses the same logo-horizontal.png that header-includes placed on the title slide — no extra file needed.

  • [PRE 0.5s] lets the slide settle before speaking; [POST 2s] holds the frame after narration ends so viewers can read the URL before the video stops.