SMStreamMDXStreaming renderer
Benchmarks

Reproducible streaming and static markdown comparisons

This page distinguishes live incremental behavior from one-shot static rendering. The goal is not to publish a single vanity number, but to let you inspect how different renderers behave under the same fixtures, chunk cadence, and browser session.

Fixture drivenSeeded scenariosLive incrementalStatic content classesBrowser-local measurementsPerf harness methodologyStreamdown comparison notes
How to read this page
  • Results are local and hardware-dependent. Compare engines under the same browser, fixture, and scheduler settings.
  • Live incremental numbers answer a different question than static rendering. Both matter and should be read separately.
  • Memory, bundle size, and worker-hosting tradeoffs belong alongside latency numbers; they are not interchangeable metrics.
Reproduce locally
npm run docs:dev
npm run perf:harness -- --fixture naive-bayes --scenario S2_typical --runs 3 --warmup 1
npm run perf:compare -- --base tmp/perf-runs/<base>/summary.json --candidate tmp/perf-runs/<candidate>/summary.json
Benchmark frame
Metric definitions
First visible render

Time from emitted delta to the first observable DOM mutation for an engine.

Why it matters: This is the most user-visible latency metric during streaming.

Final convergence

Time from emitted delta to the final stable DOM state for that update window.

Why it matters: This captures whether the renderer settles quickly or churns after visible output appears.

Patch-to-DOM latency

Measured time across the ingest, scheduling, and commit path before content becomes visible.

Why it matters: It exposes scheduler pressure and batching behavior under real incremental streams.

Static render timing

One-shot render timing for prose, tables, code, and mixed markdown fixtures.

Why it matters: It shows how engines behave outside the delta-stream case and catches content-class cliffs.

Fixture classes
Static content classes
Prose heavy

Narrative markdown with headings, nested lists, links, and inline emphasis.

Table heavy

Dense table markup where row/cell integrity and stable layout matter more than raw token count.

Code heavy

Multiple fenced blocks with different languages, where syntax-highlighting cost becomes visible.

Mixed rich markdown

A combined fixture with tables, tasks, inline code, links, and surrounding prose.

Rich feature stress

A capability workload with math, MDX, HTML, tables, code, and footnotes. It is not a parity fixture for every engine.

Runtime cost
Memory and bundle terminology
Shipped client bundle

The JavaScript transferred to the browser for a page route before optional worker assets are considered.

Hosted worker asset

The separately served worker bundle used by StreamMDX in production when parsing is isolated off the main thread.

Runtime loaded code

Everything the browser eventually executes during a benchmark session, including lazily loaded chunks and worker code.

Peak memory

The highest memory sample observed during a local browser run. It is environment-dependent and should only be compared inside the same session class.

Scheduling
Scheduler / jitter modes
CI locked

Claim-grade mode. Keeps chunk cadence, order, workload, and StreamMDX scheduling deterministic enough for reproducible local comparisons.

Explore

Diagnosis mode. Lets you vary chunking, interval, ordering, and workload to find cliffs without treating the results as published baselines.

The live comparison lab exposes these modes directly. Use CI locked for reproducible comparisons and Explore to characterize scheduler sensitivity without turning the result into a public claim.

Reading the results
Parity workloads vs capability workloads

The benchmark surface now includes both parity workloads and one rich feature stress workload. The parity workloads are the fair StreamMDX/Streamdown/react-markdown comparison set. The rich stress case exists to show how StreamMDX behaves when math, MDX, HTML, tables, code, and footnotes are all active in the same document. Unsupported cells are marked explicitly instead of being counted as comparable runs.

Parity workloads

Common-markdown fixtures used for direct StreamMDX/Streamdown/react-markdown comparisons under the same browser session, scheduler mode, and scenario.

Capability workloads

Richer workloads that exercise StreamMDX-specific features such as mixed MDX, math, HTML, footnotes, and worker-aware composition. These are shown for behavior inspection, not direct cross-engine claims.

Claim discipline
What this page can and cannot claim
  • It can compare renderers fairly on the shared parity fixtures under the same local browser session and scheduler mode.
  • It can show how StreamMDX behaves on richer feature workloads that other engines in this lab do not fully support.
  • It cannot justify universal cross-machine latency or memory superiority claims outside this methodology envelope.
Coverage
Why these five static classes are the public set

The current five-class public set is intentionally final for the active plan: four parity-friendly classes (prose, tables, code, mixed) plus one explicitly marked capability stress class (rich). Adding more public classes is deferred until a distinct behavior family appears that the current set does not already expose.

Live renderer comparison lab (sequential)

Renderers run one-by-one to minimize CPU contention. Each engine gets an unscored warmup pass before the scored pass. Metrics split into first-visible commit vs final-stable commit.

State: idle
Active engine: -
Phase: warmup
Run: 0 / 1
Measured passes: 3
Total passes incl. warmup: 6
Delta: 0 / 195
Chars: 0 / 8,165
Throughput: 0 chars/s
1. StreamMDX2. Streamdown3. react-markdown
Current order: -
Methodology mode
Freeform tuning enabled for diagnosis.
Active profile snapshot: chunk=42, interval=32ms, repeats=16, runs=1, order=fixed, workload=parity-gfm. Differs from CI profile.
A scored run is one full cycle across all renderers (then repeated for the configured run count).
Chunk size (chars)42
Emit interval (ms)32
Fixture repeats16
Scored run cycles1
Order mode
Benchmark profile
Chart metric
Split mode shows both metrics.
Chart layout
Per-delta latency (first visible commit - emit time)
Y max: 10 ms
10853000111Delta indexLatency (ms)
Per-delta latency (final stable commit - emit time)
Y max: 10 ms
10853000111Delta indexLatency (ms)
StreamMDXStreamdownreact-markdown
RendererSamplesFirst paint p50First paint p95Final stable p50Final stable p95Run p50Throughput p50
StreamMDX0------
Streamdown0------
react-markdown0------
Metric leaders
First paint p50: -Final stable p50: -Run p50: -Throughput p50: -
StreamMDX: 0 / 4 metric winsStreamdown: 0 / 4 metric winsreact-markdown: 0 / 4 metric wins
CI gate (StreamMDX vs Streamdown)
PENDING • pass=0 fail=0 pending=4Run in CI locked mode before claiming results
First paint p50 <= StreamdownFinal stable p50 <= StreamdownRun p50 <= StreamdownThroughput p50 >= Streamdown
RendererEmit→ingest p50Ingest→commit p50Emit→commit p50Append overhead p50Timer drift p50 / p95
StreamMDX----- / -
Streamdown----- / -
react-markdown----- / -
StreamMDX
Incremental worker parser + patch renderer
No stream data yet.
Streamdown
Drop-in streaming replacement for react-markdown
No stream data yet.
react-markdown
Baseline markdown renderer (full re-render on updates)
No stream data yet.

Method note: each engine runs warmup then scored pass in isolation. Commit timing capture is unified across engines via layout-effect commit hooks to avoid instrumentation bias. Use CI locked mode for claim-grade runs; use explore mode to diagnose bottlenecks. \"First paint\" measures earliest visible commit; \"final stable\" reflects how quickly each delta settles after downstream formatting/render passes.

Static render comparison (content types)

Measures one-shot static rendering across common-markdown content classes plus one richer StreamMDX-only stress workload. Times are captured per engine as first mutation and final settled mutation.

State: idle
Iterations: 3
Progress: 0 / 0
Active fixture: -
Active engine: -
Timeouts: 0
Iterations per fixture3
Prose heavyTable heavyCode heavyMixed rich markdownRich feature stress
Content typeDescriptionStreamMDX first/final p50Streamdown first/final p50react-markdown first/final p50Final p50 winner
Prose heavyLong narrative text with headings, lists, and inline formatting.- / -n=0- / -n=0- / -n=0-
Table heavyLarge table blocks with short explanatory text.- / -n=0- / -n=0- / -n=0-
Code heavyMultiple fenced code blocks with surrounding markdown.- / -n=0- / -n=0- / -n=0-
Mixed rich markdownLists, quotes, links, tasks, and tables in one document.- / -n=0- / -n=0- / -n=0-
Rich feature stressMath, MDX, HTML, tables, and code in one workload. Timed only where the engine exposes those capabilities in this harness.- / -n=0unsupported in this harnessunsupported in this harness-
StreamMDX
Streamdown
react-markdown