Reproducible streaming and static markdown comparisons
This page distinguishes live incremental behavior from one-shot static rendering. The goal is not to publish a single vanity number, but to let you inspect how different renderers behave under the same fixtures, chunk cadence, and browser session.
- Results are local and hardware-dependent. Compare engines under the same browser, fixture, and scheduler settings.
- Live incremental numbers answer a different question than static rendering. Both matter and should be read separately.
- Memory, bundle size, and worker-hosting tradeoffs belong alongside latency numbers; they are not interchangeable metrics.
Time from emitted delta to the first observable DOM mutation for an engine.
Why it matters: This is the most user-visible latency metric during streaming.
Time from emitted delta to the final stable DOM state for that update window.
Why it matters: This captures whether the renderer settles quickly or churns after visible output appears.
Measured time across the ingest, scheduling, and commit path before content becomes visible.
Why it matters: It exposes scheduler pressure and batching behavior under real incremental streams.
One-shot render timing for prose, tables, code, and mixed markdown fixtures.
Why it matters: It shows how engines behave outside the delta-stream case and catches content-class cliffs.
Narrative markdown with headings, nested lists, links, and inline emphasis.
Dense table markup where row/cell integrity and stable layout matter more than raw token count.
Multiple fenced blocks with different languages, where syntax-highlighting cost becomes visible.
A combined fixture with tables, tasks, inline code, links, and surrounding prose.
A capability workload with math, MDX, HTML, tables, code, and footnotes. It is not a parity fixture for every engine.
The JavaScript transferred to the browser for a page route before optional worker assets are considered.
The separately served worker bundle used by StreamMDX in production when parsing is isolated off the main thread.
Everything the browser eventually executes during a benchmark session, including lazily loaded chunks and worker code.
The highest memory sample observed during a local browser run. It is environment-dependent and should only be compared inside the same session class.
Claim-grade mode. Keeps chunk cadence, order, workload, and StreamMDX scheduling deterministic enough for reproducible local comparisons.
Diagnosis mode. Lets you vary chunking, interval, ordering, and workload to find cliffs without treating the results as published baselines.
The live comparison lab exposes these modes directly. Use CI locked for reproducible comparisons and Explore to characterize scheduler sensitivity without turning the result into a public claim.
The benchmark surface now includes both parity workloads and one rich feature stress workload. The parity workloads are the fair StreamMDX/Streamdown/react-markdown comparison set. The rich stress case exists to show how StreamMDX behaves when math, MDX, HTML, tables, code, and footnotes are all active in the same document. Unsupported cells are marked explicitly instead of being counted as comparable runs.
Common-markdown fixtures used for direct StreamMDX/Streamdown/react-markdown comparisons under the same browser session, scheduler mode, and scenario.
Richer workloads that exercise StreamMDX-specific features such as mixed MDX, math, HTML, footnotes, and worker-aware composition. These are shown for behavior inspection, not direct cross-engine claims.
- It can compare renderers fairly on the shared parity fixtures under the same local browser session and scheduler mode.
- It can show how StreamMDX behaves on richer feature workloads that other engines in this lab do not fully support.
- It cannot justify universal cross-machine latency or memory superiority claims outside this methodology envelope.
The current five-class public set is intentionally final for the active plan: four parity-friendly classes (prose, tables, code, mixed) plus one explicitly marked capability stress class (rich). Adding more public classes is deferred until a distinct behavior family appears that the current set does not already expose.
Live renderer comparison lab (sequential)
Renderers run one-by-one to minimize CPU contention. Each engine gets an unscored warmup pass before the scored pass. Metrics split into first-visible commit vs final-stable commit.
| Renderer | Samples | First paint p50 | First paint p95 | Final stable p50 | Final stable p95 | Run p50 | Throughput p50 |
|---|---|---|---|---|---|---|---|
| StreamMDX | 0 | - | - | - | - | - | - |
| Streamdown | 0 | - | - | - | - | - | - |
| react-markdown | 0 | - | - | - | - | - | - |
| Renderer | Emit→ingest p50 | Ingest→commit p50 | Emit→commit p50 | Append overhead p50 | Timer drift p50 / p95 |
|---|---|---|---|---|---|
| StreamMDX | - | - | - | - | - / - |
| Streamdown | - | - | - | - | - / - |
| react-markdown | - | - | - | - | - / - |
Method note: each engine runs warmup then scored pass in isolation. Commit timing capture is unified across engines via layout-effect commit hooks to avoid instrumentation bias. Use CI locked mode for claim-grade runs; use explore mode to diagnose bottlenecks. \"First paint\" measures earliest visible commit; \"final stable\" reflects how quickly each delta settles after downstream formatting/render passes.
Static render comparison (content types)
Measures one-shot static rendering across common-markdown content classes plus one richer StreamMDX-only stress workload. Times are captured per engine as first mutation and final settled mutation.
| Content type | Description | StreamMDX first/final p50 | Streamdown first/final p50 | react-markdown first/final p50 | Final p50 winner |
|---|---|---|---|---|---|
| Prose heavy | Long narrative text with headings, lists, and inline formatting. | - / -n=0 | - / -n=0 | - / -n=0 | - |
| Table heavy | Large table blocks with short explanatory text. | - / -n=0 | - / -n=0 | - / -n=0 | - |
| Code heavy | Multiple fenced code blocks with surrounding markdown. | - / -n=0 | - / -n=0 | - / -n=0 | - |
| Mixed rich markdown | Lists, quotes, links, tasks, and tables in one document. | - / -n=0 | - / -n=0 | - / -n=0 | - |
| Rich feature stress | Math, MDX, HTML, tables, and code in one workload. Timed only where the engine exposes those capabilities in this harness. | - / -n=0 | unsupported in this harness | unsupported in this harness | - |