CSV or TSV — any shape, any size. Datum will figure it out.
Datum is a statistical autopsy engine. Drop any CSV and it performs a full diagnostic — profiling every column, discovering distributions, computing correlations, flagging outliers, and generating a human-readable narrative. Every computation runs in your browser; nothing leaves your machine.
⊘ Zero AI · Zero Server · Pure Statistics
k = ⌈log₂(n) + 1⌉.r = Σ(xᵢ−x̄)(yᵢ−ȳ) / √(Σ(xᵢ−x̄)² · Σ(yᵢ−ȳ)²). Rendered as a heatmap with diverging color scale.H = −Σ pᵢ log₂(pᵢ) to measure information content.The narrative is template-driven, not AI-generated. Datum uses conditional logic over computed statistics to select sentence fragments and compose paragraphs. For example: if skewness > 1.0, it writes "heavily right-skewed"; if the strongest correlation exceeds |0.7|, it highlights the pair. The writing is deterministic — the same dataset always produces the same report.
This is a deliberate constraint. Every number in the report is mathematically verifiable. There are no hallucinations, no probabilistic summaries, no model weights. The formulas are classical statistics — the same math used in R, SciPy, and academic papers for decades. The constraint proves that meaningful data storytelling doesn't require LLMs — it requires thoughtful computation and good writing templates.
Datum draws on: Tukey's Exploratory Data Analysis (1977), the pandas profiling philosophy (automated EDA), Welford's algorithm for online variance, and the editorial tradition of data journalism — presenting statistical findings as narrative rather than dashboards.