How it
works.
The case is answered on the dashboard. This page shows the machinery behind it: the data pipeline, the engine choice, and the actual SQL.
Three lanes, one loop.
Static files on a CDN, an in-browser SQL engine, and a parquet file that never leaves the user's device. The same loop Power BI runs — without the desktop tool, the .pbix file, or the workspace setup.
- 15 Excel files~94 MB raw
- Python ETLpandas + openpyxl
- sales.parquet5.8 MB · ZSTD
- Static HTML/CSS/JSNext.js build
- duckdb.wasm~3 MB · lazy
- sales.parquet5.8 MB · cached
- React UIdate picker → event
- DuckDB-WASMSQL window funcs
- Recharts vizKPIs + chart
The same loop Power BI runs — load, aggregate, render — but compiled into a few megabytes of static files. No server, no scheduler, no license. Every visitor gets the same engine running on their device.
Excel → Parquet, in six steps.
The data pipeline runs once, offline. It produces a single 5.8 MB parquet that downstream pages query live.
Read
15 .xlsx files via pandas + openpyxl. Header at row 3 (rows 1–2 are title/blank).
pd.read_excel(path, sheet_name='Günlük Satis Raporu', header=2)Rename
Turkish source columns → canonical English. Drop the always-null Metrics column.
df.rename(columns=COLUMN_MAP)Coerce
Numeric columns to Int32/Int64/Float32/Float64 with errors=coerce → NaN. Day to date32.
df['day'] = pd.to_datetime(df['day']).dt.dateConcatenate
All 15 frames into one DataFrame. 1,784,613 rows × 11 columns.
df = pd.concat(frames, ignore_index=True)Validate
Null rates, duplicate (store, product, day) keys, p99.9 outliers, negatives, zero-sales rows. Honest report.
validation.json → 6 anomalies surfacedWrite
Parquet with ZSTD level 9, dictionary-encoded strings (vendor/store/product_name), 64k row groups for mobile streaming.
pq.write_table(t, path, compression='zstd', compression_level=9)Why DuckDB-WASM.
Three real alternatives. One fit.
Pre-aggregate to JSON
- Tiny payload (~100 KB)
- Zero engine bundle
- Every new question = new aggregation + redeploy
- Schema locked at build time
- No flexibility for the date filter case requirement
Backend API + SQL DB
- Server-side SQL flexibility
- Hide raw data
- Needs hosting & uptime
- Network round-trip per query
- Breaks offline
- Adds auth, CORS, rate-limit complexity
DuckDB-WASM in browser
- Real SQL flexibility, full window functions
- Offline-first after first load (SW cache)
- Zero backend, zero scheduling, zero ops
- URL-shareable, embed-anywhere
- ~3 MB engine bundle (paid once, cached forever)
One SQL statement. Four windows.
This is the exact query that runs every time you move the date picker on the dashboard. No paraphrase, no simplification.
WITH sel AS (
SELECT CAST('2022-08-16' AS DATE) AS sel_date -- ❶ user picks the date
)
SELECT
-- ❷ YTD: Jan 1 of selected year → selected date
SUM(CASE WHEN day BETWEEN date_trunc('year', (SELECT sel_date FROM sel))
AND (SELECT sel_date FROM sel)
THEN net_sales END) AS ytd,
-- ❸ MTD: 1st of selected month → selected date
SUM(CASE WHEN day BETWEEN date_trunc('month', (SELECT sel_date FROM sel))
AND (SELECT sel_date FROM sel)
THEN net_sales END) AS mtd,
-- ❹ Previous-year mirror windows (offset by INTERVAL 1 YEAR)
SUM(CASE WHEN day BETWEEN date_trunc('year', (SELECT sel_date FROM sel) - INTERVAL 1 YEAR)
AND (SELECT sel_date FROM sel) - INTERVAL 1 YEAR
THEN net_sales END) AS ytd_py,
SUM(CASE WHEN day BETWEEN date_trunc('month', (SELECT sel_date FROM sel) - INTERVAL 1 YEAR)
AND (SELECT sel_date FROM sel) - INTERVAL 1 YEAR
THEN net_sales END) AS mtd_py
FROM sales;The user-selected date enters as a CTE. Everything else references it — there's no max(day) trick.
date_trunc('year', sel) gives Jan 1 of the selected year. Range: [Jan 1 → selected date].
date_trunc('month', sel) gives the 1st of the selected month. Range: [month-start → selected date].
sel − INTERVAL 1 YEAR shifts the entire window back one year, preserving day-of-year. PY% is then (curr − prev) / prev on the client.
Source columns mapped to canonical names.
Turkish source columns are renamed; types are explicit; nothing is implicit.
| Source (TR) | Canonical (EN) | Type |
|---|---|---|
| Satıcı | vendor | string |
| Satıcı Ürün Kodu | product_code | int32 |
| Alıcı Ürün Kodu | buyer_product_code | int32 |
| Satıcı Ürün Adı | product_name | string |
| Barkod | barcode | int64 |
| Mağaza | store | string |
| Satıcı Teslim Noktası Kodu | delivery_point | int32 |
| Gün_ | day | date32 |
| Metrics | — | — |
| Satış Miktarı | qty | float32 |
| Kg/Lt | kg_lt | float32 |
| Net Satis Tutari (TL) | net_sales | float64 |
Method is in place.
Now let's ask what the data actually shows →