[RFC] SLIM Data Type - Compact JSON Alternative (17-62% smaller)
| От | Marco Matteucci |
|---|---|
| Тема | [RFC] SLIM Data Type - Compact JSON Alternative (17-62% smaller) |
| Дата | |
| Msg-id | CALOXXf5iPrp3wu88T_vzQ6aH0M2T5kYm-wySjhrP8PFKw6pqJQ@mail.gmail.com обсуждение исходный текст |
| Список | pgsql-hackers |
Hi hackers,
I'd like to propose a new data type for PostgreSQL: SLIM (Structured Lightweight Interchange Markup), a compact alternative to JSONB that achieves 17-62% storage reduction while remaining human-readable.
== The Problem ==
JSONB is great, but has inherent inefficiencies:
- Redundant field names in arrays of objects
- Verbose syntax (quotes, colons, brackets)
- Boolean representation uses 4-5 bytes each
- No format-level compression
For databases with millions of JSON documents, this adds up.
== SLIM Format ==
JSON: {"users":[{"id":1,"name":"Alice","active":true},{"id":2,"name":"Bob","active":false}]}
SLIM: {users:|2|id#,name,active?|1,Alice,T|2,Bob,F}
Savings: 45%
Key optimizations:
- Table format for arrays: schema once, data in rows
- Type markers: # (number), ? (boolean), ! (null)
- Compact booleans: T/F instead of true/false
- No quotes for simple strings
== Real-World Benchmarks ==
Tested on a production TimescaleDB database (74.9 million rows, 74 GB):
Format | Sample | JSON | SLIM | Savings
--------------------------|-----------|---------|---------|--------
Object (token_metrics) | 100K rows | 16 MB | 13 MB | 18.7%
Object (wallet_activity) | 100K rows | 20 MB | 16 MB | 17.0%
Table format (arrays) | 10K rows | 580 KB | 222 KB | 61.7%
Extrapolated on 74 GB: 12.5-45.9 GB savings depending on data structure.
== Implementation ==
I've built a working PostgreSQL extension (pg_slim) that provides:
- Native SLIM data type with TOAST support
- Implicit casts to/from JSONB and TEXT
- Full operator support: ->, ->>, @>, <@, ?, ?|, ?&
- B-tree and hash indexing
- Comparison operators for sorting
Example usage:
CREATE TABLE documents (id SERIAL PRIMARY KEY, data SLIM);
INSERT INTO documents (data) VALUES ('{name:Alice,age:#30}'::slim);
INSERT INTO documents (data) VALUES (slim_encode('{"name":"Bob"}'::jsonb));
SELECT data->>'name' FROM documents; -- Works like JSONB
SELECT * FROM documents WHERE data @> '{name:Alice}'::slim;
The extension compiles and passes tests on PostgreSQL 15.
== Questions for Discussion ==
1. Is there community interest in a SLIM type?
2. Should this remain an extension or be considered for contrib/core?
3. Which index types should be prioritized (GIN support is planned)?
4. Any concerns about the format design?
== Links ==
- Extension: https://github.com/matteuccimarco/pg-slim
- Full RFC: https://github.com/matteuccimarco/pg-slim/blob/main/RFC-SLIM-TYPE.md
- SLIM spec: https://github.com/matteuccimarco/slim-protocol-core
- Benchmarks: https://github.com/matteuccimarco/pg-slim/blob/main/BENCHMARK-RESULTS.md
Looking forward to your feedback.
Best regards,
Marco Matteucci
В списке pgsql-hackers по дате отправления: