An encoder & decoder implementation of base-20, inspired by Kaktovik numerals, in a number of languages.
Find a file
2025-11-05 04:51:38 -06:00
src build(c90+cpp20): add support for static builds, for future testing 2025-11-05 04:44:09 -06:00
tests refactor(test): refactor & clean-up test_kb20.sh 2025-11-05 03:37:24 -06:00
.gitignore chore(git): ignore compiled reference implementations 2025-11-05 04:51:38 -06:00
LICENSE docs(license): update LICENSE file format to Markdown & adjust copyright holders 2025-10-30 04:36:49 -05:00
README.md fix(docs): adjust README to reflect sources renaming 2025-10-31 18:54:14 -05:00

libb20 -- Base-20 Encoding Library (Kaktovik-Inspired)

libb20 is a multi-language refrence implementation and emerging library for base-20 text and binary encoding, inspired by the Kaktovik numeral system.

It provides a modern, secure, and cross-compatible way to represent binary data using:

  • ASCII-safe alphabets (e.g. A-T)
  • Unicode digit alphabets (e.g. Kaktovik digits U+1D2C0-U+1D2D3)
  • Custom user-defined 20-symbol alphabets
  • Strict validation & big-endian canonical format
  • Binary-digits mode for clean, compact 8-bit pipelines

The core objective is to create an educational, well-documented toolkit that demonstrates numeric encoding concepts across languages, while offering practical utility for real-world applications.

Features

  • Reference implementations in C90, C++20, Python 3, TypeScript
  • Text mode: <len-digits>-<payload-digits> format
  • Binary-digits mode: [len-digits] 0xFF [payload-digits]
  • Strict digit parsing - never silently ignores input
  • Big-Endian (MSB-first) canonical representation
  • Unicode support, including Kaktovik digits
  • Educational commentary & clear internal struture
  • Suitable for embedding in applications & tools
  • Designed to expand into a reusable multi-language library

📦 Status

This project currently serves as a reference + teaching implementation. Library maturation and packaging are in progress.

Planned supported languages:

Language Status
C90 [x] (reference)
C++20 [x] (reference)
Python [x] (reference)
TypeScript [x] (reference)
Rust [ ] planned
Ruby [ ] planned
Lua [ ] planned

Future targets: Go, Java, Zig -- community suggestions welcome.

📚 Example Usage

Python

from kb20py import kb20_encode_ascii, kb20_decode_ascii

data = "Hello, world!".encode()
enc = kb20_encode_ascii(data, ZERO="A")
dec = kb20_decode_ascii(enc, ZERO="A")

assert dec == data

TypeScript

import { kb20EncodeAscii, kb20DecodeAscii } from "./kb20ts.ts";

const text = "this is test text";
const enc = kb20EncodeAscii(new TextEncoder().encode(text), {ZERO:"A"});
const dec = kb20DecodeAscii(enc, {ZERO:"A"});

console.assert(new TextDecoder().decode(dec) === text);

C90 (CLI)

echo "hello, encoder!" | ./kb20c encode >out.b20
./kb20 decode < out.b20

🧠 Why Base-20?

This project blends:

  • mathematical clarity (Horner folding, MSB-first encoding)
  • cultural appreciation of indigenous numeric systems
  • practical encoding design (text + binary modes)
  • cross-language software engineering pedagogy

It is not a novelty cipher -- it is a serious educational + technical project.

🧪 Testing & Verification

Each implementation is round-tripped against the others to ensure:

  • identical encoding semantics
  • strict error handling
  • consistent big-endian canonical output
  • Unicode scalar correctness

Test harnesses & CI coming soon.

🎮 Future Integrations

This library will be embedded into the Ctrl-Alt-Compete game project as an in-game encoding mechanic (and puzzle system). Part of the broader Foster open-learning initiative.

🤝 Contributing

Contributions welcome -- especially:

  • new language bindings
  • API docs & examples
  • educational commentary & tutorials
  • fuzzing / CI test suites

📄 License

License

This project is licensed under the BSD 3-Clause icense, allowing for both personal and commercial use while ensuring proper attribution to contributors. Please see license details in LICENSE.


🌱 Project Philosophy

Teach. Preserve. Build. Numeric systems are cultural artifacts -- software is a form of language. libb20 exists to explore both.