feat: libb20 C++20 #12

Merged
sezieru merged 11 commits from feat/lib/cpp20/main into main 2025-11-05 07:00:48 +00:00
Owner

🚀 PR: Introduce C++20 libb20 Library and Reference CLI

Branch: feat/lib/cpp20/main
Version Target: v0.5.0


Summary

This PR adds the full C++20 implementation of the libb20 base-20 codec library, complete with an idiomatic C++ API, UTF-8–aware text mode, binary-digit mode, and a new kb20-cpp20ref command-line reference tool.

It modernizes the original C90 codebase while preserving 100% on-wire compatibility with the base-20 format defined by the C90 reference implementation.


🔧 Key Features

  • C++20 Core Library (libb20)

    • b20::codec class with clean encode/decode interface.
    • b20::status, error_category(), and status_message() for std::error_code integration.
    • b20::alphabet and b20::options for configurable alphabets and binary-digit modes.
    • Stream manipulators:
      std::cin >> b20::encode(codec) >> std::string;
      std::cout << b20::decode(codec) << std::string;
  • Reference CLI (kb20-cpp20ref)

    • Command-line parity with the C90 reference (encode / decode).
    • Supports both text and binary-digits modes.
    • Options: --text, --bin|--binary-digits, --zero <utf8-scalar>, --alphabet <20-utf8-scalars>.
    • Reads from stdin, writes to stdout (binary-safe).
    • Uses new helper libraries for robust I/O and UTF-8 parsing.
  • Generic Helper Libraries (contrib/)

    • contrib/io_helpers: robust read_all / write_all / read_exact / read_until utilities.
    • contrib/utf8_helpers: compact UTF-8 decoder/encoder with parse_n_exact and parse_prefix_n.
    • Fully decoupled from libb20 — usable across tools and projects.
  • Bugfixes / Parity

    • Corrected remainder logic in base-20 division (cur % base instead of & base).
    • Validates alphabets (no '-', no surrogates, 20 distinct scalars).
    • Confirmed output parity with the C90 reference:
      echo "this is test text" | kb20-cpp20ref encode → matches C90 output (S-…).
  • Build System

    • Updated Makefile with C++20 flags, cleaning, and debug target.
    • Includes contrib/ path for helper libs.

🧩 Directory Layout

src/cpp/
├── contrib/
│   ├── io_helpers.cpp
│   ├── io_helpers.hpp
│   ├── utf8_helpers.cpp
│   └── utf8_helpers.hpp
├── kb20-cpp20ref.cpp
├── libb20.cpp
├── libb20.hpp
└── Makefile

Testing & Verification

  • Verified encoded/decoded parity with the existing C90 CLI (kb20-c90ref).
  • Confirmed correct UTF-8 parsing for single- and multi-byte scalars.
  • CI builds clean with clang++ -std=c++20 -Wall -Wextra -Wpedantic -O2.
### 🚀 PR: Introduce C++20 `libb20` Library and Reference CLI **Branch:** `feat/lib/cpp20/main` **Version Target:** `v0.5.0` --- #### Summary This PR adds the full C++20 implementation of the **libb20** base-20 codec library, complete with an idiomatic C++ API, UTF-8–aware text mode, binary-digit mode, and a new `kb20-cpp20ref` command-line reference tool. It modernizes the original C90 codebase while preserving **100% on-wire compatibility** with the base-20 format defined by the C90 reference implementation. --- #### 🔧 Key Features * **C++20 Core Library (`libb20`)** * `b20::codec` class with clean encode/decode interface. * `b20::status`, `error_category()`, and `status_message()` for `std::error_code` integration. * `b20::alphabet` and `b20::options` for configurable alphabets and binary-digit modes. * Stream manipulators: `std::cin >> b20::encode(codec) >> std::string;` `std::cout << b20::decode(codec) << std::string;` * **Reference CLI (`kb20-cpp20ref`)** * Command-line parity with the C90 reference (`encode` / `decode`). * Supports both **text** and **binary-digits** modes. * Options: `--text`, `--bin|--binary-digits`, `--zero <utf8-scalar>`, `--alphabet <20-utf8-scalars>`. * Reads from `stdin`, writes to `stdout` (binary-safe). * Uses new helper libraries for robust I/O and UTF-8 parsing. * **Generic Helper Libraries (`contrib/`)** * `contrib/io_helpers`: robust `read_all` / `write_all` / `read_exact` / `read_until` utilities. * `contrib/utf8_helpers`: compact UTF-8 decoder/encoder with `parse_n_exact` and `parse_prefix_n`. * Fully decoupled from libb20 — usable across tools and projects. * **Bugfixes / Parity** * Corrected remainder logic in base-20 division (`cur % base` instead of `& base`). * Validates alphabets (no `'-'`, no surrogates, 20 distinct scalars). * Confirmed output parity with the C90 reference: ✅ `echo "this is test text" | kb20-cpp20ref encode` → matches C90 output (`S-…`). * **Build System** * Updated `Makefile` with C++20 flags, cleaning, and debug target. * Includes `contrib/` path for helper libs. --- #### 🧩 Directory Layout ``` src/cpp/ ├── contrib/ │ ├── io_helpers.cpp │ ├── io_helpers.hpp │ ├── utf8_helpers.cpp │ └── utf8_helpers.hpp ├── kb20-cpp20ref.cpp ├── libb20.cpp ├── libb20.hpp └── Makefile ``` --- #### ✅ Testing & Verification * Verified encoded/decoded parity with the existing C90 CLI (`kb20-c90ref`). * Confirmed correct UTF-8 parsing for single- and multi-byte scalars. * CI builds clean with `clang++ -std=c++20 -Wall -Wextra -Wpedantic -O2`.
sezieru self-assigned this 2025-11-05 07:00:24 +00:00
- Introduce b20::status + error_category() + status_message() (std::error_code integration)
- Add b20::alphabet (defaults: [A..T] contiguous, [0-9A-J] mapped) with validate_text()
- Add b20::options { binary_digits, alphabet }
- Implement b20::codec interface:
  - no-throw encode()/decode() (bytes<->bytes) + throwing overloads
  - text helpers: encode_text()/decode_text()
  - binary-digits helpers: encode_binary()/decode_binary()
- Provide two-step stream manipulators:
  - istream >> b20::encode(codec) >> std::string (encode remaining stream -> text)
  - ostream << b20::decode(codec) << std::string (decode text -> raw bytes)
- Keep core I/O-agnostic; future helpers (read_all, etc.) will live in a separate header

Refs: feat/lib/cpp20/main
- Implement error_category() and status_message()
- Enforce alphabet validation (20 distinct scalars, no '-', no surrogates)
- Implement codec encode/decode (no-throw) + throwing overloads
- Add text helpers (encode_text/decode_text) and binary-digits helpers
- Provide two-step stream proxies:
  - istream >> b20::encode(codec) >> std::string
  - ostream << b20::decode(codec) << std::string
- Preserve C90 on-wire compatibility (5 bytes <-> 10 digits; big-endian folds)

Refs: feat/lib/cpp20/main
- util_utf8: status, decode_one(+exact), encode_cp, parse_n_exact<N>, parse_prefix_n<N>
- util_io: read_all/read_line/read_exact/read_until, write_all (std::errc)
- No libb20 dependency; reusable across tools
- Kept separate from core; for CLIs/demos
- alphabet_from_zero/map wrappers (validated via validate_text)
- Added comments to help indicate which headers supply used features
- Text (default) and --bin modes
- --zero <scalar> / --alphabet <20 scalars> (text mode only)
- Reads stdin, writes stdout (binary-safe); C90 parity UX
- Utilize util_io::read_all()/write_all() for robust stdin/stdout
- Utilize util_utf8::decode_one_exact(), parse_n_exact<20>() for flags
- Alphabet config/validation via core lib only
- Default & debug builds covered (overridable CXX/CXXFLAGS)
- Targets: kb20-cpp20ref, clean, debug
- Links core (libb20) and generic helpers (io/utf8) by default
- Move io_helpers.{hpp,cpp} and utf8_helpers.{hpp,cpp} into contrib/
- Keeps core lib clean; helpers remain optional/reusable
- Add -Icontrib to INCL
- Ensure all compilation targets reference INCL
- Update LIBS_SRC to point to new source paths
- CLI ref unmodified, relying on INCL
- No functional changes
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: fosster/libb20#12
No description provided.