Skip to content

Feat/wkb rewrite#45

Draft
wietzesuijker wants to merge 6 commits into
geoarrow:mainfrom
wietzesuijker:feat/wkb-rewrite
Draft

Feat/wkb rewrite#45
wietzesuijker wants to merge 6 commits into
geoarrow:mainfrom
wietzesuijker:feat/wkb-rewrite

Conversation

@wietzesuijker

Copy link
Copy Markdown

Summary

Rewrites the WKB parser from scratch, replacing the @loaders.gl/wkt dependency with direct DataView parsing. Adds a WKB encoder (toWkb) for round-trip capability. Modeled on geoarrow-rs.

Follows up on the review feedback in #44.

What changed

File What it does
types.ts WkbType enum (1-6), Dimension enum, coordSize()
header.ts ISO WKB + EWKB header parsing (endianness, type, dim, SRID)
capacity.ts Pre-calculate buffer sizes from WKB structure without parsing coordinates
reader.ts parseWkb(data): two-pass (scan + fill), auto-detects type and dimensions
writer.ts toWkb(data): encodes GeoArrow arrays to ISO WKB (little-endian)

What's new vs the old parser

  • Zero dependencies: @loaders.gl/wkt and @loaders.gl/schema removed
  • All 6 OGC types: Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon
  • Auto-detect: geometry type and coordinate dimensions read from WKB header (no caller args)
  • EWKB: PostGIS extended WKB with SRID and Z/M flags
  • Null handling: Arrow validity bitmap instead of throwing
  • Round-trip: toWkb(parseWkb(wkb)) produces identical bytes
  • Bounds checking: malformed WKB throws instead of allocating unbounded buffers

Round-trip tested against geoarrow-data fixtures for all 6 types.

AI (Claude) supported my development of this PR.

kylebarron and others added 6 commits March 30, 2026 14:27
* Import from apache arrow root level

* Fix lint

* Consolidate imports to use the main one
Replace @loaders.gl/wkt-based WKB parser with zero-dependency
implementation modeled on geoarrow-rs. Direct DataView parsing,
auto-detect type/dim from WKB headers, null handling via Arrow
validity bitmap, round-trip encode/decode for all 6 OGC types.
Bounds-checked against malformed input. ISO WKB + EWKB support.

63 tests: unit, round-trip, and geoarrow-data corpus.
@kylebarron

Copy link
Copy Markdown
Member

This is cool but not really reviewable as one big PR. If you want to work on this for geoarrow-js, I think we'd need to start with

  • a dedicated PR with a detailed markdown description of the work plan, and for example of how the different parts of geoarrow-rs are planned to apply to geoarrow-js
  • A series of bite-size PRs that each solve one part of the plan.

There's also part of the TypeScript design that needs some thought. Like here

      const cap = capacity as PointCapacity;

an effective type driven design should never have a cast like this. This can be managed by overloads, discriminated unions, etc.

@wietzesuijker wietzesuijker marked this pull request as draft April 1, 2026 18:32
@wietzesuijker

Copy link
Copy Markdown
Author

Makes sense. Converting to draft. I opened an issue with the work plan, geoarrow-rs mapping, and a proposed fix for the type casts: #46.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants