-
Notifications
You must be signed in to change notification settings - Fork 49
[SEA-NodeJS] (2/8) napi-rs binding consumer — TS loader + build script #380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
9851f73
7e483b2
f735b0d
72a8109
3f09081
ca312b3
6eb7690
1a8ab56
d6101aa
cee1e14
87bc9ce
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,219 @@ | ||
| // Copyright (c) 2026 Databricks, Inc. | ||
| // | ||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, software | ||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| /** | ||
| * Lazy loader for the SEA (Statement Execution API) native binding. | ||
| * | ||
| * Mirrors the load-failure-tolerant pattern of `lib/utils/lz4.ts`: the | ||
| * `.node` artifact ships via per-platform optional dependencies | ||
| * (`@databricks/sql-kernel-<triple>`), so its absence must not crash | ||
| * a Thrift-only consumer of the driver. Callers that actually need | ||
| * SEA construct a {@link SeaNativeLoader} (or use the process-global | ||
| * {@link getSeaNative}) which throws a structured error if the binding | ||
| * could not be loaded. | ||
| * | ||
| * M0 publishes a single triple (`linux-x64-gnu`); see | ||
| * `native/sea/README.md` for the supported-platform policy. | ||
| */ | ||
|
|
||
| import type { | ||
| Connection as NativeConnection, | ||
| Statement as NativeStatement, | ||
| ConnectionOptions as NativeConnectionOptions, | ||
| ArrowBatch as NativeArrowBatch, | ||
| ArrowSchema as NativeArrowSchema, | ||
| } from '../../native/sea'; | ||
|
|
||
| // SEA-prefixed re-exports. The kernel-generated `.d.ts` keeps the | ||
| // napi-rs default names (`ConnectionOptions`, `ArrowBatch`, …); we | ||
| // disambiguate on the TS-wrapper side so these never collide with the | ||
| // Thrift-side `ConnectionOptions` (lib/contracts/IDBSQLClient.ts) or | ||
| // `ArrowBatch` (lib/result/utils.ts) when imported elsewhere. | ||
| export type SeaConnectionOptions = NativeConnectionOptions; | ||
| export type SeaArrowBatch = NativeArrowBatch; | ||
| export type SeaArrowSchema = NativeArrowSchema; | ||
| export type SeaConnection = NativeConnection; | ||
| export type SeaStatement = NativeStatement; | ||
|
|
||
| /** | ||
| * The full native binding surface, derived from the generated module | ||
| * so it can never drift from the `.d.ts` contract: when the kernel | ||
| * adds or renames a free function / class, this type follows | ||
| * automatically and `defaultRequire`'s cast stays correct. | ||
| */ | ||
| export type SeaNativeBinding = typeof import('../../native/sea'); | ||
|
|
||
| const MIN_NODE_MAJOR = 18; | ||
|
|
||
| function detectNodeMajor(): number { | ||
| // `process.version` is `vX.Y.Z`; parseInt stops at the first non-digit. | ||
| return parseInt(process.version.slice(1), 10); | ||
| } | ||
|
|
||
| function platformLabel(): string { | ||
| return `${process.platform}-${process.arch}`; | ||
| } | ||
|
|
||
| function loadFailureHint(err: NodeJS.ErrnoException): string { | ||
| const platform = platformLabel(); | ||
| // Do not name a concrete package: the published name uses the napi-rs | ||
| // triple (e.g. `-linux-x64-gnu` / `-linux-x64-musl` / `-win32-x64-msvc`), | ||
| // not the bare `${platform}` shown here, so a literal example would | ||
| // 404. Point at the README's supported-triple list instead. | ||
| const installHint = | ||
| 'Install the matching @databricks/sql-kernel-* optional dependency for your platform ' + | ||
| '(see native/sea/README.md for the supported triples; M0 ships linux-x64-gnu only).'; | ||
| if (err.code === 'MODULE_NOT_FOUND') { | ||
| return `SEA native binding not installed for platform ${platform} on Node ${process.version}. ${installHint}`; | ||
| } | ||
| if (err.code === 'ERR_DLOPEN_FAILED') { | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Low — debuggability. When dlopen fails (binding installed but won't load), real-world causes include musl-vs-glibc, Node ABI version mismatch, architecture mismatch (x64 binary on arm64 via Rosetta). The current hint says "likely a libc or Node ABI mismatch" without:
Suggested fix: return `SEA native binding present but failed to dlopen on platform ${platform} / Node ${process.version}: ${err.message}. Common causes: glibc/musl mismatch (e.g., Alpine Linux), Node ABI mismatch (try \`rm -rf node_modules && npm install\`), or architecture mismatch.`;Found by: ops, devils-advocate. Posted by code-review-squad ·
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in |
||
| // Surface the underlying dlerror string (e.g. `GLIBC_2.32 not found`) | ||
| // plus concrete remediation — without it the cause is invisible. | ||
| return ( | ||
| `SEA native binding present but failed to dlopen on platform ${platform} / Node ${process.version}: ` + | ||
| `${err.message}. Common causes: glibc/musl mismatch (e.g. Alpine Linux — install the -musl variant), ` + | ||
| `Node ABI mismatch (try \`rm -rf node_modules && npm install\`), or CPU-architecture mismatch. ` + | ||
| `The binding requires Node >=${MIN_NODE_MAJOR}.` | ||
| ); | ||
| } | ||
| return `SEA native binding failed to load on platform ${platform} / Node ${process.version}: ${err.message}`; | ||
| } | ||
|
|
||
| /** | ||
| * Default loader: resolves `native/sea/index.js` (the napi-rs router), | ||
| * which selects the per-platform `.node`. `.js` is omitted so eslint's | ||
| * `import/extensions` rule accepts the call. | ||
| */ | ||
| function defaultRequire(): SeaNativeBinding { | ||
| // eslint-disable-next-line @typescript-eslint/no-var-requires, global-require | ||
| return require('../../native/sea') as SeaNativeBinding; | ||
| } | ||
|
|
||
| /** | ||
| * Verify the loaded module exposes the surface the driver depends on. | ||
| * Catches kernel-side renames at load time rather than letting them | ||
| * surface as `undefined is not a function` deep in a call path. | ||
| */ | ||
| function assertBindingShape(binding: SeaNativeBinding): void { | ||
| const missing: string[] = []; | ||
| if (typeof binding.version !== 'function') missing.push('version'); | ||
| if (typeof binding.openSession !== 'function') missing.push('openSession'); | ||
| if (typeof binding.Connection !== 'function') missing.push('Connection'); | ||
| if (typeof binding.Statement !== 'function') missing.push('Statement'); | ||
| if (missing.length > 0) { | ||
| throw new Error( | ||
| `SEA native binding loaded but is missing expected export(s): ${missing.join(', ')}. ` + | ||
| `The kernel-generated binding and the JS loader are out of sync.`, | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Loads and caches the SEA native binding. Exposed as a class with an | ||
| * injectable `load` seam so consumers (e.g. `SeaBackend`) can be unit | ||
| * tested with a stub binding instead of requiring a real `.node` on the | ||
| * test machine. Most production code uses the process-global default | ||
| * via {@link getSeaNative} / {@link tryGetSeaNative}. | ||
| */ | ||
| export class SeaNativeLoader { | ||
| private cached: SeaNativeBinding | null | undefined; | ||
|
|
||
| private cachedError: Error | undefined; | ||
|
|
||
| /** | ||
| * @param load injectable module-require seam (stub a binding in tests) | ||
| * @param nodeMajor injectable Node-major detector. Defaults to reading the | ||
| * live `process.version`; injected in unit tests so the | ||
| * load/shape branches are exercised independently of the | ||
| * runner's actual Node version (the matrix spans 14–20). | ||
| */ | ||
| constructor( | ||
| private readonly load: () => SeaNativeBinding = defaultRequire, | ||
| private readonly nodeMajor: () => number = detectNodeMajor, | ||
| ) {} | ||
|
|
||
| private tryLoad(): SeaNativeBinding | undefined { | ||
| const nodeMajor = this.nodeMajor(); | ||
| // Fail closed: if we cannot determine the Node major (NaN) or it is | ||
| // below the floor, refuse the load and fall back to Thrift. | ||
| if (!Number.isFinite(nodeMajor) || nodeMajor < MIN_NODE_MAJOR) { | ||
| this.cachedError = new Error( | ||
| `SEA native binding requires Node >=${MIN_NODE_MAJOR}; running Node ${process.version}. ` + | ||
| `Continue using the Thrift backend on this runtime.`, | ||
| ); | ||
| return undefined; | ||
| } | ||
|
|
||
| try { | ||
| const binding = this.load(); | ||
| assertBindingShape(binding); | ||
| return binding; | ||
| } catch (err) { | ||
| if (err instanceof Error && 'code' in err) { | ||
| this.cachedError = new Error(loadFailureHint(err as NodeJS.ErrnoException)); | ||
| } else if (err instanceof Error) { | ||
| // Shape-check failure or any other Error — preserve its message. | ||
| this.cachedError = err; | ||
| } else { | ||
| this.cachedError = new Error(`SEA native binding failed to load with non-standard error: ${String(err)}`); | ||
| } | ||
| return undefined; | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Returns the loaded native binding. Throws a structured error if the | ||
| * binding is unavailable on this platform / Node version. | ||
| */ | ||
| get(): SeaNativeBinding { | ||
| if (this.cached === undefined) { | ||
| this.cached = this.tryLoad() ?? null; | ||
| } | ||
| if (this.cached === null) { | ||
| throw this.cachedError ?? new Error('SEA native binding unavailable'); | ||
| } | ||
| return this.cached; | ||
| } | ||
|
|
||
| /** | ||
| * Returns the loaded binding or `undefined` if it could not be | ||
| * loaded. Use this for capability-detection at startup; use | ||
| * {@link get} at the point where SEA is actually required. | ||
| */ | ||
| tryGet(): SeaNativeBinding | undefined { | ||
| if (this.cached === undefined) { | ||
| this.cached = this.tryLoad() ?? null; | ||
| } | ||
| return this.cached ?? undefined; | ||
| } | ||
| } | ||
|
|
||
| // Process-global default instance + thin convenience wrappers. | ||
| const defaultLoader = new SeaNativeLoader(); | ||
|
|
||
| /** | ||
| * Returns the loaded native binding from the process-global loader. | ||
| * Throws a structured error if the binding is unavailable. | ||
| */ | ||
| export function getSeaNative(): SeaNativeBinding { | ||
| return defaultLoader.get(); | ||
| } | ||
|
|
||
| /** | ||
| * Returns the loaded binding from the process-global loader, or | ||
| * `undefined` if it could not be loaded. | ||
| */ | ||
| export function tryGetSeaNative(): SeaNativeBinding | undefined { | ||
| return defaultLoader.tryGet(); | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,87 @@ | ||
| # `native/sea/` — consumer-side directory for the Rust napi binding | ||
|
|
||
| **The Rust binding source lives in the kernel repo** at | ||
| `databricks-sql-kernel/napi/`. Building it requires a local checkout | ||
| of that repo — see "Build for local dev" below. The published npm | ||
| package is `@databricks/sql-kernel-<triple>`. | ||
|
|
||
| ## Workspace topology | ||
|
|
||
| The napi crate is a **standalone Cargo workspace** (`[workspace] | ||
| members = ["."]` in `napi/Cargo.toml`), **not** a sibling of `pyo3/` | ||
| in the kernel root workspace. | ||
|
|
||
| The reason is Cargo feature unification. pyo3 builds the kernel with | ||
| the default `tls-native` feature (system OpenSSL via `native-tls`). | ||
| The napi crate has to opt INTO `tls-rustls` instead: napi modules are | ||
| loaded into Node.js processes that statically link OpenSSL 3.x, and | ||
| dynamically linking the system's OpenSSL 1.1 (which `native-tls` | ||
| pulls in on Linux) collides with Node's symbols at module-load time | ||
| and segfaults the process before any Rust code runs. `rustls` is | ||
| pure Rust + `ring` and avoids the conflict entirely. | ||
|
|
||
| If napi lived in the same workspace as pyo3, `cargo build | ||
| --workspace` would unify the kernel's feature set to `tls-native ∪ | ||
| tls-rustls`, link both TLS stacks into the resulting napi cdylib, | ||
| and reintroduce the same clash. Standalone-workspace is the fix. | ||
|
|
||
| ## What lives in this directory | ||
|
|
||
| - `index.d.ts` — TypeScript declarations consumed by `lib/sea/`. | ||
| Generated by napi-rs from the Rust source; checked in as the | ||
| consumer-facing type contract. | ||
| - `index.js` — napi-rs's per-platform router shim. Gitignored; | ||
| populated by `npm run build:native` for local dev. In published | ||
| tarballs it ships alongside the `.d.ts` and `require()`s the | ||
| right `@databricks/sql-kernel-<triple>` optional dependency. | ||
| - `index.*.node` — the actual native binary, one per platform. | ||
| Gitignored. In production these live in the per-triple optional | ||
| dependencies (`@databricks/sql-kernel-linux-x64-gnu`, etc.); for | ||
| local dev `npm run build:native` copies one into this directory. | ||
|
|
||
| ## Build for local dev | ||
|
|
||
| ```bash | ||
| # From the nodejs repo root: | ||
| export DATABRICKS_SQL_KERNEL_REPO=/path/to/your/databricks-sql-kernel | ||
| npm run build:native # release build (default) | ||
| BUILD_PROFILE= npm run build:native # debug build (empty BUILD_PROFILE drops --release) | ||
| ``` | ||
|
|
||
| `DATABRICKS_SQL_KERNEL_REPO` points at the kernel repo root (the | ||
| directory containing `napi/`) and is required when your kernel | ||
| checkout isn't at `../../databricks-sql-kernel` relative to the | ||
| nodejs repo. | ||
|
|
||
| ## Production load path | ||
|
|
||
| At release time the kernel's CI publishes | ||
| `@databricks/sql-kernel-<triple>` npm packages — one per supported | ||
| platform — each containing a single `.node` binary. The nodejs | ||
| driver lists them as `optionalDependencies`; npm installs only the | ||
| one matching the consumer's `process.platform` / `process.arch`. | ||
| `native/sea/index.js` (the napi-rs router) then `require()`s the | ||
| installed package at load time. | ||
|
|
||
| ## Supported platforms (M0) | ||
|
|
||
| M0 publishes a **single** triple: **`linux-x64-gnu`** (package | ||
| `@databricks/sql-kernel-linux-x64-gnu`). It is the only entry in the | ||
| driver's `optionalDependencies`. | ||
|
|
||
| On every other platform (macOS, Windows, linux-arm64, linux-x64-musl | ||
| / Alpine, …) the SEA binding is simply absent: `SeaNativeLoader` | ||
| returns `undefined` from `tryGet()` / throws a structured | ||
| `MODULE_NOT_FOUND` hint from `get()`, and the driver continues to use | ||
| the Thrift backend exclusively. This is expected, not a regression — | ||
| additional triples are added to `optionalDependencies` as the kernel | ||
| CI starts publishing them in later milestones. | ||
|
|
||
| ## Supply-chain note | ||
|
|
||
| The unpublished triple names (`@databricks/sql-kernel-darwin-arm64`, | ||
| `…-win32-x64-msvc`, etc.) referenced by the router are **not** | ||
| squat-able: `@databricks` is a Databricks-owned npm scope, and npm | ||
| only allows org members to publish under a scope it owns. A third | ||
| party therefore cannot register `@databricks/sql-kernel-*` and have | ||
| the router autoload it. No placeholder packages are required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High — release-engineering landmine that crashes every consumer the moment PR 3/8+ wires SEA in.
Verified at HEAD
548a14b8:git ls-tree native/sea/shows ONLYREADME.mdandindex.d.ts— noindex.js..gitignore:17ignoresnative/sea/index.js(it's regenerated bynpm run build:native)..npmignore:10allow-lists!native/sea/index.jsto ship in the tarball.package.json scriptshas noprepackorprepublishOnlyhook that runsbuild:native.lib/sea/SeaNativeLoader.ts:82doesrequire('../../native/sea')— Node's resolver looks forindex.jsin that directory.Result: any
npm publishfrom a workspace that didn't manually runnpm run build:nativefirst ships a tarball missing the napi-rs router. Consumers getMODULE_NOT_FOUNDwith a misleading hint that tells them to install the optional dep (which is already installed) instead of pointing at the missing router shim.PR #380 doesn't crash anything today because nothing calls
getSeaNative(). The moment PR 3/8 wiresSeaBackend.connect()to the loader, every install bricks.native/sea/index.js(analogous to the committedindex.d.ts— the file is small and stable), or (b) add aprepackscript that runsbuild:nativegated onDATABRICKS_RELEASE_BUILD=1, with an explicit[ -f native/sea/index.js ] || (echo "Missing native router; run npm run build:native first" && exit 1)assertion that fails the pack if it's still missing.Found by: security, ops, maintainability, devils-advocate (4 reviewers).
Posted by code-review-squad ·
/full-reviewThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in
3216019(C1). Committed the napi-rs routernative/sea/index.js(un-ignored, analogous to the committedindex.d.ts) and added aprepackassertion that hard-fails the pack if the router is missing:Companion kernel fix databricks-sql-kernel#93 corrects the base package name so the router’s require paths resolve. A publish can no longer ship a tarball missing the router.