Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,10 @@ Dockerfile* text
#
.gitattributes export-ignore
.gitignore export-ignore

# napi-rs auto-generates these files from the kernel's `napi-binding/napi/`
# crate; regenerated by `npm run build:native`. Tell git/GitHub they're
# machine-generated so they collapse in diffs and are excluded from
# blame and language stats.
native/sea/index.d.ts linguist-generated=true
native/sea/index.js linguist-generated=true
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,12 @@ coverage_unit
dist
*.DS_Store
lib/version.ts

# SEA native binding — copied/generated from kernel workspace by `npm run build:native`.
# The committed contract is `native/sea/index.d.ts` (TypeScript declarations) and
# `native/sea/index.js` (the napi-rs platform router — small, stable, and required in
# the publish tarball so a missing build step can't ship a tarball that can't load).
# The `.node` binaries are large per-platform artifacts and must NOT be committed;
# in production they arrive via the `@databricks/sql-kernel-<triple>` optional deps.
native/sea/index.node
native/sea/index.*.node
7 changes: 7 additions & 0 deletions .npmignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@
!dist/**/*
!thrift/**/*

# SEA napi-rs router shim + TypeScript declarations. The router (index.js)
# selects the per-platform `.node` artifact from `@databricks/sql-kernel-*`
# optionalDependencies (populated when the kernel CI publishes them);
# the .d.ts is the consumer-facing type contract.
!native/sea/index.js
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High — release-engineering landmine that crashes every consumer the moment PR 3/8+ wires SEA in.

Verified at HEAD 548a14b8:

  • git ls-tree native/sea/ shows ONLY README.md and index.d.ts — no index.js.
  • .gitignore:17 ignores native/sea/index.js (it's regenerated by npm run build:native).
  • .npmignore:10 allow-lists !native/sea/index.js to ship in the tarball.
  • package.json scripts has no prepack or prepublishOnly hook that runs build:native.
  • The loader at lib/sea/SeaNativeLoader.ts:82 does require('../../native/sea') — Node's resolver looks for index.js in that directory.

Result: any npm publish from a workspace that didn't manually run npm run build:native first ships a tarball missing the napi-rs router. Consumers get MODULE_NOT_FOUND with a misleading hint that tells them to install the optional dep (which is already installed) instead of pointing at the missing router shim.

PR #380 doesn't crash anything today because nothing calls getSeaNative(). The moment PR 3/8 wires SeaBackend.connect() to the loader, every install bricks.

  • Suggested fix: either (a) commit native/sea/index.js (analogous to the committed index.d.ts — the file is small and stable), or (b) add a prepack script that runs build:native gated on DATABRICKS_RELEASE_BUILD=1, with an explicit [ -f native/sea/index.js ] || (echo "Missing native router; run npm run build:native first" && exit 1) assertion that fails the pack if it's still missing.

Found by: security, ops, maintainability, devils-advocate (4 reviewers).


Posted by code-review-squad · /full-review

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 3216019 (C1). Committed the napi-rs router native/sea/index.js (un-ignored, analogous to the committed index.d.ts) and added a prepack assertion that hard-fails the pack if the router is missing:

"prepack": "test -f native/sea/index.js || { echo 'ERROR: native/sea/index.js (napi-rs router) is missing …' >&2; exit 1; }"

Companion kernel fix databricks-sql-kernel#93 corrects the base package name so the router’s require paths resolve. A publish can no longer ship a tarball missing the router.

!native/sea/index.d.ts

!LICENSE
!NOTICE
!package.json
Expand Down
6 changes: 6 additions & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,9 @@ coverage
dist
thrift
package-lock.json

# Generated by napi-rs from the kernel's `napi-binding/napi/` crate;
# regenerated by `npm run build:native`. Format follows napi-rs's
# defaults (no semicolons), not this repo's prettier config.
native/sea/index.d.ts
native/sea/index.js
219 changes: 219 additions & 0 deletions lib/sea/SeaNativeLoader.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
// Copyright (c) 2026 Databricks, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

/**
* Lazy loader for the SEA (Statement Execution API) native binding.
*
* Mirrors the load-failure-tolerant pattern of `lib/utils/lz4.ts`: the
* `.node` artifact ships via per-platform optional dependencies
* (`@databricks/sql-kernel-<triple>`), so its absence must not crash
* a Thrift-only consumer of the driver. Callers that actually need
* SEA construct a {@link SeaNativeLoader} (or use the process-global
* {@link getSeaNative}) which throws a structured error if the binding
* could not be loaded.
*
* M0 publishes a single triple (`linux-x64-gnu`); see
* `native/sea/README.md` for the supported-platform policy.
*/

import type {
Connection as NativeConnection,
Statement as NativeStatement,
ConnectionOptions as NativeConnectionOptions,
ArrowBatch as NativeArrowBatch,
ArrowSchema as NativeArrowSchema,
} from '../../native/sea';

// SEA-prefixed re-exports. The kernel-generated `.d.ts` keeps the
// napi-rs default names (`ConnectionOptions`, `ArrowBatch`, …); we
// disambiguate on the TS-wrapper side so these never collide with the
// Thrift-side `ConnectionOptions` (lib/contracts/IDBSQLClient.ts) or
// `ArrowBatch` (lib/result/utils.ts) when imported elsewhere.
export type SeaConnectionOptions = NativeConnectionOptions;
export type SeaArrowBatch = NativeArrowBatch;
export type SeaArrowSchema = NativeArrowSchema;
export type SeaConnection = NativeConnection;
export type SeaStatement = NativeStatement;

/**
* The full native binding surface, derived from the generated module
* so it can never drift from the `.d.ts` contract: when the kernel
* adds or renames a free function / class, this type follows
* automatically and `defaultRequire`'s cast stays correct.
*/
export type SeaNativeBinding = typeof import('../../native/sea');

const MIN_NODE_MAJOR = 18;

function detectNodeMajor(): number {
// `process.version` is `vX.Y.Z`; parseInt stops at the first non-digit.
return parseInt(process.version.slice(1), 10);
}

function platformLabel(): string {
return `${process.platform}-${process.arch}`;
}

function loadFailureHint(err: NodeJS.ErrnoException): string {
const platform = platformLabel();
// Do not name a concrete package: the published name uses the napi-rs
// triple (e.g. `-linux-x64-gnu` / `-linux-x64-musl` / `-win32-x64-msvc`),
// not the bare `${platform}` shown here, so a literal example would
// 404. Point at the README's supported-triple list instead.
const installHint =
'Install the matching @databricks/sql-kernel-* optional dependency for your platform ' +
'(see native/sea/README.md for the supported triples; M0 ships linux-x64-gnu only).';
if (err.code === 'MODULE_NOT_FOUND') {
return `SEA native binding not installed for platform ${platform} on Node ${process.version}. ${installHint}`;
}
if (err.code === 'ERR_DLOPEN_FAILED') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — debuggability.

When dlopen fails (binding installed but won't load), real-world causes include musl-vs-glibc, Node ABI version mismatch, architecture mismatch (x64 binary on arm64 via Rosetta). The current hint says "likely a libc or Node ABI mismatch" without:

  • Including the underlying err.message (which carries the actual dlerror string, e.g. GLIBC_2.32 not found).
  • Suggesting concrete remediation (rm -rf node_modules, install musl variant, etc.).

Suggested fix:

return `SEA native binding present but failed to dlopen on platform ${platform} / Node ${process.version}: ${err.message}. Common causes: glibc/musl mismatch (e.g., Alpine Linux), Node ABI mismatch (try \`rm -rf node_modules && npm install\`), or architecture mismatch.`;

Found by: ops, devils-advocate.


Posted by code-review-squad · /full-review

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3216019 (C11). The dlopen hint now includes the underlying err.message (the real dlerror, e.g. GLIBC_2.32 not found) plus concrete remediation: glibc/musl mismatch (install the -musl variant on Alpine), Node ABI mismatch (rm -rf node_modules && npm install), and CPU-architecture mismatch.

// Surface the underlying dlerror string (e.g. `GLIBC_2.32 not found`)
// plus concrete remediation — without it the cause is invisible.
return (
`SEA native binding present but failed to dlopen on platform ${platform} / Node ${process.version}: ` +
`${err.message}. Common causes: glibc/musl mismatch (e.g. Alpine Linux — install the -musl variant), ` +
`Node ABI mismatch (try \`rm -rf node_modules && npm install\`), or CPU-architecture mismatch. ` +
`The binding requires Node >=${MIN_NODE_MAJOR}.`
);
}
return `SEA native binding failed to load on platform ${platform} / Node ${process.version}: ${err.message}`;
}

/**
* Default loader: resolves `native/sea/index.js` (the napi-rs router),
* which selects the per-platform `.node`. `.js` is omitted so eslint's
* `import/extensions` rule accepts the call.
*/
function defaultRequire(): SeaNativeBinding {
// eslint-disable-next-line @typescript-eslint/no-var-requires, global-require
return require('../../native/sea') as SeaNativeBinding;
}

/**
* Verify the loaded module exposes the surface the driver depends on.
* Catches kernel-side renames at load time rather than letting them
* surface as `undefined is not a function` deep in a call path.
*/
function assertBindingShape(binding: SeaNativeBinding): void {
const missing: string[] = [];
if (typeof binding.version !== 'function') missing.push('version');
if (typeof binding.openSession !== 'function') missing.push('openSession');
if (typeof binding.Connection !== 'function') missing.push('Connection');
if (typeof binding.Statement !== 'function') missing.push('Statement');
if (missing.length > 0) {
throw new Error(
`SEA native binding loaded but is missing expected export(s): ${missing.join(', ')}. ` +
`The kernel-generated binding and the JS loader are out of sync.`,
);
}
}

/**
* Loads and caches the SEA native binding. Exposed as a class with an
* injectable `load` seam so consumers (e.g. `SeaBackend`) can be unit
* tested with a stub binding instead of requiring a real `.node` on the
* test machine. Most production code uses the process-global default
* via {@link getSeaNative} / {@link tryGetSeaNative}.
*/
export class SeaNativeLoader {
private cached: SeaNativeBinding | null | undefined;

private cachedError: Error | undefined;

/**
* @param load injectable module-require seam (stub a binding in tests)
* @param nodeMajor injectable Node-major detector. Defaults to reading the
* live `process.version`; injected in unit tests so the
* load/shape branches are exercised independently of the
* runner's actual Node version (the matrix spans 14–20).
*/
constructor(
private readonly load: () => SeaNativeBinding = defaultRequire,
private readonly nodeMajor: () => number = detectNodeMajor,
) {}

private tryLoad(): SeaNativeBinding | undefined {
const nodeMajor = this.nodeMajor();
// Fail closed: if we cannot determine the Node major (NaN) or it is
// below the floor, refuse the load and fall back to Thrift.
if (!Number.isFinite(nodeMajor) || nodeMajor < MIN_NODE_MAJOR) {
this.cachedError = new Error(
`SEA native binding requires Node >=${MIN_NODE_MAJOR}; running Node ${process.version}. ` +
`Continue using the Thrift backend on this runtime.`,
);
return undefined;
}

try {
const binding = this.load();
assertBindingShape(binding);
return binding;
} catch (err) {
if (err instanceof Error && 'code' in err) {
this.cachedError = new Error(loadFailureHint(err as NodeJS.ErrnoException));
} else if (err instanceof Error) {
// Shape-check failure or any other Error — preserve its message.
this.cachedError = err;
} else {
this.cachedError = new Error(`SEA native binding failed to load with non-standard error: ${String(err)}`);
}
return undefined;
}
}

/**
* Returns the loaded native binding. Throws a structured error if the
* binding is unavailable on this platform / Node version.
*/
get(): SeaNativeBinding {
if (this.cached === undefined) {
this.cached = this.tryLoad() ?? null;
}
if (this.cached === null) {
throw this.cachedError ?? new Error('SEA native binding unavailable');
}
return this.cached;
}

/**
* Returns the loaded binding or `undefined` if it could not be
* loaded. Use this for capability-detection at startup; use
* {@link get} at the point where SEA is actually required.
*/
tryGet(): SeaNativeBinding | undefined {
if (this.cached === undefined) {
this.cached = this.tryLoad() ?? null;
}
return this.cached ?? undefined;
}
}

// Process-global default instance + thin convenience wrappers.
const defaultLoader = new SeaNativeLoader();

/**
* Returns the loaded native binding from the process-global loader.
* Throws a structured error if the binding is unavailable.
*/
export function getSeaNative(): SeaNativeBinding {
return defaultLoader.get();
}

/**
* Returns the loaded binding from the process-global loader, or
* `undefined` if it could not be loaded.
*/
export function tryGetSeaNative(): SeaNativeBinding | undefined {
return defaultLoader.tryGet();
}
87 changes: 87 additions & 0 deletions native/sea/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# `native/sea/` — consumer-side directory for the Rust napi binding

**The Rust binding source lives in the kernel repo** at
`databricks-sql-kernel/napi/`. Building it requires a local checkout
of that repo — see "Build for local dev" below. The published npm
package is `@databricks/sql-kernel-<triple>`.

## Workspace topology

The napi crate is a **standalone Cargo workspace** (`[workspace]
members = ["."]` in `napi/Cargo.toml`), **not** a sibling of `pyo3/`
in the kernel root workspace.

The reason is Cargo feature unification. pyo3 builds the kernel with
the default `tls-native` feature (system OpenSSL via `native-tls`).
The napi crate has to opt INTO `tls-rustls` instead: napi modules are
loaded into Node.js processes that statically link OpenSSL 3.x, and
dynamically linking the system's OpenSSL 1.1 (which `native-tls`
pulls in on Linux) collides with Node's symbols at module-load time
and segfaults the process before any Rust code runs. `rustls` is
pure Rust + `ring` and avoids the conflict entirely.

If napi lived in the same workspace as pyo3, `cargo build
--workspace` would unify the kernel's feature set to `tls-native ∪
tls-rustls`, link both TLS stacks into the resulting napi cdylib,
and reintroduce the same clash. Standalone-workspace is the fix.

## What lives in this directory

- `index.d.ts` — TypeScript declarations consumed by `lib/sea/`.
Generated by napi-rs from the Rust source; checked in as the
consumer-facing type contract.
- `index.js` — napi-rs's per-platform router shim. Gitignored;
populated by `npm run build:native` for local dev. In published
tarballs it ships alongside the `.d.ts` and `require()`s the
right `@databricks/sql-kernel-<triple>` optional dependency.
- `index.*.node` — the actual native binary, one per platform.
Gitignored. In production these live in the per-triple optional
dependencies (`@databricks/sql-kernel-linux-x64-gnu`, etc.); for
local dev `npm run build:native` copies one into this directory.

## Build for local dev

```bash
# From the nodejs repo root:
export DATABRICKS_SQL_KERNEL_REPO=/path/to/your/databricks-sql-kernel
npm run build:native # release build (default)
BUILD_PROFILE= npm run build:native # debug build (empty BUILD_PROFILE drops --release)
```

`DATABRICKS_SQL_KERNEL_REPO` points at the kernel repo root (the
directory containing `napi/`) and is required when your kernel
checkout isn't at `../../databricks-sql-kernel` relative to the
nodejs repo.

## Production load path

At release time the kernel's CI publishes
`@databricks/sql-kernel-<triple>` npm packages — one per supported
platform — each containing a single `.node` binary. The nodejs
driver lists them as `optionalDependencies`; npm installs only the
one matching the consumer's `process.platform` / `process.arch`.
`native/sea/index.js` (the napi-rs router) then `require()`s the
installed package at load time.

## Supported platforms (M0)

M0 publishes a **single** triple: **`linux-x64-gnu`** (package
`@databricks/sql-kernel-linux-x64-gnu`). It is the only entry in the
driver's `optionalDependencies`.

On every other platform (macOS, Windows, linux-arm64, linux-x64-musl
/ Alpine, …) the SEA binding is simply absent: `SeaNativeLoader`
returns `undefined` from `tryGet()` / throws a structured
`MODULE_NOT_FOUND` hint from `get()`, and the driver continues to use
the Thrift backend exclusively. This is expected, not a regression —
additional triples are added to `optionalDependencies` as the kernel
CI starts publishing them in later milestones.

## Supply-chain note

The unpublished triple names (`@databricks/sql-kernel-darwin-arm64`,
`…-win32-x64-msvc`, etc.) referenced by the router are **not**
squat-able: `@databricks` is a Databricks-owned npm scope, and npm
only allows org members to publish under a scope it owns. A third
party therefore cannot register `@databricks/sql-kernel-*` and have
the router autoload it. No placeholder packages are required.
Loading
Loading