feat(sea): expose maxConnections + lock complex types as Arrow default#392
Open
msrathore-db wants to merge 6 commits into
Open
feat(sea): expose maxConnections + lock complex types as Arrow default#392msrathore-db wants to merge 6 commits into
msrathore-db wants to merge 6 commits into
Conversation
4 tasks
Contributor
Author
|
Re-review requested — round-3 fixups landed; DA round-3 sign-off received. Round-3 final tip: This PR carries F1 ( F1 (maxConnections):
F2 (Arrow complex types):
Gates: Co-authored-by: Isaac |
43893da to
cea1925
Compare
This was referenced May 31, 2026
Closed
…i binding Surfaces the kernel's `HttpConfig::pool_max_idle_per_host` knob through `ConnectionOptions.maxConnections` on the public client options, forwards it through `buildSeaConnectionOptions` into the napi `openSession` call. When omitted, the kernel's default (100) applies. Mirrors the Python connector's `max_connections` kwarg on the SEA backend. The NodeJS Thrift backend does NOT currently expose this knob — it remains a no-op on that path, documented inline. No break to Thrift parity (Thrift never had it). Plumbing: - `IDBSQLClient.ConnectionOptions` gains `maxConnections?: number` with a doc note that the Thrift backend ignores it today. - `SeaSessionDefaults` (already the shared base across auth-mode variants in `SeaNativeConnectionOptions`) gains the field so all three auth modes (PAT, OAuth M2M, OAuth U2M) round-trip it. - `buildSeaConnectionOptions` validates positive-integer-ness at the JS boundary (rejects 0, negative, non-integer, NaN) so a misuse fails here with a clear `HiveDriverError` instead of inside the kernel with a cryptic napi error. - `native/sea/index.d.ts` (the committed copy of the napi-rs generated d.ts) gets the `maxConnections?: number` field on `ConnectionOptions`, in step with the kernel branch `msrathore/krn-max-connections` (commit `0c5ca3f`). Tests: - `tests/unit/sea/auth-max-connections.test.ts` — 9 cases covering omission, valid values, boundary (1), and rejection of 0, negative, fractional, NaN; plus the OAuth M2M / U2M arms. Cross-PR dependency: stacks on `msrathore/sea-auth-u2m` (current NodeJS tip) and requires kernel branch `msrathore/krn-max-connections` (napi binding side of the same change). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Adds an end-to-end test against pecotesting (gated by `DATABRICKS_PECOTESTING_*` env vars; skipped when absent) that confirms ARRAY / MAP / STRUCT and nested ARRAY<STRUCT> come back as **native Arrow** shapes (List / Map / Struct) — not Utf8 JSON strings. The kernel's `ResultConfig::complex_types_as_json` defaults to `false` (Arrow-native), and the SEA wire request hardcodes `format = ARROW_STREAM`, so this is the existing default. The test locks the contract: a regression that flips the default (or that an upstream change wraps the result post-processor in the JSON pass) would fail this assertion immediately. Matches the NodeJS Thrift backend's `complexTypesAsArrow=true` default — see `DBSQLSession.getArrowOptions` where `useArrowNativeTypes=true` propagates to `complexTypesAsArrow`. Mirrors the kernel-side e2e at `tests/v0_execute_e2e.rs::complex_types_as_json_flag_stringifies_complex_columns` which exercises both the Arrow-native and JSON-string paths. Decision — no opt-in toggle exposed at the JS layer: neither Python `use_sea` nor NodeJS Thrift exposes a `complexTypesAsJson` knob to end users; the kernel's `ResultConfig.complex_types_as_json` remains internal until a consumer needs it. Adding the toggle now would invite drift from the kernel; we revisit when a consumer asks. Matrix row 11 of section 3 stays as "implemented — native Arrow default matches Thrift behaviour". Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…rict typeId asserts
DA round 1 findings on F1 (max_connections) and F2 (complex types
as Arrow), addressed in a single fixup because both touch the F1
NodeJS branch.
F1 — M1 upper-bound: the JS-layer validation accepted any positive
integer but the napi binding accepts `u32`. A value > 2^32 - 1
would either silently truncate at FFI or throw a cryptic kernel
error. Add an explicit upper-bound check with a clear
`HiveDriverError` message that points at typical pool sizes.
Three new unit tests: accept MAX_U32, reject MAX_U32+1, reject
Number.MAX_SAFE_INTEGER.
F1 — M2 mock test: the existing unit suite verified the
buildSeaConnectionOptions output shape but did not exercise the
end-to-end SeaBackend.connect → openSession → napi.openSession
round-trip. Add two mock-binding tests via the existing
`makeFakeBinding` helper that:
1. Confirm maxConnections is forwarded to the napi openSession call
2. Confirm maxConnections is `undefined` (not 0) when omitted, so
the napi binding's `Option<u32>` reads None and the kernel
default of 100 applies.
F2 — H1 skip-gate: the e2e test imported `getSeaNative` at module
top-level, which transitively ran `SeaNativeLoader.ts`'s
module-level `require('../../native/sea/index.js')`. When the
`.node` artifact isn't built (`yarn build:native` not run), that
require throws MODULE_NOT_FOUND BEFORE mocha can invoke `before()`,
crashing test discovery for the whole suite. Fix:
1. Verify the native artifact exists in `before()` and skip with
a clear "run yarn build:native" message if absent
2. Lazy-require the napi shim via `createRequire(import.meta.url)`
inside `loadBinding()` so the require lives outside module-load
path. `createRequire` (vs bare `require`) handles the ESM
reparse path mocha 11+ takes for ts-node-emitted files
(MODULE_TYPELESS_PACKAGE_JSON warning).
F2 — M2 strict assertions: prior regex-based assertions
(`/List/i`, `/Map|List/i`, etc.) were too permissive — they would
accept LargeList or FixedSizeList where the kernel must return
plain `List`. Switch to arrow-js `Type` enum comparisons via
`.type.typeId`. Adds:
- Strict typeId equality for c_arr (List), c_map (Map), c_struct
(Struct), c_arr_struct (List)
- Nested-structural check that c_arr_struct's child is Struct
- Row-count sanity assert (literal SELECT → 1 row)
F2 — live e2e: ran against pecotesting (`DATABRICKS_PECOTESTING_*`)
with the freshly-built napi binary. All assertions passed in 655ms.
The kernel default `complex_types_as_json=false` is verified to
produce native Arrow shapes for ARRAY/MAP/STRUCT and nested
ARRAY<STRUCT> end-to-end.
Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
mocha 11+'s MODULE_TYPELESS_PACKAGE_JSON reparse-as-ESM path breaks `import * as fs from 'fs'` / `import * as path from 'path'`: under ESM, the namespace import doesn't expose the CJS module's exports as own properties, so `path.resolve(...)` was undefined at runtime. Switch to named imports (`resolve as resolvePath` aliased to keep the call site readable). Same fix applied to F4 and F3b in their respective branches. Verified live e2e passes against pecotesting after the fix. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
DA round 2 findings on F2 / F4 / F3b H1 had a common root cause
the team-lead correctly flagged: `SeaNativeLoader.ts` does
`const native = require('../../native/sea/index.js')` at module
load, so any import from this module triggers the `.node`
artifact load. When `yarn build:native` hasn't run, that throws
MODULE_NOT_FOUND BEFORE mocha can fire `before()` skip-gates,
crashing test discovery for every e2e suite that imports
anything from this module — including transitive imports via
`DBSQLClient` → `SeaBackend` → `SeaNativeLoader`.
Round 1 dance: every e2e test got a `createRequire` +
filesystem-check workaround. Round 2 fixes the actual defect:
the loader itself memoises the require behind `getSeaNative()`,
so importing the module is free for code paths that never reach
SEA. Filesystem-check guards in the e2e tests stay as defense-
in-depth (they produce a friendlier "run yarn build:native"
diagnostic than napi-rs's raw MODULE_NOT_FOUND).
Verified the fix by moving the .node artifact aside and running
the F2 e2e: previously CRASHED at file load with
MODULE_NOT_FOUND; now SKIPS cleanly via the suite's `before()`
gate with an actionable message. Restored binary, re-ran the
test: passes against pecotesting in 480ms.
Also addresses F2 M1 lint:
- `lib/sea/SeaNativeLoader.ts:131` — added `import/extensions`
to the eslint-disable comment on the lazy-require, since the
`.js` extension is required (napi shim is CommonJS, not TS)
- `tests/e2e/sea/complex-types-e2e.test.ts:148` — switched to
destructuring `const { schema } = table` per prefer-
destructuring rule
Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Adds an end-to-end test against pecotesting (gated by `DATABRICKS_PECOTESTING_*` env vars; skipped when absent) that verifies `maxConnections` flows from `ConnectionOptions` through `buildSeaConnectionOptions` through the napi `openSession` into the kernel's `HttpConfig::pool_max_idle_per_host`, and that a session opened with a custom value round-trips a real query. Combined with the existing unit + mock-binding tests, this proves end-to-end: - Unit tests pin the JS-side value validation + napi-shape (mock binding records the openSession args). - This e2e proves the value actually reaches the kernel without breaking the wire path against a live warehouse. Three cases: 1. `maxConnections=1` — minimum bound the JS layer accepts. 2. `maxConnections=200` — large pool size, exercises the high end. 3. Omitted — kernel default (100) regression check. All three pass against pecotesting (~500ms each). DA round-1 F1 "L2 live" finding addressed. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
6f80c11 to
747cbdc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two features bundled into a single SEA-side PR:
maxConnections(F1) — exposes the kernel'sHttpConfig::pool_max_idle_per_hostknob throughConnectionOptions.maxConnectionson the public client options. Forwarded throughbuildSeaConnectionOptionsinto the napiopenSessioncall. When omitted, the kernel default (100) applies. Mirrors the Python connector'smax_connectionskwarg on the SEA backend. NodeJS Thrift backend does NOT expose this knob — it's a SEA-only extension; no parity break.Complex types as Arrow default lock-in (F2) — adds a regression e2e test against pecotesting that selects ARRAY/MAP/STRUCT/nested ARRAY and asserts the returned Arrow schema fields are
List/Map/Struct/List(notUtf8JSON strings). KernelResultConfig::complex_types_as_jsonalready defaults tofalse(Arrow-native), and the SEA wire request hardcodesformat = "ARROW_STREAM"— so this is the existing default. The test locks the contract.Plumbing — F1
IDBSQLClient.ConnectionOptions.maxConnections?: numberwith doc note that Thrift backend currently ignores itSeaSessionDefaults(the shared base across auth-mode variants) carries the field so all three auth modes (PAT, OAuth M2M, OAuth U2M) round-trip itbuildSeaConnectionOptionsvalidates positive-integer-ness at the JS boundary (rejects 0, negative, non-integer, NaN) withHiveDriverErrornative/sea/index.d.tsadds the field in step with kernel branchmsrathore/krn-max-connectionsCross-repo dependency
Requires kernel branch
msrathore/krn-max-connections(PR https://github.com/databricks/databricks-sql-kernel/pull/74) for the napi-binding side.Downstream fixes / reviewer note
queryTags,rowLimit, andstatementConfare fixed later in feat(sea): Thrift-parity — params, intervals, getInfo, SQL-error class, input validation + queryTags #403/feat(sea): wire rowLimit + statementConf + TIMESTAMP_NTZ/LTZ params #408. This PR is limited tomaxConnectionsplus the complex-types-as-Arrow default lock.openSession, neutral operation DTOs, declaredflatbuffers, single IPC duration rewrite, cancel/close failure-state rollback, closed fetch →OperationStateError(Closed)).Test plan
tests/unit/sea/auth-max-connections.test.ts— 9 cases passing (omit, valid, boundary 1, 0-reject, neg-reject, non-int-reject, NaN-reject, M2M arm, U2M arm)auth-pat.test.tsregression check — 8/8 passSeaOperationBackend.ts+ 2 from missinglib/version.tsgenerated by prepare hook)IDBSQLClient.ts,SeaAuth.ts,SeaBackend.ts, new tests)tests/e2e/sea/complex-types-e2e.test.tsruns against pecotesting when env vars set; pending merge of kernel PR to test full path