Support terraform references in direct engine#5392
Open
denik wants to merge 46 commits into
Open
Conversation
The script takes the DABs fields file and Terraform provider schema JSON, applies key renames (plural→singular) matching the tfdyn converter logic, and produces a bidirectional map showing which fields match, are renamed, or exist only in one schema. Co-authored-by: Isaac
…irst Previously, dabs_only entries showed the expected tf_path which was misleading (e.g., volumes.full_name appeared to match but doesn't exist in the TF schema). Now dabs_only entries show ? for tf_path. Also reorder columns to: status, dabs_path, tf_resource, tf_path. Co-authored-by: Isaac
Previously every descendant of a renamed field (e.g., tasks→task) was also marked "renamed". Now only the field where the rename occurs is printed. Inner renames (e.g., tasks.*.libraries→task.*.library) are still shown. Co-authored-by: Isaac
…cle, grants) These are DABs additions that don't exist in Terraform schema. Suppressing them reduces noise in dabs_only output (782 → 632 entries). Co-authored-by: Isaac
- Add bundle/terraform_dabs_map package that builds a bidirectional map between DABs resource fields and Terraform resource fields - Add 'databricks bundle debug schema-map' command that prints the mapping - Extend TF schema codegen to emit resource_schemas.go: a typed probe struct (ResourceSchemas) used to enumerate all TF resource struct types via reflection without a manual registry - Add acceptance test that records command output Co-authored-by: Isaac
- Add bundle/terraform_dabs_map/cmd/generate/main.go: generator that computes DABs↔Terraform field mapping at build time using a stemming algorithm (ies→y, s→"", git_→"") instead of hardcoded rename rules. Conflict detection warns when multiple DABs paths stem to the same TF path. - Add bundle/terraform_dabs_map/generated.go: three precomputed maps (TerraformToDABsFieldMap, DABsOnlyFields, TerraformOnlyFields) plus MatchCounts, all keyed by DABs group name. Only records renames where the leaf field name itself differs; child paths of renamed parents count as matches (same leaf name, different ancestor). - Remove bundle/terraform_dabs_map/map.go and renames.go: the runtime Build() function and hardcoded renameRules are replaced by the generator. - Update cmd/bundle/debug/schema_map.go: print one summary line per resource instead of a full field listing. - Add generate-schema-map task to Taskfile.yml. - Update acceptance test output to match new summary format. Co-authored-by: Isaac
…to test - Add modified_status to dabsKnownFields (DABs-internal deployment tracking field with no Terraform concept) and provider_config to tfKnownFields (Terraform provider metadata with no DABs concept), removing noise from the dabs-only and tf-only maps. - Replace bundle/terraform_dabs_map/cmd/generate/main.go with bundle/terraform_dabs_map/generate_test.go: TestGenerateSchemaMap recomputes generated.go and either overwrites it (with -update) or fails if the file is stale. Stats (matches/renames/dabs-only/tf-only per group) are printed via t.Logf when -update is passed. - Add Groups var to generated.go listing the processed 2-level groups; the CLI command now iterates over Groups instead of GroupToTerraformName to avoid printing the 3-level groups (permissions, grants, secret_acls). - Drop MatchCounts from generated.go; match counts are printed during generation only (via t.Logf -v) and not needed at runtime. - Update Taskfile generate-schema-map to use the test-based approach. Co-authored-by: Isaac
Replace the hand-rolled flag + assert.Equal with testdiff.OverwriteMode and testdiff.AssertEqualTexts so failures show a unified diff and the -update flag is consistent with the rest of the test suite. Co-authored-by: Isaac
Groups was only needed to distinguish 2-level groups (with adapters) from 3-level groups (permissions, grants, secret_acls) in GroupToTerraformName. The three generated maps encode this already: groups without adapters produce no entries in any map, so the union of all map keys is exactly the set of processed 2-level groups. Groups with all-zero entries (e.g. registered_models) disappear from the CLI output — a line reading "0 renames, 0 dabs-only, 0 tf-only" carries no information. Co-authored-by: Isaac
…level suppression - allStems: replace O(2^n) cartesian product with O(n+1) approach: try each segment independently, then try all-at-once as fallback for multi-renames like tasks.libraries → task.library. - Add tfKnownSegments for any-level TF path suppression (vs. tfKnownFields which only suppresses at the top level). Add provider_config to it since it appears nested at many depths and is never a DABs concept. - Add hasKnownSegment helper to check any segment in a path against a set. Co-authored-by: Isaac
matchToTF processes one segment at a time: segmentVariants() is called on a single field name at each level, and once the head segment is mapped the tail is resolved recursively against the filtered sub-tree. This is strictly more correct than the O(n+1) allStems approach. allStems tried each segment independently and added an all-at-once fallback, but the fallback applied all stems simultaneously — so tasks.libraries.maven.coordinates would get stemmed to task.library.maven.coordinate, missing the TF field task.library.maven.coordinates (where coordinates is spelled the same on both sides). matchToTF finds this correctly because it stops stemming a segment once a prefix match is found, rather than stemming all segments together. Co-authored-by: Isaac
segmentVariants returned up to 4 forms to cover combinations of git_-stripping and ies/s suffixing. In practice those intermediate forms (just-stripped or just-singularized) are never the correct TF name; the fully-stemmed form always suffices. stem() applies both transformations in sequence and returns a single string. matchToTF tries the original and stem(original) — at most two candidates per segment, no slice allocation needed in the common case. Co-authored-by: Isaac
Co-authored-by: Isaac
Replace the flat dotted-path maps with nested Go types:
FieldSet map[string]FieldSet
Each key is a single segment; empty child marks a leaf.
Used for DABsOnlyFields and TerraformOnlyFields.
RenameTree map[string]RenameNode
RenameNode struct { DABs string; Children RenameTree }
Navigate with TF segments; DABs is the corresponding DABs name when it
differs. Both fields may be set (e.g. TF 'task'→DABs 'tasks' while
'task.library'→'tasks.libraries' lives in Children).
Both types carry a Len() method so schema_map.go can count leaves without
knowing the nesting depth.
Co-authored-by: Isaac
Co-authored-by: Isaac
postgres_branches/catalogs/endpoints/projects all expose TF fields that are nested under a "spec" object in Terraform but flat in DABs. The generator now auto-detects such wrappers (Step 3): a TF segment is marked Unwrap when ALL of its sub-fields are accounted for by unmatched DABs fields, which rejects output-only wrappers like "status" that carry extra computed fields. RenameNode gains an Unwrap bool field; TerraformPathToDABs skips the segment without emitting it to the DABs path when Unwrap is true. Co-authored-by: Isaac
After matching against InputConfigType, remaining TF-only fields are compared against RemoteType (the shape returned by DoRead in the direct engine). Fields present in both TF schema and RemoteType but absent from user-facing InputConfigType are classified as computed: server-generated outputs the direct engine can read, but the user doesn't configure. This cleans up TerraformOnlyFields significantly for postgres resources — all are now in TerraformComputedFields (timestamps, status, uid, etc.), leaving only "purge_on_delete" as a genuinely TF-specific field. Other resources (pipelines, experiments, schemas, sql_warehouses, etc.) also gain non-zero computed counts. Jobs remain at 257 tf-only / 0 computed because those fields are structural mismatches (legacy TF top-level task syntax vs DABs tasks[]) rather than RemoteType coverage gaps. Co-authored-by: Isaac
… filter The TerraformComputedFields variable had no consumer beyond the debug command. The useful work was suppressing false positives from TerraformOnlyFields (server-generated outputs that are in RemoteType but not in user-facing InputConfigType). Keep the RemoteType walk as a silent filter inside buildGroup; drop the generated variable entirely. Co-authored-by: Isaac
Generate DABsToTerraformRenameMap (inverse of TerraformToDABsFieldMap renames) and DABsToTerraformWrappers (TF prefix to prepend for groups with Unwrap) via a new invertRenames helper in the generator. Add DABsPathToTerraform which mirrors TerraformPathToDABs: renames segments using DABsToTerraformRenameMap and prepends the spec wrapper for postgres resources via DABsToTerraformWrappers. Extend TestTerraformPathToDABs with a roundtrip bool field — when true the test also verifies DABsPathToTerraform(output) == input. All mapped paths roundtrip correctly; TF-computed paths (status.*, timestamps) are marked roundtrip:false since they pass through TF→DABs unchanged but would get the spec wrapper incorrectly added on the way back. Add TestDABsPathToTerraform covering renames, wrapper prepend, unknown segments, array indices, and unknown groups. Co-authored-by: Isaac
…de-only fields - Add FieldSet.Contains to check if a path is covered by the set - TerraformPathToDABs returns error for paths in TerraformOnlyFields - DABsPathToTerraform returns error for paths in DABsOnlyFields - Test table: named fields, one field per line, terrPath/dabsPath names, noRoundtrip instead of roundtrip, expectErr for one-side-only cases - Drop sortedKeys helper; use slices.Sorted(maps.Keys(...)) directly Co-authored-by: Isaac
Test that map-typed TF-only fields (spark_conf, base_parameters) reject any runtime key via the '*' wildcard in FieldSet.Contains. Co-authored-by: Isaac
Terraform represents single nested blocks as lists and requires [0] indexing to access their fields (e.g. python_wheel_task[0].entry_point). DABs models these as plain structs. Make getValue() and validateNodeSlice() ignore [0] on struct values so TF-style paths work transparently against DABs Go structs. Co-authored-by: Isaac
The Python script and its TSV output were scaffolding used to discover rename rules and one-sided fields. All useful data is now in the Go version (bundle/terraform_dabs_map/generated.go), which is more complete and serves as the runtime source of truth. Co-authored-by: Isaac
…mode
Add TFOnlyReferences mutator to the initialize phase. It walks the full
config looking for ${resources.*} references and, for each field path,
checks TerraformOnlyFields to see if the field has no DABs equivalent.
In direct mode the reference cannot be resolved at deploy time, so the
mutator returns an error early (during validate / plan / deploy).
In Terraform mode the reference is handled by Terraform itself, but it
will break if the bundle is later migrated to the direct engine, so a
warning is emitted to alert the author.
Tests:
- Four unit tests covering error/warning/no-op/renamed-field cases.
- tf_path_only_error acceptance test updated: validate now emits
per-engine diagnostics (2>&1 added to capture stderr alongside
the bundle summary), and plan/deploy also surface the new early
error/warning before reaching their own resolution logic.
Co-authored-by: Isaac
…pelines into tf_path_renames
All three tests exercise the same pattern: a cross-resource ${resources.*}
reference that uses a TF field name (or TF list-block syntax) that the
direct engine translates via TerraformPathToDABs. Merging them removes
duplication and exercises all three rename types in a single deploy:
- git_source[0].branch → git_source.git_branch (rename + [0] on struct)
- task[0].library[0].whl → tasks[0].libraries[0].whl (two renames)
- cluster[0].label → clusters[0].label (pipeline rename)
Co-authored-by: Isaac
The message previously read: "resources.jobs.src.always_running": always_running: Terraform-only field; ... The field name appeared twice: once in the quoted reference, and again as the p[3:] prefix. Drop the redundant prefix. Co-authored-by: Isaac
Groups with a 'spec' structural wrapper showed '0 renames, 0 dabs-only, 0 tf-only' without any indication that spec-unwrapping was applied. Add the unwrap count so the summary is complete. Co-authored-by: Isaac
TestTerraformPathToDABs already checks the roundtrip for every non-error case. Replace TestDABsPathToTerraform with a smaller TestDABsPathToTerraformErrors covering only the DABs-only error cases that the roundtrip never reaches. Co-authored-by: Isaac
Dead code; nothing in the codebase calls them. Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Contributor
Approval status: pending
|
Co-authored-by: Isaac
andrewnester
reviewed
Jun 1, 2026
Co-authored-by: Isaac
Collaborator
|
Commit: 25bf471 |
…url_ref script Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
Why
Enables seamless and unattended migration to direct engine.
Tests
New acceptance and unit tests.,