Skip to content

Codex: add sync plan/apply commands backed by shared deploying lib#3457

Open
reakaleek wants to merge 16 commits into
mainfrom
candied-soybean
Open

Codex: add sync plan/apply commands backed by shared deploying lib#3457
reakaleek wants to merge 16 commits into
mainfrom
candied-soybean

Conversation

@reakaleek

Copy link
Copy Markdown
Member

Why

Codex produces a static site to .artifacts/codex/docs but had no way to incrementally sync that output to S3 — only the assembler had that capability. The sync strategies were tightly coupled to AssembleContext, making them unusable from codex without duplication.

What

The incremental S3 sync machinery (plan strategy, apply strategy, plan model, validator, and orchestrator) is extracted from Elastic.Documentation.Assembler into a new neutral Elastic.Documentation.Deploying assembly. A small IDocsSyncContext interface captures the five properties the strategies actually use — ReadFileSystem, WriteFileSystem, OutputDirectory, Collector, and EnvironmentName — which both AssembleContext and CodexContext now implement.

Two new commands are wired under codex sync:

  • codex sync plan <codex.yml> --s3-bucket-name <bucket> [--environment <env>] --out <file> — diffs local output against S3 and writes a plan
  • codex sync apply <codex.yml> --s3-bucket-name <bucket> [--environment <env>] --plan <file> — executes the plan

The assembler's assembler deploy plan/apply surface is unchanged; DeployCommands now builds its AssembleContext locally before calling the generalized service.

How

IDocsSyncContext is the only new abstraction introduced. The strategies never needed more than ReadFileSystem, WriteFileSystem, OutputDirectory, Collector, and one environment-name log string — so the interface is deliberately narrow. Both context types already exposed all five; no data was restructured.

Test plan

  • dotnet build docs-builder.slnx — clean build
  • docs-builder assembler deploy plan --help — flags unchanged from before
  • docs-builder codex sync plan --help / docs-builder codex sync apply --help — new commands present with correct flags

reakaleek and others added 3 commits June 2, 2026 17:56
Extract the assembler's incremental S3 sync machinery into a new
Elastic.Documentation.Deploying assembly, decouple it from
AssembleContext via IDocsSyncContext, and wire `codex sync plan` /
`codex sync apply` commands that reuse the same logic.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@reakaleek reakaleek requested a review from a team as a code owner June 2, 2026 16:01
@reakaleek reakaleek requested a review from technige June 2, 2026 16:01
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds a new Elastic.Documentation.Deploying service project, introduces IDocsSyncContext as a shared context for S3 sync operations, refactors sync planning and apply code to use that context instead of AssembleContext, and adds a new codex sync CLI with plan and apply subcommands. CodexContext and AssembleContext are updated to implement IDocsSyncContext. IncrementalDeployService is moved to the deploying namespace and refactored to accept IDocsSyncContext with optional AWS dependencies. Sync strategies and validator are relocated and updated to depend on IDocsSyncContext. Project references, CLI schema, and integration tests are added to wire the new functionality end-to-end.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI/CodexSyncCommand
  participant ServiceInvoker
  participant IncrementalDeployService
  participant AwsS3SyncPlanStrategy
  participant AwsS3SyncApplyStrategy
  participant S3 as AWS S3

  CLI->>ServiceInvoker: invoke Plan(context, s3Bucket, out, deleteThreshold)
  ServiceInvoker->>IncrementalDeployService: Plan(context, s3Bucket, out, deleteThreshold)
  IncrementalDeployService->>AwsS3SyncPlanStrategy: Plan(deleteThreshold)
  AwsS3SyncPlanStrategy-->>IncrementalDeployService: SyncPlan
  IncrementalDeployService->>CLI: write plan via context.WriteFileSystem

  CLI->>ServiceInvoker: invoke Apply(context, s3Bucket, planFile)
  ServiceInvoker->>IncrementalDeployService: Apply(context, s3Bucket, planFile)
  IncrementalDeployService->>IncrementalDeployService: read plan via context.ReadFileSystem
  IncrementalDeployService->>AwsS3SyncApplyStrategy: Apply(sync plan)
  AwsS3SyncApplyStrategy->>S3: upload/delete objects per plan
  AwsS3SyncApplyStrategy-->>IncrementalDeployService: result
  IncrementalDeployService-->>CLI: return success/failure
Loading

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: extracting sync machinery into a shared deploying library and adding new Codex sync commands.
Description check ✅ Passed The description is detailed and directly related to the changeset, explaining the motivation, what was extracted, new commands added, and the test plan.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch candied-soybean

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/services/Elastic.Documentation.Deploying/IncrementalDeployService.cs (1)

79-83: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat an empty plan as success.

This branch returns false without emitting an error, so ServiceInvoker.InvokeAsync() turns a no-op sync into a failed command/exit code 1. An already-synced bucket should log and return true here, not fail the apply step.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/Elastic.Documentation.Deploying/IncrementalDeployService.cs`
around lines 79 - 83, The branch treating an empty plan as failure should be
changed to signal success: in IncrementalDeployService where you check
plan.TotalSyncRequests == 0 (the block that currently calls
_logger.LogInformation("Plan has no files to sync, skipping incremental
synchronization"); return false;), change the return to true so an empty/no-op
plan is considered successful and ServiceInvoker.InvokeAsync() won't treat it as
a failed command; keep the existing log message and ensure the method
(IncrementalDeployService's apply/invoke method) returns true for this case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/cli-schema.json`:
- Around line 4771-4831: The schema documents the command "docs-builder codex
sync apply" but currently lists the "plan" parameter as optional and omits it
from the "usage" string; make the CLI contract accurate by marking the "plan"
parameter (parameter name "plan") as required and adding it into the "usage"
string (e.g., include "--plan <path>" or similar) so the usage, parameters, and
validations for "plan" are consistent with the command behavior; update the
"plan" object's "required" field to true and amend the "usage" value for
"docs-builder codex sync apply" to include the --plan flag.

In `@src/Elastic.Codex/CodexContext.cs`:
- Around line 35-36: EnvironmentName currently uses the null-coalescing operator
on Configuration.Environment so an empty string will not fall back to "codex";
change the logic in the EnvironmentName getter to treat empty/whitespace the
same as null (e.g., use string.IsNullOrWhiteSpace(Configuration.Environment) or
string.IsNullOrEmpty) and return "codex" when empty, matching the normalization
done by IndexNamespace (lines referencing IndexNamespace) so sync logs don't
report a blank environment.

In `@src/tooling/docs-builder/Commands/Codex/CodexSyncCommand.cs`:
- Around line 35-36: The plan command in CodexSyncCommand is passing an empty
string for the --out path into IncrementalDeployService.Plan(), which causes
Plan() to skip serialization and produces no reusable plan for codex sync apply;
change the command so that when the --out option is omitted it writes the
serialized plan to stdout (instead of passing ""), or make the --out option
required; specifically update the logic around the --out handling in
CodexSyncCommand and the call to IncrementalDeployService.Plan() so that a
non-empty path or a stdout stream is provided and ensure the plan serialization
step is invoked so codex sync apply --plan can consume the result.
- Around line 61-69: resolvedEnvironment is computed but never used, so the CLI
--environment/ENVIRONMENT fallback never affects the sync target; fix by wiring
resolvedEnvironment into the CodexContext or the command state passed to
IncrementalDeployService. Specifically, update the CodexContext instantiation
(new CodexContext(...)) to pass resolvedEnvironment instead of the null
environment argument (or include resolvedEnvironment in the tuple passed to
serviceInvoker.AddCommand and forward it into the Plan call), and apply the same
change in the other code path referenced (lines 109-118) so Plan receives the
intended environment.

---

Outside diff comments:
In `@src/services/Elastic.Documentation.Deploying/IncrementalDeployService.cs`:
- Around line 79-83: The branch treating an empty plan as failure should be
changed to signal success: in IncrementalDeployService where you check
plan.TotalSyncRequests == 0 (the block that currently calls
_logger.LogInformation("Plan has no files to sync, skipping incremental
synchronization"); return false;), change the return to true so an empty/no-op
plan is considered successful and ServiceInvoker.InvokeAsync() won't treat it as
a failed command; keep the existing log message and ensure the method
(IncrementalDeployService's apply/invoke method) returns true for this case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: af4bc7a6-d176-4105-9a1f-79c7444fb441

📥 Commits

Reviewing files that changed from the base of the PR and between e010a12 and c816493.

📒 Files selected for processing (17)
  • docs-builder.slnx
  • docs/cli-schema.json
  • src/Elastic.Codex/CodexContext.cs
  • src/Elastic.Codex/Elastic.Codex.csproj
  • src/services/Elastic.Documentation.Assembler/AssembleContext.cs
  • src/services/Elastic.Documentation.Assembler/Elastic.Documentation.Assembler.csproj
  • src/services/Elastic.Documentation.Deploying/Elastic.Documentation.Deploying.csproj
  • src/services/Elastic.Documentation.Deploying/IncrementalDeployService.cs
  • src/services/Elastic.Documentation.Deploying/Synchronization/AwsS3SyncApplyStrategy.cs
  • src/services/Elastic.Documentation.Deploying/Synchronization/AwsS3SyncPlanStrategy.cs
  • src/services/Elastic.Documentation.Deploying/Synchronization/DocsSync.cs
  • src/services/Elastic.Documentation.Deploying/Synchronization/DocsSyncPlanValidator.cs
  • src/services/Elastic.Documentation.Deploying/Synchronization/IDocsSyncContext.cs
  • src/tooling/docs-builder/Commands/Assembler/DeployCommands.cs
  • src/tooling/docs-builder/Commands/Codex/CodexSyncCommand.cs
  • src/tooling/docs-builder/Program.cs
  • tests-integration/Elastic.Documentation.IntegrationTests/DocsSyncTests.cs

Comment thread docs/cli-schema.json Outdated
Comment thread src/Elastic.Codex/CodexContext.cs Outdated
Comment thread src/tooling/docs-builder/Commands/Codex/CodexSyncCommand.cs
Comment thread src/tooling/docs-builder/Commands/Codex/CodexSyncCommand.cs Outdated
…tName fallback, required --plan

- Return true (not false) when apply plan is empty — no-op sync is success
- Wire resolvedEnvironment into CodexContext so --environment flag takes effect
- Use IsNullOrEmpty instead of ?? for EnvironmentName, consistent with IndexNamespace
- Make --plan required in codex sync apply (non-nullable FileInfo, no manual null check)
- Regenerate CLI schema

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@reakaleek

Copy link
Copy Markdown
Member Author

Outside-diff comment (IncrementalDeployService.cs:79-83): Valid. ServiceInvoker emits a global error when a command returns false without having emitted any errors itself, turning a no-op apply (already-synced bucket) into exit code 1. Changed return false to return true for the empty-plan path. Fixed in 01d1348d2.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/services/Elastic.Documentation.Deploying/IncrementalDeployService.cs`:
- Around line 79-85: The early return for a no-op currently happens before
running the DocsSyncPlanValidator, so malformed plans with TotalSyncRequests ==
0 are incorrectly treated as successful; move the short-circuit that checks
plan.TotalSyncRequests to after creating and running
DocsSyncPlanValidator.Validate(plan) (references: Apply(),
DocsSyncPlanValidator, validationResult, plan.TotalSyncRequests) and only return
true for the no-op when validationResult indicates the plan is valid; if
validation fails, surface the validation failure (return false or propagate the
error consistent with existing Apply() error handling).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cd8dc5c8-00f1-461e-8aa5-28eedf3636b6

📥 Commits

Reviewing files that changed from the base of the PR and between c816493 and 01d1348.

📒 Files selected for processing (4)
  • docs/cli-schema.json
  • src/Elastic.Codex/CodexContext.cs
  • src/services/Elastic.Documentation.Deploying/IncrementalDeployService.cs
  • src/tooling/docs-builder/Commands/Codex/CodexSyncCommand.cs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/Elastic.Codex/CodexContext.cs
  • docs/cli-schema.json

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The environment flag had no effect on sync routing or bucket selection;
it only appeared in a single log message. CodexContext already derives
EnvironmentName from codex.yml, which is sufficient.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Remove dead File.Exists guard in IncrementalDeployService.Apply
  ([Existing] on planFile makes it unreachable at all call sites)
- Extract LoadContext helper in CodexSyncCommand to eliminate
  copy-pasted config-loading preamble across Plan and Apply
- IndexNamespace delegates to EnvironmentName to remove duplicate
  IsNullOrEmpty check in CodexContext
- Replace O(n²) AddRequests.Contains with a HashSet in Upload()
- Merge metrics and copy loops in Upload() into a single pass
- Remove DeleteRequests.ToList() — already an IReadOnlyList
- Overlap S3 listing with local file scan in AwsS3SyncPlanStrategy.Plan

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@reakaleek reakaleek temporarily deployed to integration-tests June 2, 2026 21:19 — with GitHub Actions Inactive
IncrementalDeployService accepted no injectable S3 dependencies, making
the plan→apply flow untestable without hitting real AWS. Adding optional
IAmazonS3, ITransferUtility, and IS3EtagCalculator parameters (all
defaulting to null so the two CLI call sites remain unchanged) opens a
seam for tests to drive the full round-trip.

The new IncrementalDeployRoundTripTests cover both AssembleContext and
CodexContext: plan writes a JSON plan file to a MockFileSystem, apply
reads it back, and assertions verify every sync category (add, update,
skip, delete) against mocked S3 and a deterministic ETag calculator.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
AmazonS3Client is expensive to construct (credential resolution,
connection pool allocation). Plan and Apply each called the ?? fallback
independently, creating two separate clients per round-trip in production.
Resolving to a field at construction shares the pool across both calls.

Also drop the intermediate diagnosticsOutputs variable (use the
collection-literal form the rest of the codebase uses) and remove the
redundant collector parameter from RunRoundTrip — context.Collector
is the same value.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

@Mpdreamz Mpdreamz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional review — traced the assembler deploy path end-to-end against the production workflow in .shared.assemble-build-and-deploy.yml. That path is safe: same data, just different wiring. Found two issues in the new codex path and one note on the empty-plan behavior change.

var configFile = fs.FileInfo.New(config.FullName);
var codexConfig = CodexConfiguration.Load(configFile);
return (new CodexContext(codexConfig, configFile, collector, fs, fs, null, null),
new IncrementalDeployService(logFactory, githubActionsService));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: fs (FileSystemFactory.RealRead) is passed as both readFileSystem and writeFileSystem. RealRead doesn't have AllowedSpecialFolders = Temp, but AwsS3SyncApplyStrategy.Upload() creates a temp staging directory via context.WriteFileSystem.Path.GetTempPath(). This will fail at runtime with a scope violation when apply tries to stage files.

The round-trip test doesn't catch it because MockFileSystem is wrapped with ScopeCurrentWorkingDirectoryForWrite which does include temp.

The second argument should be FileSystemFactory.RealWrite (same as DeployCommands does for the assembler path):

Suggested change
new IncrementalDeployService(logFactory, githubActionsService));
return (new CodexContext(codexConfig, configFile, collector, fs, FileSystemFactory.RealWrite, null, null),

@reakaleek reakaleek Jun 15, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. RealRead (no AllowedSpecialFolders.Temp) was passed as both read and write filesystem; AwsS3SyncApplyStrategy stages uploads under WriteFileSystem.Path.GetTempPath(), so any non-empty apply would hit a scope violation at runtime. Changed to FileSystemFactory.RealWrite (matches how DeployCommands wires the assembler path). Fixed in f190e25.

}
var planJson = await readFileSystem.File.ReadAllTextAsync(planFile, ctx);
var planJson = await context.ReadFileSystem.File.ReadAllTextAsync(planFile, ctx);
var plan = SyncPlan.Deserialize(planJson);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code had a File.Exists guard here (with collector.EmitError + return false) before reading the plan file. The [Existing] CLI attribute protects the current entry points, but the service method itself lost its defensive check. If called programmatically in the future, a missing plan file would throw an unhandled FileNotFoundException instead of returning a clean diagnostic error.

Worth restoring the guard above this line for defense-in-depth:

if (!context.ReadFileSystem.File.Exists(planFile))
{
    collector.EmitError(planFile, "Plan file does not exist.");
    return false;
}

@reakaleek reakaleek Jun 15, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Restored the File.Exists guard before ReadAllTextAsync. The [Existing] attribute protects the CLI entry point, but Apply() is public — a programmatic caller with a missing path would get an unhandled FileNotFoundException instead of a clean diagnostic error. Fixed in 64dd72e.

_logger.LogInformation("Plan has no files to sync, skipping incremental synchronization");
return false;
return true;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct fix — the old return false here meant an empty plan (nothing to sync) caused the apply CI step to fail with a non-zero exit code, which is wrong since a no-op sync is a success.

Worth noting: the workflow still calls apply on empty plans because plan-valid gates on validation, not on whether there are changes. That's fine since apply handles it gracefully, just something to be aware of.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — no change needed. The no-op return true is the correct behaviour (an already-in-sync bucket is a success, not an error), and apply handling empty plans gracefully means the CI workflow's unconditional apply step continues to work without special-casing.

@Mpdreamz Mpdreamz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

reakaleek and others added 2 commits June 15, 2026 13:18
RealRead was passed as both read and write, but RealRead lacks
AllowedSpecialFolders.Temp. AwsS3SyncApplyStrategy stages uploads
under WriteFileSystem.Path.GetTempPath(), so any non-empty apply
would hit a scope violation at runtime.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The [Existing] CLI attribute protects current entry points, but
Apply() is a public method. A programmatic caller passing a
nonexistent path would get an unhandled FileNotFoundException;
emit a clean diagnostic error instead, consistent with how the
service handles other invalid-input conditions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Duplicate 7-line log block in Plan() and Apply() extracted to a
  private LogPlanSummary(SyncPlan) helper — one place to maintain
  the format going forward.
- Interlocked.Add on a local variable inside a sequential foreach
  replaced with plain +=; the batch loop has no concurrent writers.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Mpdreamz

Copy link
Copy Markdown
Member

I guess the only thing I am unsure about is:

https://github.com/elastic/docs-internal-workflows/blob/main/.github/workflows/.shared.assemble-build-and-deploy.yml#L158

steps.deploy-plan.outputs.plan-valid == 'true' will now be true for files with 0 changes.

@reakaleek

Copy link
Copy Markdown
Member Author

I guess the only thing I am unsure about is:

https://github.com/elastic/docs-internal-workflows/blob/main/.github/workflows/.shared.assemble-build-and-deploy.yml#L158

steps.deploy-plan.outputs.plan-valid == 'true' will now be true for files with 0 changes.

0 files should be a valid plan though.. it never happens because of the llm.zip

Regardless, an apply to 0 files should be a noop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants