Skip to content

jac-scale --scale deploys silently no-op: hyphenated bool Args register a phantom name-with-hyphen: True namespace key (PR #6024 regression) #6115

@udithishanka

Description

@udithishanka

Problem

Any boolean CLI Arg whose name contains a hyphen (e.g. dry-run, show-yaml, enable-tls) ends up with a phantom namespace key defaulting to True that HookContext.get_arg("the-name", False) returns even when the flag was never passed on the command line. This causes pre-hooks that check context.get_arg("dry-run", False) to see True for every invocation.

Hit in production on jaseci-labs/jacBuilder@dev: every jac start main.jac --scale --experimental since 6cff699d2 (PR #6024) silently takes the dry-run branch in jac-scale/jac_scale/plugin.jac:147, prints the plan, sets cancel_execution=True, and returns before apply_manifests ever runs. The deploy reports success because the workflow follows it with a hard-coded echo "Deployment spec applied". The cluster has been frozen on a 43h-old image for ~16h before anyone noticed.

Root cause

jac/jaclang/cli/impl/registry.impl.jac:289-304 adds an auto-generated --no-X companion for every bool flag:

} elif arg.kind == ArgKind.FLAG or arg.typ == bool {
    kwargs["action"] = "store_true";
    kwargs["default"] = arg.default if arg.default is not None else False;
    if arg.short {
        parser.add_argument(f"-{arg.short}", f"--{arg.name}", **kwargs);
    } else {
        parser.add_argument(f"--{arg.name}", **kwargs);
    }
    # Add --no-X variant
    no_kwargs = {
        "action": "store_false",
        "dest": arg.name,                         # <-- preserves hyphens
        "help": f"Disable {arg.name}"
    };
    parser.add_argument(f"--no-{arg.name}", **no_kwargs);
}

argparse handles this name pair inconsistently:

  • --dry-run (store_true): no explicit dest, argparse auto-derives → namespace attribute dry_run (underscore), default False.
  • --no-dry-run (store_false): explicit dest="dry-run" (hyphen), → namespace attribute dry-run (literal hyphen), default True (the inverse default of store_false).

Then jac/jaclang/cli/impl/cli.impl.jac:81 does args_dict = vars(args), which surfaces both entries. Result for any invocation that omits both flags:

{'dry_run': False, 'dry-run': True}

jac/jaclang/cli/impl/command.impl.jac:65-67 does a plain self.args.get(name, default). There's no name normalisation, so a caller asking for the hyphen form gets the phantom-True from the --no-X companion.

PR #6024 introduced the first plugin caller that reads a hyphenated bool by its hyphen form:

dry_run = context.get_arg("dry-run", False);  # always True

enable_tls doesn't trip the bug because that caller asks for "enable_tls" (underscore), which hits the store_true attribute, not the store_false phantom.

Minimal reproduction (pure Python argparse, no Jac runtime needed)

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--dry-run", action="store_true", default=False)
parser.add_argument("--no-dry-run", action="store_false", dest="dry-run")

args = parser.parse_args([])          # nothing passed
print(vars(args))                     # {'dry_run': False, 'dry-run': True}
print(vars(args).get("dry-run"))      # True  <-- the bug

End-to-end reproduction on a jac-scale deploy

git clone --depth 1 --branch main https://github.com/jaseci-labs/jaseci.git /tmp/jaseci
uv venv /tmp/repro --python 3.12
uv pip install --python /tmp/repro/bin/python \
    -e /tmp/jaseci/jac -e /tmp/jaseci/jac-scale -e /tmp/jaseci/jac-client

mkdir /tmp/dry-run-repro && cd /tmp/dry-run-repro
cat > jac.toml <<'TOML'
[project]
name = "dry-run-repro"
version = "0.1.0"
entry-point = "main.jac"

[plugins.scale.microservices]
enabled = true
[plugins.scale.microservices.routes]
foo = "/api/foo"
TOML

cat > main.jac <<'JAC'
walker foo { can run with `root entry { report "ok"; } }
JAC

/tmp/repro/bin/jac start main.jac --scale --experimental

Expected: deploy or fail with a real cluster error.
Actual: prints === jac scale plan: dry-run === and exits. No build, no apply. From the workflow log on jaseci-labs/jacBuilder@dev (2026-05-22T11:45 UTC):

Installing Jaseci packages from repository (experimental mode)...
ℹ Generated 3 pod-spec stubs (1 gateway + 2 services) | Context: {'image': 'python:3.12-slim', 'namespace': 'jac-builder-dev'}

=== jac scale plan: dry-run ===
...
To see the raw YAML manifests, re-run with --show-yaml
Deployment spec applied by jac-scale.   # <-- this line is the workflow's hardcoded echo, NOT jac-scale

Total runtime ~3s. Earlier deploys against the same workflow on the same branch (pre-PR-#6024 jac-scale) took 4-5 minutes and emitted Applied PVC, Applied Service, Applied Deployment log lines. None of those appear after #6024 because the dry-run branch returns before apply_manifests.

Blast radius

Any pre-hook that reads a hyphenated bool arg by its hyphen form is currently broken on main. Searching the tree, the affected callers I found are:

  • jac-scale/jac_scale/plugin.jac:146: dry_run = context.get_arg("dry-run", False); -- silent no-op on every --scale deploy.
  • jac-scale/jac_scale/plugin.jac:158 (indirect): show_yaml = context.get_arg("show_yaml", False); -- uses underscore, so it happens to dodge the bug; but --show-yaml is also affected at the namespace level.

Other plugins that register hyphenated bool args (enable-tls etc.) currently dodge the bug only by asking for the underscore form. Anyone copy-pasting the Arg.create("dry-run", typ=bool, ...) pattern is one typo away from the same silent-True footgun.

Suggested fix

Normalise the dest of the --no-X companion to match argparse's auto-derivation for the positive form (replace hyphens with underscores):

# jac/jaclang/cli/impl/registry.impl.jac
no_kwargs = {
    "action": "store_false",
    "dest": arg.name.replace("-", "_"),
    "help": f"Disable {arg.name}"
};

That collapses both flags onto a single namespace attribute (dry_run), the phantom-True entry goes away, and vars(args) cleanly maps --dry-run / --no-dry-run onto one boolean.

Belt-and-suspenders option: have HookContext.get_arg try both name and name.replace("-", "_") so plugins that already shipped with either spelling continue to work.

Either approach unblocks the silent-no-op deploy on jacBuilder@dev, and on anything else that relies on --scale actually applying manifests.

Environment

  • jaseci-labs/jaseci@main HEAD (verified at 5eaee35c4 / earlier 7dcc7d4d3).
  • Python 3.12, argparse stdlib (no version dependency; bug is in our argparse usage, not in argparse itself).
  • Reproduces on macOS arm64 and ubuntu-24.04 GitHub Actions runners.

Workaround until fixed

In jac-scale/jac_scale/plugin.jac:146, change:

dry_run = context.get_arg("dry-run", False);

to:

dry_run = context.get_arg("dry_run", False);

That short-circuits the phantom-True until the registry fix lands. Downstream consumers depending on --scale actually applying (jacBuilder dev/prod, anyone else on the K-track v1 path) should pin to a commit before 6cff699d2 or apply the one-line workaround above.


Found while debugging why jaseci-labs/jacBuilder@dev was throwing Cannot read properties of undefined (reading '__jacUnsafeHtml') (#6054) again post-fix. The PR-#6062 fix was in main but every deploy was silently no-opping, so the cluster never picked it up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions