Investigate AST-level hashing in test_patches.py to distinguish substantive from cosmetic changes

## Summary

`tests/test_patches.py` currently hashes botocore function **bytes** (SHA1 of source text). This catches every change — substantive behavior shifts AND cosmetic edits (docstrings, whitespace, type hints, import reorder) alike. The result: contributors can't tell from a hash-fail alone whether a real port is needed or just a hash bump.

Proposal: add an **AST-level hash alongside the existing byte hash**. Two signals, different intents:

| Hash | What it catches | Enforcement meaning |
|-|-|-|
| **Byte hash (existing)** | All changes, including cosmetic | Aiobotocore should mirror 1:1 for diff minimality (the repo's core principle). A mismatch means the override may be out of sync even for docstrings/whitespace. |
| **AST hash (new)** | Substantive changes only — signature, control flow, call structure | A mismatch means a real functional port is needed, not just a mechanical mirror. |

With both, a pure-docstring botocore change busts the byte hash but not the AST hash, so the contributor knows to apply the cosmetic mirror + bump the byte hash but doesn't need to do behavioral porting work. A real body change busts both, signaling both mirror and port.

## Why this matters now

- The `/aiobotocore-bot:check-async-need` classifier distinguishes substantive from cosmetic per Step 2/3 (refined in #1565). The byte hash doesn't have that distinction baked in.
- A recent case (PR #1480) had a mix: botocore's `_apply_request_trailer_checksum` had a real body change AND adjacent classes got docstrings/type hints. The byte hash fires on both; the classifier now correctly only port-flags the body change. An AST hash would let the test suite encode the same distinction.
- Contributors doing a relax upgrade (no-port) with only cosmetic botocore changes upstream currently have to update hashes for no functional reason — looks like a port but isn't.

## Proposed approach (sketch)

For each entry in `test_patches.py` track two hashes:

```python
# before
'hash': 'a1b2c3...'  # SHA1 of source bytes

# after
'byte_hash': 'a1b2c3...'   # unchanged; mirrors minimality principle
'ast_hash':  'd4e5f6...'   # normalized AST hash; behavior-only signal
```

AST normalization:

1. Load the botocore source module.
2. Extract the function/class via `ast.parse` → find by name.
3. Normalize the AST:
   - Strip module-level and function-level docstrings (first `Expr(Constant(str))`).
   - Strip pure type annotations (`ast.arg.annotation`, `FunctionDef.returns`).
   - Unparse back to source with a canonical formatter (`ast.unparse`).
4. Hash the normalized unparsed source.

On test failure, the contributor sees which hash(es) failed and gets a clear signal: byte-only = cosmetic mirror + hash bump, both = real port needed.

## Risks / considerations

- **False negatives**: an AST-equivalent refactor (`if cond: return a\nreturn b` → `return a if cond else b`) busts the byte hash but equals the AST hash. Usually that's fine — behavior is the same — but edge cases exist (walrus-in-expr-context, tuple-unpacking reorder). Worth auditing the worst 5-10 AST-equivalence classes against aiobotocore's actual overrides before committing.
- **Ordering stability**: `ast.unparse` output can vary across Python versions. Pin the hashing-Python version in `test_patches.py` or bake it into the normalization.
- **Debuggability**: when the AST hash fails, the raw byte diff is harder to map to. Need a small diff tool that shows normalized forms side-by-side on failure.

## Non-goals

- Replacing the byte hash. Aiobotocore's "minimize divergence from botocore" principle (see docs/override-patterns.md) means byte-level fidelity matters — the AST hash is additive signal, not replacement.
- Replacing the check-async-need classifier. The hash tests and the classifier serve different purposes (mechanical coupling vs async-need judgment). All three signals should coexist.
- Changing aiobotocore's override model. The hashing is CI-side enforcement, not a runtime concern.

## Related

- PR #1563 introduced `/aiobotocore-bot:check-async-need` which encodes similar logic in prompt form.
- PR #1565 refined the classifier's "substantive change" criterion after a real miss on PR #1480.
- `CONTRIBUTING.rst` §"Hashes of Botocore Code" is the current documentation for test_patches.py behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate AST-level hashing in test_patches.py to distinguish substantive from cosmetic changes #1566

Summary

Why this matters now

Proposed approach (sketch)

Risks / considerations

Non-goals

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Hash	What it catches	Enforcement meaning
Byte hash (existing)	All changes, including cosmetic	Aiobotocore should mirror 1:1 for diff minimality (the repo's core principle). A mismatch means the override may be out of sync even for docstrings/whitespace.
AST hash (new)	Substantive changes only — signature, control flow, call structure	A mismatch means a real functional port is needed, not just a mechanical mirror.

Uh oh!

Investigate AST-level hashing in test_patches.py to distinguish substantive from cosmetic changes #1566

Description

Summary

Why this matters now

Proposed approach (sketch)

Risks / considerations

Non-goals

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions