Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
218 commits
Select commit Hold shift + click to select a range
7e1aa18
Add BioScript library shim infrastructure
madhavajay May 11, 2026
322870a
Add VNtyper BioScript port scaffolding
madhavajay May 11, 2026
05ca7ce
Add bcftools wrapper surface
madhavajay May 11, 2026
8e19017
Gate VNtyper integration data tests
madhavajay May 11, 2026
001122e
Split runtime tool method bindings
madhavajay May 11, 2026
35bb010
Document VNtyper upstream inventory
madhavajay May 11, 2026
0e7111f
Add VNtyper region helpers
madhavajay May 11, 2026
b2623fc
Add VNtyper command planner
madhavajay May 11, 2026
7bad876
Mark VNtyper port skeleton milestone
madhavajay May 11, 2026
6a195cf
Add optional VNtyper upstream scoring parity test
madhavajay May 11, 2026
cdc4bd6
Record VNtyper port decisions
madhavajay May 11, 2026
d16d70c
Add VNtyper expected fixture outputs
madhavajay May 11, 2026
5ae3165
Expand VNtyper structured report JSON
madhavajay May 11, 2026
c820af5
Add VNtyper adVNTR report fields
madhavajay May 11, 2026
678140b
Add initial VNtyper HTML report
madhavajay May 11, 2026
3925ef4
Record tool command planning timings
madhavajay May 11, 2026
5fe525f
Triage VNtyper optional modules
madhavajay May 11, 2026
2c3e441
Document Kestrel native port spike
madhavajay May 11, 2026
152a932
Gate VNtyper full pipeline prerequisites
madhavajay May 11, 2026
6b7ec02
Add interactive VNtyper report controls
madhavajay May 11, 2026
d43990d
Gate upstream VNtyper reference tests
madhavajay May 11, 2026
24dd37e
Add VNtyper IGV report session
madhavajay May 11, 2026
677bed0
Port VNtyper scoring unit cases
madhavajay May 11, 2026
984b562
Port VNtyper flagging unit cases
madhavajay May 11, 2026
2eabab2
Port VNtyper motif filtering unit cases
madhavajay May 11, 2026
d1fdddc
Add VNtyper large-data command plan tests
madhavajay May 11, 2026
c9d7ddc
Add VNtyper expected output planning harness
madhavajay May 11, 2026
49fbb68
Add VNtyper external pipeline runner
madhavajay May 11, 2026
fc3c18f
Wire VNtyper expected generator to pipeline runner
madhavajay May 11, 2026
2c26f98
Tighten VNtyper full pipeline gate
madhavajay May 11, 2026
ddcdfea
Add Kestrel jar build helper
madhavajay May 11, 2026
022ff91
Add native indexed BAM fetch support
madhavajay May 11, 2026
11e7ab1
Run VNtyper depth command in external pipeline
madhavajay May 11, 2026
c72225d
Summarize VNtyper depth output in reports
madhavajay May 11, 2026
ebb37ad
Resolve Kestrel jar from ignored test tools
madhavajay May 11, 2026
9448760
Add VNtyper FASTQ Kestrel generation path
madhavajay May 11, 2026
13248fa
Gate generated VNtyper FASTQ expected outputs
madhavajay May 11, 2026
193c5ab
Add native BAM depth summary
madhavajay May 11, 2026
355c3ac
Add native BAM region slicing
madhavajay May 11, 2026
63f5134
Mark selected native replacement milestone complete
madhavajay May 11, 2026
1c3d0fc
Add native BAM FASTQ extraction
madhavajay May 11, 2026
4a8d288
Expose native samtools helpers to Python
madhavajay May 11, 2026
1ae2c79
Wire native samtools into VNtyper runner
madhavajay May 11, 2026
6e3f853
Gate native VNtyper BAM integration test
madhavajay May 11, 2026
53e7bfe
Add template-based BAM FASTQ extraction
madhavajay May 11, 2026
5b71f11
Include unmapped reads in native BAM FASTQ extraction
madhavajay May 11, 2026
bf809eb
Fix native VNtyper BAM FASTQ path
madhavajay May 11, 2026
098ba70
Strengthen VNtyper native BAM report gate
madhavajay May 11, 2026
9d21693
Add VNtyper external BAM gate
madhavajay May 11, 2026
f37ac80
Port Kestrel variant VCF normalization
madhavajay May 11, 2026
173bcdd
Add native Kestrel kmer counter
madhavajay May 11, 2026
cacf754
Add native Kestrel active region types
madhavajay May 11, 2026
cb91ffe
Add native Kestrel active region detector
madhavajay May 11, 2026
0008024
Add native Kestrel haplotype alignment surface
madhavajay May 11, 2026
d40b6f8
Add native Kestrel explicit haplotype caller
madhavajay May 11, 2026
bcb895c
Add native Kestrel kmer haplotype assembler
madhavajay May 11, 2026
ec81c7f
Add native Kestrel reads to VCF path
madhavajay May 11, 2026
68b55a7
Expose native Kestrel sequence caller to Python
madhavajay May 11, 2026
84a8377
Add native Kestrel FASTQ caller path
madhavajay May 11, 2026
294cf9e
Add Kestrel haplotype repeat controls
madhavajay May 11, 2026
fd3966d
Port Kestrel refreader fixtures to Rust
madhavajay May 11, 2026
ec49ea4
Vendor Kestrel publication cases
madhavajay May 11, 2026
5995fd8
Add Kestrel detector end-anchor controls
madhavajay May 11, 2026
55b8f00
Add Kestrel detector recovery decay
madhavajay May 11, 2026
d0339da
Add Kestrel right-scan peak recovery
madhavajay May 11, 2026
8239fa2
Add Kestrel detector scan limit
madhavajay May 11, 2026
4887c6c
Add Kestrel right anchor recovery
madhavajay May 11, 2026
8f216f2
Add Kestrel left peak suppression
madhavajay May 11, 2026
40e4652
Limit Kestrel left open scans
madhavajay May 11, 2026
d3c68b6
Add Kestrel left scan recovery discard
madhavajay May 11, 2026
5d7de35
Split Kestrel detector scan helpers
madhavajay May 11, 2026
2138973
Add Kestrel ambiguous region control
madhavajay May 11, 2026
5425823
Add Kestrel scan max gap control
madhavajay May 11, 2026
daef047
Add Kestrel Java parity gate
madhavajay May 11, 2026
afd9f51
Extend Kestrel Java parity gate
madhavajay May 11, 2026
1051d9e
Document Kestrel sparse-read parity gap
madhavajay May 11, 2026
79aa528
Use read kmer transitions for Kestrel assembly
madhavajay May 11, 2026
5f65bc7
Expand Kestrel Java parity fixtures
madhavajay May 11, 2026
72fc69c
Port Kestrel alignment weight gap limits
madhavajay May 11, 2026
147aea1
Port Kestrel alignment weight parser
madhavajay May 11, 2026
57042d8
Add k20 Kestrel Java parity fixture
madhavajay May 11, 2026
db8c31a
Add Kestrel indel Java parity fixtures
madhavajay May 11, 2026
b62c4c2
Match Kestrel mixed-depth active region DP
madhavajay May 11, 2026
e16a682
Expand Kestrel mixed-depth Java parity
madhavajay May 11, 2026
21bfff0
Add native Kestrel multi-reference calls
madhavajay May 11, 2026
8be6559
Expose multi-reference Kestrel Python wrapper
madhavajay May 11, 2026
11539df
Add Kestrel multi-reference Java parity
madhavajay May 11, 2026
b0ba7e3
Add Kestrel FASTA reference loader
madhavajay May 11, 2026
d8cdf01
Wire native Kestrel into VNtyper runner
madhavajay May 11, 2026
c706f89
Record native Kestrel FASTQ scaling gap
madhavajay May 11, 2026
3c6d17e
Speed up native Kestrel kmer counting
madhavajay May 11, 2026
293c7be
Bound native Kestrel VNtyper runner
madhavajay May 11, 2026
d8109d4
Parse named Kestrel sample columns
madhavajay May 11, 2026
e698f27
Apply VNtyper motif filtering in port
madhavajay May 11, 2026
dd29278
Align VNtyper Kestrel report classification
madhavajay May 11, 2026
02419a6
Add native Kestrel alignment scoring
madhavajay May 11, 2026
872e1c0
Prune native Kestrel haplotypes by alignment score
madhavajay May 11, 2026
dcceff2
Record native Kestrel pruning gap
madhavajay May 11, 2026
14764e2
Cover native Kestrel alternate haplotype pruning
madhavajay May 11, 2026
f666d91
Match Kestrel active-region overlap guard
madhavajay May 11, 2026
918ac3f
Add vendored Rust bioinformatics facades
madhavajay May 13, 2026
a7d1f09
Test native bcftools Python wrapper
madhavajay May 13, 2026
d9a93c6
Expand native bcftools facade
madhavajay May 13, 2026
3d38af0
Bind native bcftools runtime methods
madhavajay May 13, 2026
a8d56fa
Document BioScript library dependency graph
madhavajay May 13, 2026
56c8564
Document Python backend policies
madhavajay May 13, 2026
89b8ad8
Cover missing native Python wrappers
madhavajay May 13, 2026
839d940
Add Kestrel native file runner
madhavajay May 13, 2026
e6bf32f
Use Kestrel run facade in VNtyper port
madhavajay May 13, 2026
3d628bd
Add minimal VNtyper port entrypoints
madhavajay May 13, 2026
0efa1bb
Wire native bcftools sort into VNtyper FASTQ path
madhavajay May 13, 2026
7888f42
Add native bcftools switch to VNtyper BAM path
madhavajay May 13, 2026
4c5d58a
Wire samtools-rs into BioScript libs
madhavajay May 13, 2026
1ff5f88
Add tiny Samtools facade fixture test
madhavajay May 13, 2026
5cb5131
Add Python native facade smoke tests
madhavajay May 13, 2026
3f42575
Bind native Samtools methods in runtime
madhavajay May 13, 2026
c2fda4e
Expose familiar Samtools command facades
madhavajay May 13, 2026
e88d6ac
Update runtime TODO status
madhavajay May 13, 2026
93e60ec
Update library support TODO status
madhavajay May 13, 2026
a0d0e5d
Add bcftools view facade
madhavajay May 13, 2026
0eeaec4
Update VNtyper facade TODO status
madhavajay May 13, 2026
12c29a0
Route pyfaidx facade through htslib-rs
madhavajay May 13, 2026
686c24e
Expose native pyfaidx Python facade
madhavajay May 13, 2026
64c437e
Verify all-native VNtyper BAM path
madhavajay May 13, 2026
c591290
Centralize VNtyper port config
madhavajay May 13, 2026
7e189b8
Cover BCFtools native error path
madhavajay May 13, 2026
5a4b969
Route pysam fetch through htslib facade
madhavajay May 13, 2026
74cf060
Update library support TODO status
madhavajay May 13, 2026
ca770ec
Add focused upstream facade parity tests
madhavajay May 13, 2026
7208235
Finish library support TODO
madhavajay May 13, 2026
981b8b3
Reset TODO for native VNtyper port
madhavajay May 13, 2026
a4f5d26
Establish native VNtyper baseline gates
madhavajay May 13, 2026
2e8567b
Add runnable VNtyper BioScript plan
madhavajay May 13, 2026
b2d3742
Add native FASTQ VNtyper parity gate
madhavajay May 13, 2026
4da9c0f
Add VNtyper native facade test
madhavajay May 13, 2026
23d41c8
Add VNtyper runtime program test
madhavajay May 13, 2026
b8a2091
Add VNtyper FASTQ BioScript plan
madhavajay May 13, 2026
ad6cd22
Split all-native BAM parity prerequisites
madhavajay May 13, 2026
dbf94f3
Record FASTQ native parity gap
madhavajay May 13, 2026
b911d76
Add VNtyper VCF parser tests
madhavajay May 13, 2026
8d31f5c
Tighten VNtyper parity skip gates
madhavajay May 13, 2026
fcce287
Mark tool planners as planning APIs
madhavajay May 13, 2026
e276c62
Record BCFtools view filter decision
madhavajay May 13, 2026
86a3901
Record FASTQ fixture expectation status
madhavajay May 13, 2026
d2db51e
Record VNtyper HTML report gate
madhavajay May 13, 2026
1fcb4ee
Fix native samtools FASTQ singleton parity
madhavajay May 13, 2026
27f515f
Record VNtyper output parity deltas
madhavajay May 13, 2026
4d5ed5c
Reconfirm native FASTQ Kestrel parity gap
madhavajay May 13, 2026
39e0e33
Add Kestrel VNtyper FASTQ parity gate
madhavajay May 13, 2026
21dcd86
Record native FASTQ execution path
madhavajay May 13, 2026
fdd58cb
Tighten VNtyper upstream test map
madhavajay May 13, 2026
01dbc86
Default matching runtime tool calls to native facades
madhavajay May 13, 2026
ff8f9d2
Default samtools runtime calls to native facades
madhavajay May 13, 2026
dec79f9
Add native Kestrel runtime execution
madhavajay May 13, 2026
26f92e6
Record facade replacement regression coverage
madhavajay May 13, 2026
180b02d
Run VNtyper FASTQ BioScript through native facades
madhavajay May 13, 2026
2263b6c
Record VNtyper scaffold test retention
madhavajay May 13, 2026
2843e4a
Record native FASTQ runtime coverage in VNtyper map
madhavajay May 13, 2026
97df779
Add VNtyper Kestrel call rows to vcf facade
madhavajay May 13, 2026
054878b
Build VNtyper report JSON from vcf facade
madhavajay May 13, 2026
850d3db
Materialize VNtyper Kestrel TSV in runtime slice
madhavajay May 13, 2026
bc88c56
Add native BAM VNtyper runtime slice
madhavajay May 13, 2026
03f9383
Thread VNtyper report context through vcf facade
madhavajay May 13, 2026
aadef16
Promote VNtyper BAM entry point to native runtime
madhavajay May 13, 2026
d3e4af2
Close VNtyper syntax and upstream map TODOs
madhavajay May 13, 2026
92f3f7e
Refresh native FASTQ parity blocker evidence
madhavajay May 13, 2026
6913323
Record Kestrel engine FASTQ parity failure
madhavajay May 13, 2026
70e4b1c
Parameterize VNtyper Kestrel runtime settings
madhavajay May 13, 2026
0393450
Document native FASTQ false-positive diagnostics
madhavajay May 13, 2026
99c8f1d
Add native FASTQ parity failure context
madhavajay May 13, 2026
5d381a8
Add normalized FASTQ output parity fingerprints
madhavajay May 13, 2026
9b93635
Audit remaining TODO parity blockers
madhavajay May 13, 2026
e096caa
Document VNtyper Kestrel limit parity gap
madhavajay May 13, 2026
f8a56f7
Add normalized BAM output parity gate
madhavajay May 13, 2026
5fe688d
Match VNtyper motif filters in vcf facade
madhavajay May 14, 2026
c69c318
Record Kestrel parity diff examples
madhavajay May 14, 2026
2a5cee5
Document Kestrel graph traversal parity gap
madhavajay May 14, 2026
40f86d1
Record discarded Kestrel graph traversal attempt
madhavajay May 14, 2026
9ef3332
Document Kestrel parity artifact retention
madhavajay May 14, 2026
07758da
Record VNtyper Kestrel delta summary
madhavajay May 14, 2026
01b4f69
Document Kestrel parity limit overrides
madhavajay May 14, 2026
6ae6abb
Record upstream-limit Kestrel timeout
madhavajay May 14, 2026
7a5ea8f
Document Kestrel VNtyper parity progress
madhavajay May 14, 2026
f9aecf4
Update fix-kestrel.md with kmercount filter findings
madhavajay May 14, 2026
cfeedf7
Update fix-kestrel.md with progress summary table and verification notes
madhavajay May 14, 2026
c1a04b1
Add break-cause counter findings and next-step hypothesis to fix-kest…
madhavajay May 14, 2026
37a52c4
Record seq_limit knob experiment results in fix-kestrel.md
madhavajay May 14, 2026
9dbb2a7
Document diagnostic test findings and next-step options
madhavajay May 14, 2026
5672c9c
Document haplotype_built and min_depth investigation in fix-kestrel.md
madhavajay May 14, 2026
48280a7
Document iter 25-40 cycle pattern in J-R outer iters
madhavajay May 14, 2026
799d335
Record aggressive dedup experiment and session summary
madhavajay May 14, 2026
fac2b16
Record shape dedup experiment and exhaustive knob inventory
madhavajay May 14, 2026
26cafc8
Record cap-sweep diagnostic: bug exists at every cap level
madhavajay May 14, 2026
ad522bb
Pinpoint iter 4 divergence with Java instrumentation + Rust restore t…
madhavajay May 14, 2026
e2d43f1
Trace shows Java's iter 4 = Rust's iter 6 (G-alt restore)
madhavajay May 14, 2026
95968c2
ROOT CAUSE FIX: Java's nState accounting (save -1 on evict, NOT on pop)
madhavajay May 14, 2026
7d1eb7c
Analyze post-fix parity gap: 1028 missing has DP value differences
madhavajay May 14, 2026
434e1b6
Session conclusion: root cause solved, 78% of parity gap closed
madhavajay May 14, 2026
b077aad
Consolidate session summary in fix-kestrel.md
madhavajay May 14, 2026
ccca069
Advance kestrel-rs to merged parity fix; add APOL1 pysam proof
madhavajay May 15, 2026
de8f672
TODO: set test-vntyper.sh Java<->Rust parity as current priority
madhavajay May 15, 2026
76a1c1a
test-vntyper.sh: Java↔Rust VNtyper output parity tool
madhavajay May 15, 2026
5a67eb6
VNtyper: all-fixture upstream correctness + Java↔Rust parity
madhavajay May 15, 2026
7993469
Regenerate rust/Cargo.lock after merging origin/main
madhavajay May 15, 2026
dc3d71b
Update VNtyper Rust tests for kestrel-rs Java-parity output
madhavajay May 15, 2026
96483cf
Add VNtyper advanced-assay example + runtime support for native assays
madhavajay May 15, 2026
b8b19bb
updating libs
madhavajay May 19, 2026
04dba67
adding repoverse
madhavajay May 19, 2026
b0932e0
remove
madhavajay May 19, 2026
7e89281
adding repoverse
madhavajay May 19, 2026
c59bb3f
updating samtools
madhavajay May 19, 2026
381997e
adding extra noodles checkout
madhavajay May 27, 2026
ae29304
updating submodule
madhavajay May 27, 2026
5ec071f
fixing path
madhavajay May 27, 2026
fc18e99
fixes
madhavajay May 27, 2026
69beb85
Merge remote-tracking branch 'origin/main' into madhava/libs
madhavajay May 27, 2026
8e65283
Bump htslib-rs/bcftools-rs/samtools-rs to current main; cargo fmt
madhavajay May 27, 2026
3e4b32c
Untrack macOS-only cargo env; fix clippy::pedantic warnings
madhavajay May 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[net]
git-fetch-with-cli = true
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,8 @@ examples/**/.nextflow*
# Read-only reference material (htslib/rust-htslib source trees used during
# the CRAM/noodles migration; not part of the build).
context/
/repos/

# Local-only cargo env overrides (macOS Xcode CC/CXX/SDKROOT etc.).
# Belongs per-developer; recreate locally if your toolchain needs it.
rust/.cargo/config.toml
30 changes: 29 additions & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,32 @@
[submodule "noodles"]
path = noodles
url = git@github.com:madhavajay/noodles.git
branch = madhava/streaming-slice-records
branch = madhava/bioscript
[submodule "vendor/python/pysam"]
path = vendor/python/pysam
url = https://github.com/pysam-developers/pysam.git
[submodule "vendor/python/pyfaidx"]
path = vendor/python/pyfaidx
url = https://github.com/mdshw5/pyfaidx.git
[submodule "ports/vntyper/vntyper"]
path = ports/vntyper/vntyper
url = https://github.com/madhavajay/VNtyper.git
[submodule "ports/vntyper/kestrel"]
path = ports/vntyper/kestrel
url = https://github.com/paudano/kestrel.git
[submodule "ports/vntyper/kescases"]
path = ports/vntyper/kescases
url = https://github.com/paudano/kescases.git
[submodule "vendor/rust/kestrel-rs"]
path = vendor/rust/kestrel-rs
url = git@github.com:madhavajay/kestrel-rs.git
[submodule "vendor/rust/htslib-rs"]
path = vendor/rust/htslib-rs
url = git@github.com:madhavajay/htslib-rs.git
branch = main
[submodule "vendor/rust/bcftools-rs"]
path = vendor/rust/bcftools-rs
url = git@github.com:madhavajay/bcftools-rs.git
[submodule "vendor/rust/samtools-rs"]
path = vendor/rust/samtools-rs
url = git@github.com:madhavajay/samtools-rs.git
34 changes: 34 additions & 0 deletions .repoverse.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
version: 1
defaults:
remote: github
revision: main
scheme: ssh
remotes:
github:
host: github.com
projects:
- name: madhavajay/htslib-rs
path: repos/htslib-rs
revision: main
- name: madhavajay/noodles
path: repos/noodles
revision: madhava/bioscript
provides:
- madhavajay/htslib-rs
- madhavajay/noodles
links:
- repo: madhavajay/htslib-rs
at: vendor/rust/bcftools-rs/repos/htslib-rs
branch: main
- repo: madhavajay/htslib-rs
at: vendor/rust/htslib-rs
branch: main
- repo: madhavajay/htslib-rs
at: vendor/rust/samtools-rs/repos/htslib-rs
branch: main
- repo: madhavajay/noodles
at: noodles
branch: madhava/bioscript
- repo: madhavajay/noodles
at: repos/htslib-rs/repos/noodles
branch: madhava/bioscript
877 changes: 877 additions & 0 deletions TODO.md

Large diffs are not rendered by default.

97 changes: 97 additions & 0 deletions bioscripts/apol1-new.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
from bioscript import pysam


G1_SITE_1 = bioscript.variant(
rsid="rs73885319",
grch37="22:36661906-36661906",
grch38="22:36265860-36265860",
ref="A",
alt="G",
kind="snp",
)

G1_SITE_2 = bioscript.variant(
rsid="rs60910145",
grch37="22:36662034-36662034",
grch38="22:36265988-36265988",
ref="T",
alt="G",
kind="snp",
)

G2_SITE = bioscript.variant(
rsid=["rs71785313", "rs1317778148", "rs143830837"],
grch37="22:36662046-36662051",
grch38="22:36266000-36266005",
ref="I",
alt="D",
kind="deletion",
deletion_length=6,
motifs=["TTATAA", "ATAATT"],
)


def count_char(text, needle):
if text is None:
return 0
total = 0
for ch in text:
if ch == needle:
total = total + 1
return total


def count_non_ref(text, ref):
if text is None:
return 0
total = 0
for ch in text:
if ch != ref and ch != "-":
total = total + 1
return total


def classify_apol1(genotypes):
site1 = genotypes.lookup_variant(G1_SITE_1)
site2 = genotypes.lookup_variant(G1_SITE_2)
g2 = genotypes.lookup_variant(G2_SITE)

if site1 is None and site2 is None and g2 is None:
return "G-/G-"

d_count = count_char(g2, "D")
site1_variants = count_non_ref(site1, "A")
site2_variants = count_non_ref(site2, "T")

has_g1 = site1_variants > 0 and site2_variants > 0
if has_g1:
g1_total = site1_variants + site2_variants
else:
g1_total = 0

if d_count == 2:
return "G2/G2"
if d_count == 1:
if g1_total >= 2:
return "G2/G1"
return "G2/G0"
if g1_total == 4:
return "G1/G1"
if g1_total >= 2:
return "G1/G0"
return "G0/G0"


def main():
genotypes = bioscript.load_genotypes(input_file)
status = classify_apol1(genotypes)
rows = [{
"participant_id": participant_id,
"apol1_status": status,
}]
bioscript.write_tsv(output_file, rows)
print(status)


if __name__ == "__main__":
main()
64 changes: 64 additions & 0 deletions bioscripts/apol1-pysam-proof.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
from bioscript import pysam


APOL1_SITES = [
{
"key": "G1_SITE_1",
"chrom": "22",
"start": 36265859,
"stop": 36265860,
"ref": "A",
"alt": "G",
},
{
"key": "G1_SITE_2",
"chrom": "22",
"start": 36265987,
"stop": 36265988,
"ref": "T",
"alt": "G",
},
{
"key": "G2_SITE",
"chrom": "22",
"start": 36265999,
"stop": 36266005,
"ref": "TTATAA",
"alt": "<DEL:6>",
},
]


def count_region_reads(bam, site):
total = 0
for read in bam.fetch(site["chrom"], site["start"], site["stop"]):
if not read.is_unmapped:
total = total + 1
return total


def main():
bam = pysam.AlignmentFile(
input_file,
"rc",
reference_filename=reference_file,
index_filename=input_index,
)
rows = []
for site in APOL1_SITES:
rows.append(
{
"participant_id": participant_id,
"variant_key": site["key"],
"chrom": site["chrom"],
"start": str(site["start"]),
"stop": str(site["stop"]),
"depth": str(count_region_reads(bam, site)),
"proof_status": "region_fetch_only",
}
)
bioscript.write_tsv(output_file, rows)


if __name__ == "__main__":
main()
84 changes: 84 additions & 0 deletions bioscripts/examples/vntyper/assay.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
schema: bioscript:assay:1.0
version: "1.0"
name: vntyper_muc1
label: MUC1 VNTR (VNtyper)
summary: >
Advanced assay that genotypes the MUC1 VNTR frameshift (ADTKD-MUC1) from an
aligned genome. It slices the MUC1 region, extracts reads, runs mapping-free
Kestrel genotyping, and applies VNtyper post-processing to call the
pathogenic cytosine insertion. Requires aligned input (BAM/CRAM) — it cannot
run from SNP-chip or VCF genotypes.
tags:
- type:risk
- gene:MUC1
- kind:vntr
support:
input:
- bam
- cram
members:
- kind: variant
path: muc1-vntr.yaml
version: "1.0"
analyses:
- id: vntyper_muc1
kind: bioscript
path: vntyper.py
output_format: tsv
label: MUC1 VNTR genotype
derived_from:
- muc1-vntr.yaml
assets:
- id: muc1_reference
path: assets/muc1_motifs.fa
emits:
- key: vntyper_outcome
label: MUC1 VNTR outcome
value_type: string
format: badge
- key: vntyper_status
label: MUC1 VNTR status
value_type: string
- key: vntyper_confidence
label: Kestrel confidence
value_type: string
format: badge
- key: vntyper_variant
label: Called variant
value_type: string
- key: vntyper_alt_depth
label: Alternate-variant depth
value_type: string
logic:
source:
name: VNtyper / Kestrel
url: https://github.com/hassansaei/VNtyper
description: >
The MUC1 region is sliced from the aligned input, converted to FASTQ,
and genotyped with mapping-free Kestrel against the MUC1 motif
reference. VNtyper motif/frameshift/confidence post-processing selects
the called variant; vntyper_status is positive when a valid
High_Precision* / High_Precision frameshift passes the filters.
findings:
- schema: bioscript:pgx-label:1.0
id: muc1_vntr_positive_finding
label: MUC1 VNTR pathogenic frameshift detected
authority_type: clinical_annotation
binding:
source: analysis
analysis_id: vntyper_muc1
key: vntyper_status
operator: equals
value: positive
regulatory_sources:
- "ClinVar"
pgx_action_level: "Informative"
evidence:
source: VNtyper
kind: method_annotation
id: ADTKD-MUC1
url: https://github.com/hassansaei/VNtyper
notes: >
A positive call indicates a MUC1 VNTR frameshift consistent with
ADTKD-MUC1; confirm with an orthogonal method (e.g. SNaPshot or
long-read sequencing).
Loading
Loading