Skip to content

maria-kromany/Finding-ORFs-with-Python

Repository files navigation

This project identifies Open Reading Frames (ORFs) in DNA sequences, which are potential gene regions that can code for proteins. ORFs start with "ATG" (start codon) and end with a stop codon (such as "TAA," "TAG," or "TGA"). By locating these segments, we can study gene functions and better understand DNA sequences.

The key tasks include: Parsing DNA Sequences from FASTA files, using the read_one_seq_fasta function to extract sequence data. Calculating GC Content via the gc_content function, which computes the percentage of guanine (G) and cytosine (C) bases. This helps filter biologically relevant ORFs. Identifying ORFs in both DNA strands. Functions like get_orf, one_frame, forward_frames, and reverse_complement work together to locate ORFs across multiple reading frames. Applying Thresholds with the gene_finder function to filter ORFs based on minimum length and GC content, narrowing down significant gene candidates.

The identified ORFs can be cross-checked with databases like GenBank to uncover biological functions, gene relevance, and human genome presence, connecting computational analysis to real genetic insights.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages