Description
- exploit knowledge of the fundamental principles of molecular biology to analyze contributions of DNA and proteins to human physiology and behavior
- identify, select, and query the appropriate database to best answer the bioinformatics question at hand, and effectively and accurately interpret the outcome
For the gene annotation assignment, you will be assigned unknown DNA sequence that you will have to identify and annotate by completing the following task. Your unknown gene can be found by using the following identification numbers: Gene ID: 3043 and Gene: ENSG00000244734.
Must include screenshots at each step
- Identify the open reading frame (ORF). Must describe the frame strand, positive or negative, frame 1, 2 or 3, what it means,
- List the top three hits providing statistics such as, e-value, identity score, length etc. use the top hit to describe how the query aligns with the subject including the definitions as necessary. Use screenshot to explain.
- Coding Sequence (CDS) Coding regions. State the coordinates of the start and stop codon. Or state if the gene is protein coding or not
- Genomic location: State the chromosome on which the gene is found and location
- List of exons
- What is the gene name, organism, and gene id encoded by this DNA sequence? You can use any of the genomic browsers already covered. Explain why you concluded the sequence belong to the gene you have named.
Do not cut and paste your database results to report your findings. Write two to three paragraphs reporting the information you have obtained about this gene. Imagine this is a report you will present to an employer or a group of researchers interested in this gene.