Data analysis homework in python
Biopython
1. By accident, a bioinformatics lab mixed up DNA sequences in a single data file for two
organisms: fruit fly (Drosophila melanogaster) and a grape (Vitis vinifera). Your goal is
to figure out which sequences belong to each organism while also learning about the NCBI
databases, specifically BLAS 1. Create a Python program that reads the sequences from a
text file (one per line), performs a BLAST search if not previously done, stores the results
of each search in a separate file, and theii performs analytics on the search results to help
solve our sequiciice mix- up problem. Det ai is:
a. Put a comment at the top of the Python file called Iab5ql with your naine and student
number(s).
h. Download the data file called input.txt and place it in your local directory for reading.
Each hue of the data file contains a DNA sequence. There are 10 iuieuices in the file.
c. (1 uuiark) Read each hue of the (tata file an(l print the first 20 characters in the sequence
and the length of the query sequence to the console.
#
Sample
output
for
a
single sequence:
Sequence:
GGCTGCGGAGACGTTGAAGG Length: 560
(1. (1/2 uuiark) Define a conuiter CallC(l Count to keep track of how many sequences have
l)een processed (t he total count should be 10 once your prograni is complete).
e. (2 marks) Using a try—except clause, look for a previously created BLAST result file
uiauuied dna_Iab5_<count>.xml (for example, dna_Iab5_l.xml). Start the the search
from count=l.
f. (1 mark) If tite file exists, open it 1111(1 print “Using saved file”.
Get Free Quote!
368 Experts Online