string split - Getting Python to read a text file in chunks of 3 (codons) and give me an output -
i have text file containing 3 columns - stop codon, skipping context , sequence of 102 bases come after skipping context looks bit
tag gttagct ctcgtggtcctcaaggactcagaaaccaggctcgaggcctatcccagcaagtgctgctctgctctgcccaccctgggttctgcattcctatgggtgaccc tag gttagct cttattcccagtgccagctttctctcctcacatcctcataatggatgctgactgtgttgggggacagaagggacttggcagagctttgctcatgccactc tag gttagct ctattgtgtaactgagcaattcttttcactcttgtgactatctcagtcctctgctgttttgtaactggtttacctctatagtttatttatttttaaatta
etc...
i want know how can write program read 3rd column of text file (i.e. 102 base sequence) , need read in chunks of threes , pick out stop codons sequence - 'tag', 'tga', or 'taa' , create list or table or similar tell me if each sequence contains of these stop codons , if so, how many.
so far have done python read 3rd column of text file:
infile = open('test stop codon plus 102.txt', 'ru') outfile = open('tag plus 102 reading inframe.txt', 'w') line in infile: parts = line.split('\t') stopcodon = parts[0] skippingcontext = parts[1] plus102 = parts[2]`
but i'm not sure go next.
thanks in advance!
to read 102nt sequence 3 3:
by3 = [plus102[i:i+3] in range(0,len(plus102),3)]
to find position (in sequence) of stop codons in it:
stops = [(3*i,x) i,x in enumerate(by3) if x in ["tag","tga","taa"]]
do need consider phase also?
to write file:
g = open("outfile.txt", "w") (i,x) in stops: g.write("stop codon " + x + " found @ position " + str(i) + "\n") g.close()
you may consider string formatting, tab-delimited output (see join
), etc.
Comments
Post a Comment