Gene Finding Exercise 1

Save the sequence found in "UnknownSeq.txt" on your computer's desktop.

Pick three gene finding programs and submit the sequence.

FGENEH http://genomic.sanger.ac.uk/gf/gf.shtml
GeneID http://www1.imim.es/geneid.html
GENSCAN http://genes.mit.edu/GENSCAN.html
GRAIL http://compbio.ornl.gov/tools/index.shtml
GRAIL-EXP http://compbio.ornl.gov/grailexp
MZEF http://www.cshl.org/genefinder
PROCRUSTES http://www-hto.usc.edu/software/procrustes
HMMgene http://www.cbs.dtu.dk/services/HMMgene

Question 1: How many exons are in the unknown sequence?

Question 2: What are the start and end points for each exon?

Question 3: Do the three gene finding programs agree on the above answers or are there discrepancies?

Question 4: What other elements could you identify with these programs? Look for Poly A sites, GC content, etc.

Question 5: Can you translate the sequence into a protein? What is the length of the protein sequence?

Question 6: What else can yuo say about the putative protein sequence? (Molecular weight, other characteristic attributes, matches in a database, structure)