Comparative Genomics – Exercise 1

Section A

What is the function of the genes?

The genes encode proteins involved in lipid metabolism.

What disorders are associated with the gene?

HYPERLIPOPROTEINEMIA, DYSBETALIPOPROTEINEMIA

What do you find is the reason for the low homology in most cases?

Low homology is usually found at the beginning and end of the nucleotide sequences. Hence, the initiation sequences are conserved, but the particular coding sequences are slightly different.

Section B

What kind of protein does the gene code for?

Regulatory protein involved in myogenesis

There is are two distinct functions for the MEF2 family -- what are they? What are the functions of MEF2C?

MEF2c has both DNA binding and trans-activating activities. It may also be involved with maintenance of the differentiated muscles. MEF2A is thought to be involved in muscle induction and differentiation.

Where are the areas of high homology?

We see high homology throughout, but especially immediately upstream and downstream of the gene.

Low homology? Repetitive elements are among those regions with low homology.

Do you think the mouse and the human sequences are closely related?

Yes, they appear fairly closely related.

Identify conserved non-coding sequences.

(pink/red shaded areas under the curve.)

Identify UTRs.

(sky-blue shaded areas)

Identify the exons.

(dark-blue shaded areas)

Which of these areas belong to the gene?

The exons and CNS.

Where are the areas of high homology outside of the gene?

The highest homology is just beyond the 3' end of the gene.

What could be the reason for this high homology?

Likely to be a regulatory region.

Check the alignment by clicking on interesting regions on the graph. Was your hypothesis correct?

Section C

Why are C-genes displayed separately from the alpha and the beta?

Because it is an isoform of both TNFa and TNFb.

Look at the VISTA plots. What is different about this plot when compared to the MEF2C we saw previously?

There are multiple genes, one for each TNF. The VISTA plot for MEF2C only displayed one gene.

Identify the regions that belong to genes (which genes?).

TNFa, TNFb, TNFc

Why do some genes appear to overlap?

They appear to overlap because some are truncated proteins of others. For example, the TNFCb is a short sequence of TNFa

Comparative Genomics – Part 2

Section A

Question: What do you see in the plot?

a VISTA plot which includes several genes with exons, repetitive elements and peaks indicating regions of similarity between human and mouse DNA.

Question: What do the peaks signify?

The peaks signify sequences “conserved” between mouse and human DNA over the evolutionary time since divergence from their common ancestor.

Question: Which areas are conserved exons?

Conserved exons are purple peaks.

Question: Which areas are CNSs?

Conserved exons are purple peaks.

Question: Which areas are Introns?

Introns are the gaps between the arrows (genes) above each plot.

Question: Which areas are Gaps?

Gaps are not shown separately in this version of mVista.

Section B

Question: Can you see peaks that are present in all three alignments?

Answer: very little conservation in terms of percent similarity

Question: Can you see height differences among these peaks?

One can see distinct height differences.

Question: What does height difference mean?

They suggest regions of evolutionary divergence between the species.

Question: Can you see peaks that are present in just one alignment?

Sections and the beginning and the end of the graph, including the Hs. 13308 and IL5 genes are present only in the human/mouse alignment.

Question: Which pair of organisms is most similar?

From the evidence of these alignments, humans and mice are the most similar.

Question: What additional information can you gather from a three-way alignment?

Regions of conservation may become visible which were not apparent in a two-way alignment. Many additional regions of homology are visible by selecting "human/mouse" in the "reverse complement in base/second alignment" box.