Title: Comparative genome sequence navigation and manipulation with the GenePalette software tool
1Comparative genome sequence navigation and
manipulation with the GenePalette software tool
Mark Rebeiz University of Pittsburgh
2CG13335
in situ Hybridization in fly embryos
Insert into pH-Stinger to see where expression is
driven
3(No Transcript)
4What does GenePalette do?
- Load genome sequences from any genome annotated
in GenBank on any computer platform (Windows,
Mac, Linux) - Design primers, search for motifs, look at
restriction sites - Evolutionary comparisons of DNA conservation
- Prepare to scale diagrams of gene structure for
presentations and publication
5Enter a query to GenBank
Select genes to work with from the chromosomal
region of interest
6The region is loaded into a fully integrated
interface, where every element is
clickable/selectable
7Search for motifs (restrictions sites, primers,
transcription factor binding sites) within the
loaded sequence to visualize where they occur
8Design primers for PCR by simply selecting a
region of DNA
9Phylogenetic Footprinting
Regions that could be important for binding are
often evolutionarily conserved
10Phylogenetic footprinting is laborious by hand
- Alignments of non-coding sequences are difficult,
since there are lots of insertions/deletions
(indels) - Often, binding sites are conserved, but not much
else is - The methods for automating this process are
clumsy
11Sequence comparisons in GenePalette
12(No Transcript)
13GenePalette in the literature
14Potential Projects
- Update the interface, make components easier to
use - Automate the acquisition of orthologous sequences
from databases - Improve accuracy and speed of algorithm for
sequence alignment
15Full text description
- In the post-genomic era, the analysis of genomic
sequence is a constant experimental need. A
particularly challenging issue is determining the
function of non-coding sequences that control
when and where each gene is transcribed.
Currently, a limited number of tools are
available for aligning and visualizing regulatory
sequence motifs in genomic DNA.The GenePalette
software tool is a program written in the Rebeiz
Lab at the University of Pittsburgh to handle
this need. Coded in Java, this program allows
users to download genome sequences from a
database, and visualize features within the
sequence using a graphical interface. - Several independent improvements could be the
focus of a capstone project - (1) Update the GUI to make it more user friendly.
The software is used by many researchers (several
thousand registered users) who are not
necessarily computer savvy. Thus, improvements
that facilitate logical use of components would
greatly improve the softwares utility to
researchers - (2) Streamline the acquisition of orthologous
sequences from various databases. The software
was originally designed to access GenBank, a
fairly generic repository for DNA sequence data.
However, several other extremely useful
resources, such as ENSEMBL and UCSC are now
available. In particular, the UCSC database
contains a "concordance map that allows users to
find orthologous genomic sequences. This project
would involve implementing an interface within
the software to use these resources. - (3) Improve the sequence alignment algorithm. To
compare and contrast evolutionary conservation or
lack thereof, the software implements a sequence
alignment algorithm that finds unique words of
defined length that are identical between
multiple sequences. These landmarks allow the
user to assess whether individual motifs are
conserved among species. The current algorithm is
a slow brute force algorithm. This project
would be to improve this algorithm to make it
faster and more robust.