Title: A Gentle Introduction to UCSC Genome Browser
1A Gentle Introduction to UCSC Genome Browser
2Options
- I. Genome Browser
- II. ENCODE
- III. Blat
- IV. Table Browser
- V. Gene Sorter
- VI. In Silico PCR
- VII. Proteome Browser
- VIII. Utilities
- IX. Downloads
3I. Genome Browser
- Human (Homo sapiens) Genome Browser Gateway
- Provides any section of entire human genome
- Non-Standard Join Certificates
- some sequence joins between adjacent clones in
this assembly could not be computationally
validated - the sequencing center responsible for the
particular chromosome provides an electronic
certificate - should state why the submitter thinks the join is
valid
4Query
Clade ??????????? vertebrate???? deuterostome
??? insect?? nematode??
5Chimp??? Rhesus??? Opossum?? X.
tropicalis? Tetraodon?? Fugu??
6Display image width
Assembly date
7- Entire chromosome
- chr7 (all of chromosome 7)
- Cytological band
- 20p13 (region for band p13 on chr 20)
- Chromosomal coordinate range
- chr31-1000000 (first million bases of chr 3,
counting from p arm telomere) - mRNA, EST, or STS marker
- Keywords from the GenBank description of an mRNA
(huntington)
8Search Result
Position zoom in/out
Restriction Enzyme
mRNA
Conservation
SNPs
9Display option
10Gen Browser Query (x)
11Gen Browser Results 1 (x)
12Gen Browser Results 2 (x)
13Gen Browser Details (x)
14Gen Browser Syntax (x)
- Entire chromosome
- chr7 (all of chromosome 7)
- Cytological band
- 20p13 (region for band p13 on chr 20)
- Chromosomal coordinate range
- chr31-1000000 (first million bases of chr 3,
counting from p arm telomere) - mRNA, EST, or STS marker
- Keywords from the GenBank description of an mRNA
(huntington)
15II. ENCODE
- Stands for Encyclopedia Of DNA Elements
- Public research consortium to carry out a project
to identify all functional elements in the human
genome sequence - Launched by The National Human Genome Research
Institute (NHGRI) - Conducted in three phases
- pilot project phase (survey existing methods)
- technology development phase (develop new
methods) - planned production phase ()
16ENCODE Formats
- Browser Extensible Data Format (BED)
- for efficient access to genomic annotations
- General Feature Format (GFF)
- for data where there are a set of linked features
- Gene Transfer Format (GTF)
- a refinement of GFF that tightens the
specification - Multiple Alignment Format (MAF)
- a series of multiple alignments in one format
- Wiggle Format (WIG)
- for continuous-valued data in track format
17ENCODE Options
- Regions (hg16)
- old database (mRNA, EST, STS markers)
- Regions (hg17)
- new database (mRNA, EST, STS markers)
- Data Status
- the current status of ENCODE datasets
- Downloads
- sequence and annotation data downloads
- Submission
- for the submission of ENCODE-related data
18ENCODE QueryResults
19ENCODE Details hg16
20ENCODE Details hg17
21III. Blat
- To quickly find sequences of 95 and greater
similarity of length 40 bases or more - BLAST-Like Alignment Tool, not BLAST
- Use Paste in a query sequence to find its
location in the the genome - takes up just under 1 GB of RAM
22Blat Query
Query sequence
Upload file
23Blat Results
Browser view
Detail view
24Blat Result Browse
25Blat Result Details
26IV. Table Browser
- To get the data associated with a track in text
format, to calculate intersections between
tracks, and to retrieve DNA sequence covered by a
track
27Table Browser Query
28Table Browser Results
29Table Browser Options
- Describe Table Schema
- schema for SQL table format
- Filter
- regular expression filter
- range control
- Intersection??
- Correlation??
- Summary Statistics
30Table Browser Schema
31Table Browser Filter
32Table Browser Intersection??
33Table Browser Correlation??
34Table Browser Summary Statistics
35V. Gene Sorter
- Displays a sorted table of genes that are related
to one another - Correlation is color-coded
- a highly expressed gene is colored red
- a less expressed gene is shown in green
36Gene Sorter Query
37Gene Sorter Results
38Gene Sorter Details 1
39Gene Sorter Details 2
40VI. In Silico PCR
- In-Silico PCR searches a sequence database with a
pair of PCR primers - Returns a sequence output file in fasta format
containing all sequence in the database that lie
between and include the primer pair
41PCR
PCR polymerase chain reaction,???????DNA??
http//members.aol.com/BearFlag45/Biology1A/Lectur
eNotes/lec24.html
42In Silico PCR Query
Two primer sequence
Max product size
Number of match
43In Silico PCR Results
Reverse primer
Forward primer
Match in uppercase Mismatch in lowercase
Melting temperature
44VII. Protein Browser
- UCSC Proteome Browser Gateway
- provides a wealth of protein information
presented in the form of graphical images and
links to external internet sites - SwissProt information
- Proteome browser tracks
- Protein property histograms
- UCSC links / Domain information
- Comparative 3D structures
- Pathways / Fasta format
45Protein Browser Query
Swiss-Prot/TrEMBL protein ID
46Protein Browser Tracks
polarity
hydrophobicity
cysteines
glycosylation
47Protein Browser Histograms
48Protein Browser 3D structures
49VIII. Utilities
- Some tools (for preparing input)
- Batch Coordinate Conversion (liftOver)
- converts genome coordinates and genome annotation
files between assemblies - WHY?
- occasionally, a chunk of sequence may be moved to
an entirely different chromosome as the map is
refined - DNA Duster
- formatting tool
- Protein Duster
- formatting tool
50IX. Downloads
- Offers downloads to complete genomes
- Human
- Chimpanzee
- Rhesus
- Dog
- Cow
- Mouse
- Rat
- Opossum
- Chicken