NCBI Overview - PowerPoint PPT Presentation

About This Presentation
Title:

NCBI Overview

Description:

... HomoloGene HomoloGene HomoloGene is a resource of curated and calculated orthologs for genes as represented by UniGene or by annotation of genomic sequences ... – PowerPoint PPT presentation

Number of Views:713
Avg rating:3.0/5.0
Slides: 61
Provided by: Robe114
Learn more at: http://fgamedia.org
Category:

less

Transcript and Presenter's Notes

Title: NCBI Overview


1
NCBI Overview
  • Introduction to Bioinformatics
  • Foothill College

2
(No Transcript)
3
NCBI Mission
  • Established in 1988 as a national resource for
    molecular biology information, NCBI creates
    public databases, conducts research in
    computational biology, develops software tools
    for analyzing genome data, and disseminates
    biomedical information
  • All for the better understanding of molecular
    processes affecting human health and disease -
    http//www.ncbi.nlm.nih.gov

4
(No Transcript)
5
NCBI Resources
  • Each section is also a skill
  • How do you use it?
  • Why do you use it?
  • How does it relate to all of NCBI?
  • Follow the exercises from our workshop and the
    NCBI Problem Set (Field Guide)
  • Apply a problem space to your guided tour,
    e.g., your gene, protein, or disease

6
Entrez
  • Entrez is the search and retrieval tool for all
    of NCBI. It is French for enter.
  • http//www.ncbi.nih.gov/Entrez/
  • Entrez allows you to search all of the NCBI
    databases, including PubMed, nucleotide, protein,
    structure, etc.
  • Read the URL to see what Entrez is retrieving and
    from where - follow Ids.

7
(No Transcript)
8
PubMed
  • PubMed provides access to bibliographic
    information which includes MEDLINE, NLM's premier
    bibliographic database.
  • MEDLINE contains bibliographic citations and
    author abstracts from more than 4,600 biomedical
    journals published in the United States and 70
    other countries.
  • Full text articles are usually available through
    pay-per-view on supporting journal websites.

9
(No Transcript)
10
NCBI Gene
  • Gene as the center (loci) of NCBI databases
  • Links to each key NCBI resource
  • Start your search here, and use it as a way to
    explore all of NCBIs databases
  • PubMed
  • OMIM
  • AceView
  • GenBank
  • UniGene
  • MapViewier
  • Variations (dbSNP)
  • HomoloGene

11
(No Transcript)
12
(No Transcript)
13
HomoloGene
  • HomoloGene is a resource of curated and
    calculated orthologs for genes as represented by
    UniGene or by annotation of genomic sequences
    it is all about homology.
  • The calculated homologs are the result of
    nucleotide sequence comparisons between each pair
    of organisms (threshold sameness)
  • http//www.ncbi.nlm.nih.gov/HomoloGene/

14
(No Transcript)
15
UniGene
  • Think of a gene as an interchangeable part
  • The gene is seen as center of the universe
  • Shows organisms where your gene exists
  • Shows tissues where your gene is expressed
  • Each UniGene cluster contains sequences that
    represent a unique gene, as well as related
    information such as the tissue types in which the
    gene has been expressed and map location.
  • UniGene focuses on mRNA and ETS information, and
    is often used in microarray experiments

16
(No Transcript)
17
GenBank (NCBI Data Model)
  • GenBank is the NIH genetic sequence database, an
    annotated collection of all publicly available
    DNA sequences.
  • GBFF GenBank Flat File is the format of text
    based entries (also in HTML)
  • Accession numbers are primary keys for the
    genomic and protein sequence entries.
  • All NCBI data point into GenBank entries.

18
GenBank
  • Data can be exported in many different formats
  • Flat (text) files
  • FASTA file
  • XML formats
  • Export (download) to a local file
  • The entire GenBank database can be downloaded or
    obtained on CD-ROM
  • Sequin is the submission engine for GenBank, how
    sequences are added to the database

19
(No Transcript)
20
(No Transcript)
21
BLAST
  • BLAST (Basic Local Alignment Search Tool) is a
    set of similarity search programs, designed to
    explore all of the available sequence databases
    regardless of whether the query is protein or DNA
    (or soon RNA).
  • Think of it as your Google into every single
    known genomic or protein sequence known
  • It attempts to optimize local alignment, not
    global fit thats the secret to homology
  • http//www.ncbi.nlm.nih.gov/BLAST/

22
(No Transcript)
23
BLAST / Translated BLAST
  • blastn - for nucleotide - nucleotide comparisons
  • blastp - for protein - protein comparisons
  • blastx - compares the nucleotide sequence"
    against nr translated into hypothetical
    proteins
  • tblastn - compares the protein sequence" against
    the nr nucleotide database translated into
    hypothetical proteins in all six reading frames
  • tblastx - compares the nucleotide sequence"
    translated in all six reading frames against the
    nr nucleotide translated in all six reading
    frames.

24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
Map Viewer
  • Map Viewer supports search and display of genomic
    information by chromosomal position genomes
    displayed visually
  • Regions of interest can be retrieved by text
    queries (e.g. gene or marker name) or by sequence
    alignment (BLAST)
  • Shows the neighborhood of your gene

29
(No Transcript)
30
(No Transcript)
31
RefSeq
  • The Reference Sequence (RefSeq) collection aims
    to provide a comprehensive, integrated,
    non-redundant set of genomic sequences.
  • http//www.ncbi.nlm.nih.gov/RefSeq/
  • RefSeq standards serve as the basis for medical,
    functional, and diversity studies they provide a
    stable reference for gene identification and
    characterization, mutation analysis, expression
    studies, polymorphism discovery, and comparative
    analyses.

32
(No Transcript)
33
SNPs and dbSNP
  • The NCBI SNP database is built from genomes and
    searchable that way
  • SNP variations are linked from LocusLink and show
    all the types of allelic variants
  • SNP page shows location in intron, exon,
    synonymous, non-synonymous etc.
  • Many SNPs will have GenBank entries, including
    references to contig location

34
(No Transcript)
35
Genome Resources
  • Whole genomes
  • Links to reference sequences
  • Hs, Mm, Rn, Dm
  • Retroviruses
  • Plant Genome Resources
  • Human Genome Project
  • Links to SNP database
  • http//www.ncbi.nlm.nih.gov/Genomes/

36
(No Transcript)
37
OMIM
  • Online Mendelian Inheritance in Man.
  • This database is a catalog of human genes and
    genetic disorders authored and edited by Dr.
    Victor A. McKusick and his colleagues at Johns
    Hopkins and elsewhere, and developed for the
    World Wide Web by NCBI.
  • Study your gene or protein from here.
  • http//www.ncbi.nlm.nih.gov/omim/

38
(No Transcript)
39
Structures
  • http//www.ncbi.nlm.nih.gov/Structure/
  • Cn3D
  • Launches from NCBI website
  • Multiple forms of rendering
  • Exports as PNG image file
  • PDB links
  • To RCSB and NCBI PDB entries
  • Links from BLAST / PDB database
  • RasMol viewer
  • Shows molecular structure and forces

40
(No Transcript)
41
(No Transcript)
42
COGs
  • http//www.ncbi.nlm.nih.gov/COG/
  • Clusters of Orthologous Groups of proteins (COGs)
    were delineated by comparing protein sequences
    encoded in 43 complete genomes, representing 30
    major phylogenetic lineages. Each COG consists of
    individual proteins or groups of paralogs from at
    least 3 lineages and thus corresponds to an
    ancient conserved domain.

43
(No Transcript)
44
(No Transcript)
45
AceView and SAGEmap
  • AceView
  • mRNA and EST data
  • Searchable and linked to Pfam / BLAST
  • SAGE
  • Serial Analysis of Gene Expression
  • Shows expression levels for ESTs
  • Gene expression results from SAGE tags mapped to
    mRNA sequences in GenBank.
  • Use both for microarray (gene expression) data

46
(No Transcript)
47
Retrovirus Resources
  • Complete genomes
  • HIV, SIV, HTLV, STLV
  • RefSeq for nucleotide / protein
  • Genome maps
  • Protein sequences
  • Valuable for phylogenetic comparisons
  • Subtyping and alignment
  • Multiple Sequence Alignments
  • http//www.ncbi.nlm.nih.gov/retroviruses/

48
(No Transcript)
49
NCBI Tools
http//www.ncbi.nlm.nih.gov/Tools/
  • Central hub for exploring NCBI tools
  • BLAST
  • COGs
  • Map Viewer
  • LocusLink
  • UniGene
  • ORF Finder
  • ePCR
  • VAST
  • Cancer Chromosome Aberration Project
  • Human-Mouse Homology Maps
  • dbMHC
  • Spidey (RNA align)

50
(No Transcript)
51
ORF Finder
  • The ORF Finder (Open Reading Frame Finder) is a
    graphical analysis tool which finds all open
    reading frames of a selectable minimum size in a
    user's sequence or in a sequence already in the
    NCBI database.
  • This tool identifies all open reading frames
    using the standard or alternative genetic codes.
    The deduced amino acid sequence can be saved in
    various formats and searched against the sequence
    database using the WWW BLAST server.

52
(No Transcript)
53
VecScreen
  • VecScreen is a system for quickly identifying
    segments of a nucleic acid sequence that may be
    of vector origin. NCBI developed VecScreen to
    combat the problem of vector contamination in
    public sequence databases.
  • http//www.ncbi.nlm.nih.gov/VecScreen/VecScreen.ht
    ml

54
(No Transcript)
55
Education
  • NCBI tutorials / primers
  • If you want to learn NCBI / bioinformatics, start
    here!
  • http//www.ncbi.nlm.nih.gov/Education/index.html
  • Bioinformatics science primer
  • BLAST / PSI-BLAST tutorials
  • Structure / Cn3D / VAST tutorials
  • Field Guide to NCBI and GenBank
  • SNPs, ESTs, and microarray primer

56
(No Transcript)
57
Books
  • The Bookshelf is a growing collection of
    biomedical books that can be searched directly by
    typing a concept into a searchable textbox
  • Or choose directly from a list of books
  • For science literature and references, this is
    the place to go (hint - for your research paper)
  • http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?dbB
    ooks

58
(No Transcript)
59
Summary
  • NCBI is a one-stop shop for bioinformatics
    tools and databases
  • You can use NCBI to complete or augment a
    research project or paper
  • Or use it to learn about bioinformatics
  • Its not hard to learn, but requires practice
  • If you are in biology in the 21st century, this
    needs to be part of your repertoire of tools!

60
NCBI Website References
  • http//www.ncbi.nlm.nih.gov/
  • http//www.ncbi.nlm.nih.gov/About/
  • http//www.ncbi.nlm.nih.gov/Genbank/
  • http//www.ncbi.nlm.nih.gov/Entrez/
  • http//www.ncbi.nlm.nih.gov/Tools/
  • http//www.ncbi.nlm.nih.gov/Literature/
  • http//www.ncbi.nlm.nih.gov/Education/
  • http//www.ncbi.nlm.nih.gov/Sitemap/
Write a Comment
User Comments (0)
About PowerShow.com