DNA alphabet - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

DNA alphabet

Description:

Four different bases (nucleic acid bases/nucleotides) appear in DNA adenine (A) ... String searching algorithms try to find a place where one or several strings are ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 8
Provided by: fymL
Category:

less

Transcript and Presenter's Notes

Title: DNA alphabet


1
DNA alphabet
  • DNA is the principal constituent of the genome.
    It may be regarded as a complex set of
    instructions for creating an organism.
  • Four different bases (nucleic acid
    bases/nucleotides) appear in DNA adenine (A),
    guanine (G), cytosine (C), thymine (T)
  • Rule for basepairs (bp) A ?? T, T ?? A ,
  • G ?? C, C ?? G (four bp configuration). Each
    comprises a single piece of information in the
    DNA molecule (for the creation of amino acid)

2
DNA double helix
  • The DNA molecule can be reconstructed from just
    one of the 2 strands.

3
CODONS
  • The basic unit of the genetic code is the DNA bp.
    The human gene can range in size from thousands
    to hundreds of thousands of bps.
  • Human DNA comprises of approximately 3 billion
    bps (Human Genome Project effort to decode all
    of the 3 billion nucleotide base pairs)
  • Three DNA bps combine to form a codon which codes
    for the production of an amino acid (low-level
    instruction), for example, AGA represents A ??T,
    G ??C, A ??T.
  • Sequences of codons code for the assembly of
    amino acids into RNA, polypeptides, proteins, or
    functional RNA.
  • The products so formed mediate the growth and
    development of the organism.

4
DNA SEQUENCE
  • A DNA sequence is a succession of letters
    representing the structure of a DNA molecule or
    strand. The possible letters are A, C, G, and T,
    representing the four nucleotide subunits of a
    DNA strand (adenine, cytosine, guanine, thymine),
    and typically these are printed abutting one
    another without gaps, as in the sequence
    AAAGTCTGAC. This coded sequence is sometimes
    referred to as genetic information. A succession
    of any number of nucleotides greater than four is
    liable to be called a sequence.
  • In genetics terminology, DNA sequencing is the
    process of determining the nucleotide order of a
    given DNA fragment.
  • The sequence of DNA encodes the necessary
    information for living things to survive and
    reproduce. Determining the sequence is therefore
    useful in 'pure' research into why and how
    organisms live, as well as in applied subjects.

5
String Searching Algorithms
  • A string of nucleotides is called DNA or RNA.
  • String searching algorithms try to find a place
    where one or several strings are found within a
    larger string.
  • Naïve string search The simplest and least
    efficient way to see where one string occurs
    inside another is to check each place it could
    be, one by one, to see if it's there. So, first
    we see if there's a copy of the substring in the
    first few characters of the text if not, we look
    to see if there's a copy starting at the second
    character of the text if not, we look starting
    at the third character, and so forth.

6
DNA Sequence alignment
  • Sequence alignment is an arrangement of two or
    more sequences, highlighting their similarity.
    The sequences are padded with gaps (usually
    denoted by dashes) so that wherever possible,
    columns contain identical or similar characters
    from the sequences involved
  • Example
  • tcctctgcctctgccatcat- -
    -caaccccaaagt

  • tcctgtgcatctgcaatcatgggca
    accccaaagt
  • It is usually used to study the evolution of the
    DNA sequences from a common ancestor. Mismatches
    in the alignment correspond to mutations, and
    gaps correspond to insertions or deletions.
  • The term sequence alignment may also refer to the
    process of constructing such alignment or finding
    significant alignments in a database of
    potentially unrelated sequences.

7
BIOINFORMATICS
  • Bioinformatics was born of the need for
    high-powered computing ability to help organize,
    analyze, and store biological information
    primarily DNA and protein sequence data.
  • Gene sequence databases in the United States is
    called GenBank administered by National Center
    for Biotechnology Information.
  • Besides storing biological information, the
    database can be used to help analyze genes, their
    functions, and evolution.
  • A DNA that has been cloned and sequenced is
    entered in a search computer program called BLAST
    to determine if
  • 1) it has already been cloned
  • 2) it is related to an already known gene (if it
    is a new gene sequence, its relatedness to other
    known sequences might help determine its
    biological function)
  • The BLAST program lines up the query sequence
    with each sequence in the database in an
    alignment and shows similar nucleotides by
    connecting them with a line. This gives an
    estimate of gene relatedness.
Write a Comment
User Comments (0)
About PowerShow.com