Title: Mutation Nomenclature Preworkshop Quiz
1Mutation NomenclaturePre-workshop Quiz
- Shuji Ogino, M.D., Ph.D.
- Brigham and Womens Hospital
- Dana-Farber Cancer Institute
- Harvard Medical School
- We will start at 7AM sharp (on Nov 9, 2007)!
2Factor V Leiden (so-called1691GgtA or R506Q)
Basic Structure of Standard Nomenclature
F5 NM_000130.3 c. 1601GgtA (p.Arg534Gln)
Protein
Reference sequence (Genbank accession No. and
Version No.)
Coding DNA sequence
HGNC Official Gene Symbol
Guanine-to- Adenine at position 1601
Arg-to- Gln at codon 534
3Rule
- HUGO Gene Nomenclature Committee
(http//www.gene.ucl.ac.uk/nomenclature/index.html
) - HGNC-approved official gene symbols should be
used - No Greek letters or Roman numerals (as in IGF-II
and NF??) - No h or m prefix (as in hMLH1 or mTOR)
- No - (as in K-ras)
- All capital letters
4Coding DNA Ref Seq Numbering
Exon 1
Exon 2
Poly A addition site
Transcription start site
Intron 1
Codon 1 ATG
Stop codon end
3-UTR
5-UTR
-30 to -1 1 to 36 361 to 36 37- to
37-1 37 to 96 1 to 170
5Question 1. Deletion
CFTR NM_000492.3
1501 accattaaag aaaatatcat ctttggtgtt tcctatgatg
aatatagata cagaagcgtc 1561 atcaaagcat gccaactaga
agaggacatc tccaagtttg cagagaaaga caatatagtt
506 507 508 509 510 Ile Ile Phe
Gly Val ATC ATC TTT GGT GTT
ATC ATT GGT GTT Ile Ile Gly Val
Wild-type Mutant
6Rules
- c. based on coding DNA ref seq
- g. based on genomic DNA ref seq
- A range of nucleotide numbers is indicated by _
- del for deletion
7Question 2. Hemoglobin ?S (sickle cell anemia)
(Glu6Val, E6V) Coding DNA reference sequence
NM_000518.4
AgtT (codon 7 GAGgtGTG)
Start
1 atggtgcatc tgactcctga ggagaagtct gccgttactg
ccctgtgggg caaggtgaac 1 2 3 4 5 6 7
What is wrong with Glu6Val (in terms of
standard)? What is the official gene symbol for
the hemoglobin beta? What is nomenclature for
this common variant?
8Question 3. Mutation in 3-UTR
Codon 177
Ala Asn Leu Ala Ser STOP CAA
GCC AAT CTT GCT AGC TAG AGT TTT GGT TCT
AGT AAG GTT CT
Wild-type Mutant
C
Genbank accession number is NM_00001 version
3. Gene symbol AAA1 What is nomenclature for
this mutation based on coding DNA reference
sequence?
9Rules
- c. based on coding DNA ref seq
- for 3-UTR numbering
- c.1 indicates the first nucleotide after the
stop codon - A range of nucleotide numbers is indicated by _
- delins for deletion-insertion
10Question 4. Intronic duplication mutation
p. (protein) amino acid numbers
61
Ala Asn Leu Ala
CTTAATAG GCC AAT CTT GCT
c. (coding) nt numbers
181
Insertion of A
Genbank accession number is NM_00001 version
3. Gene symbol AAA1 What is nomenclature for
this mutation based on coding DNA reference
sequence?
11Rules
- c. based on coding DNA ref seq
- An intron nucleotide can be indicated as
c.12091 or c.1210-3 (examples) - dup for duplication
- The dup designation is preferred to ins,
because it is simpler, and implicates mechanism
(duplication)
12Q5. SNP in MGMT 5-UTR (non-coding exon)
69 Kb
CgtT change
Intron 1
Exon 1
Caccgtttgcgacttg
gtgagtgtctgggtcgcctcgctcccggaagagtg
cDNA reference sequence MGMT NM_002412.2 Includes
Exon 1
Exon 2
ATG codon 1
40 bp
Ogino et al. Carcinogenesis 2007 shows this
common SNP is strongly linked to MGMT
methylation in colorectal cancer
13Rules
- c. based on coding DNA ref seq
- A nucleotide in 5-UTR can be indicated as c.-1
or c.-23 (examples)
14Question 6. First codon mutation
p. (protein) amino acid numbers
1 2 3 4 5 6 7 8 9 10
11 Met Ala Asn Leu Ala Ser Pro Arg Phe
Gly Ser CAA ATG GCC AAT CTT GCT AGC CCT
AGA TTT GGT TCT 1 4 7 10 13 16
19 22 25 28 31
c. (coding) nt numbers
Mutation
TgtG
Genbank accession number is NM_00001 version
3. Gene Symbol AAA1 What is nomenclature for
this mutation based on coding DNA reference
sequence?
15Question 7. Inversion
1 atgcagaggt cgcctctgga aaaggccagc gttgtctcca
aacttttttt cagctggacc 61 agaccaattt tgaggaaagg
atacagacag cgcctggaat tgtcagacat ataccaaatc 121
ccttctgttg attctgctga caatctatct gaaaaattgg
aaagagaatg ggatagagag 181 ctggcttcaa agaaaaatcc
taaactcatt aatgcccttc ggcgatgttt tttctggaga 241
tttatgttct atggaatctt tttatattta ggggaagtca
ccaaagcagt acagcctctc 301 ttactgggaa gaatcatagc
ttcctatgac ccggataaca aggaggaacg ctctatcgcg 361
atttatctag gcataggctt atgccttctc tttattgtca
ggacactgct cctacaccca Change
to actgtt
Coding DNA reference sequence for the AAA1
gene Genbank No. NM_00001 Version 3
16Question 8. Known polymorphisms in repeat sequence
Genomic reference sequence AJ574948.1 (CFTR
intron 8, 5T/7T/9T)
Exon 9 starts here (No. 1210 In coding DNA ref
seq NM_000492.3)
1 tataattatg tactataaag taataatgta tacagtgtaa
tggatcatgg gccatgtgct 61 tttcaaacta attgtacata
aaacaagcat ctattgaaaa tatctgacaa actcatcttt 121
tatttttgat gtgtgtgtgt gtgtgtgtgt gtttttttaa
cagggatttg gggaattatt 181 tgagaaagca aaacaaaaca
ataacaatag aaaaacttct aatggtgatg acagcctctt 241
cttcagtaat ttctcacttc ttggtactcc tgtcctgaaa
gatattaatt tcaagataga 301 aagaggacag ttgttggcgg
ttgctggatc cactggagca ggcaaggtag ttcttttgtt 361
cttcactatt aagaacttaa tttggtgtcc atgtctcttt
ttttttctag tttgtagtgc 421 tggaaggtat ttttggagaa
attcttacat gagcattagg agaatgt
What is collective nomenclature for the repeat
polymorphisms? What is nomenclature for the 5T
variant? What is nomenclature for the 9T
variant?
17Rules for repeat polymorphism
- Use the number of the first nucleotide
- Followed by repeated unit (e.g., T or CA)
- Describe the number of repeats in
- 8 a particular polymorphism (8 repeats)
- (5_10) uncertain between 5 and 10 repeats
18Question 9. Known dinucleotide repeat
polymorphisms
Genomic DNA reference sequence AJ00001.1
1 tataattatg tactataaag taataatgta tacagtgtaa
tggatcatgg gccatgtgct 61 tttcaaacta attgtacata
aaacaagcat ctattgaaaa tatctgacaa actcatcttt 121
tatttttgat gtgtgtgtgt gtgtgtgtgt gtttttttaa
cagggatttg gggaattatt 181 tgagaaagca aaacaaaaca
ataacaatag aaaaacttct aatggtgatg acagcctctt 241
cttcagtaat ttctcacttc ttggtactcc tgtcctgaaa
gatattaatt tcaagataga 301 aagaggacag ttgttggcgg
ttgctggatc cactggagca ggcaaggtag ttcttttgtt 361
cttcactatt aagaacttaa tttggtgtcc atgtctcttt
ttttttctag tttgtagtgc 421 tggaaggtat ttttggagaa
attcttacat gagcattagg agaatgt
This TG repeat polymorphisms ranges 8 to 12
repeats in the general population What is
collective nomenclature for the repeat
polymorphisms? What is nomenclature for the
8-repeat allele? What is nomenclature for the
11-repeat allele?
19Question 10. Prothrombin G20210A DNA reference
sequence AF478696.1
Stop
3-UTR
21421 attgatcagt ttggagagta gggggccact catattctgg
gctcctggaa ccaatcccgt 21481 gaaagaatta
tttttgtgtt tctaaaacta tggttcccaa taaaagtgac
tctcagcgag
GgtA
What is wrong with G20210A (for the nucleotide
change)? What is the official gene symbol for
the prothrombin gene? What is nomenclature for
this common variant?
20Q11. Deletion of non-coding exon
Stop codon ends at c.885 (NM_00001.1) gene symbol
AAA1
Exon 7
Exon 8
?
?
c.5 c.573
Exon 7 ends at c.4
Exon 8 deletion, but exact positions of start
and end of deletion unknown
What is nomenclature for this exon 8 deletion?
21Rules
- For uncertain positions, use ?, but describe as
specific as possible - This mutation starts at c.5-? and ends at
c.573? - Exon/intron numbers should not be used
- Exon/intron numbering is neither uniform nor
permanent
22Take home messages
- All DNA changes should be described at DNA level
(with amino acid changes if known) - Use HGNC-approved official gene symbols
- Standard names can be accompanied by (colloquial
names)