The PGA: PhysGen Bioinformatics Component - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

The PGA: PhysGen Bioinformatics Component

Description:

Main components of the PhysGen Program: Linking physiology ... Tarzan: Tool & Data. Testing. Dolphin: Tool. Development. New tools. Data from lab 'Mirror sites' ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 31
Provided by: brf7
Category:

less

Transcript and Presenter's Notes

Title: The PGA: PhysGen Bioinformatics Component


1
The PGA PhysGenBioinformatics Component
What are the genetic components of given
phenotypic traits?
Michael A. Thomas, Ph.D. Peter J. Tonellato,
Ph.D. Bioinformatics Research Center Medical
College of Wisconsin
2
Main components of the PhysGen ProgramLinking
physiology to the genome
  • Genomics Create biologically interesting and
    genomically engineered strains
  • Phenotypes Collect extensive physiological
    measurements on all strains
  • Research Service Biotechnology development in
    support of genotyping service.
  • Bioinformatics Quality control, data
    processing, data management and release, analysis
    tools.

3
Bioinformatics Component Overview
  • Data processing and QC
  • Data warehousing
  • Online access
  • Analytical tool development

4
Bioinformatics Database organization
Data from lab
(Quarterly release)
Eurus Public
Tarzan Tool Data Testing
Dolphin Tool Development
New tools
Mirror sites
Data frozen
5
Bioinformatics 1. Date processing QC

Data Checking
1. Check phenotype name 2. Check no
duplicate 3. Check data formatting Date
Case Number of digit 4. Check data domain
Atmosphere Condition HPX / NMX Diet
Condition HS / LS Protocol name 7
protocols Gender Female / male 5. Check
for outliers Criteria Mean /- 3 STDV
LAB
database
Y
N
6
Bioinformatics 2. Data release
  • Quarterly
  • In conjunction with
  • major web site improvements
  • release of analytical tools

7
Bioinformatics 3. Online access and analysis
http//pga.mcw.edu
8
Bioinformatics 4. Analytical tool development
  • Q Are there differences between means for a
    given phenotype among various rat treatments?
  • Differences among inbred strains can be
    attributed to any number of genetic differences
  • Differences among consomic or congenic strains
    can be attributed to the inserted chromosome (or
    chromosomal region)

9
The Physgen data - renal
10
Bioinformatics 4. Analytical tool development
y11 y12 . . . y1g y21 y22 y2g y31 y32 y3
g . . . . . . s1 s2 sg
H0 ?1 ?2 ?3 No differences between strains
11
How do we find the answers? by Analysis of
Variance (ANOVA)
  • Test if all the means are equal using ANOVA
  • If not, test by pairs (t-test or ANOVA)

12
If data are normal Conventional ANOVA
  • Test of H0 ?1 ?2 ?g
  • Construct a statistic F
  • Find the distribution of the statistic under H0
    The F -distribution
  • Compare the calculated F with the critical value,
    F?.
  • If F gt F?, then H0 is rejected

Variances among and within groups are compared
F?
13
Test of equal variance Levenes Test
  • H0 ?21 ?22 ?2n
  • (The dataset is homoscedastic)
  • Calculate W
  • Compare with F table critical value
  • If W gt F?, we reject H0

å
g
2
-
-
)
(
)
(
Z
Z
N
g
N
..
j
.
j


1
j
W
å
å
g
N
-
-
2
)
(
)
1
(
Z
Z
g
j
j
.
ij


1
1
j
i
  • If the data passes Levenes Test, a conventional
    ANOVA will be undertaken.
  • If the data fails Levenes Test, the
    non-parametric ANOVA will be suggested

14
What if equal variance does not hold ?
  • Solution 1 Non-parametric ANOVA (not sensitive
    to unequal variances)
  • Solution 2 Dynamic ANOVA (requires normality)

15
ANOVA
  • Conventional ANOVA Powerful. Requires normality
  • and equal variances.
  • Non-parametric ANOVA Less powerful. Normality
    not
  • required. Much less sensitive
  • to unequal variances.
  • Dynamic ANOVA Requires normality but not
  • equal variances. Requires
  • much more computation time.

16
If data are not normal Non-Parametric
ANOVAKruskal-Wallis Test
  • Test for equality of group means without
    assuming normality
  • Create ranks for each value in the set
  • Calculate H statistic and compare with the
    h-distribution table (asymptotic to the X2
    distribution)

17
Current implementation
Data
Levenes Test
N
Equal ?2?
Y
Conventional ANOVA
Non-Par ANOVA
18
Q Are there differences between means for a
given phenotype among various rat treatments?
  • The user can answer this question for the
    particular protocol and phenotypes of interest
  • The built-in tools from the PhysGen web site help
    the user analyze the data with the appropriate
    statistical tools

19
The user selects the data by phenotype between
different rat treatment categories
Strain BN SS FHH .. .. SS-BN-16
Atmosphere Condition Hypoxia Normoxia
Gender Male Female
Diet Condition High salt Low Salt
Category
20
The user selects the protocol
21
The user selects the data
22
Understanding PGA Data
Independent Variables
Phenotype
23
Understanding PGA Data
405.2 399.9 402.6 . . . Mean 401.0
Group Means
Q Are there differences between means for a
given phenotype among various rat treatments?
Values in the group are used to determine the
group mean
24
I. Levenes test Passed. Conventional ANOVA
A statistically significant difference is
observed among the group
25
I. Continued. Comparison between pairs
No statistically significant difference is
observed between the two
26
II. Levenes test Failed. Non-parametric ANOVA
A statistically significant difference is
observed.
27
III. Levenes test Passed
No statistically significant difference is
observed among the groups.
28
IV. Levenes test Passed
A significant difference is observed among the
groups.
29
Next steps
Currently, we ask Q Are there differences
between means for a given phenotype among various
rat treatments? A (via conventional or
non-parametric ANOVA) This will be improved by
providing more accurate and meaningful pair-wise
comparisons using a user-specific reference
strain(s). Soon, well ask Q For a given pair
(or group) of rat treatments, which phenotypes
best explain the differences? A via a
fuzzy/neural net approach
30
  • Potential projects for bioinformatics students
  • Combining physiological data with microarray data
    produced from the same experiments
  • New analytical tools /or approaches
  • New ways to manage present the data
Write a Comment
User Comments (0)
About PowerShow.com