Title: Quality Control of Microarrays and the Importance of Human Computer Interaction
1Quality Control of Microarrays and the Importance
of Human Computer Interaction
- Angela Burr
- Bioinformatics Capstone Project
- July 2004
2Capstone Project
- Using computing solutions to improve the
processing of data and analysis of results within
the Drosophila Genomics Resource Center (DGRC)
project of the Center for Genomics and
Bioinformatics. - Quality control issues will be addressed
specifically. - Focused on the aspects of human computer
interaction between the biologists and the suite
of computer programs I created to facilitate
their research.
3Background Information Microarrays
- Microarrays are biological experiments aimed to
analyze gene expression of an organism. - The DGRC produces Drosophila microarrays and
completes hybridization experiments.
http//www.bioteach.ubc.ca/MolecularBiology/microa
rray/
4Importance of Human Computer Interaction Component
- The large volume of data and various sources of
data within microarray experiments make
computational approaches essential. - Continuous interaction develops between
biological experiments and computational
components. - The computer experience of biologist is also a
consideration.
5Microarray Experiment
Microarray Production
Annotation
DNA QC
Hybridization QC
Computational Scripts
6Objective of Quality Control in Microarray
Experiment
- Because of the expense of a microarray
experiment, quality control (QC) is very
important to ensure the integrity of the samples
and the experiment. - DNA QC
- Experimental Errors
- Version of genome
- Hybridization QC
- Experimental Errors
7Microarray Experiment
Microarray Production
Annotation
DNA QC
Hybridization QC
Computational Scripts
8Annotation Processing
- The primers for the DGRC project were developed
by Incyte Genomics and used in previous
microarray experiments. An annotation is
available for these on NCBIs GEO (accession
gpl20). - The annoation contains information like the gene
names, identification in biological databases,
genome location and size of the DNA products. - A script was used to convert the annotation into
the appropriate format for the layout of the DGRC
arrays.
9Microarray Experiment
Microarray Production
Annotation
DNA QC
Hybridization QC
Computational Scripts
10Processing of DNA Quality Control
- The image analysis data is in a convoluted order.
- The image analysis data is combined and
reorganized to a single file by excel macros.
Samples within the file are ordered by sample
identification. - Another macro identifies (flags) unreliable
samples for lack of PCR product, multiple PCR
products, low mass of PCR product, or incorrect
molecular weight (size) of PCR product using the
output of the previous macros and the annotation
output.
11DNA Quality Control Interfaces
12DNA Quality Control Interfaces
13DNA QC by Sample
14DNA QC by Plate
15Microarray Experiment
Microarray Production
Annotation
DNA QC
Hybridization QC
Computational Scripts
16Processing Hybridization Quality Control
- Documentation for easier use of Bioconductor
- Scripts for easier importation
- Convert the necessary files into the correct
format for Bioconductor - Microarray layout file
- Microarray results file
- Add information about the control type and micro
titer plate number for each sample - based on a user-defined controls, quality control
information, or the annotation file - Script to create a file containing various
exploratory plots for a hybridization data set
for evaluation purposes - For use before and after normalization
17Hybridization QC
18Usability Test
- An HCI (Human Computer Interaction) usability
test was conducted to evaluate biologists
interaction with some of the programs and lead to
helpful suggestion. It is vital that biologist
can easily use the programs. - Recommendations were made for the system based on
the test, which identified problems within the
system. - Modifications corrected the programs based on the
recommendation from the HCI usability test.
19Usability Test Problems and Solutions
- Main Identified Problems
- Help Files
- Problem Users were hesitant about using the
help file, and the help file was lengthy. - Resolution
- Reliance on the help file was reduced by adding
more information to the interface. - Context-sensitive help would be ideal, but may be
difficult to implement. I am currently working
on this problem. - Naming conventions example
- Problem The names of the three programs were
confusing, including information about the order
in which they must be run. Users dont seem to
know which file to select once they get to the
menu. - Resolution The files were renamed which
included numbering to better distinguish the
programs.
20 Usability Test Problems and Solutions cont.
- Main Identified Problems cont.
- Consistency throughout the applications example
- Problem The plate number was in a different
format in two of the programs. - Resolution The plate number was changed to the
same numbering system. - Programming error
- Problem A program crashed if a user browsed for
a folder and then decided to cancel the
operation. - Resolution Error was fixed.
- Explanation of output
- Problem Users were unsure of what some of the
fields in the resulting excel file (the final
output) represented. - Resolution A READ-ME worksheet was added to
provide an explanation for the output,
specifically a key for the fields of the
worksheets.
21Usability Test Conclusions
- In a post-test questionnaire, the evaluators were
ask to rate their interaction with the
application. - This included questions on the usefulness, ease
of use, and willingness to use the application. - The application received excellent reviews from
the users. This indicates that the users had a
good feeling about the system and that it is in
fact user-friendly.
22Conclusions
- My capstone project helped to complete DNA and
hybridization quality control for the DGRC
microarray experiments. - Within microarray experiments, data was processed
more efficiently with the help of computational
tools. - The interaction between the biologist and the
computational tool was key to the success of the
tool. - Developing thorough / useful documentation is
also an important aspect of the interaction
between the biologist and computational tools.
23Future Work
- The DNA Quality Control methods offered a
temporary solution. LIMS will offer a better,
more reliable solution. - Extension of project to include a complete
solution for microarray data analysis which is an
immediately need of the DGRC who are generating
many experiments, but have yet to define a good
method for analysis. - A systematic flow of information from the raw
microarray data to the completion of data
analysis would be ideal.
24References
- Bioconductor, www.bioconductor.org
- Center for Genomics and Bioinformatics, Indiana
University, unpublished data, 2003-2004. - DGRC, http//dgrc.cgb.indiana.edu
- Gene Expression Omnibus, http//www.ncbi.nlm.nih.g
ov/geo, geo accession gpl20 - Recommendations for the Microarray Quality
Control System, Alla Genkina and Stacey Sutton,
2004. - The R Project for Statistical Computing,
www.r-project.org
25Acknowledgements
- CGB genomics lab (Justen Andrews)
- Sun Kim, Bioinformatics advisor
- HCI involvement Youn Lim, Alla Genkina, and
Stacey Sutton - HCI Usability participants Kevin Bogart,
Elizabeth Bohuski, Karmon Jones, Jacqeline Lopez,
and Takuma Tsukahara.