What is a LIMS

About This Presentation

Title:

What is a LIMS

Description:

What is a LIMS ? LIMS Laboratory Information Management System. Computerized system that tracks and ... Center for Biomedical Genomics and Informatics ... – PowerPoint PPT presentation

Number of Views:438

Avg rating:3.0/5.0

Slides: 56

Provided by: CurtisJ

Category:

Tags: lims

more less

Transcript and Presenter's Notes

Title: What is a LIMS

1
What is a LIMS ?

LIMS Laboratory Information Management System
Computerized system that tracks and manages
samples through a protocol
interfaces for both laboratory personnel and
instruments
helps support high throughput operations

2
Types of LIMS

Enterprise
cover all aspects of scientific research
data capture
reagent use and purchasing tracking
Protocol-specific
cover a specific protocol
data capture

3
sample management
inventory management
data collection
instrument management
data warehouse
chain of custody
resource management
data analysis
4
sample management
inventory management
data collection
instrument management
data warehouse
chain of custody
resource management
data analysis
5
Microarrays

large-scale sequencing projects like the human
genome project have given us the ability to
examine the complete transcriptome (the
transcriptional response to an environmental
challenge
new (and expensive) technology
large output of data

6
Microarray Data

produced in a tabular format (rows and columns)
users are relatively unsophisticated in
computational and informatic skills
much data ends up in spreadsheets which lack the
capability to handle rich datasets (no complex
query or visualization capabilities)

7
Microarray Databases

plethora of databases and schemas
three types of interactions
local data management
publication of data in a repository
analysis of repository data
the latter two interactions require a certain
level of sophistication to consolidate exogenous
data

8
Microarrays Concept
9
Microarrays Raw Data
10
Microarrays Data
1 AC3.5 Member of the aminopeptidase protein
family 5 10337580 AC3.5 20834 2/25/00 0 849 196 6
53 650 144 506 438 97 341 199 155 161 924 1.290513
0.774885 1.913864 0.522503 0.734 0.688 0.870632
0.71 0.71 1787 51 1802 66 1 1 1 1 A 1 0 U 2 A
C3.7 Member of the UDP-glucuronosyltransferase
protein family 5 10344769 AC3.7 20835 2/25/00 4 23
4 186 48 188 154 34 127 104 23 187 163 79 594 1.41
1764 0.708333 2.093682 0.477628 1.2 0.116 0.219089
0.32 0.21 1798 953 1809 964 1 1 1 1 A 2 2 U
3 AC3.8 Member of the UDP-glucuronosyltransferase
protein family 5 10347864 AC3.8 20836 2/25/00 0 3
63 198 165 348 155 193 235 105 130 254 221 121 593
0.854922 1.169696 1.267871 0.788724 1.241 1.046 0
.858487 0.25 0.29 1788 71 1801 84 1 1 1 1 A 3
0 U
11
Local Databases

make data available to local researchers
may have WWW-based tools
database and compute server centralized and
closely linked

12
GeneX

National Center for Genome Resources
www.ncgr.org/research/genex
relational database with Perl, R, and Java
components

13
GeneX Features

Free
integrated and extensible toolset
multiple types of array technology in single
database
experiment-centric design
supports an XML specification to allow
interchange between databases

14
BASE

BioArray Software Environment
http//base.thep.lu.se/
Relational database (MySQL) with WWW interface
built upon C/javascript/PHP

15
BASE Features

Free
MIAME compliant
user administration
array production
sample management

16
(No Transcript)
17
Repositories

provide public access to multiple datasets
create standard database similar to sequence
automatic deposition of data upon publication

18
Stanford Microarray Database

genome-www4.stanford.edu/MicroArray
www-based database and a dataset distribution
system
relational database
perl/java toolset
supports some complex querying as well as
browsing for datasets
datasets distributed as compressed flat-files
and/or graphical images

19
GEO

Gene Expression Omnibus
www.ncbi.nlm.nih.gov/geo/
data repository and distribution system
precomputed definitions and descriptions of data
to aid in data set retrieval

20
Data Interchange

Proposed interchange standard
MIAME
Proposed OMG exchange standards
MAML
GEML
NetGenics

21
MIAME

Minimal Information About a Microarray Experiment
www.mged.org/Annotations-wg/
Goal
specify the minimum amount of information needed
to ensure interpretability
facilitate creation of repositories
encourage journals and funding agencies to
require submission of data to repositories

22
Design Considerations

reflect data accurately
efficient access to data
efficient storage of data
compatibility with other databases

23
Data Representation
External Sequence Databases
GIPO
GIPO
GIPO
GIPO
GIPO
GIPO
GIPO
spots
spots
spots
spots
spots
spots
spots
Conditions
????
Experiment
Sample
Tissue
Species
Protocol
24
MIAME Considerations

Experimental design the set of hybridization
experiments as a whole
Array design each array used and each element
(spot) on the array
Samples samples used, extract preparation and
labeling
Hybridizations procedures and parameters
Measurements images, quantitation,
specifications
Normalization controls types, values,
specifications

25
(No Transcript)
26
Background

Center for Biomedical Genomics and Informatics
Engaged in a number of gene expression studies
ranging from liver disease, osteoarthritis and
cancer
Species studies human and rat
cDNA in house printed slides (5K human chip, 40K
human chip)

27
GMU Clinical Genomics

studying the relationship between disease and
genome expression
clinical measurements
standard battery of tests
genomic measurements
gene expression levels
genetic variation
derive correlation between clinical/genomic
factors and treatment outcome

28
Gene Expression Queries
Patient Demographic Queries
Microarray Data
Clinical Data
Clinical Database
Expression Database
29
Dataflow
Clinical Tests and Samples
Clinical Database
Analysis (Genespring, etc.)
RNA Extraction Protocols
LIMS
WWW Access (GENet)
Researchers
Microarray Experiment Protocols
BASE
30
Generic difference in gene expression patterns

We do this via visual inspection following
clustering (genes and samples)
Often we will reduce the number of genes by some
criterion (e.g., cluster only on genes that are
2-fold expressed in at least one sample/category)
Often we will group the number of samples by
condition in order to compensate for the lack of
replicates

31
Clustering of genes and samples
32
Disease vs. Normal
33
Clinical Data Challenges

Collection
text formats
disperse sources
Storage
heterogenous
incomplete
degenerate
Protection
HIPPA regulations

34
Large Clinical Databases

Nadkarni and Brandt (1998) JAMIA 5, 511
Issues involved in data mining EAV databases
Nadkarni et al. (1999) JAMIA 6, 478
Extension of EAV with classes and relationships
Chen et al. (2000) JAMIA 7, 475
Performance of EAV/CR

35
Issues with Clinical Data

Too many columns
Over 43,000 attributes
Sybase capacity
1024 columns per table
32 indexed
up to 50 tables per query
Sparse data
Multiple entries

36
Sample Clinical Table
37
Solution EAV

Entity-Attribute-Value
form of row modeling
turns columns into rows
eliminates sparse data
reduction in database size
Faster single value queries
Pushes depth rather than width

38
EAV Clinical Table
39
Accessing Single Attributes
Traditional
SELECT patient, date, BMI FROM relTable WHERE
patient 1017 AND BMI !NULL
EAV
SELECT patient, date, value FROM EAVTable WHERE
patient 1017 AND test BMI
40
Limitations for Data Mining

Complex boolean queries tough
no set operations
Complex SQL
nested subqueries
self-joins
Performance

41
Ad Hoc Query Interface

Presents a user interface which generates the
required complex SQL queries

42
EAV/CR

Simulation of a complex logical schema using an
extensive yet simple physical schema
Addition of object tables to contain like
attributes
strong data typing
Creates metadata about objects to help describe
the relationships between data objects

43
(No Transcript)
44
(No Transcript)
45
Testing EAV/CR

Data sources
used microbiology data from VA patients
extracted from existing DB
loaded in EAV/CR schema
scaled by replicating data with new IDs
Benchmarking
two attribute centered queries
two entity-centered queries

46
(No Transcript)
47
(No Transcript)
48
Results

Comparable speeds for entity queries
massive hit for attribute query
up to 10-fold worse
"ancestor" improvement
represents denormalization
space for performance trade-off

49
EAV for Clinical Genomics ?

performance issues a problem
data mining on attributes
I/O issues
full EAV not feasible
partial row modeling a good option

50
Clinical Database

Used CGO database out of Univ of Arkansas as a
template
Myeloma database
Want to generalize it for any cancer

51
(No Transcript)
52
Altering CGO

remove gene chip references
affymetrix
MIAME/MAGE non-compliant
attach to GeneX
generalize clinical system
row model test results
row model questionaires

53
Patient
LabReport
LabTest
id birthdate race occupation ...
id test_cat_id(FK) test_id(FK) patient_id(FK) test
_date result
test_cat_id test_id test_cat_desc test_desc
Questionaire
Alcohol
id patient_id
id study_id(FK) Q01 Q02 Q03
...
54
HIPAA

Health Insurance Portability and Accountability
Act
ensure the integrity and confidentiality of
patient information, protect against reasonably
anticipated threats or hazards to the security or
integrity of the information or unauthorized uses
or disclosures of the information

55
Clinical Data Flow
clinical database
redacted database
cleansing protocol
publication services
redacted database
research protocol

Write a Comment

User Comments (0)

About PowerShow.com

What is a LIMS - PowerPoint PPT Presentation

What is a LIMS

What is a LIMS ? LIMS Laboratory Information Management System. Computerized system that tracks and ... Center for Biomedical Genomics and Informatics ... – PowerPoint PPT presentation