Title: Selfish Scientists, Egocentric Engineers, Truculent Techies
1Selfish Scientists,Egocentric Engineers,Truculen
t Techies
- Some Stories from the Trenches
- A Personal Perspective
- Carole Goble
- The University of Manchester, UK
- carole.goble_at_manchester.ac.uk
3rd Intl Conf on e-Social Science, Uni of
Michigan, Ann Arbor, USA 8th October 2007 After
Dinner Keynote
2(No Transcript)
3(No Transcript)
4GSK
ID MURA_BACSU STANDARD PRT 429
AA. DE PROBABLE UDP-N-ACETYLGLUCOSAMINE
1-CARBOXYVINYLTRANSFERASE DE (EC 2.5.1.7)
(ENOYLPYRUVATE TRANSFERASE) (UDP-N-ACETYLGLUCOSAMI
NE DE ENOLPYRUVYL TRANSFERASE) (EPT). GN MURA
OR MURZ. OS BACILLUS SUBTILIS. OC BACTERIA
FIRMICUTES BACILLUS/CLOSTRIDIUM GROUP
BACILLACEAE OC BACILLUS. KW PEPTIDOGLYCAN
SYNTHESIS CELL WALL TRANSFERASE. FT ACT_SITE
116 116 BINDS PEP (BY SIMILARITY). FT
CONFLICT 374 374 S -gt A (IN REF.
3). SQ SEQUENCE 429 AA 46016 MW 02018C5C
CRC32 MEKLNIAGGD SLNGTVHISG AKNSAVALIP
ATILANSEVT IEGLPEISDI ETLRDLLKEI GGNVHFENGE
MVVDPTSMIS MPLPNGKVKK LRASYYLMGA MLGRFKQAVI
GLPGGCHLGP RPIDQHIKGF EALGAEVTNE QGAIYLRAER
LRGARIYLDV VSVGATINIM LAAVLAEGKT IIENAAKEPE
IIDVATLLTS MGAKIKGAGT NVIRIDGVKE LHGCKHTIIP
DRIEAGTFMI
5Social Ecosystem
Reference Notes Techniques
Community collective intelligence
In silico experiment
Lab Book
A Scientist
A Scientist
6e-Science is Systematic Support for
Collaborative, Accelerated, Innovative
Research.enabling-Scienceempowering-Scientists
e-Science is Science
7- Collaborative content
- Resources data workflows ontologies tools
protocols techniques research know-how - Share Reuse
- Collaborative development
- Systems, data sets, ontologies
- Build Adopt Reuse
8Structure prediction
Phylogeny
Software Engineers
Omics
Chemists
Bioinformaticians
Systems Biology
Functional genomics
BioMed
Theory
Computer Scientists
Mathematicians
Biologists
Practice
Crystallographers
Service Providers
System Administrators
Resource Providers
9Bio-Tribes, Bio-Nations, Territorial Techies
10- Collaborate and share with my colleagues and
friends I trust. - And people I dont and may never know. And
rivals? - Actually they dobecause
- They are compelled to.
- There is a culture.
- Its in their best interest.
- Good citizenship.
- Rewards.
- Open Source Community
11The Selfish (or Self-interested) Scientist
- A biologist would rather share their toothbrush
than their gene name -
Mike Ashburner and others Professor Genetics,
University of Cambridge,\UK
Data mining my datas mine and your datas
mine
Rich Giordianos Counterfeit Sharing
12The Seven Deadly Sins of Bioinformatics
- Parochialism and Insularity
- Exceptionalism
- Autonomy or death!
- Vanity Pride and Narcissism
- Monolith Meglomania
- Scientific method Sloth
- Instant Gratification
- 3115 views, so this hit a nerve.
- http//www.slideshare.net/dullhunk/the-seven-deadl
y-sins-of-bioinformatics/
Reinvention!
13Reuse is Really HardAnd that goes for software
too.
- Hell is other peoples stuff.
- Metadata is the key.
- And often an afterthought ?
14Andy Laws First (Format) Law
- The first step in developing a new genetic
analysis algorithm is to decide how to make the
input data file format different from all
pre-existing analysis data file formats.
female
male
0
1
crimap
1
0
Keightly
1
2
Knott and Haley
http//bioinformatics.roslin.ac.uk/lawslaws.html
15Biologist exceptionalism
Im different. We are all individuals.
- I know there is already a gene name for that
gene, but, I don't like it and it doesn't fit in
with my schema. - It would be better if I wrote the script I need
so I know what it does, how it does it and how to
modify it later because I havent specified what
it was supposed to do in the first place. - 20 formats for sequence?
- 250 pathway databases?
16Art is IScience is we.
- Claude Bernard
- (1813 - 1878)
17Art is IScience is I.Science is we when it
suits me.e-Science is me-Science
18Rewards Scientists
Advance the frontiers of Science. Get on with
some SCIENCE.
- Competitive advantage.
- Be the first with the Nature paper.
- Credit, credibility, fame, acclaim, recognition,
peer respect, reputation. - More funding.
- Get my result/approach/technique/workflow/ontology
adopted.
19Fears Scientist
- Beaten by lab X or Professor Y.
- Protecting my turf.
- Being misinterpreted or misrepresented.
- Looking stupid.
- Losing control.
- Taking a risk
- Releasing results too early.
- Be distracted from my Science
- Getting left behind.
- Being out of fashion.
20PhDs at Work
21The Ontologists Tale.
- Lets build Ontologies!
- Ontologies are consensual developed and shared
knowledge. - Controlled vocabularies for linking up data and
stuff.
22 Endurants, Perdurants, Being, Substance, Event
Philosophers
Spiritual guides
Aesthetics
Life Scientists Capulets
Knowledge Representation Montagues
OBO
Theoreticians
Pragmatists
A means to an end Content providers
The end Mechanism providers
Carole Goble and Chris Wroe The Montagues and
the Capulets, in Comparative and Functional
Genomics December 2004, vol. 5, no. 8, pp.
618-622(5)
23W3C Semantic Web for Life Sciences mailing list
Why dont you biologists modularise your
ontologies properly?
Er, well, like how should we do it properly and
where are the tools to help us?
We dont know and we havent got any. But here
are some vague guidelines.
24There are no proper ontologies in biology! We
have all this incredible stuff in our language
you arent using. Look at this example.
How do I handle my legacy? The data I need to
describe isnt mine and isnt neat and tidy. The
ontology is already used by thousands of people
every day.
But its only got 20 classes! I need 250,000! And
its a trivial made-up example. How would using
all this help me do my job? Who will train the
curators? Who will pay for the effort? Where are
the tools?
Tell them to start again and do it properly this
time.
If you learn some logic then you can use this
OWL-RDF editor thingy (that only scales to 20
classes)
25Rewards Computer Scientists
Advance the frontiers of Science. Get on with
some SCIENCE.
- Competitive advantage.
- Publish.
- Credit, credibility, fame, acclaim, recognition,
peer respect, reputation. - More funding.
- Get my results/approach/system/design/algorithm/fo
o-bar adopted - Showing off how clever we are.
26(No Transcript)
27Rewards Software Engineers and Service Providers
To build useful, sustainable Software and
Services. Get on with making something.
Preferably with the coolest technology.
- Competitive advantage.
- Credit, credibility, fame, acclaim, recognition,
peer respect, reputation. Showing how clever I
am. - More funding. Though the rewards are poor.
- To get my system/data set/foo-bar adopted.
- Add value to my service/software/ data set for
low or no cost.
28Mutual Dependency and Antagonism
- How can I get Scientists to use my software,
which is clearly superior to what they already
have and unaccountably like? - How do I make them work with me.
- Why dont these computer people give me something
I can actually use. Preferably now? - Make my favourite desktop application faster.
Make my dataset bigger. Get me their data set.
Dont let them see my results until I say so.
Give me something I couldnt get before.
29Simple to use is not the same as simple.Stuff
you cant see must be easy stuff. Right? ?
30The Integrationists Tale.
- Lets build systems that link all these data sets
and tools together! - myGrid Taverna Workflow system
31myGrid Taverna Workflow
http//www.mygrid.org.uk
32Trypanosomiasis in Cattle
A PhD student. Paul Fisher.
- Identified a pathway for which its correlating
gene (Daxx) is believed to play a role in
trypanosomiasis resistance. - Systematic and comprehensive automation.
Elimination of user bias.
Fisher P et al A systematic strategy for
large-scale analysis of genotypephenotype
correlations identification of candidate genes
involved in African trypanosomiasis, Nucleic
Acids Research, 2007, 19
33Principles of Engagement
- Content is King data, workflows, services.
- Be Open to build Critical Mass.
- Keep adoption costs low.
- Fit into the scientific world as it is.
- Change by stealth. Track it. Predict it. React to
it. - Dont be prescriptive. Scientists control.
- Cooperate. Get Others to Add Value. Use Network
effects. Think local Act Global.
34Just enough, Just in timeJam today and Jam
tomorrow
Very BAD
Pain
Just right
Good, but Unlikely
Gain
35Many of these are marketing points
caBIG User Advocate
36Long tail
OReilly Book
37A good User Experience outweighs Smart
Features.Eat Your Own Dog Food Innovation is
not necessarily Cleverer Infrastructure
Scientists are Naughty
38Computer Scientists
Life Scientists
39Mars vs Venus
- Not my problem Lets solve this other problem
which isnt your problem but is fun and leads to
interesting software. And papers. - Over-complication Lets solve this harder problem
than take the easier route that solves your
problem. - its simple myth My granny can write
workflows...
40Venus vs Mars
- Paternalism Repeating the same old mistakes
despite our experiences. - Short termism instant gratification It just
about holds together to get the results for my
paper. Lets hope the PhD student doesnt
leave...Oh... - Hackery. Simplifications, hackery and monolithes
now stores up trouble down the road. Act in
haste, repent later. - Calling me a plumber.
41Marco Roos
42Standards are boring blue collar scienceBy not
making shareable reusable software, we can
publish every single monolithic software
solution. Hurrah!
43Changing Scientific Method
- Dry people hypothesise
- Bench scientists validate.
- make sense of this data
- to
- does this result make sense?
Fisher P et al A systematic strategy for
large-scale analysis of genotypephenotype
correlations identification of candidate genes
involved in African trypanosomiasis, Nucleic
Acids Research, 2007, 19
44Luddism? Surely not!
45Recycling, Reuse, Repurposing
- Workflows are memes.
- Scientific commodities.
- Traded know-how.
- To be exchanged and traded and vetted and mashed.
- Acceleration.
- Quality.
46(No Transcript)
47The Social Networkers Tale.
- myExperiment
- Sharing content
- AND
- Sharing development!
48 49From me-Science to we-Science
- Tribal bonding and sharing
- Crossing Tribal Boundaries
- Across communities and disciplines (MIT)
- Intellectual Fusion Swarming breaking down
silos - Understanding outside my expertise. E.g. sources
of error - Metadata challenges.
- Social challenges.
50Scientists mashing it up for themselves
51This is fun!
- Incentives and inhibitors for sharing content
- Newbies.
- Projects amongst networks of friends.
- Across communities for intellectual swarming.
- Quality, Protection, Reputation. Market place.
- Incentives and inhibitors for sharing development
- Perpetual beta.
- Best friends.
- Mash ups.
52Structure prediction
Phylogeny
Software Engineers
Omics
Chemists
Bioinformaticians
Systems Biology
Functional genomics
BioMed
Theory
Computer Scientists
Mathematicians
Social Scientists
Biologists
Practice
Crystallographers
Service Providers
System Administrators
Resource Providers
53Rewards Social Scientists
Advance the frontiers of Social Science. Get
on with some SCIENCE.
- Competitive advantage.
- Be the first with the X paper.
- Credit, credibility, fame, acclaim, recognition,
peer respect, reputation. - More funding.
- Get my result/approach/technique adopted.
54Challenge for e-Social Science for ME
- Timeliness
- Rapid Churn
- Participative activity
- Strangers participation
- Added value to the Scientists and the Engineers.
- Ignoring the Star Trek Prime Directive
55So Just Do It.e-Social science. For what?Jam
for All.Thats a real challenge.
56Acknowledgements
- With considerable thanks to
- David De Roure
- Robert Stevens
- The ontology, myGrid and myExperiment teams over
the years. - And all our long suffering users.
- My Systems Administrator Mr Cottam
- http//www.myexperiment.org
- http//www.mygrid.org.uk
- Funders EPSRC and JISC