Title: End to End science with VO from major facilities
1End to End science with VO(from major facilities)
2my assumptions
- The international virtual observatory is an
outstanding idea it can make science more
efficient and more effective, save money by
removing duplication, and allow higher-quality
calibrations to deliver better data. It needs,
and deserves, long-term support. - It could also fail horribly, by allowing bad
science. What are the risks and preventions?
3end to end
- Data must be understood its origins, limits,
strengths, errors, systematics, etc - Every experiment must be repeatable
- A reader/referee must know what happened well
enough to repeat it (in principle) - The data provider must be able to be credited,
and to be blamed
4end to end
- The process is in principle simple
- DATA ? vo stuff ? user ? understanding
- reliable science requires either that an
astronomer understands the data
acquisition/calibration/reduction processes - or, more typically
- cites and uses the results from a well-checked
public description of that process (eg, HST
pipeline calibrations)
5An effect of VO?
- No small research group will have the expertise
to really understand the limitations of all the
many datasets VO makes available - And many simple queries will try to take data
outside their range of calibration - ? even major facilities can be mis-used
6The multi-wavelength challenge
7The multi-wavelength challenge
- Surface brightness maps vs source lists
- Upper limits, etc
8Matching multi-wavelength data sets is possible
only for a very expert large team until VObut
how reliably can it be done?
9Multi-wavelength matching can be very
interesting And is easy in simple
cases Example here The Orphan stream in the
Segue Field of Streams Belokurov etal
2006 Fellhauer etal 2006 Zucker etal 2006
10Multi-wavelength matching can be very
interesting And is easy in simple cases Example
here The Orphan stream in the Segue Field of
Streams Belokurov etal 2006 Fellhauer etal
2006 Zucker etal 2006
11(No Transcript)
12Reference frame GAIA- is opticalradio plan now?
Limits can be intrinsic or extrinsic different
astrophysics, different methods?
13Thermal IR and Sub-mm surveys new tools for huge
complex data sets and maps are already essential
14The effect of VO
- VO will certainly democratise astronomy
- Powerful tools can dominate ? powerpoint
- VO needs to empower, not limit
- VO will break the multi-wavelength access
barrier, and allow more complexity - There are sure to be serious errors from this!
- Poor or inappropriately calibrated data can be
used unknowingly in a complex system. - This will be a huge challenge for referees.
15Archive or Albatross?
- How much extant data is worth saving?
- How do we decide what to accept?
- Must balance resources old v new
16Some early lessons
- Unique science exists in the archives
- Retrieving it is hard work (now)
- Lots of defective data are in the archives
- Accessing bad data will destroy AVO
- We need data standards as well as
interoperability standards
17An effect of VO?
- No small research group will have the expertise
to really understand the limitations of all the
many datasets VO makes available - Will large expert data centres (CDS, IPAC,)
become even more necessary how are these to be
funded, if their role is international helpline
support? - Need they exist as entities? Linux model? Virtual
institutes?
18IS VO a free lunch for most?
- If so, it will fail.
- VO must retain active participation by most
potential user communities if it is to be used
and developed. How? - Does this create a monster, and inhibit future
individual creativity? - Most great ideas, as for VO, come from a few
exceptional people (PIs) too rigid a structure
prevents this in future - BUT software maintenance is expensive, and
requires a different structure than does
development
19The 3 Cs credit, careers, citations
- Careers depend on credit, citations, name
recognition - What credit is available to reward the
provider/calibrator/corrector of high-quality
data used through VO? Citation! - Citing what? calibration description
- And similarly, de-merits, or health warnings for
poor data
20the VO tomorrow
- Generic justification for public funding
- 1) VO is essential to allow effective public
access to processed data ? longevity of research
use - 2) it also removes the need for duplication
- Stability implies significant continuing support
and development ? VO career paths, VO management
structures - And decision making challenges there is no
single PI Institution/group who decides?
21end to end
- All this implies that only high quality, well
calibrated, data sets are appropriate for an
ordinary non-specialist, user - if a dBase doesnt have a published description,
and known quality, it should not be available to
the unwary. - And it must never be utilised unknowingly
- A typical Google search generates rubbish
22end to end science from major facilities does VO
have a role?
- yes!
- What VO needs to deliver science of the quality
appropriate to major facilities is the standard
of quality-control which the major facilities
implement themselves - A default elite data set, with optional add-ons
for the expert. NOT vice-versa.
23end to end
- ? Extreme deduction 1
- private publication of data is a high-risk
potential disaster such data should never be
accessed by the VO by default - ? Extreme deduction 2
- The default request to VO should access ONLY data
from the major facilities and observatories - (have a directory of approved datasets, in the
same way that papers are refereed? I trust IVOA
to do this reliably.) - Corollary 1 experts can switch in other info
only by making extra efforts, passing barriers - Corollary 2 the SQL (or whatever) query should
automatically be delivered to the scientist in a
format for inclusion in the paper, along with
relevant references to all accessed data sets
24end to end science from major facilities does VO
have a role?
- yes!
- and, in fact, arguably ONLY from major facilities
by default - and someone needs to house the career VO
infrastructure system