DatabasesMPA, access methods and plans - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

DatabasesMPA, access methods and plans

Description:

Guinevere Kauffmann, Anja von der Linden, Ben Panter, Guo Qi, Volker Springel, Vivienne Wild ... Galaxies added: merger trees, links to their parent halos ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 24
Provided by: Lem70
Category:

less

Transcript and Presenter's Notes

Title: DatabasesMPA, access methods and plans


1
Databases_at_MPA, access methods and plans
  • With contributions from
  • JHU Alex Szalay, Jan Vanderberg
  • MPA Jeremy Blaizot, Jarle Brinchmann,
    Guinevere Kauffmann, Anja von der Linden,
    Ben Panter, Guo Qi, Volker Springel,
    Vivienne Wild

2
Last year, Budapest
  • Presented milli-Millennium halo merger tree
    database
  • Requests
  • More properties (lambda, ...) X
  • Galaxies V
  • Correlation with environment (galaxies in voids)
    V
  • Millennium
  • Why use databases ? Ask Alex.

3
Current status
  • milli-Millennium
  • Galaxies added merger trees, links to their
    parent halos
  • Density field at various smoothings
  • Updated web site (demo)
  • Millennium subset
  • Subset (2, 10x milli-Mil) of halo and galaxy
    trees
  • Z0 density field
  • Millennium
  • Halo trees in database (proprietary)
  • SAM galaxies under way (settle on model etc)
  • Density fields at all Z will be added 1056964608
    rows
  • Durham
  • milli_Millennium mirror (Postgres)
  • Durham halo tree and galaxy catalogues

4
Other databases
  • ROSAT source catalogues and RASS photons (100
    million)
  • SDSS Peripherals
  • SDSS_MPA (Brinchman, Kauffmann, Tremonti et al)
  • MOPED (Ben Panter)
  • SDSS_PCA (Vivienne Wild et al)
  • GalICS (Jeremy Blaizot)
  • HEALPix all sky maps (Alex Szalay, Tony Banday)
  • wmap (3 year data soon !)
  • extinction maps
  • radio maps (Bonn)
  • ROSAT background (hopefully)

5
Access
  • Public http//www.g-vo.org/mpasims
  • Local web apps to Millennium, BESTDR3 and
    peripherals http//www.g-vo.org/sdssdr3/
  • Public web browser queries limited (1min, 10000
    rows)
  • Local databases web apps less limited

6
Streaming
  • Query results temporarily buffered on server
    memory
  • Streaming queries faster, less limited (only
    timeout)
  • Access
  • IDL (with Ben Panter)
  • wget http-user --http-password -O
    localfile.csv http//www.g-vo.org/sdssdr3/DBQuery
    Stream?SQLselect from moped..agebin
  • GUI asking for username/password
  • Interprets CSV stream, turned into IDL components
  • TOPCAT

7
Plans Millennium
  • Millennium
  • Tune database
  • 750000000 halos
  • N x 1000000000 galaxies
  • 63 x 2563 density field grid cells
  • More halo properties (shape, ?, ...)
  • More galaxy catalogues
  • different parameters
  • different algorithms (GalICS, Durham, ...)
  • Light cone mock catalogues
  • Galaxy spectra ( PCA)
  • Links to SDSS mirror and peripherals
  • Proper metadata handling (ala SkyServer)
  • "SAM online
  • Move webapps to MPA
  • Use JHU services, install CAS jobs

8
Plans SDSS mirror peripherals
  • Make mirror web site public
  • Upgrade SDSS mirror to DR4
  • Stabilize, document, publish SDSS peripherals
  • Proper metadata handling
  • Links to Millennium
  • Personal databases MyDB (ala SkyServer)
  • Add logos

9
Theory VO spectra
  • Combine theory and observations
  • Example query-by-example on theory spectra
  • Find similar spectra, from these the actual
    galaxy formation history
  • Chi-squared on all stored spectra ? Slow,
    requires storing all of them
  • Idea (not original, see HVO/JHU talks) use PCA
    to compress data

10
PCA
  • Need training sample of theory spectra to create
    eigenspectra
  • Project all spectra
  • Store PCA amplitudes in DB
  • Provide web service
  • Upload (observational) spectrum (IVOA SSA/SED)
  • Project onto theory eigenspectra
  • Use amplitudes as parameters in query for
    nearby amplitudes
  • Return corresponding theory spectra
  • Return corresponding galaxy formation histories,
    or their halos, or their environment

11
Issues
  • Dealing with errors, gaps gappy PCA (Connolly
    Szalay)
  • Normalization
  • incoming spectrum in general from very different
    dataset, needs common normalization
  • Incoming set will have gaps, errors
  • Ad hoc normalization possible (and works quite
    good)
  • Indexing of complex multi-dimensional point set
    for quick nearest k neigbours search (Voronoi ?
    See Laszlos work)

12
Normalized gappy PCA
  • Fit normalization factor at same time as PCA
    amplitudes. Model
  • Minimize (over ai and N )

13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
So far
  • Ran PCA on BC03 stochastic bursts (Vivienne)
  • On first GalICSmilli-Millennium spectra (Jeremy)
  • Projected SDSS spectra on both
  • Defined a PCA data model/schema
  • Stored PCAs in database
  • TOPCAT

17
PCA data model (RDB schema available)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
milliMil-GalICS PC1 vs PC2 Voronoi tesselation
22
Issues for query-by-example
  • Overlap quite good, but good enough ?
  • GalICS spread less than SDSS.
  • BC03 comparable with SDSS, but different slope.
  • Systematics
  • Model
  • physics very preliminary (see Blaizot de
    Lucia?)
  • resolution effects
  • Preprocessing SDSS galaxies
  • Rebinning different algorithms give comparable
    results
  • (slightly) wrong redshift ? Can be easily
    simulated
  • Projection algorithm normalization does not
    affect outcome
  • Observational systematics use virtual telescope
    (virtual spectrograph) to test on the theory
    spectra.Easier to blow up simulation than to
    shrink observation cloud

23
Comments
  • Millennium database being used for science
    projects (Guo Qi)
  • SDSS peripherals used for science projects (see
    Viviennes talk, Ben Panter)
  • Use of mydb for debugging and testing (Jeremy)
  • Please give comments, feedback.
Write a Comment
User Comments (0)
About PowerShow.com