4 years ago

CosmoHub: Interactive exploration and distribution of astronomical data on Hadoop. (arXiv:2003.03217v2 [astro-ph.IM] UPDATED)

Pau Tallada, Jorge Carretero, Jordi Casals, Carles Acosta-Silva, Santiago Serrano, Marc Caubet, Francisco J. Castander, Eduardo César, Martín Crocce, Manuel Delfino, Martin Eriksen, Pablo Fosalba, Enrique Gaztañaga, Gonzalo Merino, Christian Neissner, Nadia Tonello

We present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques.

CosmoHub, hosted and developed at the Port d'Informaci\'o Cient\'ifica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ci\`encies de l'Espai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHub's datasets are seldomly modified, Hive it is a better fit.

Over 60 TiB of catalogued information and astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes.

Publisher URL: http://arxiv.org/abs/2003.03217

DOI: arXiv:2003.03217v2

You might also like
Discover & Discuss Important Research

Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.

  • Download from Google Play
  • Download from App Store
  • Download from AppInChina

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.