Data Processing and Distribution

Data Life Cycle, from planning the collection of data to reusing

Processing and distribution are the next steps in the data life cycle after having collected datasets from ground observations or space instruments. SIDC has pioneered the use of image processing and feature recognition methods to build high level data products from direct observations, and is now exploiting machine and deep learning for data generation and prediction purpose.  SIDC is also a major player in building tools that allows for a seamless distribution of solar datasets to the community in agreement with open data policy and FAIR principles: JHelioviewer, a data browsing and visualization tool, and the SOLARNET Virtual Observatory, which facilitates search and access across a large variety of solar event and observational datasets. SIDC also runs the  World Data Center for the Sunspot Index (SILSO) and as such is involved in all aspects of the sunspot index data life cycle.

SPoCA algorithm to detect active region on SOHO/EIT images

Historical facts

Data handling and distribution was already at the heart of the SIDC when it started in 1981 as the World Data Center for the Sunspot Index. From 1996 onwards, the EIT telescope on board  SOHO provided large amounts of data which requires automated tools to produce extract the important information.  Towards this goal, SIDC initiated in 2003 the series of Solar Image processing Workshop, and has developed a series of feature recognition and tracking tools: CaCTus (Coronal mass Ejection), SoFast (solar flares), Solar Demon (solar flares, dimmings), SPoCA (Active regions, coronal holes), Velociraptor (motion analysis). Those tools are used for space weather now- and forecasting purposes, but also to build catalog in view of long term statistical studies.
 Around 2000, SIDC started developing the Solar Weather Browser (SWB), a standalone, open-source software tool designed to display solar images with context overlays. With this heritage, the SIDC was able a few years later to play a significant role in the development of the JHelioviewer project. 

Solar image visualisation with jHelioviewer

Current activities

To facilitate research and scientific innovation, funding agencies are now increasingly requesting science data to be open, that is, to meet the criteria of Findability-Accessibility-Interoperability-Reusability (FAIR). Various projects at SIDC, with national or European (SOLARNET, ESCAPE) funding, actively work towards making all SIDC data FAIR. The ESCAPE project also aims at adopting international standard for data access and to connect tools such as JHelioviewer and SOLARNET VO. At SIDC, particular attention is also given to visualization: 3D visualization, essential when satellites are observing the Sun from different perspective in the heliosphere, but also timely data visualization, needed for space weather operators to assess the future impact of solar eruptions observed in real time.

Regarding data processing, on-going projects such as DELPHFI and DeepSun are using deep learning methods to exploit solar data in new ways and improve our understanding of solar phenomena.

Perspective for the future

In the future, more projects using a combination of information processing, deep learning and physical modelling will be carried out at SIDC. Each of these projects will have their results accessible following the FAIR principles.

Interoperability between our various data access tools will be also enlarged. Through the use of international standard in data access protocol and metadata, we build towards a future where users can discover our data and use them in  combination with data and models from elsewhere.