The goal of this dataset competition is to introduce the machine learning/data mining community to an integrated dataset that can be utilized for predicting and understanding solar flares. Solar flares and Coronal Mass Ejections (CMEs) [1, 2, 3] are events occurring in the solar corona and heliosphere that can have a major negative impact on our technology dependent society . Electromagnetic radiation and ionized particles from solar flares and eruptions tend to be filtered out by Earth’s atmosphere, but they can still pose a hazard to astronauts and sensitive equipment in space, as well as disrupt various high frequency radio communications that military and civilian customers become increasingly reliant upon each year. A strong enough CME can also cause significant enough fluctuations in Earth’s magnetosphere to induce currents in large networks of conductive materials such as power grids. These induced currents can lead to surges that have the potential to melt transformers of long distance transmission lines causing large scale blackouts. A 2008 report by the National Research Council concluded that a solar superstorm, similar to one observed in 1857 called the Carrington event , could cripple the entire US power grid for months and lead to an economic damage of 1 to 2 trillion dollars .
In response, the White House released the National Space Weather Strategy and Space Weather Action Plan  in 2015 as a roadmap for producing research that focuses on predicting and mitigating the effects of solar eruptive activity. One of the suggested routes to accomplishing the goals set out by this roadmap was to use machine learning to predict extreme space weather events, which was reiterated recently by the current administration . Our benchmark dataset is intended as a testbed for solar physicists and machine learning practitioners to test new ideas and methods on a cleaned, integrated, and readily-available dataset comprised of validated information from multiple sources. Our dataset mainly relies on Spaceweather HMI Active Region Patches (SHARPs) available from the Joint Science Operations Center (JSOC). This data product stems from solar vector magnetograms obtained by the Helioseismic Magnetic Imager (HMI) onboard the Solar Dynamics Observatory (SDO). The HMI instrument continuously observes the Sun and provides information about the magnetic field on the Sun’s surface; since the cause of a solar flare is the sudden release of magnetic energy in the solar corona, utilizing the output of this instrument for modeling and predicting their eminent occurrence is only logical.
Successful flare predictions via machine learning models trained and tested on this dataset intend to (1) tackle a central problem in space weather forecasting and (2) help identify physical mechanisms pertaining, or even giving rise, to solar flares. This dataset is intended to be a valuable resource for an unbiased comparison between results from various flare prediction algorithms.
This project has been supported in part by funding from the Division of Advanced Cyberinfrastructure within the Directorate for Computer and Information Science and Engineering, the Division of Astronomical Sciences within the Directorate for Mathematical and Physical Sciences, and the Division of Atmospheric and Geospace Sciences within the Directorate for Geosciences, under NSF awards #1443061, #1812964, #1936361 and #1931555. It was also supported in part by funding from NASA, through the Heliophysics’ Living With a Star Science Program, under NASA award #NNX15AF39G, as well as through the direct contract from Space Radiation Analysis Group (SRAG). In addition to that, the work has been in part sponsored by state funding from Georgia State University’s Second Century Initiative, and Next Generation Program. Also, we would like to mention that all images used in this work are courtesy of NASA/SDO and the AIA, EVE, and HMI science teams.
- Benz, A. O. Flare observations. Living Reviews in Solar Physics 5, 1 (2008). URL https://doi.org/10.12942/lrsp-2008-1.
- Howard, T. Coronal mass ejections: An introduction, vol. 376 (Springer Science |& Business Media, 2011).
- Martens, P. C. & Angryk, R. A. Data Handling and Assimilation for Solar Event Prediction. In Foullon, C. & Malandraki, O. E. (eds.) Space Weather of the Heliosphere: Processes and Forecasts, vol. 335 of IAU Symposium, 344–347 (2018). https://arxiv.org/abs/1712.01402.
- National Science and Technology Council. National space weather action plan (2015). Available at https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/final_nationalspaceweatheractionplan_20151028.pdf.
- Carrington, R. C. Description of a singular appearance seen in the sun on september 1, 1859. Monthly Notices of the Royal Astronomical Society 20, 13–15 (1859).
- National Research Council. Severe space weather events: Understanding societal and economic impacts: A workshop report (National Academies Press, 2009).
- Hutson, M. Trump to launch artificial intelligence initiative, but many details lacking. Science (2019). URL https://doi.org/10.1126/science.aaw9677.