Solar Flare Prediction from Time Series Data 2020

A Track in the IEEE Big Data 2020 Big Data Cup Challenge

Submissions


The data competition is hosted through Kaggle and the participants classification results are submitted through their platform. Teams will be limited to 2 submissions a day.

https://www.kaggle.com/c/bigdata2020-flare-prediction/submissions


Evaluation of Results

The competition will utilize the built-in MacroFScore of Kaggle, as a way to evaluate the performance of each submission. This score is calculated by computing the F1 score fore each label and computing their unweighted mean. This is not an ideal score, because it does not take label imbalance into account, which is why this shall not be the final metric we ask to be reported in the submitted papers. There are both public and private leaderboards that are based on different partitions of the testing dataset, so participants are encouraged to keep this fact in mind and not overtrain their model for achieving the best results on the public leaderboard. The winning prediction method(s) will be evaluated on the following: (1) 30% coming from their rank on the private leaderboard, (2) 10% from their rank on the public leaderboard, and (3) 60% from the quality of the accompanying paper describing their methods and results. As this task is intended to help identify physical mechanisms that indicate the possible occurrence of or are the cause of solar flares, interpretability of results shall be given extra weight in the evaluation of the accompanying paper.


Your Code

Source code will also be required to be submitted, either through a publicly available repository on a Git based version control hosting service such as GitHub or BitBucket, or as code directly shared on Kaggle as a Kernel. Since our work is publicly funded, all source code is expected to be released as open-source software, utilizing some generally accepted licensing such as Apache License 2.0, GNU General Public License, MIT license, or others of similar acceptance by the Open Source Initiative.


Your Paper

After the competition phase is completed, a link for the submission of the accompanying full-length regular academic paper will be provided to the top 10 participants as ranked by the public/private leaderboard weighting described above. The academic papers will be ranked by peer reviewers and a final decision will be made using the weighting method detailed above.

Papers are expected to conform to the IEEE 2-column format set by the conference, which can be found Big Data CFP. If you have questions, please feel free to contact the lead organizer of the dataset competition.

Additional information about the dataset that was provided for the competition can be found at this (dataset paper link). A revision of this paper is currently under review for the Nature Scientific Data journal. If accepted prior to the deadline for submission of your paper, we will update this page with additional citation information. However, as this is a Computer Science dataset competition, the primary focus of your papers should be describing the methods used to produce the results achieved for this competition and not focus too heavily on how these results can be mapped to operational flare prediction applications. So, citations on how the dataset was created are not necessary for this work.