The Data Mining Lab at Georgia State University performs research on the storage, processing, retrieval, and analysis of massive, real-life data with highly dynamic spatial and temporal characteristics. Members of this lab work in close collaboration with experts from solar physics, astronomy, business, geosciences, statistics, and other fields.  This inter-departmental/multi-institutional collaborative environment provides the lab members with numerous opportunities to work on problems that are not only interesting in a computer science perspective, but also impact other areas of scientific understanding.






Courtesy: NASA (svs.gsfc.nasa.gov)


Some topics that the Data Mining Lab has worked on include:

  • (Un)Supervised Classification/Clustering
  • Fuzzy (overlapping) multi-class data
  • Ensemble learning (fusion of classifiers)
  • Frequent Pattern Mining (Co-location patterns)
  • Parallel/Distributed/GPU Computing
  • Information Visualization (and presentation)
  • Spatial/Temporal Databases (OpenGIS systems)
  • Time Series analysis
  • Linear Regression Models and Trends
  • Dimensionality Reduction
  • Multidimensional Database Indexing
  • Knowledge Rule Mining (empirical findings and hypothesis validation)


The Upcoming Workshop

A series of presentations, centering in the latest research-based applications, datasets, and data mining methodologies, for scientific inquiries on solar events, obtained from collaborative work of computer scientists and solar physicists of DMLab.

[More info]







DMLab Workshop 2019 Board



The Upcoming Data Challenge

We are now organizing a Big Data Cup Challenge on Solar Flare Prediction as part of IEEE BigData 2019. The goal of this dataset competition is to introduce the machine learning/data mining community to an integrated dataset that can be utilized for predicting and understanding solar flares.

[More info]






The winning prediction method(s) will be evaluated on the following:

  • 30% coming from their rank on the private leaderboard,
  • 10% from their rank on the public leaderboard, and
  • 60% from the quality of the accompanying paper describing their methods and results.