Big data for small earthquakes: Computational challenges in large-scale earthquake detection
Earthquake detection – the identification of weak earthquake signals in continuous waveform data recorded by sensors in a seismic network – is a fundamental task in seismology. In this talk, I will describe the data science challenges associated with earthquake detection in massive seismic data sets. I will discuss how new algorithmic advances in machine learning and data mining are helping to advance the state-of-the-art in earthquake monitoring.
As a case study, I will present Fingerprint and Similarity Thresholding (FAST), a novel method for large-scale earthquake detection inspired by audio recognition technology (Yoon et al., 2015). FAST uses locality-sensitive hashing, a technique for efficiently identifying similar items in large data sets, to detect similar waveforms (candidate earthquakes) in continuous seismic data. By posing earthquake detection as a data mining problem, FAST can discover new earthquake sources without training data, which is often unavailable for seismic data sets. FAST has recently been extended to long-duration, multi-sensor seismic data sets (Bergen and Beroza, 2018; Rong et al., 2018; Yoon et al., 2019) – introducing a capability for large-scale unsupervised detection that was not previously available for seismic data analysis.
The latest generation of earthquake detection methods, including FAST and other new approaches based on deep neural networks, reflect a broader trend toward data-driven methods in the solid Earth geosciences (Bergen et al., 2019). I will conclude the talk with a brief discussion of opportunities for collaboration between the geoscience and data science communities that will advance the state-of-the art in both fields. In particular, I will highlight how research in the emerging discipline of scientific machine learning will play a critical role in driving discovery in the Earth and physical sciences.