Ocean exploration with data science: An antidote to chaos?
Data science has the potential to accelerate progress in oceanography and across the geosciences. Three studies using complex data demonstrate how novel applications create new knowledge, stressing the importance of rigorous statistical validation. First, global ocean dynamical regimes relevant to climate are identified from a realistic ocean model by using a combination of theory driven dimensionality reduction, clustering and information theory for validation. The geographic distribution of dynamical regimes allow insight into potential heat transport pathways and could help model parameterization. Second, eco-provinces analogous to those on land are identified from a global ecosystem model with 51 plankton types and 4 nutrient fluxes. Using the existing Longhurst provinces as a benchmark for skill, this Systematic AGgregated Eco-province (SAGE) method combines domain knowledge, probabilistic dimensionality reduction, clustering and graph theory. The eco-provinces can be nested to be appropriate to both regional and global studies, with target applications spanning conservation, cruise planning and fisheries. Third, these global dynamical regimes are used to predict from only surface data, what dynamical regime is locally present. A deep neural network (gradient boost machines) is trained using sea surface height, wind stress, ocean depth and latitude, with the goal of applying similar predictive models using satellite fields. If successful, such predictive models could offer insight into ocean heat dynamics relevant to operational applications such as medium to long-range weather forecasting. Using data science, these studies each use novel methods that provide objective and data driven discoveries that could directly aid efforts to understand our changing climate, and help society prepare for future changes.