Abstracting the environment: automating geoscientific simulation

Lead Research Organisation: Imperial College London
Department Name: Dept of Computing

Abstract

This project will deliver a revolutionary increase in the ability of
geoscientists to implement computer simulations, especially for
emerging parallel hardware, and to work with the results of simulations.

Computer simulation of processes in the Earth system has become one of
the key tools if science. In the atmosphere and ocean and from frozen
ice sheets to the molten rock of the Earth's mantle, simulations of
fluids and solids are ubiquitous and essential tools. These are
critical processes: many of the world's largest computers are engaged
in simulating them.

The numerical methods used to produce these models are becoming
rapidly more sophisticated. At the same time the emergence of
massively parallel computer hardware presents the opportunity for
unprecedented levels of resolution. However, the complexity of the
numerics and the new hardware is such that it is becoming very
difficult for researchers to write computer code which is correct,
sufficiently high performance and sufficiently usable. Conventional
software development essentially requires superhuman developers who
are simultaneously geoscientists, mathematicians and computer
scientists.

In essence these difficulties occur because conventional computer code
mixes the numerics and the parallel implementation. Instead, models
could be developed by specifying the numerical methods in a high-level
computer language similar to the maths, and the parallel
implementation could be generated automatically. This would enable
experts in numerical modelling to specify their algorithm, and arrive
at correct, parallel code at a tiny fraction of current development
costs.

This automatic generation of models is a reality today for scientists
and engineers working on smaller-scale and simplified simulation
problems. However the curvature of the earth, the extreme flatness of
geophysical domains and the scale of the domains involved mean that
geoscientists have additional needs which require deep changes in
simulation code generation systems. I will extend code generation
techniques to meet these special challenges, and therefore deliver
automation to geoscientific model development.

Much science does not just depend on simulating processes: it also
depends on studying the sensitivity of systems, optimising inputs and
parameters, stability analysis and error analysis. All of these
processes require an adjoint model: essentially the gradient of the
original simulation. Developing adjoint models is so complex that only
the largest national centres can typically afford to develop
them. Using code generation, I have already demonstrated that this can
be made almost automatic for some types of model. I will extend this
capability to other discretisations which are more common in the
geosciences, and thereby put the powerful tool that adjoints are into
the hands of the individual scientists and students who conduct much
of the cutting edge geoscience.

The largest simulations, particularly of the climate system, produce
so much complex data that much important science occurs by studying
the output of archived simulations. For large collections of data from
many models, even the process of calculating statistics is labourious
and error-prone. It is also currently impossible to verify if
published data analyses are correctly calculated. I will extend the
automated generation of simulation software to allow for an automated
data query language. This make this form of data science far less
labour-intensive, will allow data science with properly published methods,
will reduce sources of error and will allow scientists to work effectively
with the massive data sets of the future.

Planned Impact

Commercial and public bodies simulating geoscientific problems
=======================================

This project will deliver ability to simulate complex geophysical
phenomena by using limited staff time to writing short pieces of code
and automatically utilising modern parallel hardware. The automatic
availability of adjoint models makes this capacity truly revolutionary.

As a first example collegues at Imperial have worked with Tidal
Generation Ltd., a UK tidal power generation company, to optimise the
placement of tidal turbines to produce the maximal electrical
power. Using the existing automatic generation capability, idealised
cases are possible. This fellowship would enable realistic scenarios to
be simulated. Making the best use of the UK's tidal resources would be
of immense economic value.

The ability to create automated and adjointed models enables the study
of the sensitivity of environmental systems to parameters. A few
examples:

* Adjoint storm surge and flooding models would enable the
Environment Agency to know which missing data would best improve
forecasts.

* An adjoint coastal flow model would enable the effect of fish-farm
proximity on productivity and disease propagation to be rigorously
analysed. This could enable the calculation of per-farm optimal
separation, rather than the current one size fits all approach.

* Adjoint small-scale atmosphere models can be used by
urban planners to understand the effect of traffic management measures
on local air quality.

The Met Office weather and climate forecasting
-----------------------------------------------------

The Met Office is currently in the early stages of a total rebuild of
their climate and weather forecasting systems, a project in which I am
closely involved. The capability delivered by this fellowship will
enable rapid prototyping of Met Office capability, and advances
demonstrated here in aspects such as vertical structuring and
automation of adjoints are likely to be taken up by that project.
Adjoint simulations is utterly critical to the data assimilation process
which makes the weather forecast accurate.

Engineering simulations
==============

The ability to automate simulations on large, high aspect ratio,
curved domains is critical to the geosciences. However it will also
enable the automation of simulation in a vast array of engineering
contexts. Thin-walled structures are ubiquitous in engineering
applications. Car and aircraft bodies, machine components, glass and plastic
plates, all are shell-like structures critical to engineering
performance and safety.

The extensions to automated modelling which enable simulation
in columnar meshes on the sphere will also enable the automation of
simulations on arbitrary curved, thin domains. Combined with the
automated ability to conduct adjoint simulations, this will enable
automated design optimisation to be employed by small, dynamic
entrants into engineering fields, as well as massively reducing the
cost of doing so by larger companies.

Policy maker and public confidence in climate model analysis
======================================

The extension of the high-level simulation specification language UFL
into an automated query language for large data sets will clearly
benefit the scientists working with that data. However it will also
enable those scientists to publish their methods in a far more precise
and accurate way than is currently possible. The ability to be
absolutely precise about how data was analysed and processed is
critical if public and policy-maker confidence in climate science is
to be maintained. Indeed, the Oxburgh Report on the Climate Research
Unit at the University of East Anglia highlighted the need to devote
more attention "to archiving data and algorithms and recording exactly
what they did". The outcome of this fellowship will enable scientiists
to do the last of these.

Publications


10 25 50
Heinis T (2015) On-the-Fly Data Synopses in ACM SIGMOD Record
Homolya M (2016) A Parallel Edge Orientation Algorithm for Quadrilateral Meshes in SIAM Journal on Scientific Computing
Luporini F (2015) Cross-Loop Optimization of Arithmetic Intensity for Finite Element Local Assembly in ACM Transactions on Architecture and Code Optimization
McRae A (2016) Automated Generation and Symbolic Manipulation of Tensor Product Finite Elements in SIAM Journal on Scientific Computing
Rathgeber F (2016) Firedrake in ACM Transactions on Mathematical Software
 
Title Firedrake 
Description Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM). Firedrake enables users to employ a wide range of discretisations to an infinite variety of PDEs and employ either conventional CPUs or GPUs to obtain the solution. 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact Firedrake is a principle test platform for the development of Gung Ho, the future UK Met Office dynamical core. 
URL http://www.firedrakeproject.org/