GRAV3D manual - background

GRAV3D manual:
Background theory

1.1 Introduction

This manual presents theoretical background, numerical examples, and explanation for implementing the program library GRAV3D. This suite of algorithms, developed at the UBCGeophysical Inversion Facility, is needed to invert gravimetric responses over a 3 dimensional distribution of density contrast, or anomalous density (henceforth referred to as 'density' for simplicity). The manual is designed so that geophysicists who are familiar with the gravity experiment, but who are not necessarily versed in the details of inverse theory, can use the codes and invert their data. In the following, we describe the basics of the algorithm, but readers are referred to Li and Oldenburg (1997, 1998) for an in-depth discussion of various aspects of the algorithm. Note that an understanding of these components is necessary for the user to have a global view of the algorithms and to use the program library correctly.

A gravity experiment involves measuring the vertical components of the gravity field produced by anomalous (either excess or deficient) mass beneath the surface. A distribution of anomalous mass, characterized by density (x; y; z), produces its own gravity field, _s, which is superimposed on the ambient gravity field. By measuring the resultant field and removing the ambient field from the measurements through numerical processing, one obtains the field due to the anomalous mass.

The vertical component of the gravity field produced by the density (x; y; z) is given by

(1)

where ₀ is the vector denoting the observation location and is the source location. V represents the volume of the anomalous mass, and is the gravitational constant. Here we have adopted a Cartesian coordinate system having its origin on the earth’s surface and the z-axis pointing vertically downward.

The data from a typical gravity survey are a set of field measurements acquired on a 2D grid over the surface. These data are first processed to yield an estimate of the anomalous field, which is due to the excess or deficient mass below the data area. The goal of the gravity inversion is to obtain, from the extracted anomaly data, quantitative information about the distribution of the anomalous density in the ground.

The GRAV3D library consists of three programs:

GZFOR3D: calculates the surface gravity data produced by a given density model.
GZSEN3D: calculates the sensitivity matrix for use in the inversion.
GZINV3D: inverts the anomalous gravity field to generate a density model.

In the following, we outline the basics of the forward and inverse procedures used by these programs.

1.2 Forward Modelling

Forward modelling of gravity data is a linear problem and can be carried out by performing the integration in eq.(1). We divide the region of interest into a set of 3D prismatic cells by using a 3D orthogonal mesh and assume a constant density within each cell. We discretize the density model in this manner since it is best suited for our inversion methodology. Given such a discretization, the gravity field at the i’th location can be written as,

(2)

In eq.(2), _j and Vj are the density and volume of the j’th cell, di is introduced as a generic symbol for the i’th datum, and G_ij, defined by the expression in brackets, quantifies the contribution of the j’th cell to the i’th datum. The solution for the integral in eq.(2) can be found in Nagy (1966) and we have adopted the solution by Haaz (1953) here. Expressed in matrix notation, the gravity data consisting of N observations are given by

(3)

where = (d₁; ... ; d_N)^T is the data vector and = (₁; ... ; _M)^T is the vector containing the density values of the M cells. The program GZFOR3D performs forward modelling calculation to generate the surface gravity data produced by a given density model. Program GZSEN3D calculates the complete matrix G to be used in the inversion, with optional wavelet compression applied.

To illustrate the forward modelling program, and to introduce the example used in the inversion section of this manual, we calculate the gravity anomaly on the surface of the earth produced by the density model shown in Fig.1. The model consists of a dipping dyke with anomalous density of 1.0 g/cm³ situated in a uniform background. The model is represented by 4000 cells (with 20 in each horizontal direction and 10 in the vertical direction). The anomaly is calculated at a height of 0.5 m above the surface over a 21 by 21 grid with 50 m spacing in both directions. A contour presentation of the data is shown in Fig.2.

Figure 1: Synthetic model; dipping dyke with anomalous density = 1 g/cm3.
Top: A cross-section at north 525 m.
Middle: plan-section at depth 125 m.
Bottom: plan-section at depth 225 m.

Figure 2: The gravity anomaly in units of mGal. Uncorrelated Gaussian noise has been added to
the data. The standard deviation of the additive noise is 0.01 mGal plus 5% of the magnitude of the datum.

1.3 Inversion Methodology

The inverse problem is formulated as an optimization problem where an objective function of the density model is minimized subject to the constraints that the data be reproduced to within an error tolerance. The details of the objective function are problem-dependent and can vary according to the available a priori information, but generally the objective function should have the flexibility of constructing a model that is close to a reference model ₀ and producing a model that is smooth in three spatial directions. An objective function that accomplishes this is

(4)

where the functions w_s, w_x, w_y, and w_z are spatially dependent, whereas _s, _X, _Y, and _Z are coefficients which affect the relative importance of the different components of the objective function. The greater the ratio _X/_s and so on, the smoother the recovered model is in that axis direction. A length scale of smoothness can be defined for each direction as , , ; and specifying greater values for the length scales will produce smoother models.

The objective function in eq.(4) has the flexibility to allow many different models to be constructed. The reference model ₀ may be a general background model that is estimated from previous investigations or it could be the zero model. The relative closeness of the final model to the reference model at any location is controlled by the function w_s. For example, if the interpreter has high confidence in the reference model at a particular region, he can specify w_s to have increased amplitude there compared to other regions of the model. The weighting functions w_x, w_y, and w_z can be designed to enhance or attenuate structures in various regions in the model domain. If geology suggests a rapid transition zone in the model, then a decreased weighting for flatness can be put there and the constructed model will exhibit higher gradients provided that this feature does not contradict the data.

The function w(z) in eq. (4) is a depth weighting that depends upon the model discretization and observation location. The weighting function is used to counteract the decay of the kernel functions, G_ij , with depth. Because of this decay, an inversion which minimizes subject to fitting the data will generate a density distribution that is concentrated near the surface. We introduce a weighting of the form w(z) = (z + z₀)^-^/2 into the objective function to counteract this effect. The values of and z₀ are discussed in the next section. They are important in defining the model objective function and thus the type of final model.

The next step in setting up the inversion is to define a measure of misfit. Here we use the 2–norm measure

(5)

We assume that the noise contaminating the data is independent and Gaussian with zero mean. Specifying W_d to be a diagonal matrix whose i’th element is 1=_i, where _i is the standard deviation of the i’th datum, makes _d a chi-squared random variable distributed with N degrees of freedom. Accordingly E[_d ]=N provides a target misfit for the inversion.

The inverse problem is solved by finding a density () which minimizes _m and misfits the data according to the noise level. This is accomplished by minimizing ()= _d+ _m where is a regularization parameter that controls the relative importance of data misfit _d and model objective function _m. A value of is sought so that the data are neither fit too well nor too poorly. To perform a numerical solution, we first discretize the objective function in eq.(4) according to the mesh that defines the density model. This yields

(6)

where where and ₀ are M-length vectors. The individual matrices W_s, W_x, W_y, W_z are straightforwardly calculated once the model mesh and the weighting functions w(z) and w_s, w _x, w _y, w _z are defined. The cumulative matrix W^T_mW_m is then formed. The details of these quantities are specified by the user through the inputs to programs GZSEN3D and GZINV3D.

Since the density contrast is limited to a small range for any practical problems, and often there are well-defined bounds on the density contrast based on direct sampling or other geological information, we also impose constraints to restrict the solution to lie between a lower and upper bound. Thus the solution is obtained by the following constrained minimization problem,

(7)

where _min and _max are vectors containing the lower and upper bounds on the model values. When the standard deviations of data errors are known, the acceptable misfit is given by the expected value and we will search for the value of that produces the expected misfit. Otherwise, an estimated value of will be prescribed. The details of various aspects of choosing a regularization parameter will be discussed in a following section.

We use a primal logarithmic barrier method with the conjugate gradient technique as the central solver. In the logarithmic barrier method, the bound constraints are implemented as a logarithmic barrier term. The new objective function is given by,

(8)

where is the barrier parameter, and the regularization parameter is fixed during the minimization. As the name suggests, the logarithmic barrier term forms a barrier along the boundary of the feasible domain and prevents the minimization from crossing over to the infeasible region. The method solves a sequence of nonlinear minimizations with decreasing and, as approaches zero, the sequence of solutions approaches the solution of eq.(7).

The above methodology provides a basic framework for solving 3D gravity inversion with arbitrary observation locations. The basic components are the forward modelling, a model objective function that incorporates a "depth" weighting, a data misfit function, a regularization parameter that ultimately determines how well the data will be fit, and the logarithmic barrier method to obtain the solution with bound constraints. Without getting into the algorithmic details, we discuss three of these basic components in the next sections, namely, the depth weighting, efficient forward mapping, and the choice of the regularization parameter. An understanding of these components is necessary for the user to have a global view of the algorithm and to use the program library correctly.

1.4 Depth Weighting

It is well known that gravity data have no inherent depth resolution. As a result, structures will concentrate near the surface when a simple smallest or flattest model is produced, regardless of the true depth of the causative bodies. In terms of model construction, this is a direct manifestation of the kernels’ decay with depth. Because of their rapidly diminishing amplitude at depth, the kernels of surface data are not sufficient to generate a function that possess significant structure at depth. In order to overcome this, the inversion needs to introduce a depth weighting to counteract this natural decay. Intuitively, such a weighting will approximately cancel the natural decay and give cells at different depths equal probability of entering the solution with a non-zero density.

For data acquired over a relatively flat surface, the sensitivity decays predominantly in the depth direction. Numerical experiments indicate that a function of the form (z+z₀)^-2 closely approximates the kernels’ decay directly under the observation point provided that a reasonable value is chosen for z₀. Having the exponent to be 2 is consistent with the fact that the gravity field decays as inverse distance squared. The value of z₀ can be obtained by matching the function (z+z₀)^-2 with the field produced by a column of cells directly beneath the observation point. In accordance with the discretization used in the inversion, we use a depth eighting function of the form

(9)

for the inversion and _j is used to identify the jth cell and z_j is its thickness. This weighting function is first normalized so that the maximum value is unity. Numerical tests indicate that when this weighting is used, the density model constructed by minimizing a model objective function in eq.(4), subject to fitting the data, places the recovered anomaly at approximately the correct depth.

For data sets acquired over rugged terrain, the simple depth decay does not describe the dominant variation in sensitivity. Therefore a weighting function that varies in three dimensions is needed. We generalize the depth weighting to form a distance weighting:

(10)

V_j is the volume of jth cell, R_ij is the distance between a point in V_j and the ith observation, and R₀ is a small constant used to ensure that the integral is well defined (chosen to be a quarter of the smallest cell dimension). Similarly, this weighting function is normalized to have a maximum value of unity. For inversion of data sets acquired in areas with high topographic relief, it is advantageous to use this more general form of weighting function.

The weighting function is directly incorporated in the sensitivity file generated by program GZSEN3D. This program allows the user to specify whether to use the depth weighting or the distance weighting depending on the terrain of the observed data.

1.5 Wavelet Compression of Sensitivity Matrix

The two major obstacles to the solution of large scale gravity inversion problems are the large amount of memory required for storing the sensitivity matrix and the CPU time required for the application of the sensitivity matrix to model vectors. The GRAV3D program library overcomes these difficulties by forming a sparse representation of the sensitivity matrix using a wavelet transform based on compactly supported, orthonormal wavelets. For more details, the users are referred to Li and Oldenburg (1997). In the following, we give a brief description of the method necessary for the use of the GRAV3D library.

Each row of the sensitivity matrix in a 3D gravity inversion can be treated as a 3D image and a 3D wavelet transform can be applied to it. By the properties of the wavelet transform, most transform coefficients are nearly or identically zero. When the coefficients with small magnitude are discarded (the process of thresholding), the remaining coefficients still contain much of the necessary information to reconstruct the sensitivity accurately. These retained coefficients form a sparse representation of the sensitivity in the wavelet domain. The need to store only these large coefficients means that the memory requirement is reduced. Further, the multiplication of the sensitivity with a vector can be carried out by a sparse multiplication in the wavelet domain. This greatly reduces the CPU time. Since the matrix-vector multiplication constitutes the core computation of the inversion, the CPU time for the inverse solution is reduced accordingly. The use of this approach increases the size of solvable problems by nearly two orders of magnitude.

Let G be the sensitivity matrix, and be the symbolic matrix-representation of the 3D wavelet transform. Then applying the transform to each row of G and forming a new matrix consisting of rows of transformed sensitivity is equivalent to the following operation,

(11)

where is called the transformed matrix. The thresholding is applied to individual rows of by the following rule to form the sparse representation _s,

(12)

where is the threshold level, and and _s are the elements of and _s, respectively.

The threshold level are determined according to the allowable error of the reconstructed sensitivity, which is measured by the ratio of norm of the error in each row to the norm of that row, r_i(). It can be evaluated directly in the wavelet domain by the following expression:

(13)

Here the numerator is the norm of the discarded coefficients. For each row we choose such that r_i() = r* , where r* is the prescribed reconstruction accuracy. However, this is a costly process. Instead, we choose a representative row, i₀, and calculate the threshold level ₀ . This threshold is then used to define a relative threshold . The absolute threshold level for each row is obtained by

(14)

The program that implements this compression procedure is GZSEN3D. The user is asked to specify the relative error r* and the program will determine the relative threshold level . Usually a value of a few percent is appropriate for r*. For experienced users, the program also allows the direct input of the relative threshold level.

1.6 Choice of Regularization Parameter

The choice of the regularization parameter ultimately depends upon the magnitude of the error associated with the data. The inversion of noisier data requires heavier regularization, thus a greater value of is required. In this section, we discuss the various implementations for the choice of in the GRAV3D library.

If the standard deviation associated with each datum is known, then the data misfit defined by eq.(5) has a known expected value , which is equal to the number of data when the errors are assumed to be independent Gaussian noise with zero mean. The value of should be such that the expected misfit is achieved. This entails a line search based on the misfit curve as a function of . Because of the positivity constraint, our problem is nonlinear. Thus for each a nonlinear solution using a logarithmic barrier method must be obtained. This is computationally demanding and we therefore have developed the following strategy to reduce the cost.

It is observed that, when plotted on a log-log scale, the misfit curves for 3D inversion with and without bound constraints often parallel each other in the vicinity of the expected misfit. The curve with positivity must lie above the curve without positivity. Therefore, we can first perform a line search without positivity to find a ₀ that gives rise to . This search also generates the slope, s₀, of misfit curve at ₀. This process is very efficient and the required CPU time is much smaller compared to the time required for the solution with positivity. We next assume that s₀ can be used to approximate the slope of the misfit curve when the positivity is imposed. A rigorous line search incorporating positivity starts with an initial guess of =0.5₀. This usually yields a misfit that is very close to the target value. If the misfit is not sufficiently close to , however, a new guess for is obtained which makes use of the approximate slope s₀. The inversion with updated can be solved efficiently if the logarithmic barrier algorithm is started from an initial model close to the final solution. That model is obtained by perturbing the solution corresponding to the previous away from the zero bound. The line search using this strategy is often successful in reaching the target after testing two to four values of . This strategy is implemented in GZINV3D as the first method for choosing the tradeoff parameter.

flowchart In practical applications the estimate of data error is often not available. Then the degree of regularization, hence the value of , needs to be determined based on other criteria. A commonly used method in linear inverse problems is the generalized cross-validation (GCV) technique. The use of GCV in inverse problems with inequality constraints such as positivity is far more involved and numerically expensive to implement. However, applying GCV on the 3D gravity inversion without bounds still produces a reasonable estimate of the data error when the data are dominated by anomalies of the same sign. That error can serve as a starting point for further adjustment by the user based on his or her judgement. Since no other information is assumed, we have chosen to use the value of obtained in this manner directly in the final inversion, which has bound constraints imposed. In this case, only one logarithmic barrier solution is needed. Numerical tests have indicated that this simplistic use of GCV is in fact surprisingly effective unless the data have a large negative bias or are distributed sparsely. GZINV3D has implemented this approach as the third method for choosing the tradeoff parameter.

The flowchart shown to the right illustrates the structure of the program GZINV3D. It has three options for determining the tradeoff parameters. The controlling parameter is mode. When mode = 1, the line search based on known target value of data misfit is used. Two stages, as discussed above, are used, and several solutions for different values of must be tested to obtain one that produces the target misfit. When mode = 2, the user specifies a tradeoff parameter and a single solution is produced. When mode = 3, the program first performs GCV analysis on the inversion without positivity and then uses the resultant value of in the final inversion.

1.7 Example Inversion

We now illustrate results of using the program GZINV3D to invert the surface gravity data that was generated for the example in Section 1.2 above. The model region is again discretized into 20 by 20 by 10 cells of 50 m on a side. We invert the 441 noise-contaminated data illustrated in Fig.2 to recover the density contrasts within the 4000 cells.

For the model objective function, we choose L_x = L_y = L_z = 100.0 and unit 3D weighting functions for all components and the default parameters for w(z). A zero reference model is used. We have also imposed a lower bound of 0.0 and an upper bound of 2.0.

The final model is shown to the right in Fig. 4. The top panel is a cross-section through the centre of the model and the other two panels are horizontal slices at different depths. The tabular shape of the anomaly and its dipping structure are very clear and the depth extent is reasonably recovered. The dip angle inferred from the recovered model is close to the true value. The amplitude of the recovered model is slightly less than the true value. Over all, however, this recovered model compares favorably with the true model shown in Fig.1.