Inversion theory:

Contents and objectivesNow it is the time to gather the pieces together and carry out an inversion. Study this page carefully because it explains in a conceptual manner how the ideas of misfit and model norm are used to perform an inversion. It also highlights the effects of making wrong choices when specifying how the algorithm is supposed to perform. IntroductionThe foundations for inversion have now been laid. Referring to the flow chart, we have outlined what is required before inversion can be carried out, we have mentioned briefly how the earth is discretized, and we have discussed the two fundamental components that are used to control the inversion process  misft and model norms. The next step is to draw these concepts together into a useful inversion scheme. Combining misfit and model normsWe have mentioned before that inversion can be described as a process for automatically deciding which of infinitely many possible models to select, based upon measured data and prior understanding about the problem. The choice should be an optimal model, so we are trying to solve an optimization problem. There are two components to this optimization problem  (1) misfit and (2) model norms. (1) Misfit: The ideas about the data, errors, predicted data and misfit have been introduced in the previous section. For now, we will assume that errors on the data are independent and we have some estimate of their standard deviations. The appropriate misfit function is
Acceptable models are those for which the misfit is approximately equal to the expected value N; we will let _{d}^{*} denote a "target" value for our misfit. (2) Model norm: Ideas about the model norm, _{m}, were also introduced in the previous section. It is usual for this function to contain elements representing closeness to a reference model, and elements representing amount of structure in the spatial directions. Recall that we generally want to find a model that minimizes _{m}. Such a solution would be close to a reference model, which represents our prior knowledge about the earth. It would also have minimum structure. Minimum structure models are useful as starting points for interpretation because they can be expected to capture the important bigscale features of the earth, even though detail may be missed. The argument is that arbitrarily complicated models can always be found that recreate the data, but arbitrarily simple models CANNOT be found. For this reason the general inverse problem is usually formed in terms of minimizing a model norm. Combining these two points allows us to clearly state the inverse problem as:
To solve such problems, the optimization with its constraint are often recast mathematically as a single optimization. The model norm and the misfit functions are combined in a single objective function and the problem is expressed as:
The quantity (Beta) is called the regularization parameter or Tikhonov paramter. Its purpose is to control the relative importance attached to making the misfit small and to reducing the value of the model norm. Its value is not known when the inversion begins. Rather, a value of is sought so that when is minimized, the computed model has a misfit that is equal to some predetermined target value or is less than some tolerance. To help consolidate intuition about the role of , we present a simple analogy that should be familiar to everyone. The role of BetaIn our formulation of the inverse problem, we have two quantities that we want to make small. We want to minimize the model norm and we also want to minimize the misfit. Here is an example of another problem in which it is desireable to simultaneously minimize two quantities. Suppose a traveller is attempting to go from point A to point B on the map to the right. The twopart optimization problem is that you want to find a speed that will minimize the time taken on the trip, and you would also like that speed to result in minimum fuel consumption. Both time and fuel consumption are functions of speed and we could express this problem in the same form as our inversion problem:
The objective is to find a speed that will minimize . If we set =0, then the minimizing process will ignore fuel consumption and it will find a speed that minimizes time. On the other hand, if we set to be large, the minimization process will recover a speed that keeps fuel to a minimum regardless of the time taken. Using optimization to manage this decisionmaking process can be illustrated with the "tradeoff" curve shown here. The result we end up choosing depends upon a choice of : large values of result in low, efficient speeds, while smaller values of result in high, inefficient speeds. Now, which value of should we choose? One good way of clarifying this travel problem is to specify that we want to Specifying misfit to constrain the optimizationLet's return to our inverse problem where we had defined a model norm and a data misfit . We combine these into one objective function and minimize
Carrying out the minimization for a range of 's produces the Tikhonov curve plotted at the right. It is named after the Russian scientist who advocated its use. The question remains regarding which solution we want. In the previous section we showed that if errors associated with data are Gaussian and have known standard deviations, the expected value of the misfit function (equation #3.3.8) is N, the number of data points. So we can find a preferred model by performing several inversions using a range of 's, and selecting the result which satisfies =N. The preferred value of misfit, or target misfit, is specified as part of the inversion, and it will be called . Mouse over on the figure to see this. What kinds of models would be obtained if the value of is more or less than the one which produces the preferred model? Following the tradeoff curve helps to anticipate what happens. Choosing the result obtained when was less than optimal means the misfit will be smaller and the model norm will be larger. Consider the meanings of this:
Conversely, larger values of will yield models that cause larger values of misfit  predictions will look less like the observations. Also, model norm values will be smaller, meaning the model will be simpler than our "optimal" model. Figures at the end of this section illustrate these effects using results from the Linear Inversion Applet. Application to the appletLet us proceed with inversion using the UBCGIF Linear Inversion Applet. As a reminder, all of the necessary steps are in the following list (the steps are shown in the figure below). Only the last two steps were not covered in the previous section: The inversion result, or recovered model, will be displayed as a red line over the true (green) model's graph. Predicted data will be displayed in the data window, and two other graphs will be plotted  these are explained later. Also, some numbers will be listed under the buttons summarizing values obtained for misfit, model norm, and other parameters. Effect of specifying different Beta valuesThree pairs of figures below illustrate what kinds of models are obtained when different values of target misfit are specified. This is done by changing the "chifact" value. Doing this corresponds to selecting an inversion result that uses a value of that is more (or less) than the one used when chifact = 1. Inversion resultsA question that often arises is, "why isn't the inversion hardwired to find a solution that yields a misfit that is equal to N?" The answer has two parts. Firstly, even in synthetic examples where known Gaussian noise is added, the misfit between the true data and the observations is likely somewhat different than N. More importantly, the misfit measure in equation 3.3.8 assumes that the errors are Gaussian, independent, and have known standard deviations. In field examples we do not have the luxury of knowing the standard errors of the data and hence, we have to make a guess. Being able to run an inversion with different values of allows one to compensate for incorrect estimates for the data errors. So how should an inversion be inspected to see if it was successful? Deciding exactly how to proceed with specific data sets takes some experience and understanding of the problem, the data set, and the methodology. We will discuss these issues in the sixth section of this chapter, and in other sections that discuss how to use or apply inversion. Conclusion so far:On this page and the previous one, we have introduced the primary concepts that underly inversion. These two sections (3.3 and 3.4) are very dense with new concepts and information, and it is challenging to grasp it all quickly. In the last two sections of this chapter we show how these concepts are put into practice to solve real problems in geophysical inversion. But first, the next section gives a brief summary using interactive figures to reemphasize these key concepts. 