Personal tools
You are here: Home Articles 2002 08 Uncertainties in the Integrated Assessment Process – A Tyndall Centre Project
Document Actions

Uncertainties in the Integrated Assessment Process – A Tyndall Centre Project

by Peter Challenor last modified 2006-06-22 12:27

Peter Challenor introduces a project examining uncertainties in the Tyndall Centre's Integrated Assessment Model.

The Tyndall Centre for Climate Change Research has just funded a project to look at uncertainties in the Integrated Assessment Model (IAM) being built by the Tyndall Centre. The investigators are Peter Challenor at the Southampton Oceanography Centre, Jim Hall and Jonathan Lawry at the University of Bristol, Michael Goldstein at University of Durham, Tony O’Hagan at the University of Sheffield and Rachel Warren at the Tyndall Centre. The Integrated Assessment Model is a mathematical model of the climate and socio-economic system including feedbacks. This is a complex non-linear model and the propagation of errors through such a system is non-trivial. Further complexity is introduced because the IAM is not a single model but is built from a series of modules. More complex or simpler modules can be substituted giving a large number of possible model configurations. Furthermore we would like to include expert opinion and data into the forecasts. Although the main part of the model will be quantitative, there are elements that will be qualitative. For example some expert opinion could be expressed as words such as ‘high’, ‘medium’ and ‘low’. There may even be modules that are qualitative. If we wish to know how people change their behaviour this information may come from asking them rather than from a set of equations. Thus there are two areas where we need to deal with uncertainty. The first is the propagation of uncertainty through non-linear equations in the form of statistical moments or probability density functions. The second is the handling of uncertainty which may be expressed in natural language rather than in traditional probability terms.

Objectives

The objective of the project is to provide users of the Tyndall integrated assessment process with tools that will allow them to estimate the uncertainty in any predictions of climate change impacts they make. This objective can be divided into two parts:-

  1. to produce estimates of outputs of the IAM using the methods developed for the statistical analysis of computer code output (SACCO)
  2. to demonstrate methods of handling uncertainty in the less quantitative areas of the integrated assessment process. Here the methods may not be purely numerical and so a broader series of methods are needed, particularly to deal with linguistic information

SACCO Methods

The methods we will be using are Bayesian (Goldstein, 1999). Thus we consider all unknown variables as random and relate their posterior distributions to prior information and the model or data through Bayes Theorem. We will use two methods to look at uncertainty in the outputs of the IAM. The first is a full Bayes solution where the full probability density function of the model outputs is produced. This is demonstrated for low dimensional problems by Kennedy and O’Hagan (2001). There are problems however with this method with models of high dimension, in which case a linear Bayes solution is an attractive alternative. Linear Bayes methods use a form of Bayes Theorem relating moments rather than entire probability density functions. This means that only means and variances are propagated through the model. However no assumption of Normality is necessary. These methods can deal with very high dimension problems and are described in Craig et al (2001). They apply their method to a complex and demanding oil field simulator where a single run can take up to forty hours. In this problem their approach, which combines expert beliefs, historical data and simulator runs, produces reasonable forecasts, including uncertainties, measured by variances, after just six simulator runs. They produce forecasts that compare well with data that has been held back, with only six runs of the model. Thus the forecasts can be produced in a very efficient manner. This means that we may be able to use more complex components in the IAM than is presently envisaged.

The basis of both methods is to consider the numerical model as a simple black box. We can consider the model as a function f(.) which relates the inputs (x) to the outputs (y), so we have

y=f(x)

This function is usually the solution to a system of partial or ordinary differential equations and is not available as a closed function but has to be solved numerically. Thus, although in principle we can calculate its value at any point, f(x) can be considered as an unknown, or random, function. We can therefore apply statistical methods to estimate the form of this function from a small number of function evaluations. The methods we are proposing use a Gaussian process to describe this unknown function. The model is run at a number of points according to some design. We will investigate suitable designs as part of the project, for example the sequential designs in Craig et al. 2001, but a priori we could use a Latin Hypercube. Using the results at these positions we then build a statistical model of f(.). This consists of mean and variance functions with an associated covariance structure. Such a model is capable of representing any ‘smooth’ function. Any highly non-linear functions in the IAM that have steps or other discontinuities in them, THC collapse for example, will need special methods to be applied, we may be able to parameterise such a discontinuity. This statistical model is know as an emulator and is used in place of the full numerical model to make statistical inferences. For high dimensional models our statistical modelling of the mean and variance functions will involve selecting reduced sets of important variables that can be used to explain the majority of the variability in the full model. There are other applications of such emulators in addition to estimating uncertainties. For example they provide rapid alternatives to the full numerical model for exploring parameter space and the reduced statistical model reveals which are the important variables in the full model and how they are related (empirically).

Generalised uncertainty in the IAM

The use of expert assessment and knowledge is widespread and fundamental to climate research. It is implicit in choice of quantified modelling approaches (which processes to include/exclude from a particular model) and in the evaluation of model results. Qualitative scenarios and storylines are constructed by experts and expressed in linguistic terms. In situations where there is very little quantitative evidence (modelled or measured) expert knowledge provides an essential mechanism for constraining predictions. If the process of eliciting expert judgements and including them in modelling processes is not effectively formalised it can be prone to bias and lack transparency.

The Bayesian methods discussed in the previous sections lend themselves to inclusion of expert knowledge through the elicitation from experts of prior probability distributions. In this section of the research, a range of less standard methods for handling expert judgements and measured/modelled data will be addressed.

Methods of approximate reasoning provide a mechanism for formalising the construction of inferences from expert knowledge. In short, this involves placing a set of propositions in relation to each other and then attaching measures of belief to some of those propositions and the relationship between them. Rules may be learnt from data or elicited from experts – the latter will form the focus of the current research. This provides a visual or linguistic representation of the expert reasoning and an auditable account of it. This type of approach is widely used in Artificial Intelligence and has successfully been used in environmental simulation. Approximate reasoning methods can also be applied to situations where modelled data is available, for example in the mathematical parts of the IAM, in which case fuzzy sets and connectives will be learnt from data.

In both quantitative and qualitative modelling, in situations of information scarcity it is necessary to assemble information from a range of, usually only partially relevant, sources and in a range of formats. For example, information may appear as point measurements, interval measurements or linguistic statements. The theory of random sets provides a coherent framework for integrating probabilistic, interval-based and fuzzy information. It includes probability theory and fuzzy set theory as special cases. Random set approaches have been used in reliability analysis of flood defences given very scarce data about defence performance (Hall and Lawry, 2001). The approach is based on the concept of a label description of some domain of interest. Labels can be thought of words, such as “large” or “very large” that describe locations in the domain. Any point in the domain has associated measures of membership in each of the set of labels. It is therefore possible to express a data set in terms of a mass distribution on these labels and conversely construct a posterior distribution on the parameter space from a mass distribution on the labels. The approach therefore provides a mechanism for combining linguistic statements with data.

A common problem in uncertainty analysis is addressing the dependency between model parameters in the absence of joint measurements that enable joint probability distributions to be constructed. In these situations all that may be available are vague expert judgements about the relationship between parameters. For example in environmental modelling experts may be able to specify which combinations of parameters are physically realisable. This information can be encoded as fuzzy rules and used to constrain joint probability distributions. A local assumption of independence within a fuzzy set on the joint space of parameters is a much weaker assumption than global independence. Once again this approach results in imprecise model predictions, which are a better reflection of the state of knowledge or ignorance about the problem domain and avoid inappropriate independence assumptions. This approach has been applied at Bristol in combined slope stability and hydrology analysis, in which expert knowledge has been used to classify the parameter space into domains of physically realisable parameter combinations and local independence assumptions have been made within these domains.


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: