Stat Model Predicts Flat Temperatures Through 2050

While climate skeptics have gleefully pointed to the past decade's lack of temperature rise as proof that global warming is not happening as predicted, climate change activists have claimed that this is just “cherry picking” the data. They point to their complex and error prone general circulation models that, after significant re-factoring, are now predicting a stretch of stable temperatures followed by a resurgent global warming onslaught. In a recent paper, a new type of model, based on a test for structural breaks in surface temperature time series, is used to investigate two common claims about global warming. This statistical model predicts no temperature rise until 2050 but the more interesting prediction is what happens between 2050 and 2100.

David R.B. Stockwell and Anthony Cox, in a paper submitted to the International Journal of Forecasting entitled “Structural break models of climatic regime-shifts: claims and forecasts,” have applied advanced statistical analysis to both Australian temperature and rainfall trends and global temperature records from the Hadley Center's HadCRU3GL dataset. The technique they used is called the Chow test, invented by economist Gregory Chow in the early 1960s. The Chow test is a statistical test of whether the coefficients in two linear regressions on different data sets are equal. In econometrics, the Chow test is commonly used in time series analysis to test for the presence of a structural break.

A structural break appears when an unexpected shift in a time series occurs. Such sudden jumps in a series of measurements can lead to huge forecasting errors and unreliability of a model in general. Stockwell and Cox are the first researchers I know of to apply this econometric technique to temperature and rainfall data (a description of computing the Chow test statistic is available here). They explain their approach in the paper's abstract:

A Chow test for structural breaks in the surface temperature series is used to investigate two common claims about global warming. Quirk (2009) proposed that the increase in Australian temperature from 1910 to the present was largely confined to a regime-shift in the Pacific Decadal Oscillation (PDO) between 1976 and 1979. The test finds a step change in both Australian and global temperature trends in 1978 (HadCRU3GL), and in Australian rainfall in 1982 with flat temperatures before and after. Easterling & Wehner (2009) claimed that singling out the apparent flatness in global temperature since 1997 is ’cherry picking’ to reinforce an arbitrary point of view. On the contrary, we find evidence for a significant change in the temperature series around 1997, corroborated with evidence of a coincident oceanographic regime-shift. We use the trends between these significant change points to generate a forecast of future global temperature under specific assumptions.

The climatic effects of fluctuations in oceanic regimes are most often studied using singular spectrum analysis (SSA) or variations on principle components analysis (PCA). In other words, by decomposing rainfall and temperature into periodic components. Such approaches can capture short period phenomena like the effects of El Niño , and the potential impact of longer term phenomena such as the Pacific Decadal Oscillation (PDO) on variations in global temperature. These phenomena take place over a period of years or decades. For finding and testing less frequent regime-shifts different techniques are called for. According to the authors: “An F-statistic known as the Chow test (Chow, 1960) based on the reduction in the residual sum of squares through adoption of a structural break, relative to an unbroken simple linear regression, is a straightforward approach to modeling regime-shifts with structural breaks.” All the statistical details aside, the point here is that a sequence of data that contains sudden shifts or jumps is hard to model accurately using standard methods.


Sea surface temperature simulation from Oak Ridge National Laboratory. Source DOE.

The paper investigates two claims made in the climate literature: first, a proposed regime-shift model of Australian temperature with a slightly increasing trend to 1976, rapidly increasing to 1979 (the shift), and slowly increasing since then; and second, a claim of lack of statistical significance regarding the declining temperature since the El Niño event in 1998. Regarding the first, the authors state: “The increase in Australian temperature of around 0.9°C from the start of the readily available records in 1910 is conventionally modeled as a linear trend and, despite the absence of clear evidence, often attributed to increasing concentrations of greenhouse gases (GHGs).” The main reason to apply econometric techniques to climate time series data it that simple linear forecasting can fail if the underlying data exhibit sudden jumps. “That is, while a forecast based on a linear model would indicate steadily changing global temperatures, forecasts based on shifts would reflect the moves to relatively static mean values,” the study states. The choice of underlying model may also impact estimates of the magnitude of climate change, which is one of the major points put forth by this work.

As for the “cherry picking” assertion, the authors claim that the flat global temperatures since 1998 are not an anomaly but are representative of the actual climate trend. That climate trend exhibits two distinct breakpoints, one in 1978 and another in 1997. The proposed new climate model is what is know as a change point model. Such models are characterized by abrupt changes in the mean value of the underlying dynamical system, rather than a smoothly increasing or decreasing trend. Confidence in the 1978 breakpoint is strengthened by the results for global temperatures since 1910. These data indicate the series can be described as gradually increasing to 1978 (0.05 ± 0.015°C per decade), with a steeper trend thereafter (0.15 ± 0.04°C per decade).

Breakpoints in sea surface monthly temperature anomalies (HadCRU3GL) (a)1910 to present and (b) 1976 to present. Figure 2 from Stockwell and Cox.

The Chow test since 1978 finds another significant breakpoint in 1997, when an increasing trend up to 1997 (0.13 ± 0.02°C per decade) changes to a practically flat trend thereafter (−0.02 ± 0.05°C per decade). Contrary to claims that the 10 year trend since 1998 is arbitrary, structural change methods indicate that 1997 was a statistically defensible beginning of a new, and apparently stable climate regime. Again, according to the authors: “The significance of the dates around 1978 and 1997 to climatic regimeshifts is not in dispute, as they are associated with a range of oceanic, atmospheric and climatic events, whereby thermocline depth anomalies associated with PDO phase shift and ENSO were transmitted globally via ocean currents, winds, Rossby and Kelvin waves .”

Perhaps most interesting is the application of this analysis to the prediction of future climate change, something GCM climate modelers have been attempting for the past 30 years with little success. Figure 3 from the paper illustrates the prediction for temperatures to 2100 following from our structural break model, the assumptions of continuous underlying warming, regime-shift from 1978 to 1997, and no additional major regime-shift. The projections formed by the presumed global warming trend to 1978 and the trend in the current regime predicts constant temperatures for fifty years to around 2050. This is similar to the period of flat temperatures from 1930-80.

Prediction of global temperature to 2100, by projecting the trends of segments delineated by significant regime-shifts. The flat trend in the temperature of the current climate-regime (cyan) breaks upwards around 2050 on meeting the (presumed) underlying AGW warming (green), and increases slightly to about 0.2°C above present levels by 2100. The 95% CI for the trend uncertainty is dashed. Figure 3 from Stockwell and Cox.

What is even more encouraging is that, even though temperatures resume their upward climb after 2050, the predicted increase for the rest of the century is only about 0.2°C above present levels. That is around one tenth the increase generally bandied about by the IPCC and its minions, who sometimes predict as much as a 6°C rise by 2100. It must be kept in mind that this extrapolation is based on a number of simplifying assumptions and does not incorporate many of the complexities and natural forcing factors that are incorporated in GCM programs. Can a relatively simple statistical model be more accurate than the climate modelers' coupled GCM that have been under continuous development for decades?

Mathematical models based on statistics are often the only way to successfully deal with non-linear, often chaotic systems. Scientists often find that physical reality at its most detailed level can defy their computational tools. Consider fluid flow, which can be either laminar or turbulent. Laminar fluid flow is described by the Navier-Stokes equations. For cases of non-viscus flow, the Bernoulli equation can be used to describe the flow. The Navier-Stokes equations are differential equations while the Bernoulli equation is a simpler mathematical relationship which can be derived from the former by way of the Euler Equation.


The Navier-Stokes equations.

In effect, both are ways of dealing with massive numbers of individual molecules in a flowing fluid collectively instead of individually. At the finest physical level, fluid flow is a bunch of molecules interacting with each other, but trying to model physical reality at the level of atomic interaction would be computationally prohibitive. Instead they are dealt with en mass using equations that are basically statistical approximations of how the uncountable number of molecules in flowing fluid behave. Often such mathematical approximations are accurate enough to be useful as scientific and engineering tools.


Fluid flow around a train using the Navier-Stokes equations. Source: Body & Soul.

Indeed, many of these types of equations find their way into GCM to model parts of the system climate scientists are trying to simulate. Instead of simply looking at the statistical behavior of Earth's climate, GCM try to model all the bits and pieces that comprise the Earth's climate system. Unfortunately, not all of the pieces of the Earth system are well understood and many factors cannot be modeled at the course physical scales forced on the modelers because of the lack of computational capacity. As I have discussed on this blog before, simply changing the structural components of a model, leaving all of the scientific assumptions and factors intact, can radically change the answers a model cranks out (see “Extinction, Climate Change & Modeling Mayhem”). Beyond that, there are the matters of inherent data inaccuracy and error propagation as presented in The Resilient Earth chapters 13 and 14.

If the new model's prediction is true, global temperatures in 2100 will not even approach the tripwire-for-Armageddon 2°C level set by the IPCC as humanity's point of no return. Can a statistical model be better at predicting future temperatures than complex yet incomplete GCM? With the lack of theoretical understanding, paucity of good historical data, and overwhelming simplifications that have to be made to make climate models run on today's supercomputers I would have to say that the statistical model comes off pretty well. Give me a well known statistical technique over a fatally flawed climate model any day.

Be safe, enjoy the interglacial and stay skeptical.

Chow Method

Statistics is a way of analyzing data, not a modelling technique per se. Statistical models do not incorporate information about state transitions. For example, take a basic chemical engineering process involving heat transfer and reaction in a well mixed tank which during processing as the result of controlled interventions, steps between various steady states.

The statistical technique you describe could tell you when the changes occurred, which is useful, but not when the next one will occur nor what the reason for the change. A learning system would fare no better requiring a complete set of feasible transitions to be available before functioning correctly. Yet such processes are extremely well understood and the system model may be characterized using a lumped parameter approach using first and second order non-linear, non-stiff ODEs with constant coefficients and simulated numerically by netbook running an Excel spreadsheet.

Let us apply the same reasoning to a system model consisting of non-linear partial and ordinary DEs with non constant coefficients, some of which are stiff (i.e. prone to fall apart under numerical analysis), and which is not in equibrium (i.e. not only does it not move to a steady state, but worse the nature of the equations alters over time not just its parameters and inputs). Let us also assume that anthropogenetic alterations are significant in such as system and moreover so are geological events with geophysical consequences such as volcanoes, significant space borne strikes, alterations in solar activity and in solar-spatial orbit plus other unknown disturbances. The inadequacy of statistical analysis as a predictor of the system follows.

Interesting points

You raise some interesting points, let me see if I can answer them at least in part. First, though GCM based on systems of differential equations are certainly a type of model that can be used to predict behavior of physical systems. My point was that the equations in question are actually just a way to treat the interaction of the constituent molecules in the system in a collective manner. In other words, they are a representation of the statistical behavior of the particles in the atmosphere and ocean as they interact while meeting various boundary conditions.

While the Chow test is not a model per se, there are models involved—specifically, linear regression models. As described in the paper:

The approach to developing a structural break model was to calculate the F statistic (Chow test statistic) for potential breaks at all potential change points using the R package strucchange (Zeileis et al., 2002). In this test, the error sum of squares (ESS) of a linear regression model is compared with the residual sum of squares (RSS) from a model composed of linear sections before and after each potential change point.

This is not a model in the sense that it is modeling the physical world as coupled fundamental processes, but it is a model of the time series data and fully capable of rendering future predictions based on the statistical trends detected. Given that most physical phenomena behave in ways that can be modeled by mathematics and described by statistical measures I find no fundamental reason to discount the work described in this paper.

Second, while it is true that the statistical model doesn't allow the prediction of meteor strikes of volcanic eruptions neither do GCM. In current climate models an estimated average impact for such perturbations is applied over the entire span of time being simulated. In other words, they are modeled statistically.

It should also be noted that the paper's statistical model would effectively account for such disturbances in a comparable way since it is based on historical time series data. Unless there was an inordinately large meteor strike or a widespread outbreak of vulcanism, the historical contributions of these types of event would be automatically incorporated into the model. Solar activity changes that are significantly atypical or reflect a longer term trend would not be detected by GCM though the Chow test might detect the later.

Again, I'm not saying that this statistical model is going to prove accurate, the error bounds are very wide by 2100. What I am saying is that the results are worth pondering and are at least as trustworthy as the dreck produced by the IPCC models.

Lag in Forcing

Apologise for one additional point, but the majority of models appear to show a fairly direct correlation between CO2 and temperature without any apparent time lag in direct contrast to paleoclimatic models which show temperature forcing CO2 with a time lag of 800 years or so (not sure what the confidence level is).

Quite apart from possibly confusing cause and effect, if we assume a forcing mechanism, there should be a process lag to account for time-distance i.e. even if you spew a lot of CO2 into the atmosphere, it should take a period of time for the forcing to occur as the CO2 needs time to absorb the sun's radiation, create warming, raising the level of water vapor in the atmosphere leading to a further time lag between water vapor levels increasing in the troposphere and increased greenhouse effect (this is all assuming postive feedback).

It follows that if there is zero or even an unrealistically small time lag in a model of what is supposedly a dynamic process (i.e. dynamic here means that the process has to move between steady states as opposed to non-dynamic where the process instantly moves between states) then the claimed forcing does not exist i.e. it is just a coincidence.

Interesting Points

Teach me to dump a whole paper's worth in two paragraphs :D. You are correct to point out that linear regression models can be used to predict time series and make interpolations w.r.t trends. The problem I was trying to underscore was that to render these interpolations accurate you have to incorporate state transitions such as phase changes into physical (lumped parameter) models and into statistical models - it is easier in the former than the latter because the phase change can be implicit in the coupled systems representation.

What you are really up against is the computational difficulty of doing so in statistically based or expert systems. In a complex non linear system which is not in equilibrium this becomes unfeasible given the current state of technology. In fact your statement that "most physical phnomena behave in ways that can be modelled by mathematics and describe by statistical measures" falls down at this point" simply because, although in theory this is true, in practice we do not have the requisite computational power to do so.

There are of course disadvantages to using coupled systems of differential equations as well because process ``noise'' is lost, yet in a non linear system it could be significant as small changes, even as small as 1 sigma, can have a significant effect on final outcomes. A better approach might be to marry statistical and coupled systems approaches utilizing non-linear structural equations models to define causal effects. Of course, several directions of cause effect may need to be explored including all potential feedback loops to determine the most accurate approach.

Computer capacity

A mathematician would say that the model comprises the equation or system of equations that describe the thing being modeled. Whether or not a usable computational model can be constructed is another matter entirely. This is, in fact, where the IPCC's models and other complex GCM "fall down." Lack of computational capacity forces them to be run using coarse spatial scales, resulting in output that would be inaccurate or erroneous even if the models were theoretically complete (which they are not). For more on this problem see "Extinction, Climate Change & Modeling Mayhem."

Warming trend from Little Ice Age

This method appears to produce a result that steers toward and then conforms to the observed warming trend extending from the little ice age. The engineer in me likes what I see. Use of the Hadley Center's data causes me to wonder about the veracity of the initial conditions, but I'd be more inclined to make a wager on the results of this methodology than on anything I've seen in the IPCC reports.

Seems to me that the up trend starts way too late...

The flat line lasts too long, and so does the up trend. The more likely event is that there would be flat ~23 more years, up for 23 years, flat to falling for 32 years and up for the last 22 years

Based on what?

The results are a matter of statistics, not opinion. Trying to fit the curve more closely is why other methods fail.

Based on past.

Which is what they used as the basis. The 1930s-1940s increase in temperatures, about 20-25 years warming, followed by 1940s-1970s decrease in temperatures, about 25 to 30ish years cooling, followed by 1970s to 1998 temperature increase, about 20ish years. So you would think that the model based on that would follow some kind of similar pattern of 30ish years flat to cool followed by 20 years of warming.

As was stated in the article.

The warming and cooling periods you have discerned with your mark one eyeball are not necessarily meaningful in terms of the longer term trend. Just as there will be some hot years and some cold years, simple statistics would expect periods of successive downward and upward temperature shifts. What you see as trends are not indicative of the underlying long-term trend—in effect, they are noise. I'm not convinced that this method is any more trustworthy than GCM, but it would be hard to be worse.