The recent focus on George Mason University’s investigation into plagiarism allegations concerning the Wegman “hockey stick” report and related scholarship has led to some interesting reactions in the blogosphere. Apparently, this involves trifling attribution problems for one or two paragraphs, even though the allegations now touch on no less than 35 pages of the Wegman report, as well as the federally funded Said et al 2008. Not to mention that subsequent editing has also led to numerous errors and even distortions.
But we are also told that none of this “matters”, because the allegations and incompetence do not directly touch on the analysis nor the findings of the Wegman report. So, given David Ritson’s timely intervention and his renewed complaints about Edward Wegman’s lack of transparency, perhaps it is time to re-examine Wegman report section 4, entitled “Reconstructions and Exploration Principal Component Methodologies”. For Ritson’s critique of the central Wegman analysis itself remains as pertinent today as four years ago, when he expressed his concerns directly to the authors less than three weeks after the release of the Wegman report.
Ritson pointed out a major error in Wegman et al’s exposition of the supposed tendency of “short-centred” principal component analysis to exclusively “pick out” hockey sticks from random pseudo-proxies. Wegman et al claimed that Steve McIntyre and Ross McKitrick had used a simple auto-regressive model to generate the random pseudo-proxies, which is the same procedure used by paleoclimatologists to benchmark reconstructions. But, in fact, McIntyre and McKitrick clearly used a very different – and highly questionable – noise model, based on a “persistent” auto-correlation function derived from the original set of proxies. As a result of this gross misunderstanding, to put it charitably, the Wegman report failed utterly to analyze the actual scientific and statistical issues. And to this day, no one – not Wegman, nor any of his defenders – has addressed or even mentioned this obvious and fatal flaw at the heart of the Wegman report.
Wegman et al begin the section with an incomplete (and somewhat misleading) account of the role of principal component analysis (PCA) in Mann et al’s reconstruction methodology.
But for now, I’ll skip ahead to the discussion of McIntyre’s results, based on the analysis in the 2005 GRL paper, “Hockey sticks, principal components, and spurious significance”.
While at first the McIntyre code was specific to the file structure of his computer, with his assistance we were able to run the code on our own machines and reproduce and extend some of his results.
The four figures which follow apparently demonstrate Wegman et al’s ability to “reproduce” McIntyre’s “results”. (After that, Wegman goes on to “extend” the results by incorporating the infamous “cartoon” from the first IPCC report, a passage with its own problems that I’ll leave for another time).
Describing the first figure, Wegman et al say:
In Figure 4.1, the top panel displays PC1 simulated using the MBH98 methodology from stationary trendless red noise. The bottom panel displays the MBH98 Northern Hemisphere temperature index reconstruction.
Indeed, this appears to be an exact replica of McIntyre and McKitrick’s Fig. 1, down to use of the same “sample” PC1! Presumably, the “reproduction” is exactly the same, save for the file paths, not an independent replication. Even the same seed for random generation must have been used, so that the same “sample” PC1 (out of 10,000 pseudo-proxy series) could be identified and re-used. (If not, the exact replication of this one “sample” would indeed be hard to explain).
Here is M&M 2005 Fig. 1:
Figure 1. Simulated and MBH98 Hockey Stick Shaped Series. Top: Sample PC1 from Monte Carlo simulation using the procedure described in text applying MBH98 data transformation to persistent trendless red noise; Bottom: MBH98 Northern Hemisphere temperature index re-construction.
And here is Wegman et al fig. 4.1 (scaled here to match the aspect ratio of M&M fig. 1).
The purpose of this low-level mechanical “recomputation” is unclear, but it certainly does not demonstrate insight or understanding of McIntyre’s methodology. Note also the reference to “stationary trendless red noise”, which Wegman et al previously define in section 2.2 as AR1 (auto-regressive with lag of order 1).
Now I’ll skip ahead to Figure 4.4, the first one that is not a direct reproduction of a corresponding figure in M&M 2005.
Figure 4.4: One of the most compelling illustrations that McIntyre and McKitrick have produced is created by feeding red noise [AR(1) with parameter = 0.2] into the MBH algorithm. The AR(1) process is a stationary process meaning that it should not exhibit any long-term trend. The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications.
Presumably, we have here 12 more PC1 samples from the same 10,000 pseudo-proxy series recomputed using McIntyre’s code. But, for the first time, Wegman et al specifically refer to these as generated via a conventional AR1 red noise model. The discussion is interesting:
Discussion: Because the red noise time series have a correlation of 0.2, some of these time series will turn upwards [or downwards] during the ‘calibration’ period and the MBH98 methodology will selectively emphasize these upturning [or downturning] time series.
Here (also for the first time), Wegman et al note that “downwards” hockey sticks are also generated; indeed, fully half of the PC1s turn down rather than up. Yet, somehow, only upward turning PC1s are shown (no fewer than 12)!
The relevant point here, though, is that the claim that AR1 “red noise” proxies generate pronounced spurious PC1 “hockey sticks” [whether turning up or down] almost 100% of the time is completely wrong, especially given the low AR parameter of 0.2. How could Wegman et al make such a huge mistake?
Part of the problem may have been McIntyre and McKitrick’s confusing nomenclature; they refer to “persistent trendless red noise”, which might lead a careless reader to assume that they meant conventional AR1 “red noise”. McIntyre also refers to the MBH98 procedure (somewhat redundantly) as “AR1 red noise”, and perhaps Wegman et al confused the two models.
MBH98 attempted to benchmark the significance level for the RE statistic using Monte Carlo simulations based on AR1 red noise with a lag coefficient of 0.2, yielding a 99% significance level of 0.0.
McIntyre’s unconventional use of the term “red noise” certainly contributed to the confusion; this may have even been done to imply a greater correspondence between the two pseudo-proxy types than actually exists. On the other hand, M&M do refer several times to “persistent” red noise, including right in the abstract. And the described procedure does not resemble an AR1 model, although it is confusingly said to generate (unqualified) “trendless red noise”:
We downloaded and collated the NOAMER tree ring site chronologies used by MBH98 from M. Mann’s FTP site and selected the 70 sites used in the AD1400 step. We calculated autocorrelation functions for all 70 series for the 1400–1980 period. For each simulation, we applied the algorithm hosking.sim … which applied a method due to Hosking  to simulate trendless red noise based on the complete auto-correlation function.
The usual term for this more complex noise model, which generates series exhibiting very long-term dependencies, is fractional ARIMA (autoregressive integrative moving-average model), also referred to as ARFIMA, first described in JR Hosking’s 1981 Biometrika paper, Fractional Differencing. (ARIMA is itself an extension of the familiar ARMA models). It would have been helpful to unsophisticated readers if M&M had actually used the recognized term at some point.
In his follow up email of July 30, 2006, David Ritson elaborated on the actual procedure used by McIntyre, together with its full implications:
To facilitate a reply I attach the Auto-Correlation Function used by the M&M to generate their persistent red noise simulations for their figures shown by you in your Section 4 (this was kindly provided me by M&M on Nov 6 2004 ). The black values are the ones actually used by M&M. They derive directly from the seventy North American tree proxies, assuming the proxy values to be TREND-LESS noise.
Surely you realized that the proxies combine the signal components on which is superimposed the noise? I find it hard to believe that you would take data with obvious trends, would then directly evaluate ACFs without removing the trends, and then finally assume you had obtained results for the proxy specific noise! You will notice that the M&M inputs purport to show strong persistence out to lag-times of 350 years or beyond. Your report makes no mention of this quite improper M&M procedure used to obtain their ACFs. Neither do you provide any specification data for your own results that you contend confirm the M&M results. Relative to your Figure 4.4 you state “One of the most compelling illustrations that M&M have produced is created by feeding red noise (AR(1) with parameter = .2 into the MBH algorithm”.
In fact they used and needed the extraordinarily high persistances contained in the attatched figure to obtain their `compelling’ results. Obviously the information requested below is essential for replication and evaluation of your committee’s results. I trust you will provide it in timely fashion.
Ritson appears to suggest that Wegman et al claim to have independently generated the results in figure 4.4 using AR1 (.2) pseudo-proxies. But the truth is likely more mundane, if no less disquieting.
Wegman et al appear to have simply retrieved additional sample PC1s from their mechanical recomputation of M&M 2005, without a glimmer of understanding concerning the underlying statistical model actually used to generate the pseudo-proxies. In so doing, they missed the fundamental issue concerning McIntyre’s claim of demonstrated extreme bias via so called “persistent red noise” proxies. After all, there’s a huge difference between pseudo-proxies that have auto-correlation lags up to 350 years and an AR1 derived pseudo-proxy set!
The controversy concerning McIntyre’s pseudo-proxies continues to this day, as I discussed in my piece on McShane and Wyner. Yet, in all this time, Wegman et al have never provided the promised supporting material, or even acknowledged David Ritson’s questions about their flawed analysis.
It’s hard to imagine a more egregious and fatal analytical flaw than Wegman et al’s utter misunderstanding of the very procedure said to demonstrate the extreme bias of Mann et al’s PCA methodology. Wegman’s mischaracterisation of McIntyre’s pseudo-proxies as conventional “red noise”, along with the accompanying failure to actually analyze McIntyre’s methodology, surely ranks as one of the epic gaffes in statistical analysis. Indeed, no one can reasonably continue to claim that Wegman’s central analysis and findings hold up.
But I’m sure we’ll be told very soon that this, too , doesn’t matter.
[Note: Slight additions for clarity are shown in square brackets. ]