Berkeley Earth, part 1: Divergences and discrepancies

[UPDATE 08/17: In comments, Berkeley Earth team member Zeke Hausfather reveals that most of the discrepancy between the Berkeley Earth 2011 and 2012 results is due to a previously unreported error in latitudinal weighting in the earlier version.

UPDATE 08/20: The 2012 GHCN-only series has been uploaded by Zeke Hausfather. Also, I have added clarifications concerning absolute temperature uncertainty and data availability. The summary has been updated accordingly.]

The recent Berkeley Earth land-surface average temperature series is based on a greatly expanded database of station temperature data, along with a completely automated statistical averaging process. In contrast, established average temperature series from NOAA, NASA and HadCrut are based primarily on the smaller Global Historical Climate Network (GHCN) database, and use empirically derived homogenization methods to remove known biases, albeit supplemented by pure statistical methods.

Here, the post-1950 Berkeley Earth “complete” land series is compared to the preliminary Berkeley series released in 2011, as well as to GHCN-only simulated series, based on overall attributes of those unreleased series provided in the Berkeley Earth companion “methods” paper. The 2011 and 2012 “full” (ALL) series Berkeley versions both fall squarely in the range of the latest comparable series from the three other groups post-1950. However, the two Berkeley ALL series diverge over the 1980-2010 period, and lie completely outside each others’ 95% confidence intervals in the 2000s, when baselined to 1950-1979. This turns out to be due to a significant error in latitudinal weighting in the 2011 ALL series; the error was not publicly disclosed at the time of correction. The GHCN 2012 series falls halfway between the 2012 ALL and 2011 ALL series in the 2000s; 2012 GHCN and 2012 ALL each appear to diverge outside the other’s confidence interval in the 2000s. As well, there is an increasing widening between the 2012 GHCN and ALL  series the further one goes back before the 1950-1979 baseline period, with the ALL series about 0.3 C cooler in the early 1800s.

Other issues requiring further analysis are also identified, particularly a reported reversal in the long-term trend of narrowing diurnal temperature range starting in 1987, which contradicts previous GHCN-based analyses.Taken together, these issues cast doubt on the robustness of the present Berkeley Earth analysis, and point up the need for more open data access and improved diagnostics in order to further assess the reliability of the Berkeley Earth approach to surface temperature analysis.

1. INTRODUCTION

The Berkeley Earth Surface Temperature project has created a stir out of all proportion with its limited scientific import. In large part, this great interest stems from the very explicit conception of the Berkeley Earth project as a counter to the alleged scientific malfeasance of existing temperature group leadership, specifically at NASA-GISS  and the U.K. Climate Research Unit at the University of East Anglia. (By the way, I’ll be eschewing the obvious – albeit oxymoronic – acronym for the Berkeley project).

I suppose the belated “conversion” of Richard Muller to full acceptance of long-established scientific findings concerning anthropogenic global warming is positive on balance. Nevertheless, Muller and other Berkeley team members have a history of parroting baseless accusations against mainstream climate scientists, while carefully avoiding criticism of clearly misleading and false statements from contrarians. That disturbing pattern continues, as seen in Muller’s recent doubling-down on accusations related to “climategate”. This problem has even affected Berkeley Earth discussions of the very recent temperature record, an issue I’ll explore another time.

For now though, I’ll discuss the Berkeley Earth findings, drawing on  the results summary and data, as well as the team’s  supporting (as yet unpublished) papers. Of primary interest are the newly released main “results” paper, A New Estimate of the Average Earth Surface Land Temperature Spanning 1753 to 2011 , and the recently revised “methods” paper, Berkeley Earth Temperature Averaging Process, along with the earlier 2011 version.

So far, Berkeley Earth has confined its analysis to land-surface temperature only (the othr groups produce global series incorporating both land and ocean). This fact alone may well inhibit citation of the Berkeley series, even if peer-reviewed publication can be achieved. In particular, the IPCC does not incorporate separate land-only analyses, so cited temperature series must be truly global.

The Berkeley approach to land temperature analysis differs from that of the three traditional groups (NOAA-NCDC, NASA-GISS and HadCRU), in two important respects:

  • Berkeley uses a greatly  expanded database of station temperature histories drawn from a variety of sources (most held by the NOAA), whereas the others primarily base their analyses on the Global Historical Climatoligical Network (GHCN), managed by NOAA. The Berkeley database conatains more than 38,000 station histories (albeit some very short), as opposed to about 7200 in the GHCN.
  • Berkeley uses a pure statistical approach to processing station histories, breaking the series into sub-series based on discontinuities and then weighting based on internal consistency and correlation with neighbouring series. The other groups use empirically derived homogenization methods to remove known biases (or previously homogenized data in the case of HadCrut), albeit supplemented by purely statistical methods (see for example NOAA’s USHCN/GHCN methodology) .

In October 2011, Berkeley Earth released a preliminary full land temperature series, based on the complete station database. At the end of July, an updated land temperature series was presented, along with the new “results” paper. I refer to these two series, as Berkeley ALL 2011 and Berkeley ALL 2012, respectively.

The Berkeley “methods” paper presents another temperature series, based on the GHCN data set. Here the Berkeley methodology has been applied to the “raw” GHCN station histories without regard to any subsequent GHCN homogenization, as a test and demonstration. This series provides a useful comparison to the Berkeley ALL series, on the one hand, and to the other groups’ analyses on the other. In fact, the NOAA temperature analysis is based on this exact set of stations, so any differences with Berkeley GHCN are strictly down to differences in methodology.

Unfortunately, at the time of writing Berkeley Earth had not seen fit to release the GHCN-based temperature series data. However, the attributes of the series post-1950 are sufficiently well described in the two versions of the “methods” paper to enable very approximate simulated GHCN series, in order to compare the broad trends visually. I refer to these two series as Berkeley GHCN-Sim 2011 and Berkeley GHCN-Sim 2012 respectively.

2. BERKELEY ALL 2011 and 2012, 1950 to PRESENT

The Berkeley ALL 2o12 “complete” data set is available at the main Berkeley Earth summary page. It contains monthly data points from 1755 to 2011, along with one-year, five-year, ten-year and twenty-year moving averages and associated 95% uncertainty. The earlier preliminary Berkeley ALL 2011 data set is no longer available from Berkeley Earth, but apparently can be downloaded from third-party websites (e.g. at FindThatFile.com). That series goes up to early 2010.

The Berkeley summary page compares the Berkeley 2012 ALL series with three land temperature series: NOAA-NCDC, NASA-Giss and CRUTEM4 (also available at the HadCrut data page).

The close correspondence of all four series from 1950 on is obvious, but it is somewhat difficult make out when seen along side the more variable 19th century. So let’s drill down to the centred ten-year moving averages post-1950 period and add the Berkeley ALL 2011 series for good measure. (This and all subsequent charts use a 1950-1979 baseline).

Land surface temperature comparison 1950-2011 (10-year moving average)

By the 2000s, NOAA is running a little warmer than the other three; Berkeley 2012 reached virtually identical level with CRUTEM4 and the GISS Land series in the 2000s. In contrast, Berkeley ALL 2011 tracks the NOAA  series more closely. Nevertheless, all series are quite close to one another over this entire period.

Note that both Berkeley series are at the high end of the range in the 1970s, while Berkeley ALL 2012 is at the low end in the 2000s. This is reflected in a slight lower linear trend slope in Berkeley ALL 2012 compared to the other series.

Although the trend is slightly lower in Berkeley ALL 2012 than the other series (including Berkeley ALL 2011) the trend from 1970 on has remain above 0.25 C per decade in all series since 2005.

Now let’s take a closer look at the differences between Berkeley ALL 2011 and 2012, showing their respective confidence intervals. The Berkeley methodology computes spatial and statistical uncertainty (the latter via a 7/8 “jackknife”), but does not include bias and measurement error  estimated by the other groups. Because of this narrower focus, and because of the large size of the Berkeley station database, the confidence intervals of the Berkeley series are very tight indeed, especially in recent years. To illustrate this, here are the two series from 1990 through 2011, this time displayed annually.

The confidence intervals are shown by the bounding dotted lines; circled years denote where the central value of  the 2011 series is outside the confidence interval of the 2012 series.  There are four such years in the 1990s and six in the 2000s, including all of the last five years common to both (2005-2009).  Generally speaking, the 95% confidence intervals for annual values in this period are about +/-0.04 C, with the 2011 intervals slightly higher. Several years feature no overlap of the confidence interval at all (e.g. 2007)!

The 10-year moving average shows the evolution of this divergence since 1980.

By the 2000s, the confidence intervals no longer overlap at all, and presumably would continue to widen, were both series to be extended in the future.

This divergence does not seem easy to explain. The database has presumably been stable, and the methodology does not seem to have undergone any significant changes. Yet the Berkeley ALL series has suddenly moved from the top end of the range of observed warming to the bottom post-1980, compared to the other series. The sensitivity of the Berkeley series to presumably small tweaks or corrections in the Berkeley methodology is worrisome; if this effect went unnoticed by the Berkeley team, they would be well advised to perform a sensitivity analysis for each change introduced over the last few months.

Before moving on, I should also mention some baseline curiosities in these datasets. The eagle-eyed observer may have noticed a discrepancy in this regard. Indeed, the Berkeley papers consistently cite a 1951-1980 baseline (that is, all series are given as anomolies from the average in this period). The Berkeley datasets  headers, however, state that the baseline is 1950-1980, not 1951-1980. Meanwhile, the data itself has an average of exactly 0 from 1950-1979, so I have used this baseline to avoid changing the data as given (the other series have been adjusted accordingly, of course).

To add to the confusion, the Berkeley summary charts (such as the one above) do not explicitly mention any baseline, but the baseline used is clearly later than 1951-1980 (perhaps 1961-1990).

[UPDATE 8/20: As Berkeley team member Zeke Hausfather notes in a comment below, the Berkeley 2011 series had a significant error due to incorrect latitudinal weighting. That problem resulted in an elevated trend compared to the corrected 2012 series, with a four-sigma difference in the 2000s decade. It also had a clearly unrealistic absolute average temperature a full 2 C lower than other Berkeley series, including the contemporaneous GHCN 2011 series. It is therefore somewhat surprising that the Berkeley Earth team did not realize that this series was problematic.

It is not yet clear when the error was discovered, but it seems it must have been earlier this year. At some point (I can’t recall exactly when), the Berkeley Earth data analysis page was updated with a corrected chart that showed a little less post-1980 warming, and the link to the underlying data was removed. That page is still available; it contains no indication or notice that the previous version had been corrected.

I’m inclined to give the Berkeley Earth team the benefit of the doubt and presume that the failure to disclose this error was not an intentional effort to mislead. However, it does point up a serious problem in Berkeley Earth’s science communication. And it is especially ironic that Berkeley Earth failed to disclose this information, as well as reach a minimum level of professional communication, given the unfounded attacks by the project leadership on established groups. ]

3. COMPARISON OF BERKELEY ALL and BERKELEY GHCN, 1950 to PRESENT

In the two versions of the “methods” papers, the Berkeley team used their methodology on the GHCN station data.  Unfortunately, these series remain unarchived. However, the main attributes of the series post-1950 are given in the respective paper versions. The corresponding ALL  attributes are given in the corresponding data set header or derived from the actual data, except for the 2000s increase in 2012 ALL, which is found in the “results” paper. The following table compares the general characteristics of the two Berkeley ALL and the two Berkeley GHCN series from 1950 on.

Series Version 1950s  Absolutei°C 2000s-1950s Increasei°C
Trend_°C/Cent. 1970-2011
2011 ALL 6.83  0.910  2.70
2011iGHCN 8.849i±0.033  0.911i±0.042  2.76 ± 0.16
2012 ALL 8.87  0.869i±0.048  2.53
2012iGHCN 9.29i±0.032  0.893i±0.063  2.74 ±0.24

The wide variance in estimated average absolute temperature in the 1950s is striking. The 2011 ALL value appears to be an error and so has been italicized. But the other values are also well apart. In particular the difference between 2012 ALL and GHCN is a whopping 0.42 °C. Since the given standard deviation for the GHCN 1950s value is only 0.016 °C, this represents a more than 25-sigma difference!

[UPDATE 8/20: The GHCN temperature and uncertainties given in the above table came from the “methods” paper, which reads as follows at p. 24.

Applying the methods described here, we find that the average land temperature from Jan 1950 to Dec 1959 was 9.290 ± 0.032 C, and temperature average during the most recent decade (Jan 2000 to Dec 2009) was 10.183 ± 0.047 C, an increase of 0.893 ± 0.063 C.

However, later in the paper we read:

The global land average from 1900 to 2000 is 9.35 ± 1.45 C, broadly consistent with the estimate of 8.5 C provided by Peterson et al. (2011). This large uncertainty in the normalization is not included in the shaded bands that we put on our 𝑇𝑎𝑣𝑔 plots, as it only affects the absolute scale and doesn’t affect relative comparisons.

Thus, the uncertainty in the first passage appears to refer to the uncertainty of the nominal average temperature relative to other periods, not the uncertainty of absolute temperature per se. In other words, it is equivalent to the uncertainty in the anomaly. The uncertainty in the absolute temperature is considerably larger. The first sentence is poorly worded, though, and should be clarified. ]

The trend from 1970 to present is noticeably lower in the 2012 ALL series than the other three, and barely within the GHCN 2012 confidence interval.

To show all this in more visual terms, I have fixed the given GHCN series with decadal average values in the 1950s and the 2000s. I then extended from present back to 1970 at the given rate of slope and differencing that slope with the corresponding ALL series, producing the “GHCN-Sim” series seen below.

I have shown the GHCN-Sim 2012 with both a common baseline to 2012 ALL (showing the extremeley wide divergence in estimated absolute temperature) and a more conventional “internal” baseline. Evem the latter shows evidence of divergence by 2011, with the GHCN series already at the upper bound of the 2012 confidence interval.

4. BERKELEY ALL 2012 vs BERKELEY GHCN 2012 PRE-1950

4a. 1900-1950

The “methods” paper gives few hard numbers concerning Berkeley GHCN 2012 in this period. I will therefore confine my remarks to a visual comparison of the two series, as seen in charts from the respective papers.

The following is from Fig 8 in the “methods” papers and shows the Berkeley GHCN 2012 (in black) along with NOAA (green), GISS (red) and CRUTEM3 (red). We can see that GHCN 2012 is within the other three, and tracks CRUTEM3 quite closely at the bottom of the pack up until 1915 or so. By 195o, the four series are barely distinguishable with Berkeley GHCN 2012 squarely in the middle.

Contrast this with Berkely ALL 2012 as seen in the “results” paper Fig. 1 (note this a 10-year moving average, as the annual chart is very indistinct).

In the early 1900-1915 period, Berkeley 2012 ALL is way below all the others, and is still slightly below even in the 1940s.

4b. 1800-1900

In this period the divergence between Berkeley GHCN and Berkeley ALL is even wider.

First, as in the post-1950 case, we give a table of comparable attributes.

Series Version 2000s-1800s Increasei°C 1950s-1800s Increasei°C
Trend_°C/Cent. 1800-1899
2012 ALL 1.38 0.51 0.54
2012iGHCN 1.27 ± 0.21 0.38 0.18 ± 0.45

In the “methods” paper,  the GHCN  1800s trend is deemed “approximately constant” at 0.18 C/century, whereas  Berkeley ALL gives a much higher trend of 0.54 C/century. In fact, that is closer to the Berkeley 2012 ALL 20th century trend of  0.77 C/century  for that period, than to the GHCN 1800s trend.

The combination of steeper trend and lower temperatures relative to the 1950s, implies an approximately 0.3 °C divergence at the beginning of the 19th century.

In order to confirm this divergence, I compare the 1813 troughs in the respective 10-year average data sets. The Berkeley 2012 GHCN levels can be approximately determined from Fig. 1 bottom panel in the “methods” paper, while the Berekeley 2012 ALL are derived from the data set as before.

In the following table, the centred 10-year averages are given relative for the minimum centred on 1813, relative to the 1950s decade.

Series Version 1813 °C Rel.i1950s Divergence °C
2012 ALL -1.22i±0.50 -0.32
2012iGHCN -0.90 ±0.40 N/A

This confirms the approximate 0.3 °C divergence between the two series in the early 19th century.

As the authors note in a cursory manner, the Berkeley ALL 19th century temperature series is considerably cooler in general than paleoclimate reconstructions, including NH land reconstructions. However, the authors have not attempted to quantify these differences or assess them in any way, even though some of the 19th century reconstructions (e.g. Lutterbacher) are themselves largely based on instrumental data.

To be sure, estimates of past centennial variability have tended to increase over the years (compare IPCC TAR and AR4 for exzmple).  But the Berkeley 19th century portion looks like an outlier even compared to the coolest paleoclimate reconstructions such as Moberg et al.

The 19th century series portion should also be compared to paleoclimate model “hindcasts” in this era, such as presented in IPCC AR4 Chapter 6 Fig. 6-14. These capture short-term volcanic related excursions better than reconstructions from proxies (although the two approaches align well at longer timescales).

The spread between the early 1800s trough and 1950s level in the various model runs is about 0.6-0.8 °C, or  0.4-0.6 °C less than in the Berkeley ALL 2012 series. To be sure, the two are not directly comparable as Berkeley ALL is global land, and the model reconstructions are NH over land and ocean. A more precise comparison would compare Berkeley ALL NH, to NH model and proxy land reconstructions. Nevertheless, it appears that the GHCN-only series may be more plausible and in line with other evidence. Understanding the source of the large divergence between the Berkeley ALL and GHCN series becomes all the more crucial.

5. OTHER ISSUES

Beyond presentation of the new temperature series, the Berkeley Earth “results” abstract mentions two other findings, concerning (a) diurnal temperature range and (b) the ability of volcanism and anthropogenic effects to account for temperature variations.

Diurnal variations decreased from 1900 to 1987, and then increased; this increase is significant but not understood. The period of 1753 to 1850 is marked by sudden drops in land surface temperature that are coincident with known volcanism; the response function is approximately 1.5 ± 0.5 ºC per 100 Tg of atmospheric sulfate. This volcanism, combined with a simple proxy for anthropogenic effects (logarithm of the CO2 concentration), can account for much of the variation in the land surface temperature record; the fit is not improved by the addition of a solar forcing term.

5a. Diurnal temperature range

Like the NOAA, the Berkeley Earth team has studied trends in minimum and maximum temperature, not just temperature average. The majority of sites (at least since 1950) allow the construction of paired minimum/maximum series, enabling average  diurnal temperature range to be estimated over time.

Some of the climate models predict that the diurnal temperature range, that is, the difference between Tmax and Tmin, should decrease due to greenhouse warming. The physics is that greenhouse gases have more impact at night when they absorb infrared and reduce the cooling, and that this effect is larger than the additional daytime warming. This predicted change is sometimes cited as one of the “fingerprints” that separate greenhouse warming from other effects such as solar variability. Previous studies [Karl et al., 1991; Easterling et al., 1997; Braganza et al. 1998; Jones et al. 1999] reported significant decreases in the diurnal temperature range over the period 1948 to 1994. …

The figure:

The interpretation  follows:

The behavior of the diurnal range is not simple; it drops from 1900 to 1987, and then it rises. The rise takes place during a period when, according to the IPCC report, the anthropogenic effect of global warming is evident above the background variations from natural causes. Although the post-1987 rise is not sufficient to undo the drop that took place from 1901 to 1987, the trend of 0.86 ± 0.13 C/century is distinctly upwards with a very high level of confidence. This reversal is particularly odd since it occurs during a period when the rise in Tavg was strong and showed no apparent changes in behavior

… We are not aware of any global climate models that predicted the reversal of slope that we observe.

The general descending trend in diurnal temperature range is well established, as noted by the authors. However, there are at least two major problems with the above exposition, as revealed by a quick Google search on “DTR Trend” at the NOAA site.

First, the assertion that climate models show DTR is a particular feature of greenhouse gas warming appears to be without foundation. Liming Zhou’s bulletin, Asymmetric Global Warming: Day versus Night, on DTR notes:

The greenhouse gases enhanced surface downward longwave radiation (DLW) explains most of the warming of Tmax and Tmin while decreased surface downward shortwave radiation (DSW) due to increasing aerosols and water vapor contributes most to the decreases in DTR in the models.

So there is an anthropogenic component (i.e. aeorosols) to modelled DTR, but it is also associated with increasing water vapour, which is a feedback of warming irrespective of attribution. Thus there is no justification for claiming that DTR decrease is a specific “fingerprint” of greenhouse gas warming in climate models.

It is also worth noting that the observed decrease in DTR is well above  the small descending trend adduced in models, as noted in Zhou’s figure 2 (bottom panel – red line is modelled with all forcings, black is observed).

Second, the Berkeley Earth team cites older studies on DTR but fails to cite a more recent 2005 study by Vose et al that finds no trend in DTR from 1980 on (a feature also seen in the Zhou figure above, which derives observed trends from Vose et al).

Since this latter finding is based on analysis of NOAA GHCN temperature series, the natural question is whether the Berkeley Earth divergent finding of increasing DTR is attributable to the expanded data set, different processing algorithm or both. One obvious approach is to perform the same analysis on a GHCN-only TMax / TMin temperature series.

It is particularly worrisome is that not a single member of the Berkely team, all of whom were either co-authors or credited with useful insights, caught these major deficiencies in this section.

5b. Volcanic and other forcings

The Berkeley Earth ALL series extends back to 1755, while the GHCN-only series goes back to 1800. (In contrast, NOAA and NASA-GISS go back to 1880, while HadCrut goes back to 1850).

The extended record has been fit to a linear combination of volcanic emissions and ln CO2 levels. In the following figure reformatted from the “results” paper Fig 5, annual and decadal record and fit are shown at top and bottom respectively.

As noted above, the 19th century portion of the series is generally cooler and has a more pronounced upward trend than the Berkeley GHCN series, or comparable model and proxy paleoclimate reconstructions. This characteristic of the Berkeley ALL 2012 series appears to be related to a more pronounced downward excursion in the early 1800s, associated with the Tambora mega-volcano in 1815 (on top of a smaller volcano in 1809).

That volcanic effects play a major role in 19th century temperature and anthropgenic GHGs in the 20th, is scientifically uncontroversial (or should be). However, treating forcings as completely independent could lead to inconsistent results. At first glance, the large decadal downturns attributed to volcanic activity imply a higher climate sensitivity to those forcings than to the equivalent forcings from GHGs, a nonsensical result. Alternatively, it could imply that volcanic forcings have been severely underestimated in the past.

Once again, benchmarking against the GHCN-only series would supply much needed insight. It seems plausible that the GHCN-only series would be more consistent with previous work, as well as provide a more consistent fit to volcanic emissions and GHGs.

In any event, while linear fits may provide some qualitative confirmation of the correctness of the general shape of the 19th century portion of the Berkeley series, they are a poor substitute for proper attribution studies. (A more apt use of linear fits is to remove short-term effects of natural variation in order to ascertain the true underlying trend in the recent temperature record, as done in Foster and Rahmstorf 2011).

In a subsequent post, I’ll compare the Berkeley series to reconstructions from both models and proxies over NH land. It is hoped by then that the Berkeley Earth will have seen fit to release the GHCN 2012 series.

6. CONCLUSION AND SUMMARY

I have compared the Berkeley Erath land temperature series to other Berkeley Earth series, and series from other groups, and found many unexplained divergences and discrepancies. These include:

  • Berkeley ALL 2012 is cooler post-1980 compared to all other Berkeley series and those from other groups. There is no overlap of confidence intervals in the 2000s. [This is now known to be due to a previously undisclosed error in latitudinal weighting].
  • The 19th century portion of the Berkeley ALL 2012 series is considerably cooler than the GHCN series, with a divergence of about -0.3 C at the circa 1813 trough. It is also appears to be cooler than all proxy and model reconstructions.
  • The analysis of diurnal temperature range (DTR) is hampered by misconceptions and ignorance of previous scientific literature. The DTR findings are at odds with the most recent credible science in this area.
  • Independent fit of Berkeley ALL 2012 to volcanic emissions and CO2 atmospheric levels appears to give conflicting estimates of sensitivity to forcings, or else imply a previous underestimate of volcanic forcings.

These differences need to be reported, explained and resolved before the Berkeley Earth series can be considered a credible addition to the global surface temperature record.

Despite Berkeley Earth’s stated commitment to open access to data, much work needs to be done to improve data access. The group has released the main average, maximum and minimum temperature series and created a drill down database that can provide regional and local temperature series. However there is no data available for the minimum and maximum series which were used to analyze DTR. And no data whatsoever has been released to support the “methods” paper and the GHCN-only series presented there. Moreover, the raw data sources are poorly described and little information is available on location and number of stations in each data source.

As a first step towards more open access, the GHCN only series should be released as soon as possible. Other subsets should also be considered in order to explore possible biases lurking in the various data sets.For all released series, intermediate detailed and summary results should also be released (as NOAA does). This is especially important given the relatively opaque methods of Berkeley Earth, which bear no obvious relationship to individual station data. Thus, for example, the merged “Best Quality Stations” should be released and should probably also include diagnostic information to ascertain the weight of individual stations in each combined series over time.

[UPDATE 8/20:  Today Zeke Hausfather posted the 2012 GHCN series, as noted below. I commend him for this swift action.

The next logical step is to release other key summary data for each of the submitted papers. This will permit better public scrutiny of Berkeley Earth results and attributes, as I argue above. But just as importantly, if this policy had been in place from the beginning, the error in the Berkeley Earth 2011 results would  have been caught much earlier, as the “methods” GHCN series would have been seen to be clearly at odds with the 2011 ALL series. ]

The Berkeley approach to temperature averaging and the creation of an expanded integrated database of raw data may turn out to be useful contributions. But in the end, it is entirely possible that a pure statistical approach can not overcome deficiencies in the raw data; certainly, the automated “more is better” approach has yet to be validated.

About these ads

16 responses to “Berkeley Earth, part 1: Divergences and discrepancies

  1. Deep,

    The primary divergence between the 2011 and 2012 Berkeley series was due to a bug in latitudinal weighting that was fixed.

    Here is GHCN vs non-GHCN stations in Berkeley: http://rankexploits.com/musings/wp-content/uploads/2012/07/GHCN_NonGHCN_Compare.png
    Here are min and max temperatures for the globe (you can find them on any of the temperature pages): http://berkeleyearth.lbl.gov/regions/global-land

    On the Vose et al paper, I suggested including it in the draft, and we discussed it in meetings. Ditto climate models underestimating DTR. Not sure why it didn’t make it into the final version, however.

    Also, if you want to compare non-smoothed Berkeley temps to other temps, make that comparison explicitly: http://i81.photobucket.com/albums/j237/hausfath/LandTempComparison.png

    I’ll put in a request that we put up a GHCN-only series for folks to download. It will have to go in the queue behind releasing each station’s record and 1×1 gridded data, however, as we have had a lot more requests for that.

    • Zeke,
      Thanks for stopping by; I appreciate it. I do feel that you are by far the most credible public face of Berkeley Earth, although I suppose that’s faint praise in the context. I’ll answer your specific responses in comments below. I’ll be tough (but fair); normally, I’d cut a neophyte group more slack, but am less inclined to do so given the fatuous arrogance exhibited by Berekeley Earth leadership.

  2. Oops, had CRUTEM4 twice in there. Here is a version with NCDC included: http://i81.photobucket.com/albums/j237/hausfath/LandTempComparison-1.png

  3. Zeke:

    “The primary divergence between the 2011 and 2012 Berkeley series was due to a bug in latitudinal weighting that was fixed.”

    So Berkeley Earth released a result in 2011, and then corrected a major bug with the July 2012 update, but did not advise anyone of a major discrepancy. When was BE going to release that fact? Never?

    Did that bug also affect the results in the initial “methods” paper? How about the latest “methods” paper (i.e.BE GHCN 2011 and 2012)?

  4. Zeke on GHCN:

    Here is GHCN vs non-GHCN stations in Berkeley: http://rankexploits.com/musings/wp-content/uploads/2012/07/GHCN_NonGHCN_Compare.png

    I thought about adding that chart, but it only shows in a qualitative way what I had discovered: That BE ALL 2012 is cooler than BE GHCN 2012 both before *and* after the baseline period. Also your chart doesn’t show the period of widest divergence (1800-1850). Although I might use it in an update, the actual data for BE GHCN 2012 is much more important.

    Speaking of which, Zeke continues:

    I’ll put in a request that we put up a GHCN-only series for folks to download. It will have to go in the queue behind releasing each station’s record and 1×1 gridded data, however, as we have had a lot more requests for that.

    BE priorities are greatly misplaced. The GHCN-based results are the main results presented in the two versions of the “methods” papers, and yet neither series has been made available. That’s appalling, especially given BE’s supposed spirit of openness and criticism of other groups for supposed lack of same. In addition, the GHCN-based series is crucial for assessing the BE temperature analysis.

    I consider this eminently reasonable request to fix an egregious oversight of data release to have been made to Zeke on July 31. The clock is ticking.

  5. Zeke on min/max and DTR:

    Here are min and max temperatures for the globe (you can find them on any of the temperature pages): http://berkeleyearth.lbl.gov/regions/global-land

    Thanks for that. It looks like there is both a flattening in min *and* slight acceleration in max post 1987. I’m not convinced this is plausible, but we’ll see eventually I suppose.

    On the Vose et al paper, I suggested including it in the draft, and we discussed it in meetings. Ditto climate models underestimating DTR. Not sure why it didn’t make it into the final version, however.

    It’s worse than just the omissions though. For example, the paper actually mischaracterizes DTR as a GHG “fingerprint”. And the failure to check DTR in the GHCN-only data set is a major flaw. Do you know what the GHCN-only “methods” series shows for DTR?

    • I don’t know off the top of my head what GHCN-only shows for DTR. It will probably be similar to the results you get from looking at DTR changes in the NCDC record (subtract their min and max series). I’ll look into getting you min/max as well as mean.

  6. Deep,

    I really don’t see the GHCN-only result being of that much importance, but I will bug Robert to sent me the anomalies since he has them handy. You could also download the data yourself and compare the two, as I did back in February: http://curryja.files.wordpress.com/2012/02/berkeley-fig-3.png

    I agree that you would expect greater divergence pre-1850, since there are much fewer stations available and the introduction of a few additional stations make a large impact.

    • “I really don’t see the GHCN-only result being of that much importance”

      Well, it was the only result presented in an actual paper in 2011. And you should simply release it if it’s the major result of a paper, even if you think it’s not important. And if it doesn’t show DTR reversal, you can’t just wave your hands and say “The other data set has more data so it’s better”. You need to explain it and figure out why there is a divergence. If it *does* show DTR reversal, then you have to explain the discrepancy with NOAA. Or else someone will do it for you. That’s how science works.

      “You could also download the data yourself … ”

      You appear to be referring to downloading Berkeley GHCN-only processed data. What data exactly are you referring to (link please)? Or do you mean NOAA/GHCN raw data?

      ” expect greater divergence pre-1850″. Not necessarily – it depends also on % of stations added, where they are and how reliable they are.

  7. I was referring to the Berkeley dataset available here: http://berkeleyearth.org/data/

    If you want to only use GHCN-M data, retain stations with a GHCN-M designation in the metadata.

  8. Pingback: Another Week of GW News – August 19, 2012 – A Few Things Ill Considered

  9. Deep,

    We will update the website with the GHCN-M only run, but in the mean time folks can find it here: http://rankexploits.com/musings/wp-content/uploads/2012/08/GHCN_average_complete.txt

  10. Zeke H:

    “I really don’t see the GHCN-only result being of that much importance …”

    This confirms divergence of Berkeley ALL and GHCN-only outside each others’ CI in the 2000s.

  11. Zeke H:

    “I really don’t see the GHCN-only result being of that much importance …”

    This shows that Berkeley ALL actually adds uncertainty relative to GHCN-only in crucial periods: 1810-1830 and 1910-1940.

    A more complete post on the GHCN-only result and its supposed lack of importance is coming soon.

  12. Pingback: Richard Muller Radio Rambles, part 1: Kochs “very deep”, “very thoughtful” and “properly skeptical” | Deep Climate

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s