The saga of statistician turned climate science critic Edward Wegman and his protege Yasmin Said has taken yet another strange turn. The pair’s tenure as editors-in-chief at the Wiley journal they founded three years ago quietly came to an unceremonious end recently, while release of the hard-cover encyclopedia based on the journal also appears to have been delayed. Not only that, but it now seems that Yasmin Said’s stint as research assistant professor at George Mason University ended at the same time.
I had thought the saga of climate science critic Edward Wegman and the various allegations of misconduct in his recent work could not possibly get any more bizarre, especially in the wake of manifestly contradictory findings in two recently concluded investigations at George Mason University.
But in a shocking new development, it turns out that two problematic overview articles by Wegman and his protege and congressional report co-author Yasmin Said in Wiley Interdisciplinary Reviews: Computational Statistics (WIREs CS), have been completely revised. Those revisions saw the removal or rewriting of massive swathes of copy-and-paste scholarship, as well as correction of many errors identified by myself and others. In each case, the comprehensive revisions came “at the request of the Editors-in-Chief and the Publisher”, following complaints to Wiley alleging wholesale plagiarism. But Wegman and Said also happen to be two of the three chief editors of WIREs CompStat, thus raising compelling concerns of conflict of interest, to say the least.
In fact, it is very clear that Wiley’s own process for handling misconduct cases was egregiously abused in favour of a face-saving “redo” manoeuvre. And this latest episode raises disturbing new questions about the role of the third WIREs CS editor-in-chief (and “hockey stick” congressional report co-author) David Scott, and indeed Wiley management itself, in enabling the serial misconduct of Wegman and Said.
[Updates, Feb. 23-24: I have added extensive discussion "below the fold", starting with the section entitled GMU Process. The summary has been updated with additional links to side-by-side comparisons to enable readers to make their own judgments.]
Dan Vergano of USA Today reports on an “all faculty” announcement from George Mason University concerning the outcome of two faculty committee investigations of plagiarism charges against GMU statistics professor Edward Wegman.
One investigation concerned a 20o8 article by Wegman protege Yasmin Said, Wegman himself and two others in Computational Statistics & Data Analysis (CSDA). The committee upheld CSDAs previous plagiarism finding; as “team leader”, Wegman was found to bear responsibility and has been asked to retract the article and apologize to CSDA’s editor. GMU has also issued an official letter of reprimand confirming that finding of research misconduct.
A separate GMU committee investigated the 2006 congressional report commonly known as the Wegman Report, a critique of the Mann-Bradley-Hughes “hockey stick” reconstruction. That investigation held that “no scientific misconduct was involved”, only “extensive paraphrasing of another work” that was “referenced repeatedly”. [That finding holds that there was no plagiarism in Wegman Report background material derived from Raymond Bradley's Paleoclimatolgy; readers may judge side-by-side comparisons of the passages on tree-rings and ice core and coral proxies for themselves]. However, in a bizarre twist, it appears that the committee did not even consider side-by-side comparison of the Wegman Report’s long and unreferenced background section on social network analysis, part of which was reused in the later CSDA article and gave rise to the plagiarism finding in the other GMU case!
Today I present an analysis of a 2009 article by Yasmin Said and Edward Wegman of George Mason University. “Roadmap for Optimization” was published in the inaugural edition of WIREs Comp Stats, one of a new family of Wiley publications conceived as a “serial encyclopedia”. Wegman and Said, along with David Scott of Rice University, are also editors of the journal; the three are best known as co-authors of the 2006 “hockey stick” report to Congress, commissioned by Rep. Joe Barton.
As the title implies, the article was meant to provide a broad overview of mathematical optimization and set the stage for subsequent articles detailing various optimization techniques. However my analysis, entitled Suboptimal Scholarship: Antecedents of Said and Wegman 2009, demonstrates the highly problematic scholarship of the “Roadmap” article.
- No fewer than 15 likely online antecedent sources, all unattributed, have been identified, including 13 articles from Wikipedia and two others from Prof. Tom Ferguson and Wolfram MathWorld.
- Numerous errors have been identified, apparently arising from mistranscription, faulty rewording, or omission of key information.
- The scanty list of references appears to have been “carried along” from the unattributed antecedents; thus, these references may well constitute false citations.
First, I’ll present an abridged version of Suboptimal Scholarship summary as an overview of the analysis. Then I’ll take a look at a few examples showing the derivation of “Roadmap” from its antecedents, including some remarkable errors introduced in the process. And finally I’ll place this latest embarrassment in the context of the pattern of dubious scholarship evidenced by Wegman and Said over the last several years.
I examine the opening chapter by Edward Wegman and Jeffrey Solka in the 2005 Handbook of Statistics: Data Mining and Data Visualization (C Rao, E Wegman and J Solka, editors). Sections 3 (The Computer Science Roots of Data mining ), 5 (Databases), 6.2 ( Clustering) and 6.3 (Artificial Neural Networks) appear to be largely derived from unattributed antecedents; these include online tutorials and presentations on data mining, SQL and artificial neural networks, as well as Brian Everitt’s classic Cluster Analysis. All the identified passages, tables and figures were adapted from “copy-paste” material in earlier course lectures by Wegman. The introduction to Chapter 13 (on genetic algorithms) by Yasmin Said also appears to contain lightly edited material from unattributed sources, including an online FAQ on evolutionary computing and a John Holland Scientific American piece. Several errors introduced by editing and rearrangement of the material are identified, demonstrating the authors’ lack of familiarity with these particular subject areas. This extends a pattern of problematic scholarship previously noted in the work of Wegman and Said.
Early climate contrarian reactions to the retraction of Said, Wegman et al 2008 have grasped at straws, holding that this does not affect the findings of the paper and the earlier Wegman report alleging inadequate peer review in climate science.
Now USA Today’s Dan Vergano, who broke the the retraction story, addresses exactly that contention in a follow up piece. Social network analysis expert Kathleen Carley of Carnegie Mellon calls Said et al “more of an opinion piece” that would have required “major revision” to render it fit for publication in an SNA journal.
And it gets worse. Computational Statistics and Data Analysis chief editor Stanley Azen “personally reviewed” the paper and sent Wegman an acceptance notice within days of submission. Meanwhile, Virginia Tech’s Skip Garner enumerates the potential consequences of the research misconduct finding, including the possible need to investigate “ethical issues such as conflict-of-interest, haste vs. scientific rigor and bias”.
It’s been a long time coming, but there has now been an official finding in at least one of the complaints concerning the dubious scholarship of GMU professors Edward Wegman and Yasmin Said. According to Dan Vergano of USA Today, the journal Computational Statistics and Data Analysis (CSDA) has officially confirmed that Said, Wegman et al 2008, a follow up to the infamous Wegman et al report to Congress, will finally be retracted following complaints of plagiarism and inadequate peer review.
Previous posts have examined scholarship issues in the Wegman Report and Wegman et al’s core flawed statistical analysis of the “hockey stick” graph. Now I show that a recent WIREs Computational Statistics overview article on colour theory and design by Edward Wegman and protege Yasmin Said is based mainly on unattributed “flow through” decade-old material from various websites. These have been augmented by further unattributed figures and text from current online sources, including five Wikipedia articles (see figure above right).
The first anniversary of “hockey stick” co-author Ray Bradley’s complaint against George Mason University statistics professor Edward Wegman has come and gone, but the ensuing proceeding at GMU shows no sign of resolution. Similarly absent is any indication of the release of code and data, promised by Wegman back in 2006, nor an explanation for the obvious problems permeating the Wegman Report’s core statistical analysis.
But through it all there has been one obvious question: if the Wegman Report and the follow up federally funded Said et al on co-author social networks showed clear evidence of cut-and-paste scholarship, what might a close examination of other recent (or even not so recent) scholarship from the Wegman group reveal? To be sure, there already hints at the answer seen in problems in PhD dissertations from Said and others at GMU, and the insertion of a couple of paragraphs from the PhD dissertation of computer scientist David Grossman into a Wegman et al’s 1996 technical report.
A recent article by Wegman and Said in WIREs Computational Statistics opens up a whole new avenue of inquiry – and reveals a remarkable pattern of “flow through” cut-and-paste that goes even beyond Said et al 2008. Colour Design and Theory (published online in February) is based largely on a 2002 course lecture by Wegman. However, this is no case of simple recycling of material, for most of the earlier lecture material came from obscure websites on colour theory and was simply copied verbatim without attribution. Now much of it has shown up, virtually unchanged, nine years later. And the old material has been augmented with figures and text from several more decidedly non-scholarly sources, including – wait for it – five different Wikipedia articles.
By Deep Climate
A year ago, I first identified scholarship issues in the 2006 Wegman report, the contrarian touchstone commissioned by Republican congressman Joe Barton as part of his concerted campaign to discredit the “hockey stick” temperature reconstruction and the scientists behind it, especially Michael Mann. (The report was produced by lead author George Mason University statistics professor Edward Wegman, along with co-authors Rice University professor David Scott and Wegman protégé Yasmin Said, although Scott seems to have had little to do with it). Eventually, I demonstrated apparent plagiarism in 10 pages of background sections in the report, as well as in an obscure (but federally funded) follow-up article by Said, Wegman and two other Wegman acolytes. At the time, it seemed a matter of interest only in the blogosphere, while the mainstream media ignored the issue in favour of the bogus “climategate” scandal.
But it turned out that my work had come to the attention of at least one important player in this drama – paleoclimatologist and “hockey stick” co-author Raymond Bradley. Back in March, Bradley quietly filed an initial complaint with GMU alleging plagiarism by Wegman et al of Bradley’s own work, attaching some of my initial analysis. Two months later, Bradley updated GMU with my further evidence of more “widespread” plagiarism, including wholesale copying of passages from two social network text books and Wikipedia, in both the Wegman report itself, and the follow-up 2008 article by Said et al. Bradley also took special care to point out the discovery of federal funding for the latter, which made the apparent breaches of misconduct policy all the more serious.
None of this was known until the ever patient Bradley went public, notably in recent statements in online and print articles by USA Today science reporter Dan Vergano. Now, a comprehensive report by John Mashey, based on the complete communication between Bradley and GMU research vice president Roger Stough, along with an analysis of GMU’s academic misconduct policy, shows exactly why Bradley finally came forward. Strange Investigations at George Mason University [PDF 2.6 Mb ], presents a shocking picture of foot-dragging and lack of transparency at GMU. Despite the copious evidence presented by Bradley, no substantive action ensued until the belated August convening of the inquiry committee. That committee was supposed to have reported within 60 days of its ostensible nomination in April, and had only to consider the limited question of whether the allegations were substantive enough to warrant a full-blown investigation. Yet even the committee’s belated start only came after the intervention of Elsevier environmental sciences publisher John Fedor. Even worse, GMU’s Stough failed to provide promised progress reports, and the inquiry committee missed Stough’s stated September 30 deadline for delivery of its report. And Stough’s last substantive response in October to Bradley vaguely referred to needing “a few weeks more” to wrap up the inquiry phase, while since then he has stonewalled all further requests for updates.
So more than nine long months after Bradley’s initial complaint, GMU has yet to clearly reach the end of its initial inquiry, a phase that should have been pursued rigorously and resolved easily within GMU’s own timelines. That is especially so given the compelling evidence and the impetus of the serious issue of federal funding, which normally requires resolution of a misconduct inquiry within 60 days as a matter of law. All of this calls into serious question GMU’s misconduct policy and process, and indeed the university’s very commitment to fundamental principles of academic integrity.
By Deep Climate
Today I continue my examination of the key analysis section of the Wegman report on the Mann et al “hockey stick” temperature reconstruction, which uncritically rehashed Steve McIntyre and Ross McKitrick’s purported demonstration of the extreme biasing effect of Mann et al’s “short-centered” principal component analysis.
First, I’ll fill in some much needed context as an antidote to McIntyre and McKitrick’s misleading focus on Mann et al’s use of principal components analysis (PCA) in data preprocessing of tree-ring proxy networks. Their problematic analysis was compounded by Wegman et al’s refusal to even consider all subsequent peer reviewed commentary – commentary that clearly demonstrated that correction of Mann et al’s “short-centered” PCA had minimal impact on the overall reconstruction.
Next, I’ll look at Wegman et al’s “reproduction” of McIntyre and McKitrick’s simulation of Mann et al’s PCA methodology, published in the pair’s 2005 Geophysical Research Letters article, Hockey sticks, principal components, and spurious significance). It turns out that the sample leading principal components (PC1s) shown in two key Wegman et al figures were in fact rendered directly from McIntyre and McKitrick’s original archive of simulated “hockey stick” PC1s. Even worse, though, is the astonishing fact that this special collection of “hockey sticks” is not even a random sample of the 10,000 pseudo-proxy PC1s originally produced in the GRL study. Rather it expressly contains the very top 100 – one percent – having the most pronounced upward blade. Thus, McIntyre and McKitrick’s original Fig 1-1, mechanically reproduced by Wegman et al, shows a carefully selected “sample” from the top 1% of simulated “hockey sticks”. And Wegman’s Fig 4-4, which falsely claimed to show “hockey sticks” mined from low-order, low-autocorrelation “red noise”, contains another 12 from that same 1%!
Finally, I’ll return to the central claim of Wegman et al – that McIntyre and McKitrick had shown that Michael Mann’s “short-centred” principal component analysis would mine “hockey sticks”, even from low-order, low-correlation “red noise” proxies . But both the source code and the hard-wired “hockey stick” figures clearly confirm what physicist David Ritson pointed out more than four years ago, namely that McIntyre and McKitrick’s “compelling” result was in fact based on a highly questionable procedure that generated null proxies with very high auto-correlation and persistence. All these facts are clear from even a cursory examination of McIntyre’s source code, demonstrating once and for all the incompetence and lack of due diligence exhibited by the Wegman report authors.