[Updates, Feb. 23-24: I have added extensive discussion “below the fold”, starting with the section entitled GMU Process. The summary has been updated with additional links to side-by-side comparisons to enable readers to make their own judgments.]
Dan Vergano of USA Today reports on an “all faculty” announcement from George Mason University concerning the outcome of two faculty committee investigations of plagiarism charges against GMU statistics professor Edward Wegman.
One investigation concerned a 20o8 article by Wegman protege Yasmin Said, Wegman himself and two others in Computational Statistics & Data Analysis (CSDA). The committee upheld CSDAs previous plagiarism finding; as “team leader”, Wegman was found to bear responsibility and has been asked to retract the article and apologize to CSDA’s editor. GMU has also issued an official letter of reprimand confirming that finding of research misconduct.
A separate GMU committee investigated the 2006 congressional report commonly known as the Wegman Report, a critique of the Mann-Bradley-Hughes “hockey stick” reconstruction. That investigation held that “no scientific misconduct was involved”, only “extensive paraphrasing of another work” that was “referenced repeatedly”. [That finding holds that there was no plagiarism in Wegman Report background material derived from Raymond Bradley’s Paleoclimatolgy; readers may judge side-by-side comparisons of the passages on tree-rings and ice core and coral proxies for themselves]. However, in a bizarre twist, it appears that the committee did not even consider side-by-side comparison of the Wegman Report’s long and unreferenced background section on social network analysis, part of which was reused in the later CSDA article and gave rise to the plagiarism finding in the other GMU case!
Today I present an analysis of a 2009 article by Yasmin Said and Edward Wegman of George Mason University. “Roadmap for Optimization” was published in the inaugural edition of WIREs Comp Stats, one of a new family of Wiley publications conceived as a “serial encyclopedia”. Wegman and Said, along with David Scott of Rice University, are also editors of the journal; the three are best known as co-authors of the 2006 “hockey stick” report to Congress, commissioned by Rep. Joe Barton.
As the title implies, the article was meant to provide a broad overview of mathematical optimization and set the stage for subsequent articles detailing various optimization techniques. However my analysis, entitled Suboptimal Scholarship: Antecedents of Said and Wegman 2009, demonstrates the highly problematic scholarship of the “Roadmap” article.
- No fewer than 15 likely online antecedent sources, all unattributed, have been identified, including 13 articles from Wikipedia and two others from Prof. Tom Ferguson and Wolfram MathWorld.
- Numerous errors have been identified, apparently arising from mistranscription, faulty rewording, or omission of key information.
- The scanty list of references appears to have been “carried along” from the unattributed antecedents; thus, these references may well constitute false citations.
First, I’ll present an abridged version of Suboptimal Scholarship summary as an overview of the analysis. Then I’ll take a look at a few examples showing the derivation of “Roadmap” from its antecedents, including some remarkable errors introduced in the process. And finally I’ll place this latest embarrassment in the context of the pattern of dubious scholarship evidenced by Wegman and Said over the last several years.
I examine the opening chapter by Edward Wegman and Jeffrey Solka in the 2005 Handbook of Statistics: Data Mining and Data Visualization (C Rao, E Wegman and J Solka, editors). Sections 3 (The Computer Science Roots of Data mining ), 5 (Databases), 6.2 ( Clustering) and 6.3 (Artificial Neural Networks) appear to be largely derived from unattributed antecedents; these include online tutorials and presentations on data mining, SQL and artificial neural networks, as well as Brian Everitt’s classic Cluster Analysis. All the identified passages, tables and figures were adapted from “copy-paste” material in earlier course lectures by Wegman. The introduction to Chapter 13 (on genetic algorithms) by Yasmin Said also appears to contain lightly edited material from unattributed sources, including an online FAQ on evolutionary computing and a John Holland Scientific American piece. Several errors introduced by editing and rearrangement of the material are identified, demonstrating the authors’ lack of familiarity with these particular subject areas. This extends a pattern of problematic scholarship previously noted in the work of Wegman and Said.
It’s been a long time coming, but there has now been an official finding in at least one of the complaints concerning the dubious scholarship of GMU professors Edward Wegman and Yasmin Said. According to Dan Vergano of USA Today, the journal Computational Statistics and Data Analysis (CSDA) has officially confirmed that Said, Wegman et al 2008, a follow up to the infamous Wegman et al report to Congress, will finally be retracted following complaints of plagiarism and inadequate peer review.
Previous posts have examined scholarship issues in the Wegman Report and Wegman et al’s core flawed statistical analysis of the “hockey stick” graph. Now I show that a recent WIREs Computational Statistics overview article on colour theory and design by Edward Wegman and protege Yasmin Said is based mainly on unattributed “flow through” decade-old material from various websites. These have been augmented by further unattributed figures and text from current online sources, including five Wikipedia articles (see figure above right).
The first anniversary of “hockey stick” co-author Ray Bradley’s complaint against George Mason University statistics professor Edward Wegman has come and gone, but the ensuing proceeding at GMU shows no sign of resolution. Similarly absent is any indication of the release of code and data, promised by Wegman back in 2006, nor an explanation for the obvious problems permeating the Wegman Report’s core statistical analysis.
But through it all there has been one obvious question: if the Wegman Report and the follow up federally funded Said et al on co-author social networks showed clear evidence of cut-and-paste scholarship, what might a close examination of other recent (or even not so recent) scholarship from the Wegman group reveal? To be sure, there already hints at the answer seen in problems in PhD dissertations from Said and others at GMU, and the insertion of a couple of paragraphs from the PhD dissertation of computer scientist David Grossman into a Wegman et al’s 1996 technical report.
A recent article by Wegman and Said in WIREs Computational Statistics opens up a whole new avenue of inquiry – and reveals a remarkable pattern of “flow through” cut-and-paste that goes even beyond Said et al 2008. Colour Design and Theory (published online in February) is based largely on a 2002 course lecture by Wegman. However, this is no case of simple recycling of material, for most of the earlier lecture material came from obscure websites on colour theory and was simply copied verbatim without attribution. Now much of it has shown up, virtually unchanged, nine years later. And the old material has been augmented with figures and text from several more decidedly non-scholarly sources, including – wait for it – five different Wikipedia articles.
By Deep Climate
A year ago, I first identified scholarship issues in the 2006 Wegman report, the contrarian touchstone commissioned by Republican congressman Joe Barton as part of his concerted campaign to discredit the “hockey stick” temperature reconstruction and the scientists behind it, especially Michael Mann. (The report was produced by lead author George Mason University statistics professor Edward Wegman, along with co-authors Rice University professor David Scott and Wegman protégé Yasmin Said, although Scott seems to have had little to do with it). Eventually, I demonstrated apparent plagiarism in 10 pages of background sections in the report, as well as in an obscure (but federally funded) follow-up article by Said, Wegman and two other Wegman acolytes. At the time, it seemed a matter of interest only in the blogosphere, while the mainstream media ignored the issue in favour of the bogus “climategate” scandal.
But it turned out that my work had come to the attention of at least one important player in this drama – paleoclimatologist and “hockey stick” co-author Raymond Bradley. Back in March, Bradley quietly filed an initial complaint with GMU alleging plagiarism by Wegman et al of Bradley’s own work, attaching some of my initial analysis. Two months later, Bradley updated GMU with my further evidence of more “widespread” plagiarism, including wholesale copying of passages from two social network text books and Wikipedia, in both the Wegman report itself, and the follow-up 2008 article by Said et al. Bradley also took special care to point out the discovery of federal funding for the latter, which made the apparent breaches of misconduct policy all the more serious.
None of this was known until the ever patient Bradley went public, notably in recent statements in online and print articles by USA Today science reporter Dan Vergano. Now, a comprehensive report by John Mashey, based on the complete communication between Bradley and GMU research vice president Roger Stough, along with an analysis of GMU’s academic misconduct policy, shows exactly why Bradley finally came forward. Strange Investigations at George Mason University [PDF 2.6 Mb ], presents a shocking picture of foot-dragging and lack of transparency at GMU. Despite the copious evidence presented by Bradley, no substantive action ensued until the belated August convening of the inquiry committee. That committee was supposed to have reported within 60 days of its ostensible nomination in April, and had only to consider the limited question of whether the allegations were substantive enough to warrant a full-blown investigation. Yet even the committee’s belated start only came after the intervention of Elsevier environmental sciences publisher John Fedor. Even worse, GMU’s Stough failed to provide promised progress reports, and the inquiry committee missed Stough’s stated September 30 deadline for delivery of its report. And Stough’s last substantive response in October to Bradley vaguely referred to needing “a few weeks more” to wrap up the inquiry phase, while since then he has stonewalled all further requests for updates.
So more than nine long months after Bradley’s initial complaint, GMU has yet to clearly reach the end of its initial inquiry, a phase that should have been pursued rigorously and resolved easily within GMU’s own timelines. That is especially so given the compelling evidence and the impetus of the serious issue of federal funding, which normally requires resolution of a misconduct inquiry within 60 days as a matter of law. All of this calls into serious question GMU’s misconduct policy and process, and indeed the university’s very commitment to fundamental principles of academic integrity.
By Deep Climate
[Update, Oct. 11: George Mason University spokesperson Doug Walsch has clarified that the complaint against Wegman has moved past the preliminary “inquiry” phase and is now under formal investigation. ]
[Update, Oct. 15, 19: I have added pointers to my previous discussions and updated side-by-side comparisons relevant to allegations of plagiarism forwarded to George Mason University last March and April. The allegations concern not only the Wegman report, but also the federally-funded Said et al 2008 (published in Computational Statistics and Data Analysis, with Wegman and two other Wegman proteges as co-authors). ]
George Mason University has acknowledged that statistics professor Edward Wegman is under investigation for plagiarism. As related in USA Today, the investigation followed a formal complaint by paleoclimatologist Raymond Bradley, co-author of the seminal (and controversial) 1998 and 1999 “hockey stick” temperature reconstructions.
But a letter from Roger Stough, GMU’s vice-president responsible for research, indicates that the pace of the initial inquiry has been slow. And it appears that a promised date for resolution of the inquiry phase of the proceeding has been missed.