Today I present an analysis of a 2009 article by Yasmin Said and Edward Wegman of George Mason University. “Roadmap for Optimization” was published in the inaugural edition of WIREs Comp Stats, one of a new family of Wiley publications conceived as a “serial encyclopedia”. Wegman and Said, along with David Scott of Rice University, are also editors of the journal; the three are best known as co-authors of the 2006 “hockey stick” report to Congress, commissioned by Rep. Joe Barton.
As the title implies, the article was meant to provide a broad overview of mathematical optimization and set the stage for subsequent articles detailing various optimization techniques. However my analysis, entitled Suboptimal Scholarship: Antecedents of Said and Wegman 2009, demonstrates the highly problematic scholarship of the “Roadmap” article.
- No fewer than 15 likely online antecedent sources, all unattributed, have been identified, including 13 articles from Wikipedia and two others from Prof. Tom Ferguson and Wolfram MathWorld.
- Numerous errors have been identified, apparently arising from mistranscription, faulty rewording, or omission of key information.
- The scanty list of references appears to have been “carried along” from the unattributed antecedents; thus, these references may well constitute false citations.
First, I’ll present an abridged version of Suboptimal Scholarship summary as an overview of the analysis. Then I’ll take a look at a few examples showing the derivation of “Roadmap” from its antecedents, including some remarkable errors introduced in the process. And finally I’ll place this latest embarrassment in the context of the pattern of dubious scholarship evidenced by Wegman and Said over the last several years.
Overview of Suboptimal Scholarship: Antecedents of Said and Wegman 2009
Here is the full citation for “Roadmap for Optimization”
Yasmin H. Said and Edward J. Wegman, “Roadmap for Optimization‖”, Wiley Interdisciplinary Reviews: ComputationalStatistics[WIREs Comp Stat], Volume 1, Issue 1, pages 3-11, July/August 2009. Online July 13, 2009.
The abstract reads:
This article focuses broadly on the area known as optimization. The intent is to provide in broad brush strokes a perspective on the area in order to orient the reader to more detailed treatments of specific subdisciplines of optimization throughout WIREs: Computational Statistics. In this article we provide background on mathematical programming, Lagrange multipliers, Karush-Kuhn-Tucker Conditions, numerical optimization methods, linear programming, dynamic programming, the calculus of variations, and metaheuristic algorithms.
Thus, this article was conceived as a key element in realizing the “serial encyclopedia” vision of WIREs Comp Stat.
As can be seen in the following table, apparent unattributed antecedents
have been identified in each of the sections corresponding to the aforementioned subjects. For the most part, the antecedents’ use appears
sequential, and shows little of the interspersing of sources found in the pair’s other WIREs overview article, Wegman and Said 2011 “Color Theory and Design” (previously analyzed).
|Sect.||Source (with version date)|
|1||1. Wikipedia – Mathematical Optimization (Jan. 2009)|
|2||2. Wikipedia – Lagrange Multiplier (Jan. 2009)
3. Wikipedia – Karush-Kuhn-Tucker conditions (Jan. 2009)
|3||4. Wikipedia – Gradient Descent (Nov. 2008)
5. Wikipedia – Conjugate Gradient Method (Jan. 2009)
|4||6. Wikipedia – Linear Programming (Jan. 2009)
7. Tom Ferguson – Linear Programming
8. Wikipedia – Simplex Algorithm (Apr. 2009)
9. Wikipedia – Karmarkar’s Algorithm (Jan. 2009)
10. Wikipedia – Interior Point Methods (Dec. 2008)
|5||11. Wikipedia – Dynamic Programming (Jan. 2009)|
|6||12. Wikipedia – Calculus of Variations (Dec. 2008)
13. Wolfram MathWorld – Calculus of Variations
14. Wikipedia – Fundamental Lemma of Calculus of Variations (Jan. 2009)
|7||15. Wikipedia – Simulated Annealing (Jan. 2009)|
Section 2 (p. 3-5) of Suboptimal Scholarship breaks down Said and Wegman 2009 page by page and gives identified antecedents, if any, for each sub-section, paragraph by paragraph. This section also contains a brief selection of identified errors, most of which have been introduced through various changes to the antecedents, reflecting apparent mistranscriptions or misunderstandings of subject area details. Some key omissions are also identified.
Section 3 (p. 6-30) begins with a detailed breakdown of the antecedents paragraph by paragraph. That is followed by a complete textual side-by-side comparison of Said and Wegman and its identified antecedents, with identical and trivially similar text highlighted, in the now familiar cyan (identical) and yellow (trivially different) highlighting scheme.
Finally, section 4 (p. 31-33) has a brief analysis of Said and Wegman’s references and suggested reading list. The seven references, including two for different editions of Bellman’s Dynamic Programing, can all be found in the corresponding Wikipedia antecedents. Similarly, the extensive “Further Reading” list also appears to be derived from the various antecedents.
Example 1: “Mathematical programming is the simplest case of optimization” [SoS p. 4, p. 10]
Note: in this and following examples, I’ve given the relevant page numbers from Suboptimal Scholarship (SoS), in case you want to follow along. The first page number is the page analyzing the error(s) and the second is the location of the detailed side-by-side comparison.
Almost a year ago, reader “Amoeba” tried to point out that parts of “Roadmap” had “striking similarities” to the Wikipedia article on Mathematical Optimization. I didn’t get it at the time, but in my defence the correspondence is quite a bit clearer if you go back to the version from early 2009.
In mathematics, the simplest case of optimization, or mathematical programming, refers to the study of problems in which one seeks to minimize or maximize a real function by systematically choosing the values of real or integer variables from within an allowed set. This (a scalar real valued objective function) …
Now let’s omit certain parts:
… the simplest case of optimization, or mathematical programming, … real … variables … a scalar real valued objective function
And rearrange to get the second paragraph of Said and Wegman:
The simplest case of mathematical optimization is mathematical programming. Mathematical programming considers as its objective function a scalar real-valued function of real variables.
The rest of the section features a similar rearrangement of the Wikipedia antecedent, but with so much common material as to make the source unmistakeable.
However, the real howler is that Said and Wegman have failed to understand that “mathematical programming” is used here as a synonym for mathematical optimization (or just plain optimization); it is patent nonsense to claim that mathematical programming is the “simplest case” of mathematical optimization.
If only they had waited a year, they could have avoided this particular fiasco. The Wikipedia opening now reads:
In mathematics and computational science, mathematical optimization (alternatively, optimization or mathematical programming) refers to the selection of a best element from some set of available alternatives.
In the simplest case, this means solving problems in which one seeks to maximize (or to minimize) a real function by systematically choosing the values of real or integer variables from within an allowed set.
Example 2: 2d (or, not 2d) [SoS, p. 4, p. 22-24]
The long section on linear programming contains some of the most egregious lifting of material; check out the “sea of cyan” in the side-by-side comparison.
Here is a short excerpt from a longer passage that is strikingly similar to Tom Ferguson’s handy Linear Programming: A Concise Introduction. The highlighted portions are identical, with the slight changes shown with strikeout.
, x for in the standard maximum problem or y for in the standard minimum problem , is said to be feasible if it satisfies the corresponding constraints. The set of feasible vectors is called the constraint set. A linear programming problem is said to be feasible if the constraint set is not empty; otherwise it is said to be infeasible. A feasible maximum ( resp. minimum) problem is said to be unbounded if the objective function can assume arbitrarily large positive ( resp. negative) values at feasible vectors. ;otherwise, If a problem is not unbounded,it is said to be bounded
And so it goes, for several more sentences. But even these minimal changes have introduced confusion; the original commas (and the word “for”) are there for a reason, to emphasize that x and y are used for the standard maximum or standard minimum problem respectively, and that the rest of the statement applies to “a vector” in general. Then the removal of “resp.” (i.e. “respective”) makes the parallel construction incomprehensible.
One of Said and Wegman’s particular enhancements to their Wikipedia antecedents, is to change notation of variable counts from n to d, perhaps to emphasize the geometric aspect of some optimization techniques. So a long passage on the simplex algorithm, apparently originating in Wikipedia – Simplex Algorithm (Apr. 2009), contains several such switches. For example, these two sentences require five such switches:
Suppose in the standard form of the problem there are
n d variables and m constraints, not counting the n non-negativity constraints. Generally, a vertex of the simplex corresponds to making n d of the m + d total constraints tight, while adjacent vertices share n d − 1 tight constraints.
Unfortunately, the authors only made four of the changes, omitting to adjust the count of non-negativity constraints, which should of course match the number of variables. (And, no, n makes no other appearance in this section).
… the simplex method visits all 2d vertices before arriving at the optimal vertex.
The original has:
the simplex method … visits all 2n vertices before arriving at the optimal vertex.
And all this time I thought a cube had eight vertices, not six. Who knew!
Example 3: Simulated annealing breaks down [SoS p. 5, p. 30]
Here is the opening from the Wikipedia 2009 article on Simulated Annealing.
Simulated annealing (SA) is a generic probabilistic meta-algorithm for the global optimization problem, namely locating a good approximation to the global minimum of a given function in a large search space.
The Said and Wegman version is less clear, but at least does not totally mangle the original.
Simulated annealing is a probabilistic metaheuristic global optimization algorithm for locating a good approximation to the global minimum of a given function in a large search space.
Here’s the Wikipedia explanation of the role of the global “temperature” parameter T, and its effect on solution search.
By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random “nearby” solution, chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods.
Now look at the Said and Wegman version, which effectively confuses the global “temperature” parameter that modulates the solution search, and the minimum solution itself, as well as omitting key details such as selecting successive candidate solutions “nearby”.
During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature parameter, T. As the temperature of the system decreases, the probability of higher temperature values replacing the minimum decreases, but it is always non-zero. The decrease in probability ensures a gradual decrease in the value of the minimum. However, the non-zero stipulation allows for a higher value to replace the minimum. Though this may sound like a flaw in the algorithm, it makes simulated annealing very useful because it allows for global minimums to be found rather than local ones. If during the course of the implementation of the algorithm a certain location (local minimum) has a lower temperature than its neighbors yet much higher than the overall lowest temperature (global minimum), this non-zero probability stipulation will allow for the value of the minimum to back track in a sense and become unstuck from local minima.
This is plain wrong – frighteningly so, in fact.
This paper is the fifth major work that I have analyzed from Wegman and Said. From the 2006 Wegman report to Congress, up to this year’s “Colour Theory and Design”, so much of Wegman and Said’s recent work demonstrates extreme reliance on unattributed antecedents, as well as numerous errors and incompetent analysis. Here is the list of the main works that have been found to be highly questionable (together with links to discussion and analysis for each).
- 2005: Chapter 1 (Wegman, Solka and Rao), and Chapter 13 (Said) in Handbook of Statistics: Data Mining and Data Visualization, Rao et al eds., Elsevier [Discussion]
- 2006: Wegman, Scott and Said; Ad Hoc Committee Report on the ‘Hockey Stick’ Global Climate Reconstruction [Discussion 1, 2, 3, 4 - Analysis 1, 2, 3, 4 - Statistical analysis discussion ]
- 2008: Said, Wegman et al, “Social Networks of Author–Coauthor Relationships”, Computational Statistics and Data Analysis [Discussion - Analysis - Retraction part1, part 2]
- 2009: Said and Wegman, “Roadmap for Optimization”, WIREs Comp Stat [Discussion (this piece) - Analysis]
- 2011: Wegman and Said, “Color Theory and Design”, WIREs Comp Stat [Discussion part 1, part 2 - Analysis]
It’s a dismal chronology. Yet George Mason University is stuck in a never-ending research misconduct inquiry that apparently still has not reached a recommendation to proceed to a full-blown investigation, even after 18 long months consideration of an overwhelming mountain of evidence. Meanwhile, Wiley is incapable of facing up to the serious problems at WIREs Comp Stat and seems prepared to let the reputation of the journal spiral downward.
Perhaps it’s time for David Scott to step up and do the right thing.