Monthly Archives: June 2011

Mining new depths in scholarship, part 1

I examine the opening chapter by Edward Wegman and Jeffrey Solka in the 2005 Handbook of Statistics: Data Mining and Data Visualization (C Rao, E Wegman and J Solka, editors).  Sections 3 (The Computer Science Roots of Data mining ), 5 (Databases), 6.2 ( Clustering) and 6.3 (Artificial Neural Networks) appear to be largely derived from unattributed antecedents; these include online tutorials and presentations on data mining, SQL and artificial neural networks, as well as Brian Everitt’s classic Cluster Analysis. All the identified passages, tables and figures were adapted from “copy-paste” material in earlier course lectures by Wegman. The introduction to Chapter 13 (on genetic algorithms) by Yasmin Said also appears to contain lightly edited material from unattributed sources, including an online FAQ on evolutionary computing and a John Holland Scientific American piece. Several errors introduced by editing and rearrangement of the  material are identified, demonstrating the authors’ lack of familiarity  with these particular subject areas. This extends a pattern of problematic scholarship previously noted in the work of Wegman and Said.

Continue reading