{"id":975,"date":"2011-01-20T21:33:04","date_gmt":"2011-01-20T19:33:04","guid":{"rendered":"http:\/\/www.theusrus.de\/blog\/?p=975"},"modified":"2011-01-20T21:33:04","modified_gmt":"2011-01-20T19:33:04","slug":"data-analysis-with-open-source-tools","status":"publish","type":"post","link":"https:\/\/www.theusrus.de\/blog\/data-analysis-with-open-source-tools\/","title":{"rendered":"Data Analysis of Yesteryear"},"content":{"rendered":"<p>It is not too often that a book is published that integrates data analytical methodology and the illustration of the appropriate use of specific tools. When Henk pointed me to the just released &#8220;Data Analysis with Open Source Tools&#8221; by Philipp Janert, the excitement was big, but it evaporated as soon as I read through the book.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" title=\"Data Analysis with Open Source Tools\" src=\"http:\/\/www.theusRus.de\/Blog-files\/DAwithOS.png\" alt=\"\" width=\"220\" height=\"300\" \/><\/p>\n<p>I did start to flip through the pages with Amazon Preview, and was positively surprised that Part I of the book was on &#8220;<strong>Graphics: Looking at Data<\/strong>&#8221; and the following sections were actually progressing in the dimensionality of the data looked at &#8211; nice concept, and well copied. The first figure though, is a jittered dotplot &#8211; something we were doing in the 70s when we were still sending our plot commands to a pen plotter, and were trying to avoid ink soaked holes in the paper &#8211; we should know better more than a quarter of a century later.<\/p>\n<p>It takes quite some pages until the book hits the widely used boxplots in the section &#8220;<strong>Only when Appropriate: Summary Statistics and Box Plots<\/strong>&#8220;, and we read &#8220;<em>These summary statistics (mean and median, standard deviation, and percentiles) apply only under certain assumptions and are misleading, if not downright wrong, is those assumptions are not fulfilled.<\/em>&#8221; Well, how can a median be wrong?<\/p>\n<p>A surprising highlight can be found on page 68, where Janert absolutely hits the point in the distinction between &#8220;<strong>Graphical Analysis and Presentation Graphics<\/strong>&#8221; &#8211; something he seems to have forgotten just 50 pages later.<\/p>\n<p>In the section on multivariate data analysis Janert talks about &#8220;<strong>Interactive Exploration<\/strong>&#8221; and writes &#8220;<em>Now I could imagine a tool that allows us to select a bin in <\/em>one<em> of the histograms and then highlights the contribution from the points in that bin in all the other histograms<\/em>&#8220;. His imagination could come true with a few clicks when he would use the appropriate tools. On page 124, he throws <a href=\"http:\/\/www.ggobi.org\" target=\"_blank\">ggobi<\/a> and <a href=\"http:\/\/www.theusRus.de\/Mondrian\" target=\"_blank\">Mondrian<\/a> in the subtly named group of &#8220;<strong>Experimental Tools<\/strong>&#8220;. He claims &#8220;<em>I don&#8217;t think any of these novel plot types have been refined to a point where they are clearly useful<\/em>.&#8221; Certainly, if you do not use these (novel?) plots &#8211; btw. PCPs had their 25th anniversary last year and mosaic plots will celebrate their 30th anniversary this year &#8211; you wont see their usefulness. That Janert most likely did not use Mondrian is somehow apparent, otherwise he would not need to imagine a tool that links histograms.<br \/>\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" title=\"Linked Histograms - more than imagination in Mondrian\" src=\"http:\/\/www.theusRus.de\/Blog-files\/LinkedHisto.gif\" alt=\"\" width=\"627\" height=\"535\" \/> The last lowlight to present here \u00a0is the &#8220;histogram&#8221; in Figure 9.4 on page 202, which is &#8211; hey &#8211; just a scatterplot; they are not that hard to tell apart.<\/p>\n<p>I hate being so critical, but we should not let someone get away with a book on data analysis published in 2010 bashing what is standard in modern, interactive, graphical data analysis for more than a decade now. Who would consider using <a href=\"http:\/\/www.manning.com\/janert\/\" target=\"_blank\">Gnuplot<\/a> for graphical data analysis in 2011?<\/p>\n<p>If you answer above question with &#8220;yes&#8221;, go buy the book &#8211; if not, save the money for a more up-to-date book.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It is not too often that a book is published that integrates data analytical methodology and the illustration of the appropriate use of specific tools. When Henk pointed me to the just released &#8220;Data Analysis with Open Source Tools&#8221; by Philipp Janert, the excitement was big, but it evaporated as soon as I read through [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7,1],"tags":[],"class_list":["post-975","post","type-post","status-publish","format-standard","hentry","category-books","category-general"],"_links":{"self":[{"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/posts\/975","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/comments?post=975"}],"version-history":[{"count":9,"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/posts\/975\/revisions"}],"predecessor-version":[{"id":985,"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/posts\/975\/revisions\/985"}],"wp:attachment":[{"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/media?parent=975"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/categories?post=975"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.theusrus.de\/blog\/wp-json\/wp\/v2\/tags?post=975"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}