I received a pleasant surprise in the post today: my personal copy of Text Mining and Visualization: Case Studies Using Open-Source Tools, edited by Markus Hofmann and Andrew Chisholm. Now I don’t normally blog about books, since as editor of Informer there was a time when I would be sent all manner of titles for inspection and review. But I’ll make an exception here. This is partially since Chapter 7 is my own contribution (on mining search logs), as discussed in my earlier blog posts. This is complemented by 11 other chapters, covering a variety of topics organised into four sections:
RapidMiner
RapidMiner for Text Analytic Fundamentals
John Ryan
Empirical Zipf-Mandelbrot Variation for Sequential Windows within Documents
Andrew Chisholm
KNIME
Introduction to the KNIME Text Processing Extension
Kilian Thiel
Social Media Analysis — Text Mining Meets Network Mining
Kilian Thiel, Tobias Kötter, Rosaria Silipo, and Phil Winters
Python
Mining Unstructured User Reviews with Python
Brian Carter
Sentiment Classification and Visualization of Product Review Data
Alexander Piazza and Pavlina Davcheva
Mining Search Logs for Usage Patterns
Tony Russell-Rose and Paul Clough
Temporally Aware Online News Mining and Visualization with Python
Kyle Goslin
Text Classification Using Python
David Colton
R
Sentiment Analysis of Stock Market Behavior from Twitter Using the R Tool
Nuno Oliveira, Paulo Cortez, and Nelson Areal
Topic Modeling
Patrick Buckley
Empirical Analysis of the Stack Overflow Tags Network
Christos Iraklis Tsatsoulis
But what makes this book really interesting and valuable is the way it combines theory with practice: each chapter is written by respected scholars who understand their discipline, but present it via a series of use cases (i.e. practical examples) using open source data and tools. Moreover, it provides all the examples for download on a supplementary website.
It’s a formula that is very attractive: the reader can dip in and out of topics, take ideas they find interesting, then implement or extend them knowing that all the tooling, code and data are freely available. A simple idea of course, but one to which many other titles pay only lip service. From first-hand experience, I can account for the enthusiasm with which Markus and Andrew pursued this ideal, and I think their efforts have paid off handsomely. I for one look forward to getting the best out of this book.
Reblogged this on Saqib Ali and commented:
This looks interesting. A bit pricey though at $95 on Amazon. Will have to check it out at Barnes and Noble before making a purchase. 🙂