Last week I had the privilege of organising the 13th meeting of the London Text Analytics group, which featured two excellent speakers: Despo Georgiou of Atos SE and Diana Maynard of Sheffield University. Despo’s talk described her internship at UXLabs where she compared a number of tools for analysing free-text survey responses (namely TheySay, Semantria, Google Prediction API and Weka). Diana’s talk focused on sentiment analysis applied to social media, and entertained the 70+ audience with all manner of insights based on her expertise of having worked on the topic for longer than just about anyone I know. Well done to both speakers!

Just a quick reminder that next Friday (1st August) is the deadline for submissions to EuroHCIR 2014, which I am co-organising with Max Wilson, Birger Larsen, Preben Hansen and Kristian Norling. This is the fourth year we’ve run the event, so we’re hoping for a good set of submissions to keep up momentum. As before, we’re accepting both research and practice-oriented papers, so if you have any queries (particularly about the latter) just drop me a line.

The event itself is on 13 September at BCS London, with a poster session/social scheduled for the evening before. I’ve appended a summary of the call for papers below, and further details can be found at the EuroHCIR 2014 website. Hope to see you there!

Expectation Maximization applied to a new sample of 100,000 sessions

In a previous post I discussed some initial investigations into the use of unsupervised learning techniques (i.e. clustering) to identify usage patterns in web search logs. As you may recall, we had some initial success in finding interesting patterns of user behaviour in the AOL log, but when we tried to extend this and replicate a previous study of the Excite log, things started to go somewhat awry. In this post, we investigate these issues, present the results of a revised procedure, and reflect on what they tell us about searcher behaviour.

AIIM: Enterprise Search

I am in the process of creating a bibliography / list of key books & papers on Enterprise Search. I’m ideally looking for peer-reviewed, published works, from either a practitioner or researcher perspective, but am also interested in well-regarded online resources. Here are a few that immediately spring to mind:

  • Martin White, Enterprise Search. O’Reilly Media, 2012
  • Ricardo Baeza-Yates, Berthier Ribeiro-Neto, Modern Information Retrieval. Addison Wesley, 2010. (See Chapter 15)
  • Ron Miller, Unlock the Power of Enterprise Search. EContent Magazine, 2008.
  • David Hawking, Challenges in Enterprise Search, Proceedings of the Australasian Database Conference, 2004.
  • R. Mukherjee and J. Mao, Enterprise search: Tough stuff. Queue, 2(2), 2004.

And there are various online resources, e.g.

In case you missed it last time (since it filled up pretty quickly), there’s another chance to catch my faceted search tutorial in London on May 14. I’ll be presenting a full day course called Search Usability: Filters and Facets, which provides detailed coverage of the key topics along with a variety of new practicals and group exercises.

It’s also very competitively priced from just £180 per person – contrast that with a rate of ~£550 a day for this comparable offering!

For further details and registration, see the UKeIG website. In the meantime, I’ve appended further details below.

Hope to see you there!

EM, 7 features

As I mentioned in a previous post I’ve recently been looking into the challenges of search log analysis and in particular the prospects for deriving a ‘taxonomy of search sessions’. The idea is that if we can find distinct, repeatable patterns of behaviour in search logs then we can use these to better understand user needs and therefore deliver a more effective user experience.

We’re not the first to attempt this of course – in fact the whole area of search log analysis has an academic literature which extends back at least a couple of decades. And it is quite topical right now, with both ElasticSearch and LucidWorks releasing their own logfile analysis tools (ELK and SiLK respectively). So in this post I’ll be discussing some of the challenges in our own work and sharing some of the initial findings.

No April fools from me today but instead just a quick heads up that on Sunday April 13 I’ll be presenting a tutorial at ECIR 2014 in Amsterdam called Designing Search Usability. This is part of a programme of tutorials offered that day, so there are lots to choose from.

The course represents a wholesale revision of my original tutorial, updated to accommodate new concepts and exercises drawn from the book “Designing the Search Experience: the Information Architecture of Discovery”, published by Morgan Kaufmann in December 2012.

For further details and registration, see the ECIR 2014 website. In the meantime, I’ve appended further details below.

Hope to see you there!

