Feeds:
Posts
Comments

Posts Tagged ‘search logs’

Expectation Maximization applied to a new sample of 100,000 sessions

In a previous post I discussed some initial investigations into the use of unsupervised learning techniques (i.e. clustering) to identify usage patterns in web search logs. As you may recall, we had some initial success in finding interesting patterns of user behaviour in the AOL log, but when we tried to extend this and replicate a previous study of the Excite log, things started to go somewhat awry. In this post, we investigate these issues, present the results of a revised procedure, and reflect on what they tell us about searcher behaviour.

(more…)

Advertisements

Read Full Post »

EM, 7 features

As I mentioned in a previous post I’ve recently been looking into the challenges of search log analysis and in particular the prospects for deriving a ‘taxonomy of search sessions’. The idea is that if we can find distinct, repeatable patterns of behaviour in search logs then we can use these to better understand user needs and therefore deliver a more effective user experience.

We’re not the first to attempt this of course – in fact the whole area of search log analysis has an academic literature which extends back at least a couple of decades. And it is quite topical right now, with both ElasticSearch and LucidWorks releasing their own logfile analysis tools (ELK and SiLK respectively). So in this post I’ll be discussing some of the challenges in our own work and sharing some of the initial findings.

(more…)

Read Full Post »

Over the last few months I have been working with Paul Clough and Elaine Toms of Sheffield University on a Google-funded project called ‘A Taxonomy of Search Sessions’. A session, in case you’re wondering, is defined as a period of continued usage between a user and a search application. So if you spend a while Googling for holiday destinations, that’s a session. Sessions are interesting because they form a convenient unit of interaction with which to study usage patterns, and these can provide insights that drive improved design and functionality.

(more…)

Read Full Post »