Feeds:
Posts
Comments

Posts Tagged ‘natural language processing’

I am delighted to announce a new release of 2Dsearch, with various improvements including an integration with two of the world’s biggest patent databases: Google Patents and Lens Patents. What this means is that you can now run a single visual search across 14 different sources and benefit from automated translations for many more. And with the addition of Google Patents and Lens Patents, you can now use 2Dsearch for patent searching and competitive intelligence, with free starter examples for each.

We’ve lots more planned for the next release, so if you’d like to help shape this or have comments or suggestions then do let us know. We’d be delighted to hear from you.

Advertisement

Read Full Post »

An early Xmas present maybe…? I am currently hiring for the following position. The official job ad has yet to go live, hence the TBDs below, but I am happy to share further details with interested parties. If you know of anyone suitable, please do encourage them to get in touch ASAP:

Goldsmiths, University of London has a vacancy for the following part-time (0.4) position:  

Research Associate in the field of natural language processing 

This role is part of a strategic research project that aims to apply techniques from corpus linguistics and natural language processing (NLP) to capture and analyze the “public conversation” around the ongoing energy crisis. The energy crisis has become a nexus for important social debates around pivotal issues like climate change, as well as the distribution of economic burdens, equality, risk and sustainability. A study of public debates around the energy crisis can help us understand the complex intersection of discourses around fossil fuels, climate change, sustainability and social justice.   

(more…)

Read Full Post »

Delighted to announce a new release of 2Dsearch, with various improvements including new preferences with options for snap-to-grid, advanced query parsing and default databases. We’ve also integrated the latest version of PolyGlot and fixed the query parser bug which so many of you raised.

What this means is that you can now use 2Dsearch to execute a single visual search across a dozen different databases and benefit from automated translations for many more. We’ve also made further improvements to the user experience and now provide example searches for each of the databases.

We’ve lots more planned for the next release, so if you’d like to help shape this or have comments or suggestions then please let us know.

Read Full Post »

Spring is traditionally a time of new beginnings, so I am delighted this week to announce a new release of 2Dsearch. This release contains a variety of bug fixes and improvements, including support for two new databases: ACM Guide to Computing Literature and IDEAS (the largest bibliographic database dedicated to Economics available freely on the Internet). We’ve also improved our search report generation tool and now offer query statistics to help you refine those all important search strategies.

What all this means is that you can now use 2Dsearch to search visually across 12 different databases and use automated translations for 8 more. We’ve also made further improvements to the user experience and now provide example searches for each of the 12 databases.

We’ve lots more planned for the next release, so if you’d like to help shape this and/or have comments or suggestions then do let us know. We’d be delighted to hear from you!

Read Full Post »

Earlier this week I gave a talk called “Introduction to NLP” as part of a class I am currently teaching at the University of Notre Dame. This is an update of a talk I originally gave in 2010, whilst working for Endeca. I had intended to make a wholesale update to all the slides, but noticed that one of them was worth keeping verbatim: a snapshot of the state of the art back then (see slide 38). Less than a decade has passed since then (that’s a short time to me 🙂 but there are some interesting and noticeable changes. For example, there is no word2vec, GloVe or fastText, or any of the neurally-inspired distributed representations and frameworks that are now so popular (let alone BERT, ELMo & the latest wave). Also no mention of sentiment analysis: maybe that was an oversight on my part, but I rather think that what we perceive as a commodity technology now was just not sufficiently mainstream back then.

(more…)

Read Full Post »

When I started the London Text Analytics meetup group some seven years ago, ‘text analytics’ was a term used by few, and understood by even fewer. Apart from a handful of enthusiasts and academics (who preferred the label of “natural language processing” anyway), the field was either overlooked or ignored by most people. Even the advent of “big data” – of which the vast majority was unstructured – did little to change perceptions.

But now, in these days of chatbot-fuelled AI mania, it seems everyone wants to be part of the action. The commercialisation and democratisation of hitherto academic subjects such as AI and machine learning have highlighted a need for practical skills that focus explicitly on the management of unstructured data. Career opportunities have inevitably followed, with job adverts now calling directly for skills in natural language processing and text mining. So the publication of Tom Reamy’s book  “Deep Text: Using Text Analytics to Conquer Information Overload, Get Real Value from Social Media, and Add Bigger Text to Big Data” is indeed well timed.

(more…)

Read Full Post »

After a brief hiatus, I’m pleased to say that we will shortly be relaunching the London Text Analytics meetup. As many of you know, in the recent past we have organized some relatively large and ambitious events at a variety of locations. But we have struggled to find a regular venue, and as a result have had difficulty in maintaining a scheduled programme of events.

What we really need is a venue we can use on a more regular schedule, ideally on an ex-gratia basis. It doesn’t have to be huge – in fact; a programme of smaller (but more frequent) meetups is in many ways preferable to a handful of big gatherings.

(more…)

Read Full Post »

textmining

I received a pleasant surprise in the post today: my personal copy of Text Mining and Visualization: Case Studies Using Open-Source Tools, edited by Markus Hofmann and Andrew Chisholm. Now I don’t normally blog about books, since as editor of Informer there was a time when I would be sent all manner of titles for inspection and review. But I’ll make an exception here. This is partially since Chapter 7 is my own contribution (on mining search logs), as discussed in my earlier blog posts. This is complemented by 11 other chapters, covering a variety of topics organised into four sections:

(more…)

Read Full Post »

Here’s a sample of some of the things we’re working on at UXLabs this year, neatly packaged into Masters level ‘internships’. I use quotes there as although it’s a convenient term used by many of my academic colleagues, these opportunities are (a) unpaid and (b) remote (i.e. hosted by your own institution). So maybe ‘co-supervised MSc projects initiated by a commercial partner’ is more accurate term… Anyway, what we offer is support, expertise, co-supervision and access to real world data/challenges. If you are interested in working with us on the challenges below, get in touch. (more…)

Read Full Post »

A short while ago I posted the slides to Despo Georgiou’s talk at the London Text Analytics meetup on Sentiment analysis: a comparison of four tools. Despo completed an internship at UXLabs in 2013-4, and I’m pleased to say that the paper we wrote documenting that work is due to be presented and published at the Science and Information Conference 2015, in London. The paper is co-authored with my IRSG colleague Andy MacFarlane and is available as a pdf, with the abstract appended below.

As always, comments and feedback welcome 🙂

ABSTRACT

Sentiment analysis is an emerging discipline with many analytical tools available. This project aimed to examine a number of tools regarding their suitability for healthcare data. A comparison between commercial and non-commercial tools was made using responses from an online survey which evaluated design changes made to a clinical information service. The commercial tools were Semantria and TheySay and the non-commercial tools were WEKA and Google Prediction API. Different approaches were followed for each tool to determine the polarity of each response (i.e. positive, negative or neutral). Overall, the non-commercial tools outperformed their commercial counterparts. However, due to the different features offered by the tools, specific recommendations are made for each. In addition, single-sentence responses were tested in isolation to determine the extent to which they more clearly express a single polarity. Further work can be done to establish the relationship between single-sentence responses and the sentiment they express.

(more…)

Read Full Post »

Older Posts »