Sentiment analysis – analysis

Last night’s supper was brilliant for a host of reasons, not least of which being the food served at the awesome Rules restaurant in London. I was there with, amongst others, Chris Condron – a wonderful man I met through Young Rewired State. Conversation ranged from the antics of Edward VII and Lillie Langtry to sentiment analysis (the former I am comfortable with, the latter I was fascinated by).

Anyone working in the world of digital media is used to the feeling of playing catchy uppy, adopting the look of the slightly baffled whilst trying desperately to keep up and learn. That was me last night.

Today I hounded Chris for an explanation of sentiment analysis, and he gave me the following:

Crudely, semantic analysis gives you a non-statistical (unlike search engines) sense of what something (say, an article) is about.

Sentiment analysis uses semantic analysis techniques to measure that against a set of known criteria, eg is a text pro or anti something?

It’s already being used in the financial world. It could be a really cool tool (especially when run across live data [such as Twitter] rather than flat text articles) for brand management. Marketers can use it to test in real time the public’s reaction to a product launch.

Before rapidly handing me over to his much heralded colleague Dr Jarred McGinnis:

It’s a computer that analyses text for keywords and phrases and determines the positive or negative sentiment of the story. For example, “Paddington Bear sucks” would probably be determined to be negative where the statement “Paddington Bear is a hero” would be positive.

The technology is not very accurate but still useful. One example of its use is to monitor mainstream and social media for negative or positive trends with respect to your company or one of its products.

Now, I am rapidly becoming a huge fan of championing the ability of talented people whilst genuflecting to the power of the computer. As the work in my field diverges ever more on information, data and ontologies, so my respect for the statisticians and analysts grows; and my understanding of the limits of computers, and the limits of humans. I am not sure whether to proudly embrace my ever-increasing knowledge of librarian skills and understanding of the importance of cataloguing languages: Dublin Core and the like – or to run away. What I do know, is that it is increasingly important to spend time making sure that the human involvement in the digital revolution is carefully balanced with the awesome power of the computer.

So we come to sentiment analysis. Whilst doing my own homework on this tonight, I understand that it is essentially an ontology of words or phrases that are assigned positive or negative associations. Using this as a framework, you can throw a whole load of content at this wall of good and bad – and have it separated cleanly into positive and negative, using the brilliant processing power of the computer.

To give you an example that Dr J showed me http://www.newssift.com/index.jsp. Using the search box, I can put in a topic. The resulting page gives me bucket loads of information; the graph on the top left is the sentiment analysis, some useful MIS and source material is included and the main centre gives me the search results being analysed. I won’t do it for you, you go and play.

However, what has kept me most intrigued is the semantic search bit, (by semantic I mean refined associated search). Once you have run your initial search, the results page lets you add search terms to refine the results and gives you ever more detailed information.

Now, I don’t know how you would use this – I would say with a note of caution: this is just data being thrown at a pretty brutal analysis tool of positive and negative feeling (something a computer can only do by cataloguing good/bad feeling words against online content) – but it is the first step I have seen in digitally automating the mood of the nation on any given topic.

Please do let me know of other tools that you know of that have refined this further, (don’t google it – I already have!), and please do let me know your thoughts on this. I will certainly be playing about a bit more with this stuff.