The Challenge of Sentiment Detection
Factiva announced last week their new corporate reputation analysis tool. This is their second effort in this market. In spring of 2004, Factiva joined with IBM's WebSphere group to focus on corporate reputation management. WebSphere was attempting to converge IBM's hodgepodge of text analytics tools together with its broad infrastructure, developing a massive new platform for unstructured data. IBM has adapted its model, using WebSphere as a platform for other analytics applications (ClearForest, Cognos, Attensity and others) while this aspect of the WebSphere-Factiva partnership has gone quiet.
Factiva, however, has stepped forth with a new initiative in this market. The goal of reading blogs, websites, message boards and the like, and giving companies an early view of customer sentiment is a good one. At the same time, the challenge will be tremendous. Having been involved in a few projects designed to identify customer sentiment from text, my experience has been that this is not something easily delivered. Extracting facts and events from published text is difficult; disambiguating positive from negative segment in informal text such as chat and newsgroups will be daunting. The lack of sentence structure, poor grammar and spelling are bound to create challenges for NLP applications (just ask your PC to interpret the :( symbol).
Factiva has overcome its biggest limitation - accessing content, by partnering with Intelliseek. While Factiva's own content is a major component, the real value of this system is the ability to assess the informal "buzz" from websites, blogs and newsgroups. That may enable the system to serve as an Early Warning indicator, able to sense market opinions in advance of more formal channels.
Factiva's approach, using automation, combined with domain experts, promises to be the biggest push in this market do-date. That said, I believe that it will be important for the early adopters to manage their expectations, as automated tools can still only do so much in this area. Regardless, I hope that the early adopters embrace these initial capabilities so that the efforts continue.
Comments