The Ubiquity of Data-mining
I always make an effort to meet people in a variety of different fields (avoiding homophily, the subject of a later post). When I was writing “Programming Collective Intelligence“, it was usually one of the main things on my mind, so it came up in conversation with such people fairly often. Since at the time I thought of mining social data as a pretty geeky topic, I was often pleasantly surprised at how excited people seemed to get about the subject, and how many people felt it related to their professional interests.
Here are a few examples:
- A political consultant explained the importance of making projections about what is likely to change people’s minds based on data about where particular types of political campaigns were run in the past and how effective they were.
- An advertiser who believed the advertising would be moved forward by being able to quickly analyze the buzz response for certain campaigns.
- An investor in a community site for medical professionals told me how they could use information automatically gleaned from behavior and postings to learn about potential drug safety issues.
- A former trader at an entertainment hedge-fund said that such funds were interested in how to more quickly gather data from people’s online behavior about which movies and shows were being watched and discussed.
- A buyer for Banana Republic told me that they are interested in using collected data from a variety of sources to make predictions about what the popular trends will be. Apparently they’ve found traditional trend-watchers to be inadequate.
This was quite heartening to me. I’ve long felt that a bit of review of a small subset, a simplified narrative and some overrated gut-instinct passed for analysis a little too often.
Of course, the standard “Wisdom of Crowds” saw still applies. Groups of people working together are much better at solving estimation problems than they are solving judgment or creative problems. One could not expect an original fashion concept to be developed by mining social data, however, the trend-watchers for Banana Republic aren’t tasked with creating original concepts — their job is to watch what’s happening both at the higher end and what people are buying now and make predictions by aggregating this information themselves.
In the end, I believe that almost any problem of significance involves estimation, judgment and creativity. The real solutions will come from the use of algorithms that can aggregate social data and build white-box models and visualizations that can be used by people to weave a sensible narrative based on a combination of correctly synthesized information and trained instinct. I also believe that the possibilities in this area are only just starting to be explored.