Tuesday, May 11, 2010

The Rise of Social Data Mining (as a Business Model)

Companies are discovering how to monetize social network data.  This is driving Big Open Science.  Is that a good thing?

A social network for sharing illness data, patientslikeme.com, has demonstrated that it can tap the information in its user network to predict the outcome of clinical drug trials.  The service, which is populated by a large number of ALS sufferers, determined that lithium use had no effect on the late-stage decline in ALS patients.  Why is this significant? Because it took 18 months before a formal study was able to confirm exactly the same thing.

While clearly not yet a replacement for the clinical trial process, the findings do reinforce the concept of Big Open Science - the use of large data sets to conduct a rougher, more rapid form of science.

The financial model is clever and solid:
We take the information patients share about their experience with the disease, and sell it in a de-identified, aggregated and individual format to our partners (i.e., companies that are developing or selling products to patients). These products may include drugs, devices, equipment, insurance, and medical services.  We do not rent, sell or share personally identifiable information for marketing purposes or without explicit consent.  Because we believe in transparency, we tell our members exactly what we do and do not do with their data. 
So long as the data remains totally secure, it sure reads like a win-win to me.  I can see many quantified health start-ups adopting or moving towards this model.

But, aside from the health data, it's not really all that new.

Focus groups and stock markets have been around for hundreds of years.  More recently, Hollywood Stock Exchange (HSX), a movie performance predictor site oft cited by collective intelligence researchers,  has clearly demonstrated its ability to forecast box office revenues via crowdsourcing.  And, of course, don't forget the banks, credit card companies and info aggregators that can already "predict with 95% certainty that you will get a divorce, two years before it happens, based on your purchases", as Google's Marissa Mayer famously pointed out on Charlie Rose.

In the coming years we can expect this sort of model to proliferate.  Trends like cheaper data storage, smaller sensing devices, widening bandwidth, exponentially faster computing and emergent social behavior suggest that more companies will be able to mine more valuable data from more willing participants and sell it to more interested parties.  Barring an all-out privacy backlash, it's a relatively safe bet that the broader market will create the conditions necessary for more similar social data mining startups and operations.

The new opportunities are seemingly endless:
  • health & medicine sites - like patientslikeme or curetogether (shout out to Alexandra Carmichael)
  • location video - streaming and stored on youtube
  • driving information-  gathered through your car
  • smartphone apps - more complex data capture, reality mining
  • genome - companies like 23andme
  • etcetera!
At the same time, consider how certain large companies can leverage Big Open Science (they're already doing it for market research, but could easily broaden these efforts):
  • Search: Google, Bing, Yahoo Search
  • Social: Facebook, MySpace, Twitter
  • Gaming: Sony, XBox, Nintendo, Apple
  • Smartphones: Apple, Microsoft, HTC
Privacy concerns aside (for the moment), there's such an abundance of untapped informational value that it's easy to envision a world in which total productivity grows by leaps and bounds - as a square to acceleration in the technology, data and comm space -  a sentiment echoed by Wired writer and Quantified Self blogger Gary Wolf at a recent Stanford MediaX seminar. (When I asked him whether or not he believed that Quantification was directly related to Kurzweil's Law of Accelerating Returns he thought about it for a moment then said "yes".)

Now, will these quantification driven economic gains gains trickle down to the average person?  Yes, I do think they will.  First in the form of accelerated science.  Then, second, it seems likely that the increasingly abundant services and social networks in the space will be forced by the market to return more and more value to the users, or prosumers, that are contributing this data - an effect that I have playfully nicknamed The Mandate of Kevin.

But will these gains come online fast enough to offset the disruptive forces of globalization, production automation, a large-scale privacy backlash or the resulting social turmoil?  That's hard to say, because we humans have never experienced such convergence before.  However, it is becoming more and more clear that traditional economics will drive Big Open Science and that this behavior is a thread interwoven with other accelerators.  Hopefully that will turn out to be a good thing.