Skip to content →

Facebook’s ‘open’ move into the data mining space

It’s been interesting to read many people describe the recent Facebook announcements (including today’s) as “Facebook opening up”. While it is true, they are – and should be congratulated for it – there are greater reasons for them doing so than just for ‘pure alteruism’ as some people have suggested.

It seems pretty clear to me that Facebook’s business model is shifting towards one of data mining and analytics – where they are able to leverage the collective thinking of everyone contributing their ‘stuff’ into the Facebook bucket.

Let’s take a quick look at the theme of Facebook’s recent announcements:

  • early Feb: Terms of Service changed to give FB perpetual right to keep all data you give them (later repealed due to public outcry)
  • Feb 19: Commenting on public pages with FB Connect
  • Mar 4: New Publisher (twitter like) and Highlighter (ranking content) functionality,

Let’s take a quick look at what those announcements gave us:

First off was the ToS changes – which for me was a clear indicator Facebook wanted to do more with the data it holds then just display it to your friends and use it to make recommendations on other content you might be interested in. If Facebook is going to move into a data play then it needs to make sure it can retain all of that data despite what the user might want to do with their view of it. It becomes tricky to have to remove arbitrary data from the cube because a user requests it, plus it devalues your model – and why would you want your model devalued?

OK, so they backed off with those sweeping changes, but only because of the fallout it created for the company. At that point, they had still partially shown their hand.

In addition to the data Facebook keeps inside it’s database there is also the metadata that Facebook can gather about what’s going on outside it’s domain – and that’s where functionality like commenting on external pages, released at the Facebook garage come into play. Putting Javascript calls on foreign pages also allows Facebook to match up visitors with a Facebook cookie and track their usage of that site even if they never interact with any Facebook powered functionality.

Today’s announcement of the Publisher functionality built on top of rudimentary twitter-like functionality with status requests that we’d begun to see with the Facebook comment boxes used during the Presidential Inauguration and more recently the live streaming of Demo 09. Highlighter also further aids the recommendation and collaborative filtering of content by peers in order to work out what is currently most interesting and most engaged with. Facebook call the subset that you can see of your friend’s output as your “social lens”. This is true, but at the macro level of the system, Facebook ends up with a complete lens of what everyone is filtering and sorting and ranking.

So where is this all going?

Facebook is moving into a new gear, encouraging constant flow of status updates and conscious thought (publisher, status messages), creating deeper indicators of intent and interest (highlighter, like functionality, etc) and behavioral indicators (integration with location based services such as brightkite, events, etc).

What this gives Facebook is the ability to gauge what is hot, popular and current in real time. It also gives Facebook historical data to track changing interest and attention over time. There are many uses for this data – including in the financial and trading sector, brand management, competitor analysis, real time consumer attention tracking.

Twitter is also doing this, but they have one dimension of data (text). Facebook has many dimensions of data that can go into their cube, and their sample size is much higher given their 175 million users vs Twitter’s 4-6 million.

I spent a lot of time working with MySpace last year, and one of the things that impressed me the most was their ability to monetize their pages with advertisements – ones that used a combination of technology (for user targeting) and business development (for high-yielding ‘take over pages’, sponsorships, promo tie ins, etc). They’re probably the best in the business at it.

However advertising on it’s own is a Web2.0 business model, and while I don’t want to go so far as to say data mining is going to be the Web3.0 business model, I do think we’re going to see a greater use of it moving forward – with industries who can benefit from it becoming a lot more receptive and engaged with the process in the same way that the digital agencies became popular as advertising wanted to move into the online space.

Risks for the ecosystem

The benefit of being ‘open’ and part of the ecosystem is that everyone gets to play and share and new 3rd party innovation and business can be created with it. While this is true, those 3d party participants in that ecosystem need to be careful not to loose sight of their own ability for commercial success. All of these announcements have included new ways to leverage the Facebook APIs to help users shovel more stuff into the Facebook Bucket. Those ‘spades’ must be clear how they will make money given that they will not have access to the data or ability to monetize it like Facebook will.

I’m not trying to be bearish on the Facebook API or platform – far from it. I merely wish to offer a sense of perspective and to urge developers to consider carefully the business models of everyone within the stack they are participating in. There is opportunity and success in here for everyone, but we must all be cognizant of where it lies and to what extent each level in the stack is able to capitalize upon it.

Published in News Thoughts and Rants