Tuesday, May 30, 2006

Internet mental health index

From a pure geek perspective this is one of the coolest things I've seen on the internet lately.
Since August 2005, We Feel Fine has been harvesting human feelings from a large number of weblogs. Blog data comes from a variety of online sources, including LiveJournal, MSN Spaces, MySpace, Blogger, Flickr, Technorati, Feedster, Ice Rocket, and Google. Every few minutes, the system searches the world's newly posted blog entries for occurrences of the phrases "I feel" and "I am feeling". When it finds such a phrase, it records the full sentence, up to the period, and identifies the "feeling" expressed in that sentence (e.g. sad, happy, depressed, etc.). Because blogs are structured in largely standard ways, the age, gender, and geographical location of the author can often be extracted and saved along with the sentence, as can the local weather conditions at the time the sentence was written. All of this information is saved.

Pretty neat technology, eh? As a result of the We Feel Fine project, one is reminded that there are likely a number of other people around the world who feel the same way you do - something often forgotten in monocultures and regional/demographic sub-cultures of multi-cultural populations. If urban Americans and rural Chinese share some of the same experiences and emotions, maybe that means that on the micro-societal level the goths and the jocks really aren't that different underneath the eyeliner and jerseys? We could have cats and dogs living together before you know it as we break down the stereotypes that separate us and start building relationships on our shared human experience!

Unfortunately, before we all break out into song and belt out a verse of "I'd Like to Teach the World to Sing", there is inevitably a downside to this. I'm a bit freaked out by the privacy aspect of such a massive data collection endeavor.
Because a high percentage of all blogs are hosted by one of several large blogging companies (Blogger, MySpace, MSN Spaces, LiveJournal, etc), the URL format of many blog posts can be used to extract the username of the post's author. Given the author's username, we can automatically traverse the given blogging site to find that user's profile page. From the profile page, we can often extract the age, gender, country, state, and city of the blog's owner. Given the country, state, and city, we can then retrieve the local weather conditions for that city at the time the post was written. We extract and save as much of this information as we can, along with the post.

They also save any photo associated with the post. I'm suddenly having second (well, okay, third) thoughts about blogging at all. If nothing else now I've got an F word I'll do my best to avoid using here in the future.


Labels: , ,


Post a Comment

<< Home