If you closely follow technology news, TechMeme has gotta be in your daily reading list. Indimeme is a similar effort at developing a meme tracker for Indian blogosphere. The ‘hot’ stories from Indian blogosphere are presented in a semi-clustered way. The site is a project by Raj @Teknobites and currently includes 100 Indian blogs with plans to include additional blogs later.
The site is a good start but is different from Techmeme is several respects. The biggest difference is that Techmeme follows links and trackbacks from blogs and uses this link structure to build its clusters. Indimeme, meanwhile, seems to read the blog feeds and uses text summarization to build the clusters. This method does not always generate good quality clusters and a lot boils down to how effectively the text is summarized.
It isnt very difficult to build a site that uses document clustering. Using open source software like Carrot2 & Nutch etc. you can put together a similar site as well. Its the quality of the clusters and the fine-tuning that actually matter. If document clustering sounds interesting to you, see my earlier post on document clustering algorithms and related technology.
If you liked my post, feel free to subscribe to my rss feeds



















BlogoSquare
7 Comments so far (Add 1 more)