The real-time information on news sites, blogs and social networking sites changes dynamically and spreads rapidly through the web. Developing methods to interrogate and uncover stories from within information at this scale, requires that we think about how information content varies over time, how it is transmitted, and how it mutates as it spreads.
NIFTY is a system that finds mutations of a single piece of information across the daily news cycle. Based on Memetracker, each day the system parses through 3.5 million news articles and 2 million mentioned quotes to find the top clusters of quotes.
The tool utilises a process called incremental clustering, which is a novel, and highly-scalable, means of efficiently extracting and identifying variants of a single meme.
Separated into daily, weekly, monthly, and quarterly clusters, NIFTY provides a streamlined way to identify what phrases and quotes are making the news and the interest in stories over time.
The project was developed as a part of the Stanford summer research internship program in Computer Science (CURIS). The project was supported by several organizations and designed by Caroline Suen, Sandy Huang, and Chantat Eksombatchai advised by professor Jure Leskovec and research scientist Rok Sosic.