Understanding news story chains using information retrieval and network clustering techniques.
Content analysis of news stories (whether manual or automatic) is a cornerstone of the communication studies field. However, much research is conducted at the level of individual news articles, despite the fact that news events (especially significant ones) are frequently presented as "stories" by news outlets: chains of connected articles covering the same event from different angles. These stories are theoretically highly important in terms of increasing public recall of news items and enhancing the agenda-setting power of the press. Yet thus far, the field has lacked an efficient method for detecting groups of articles which form stories in a way that enables their analysis.
In this work, we present a novel, automated method for identifying linked news stories from within a corpus of articles. This method makes use of techniques drawn from the field of information retrieval to identify textual closeness of pairs of articles, and then clustering techniques taken from the field of network analysis to group these articles into stories. We demonstrate the application of the method to a corpus of 61,864 articles, and show how it can efficiently identify valid story clusters within the corpus. We use the results to make observations about the prevalence and dynamics of stories within the UK news media, showing that more than 50% of news production takes place within stories.
Publisher URL: http://arxiv.org/abs/1801.07988
Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.
Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.