Recent weeks have brought a deluge of new findings about the digital media space, crowned by the Pew Research Journalism Project’s 2014 State of the News Media report.
The American Press Institute also issued an important new report, “The Personal News Cycle,” which finds that demographics matter less in terms of news-seeking and “that some long-held beliefs about people relying on just a few primary sources for news are now obsolete.” Meanwhile, Columbia Journalism School’s Tow Center for Digital Journalism has released a new report by Nicholas Diakopoulos, “Algorithmic Accountability Reporting: On the Investigation of Black Boxes,” as well as findings from Anna Hiatt’s “Future of Digital Longform Project.” But the insights don’t end there. From gatekeeping debates to filter bubbles to viral content, answers are flowing in from many corners of the research world, as you’ll see below.
(Note: This article was first posted at Nieman Journalism Lab, as part of an ongoing collaboration.) _______
Drawing two weeks of data from the New York Times’ and the Guardian’s APIs, as well as the APIs of various social media platforms, Bastos set out to answer an important question: How much overlap is there between what editors choose to focus on and what social media users grab on to? This revitalizes an old debate over editorial judgment, gatekeeping and norms of “newsworthiness.” The author looks at about 16,000 articles on the news sites from late 2012; he also analyzes article links circulated on Facebook, Twitter, Pinterest, Google+, Delicious and StumbleUpon. Some raw data findings relating to links posted on social media prove interesting: Times articles earned an average of 39 retweets on Twitter and 445 shares on Facebook, while Guardian articles saw an average of 50 retweets and 190 Facebook shares.
Bastos concludes: “The results show that social media users express a preference for a subset of content and information that is at odds with the decisions of newspaper editors regarding which topic to emphasize.” Social media users tend to favor hard news over soft news, especially on Twitter. Only a quarter of the Times sports articles studied, for example, ever showed up on Twitter or Facebook. Likewise, news editors’ preferences for more articles about the economy do not track with social media user’s apparent preferences. Further, Bastos says, “although most news sections are uniformly and symmetrically distributed across newspapers and social networking sites, we found remarkable differences on the number of news items about arts, science, technology and opinion pieces, which are on average more frequent on social networking sites than on newspapers.” The variation may be partly explained by the more urban, educated and youthful characteristics of social media users, the study notes.
A noteworthy contribution to the “filter bubble” debate, the research — one of the “largest studies of online news consumption to date” — suggests that overall ideological segregation because of online media channels and personalization is rather limited. Flaxman, Goel and Rao analyze a dataset of the nearly complete (and anonymized) web browsing habits of 1.2 million Internet users over a three month period, some 2.3 billion pageviews. The researchers use machine learning tools to assess ideology of the persons studied, looking at county-level voting patterns and demographics.
For descriptive news articles accessed through social media, the level of ideological segregation is “marginally higher” than for those read by visiting a news site directly. The pattern is “more pronounced” for opinion pieces, and there is a higher degree of segregation in web search, roughly the “ideological distance between the centrist Yahoo! News and the left-leaning Huffington Post (or equivalently, CNN and the right-leaning National Review).” Flaxman, Goel and Rao conclude that a “relatively small amount of online news consumption is driven by the polarizing social and search channels, and opinion pieces which are typically the focus of laboratory studies constitute just 6% of articles relating to world or national news…. [W]e find that individuals typically consume descriptive reporting, and do so by directly visiting a handful of their preferred news outlets.” Thus, while it is true social channels and search lead to segregation and filter bubbles, people are not primarily getting their news through those channels, and the “overall impact of these factors appears to be limited at this time.”
Related: For a more precise, quantitative sense of how much Google actually filters results, see “Personalization of Web Search,” a 2013 paper by a team at Northeastern University. That study finds that on average “11.7% of search results show differences due to personalization.”
“Can Cascades Be Predicted?” From Facebook, Stanford University and Cornell University. By Justin Cheng, Lada A. Adamic, P. Alex Dow, Jon Kleinberg and Jure Leskovec.
A group of in-house data scientists at Facebook and select academic partners are increasingly sharing publicly some insights from the holy grail of network data. This paper looks at how information cascades unfold and whether they can be predicted. It analyzes about 151,000 photos uploaded to Facebook and shared 9.2 million times over in June 2013. The network scientists try to figure out if it is possible to predict a viral cascade (multiple generations of peer-to-peer sharing, originating from a single “seed” or node). They work backwards, doing detective work to try to pick out viral signatures. Tentative findings include: Viral cascades typically start fast; early reproduction speed, the initial velocity across the network, seems to be a key marker. Further, as the photo spreads, it begins to matter less (the variable diminishes in importance) who spread it originally, and the actual kind of content matters less — though captions do seem to be important. Having a broader first generation of resharers also counts, too. Because the researchers could see the same photos being uploaded by different persons, they also noted that the first times the photo was uploaded, it was more likely to go viral compared to the later instances of uploading.
Related: Another hugely important paper in this area is “The Structural Virality of Online Diffusion,” by Sharad Goel, Jake Hofman and Duncan Watts of Microsoft Research and Ashton Anderson of Stanford. They analyze 1 billion links (news, images, videos, petitions) shared on Twitter. One of out every 3,000 links produced a “large event,” or a sharing phenomenon that reached 100 additional persons beyond the seed node; but truly viral events (many multiple generations of sharing, several thousand adoptions at least) occurred only about once in a million instances. The researchers finally define what it is to be a viral event: There is an average of at least 10 nodes between any points on the entire network graph, suggesting the content has genuinely traveled far by virtue of grassroots peer-to-peer sharing, not just a big broadcast.
A hugely insightful new report. The data that Mitchell, Jurkowitz and Olmstead analyze suggest that those accessing media through social media channels do not spend much time with news content, and news is consumed mostly “incidentally” on social platforms. Some of their salient conclusions are: “Among users coming to these news sites through a desktop or laptop computer, direct visitors spend, on average, 4 minutes and 36 seconds per visit. That is roughly three times as long as those who wind up on a news media website through a search engine (1 minute, 42 seconds) or from Facebook (1 minute, 41 seconds). Direct visitors also view roughly five times as many pages per month (24.8 on average) as those coming via Facebook referrals (4.2 pages) or through search engines (4.9 pages). And they visit a site three times as often (10.9) as Facebook and search visitors.” Pew breaks out some of the other top insights here.
Related: This all feeds into a larger recent conversation also joined by Chartbeat’s Tony Haile and others at Upworthy about the relative importance of social sharing and the need to measure quality engagement in new ways, perhaps through “attention minutes.” Meanwhile, a new report covering January 2014 by analytics platform Parse.ly suggests that Facebook is becoming an increasingly big part of driving traffic to news sites (26% in that period), while Google’s share of referrals to news sites is dropping (38%).
The study looks at the mix of sources Andy Carvin used during his social media-focused reporting for NPR. Hermida, Lewis and Zamith examine the mix of “elite” sources and “alternative voices” in a dataset of 60,000 tweets during 2010-11; they plug this data into a wider debate over how the new network ecosystem is changing the mix of media voices and sources. The researchers conclude that “nonaffiliated activists accounted for the greatest single share of tweet mentions, overall (35.3%) and for Egypt (37.5%).” However, “in the overall population of individual sources, mainstream media employees accounted for the largest group by far (26.7%).” This general mix of evidence, the study concludes, suggests a “new paradigm of sourcing at play.”
Ananny argues that the contemporary social media policies of some news organizations still fit into an age-old “defensive” and “conservative” pattern of distancing media members and institutions from their audiences and mitigating risks. The audience is seen in utilitarian terms — as a way of generating traffic or merely producing more efficient sourcing. Thus, old gatekeeping customs emerge in new clothes.
This giant scholarly roundup and synthesis — 115 studies, from around the globe, across many election cycles — finds that research typically falls into one of three categories: “the use of Twitter by politicians and campaigners, the use of Twitter by publics in election and issue campaigns and the use of Twitter by various users to comment on mediated campaign events — such as televised debates, party conventions or election day coverage.” Jungherr concludes that despite the somewhat haphazard and emerging nature of the field, there are some “stable findings” being arrived at. For example, “candidates belonging to opposition parties take more frequently to Twitter than candidates from parties in government.”
This study examines what kinds/amounts of health information people publicly disclose on Twitter (and compares and contrasts with search engine use for health-related inquiries). It turns out people share a lot of health information publicly on Twitter, although “high-stigma” conditions are not as frequently shared. Based on this evidence, the researchers hypothesize some needed digital innovations: “New kinds of health information search systems may be built that support standing queries over search and/or social media to keep users apprised of new developments related to different common health concerns , since seeking new research about conditions and diversity of health content were the goals of many respondents.”
This “Big Data” study (38 million tweets analyzed) looks at how Republicans and Democrats operated differently on Twitter and how they responded to different forms of media — both “vertical,” or traditional media, and “horizontal,” or niche forms of media that target like-minded communities. The researchers conclude that “although vertical media could best predict Obama supporters’ behaviors on Twitter, the Republican horizontal media offered the greatest predictor power in explaining Romney supporters’ network agenda.”
An ethnographic look at the 2012 DNC and media produced there, this paper provides some interesting insights into how conventions — now so ritualized and scripted that journalists find them impossible to cover — can actually empower attendees as “active spectators.” Social media at a convention now allow non-elite participants opportunities for “public critique and accountability over both political and journalistic actors.”
This five-year case study on Britain’s Northern Echo newspaper shows how technological adoption in a media organization is not “unidirectional”: rather, it is “neither smooth nor uniform and is marked by uncertainty as to which of several actions is rational. Doubt is spread throughout the hierarchy.” The observations will be familiar to many in news organizations, but for scholars it’s another good data point showing a complex transition to the web.