Enhancing algorithmic transparency to govern the power of online platforms

The algorithms that drive social media platforms are some of the most powerful on the planet, shaping what billions of people read and watch daily. Naturally, these algorithms are coming under increasing scrutiny, including from Congress, which is looking for ways to rein in that power.

On Tuesday, the Senate Judiciary Committee met to ask platform representatives pointed questions about the harmful viral content, polarization and incitements driving this algorithm-governed media world we all now inhabit. .

Defending their platforms, representatives from Twitter, Facebook and YouTube coalesced around a few common lines to deflect pressure: user control and platform transparency.

YouTube’s Alexandra Veitch reminded the panel that the platform offers a switch for autoplay, implying that users could simply turn off the algorithm if they thought

it was causing problems. Facebook’s Monika Bickert pointed to options for switching between the “most recent” and “favorite” aspects of News Feed. And Twitter’s Lauren Culbertson came up with the “glitter icon,” which launched in 2018, which allows users to view an old fashioned timeline.

With platforms touting the control they offer users, you might be wondering what exactly happens when people To do decide to disable the algorithms.

A new empirical study we just published provides insight into how algorithms can change what we see on social media platforms, especially Twitter. In our study, we found that Twitter’s timeline algorithm shifted exposure to different types of content, different topics, and different accounts. In the timeline, 51% of tweets contained an external link. But that rate dropped to 18% in the algorithmic timeline. So if you use Twitter to find news articles and other content on the Internet, you can click the sparkle icon to turn off the algorithm.

Likewise, you can disable the algorithm if you prefer to only see tweets from accounts you actually follow. On average, we found that “suggested tweets” made up 55% of all tweets in algorithmic timelines. This led to a significant increase in the average number of accounts in the timeline, from an average of 663 in timelines to 1,169 in algorithmic timelines. Other notable impacts of the algorithm included less exposure to tweets containing COVID-19 health information (e.g. “If you have diabetes, you’re at higher risk…”) and a slight room effect. partisan echo – both from left and right accounts saw more tweets from like-minded accounts and fewer tweets from bipartisan accounts.

To test Twitter’s algorithm, we had to create a group of “sock puppet” accounts — bots that mimicked the tracking patterns of real users — that checked their timelines twice a day. Each time, they collected the top 50 tweets in their algorithmic “Top Tweets” timeline, and the top 50 tweets in the “Latest Tweets” timeline. We used this data to characterize how much Twitter’s algorithm changed the content users were exposed to.

Which brings us to the second argument that platforms often come back to: transparency. But if the platforms are so transparent, why do academic researchers like us have to set up painstaking experiments to test their algorithms externally?

To their credit, platforms have generally come a long way in offering transparency about their algorithms and content moderation. But there is still a huge gap. Social media platforms still don’t provide meaningful transparency into how their distribution algorithms shape the texture of the content people see. In some ways, the emphasis on user control erratically transfers responsibility to end users. As Sen. Jon Ossoff (D-Ga.) noted, the power dynamics of platforms are steeply tilted and platforms still dominate.

Platforms know that the real lever of power is not a flickering icon or on/off switch, but the algorithm itself. This is why Facebook and Twitter are constantly announcing structural changes to their algorithms: last week, Facebook refused hateful content in anticipation of Derek Chauvin’s verdict. In the same way, Twitter disabled many algorithmic features ahead of the US presidential election.

The platforms are to be commended for their transparency in these situations, but the public still needs more information. Three easy wins would involve transparency around (1) algorithmic exposure metrics, (2) user controls metrics, and (3) on-platform optimization.

Regarding algorithmic exposure metrics, Sen. Chris Coons (D-Del.) noted that YouTube already collects metrics on how many times YouTube’s algorithm recommends videos. YouTube and other platforms may share these statistics with the public (especially for videos that violate content policies). In the old days, some platforms have claimed that sharing these metrics could violate user privacy, but Facebook itself has demonstrated workarounds that protect user privacy while providing transparency and supporting academic research.

Additionally, if platforms claim user controls as a meaningful way to control algorithmic damage, they should share metrics about those controls. Our study only provides a narrow snapshot of what’s going on, but the platforms have extensive data to be transparent (overall) on how the algorithm shapes exposure to different types of content. How many users operate the on/off switch? What percentage of time spent on the platform do users spend in algorithmic feeds? For example, in 2018, YouTube Chief Product Officer Neal Mohan said their recommendation algorithm drove 70% of the time users spent on the platform. Has the new kill switch for autoplay changed that?

Finally, platforms could offer more transparency on on-platform optimization. There are many potential reasons why users in our study saw fewer external links. For example, Twitter may respond to bots that share large amounts of links. Another common theory, which came up repeatedly during the Senate hearing, is that platforms elevate content that keeps users on the platform, and thereby remove content that would take them off the platform. platform. For example, we know that Facebook uses a metric called “L6/7”, the percentage of users who have logged in six out of the last seven days. What other specific targets do platforms use for measurement and optimization?

While more drastic policy proposals may also be at hand, we believe that transparency can be an effective lever to govern platform algorithms. Algorithmic transparency policies should be developed in collaboration with the rich community of grassroots organizations and scientists who study these platforms and their associated harms, including those who study misinformation, content moderation, as well as the algorithms themselves. same. Policies should also focus on end-user outcomes: the ways algorithms can affect us as individuals, as communities, and as a society.

Jack Bandy holds a Ph.D. a candidate in Northwestern’s Technology and Social Behavior program who conducts research in the lab of Nicholas Diakopoulos. Follow him on Twitter @jackbandy. Nicholas Diakopoulos is an associate professor of communications and computer science at Northwestern University, where he directs the Computational Journalism Lab. He is the author of “Automating the News: How Algorithms are Rewriting the Media”. Follow him on Twitter @ndiakopoulos.