Ethics statement
Throughout this study, we took care to follow relevant ethical standards. The use of sock puppet accounts is an established research technique for investigating personalization and bias on internet platforms12,24,45, provided it is strictly for noncommercial, public-interest purposes and does not compromise user privacy. Previous work and legal precedent have indicated that the Terms of Service of social media platforms, which may prohibit automated access to content, do not necessarily conflict with collecting publicly available data for research aims46,47. Given our focus on potential impacts on democratic processes and political discourse, this project falls under well-recognized academic exceptions for studying online information ecosystems. Moreover, we minimized the risk of privacy breaches or commercial harm by restricting our data collection to publicly available content. Although the experiment itself did not include any human participants, we obtained informed consent from the human annotators for our LLM classification validation tasks.
For the survey component of this study, the survey was deemed exempt by the authors’ institutional review board (IRB protocol no. HRPP-2025-69).
Experimental setup
In this experiment, we aimed to measure the rate at which videos of a certain political leaning appear in the recommendations of TikTok, in the context of US politics. To do so, we use bots that simulate TikTok users by watching both predefined sequences of videos of a given political leaning (the ‘conditioning’ stage), and then subsequently watching recommended videos on ‘For You’ page (the ‘recommendation’ stage) of a given bot. Each experimental run, comprising these two stages, lasts for a 1-week period.
Over the duration of 27 weeks, 567 experiments were conducted. Specifically, each week, 21 new TikTok accounts are created by randomly combining the most common American first and last names from ref. 48 and assigning an age between 22 years and 24 years. This allowed each account to impersonate a potential voter for the US presidential election likely to be active on TikTok. This age range was also an intentional decision made to standardize the age of the user across experimental conditions. Moreover, our decision to select this age range was guided by data suggesting that the 18–24 years age bracket had the largest share of users on TikTok in the USA as of 2022 (35%), although the older 25–34 years age bracket now occupies the largest share as of late 2024 (ref. 49). The 21 accounts created each week are split into one of nine experimental conditions, which are defined by two attributes. The first attribute is the state to which the bot is manually geo-located, which is either New York, Texas or Georgia (Georgia was largely regarded as a key swing state for 2024 US presidential elections, with a narrow 0.23% margin in favour of Joe Biden in the previous 2020 elections). We chose New York, Texas and Georgia because, on the basis of the 2020 presidential results, they serve as clear prototypes of a reliably Democratic state, a reliably Republican state and a competitive swing state, respectively, in the 2024 electoral landscape. Practical constraints also shaped this choice, as only these three locations were simultaneously available and stable across our VPN infrastructure. Among the feasible options, they maximized geographic and partisan diversity, but we caution that our findings should not be presumed to generalize beyond these states. The second variable is the political leaning of the videos the bot watches in the conditioning stage. The videos watched in the conditioning stage are published by either known Democrat-supporting channels or Republican-supporting channels. Finally, each week, three bots geo-located to Georgia bypass the conditioning stage of the experiment and move directly to the recommendation stage. This is done to collect recommendations made to users who do not have a particular interest in politics. A summary of the experimental conditions is provided in Supplementary Table 2.
Supplementary Fig. 4 shows a more detailed timeline of a bot during a given experimental run. Although our original design contemplated a fully crossed 3 (State: New York, Texas, Georgia) × 3 (Partisan seed: Democrat, Republican, Neutral) structure, we ultimately implemented a reduced version with seven conditions. In particular, neutral-seeded accounts were deployed only in Georgia. This decision was driven by practical constraints: expanding to nine concurrent experimental conditions required more devices to run simultaneously, which increased the risk of detection and deactivation by the platform safeguards of TikTok. We prioritized the inclusion of a neutral baseline in Georgia, a prototypical swing state, given its strategic value for interpreting partisan effects in a politically heterogeneous context. Although this design does not allow for full within-state comparisons between neutral and partisan accounts in New York and Texas, we note this as a limitation and interpret state–partisanship interactions with appropriate caution.
Pre- and post-experiment protocols
TikTok infers user location through the GPS or network (IP) geo-location of the device50. This required us to dedicate an Android smartphone, namely, Samsung Galaxy A34 5G, to each of the 21 accounts created every week. Before each experiment, we controlled device geo-location across three target states using a combined approach of GPS mocking and VPN tunnelling. Specifically, we used AnyTo51 for GPS coordinate spoofing, setting New York bots to ⟨40.7308, −73.9976⟩ in Manhattan, New York City; Texas bots to ⟨33.148, −96.638⟩ in Collin County; and Georgia bots to ⟨33.961, −84.537⟩ in Cobb County. These specific locations were chosen as counties that voted strongly Democrat, Republican or were a close call in the 2020 US presidential elections, respectively. Furthermore, to align the network identity of each bot with the geo-location of the intended state, we tunnelled the public IP address of each phone to one of three custom VPN servers we hosted on third-party cloud providers. We avoided commercial VPN services to minimize the risk of TikTok identifying the IPs as virtual. We installed TikTok from Google Play Store only after the GPS and IP address of each phone had been appropriately modified. At the conclusion of a weekly experiment, we factory-reset every phone before beginning the next round of the experiment. This step ensures that any TikTok-related cache is cleared and does not influence the subsequent experiments conducted on the same phones. Finally, all phones operated on Android 13, which re-randomizes the MAC (medium access control) address every 24 h (ref. 52), precluding the possibility of device-level tracking or bot detection of TikTok throughout a weekly experiment.
Conditioning stage
... continue reading