Tech News
← Back to articles

Hard problems in social media archiving

read original related products more articles

Hard problems in social media archiving

In my previous post, I described my social media scrapbook – a tiny, private archive where I save conversations that I care about.

The implementation is mine, but the ideas aren’t: cultural heritage institutions have been thinking about how to preserve social media for years. There’s decades of theory and practice behind digital preservation, but social media presents some unique challenges.

Institutional archiving has different constraints to individual collections – institutions serve a much wider audience, so their decisions need consistency and boundaries. My own scrapbook is tiny and personal, and comparing it alongside institutional efforts really highlights the differences and difficulties. It’s why I usually call it a “scrapbook”, not an “archive”: it’s informal and a bit chaotic, and that’s fine because it’s only for me.

In this post, I’ll explain what I see as the key issues facing institutional social media archiving: what can be saved, what resists preservation, and how context is so hard to keep.

What exists and what can be saved

The scale of social media is overwhelming

Social media exists at a scale that’s hard to comprehend: billions of posts, with millions more being added each day.

This makes it difficult for anyone to choose what to preserve, because any one person can only know a tiny fragment of the whole. Making a choice inevitably introduces selection bias, and I’ve spoken to many people who’d like to avoid that bias by “collecting everything” – but that’s far beyond the capacity of any institution.

Since they can’t collect everything, institutions create rules – collection policies that define what’s in-scope. These rules are meant to ensure consistency, fairness, and reduce individual bias, but they force archivists to draw boundaries in a medium that inherently resists them.

... continue reading