Skip to content
Tech News
← Back to articles

This is what some of the world’s largest banks of malware look like stacked as hard drives

read original get Cybersecurity Hard Drive Kit → more articles
Why This Matters

This article highlights the vast scale of malware repositories maintained by cybersecurity organizations, emphasizing their critical role in detecting and understanding evolving cyber threats. Visualizing these datasets as physical stacks underscores the enormous volume of data that underpins modern cybersecurity efforts, impacting both industry practices and consumer safety.

Key Takeaways

Malware research group vx-underground, which says it has the largest collection of malware source code, said in a post on X that its archive of data amounts to about 30 terabytes.

A reply by Bernardo Quintero, founder of VirusTotal, an online service that scans files for malware across multiple antivirus engines at once, said his service has about 31 petabytes of malware samples that users have contributed to date. (A petabyte is ~1,000-times larger than a terabyte.)

In both cases, that’s a lot of data. For context, cybersecurity companies, AI researchers, and threat intelligence firms treat repositories like these as critical for training detection models and understanding how attacks evolve. But this had us wondering: What would these enormous datasets actually look like stacked as hard drives one on top of the other and side-by-side? And how would they compare to, say, the Eiffel Tower?

Someone in our newsroom asked an AI chatbot this question, and it got it incredibly wrong.

Instead, we did some rough back-of-a-napkin math to figure out how tall these data banks would be. Since vx-underground and VirusTotal both have “about” that much data each, “about” is good enough for us in this case.

Let’s say we’re using 1 terabyte capacity internal hard drives, since these are generally designed to be the same physical size to fit inside any computer. These standardized 3.5-inch internal hard drives are 1 inch in height, which for the sake of stacking one on top of the other is really what we want to know here.

We’re also assuming that the hard drives we’re using in this example are exactly 1 terabyte, because in reality the total usable file capacity of a hard drive is generally somewhat less.

Using this online conversion tool, it looks like vx-underground’s 30 terabytes of malware data could fill 30 hard drives stacked on top of one another, reaching 30 inches, or about 2.5 feet tall.

For reference, this reporter is 6 feet tall. (See visual below, and yes, terrible opsec, I know.)

With that same logic, VirusTotal’s 31 petabytes of submitted data would fill 31,744 hard drives, which stacked on top of another would reach about 2,645 feet.

... continue reading