Properly seeding random number generators doesn't always get the attention it deserves. Quite often, people do a terrible job, supplying low-quality seed data (such as the system time or process id) or combining multiple poor sources in a half-baked way. C++11 provides std::seed_seq as a way to encourage the use of better seeds, but if you haven't thought about what's really going on when you use it, you may be in for a few surprises.
In contrast to C++11, some languages, such as popular scripting languages like JavaScript, Python, or Perl, take care of good seeding for you (provided you're using their built-in RNGs). Today's operating systems have a built-in source of high-quality randomness (typically derived from the sequencing of unpredictable external and internal events), and so the implementations of these languages simply lean on the operating system to produce seed data.
C++11 provides access to operating-system–provided randomness via std::random_device , but, strangely, it isn't easy to use it directly to initialize C++'s random number generators. C++'s supplied generators only allow seeding with a std::seed_seq or a single integer, nothing else. This interface is, in many respects, a mistake, because it means that we are forced to use seed_seq (the “poor seed fixer”) even when it's not actually necessary.
In this post, we'll see two surprising things:
Low-quality seeding is harder to “fix” than you might think. When std::seed_seq tries to “fix” high-quality seed data, it actually makes it worse.
Perils of Using a Single 32-Bit Integer
If you look online for how to initialize a C++11 random number generator, you'll see code that looks like
std::mt19937 my_rng(std::random_device{}());
or possibly the more explicit (but equivalent) version,
std::random_device rdev; uint32_t random_seed = rdev(); std::seed_seq seeder{random_seed}; std::mt19937 my_rng(seeder);
... continue reading