Inverting the Xorshift128 random number generator

CVE-2025-7783 is a very recent vulnerability affecting a lot of applications in the Node.js ecosystem including those which use axios or the deprecated request library. In all honesty, this vulnerability is really an edge case that is extremely unlikely to be exploited: it is dependent upon a number of events that are not normally present. One of those events is the attacker having access to five consecutive outputs of JavaScript Math.random( ) , which allows the attacker to predict future outputs of Math.random( ) using the z3 solver as a predictor. When I looked into this attack, I couldn’t believe that z3 was the best one can do to “invert” (determine the internal seed) of the Math.random( ) generator. As a former cryptographer, I said to myself surely it is enough to only have 2 or 3 outputs to invert it. So I decided to prove it. This blog is about my first step in the journey to find an improved algorithm. Math.random( ) uses an algorithm called Xorshift128+ under the hood, but it only outputs 52 of the 64-bits that Xorshift128+ generates. Below I will show a simple and efficient (226 operations) way to invert Xorshift128+ if at least two complete 64-bit outputs are given. This can be turned into an algorithm that inverts the full Math.random( ) , but it will require 3 outputs and currently it is somewhat inefficient (250 operations). I expect this will be improved later by either me or somebody else, perhaps you. Ten years ago I wrote a blog called So, You Want to Learn to Break Ciphers which has had about 30,000 views. This blog aligns nicely with that previous one: nothing below is particularly complicated. I expect that a competent computer science graduate could understand it and potentially improve on it. So to the aspiring or amateur cryptographer, I invite you to give it a crack! You can download my source code from GitHub and work to improve it. The Xorshift128+ algorithm The source code for the Xorshift128+ used in Math.random( ) (v8 engine) can be found here. Yep, it is written in C++. The code is transcribed below: static inline void XorShift128(uint64_t* state0, uint64_t* state1) { uint64_t s1 = *state0; uint64_t s0 = *state1; *state0 = s0; s1 ^= s1 << 23; s1 ^= s1 >> 17; s1 ^= s0; s1 ^= s0 >> 26; *state1 = s1; } It takes in two 64-bit states, state0 and state1. At the end of the algorithm, the new state0 becomes the old state1 and the new state1 is entirely determined by exclusive-or operations of various bits from the old state0 and state1. Note that the bit shifts essentially select which bits get mixed into each position of the new state1. Math.random( ) performs an integer addition of the new state values and converts it to a double, which is the output. This is also where bits are dropped because a 64-bit unsigned int cannot be represented exactly as a double. In our analysis, we are going to assume that we have the full 64-bit output of the new state0 + state1. We will show at the end how to deal without having that full output. We’re going to need some notation rather than saying “old state” and “new state” all the time. Also, in my first write-up of this attack, I came to believe that the whole “state0” and “state1” gets too confusing and intimidating when I bring in other notation, so I decided to rename it. Going forward, we will refer to state0 as L (for left) and state1 as R (for right). With that, I will introduce some notation starting with a subscript to indicate which iteration. The initial seed will be at iteration 0, denoted (L 0 , R 0 ). The function at iteration i takes (L i , R i ) and maps it to (L i+1 , R i+1 ), and the output is x i = L i+1 + R i+1 . Visually, XorShift128( ) looks like this: Visual representation of Xorshift128+ where L is state0 and R is state1 Beating brute force is easy Brute force involves trying all possibilities. There are two 64-bit states, so 128 bits are unknown. Brute force means trying all 2128 operations. It is easy to see that in fact we only need to brute force R 1 because x 0 = L 1 + R 1 (this comes directly from the Xorshift128+ first output). In other words, if we know the correct R 1 then we can determine L 1 because we also know x 0 , the observed output. This is just subtraction. But how can we determine whether our brute force guess is correct for R 1 ? We feed the combined “candidate” values (L 1 , R 1 ) into the real Xorshift128+ to get a candidate value (L 2 , R 2 ) and if the sum of those values matches the observed x 1 then we are fairly confident that our guess is correct. You can observe this by trialing my GitHub implementation: when you feed in two outputs, most of the time there is only one solution. In the cases that there is more than one solution, you will need one more observed output to check the guess. Hence, it is sufficient to know only one of the 64-bit states if you have two observed outputs. Our search space is now reduced to 264 operations. That’s a bit of improvement, but it is still too slow. A 226 algorithm to determine the internal state given at least 2 observed outputs Beating brute force was easy, now we have to be a little bit clever. What we show here is that if you know only the least significant 26 bits of R 1 then you can determine with certainty the remaining bits of L 1 and R 2 . Assuming that, it means that we only need to search all 226 possibilities for these least significant bits and then we can determine which one is correct by calculating the remaining unknown bits (with the algorithms to be shown below) and testing it similar to how we described the attack for brute force above. Before getting into details, it is important to understand searching 226 space is very doable. This is what my GitHub implementation does. It is not optimised for speed yet, but still only takes a few seconds on my MacBook Pro. I’ll talk about implementation enhancements later. Now time to get our hands dirty. To get the 226 algorithm, we need two tools. One of them you already know, we just need to write out a bit more: x 0 = L 1 + R 1 x 1 = L 2 + R 2 These are unsigned integer additions which is the same as saying mod 264. Now also remember how Xorshift128+ works: R 1 = L 2 (see visual diagram above). It is more convenient to write these equations as: L 1 = x 0 – R 1 R 2 = x 1 – R 1 In the above, if we assume we know R 1 then we know the right hand sides of both equations, so it means we can also calculate L 1 and R 2 . Next, look at the Xorshift128+ code (remembering that we are using L and R in replace of state0 and state1) and think about how bit index j of R 2 is calculated from bit indices in (L 1 , R 1 ). We need more notation here, so we will bring in brackets: R 2 [j] will refer to index j of R 2 , meaning the bit (R 2 >> j) & 1 in C code. By expanding the shifts in the code, which gets a little bit tricky because the ^= terms are compounding state values from one iteration to the next (for example s1 ^= s1 << 23 followed by s1 ^= s1 >> 17 results in a certain set of bits being mixed in at an index of << 6 ), we end up with (caret symbol denotes XOR): R 2 [j] = L 1 [j-23] ^ R 1 [j-6] ^ L 1 [j] ^ L 1 [j+17] ^ R 1 [j] ^ R 1 [j+26] This holds for 23 <= j < 38. For smaller values of j, L 1 [j-23] drops out (i.e. these bits are shifted off the end and hence do not contribute) and similarly for j < 6 we have R 1 [j-6] dropped out. It is convenient to bring the R 1 [j+26] to the left hand side and the R 2 [j] to the right hand side (simple algebra, xor the values on each side) and we get this equation which we shall call the inductive equation: R 1 [j+26] = L 1 [j-23] ^ R 1 [j-6] ^ L 1 [j] ^ L 1 [j+17] ^ R 1 [j] ^ R 2 [j] That’s the hardest part of everything we do, so if you are with me so far, then the rest is a lot easier. The beauty of the above equation is that the left hand side is computing bit index j+26 of R 1 using bit indices that are smaller for L 1 and R 1 and R 2 . It tells us that if we know all these lower significant bits, then we can calculate exactly the next most significant bit of R 1 . You should be noticing the 26 on the left hand side and remember that I promised a 226 algorithm, so things are starting to come together. So far it might look like I did a little sleight of hand because the inductive equation is not just involving lower significant bits of R 1 but also lower significant bits of L 1 and R 2 . But this is where the observed outputs come into play. We can derive exactly the least significant bits of these states because the equations above also hold mod 2(j+26): L 1 = x 0 – R 1 mod 2j+26 R 2 = x 1 – R 1 mod 2j+26 What it means is that as we calculate each new bit of R 1 using the inductive equation, we also calculate the corresponding least significant bits of L 1 and R 2 . So in summary, the pseudo code for this attack is as follows: For each possible guess of the least 26 significant bits of R 1 : - Use inductive equation to derive 64-bit candidates L 1 , R 1 and R 2 - Compute Xorshift128+(L 1 , R 1 ) and let y 1 be output sum from new states - If (x 1 == y 1 ) then solution found (break) - Else our guess is wrong (continue) We can also bring in x 2 to the testing to increase certainty. Speed optimisations These do not get the 226 any lower, but instead make each iteration faster. One of these I have implemented already, the other I have not done so yet because that will make the code harder to read. The way we described the algorithm above, we update the least significant bits of states L 1 and R 2 in every iteration. We don’t need to update those states yet if they are not used in the next iteration, instead we can delay it until we need it, which means we are doing less work per iteration. This is implemented in my code. and R in every iteration. We don’t need to update those states yet if they are not used in the next iteration, instead we can delay it until we need it, which means we are doing less work per iteration. This is implemented in my code. Deriving consecutive bits of R 1 also involves consecutive bits of L 1 and R 2 . That means that rather that computing each bit one at a time, we can do some type of time-memory tradeoff using table lookups (after an initial precomputing tables phase) so long as the bits do not require updating the corresponding bits of L 1 and R 2 every iteration (i.e. the enhancement above). I expect this to make the code at least a few times faster. Translating the above attack to a full Math.random() The whole problem with making this work for Math.random( ) is that the least significant bits are exactly the ones that are dropped when it is converted to a double. We lose the 12 least significant bits of both x 0 and x 1 . We can make up for that by brute forcing the 224 possibilities for the least significant bits and essentially do the same algorithm above. The main problem is that we’re going to have a lot more false positives, so we definitely need the third output x 2 to weed them out. It’s possible that we need more outputs, but I doubt it. The heuristic reasoning here is that we are losing 24 bits for comparison in the first two iterations but we are gaining 52 bits, which more than makes up for what we lost. So if you believe two outputs is usually enough for inverting Xorshift128+ (this can be observed from my code on GitHub which returns all seed that match two observed outputs), then I think it is also convincing that three outputs should be enough for inverting Math.random( ) . Since we are guessing 24 more bits in addition to the 26 in the original algorithm, we are now searching a space of 250. There should be ways to improve it, especially if we think about how we can use x 2 more intelligently in the algorithm. One last remark: on the use of AI to break ciphers This is a story for another day, but I actually spent more time than I expected to get this working and one of the reasons why is because I thought I could explain my ideas to ChatGPT and get it to program the idea. I was truly amazed to see ChatGPT understand my reasoning and even come up with its own ideas to help improve the research. But the downfall was in trusting ChatGPT to output the code to prove it. It unfortunately led to one problem after another, which ended up wasting a huge amount of my time. In the end, I decided to start from scratch and not let it write hardly any code, with the exception of one small snippet that it did a lot quicker than I would have done. But I still have to say I took great value in explaining my ideas and getting feedback from the bot. I strongly recommend researchers to start using tools like this. While at first they may result in productivity sinks, we will eventually get to understand them better and know when to trust them and when to say no to them. Also, these tools will only improve over time. I seriously think there will be a day when such tools are outputting research results better than humans. I’ll comment more on that in my next blog. My complete conversation with ChatGPT for this research can be found here.

Inverting the Xorshift128 random number generator

Share this article

Related Articles