Building the Future of Robust Computing Systems

An Interview with Dr. Onur Mutlu – 2025 Harry H. Goode Memorial Award Recipient

Dr. Onur Mutlu, Professor of Computer Science at ETH Zurich, has done research in computer architecture, computing systems, hardware security, memory and storage systems, and bioinformatics and is an innovator whose work has improved microprocessors and memory and storage systems used by billions of people.

Your identification of the RowHammer vulnerability has had a significant impact on hardware security and memory systems. What led you to this discovery, and how has it influenced subsequent research in memory systems?

RowHammer is a fascinating phenomenon that affects all modern main memory (i.e., DRAM) chips used in almost all computers today (comprising a more than 120 billion USD market size). It is the fact that repeatedly accessing one DRAM row causes bitflips in physically nearby DRAM rows that should not change at all, leading to data corruption. We have identified the phenomenon and analyzed it in detail for the first time in 2012. We have continued to study it since then, discovering new properties as well as new read disturbance mechanisms along the way, such as RowPress (in 2020-2023) and Variable Read Disturbance (in 2024-2025). We have come a long way, yet the phenomenon still fascinates me and there is a lot more to do in fundamentally understanding and very efficiently solving it, especially as memory technology scales to much denser capacities using increasingly smaller cells to store each bit of data. And, not only us, the phenomenon continues to fascinate researchers across hardware security, design automation, computer architecture, dependable systems, device analysis communities. Many works are published each year on understanding, analyzing, modeling the phenomenon as well as solving it using new and creative methods across the system stack (spanning both hardware and software). For example, top security and computer architecture conferences have been almost regularly having dedicated RowHammer sessions (sometimes multiple of them) since 2020.

Our stumbling on the RowHammer problem and creation of its first scientific analysis happened as a result of a confluence of multiple factors. First, my group has been working on DRAM technology scaling issues since late 2010. We were very interested in failure mechanisms that appear or worsen due to aggressive technology scaling. To study such issues (e.g., data retention errors), we built an FPGA-based DRAM testing infrastructure between 2011-2012, which we later open sourced as SoftMC and DRAM Bender. This infrastructure serves as the basis of a large “laboratory for analyzing and understanding memory chips” in my group. Second, around the same timeframe, we were investigating similar technology scaling issues in flash memory using real NAND flash chips (e.g., our DATE 2012 and ITJ 2013 papers analyzed fascinating errors in such chips). We knew read disturbance errors were significant in real NAND flash memory chips and were very interested in how prevalent they were in real DRAM chips. Third, we were collaborating with Intel to understand and solve DRAM technology scaling problems and build our DRAM infrastructure. Three of my students and I spent the summer of 2012 at Intel to work closely with our collaborators (two are co-authors of our original RowHammer paper): during this time, we finalized the calibration and stabilization of our infrastructure and had significant technical discussions and experimentation on DRAM scaling problems.

Although there was very limited awareness of the RowHammer problem in industry in 2012 (see Footnote 1 in our original RowHammer paper), there was no comprehensive experimental analysis and detailed real-system demonstration of it. We believed it was critical to provide a rigorous scientific analysis using a wide variety of DRAM chips and scientifically establish major characteristics and prevalence of RowHammer (and also provide solutions to it). Hence, in the summer of 2012, we set out to use our DRAM testing infrastructure to analyze RowHammer. Our initial results showed how widespread the read disturbance problem was across the (at the time) recent DRAM chips we tested, so we studied the problem comprehensively and developed many solutions to it. We submitted the resulting paper to the MICRO conference in May 2013 but was rejected for interesting yet not valid scientific reasons. One reviewer, for example, rejected the paper strongly, claiming that this is not an important problem. Another reviewer argued that the problem was not of interest to the computer architecture community and should not be published at MICRO since the problem was a “DRAM manufacturers’ problem” that should be solved by them, not computer architects. We strengthened the results, especially of the mitigation mechanisms and the number of tested chips, and made the analysis more comprehensive before it was accepted to ISCA 2014 (2 of the 6 reviewers still rejected it for yet other interesting reasons, one being that industry has already solved the problem).

Our demonstration that one can easily and predictably induce bitflips in commodity DRAM chips using a real user-level program enabled a major mindset shift in hardware security. It showed that general-purpose hardware is fallible in a very widespread manner and its problems are exploitable. Hundreds of works built directly on our work to exploit RowHammer bitflips to develop many attacks that compromise system integrity and confidentiality, starting from the first RowHammer exploit by Google Project Zero in 2015 to recent works in 2020-2025 (e.g., TRRespass, RAMBleed, Blacksmith, SMASH, RowPress). These attacks showed increasingly sophisticated ways by which an unprivileged attacker can exploit RowHammer bitflips to circumvent memory protection and gain complete control of a system, gain access to confidential data, or maliciously destroy the safety and accuracy of a system, e.g., an otherwise accurate machine learning inference engine. The mindset enabled by RowHammer bitflips caused a renewed interest in hardware security research, enticing many researchers to deeply understand hardware’s inner workings and find new vulnerabilities. Thus, hardware security issues have become mainstream discussion in top security & architecture venues, some having sessions entitled RowHammer.

Fast forward more than a decade since our original investigation, RowHammer is now a well-recognized problem and all existing DRAM chips are fundamentally vulnerable to it, in a way that is much worse than in 2012-2014. The good news is a lot of progress has also been made, and the DRAM industry now finally openly speaks out about RowHammer and writes papers about it. SK Hynix, for example, wrote a paper in ISSCC 2023, acknowledging the problem publicly in print for the first time as a DRAM manufacturer, and describing their solutions to the problem. Similarly Samsung, Google, Microsoft have all written papers about the RowHammer problem since 2020, developing both new attacks and solutions. Our original solutions to RowHammer, which we developed in 2012-2014, were adopted initially by industry earlier but they were not implemented very well. New solutions continue to be developed to RowHammer as the problem is getting worse. Industry finally standardized a solution called Per-Row Activation Counters (PRAC) in April 2024, which will be implemented in all DRAM chips going forward. This solution adds an intelligent controller inside each DRAM chip, a system-memory co-design solution that we have argued in favor of, for a very long time (e.g., in our original IMW 2013 and ISCA 2014 papers).

Clearly, fascinating ideas (attacks, analyses, and solutions) continue in the RowHammer space, with real industrial impact on the 120+ billion USD DRAM market used in essentially all computers. I believe there are a lot more discoveries to be made and much better solutions to be developed. Memory robustness issues are also a moving target as new issues appear and get discovered over generations, so research in this field is critically important and highly exciting!

Interested folks (who might be excited to better analyze, exploit and even better solve the problem) can read some of our overview papers on the problem and also watch the DRAMSec 2025 workshop, which we recently organized on the topic:

... continue reading