Microsoft first announced DirectStorage for PC back in 2020, with Forspoken being the first game to officially support it in early 2023. However, it wasn’t until Ratchet & Clank: Rift Apart was released later that year that we saw the full DirectStorage suite in action. Ratchet & Clank: Rift Apart was the first title to ship with GDeflate compression and support for GPU decompression of assets – a task that had previously been the responsibility of the CPU.
In theory, this should have facilitated more seamless streaming of assets with smoother performance, as the feature aimed to reduce the CPU bottleneck associated with the decompression of assets during gameplay. In practice, the opposite happened, in particular on Nvidia GPUs.
What is DirectStorage?
DirectStorage on PC aims to bring many of the benefits of the fast storage technology used in the PS5 and Xbox Series X|S. Its purpose is to allow games to make full use of NVMe SSDs with minimal CPU overhead, allowing for reduced load times, faster asset streaming, and larger, more dense worlds in games. DirectStorage 1.1 also added support for GPU decompression, which would shift the burden of decompression of game assets from the CPU to the GPU. This amplifies the amount of data that can be transferred through the SSD -> RAM -> VRAM pipeline.
Article continues below
Unlike CPUs, GPUs have thousands of cores, and they are also very efficient at performing repeatable tasks in parallel. GDeflate is a data-parallel compression scheme that is specifically optimized for GPU decompression.
GDeflate has two levels of parallelism. First, the original data stream is split into 64 KB tiles, and each time is compressed separately. If the CPU does the decompression, then each tile can be decompressed by a different thread. If the GPU does the decompression, then each tile can be decompressed by a single thread group. Second, the data is arranged within a tile so that many lanes within a thread group can decompress that tile in parallel. GPU decompression not only saves CPU cycles, but also saves system interconnect bandwidth and on-disk footprint since the data remains compressed until it reaches VRAM.
Image 1 of 3 (Image credit: Tom's Hardware) (Image credit: Tom's Hardware) (Image credit: Tom's Hardware)
One benefit of data moving at a faster rate through this pipeline is that, theoretically, you would need to hold less data in system memory at any point in time, which could be extremely helpful given the skyrocketing prices of DDR. This will be especially true if game developers start to lean even more into the high bandwidth of NVMe storage. Another benefit, which is especially evident in Ratchet & Clank, is that textures load in faster with DirectStorage enabled. As you can see above, with it disabled, you get blurry textures until the higher resolution textures load in.
What is the problem?
... continue reading