Parallelizing Cellular Automata with WebGPU Compute Shaders

A simple set of rules can generate complex, emergent behavior. This is the essence of cellular automata, formalized by John von Neumann in 1948 and popularized by John Conway’s Game of Life in 1970. SmoothLife, a continuous cellular automaton, implemented in WebGPU Despite being simple to implement, cellular automata demonstrate complex, almost life-like behavior. Their inherently parallel nature makes them especially well-suited to modern GPUs. This article takes a tour through various cellular automata algorithms, each one more sophisticated than the last: Conway’s Game of Life — By far, the most famous cellular automaton. Implementing the Game of Life in WebGPU will serve as a basis for the rest of the algorithms presented in this article. Life-like Cellular Automata — What happens when we generalize the rules of Conway’s Game of Life? The answer is cellular automata that exhibit very interesting, self-organizing (and sometimes chaotic) behavior. Von Neumann Neighborhood — We take a small detour and talk about how a small modification to a cell’s neighborhood (i.e., a cell’s search radius) can drastically change the resulting output. Multiple Neighborhood Cellular Automata — Further modifications to a cell’s neighborhood can create very stable, self-organizing behavior that’s very reminiscent of biological cells. Cyclic Cellular Automata — Allowing cells to assume more than just “alive” and “dead” states creates fascinating cyclic structures, evolving past the traditional cell metaphor. Continuous Cellular Automata — Representing cell densities as floating-point numbers generates structures that can move very fluidly, such as the SmoothLife example shown above. The code for this article can be found here. All examples are implemented in WebGPU. Without further ado, let’s start by looking at how to implement Conway’s Game of Life on a GPU. Conway’s Game of Life The most well-known cellular automaton is Conway’s Game of Life, which inspired countless variants and extensions. This classic algorithm will serve as the basis for all other cellular automata discussed in this article. Animation of Conway’s Game of Life, starting with the R-pentomino as the first generation. The Game of Life can be conceptualized as a game on a checkerboard, where each position on the board has two potential states: it either has a checkerboard piece, or it does not. In a metaphor to biological systems, each position on the board can be thought of as a cell. If a checkerboard piece is placed on a position, it is said to be alive; otherwise, it is dead. Like checkers, the game proceeds in discrete turns. However, unlike checkers, the game unfolds deterministically after the pieces have been placed on the first turn. Each turn is called a generation of the algorithm. The rules of the game are as follows: For each position on the board, count the number of cells that surround it. These surrounding cells can be referred to as the neighbors of the current position. If a living cell has 2 or 3 neighbors, the cell will remain alive in the next generation. If a dead cell has exactly 3 neighbors, it will come back to life in the next generation. In any other set of conditions, the cell dies or remains dead in the next generation. If the neighbor count is less than 2, it is said that the cell dies from underpopulation. If the neighbor count is greater than 3, the cell dies from overpopulation. Each generation is completely discrete: every cell updates simultaneously based on the neighbor counts in the current generation. Each new generation depends entirely on the state of the previous generation, meaning the current generation cannot be modified until the next generation has been entirely computed. For this reason, the simplest way to play the Game of Life is with two boards: one board representing the current generation, while the other is used to compute the next. The neighbor counts from the first board are used to modify the second board. After every position on the second board has been updated, the roles of the boards swap: the second board is now the current generation, and the first is used to compute the next. In computer science terms, the boards are ping-ponged back and forth: one is read from, while the other is written to. This is a very useful technique in parallel programming because it makes it easy to reason about which memory every thread is reading from and writing to. This technique will be used for every algorithm shown in this article. Basic CPU implementation Before implementing this algorithm in a compute shader, we can implement it on the CPU first to gain some intuition about how it works. If the low-level implementation details aren’t interesting to you, feel free to skip to the next section. First, we’ll need two 2D grids that represent our boards: A read board — representing the current generation A write board — representing the to-be-computed next generation Each position on both boards will hold either a 0 or 1, representing a dead and alive cell, respectively. const boardSize = 32; const board0 = new Uint32Array(boardSize * boardSize); const board1 = new Uint32Array(boardSize * boardSize); const boards = [ board0, board1 ]; let readIndex = 0; // boards[0] -> board0 let writeIndex = 1; // boards[1] -> board1 To initialize the first generation, some arrangement of 0’s and 1’s has to be selected. Some patterns are known to generate interesting results, but filling the board with random noise is another perfectly valid option. for (let y = 0; y < boardSize; ++y) { for (let x = 0; x < boardSize; ++x) { board0[y * boardSize + x] = Math.random() >= 0.5 ? 1 : 0; } } For each generation, the algorithm counts the number of living neighbor cells at every position on the board. In practice, this involves iterating over each position on the board and checking the state of each adjacent cell. // find which board should be read from, and which to write to const readBoard = boards[readIndex]; const writeBoard = boards[writeIndex]; // iterate over every position on the board for (let y = 0; y < boardSize; ++y) { for (let x = 0; x < boardSize; ++x) { let neighborCount = 0; // iterate over every neighboring position to compute a total count for (let yOffset = -1; yOffset <= 1; ++yOffset) { for (let xOffset = -1; xOffset <= 1; ++xOffset) { // don’t include self in count if (yOffset == 0 && xOffset == 0) continue; let neighborX = x + xOffset; let neighborY = y + yOffset; // wrap around to the other side on the edges if (neighborX < 0) { neighborX += boardSize; } else if (neighborX >= boardSize) { neighborX -= boardSize; } if (neighborY < 0) { neighborY += boardSize; } else if (neighborY >= boardSize) { neighborY -= boardSize; } // if neighboring cell is alive, increment neighbor count if (readBoard[neighborY * boardSize + neighborX] == 1) { neighborCount += 1; } } } // ... Once the number of alive neighbors has been determined for a given position, the output board can be modified according to the aforementioned game rules. // ... const index = y * boardSize + x; const isCurrentlyAlive = readBoard[index] == 1; let nextState = 0; if (neighborCount == 3) { // if the neighbor count is 3, the cell is alive nextState = 1; } else if (isCurrentlyAlive && neighborCount == 2) { // if the cell is already alive and the count is 2, it stays alive nextState = 1; } else { // in all other cases, the cell dies nextState = 0; } writeBoard[index] = nextState; } Once every position on the read board has been processed, the write board will contain the complete state for the most recently computed generation. It will then become the read board for the next generation. // swap read index and write index readIndex = (readIndex + 1) % 2; // 0 -> 1 -> 0 -> 1 writeIndex = (writeIndex + 1) % 2; // 1 -> 0 -> 1 -> 0 When rendered to the screen, it should look something like this: Animation of Conway’s Game of Life evolving from random noise, with white representing live cells and black representing dead ones. WebGPU Implementation Implementing Conway’s Game of Life with a modern graphics API can be broken down into the following steps: Create two buffers, each one a 2D grid representing a game board. Set one grid to represent the current generation (read buffer) and the other to represent the next (write buffer). Initialize the read buffer with the first generation (can be done either on the CPU or GPU). Dispatch a compute shader that launches one GPU thread per grid position. Each invocation counts the number of neighboring cells and updates the next generation accordingly. Render the output on a quad, where each pixel has a corresponding grid position. The pixel shader will set its resulting color based on whether the corresponding position has an alive or dead cell. Swap the read and write buffer. Repeat from step 4. Just as in the CPU version, the GPU version alternates between two boards, applies the rules to advance to the next generation, and renders the results. In WebGPU terms, that means creating two buffers, one compute shader, and one render pass. This section focuses on those three pieces, leaving the standard WebGPU boilerplate aside. The complete code can be found here. We’ll start by creating a square board of width 32. The size of the buffer in bytes will be 32×32, multiplied by the size of each element in the buffer. We’ll use unsigned 32-bit integers to represent the state of each cell, meaning the size of each element is 4 bytes. const boardSize = 32; // square board of 32-bit integers const bufferSize = boardSize * boardSize * 4; // buffers[0] -> board0 // buffers[1] -> board1 const buffers = [ device.createBuffer({ size: bufferSize, usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST }), device.createBuffer({ size: bufferSize, usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST }) ]; The first generation has to be written to the read buffer. This can be done on the CPU by creating an array, randomly setting every position to 0 or 1, and uploading the results to the GPU. // Create CPU-based array const initialState = new Uint32Array(boardSize ** 2); // Randomly set every position to 0 or 1 for (let y = 0; y < boardSize; ++y) { for (let x = 0; x < boardSize; ++x) { initialState[y * boardSize + x] = Math.random() >= 0.5 ? 1 : 0; } } // Copy array to board 0 (CPU -> GPU) device.queue.writeBuffer(buffers[0], 0, initialState.buffer); To easily ping-pong between these two buffers, we can create two bind groups and flip between them every frame. Bind Group 0 Binding 0 → Buffer 0 Binding 1 → Buffer 1 Bind Group 1 Binding 0 → Buffer 1 Binding 1 → Buffer 0 The first frame uses Bind Group 0, the second uses Bind Group 1, the third frame flips back to 0, the fourth back to 1, and this alternating pattern continues for every frame afterward. let bindGroupIndex = 0; const computeBindGroups = [ device.createBindGroup({ layout: computePipeline.getBindGroupLayout(0), entries: [ { binding: 0, resource: { buffer: buffers[0] } }, { binding: 1, resource: { buffer: buffers[1] } }, ] }), device.createBindGroup({ layout: computePipeline.getBindGroupLayout(0), entries: [ { binding: 0, resource: { buffer: buffers[1] } }, { binding: 1, resource: { buffer: buffers[0] } }, ] }) ]; function frame() { // ... // dispatch compute shader const computePass = encoder.beginComputePass(); computePass.setPipeline(computePipeline); computePass.setBindGroup(0, computeBindGroups[ bindGroupIndex ]); computePass.dispatchWorkgroups(...); computePass.end(); // draw using render pass // ... // swap buffers - bindGroupIndex: 0 -> 1 -> 0 -> 1 bindGroupIndex = ( bindGroupIndex == 0) ? 1 : 0; requestAnimationFrame(frame); } frame(); A compute shader is dispatched on every frame. It reads from the previous generation, applies the rules of the Game of Life, and updates the next generation accordingly. // bind group definition in shader @group(0) @binding(0) var readBuffer : array; @group(0) @binding(1) var writeBuffer : array; const boardSize = 32; // Optimization: compute shader dispatches over 8x8 tiles // Board size must be a multiple of 8 @compute @workgroup_size(8, 8, 1) fn cs_main(@builtin(global_invocation_id) id : vec3) { let idx = id.y * u32(boardSize) + id.x; var neighborCount = 0u; // iterate over every neighboring position for (var yOffset = -1; yOffset <= 1; yOffset += 1) { for (var xOffset = -1; xOffset <= 1; xOffset += 1) { // don’t include self in count if (xOffset == 0 && yOffset == 0) { continue; } var x = i32(id.x) + xOffset; var y = i32(id.y) + yOffset; // wrap around to the other side on the edges if (y < 0) { y += boardSize; } if (x < 0) { x += boardSize; } if (y >= boardSize) { y -= boardSize; } if (x >= boardSize) { x -= boardSize; } // if neighboring cell is alive, increment neighbor count if (readBuffer[y * boardSize + x] == 0x1) { neighborCount += 1; } } } var alive = readBuffer[idx]; if (neighborCount == 3u) { // if the neighbor count is 3, the cell is alive alive = 1u; } else if (alive == 0x1u && neighborCount == 2u) { // if the cell is already alive and the count is 2, it stays alive alive = 1u; } else { // in all other cases, the cell dies alive = 0u; } writeBuffer[idx] = alive; } In parallel, a render pass is used to output black and white pixels, representing the state of each cell on the board. The fragment shader (i.e., pixel shader) can also be used for more advanced post-processing. struct VSOut { @builtin(position) pos: vec4f, @location(0) fragCoord: vec2f }; // Used to render quad: // -1.0, -1.0 // 1.0, -1.0 // -1.0, 1.0 // 1.0, 1.0 @vertex fn vs_main(@location(0) inPos : vec2f) -> VSOut { var out : VSOut; // directly output quad to screen out.pos = vec4f(inPos, 0.0, 1.0); // fragment shader coordinates out.fragCoord = inPos * 0.5 + 0.5; return out; } @fragment fn fs_main(@location(0) pos : vec2f) -> @location(0) vec4f { let index = vec2u(pos * f32(boardSize)); let value = readBuffer[index.y * boardSize + index.x]; if (value == 1u) { // White: Cell at this location is alive return vec4f(1.0, 1.0, 1.0, 1.0); } else { // Black: Cell at this location is dead return vec4f(0.0, 0.0, 0.0, 1.0); } } The result is the same as Conway’s Game of Life algorithm, but each position on the board is computed completely in parallel. Life-like Cellular Automata Conway’s Game of Life forms part of a larger class of cellular automata called Life-like cellular automata. The only difference between the Game of Life and other Life-like cellular automata is that the neighbor count rules slightly differ. The Anneal cellular automaton is a Life-like CA that uses the following ruleset: If a dead cell has 4, 6, 7, or 8 neighbors, it becomes alive. If an alive cell has 3, 5, 6, 7, or 8 neighbors, it remains alive. Otherwise, the cell is dead. This ruleset creates very interesting, flow-like patterns. Animation of the Anneal Cellular Automaton. Living cells are dark green, while dead cells are light green. The only difference from the Game of Life is the conditional block after the neighbor-count loop, where the ruleset of a Life-like cellular automaton is encoded. let alive = readBuffer[idx]; var output = 0u; if (alive == 0x0) { // if a dead cell has 4, 6, 7, or 8 neighbors, it becomes alive // otherwise, the cell remains dead switch (neighborCount) { case 4u, 6u, 7u, 8u: { output = 1u; break; } default: { output = 0u; } } } else { // if an alive cell 3, 5, 6, 7, or 8 neighbors, it remains alive // otherwise, the alive cell dies switch (neighborCount) { case 3u, 5u, 6u, 7u, 8u: { output = 1u; break; } default: { output = 0u; } } } writeBuffer[idx] = output; It’s common to compactly encode these rulesets into a rulestring. The rulestring for the Anneal Cellular Automaton is: B4678/S35678 . B stands for birth: the set of neighbor counts that cause a cell to be born. S stands for survival: the set of neighbor counts that allows a living cell to remain alive. The rulestring for Conway’s Game of Life is: B3/S23 . There are 2⁹ possible sets of birth conditions for any Life-like CA and 2⁹ sets of possible survival conditions, leaving us with a total of 2⁹×2⁹ ( 262,144 ) possible Life-like CA rulesets, with many distinct and interesting patterns to discover. One of the more interesting rulesets is B3/S12345, known as the Maze automaton for its infinite labyrinth of corridors and dead ends. The Maze Cellular Automaton expands infinitely outward, leaving behind a maze-like structure as it spreads. There are countless strange and beautiful rulesets out there, waiting to be discovered. However, modifying the ruleset is not the only modification that can be made: increasing the search radius (i.e., neighborhood size) leads to a new class of cellular automata called Larger-than-Life. Larger-than-Life CA Larger-than-life cellular automata extend the Life-like CA ruleset by introducing the following: Expanded Search — The neighborhood size (i.e., search radius) is adjustable and can include cells that are not immediately adjacent to the center one. In addition, the center cell can be included as part of the count. Increased Number of States — Cells can enter a cooldown state, where they don’t participate in the neighborhood counts of other cells and can’t be brought back to life until they exit this state. More Complex Birth and Survival Conditions — Because of the increased neighborhood size, birth and survival rules are typically expressed as ranges rather than single numbers. For example, in a 7×7 square neighborhood, a birth condition could be expressed as any range of numbers from 0 to 49. Bosco’s Rule is an example of a Larger-than-Life cellular automaton. The ruleset is as follows: Just as in Life-like cellular automata, Bosco’s Rule only uses two states: dead and alive. It does not take advantage of cooldown states. Every cell searches within an 11×11 square neighborhood (not including itself), resulting in neighbor counts from 0 to 120. Any dead cell with a neighbor count from 34 to 45 comes to life. Dead cells with a neighbor count outside this range remain dead in the next generation. Any living cell with a neighbor count from 33 to 57 stays alive. Living cells with a neighbor count outside this range die in the next generation. Applying these rules to a grid of random noise causes chaotic chain reactions and creates self-propelling spaceships that move across the board. Bosco’s Rule starting from random noise. The rulestrings for Larger than Life cellular automata also reflect this increased complexity. The rulestring for Bosco’s Rule is R5,C2,S33-57,B34-45 . R5 → Searches 5 grid units in each direction, creating an 11×11 square neighborhood. C2 → Uses 2 states. S33-57 → A cell stays alive when it has 33 to 57 neighbors. Otherwise, it dies. B34-45 → A cell is brought to life when it has 34 to 45 neighbors. Otherwise, it remains dead. The wider neighborhood can be implemented by adding an adjustable for loop parameter to extend the search range. const searchRange = 5; for (var dy = -searchRange; dy <= searchRange; dy += 1) { for (var dx = -searchRange; dx <= searchRange; dx += 1) { if (dx == 0 && dy == 0) { continue; } let nx = wrapCoord(x + dx); // wrapCoord keeps index in-bounds let ny = wrapCoord(y + dy); if (readBuffer[ny * boardSize + nx] == 0x1) { sum += 1u; } } } The conditional block after the neighbor-count loop also needs to reflect the more complicated ruleset. var nextState = 0u; if (currentState == 0u) { // if a cell is dead and it has between 34 and 45 alive neighbors, // it is brought back to life if (sum >= 34 && sum <= 45) { nextState = 1u; } else { nextState = 0u; } } else if (currentState == 1u) { // if a cell is alive and it has between 33 and 57 neighbors, // it stays alive if (sum >= 33 && sum <= 57) { nextState = 1u; } else { nextState = 0u } } While quite rare in practice, Larger-than-Life cellular automata can use more than 2 states, typically resulting in quite chaotic outputs. The Modern Art Cellular Automaton uses 255 states and creates weird circuit-like patterns. ModernArt CA ( R10,C255,S122-211,B123-170 ), starting from a small square of randomly chosen states in the center. When using only two states, Larger-than-life CA operate very similarly to Life-like CA: 0 represents a dead cell and 1 represents a living cell. When using more than two states, a living cell doesn’t immediately die when its survival conditions are not met. Instead, it switches to a cooldown state. The complete compute shader implementation for this algorithm can be found here. The most significant distinction between Larger-than-Life cellular automata and Life-like cellular automata lies in the difference in neighborhood size. As we’ll see in the next section, adjusting the shape of the neighborhood can yield even more interesting results. Von Neumann Neighborhood The two most common neighborhoods in cellular automata are the Moore and Von Neumann neighborhoods. Until now, we’ve only used the Moore neighborhood, in which each cell searches a square region around itself. // 5x5 Moore Neighborhood (radius = 2) for (var yOffset = -2; yOffset <= 2; yOffset += 1) { for (var xOffset = -2; xOffset <= 2; xOffset += 1) { if (xOffset == 0 && yOffset == 0) { continue; } let nx = wrapCoord(x + xOffset); let ny = wrapCoord(y + yOffset); if (readBuffer[y * boardSize + x] == 0x1) { neighborCount += 1; } } } A 5x5 Moore neighborhood Instead of searching in a square region, the Von Neumann neighborhood searches all cells within a certain Manhattan distance. A Von Neumann neighborhood of size 1 searches all cells in the 4 immediate cardinal directions. let x = i32(id.x); let y = i32(id.y); var count = 0u; // 3x3 Von Neumann neighborhood (radius = 1) if (readBuffer[(y + 1) * boardSize + (x + 0)] == 0x1) { count += 1; } if (readBuffer[(y - 1) * boardSize + (x + 0)] == 0x1) { count += 1; } if (readBuffer[(y + 0) * boardSize + (x + 1)] == 0x1) { count += 1; } if (readBuffer[(y + 0) * boardSize + (x - 1)] == 0x1) { count += 1; } A 3x3 Von Neumann neighborhood To produce a Von Neumann neighborhood of any size, we have to iterate over all positions within a given Manhattan distance. A 7x7 Von Neumann neighborhood We’ll use a nested for loop. The outer loop runs over each y-offset, and for every y-offset, we then determine the valid range of x-offsets. From the definition of Manhattan distance, we get the following condition: \(|x| + |y| \leq size \;\;\Longrightarrow\;\; |x| \leq size - |y| \) For each y-offset, we can iterate from -abs(x) to +abs(x) to cover all valid positions within the neighborhood. for (var yOffset = -searchRange; yOffset <= searchRange; yOffset += 1) { let x_search = searchRange - abs(yOffset); for (var xOffset = -x_search; xOffset <= x_search; xOffset += 1) { if (xOffset == 0 && yOffset == 0) { continue; } var x = wrapCoord(x + xOffset); var y = wrapCoord(y + yOffset); if (readBuffer[y * resolution + x] == next_state) { neighborCount += 1; } } } While most Life-like and Larger-than-life cellular automata use the Moore neighborhood, applying the Von Neumann neighborhood often changes the system’s behavior in surprising ways. The Game of Life ( B3/S23 ) using a Von Neumann neighborhood, restarted multiple times. Neighborhood definitions are not limited to Moore and Von Neumann. In fact, combining or extending them can produce entirely new classes of cellular automata. Multiple Neighborhood Cellular Automata In addition to changing the shape and size of a neighborhood, it’s possible to combine multiple neighborhoods, each with its own distinct ruleset. A cellular automaton using an inner circle neighborhood and an outer square neighborhood Combining more than one neighborhood creates what are known as Multiple Neighborhood Cellular Automata (MCNA). These automata often create very organized, biological-looking structures. Multiple Neighborhood CA implementation from Slackermanz. https://www.shadertoy.com/view/Nts3RM The simplest case is to use two neighborhoods: an inner one and an outer one. Each neighborhood maintains its own neighbor count. A rule can be expressed in terms of both counts, such as “if a cell has a high inner count and a low outer count, it stays alive”. Example of a standard Moore neighborhood of range 1 and a circular neighborhood of range 6, with a minimum distance of 4. This additional degree of freedom vastly enlarges the space of possible rulesets, and the interaction between neighborhoods produces highly complex behavior. In WGSL, multi-neighborhood search can be easily expressed with a few helper functions that take in a minimum distance, a maximum distance, and the current board position: fn searchSquare(minDist: i32, search: i32, id: vec3) -> f32 { /* ... */ } fn searchNeumann(minDist: i32, search: i32, id: vec3) -> f32 { /* ... */ } fn searchCircle(minDist: i32, search: i32, id: vec3) -> f32 { /* ... */ } For large neighborhoods, neighbor counts become difficult to reason about. It’s easier to think about the percentage of living cells in that neighborhood, rather than the absolute count. fn searchCircle(minDist: i32, search: i32, id: vec3) -> f32 { var aliveCount = 0u; var totalCount = 0u; let rMax = i32(search); let rMin2 = i32(minDist) * i32(minDist); let rMax2 = rMax * rMax; for (var yOffset: i32 = -rMax; yOffset <= rMax; yOffset += 1) { let y2 = yOffset * yOffset; for (var xOffset: i32 = -rMax; xOffset <= rMax; xOffset += 1) { let d2 = xOffset * xOffset + y2; if (d2 < rMin2 || d2 > rMax2) { // skip cells that are below the min distance continue; } var x = wrapCoord(i32(id.x) + xOffset); var y = wrapCoord(i32(id.y) + yOffset); if (readBuffer[y * boardSize + x]) { // count alive cells aliveCount += 1; } // count total amount of cells in neighborhood totalCount += 1; } } // return percentage cells in neighborhood that are alive return f32(aliveCount) / f32(totalCount); } We can look at an example that shows some interesting dynamics between two neighborhoods. First, we’ll start with a simple circular neighborhood: @compute @workgroup_size(8, 8, 1) fn cs_main(@builtin(global_invocation_id) id : vec3) { const boardSize = 256; let idx = id.y * boardSize + id.x; let current_state = readBuffer[idx]; let percentAlive = searchCircle (1, 5, id); var out = current_state; if (inRange(percentAlive, 0.114, 0.200)) { out = 0u; } if (inRange(percentAlive, 0.281, 0.301)) { out = 1u; } if (inRange(percentAlive, 0.550, 0.750)) { out = 0u; } writeBuffer[idx] = out; } Beyond expanding outward very quickly, the code above produces a cellular automaton that’s relatively uninteresting. Now, suppose we wanted to keep this explosive outward expansion but add more complex behavior. We can set a minimum distance for the circular neighborhood, forming an outer ring. A circular neighborhood with a minimum distance Inside this ring, we can use a smaller neighborhood with a different set of rules. An outer circular neighborhood, combined with a smaller Moore neighborhood Combining two neighborhoods with distinct rulesets often produces very complex, unpredictable behavior. let innerPercent = searchSquare(1, 1, id); let outerPercent = searchCircle(3, 5, id); if (inRange(innerPercent, 0.000, 0.125)) { out = 0u; } if (inRange(innerPercent, 0.500, 1.000)) { out = 1u; } if (inRange(outerPercent, 0.114, 0.200)) { out = 0u; } if (inRange(outerPercent, 0.281, 0.301)) { out = 1u; } if (inRange(outerPercent, 0.550, 0.750)) { out = 0u; } Modifying the shape and size of a neighborhood, combining multiple neighborhoods, and tuning birth and death conditions can all produce vastly different results with Multiple Neighborhood Cellular Automata. Yet there are other modifications to cellular automata to explore. In the next section, we’ll look at Cyclic Cellular Automata, which create organized, repeating patterns with the use of multiple states. Cyclic Cellular Automata With Cyclic Cellular Automata (CCA, for short), the alive and dead metaphor no longer works. Instead, each cell can have an arbitrary number of states. The number of states is a fixed integer chosen ahead of time. Even when starting from pure noise, CCA create remarkably stable, repeating structures. A 7-state cyclic cellular automaton starting from pure noise As with other automata, each cell has a defined neighborhood, with Moore and Von Neumann being the most popular. Instead of counting alive or dead neighbors, each cell looks for neighbors whose state is exactly one greater than its own. If the cell is already at the maximum state, it searches for state 0 — hence the “cyclic” in Cyclic Cellular Automata. const states = 7u; let current_state = readBuffer[idx]; let next_state = (current_state + 1u) % states; A cell advances to the next state if its neighbor count is greater than or equal to a predetermined threshold. This threshold can be adjusted according to the desired results of the algorithm. var neighbor_count = 0u; const search = 4u; // Example: standard Moore search of radius 4 for (var yOffset = -search; yOffset <= search; yOffset += 1) { for (var xOffset = -search; xOffset <= search; xOffset += 1) { // ... if (readBuffer[y * resolution + x] == next_state) { // count neighbors whose state is equal to next_state neighbor_count += 1; } } } // Example threshold of 6 const threshold = 6u; var out = current_state; if (neighbor_count >= threshold) { out = next_state; } writeBuffer[idx] = out; The fun part about CCA is that we can choose different colors to represent each state and give the cyclic structure a cohesive color palette. // one color defined for each state // this example uses 7 states const color_palette = array, 7>( vec3(0.07, 0.07, 0.07), vec3(0.33, 0.33, 0.33), vec3(0.90, 0.90, 0.90), vec3(1.0, 0.0, 0.20), vec3(0.55, 0.55, 0.55), vec3(0.15, 0.15, 0.15), vec3(0.75, 0.0, 0.15) ); @fragment fn fs_main(@location(0) pos : vec2f) -> @location(0) vec4f { const board_size = 512u; let index = vec2u(pos * board_size); let value = readBuffer[index.y * board_size + index.x]; return vec4f(color_palette[value], 1.0); } An automaton using the color palette from the code example above The three main adjustable parameters of Cyclic Cellular Automata are: The number of states The size and shape of the neighborhood The threshold at which cells advance to the next state Changing any one of these can lead to vastly different results. Cyclic Cellular Automata are bound to a small number of fixed integer states. However, more recent developments have generalized cellular automata algorithms to work with any number of states, even allowing us to use the full range of floating-point numbers between 0.0 and 1.0. These automata are called Continuous Cellular Automata. Continuous Cellular Automata Representing each cell as a floating-point number allows smooth, continuous transitions between states. The first well-known example of a cellular automaton to use floating-point states is SmoothLife. SmoothLife generates continuous, self-organizing patterns reminiscent of living forms. SmoothLife is a generalization of Conway’s Game of Life where each board position might be thought of as a density of cells, ranging from 0.0 to 1.0. While historically important, Smoothlife is rather complex to implement. In this article, we’ll explore a simpler alternative: Primordia, a continuous but more tractable cellular automaton inspired by SmoothLife. To implement these two cellular automata in compute shaders, the read and write buffers must use floating-point numbers instead of integers. @group(0) @binding(0) var readBuffer : array; @group(0) @binding(1) var writeBuffer : array; const initialState = new Float32Array(boardSize ** 2); for (let y = 0; y < boardSize; ++y) { for (let x = 0; x < boardSize; ++x) { // initialize buffer0 with floating-point number in the range [0, 1) initialState[y * boardSize + x] = Math.random(); } } device.queue.writeBuffer(buffers[0], 0, initialState.buffer); Primordia’s ruleset is defined by two numeric intervals: one for growth and one for decay. The key difference is that Primordia doesn’t count discrete neighbors; instead, it measures the local density of living cells. To measure the local cell density, we can take the average cell density of all surrounding positions in a Moore neighborhood. var sum = 0.0; for (var dy = -1; dy <= 1; dy += 1) { for (var dx = -1; dx <= 1; dx += 1) { if (dx == 0 && dy == 0) { continue; } let nx = wrapCoord(x + dx); let ny = wrapCoord(y + dy); let index = ny * board_size + nx; sum += readBuffer[index]; } } let avg_sum = sum / 8.0; Because Primordia is continuous, board positions aren’t immediately set to 0 or 1. Instead, each position stores a floating-point cell density that gradually changes over time. We define two pairs of thresholds that control growth and decay: If the surrounding density of living cells is in the growth range, the cell density increases slightly. If it falls outside the survival range, the cell density decreases slightly. // Adjustable growth/decay parameters const growth_min = 0.2; const growth_max = 0.25; const survival_min = 0.19; const survival_max = 0.334; // Adjust how fast the algorithm progresses const growth_delta = 1.0 / 32.0; var growth = 0.0; if (avg_sum >= growth_min && avg_sum <= growth_max) { // increase cell density if surrounding cell density is in growth range growth = growth_delta; } if (avg_sum <= survival_min || avg_sum >= survival_max) { // decrease cell density if surrounding cell density too high or too low growth = -growth_delta; } let current_cell_density = readBuffer[idx]; let new_cell_density = saturate(current_cell_density + growth); writeBuffer[idx] = new_cell_density; The result is a cellular automaton that smoothly transitions from one state to another. Primordia, of course, is only the beginning. Many techniques from previous sections of this article can be applied to cellular automata. A Cyclic Cellular Automaton, modified for continuous states Research in continuous cellular automata is an ongoing venture, with developments like Lenia being discovered as recently as 2019. In practice, the addition of floating-point is just another layer that can be combined with any other technique shown in this article, often producing interesting (and sometimes chaotic) behavior. Conclusion In this article, we implemented six cellular automata algorithms in WebGPU compute shaders: Conway’s Game of Life Life-like CA Larger-than-Life CA Multi-neighborhood CA Cyclic CA Continuous CA Each one presents a different way to generate emergent complexity from simple, local rules. With modern GPUs, these emergent systems can be explored in real-time and offer a glimpse into the fundamental relation between computation, biological processes, and emergent behavior.

Parallelizing Cellular Automata with WebGPU Compute Shaders

Share this article

Related Articles