Maze Generation w/ Disjoint-Sets & Union-Find Observing Properties of Mazes Cells are "matched" with a select few adjacent ones. Cells that have been matched do not have a wall between them. All cell pairs that are not "matched" have a wall separating them. Mazes can be represented as graphs. Depending on the properties of the maze, it can be a minimum spanning tree. We can use typical graph algorithms to find solutions to mazes. Popular choices include DFS (Depth-First Search) and BFS (Breadth-First Search). We can use them to find a solution from any S to any T, easily. The Disjoint-Set Data Structure Union: Join two disjoint sets together. Join two disjoint sets together. Find: Get the vertex that is at the "root" of a disjoint set. This is the "ID" of that set. Union Operation i i j i j i j i 0 0 1 1 n - 1 n - 1 1 2 0 0 1 n - 1 The maze generation algorithm we just used is known as Randomised Kruskal's Algorithm . . There are no cycles in this graph. There is exactly one path from every S to every T . to every . If drawn as a graph, it is a minimal spanning tree. It tends to generate mazes with patterns that are easy to solve. A simple square maze is boring. We can do better. Find Operation 0 0 1 1 0 0 Improving "find(i)" Union by rank - When merging two sets, attach the shorter tree to the taller one. This forces minimal (or no) growth of the tree. In fact, at most, the tree can only grow in height by 1 from this. - When merging two sets, attach the shorter tree to the taller one. This forces minimal (or no) growth of the tree. In fact, at most, the tree can only grow in height by 1 from this. Path compression - Make every node point straight to the root node. Maze Generation is an element in games that can give players a unique experience every time they play. Doing them efficiently can be accomplished via Graph Theory. If you attended my previous talk , you will know how powerful the Disjoint-Set data structure is for object detection. In this talk, however, we're going to use it to generate mazes... the right way.Let's see what we got here... There are a few things about mazes that we should pay attention to prior to making one ourselves:Now then, let's talk about Disjoint-Sets...Initially, treat this data structure as if we havedisjoint sets (hence the name), not connected to each other at all. When thinking about a maze, treat this as a maze wherewalls are up, and you can't go from any cell to any other cell. Then, we can use operations to manipulate this data structure and connect cells together.We have two operations we can use:andLet's discuss them... For now, let's only talk abouthas a neat optimisation that'll come in handy later.This one is trivial. Join two disjoint sets together. For this part, I'm going to notate it aswhere S= S⋃ Swe merge the two sets Sand Sinto S. Then, Sis obliterated.Let's show a visual example of this... It might make some things click more easily.Assume a graphwhere(there are 16 vertices) and(there are 0 edges). Each separate vertex is part of its own set S( v∈ S, v∈ S, ..., v∈ S, ). Shown as a 4 × 4 grid, we have the following:Now let's say we performed. The figures below show the selection of 1 and 2 (left) as well as the result of the union operation (right), visually:Post-union(), S= { 1, 2 }. Sis empty and obliterated. How about we try anow?To properly generate a maze, we can just keep repeating this procedure until there is only one set left.At this point, the only remaining set is S= { v, v, ..., v}. We are unable to merge anymore sets (and break anymore walls) because they all belong in the same set already. Any other walls being broken down will force ato appear in the graph. Let's break down the kind of graph we have just created:Though, this wouldn't be a talk by me though if I didn't say we can do better, now would it? Let's expand on this concept:We can connect 2 mazes together by breaking down a wall between them. We can even add a "hallway" between them if we wanted. This is only possible because there exists a path from everyto everyThus, if we broke down a wall on the outer end of the maze and merged two together, there will always exist a path from one maze to the other, as there will always be a path to the cell with the outer wall we broke. Here's what I mean:This kind of "horizontal expansion" is possible with mazes. We can also do this vertically. Notice, though, that there is a valid path from anywhere in the left maze to anywhere in the right maze. To take this to the extreme, we cando better, and expand the maze by an(or a few, if we really wanted to). Let's give it a second floor...We can go on, but I think this gets the point across. We can also combine the horizontal/vertical expansion with this "elevator" to make some pretty unique (but also tedious) puzzles!The "find" operation is used to find the ID of the set a vertex belongs in. I'll denote it as. If S= { v, v} and I call, it'll return 0, because v∈ S. By the time the maze generation algorithm above is done, callingon any vertex will return 0, as they are all in SSince theof two sets, in a graph theory sense, is simply connecting an edge between two points, theoperation is simply going up to theof the tree and returning that value. Let's go though the previous maze generation example once more. This time, let's see how ais built from all of this.Now that we've constructed the graph, let's order it to where the coloured node (the root) is at the top. It'll look like this:There's something bad about this... Take a look at the deepest node in that tree. Sincejust goes up to the top and returns the root node, it has to go throughvertices before it returns. That's an O(n) operation right there.Now, I'm not going to make a huge deal out of a linear-time lookup. A maze of size 2048 × 2048 would speak for itself. But, like I said, we can do better...There are two techniques we can apply to our operations to make find() perform much better:and...The visuals of these two would get messy quite quickly, so I decided to not draw them out. But I think those explanations make it obvious how these improve on what we had before.Now then... with these optimisations in place, our O(n) lookup time suddenly becomes lgn (iterated logarithm base 2). In the world of Computer Science, this is essentially. For the record, if x = 2, then lg(x) = 5. Here's a table of values just to show how slow the equation grows...For the record,a is not a typo. It's known as tetration , and is a step up from exponents. If I said2, that's the same as 22 = 2. You get the idea.