Art

Wave Function Collapse

The WaveFunctionCollapse algorithm[1] works off an inventory of possible patterns in a grid. Initially each grid point is initialized to "all possible patterns". As the algorithm iterates, a grid point is selected, a valid tile is selected for that grid point, and knowledge about how that constrains possible adjacent patterns is propagated to the neighbouring points in the grid. It is configurable how far this wave of propagation continues. The algorithm terminates when either everything has been "collapsed" to a single known state (a chosen pattern) or there are no more valid continuations. If the adjacency rules are not too stringent, the algorithm generally converges fairly well.

There are two main modes in which the algorithm operates:

  1. Simple tiled mode
  2. Overlapping mode

For simple tiles, a set of base tiles is defined to provide the patterns along with some adjacency rules. The grid represents the grid of tiles, producing a final image that is grid size X tile size.

For the overlapping mode, patterns are sampled from a source image. Adjacency rules are determined by considering the patterns that are the same except for a leading or trailing row or column. In this case each grid point represents a single pixel inferred from the patterns for a final image that is just the size of the grid, regardless of tile size. Increasing tilesize limits the number of allowable overlaps. Since the algorithm performance depends largely on number of points in the grid (so increases with the square of the length/width of the grid), producing large images with the overlapping mode can get very expensive. However, the result can be generated setting the periodic output flag, which means the top and bottom of the result match as do the left and right, so multiple instances of the same tile can be laid out repeatedly in a grid to produce a single, larger, seamless image.

Selection of the next grid point is based on the minimum entropy. Initially this is (log(Σwi) - Σlog(wi))/Σwi (wi is the weight of the ith tile). As the algorithm proceeds and allowable tiles are eliminated at a grid point, eliminated tiles' contributions to this value are removed. A small random factor is added in at selection time to reduce ties and add more variability to the outputs.

Once a grid point is selected, a tile is picked randomly from the remaining allowable tiles, with the selection weighted in accordance with relative tile weights.

The following series of diagrams shows the progression of the algorithm (simple tiled mode), showing a rendering of the state for various intermediate iterations. You can see growing regions of collapse where a definitive tile has been chosen for each grid point and shrinking regions of complete lack of knowledge, showing the superposition of all possible tiles. Around the boundaries of the islands of collapse are neighbourhoods of partial knowledge: the grid still represents a superposition of possible tiles, but it is a subset of the complete set. The extent of these neighbourhoods is another algorithm parameter: with larger neighbourhoods each iteration is more costly to compute, but fewer iterations are required, and the overall constraints on the pattern are stricter. Larger neighbourhoods also raises the chance of the collapse failing.

(Notice how here the output is periodic, so that the result joins up with itself.)

Simple Tiled Mode

The simple tiled mode is an easy and generally fairly quick way of defining a constrained tiling pattern. The original implementation supports tiles defined as small PNG files: I added support for raw P3/P6 images, adhoc pattern matrixes, and adhoc drawn shapes (which I can then render out to SVG). The series showing the progress of the algorithm was made with adhoc drawn tiles, in this case squares filled with different mutations of a base random gradient (linear, radial, reversed radial, and slanted). The adhoc drawing mode is the mode I use most: it gives me the most flexibility.

On the left: Several runs of the algorithm using adhoc line designs, layered over one another with different colouring and opacities. On the right: layered maze designs, with a warping function applied to the results.

Symmetries

Tiles are defined with a symmetry code that defines how the shape operates under rotation and reflection (through the vertical center line). The code letters are supposed to be evocative, in that the symmetry behaves like the letter. For example, with "T" symmetry, each of the four rotations is distinct, the reflection of the unrotated and twice-rotated forms is unchanged, and the reflection of the single- and triple-rotated forms is each other.

Adjacency rules are defined against a sequence of 8 possibilities. The first four define the (counter-clockwise) rotation of the pattern 0, 1, 2, and 3 times. The second four define the horizontal reflections of those same variations.

That is, numbering from 0 to 7:

  • 0 = tile
  • 1 = rotate left to bottom
  • 2 = rotate left to right
  • 3 = rotate left to top
  • 4 = reflect 0
  • 5 = reflect 1
  • 6 = reflect 2
  • 7 = reflect 3

The symmetry variations define the relationships between the rotations and reflections for a given pattern:

  • X: 1 distinct rotation; everything same across rotation and reflection

    0 0 0 0 0 0 0 0

  • I: 2 distinct rotations; reflect(0)=0, reflect(1)=1, reflect(2)=0, reflect(3)=1

    0 1 0 1 0 1 0 1

  • \: 2 distinct rotations; reflect(0)=1, reflect(1)=0, reflect(2)=1, reflect(3)=0

    0 1 0 1 1 0 1 0

  • L: 4 distinct rotations; reflect(0)=1, reflect(1)=0, reflect(2)=3, reflect(3)=2

    0 1 2 3 1 0 3 2

  • T: 4 distinct rotations; reflect(0)=0, reflect(1)=3, reflect(2)=2, reflect(3)=1

    0 1 2 3 0 3 2 1

  • [: 4 distinct rotations; reflect(0)=2, reflect(1)=1, reflect(2)=0, reflect(3)=3

    0 1 2 3 2 1 0 3

  • %: 4 distinct rotations; reflect(0)=3, reflect(1)=2, reflect(2)=1, reflect(3)=0

    0 1 2 3 3 2 1 0

  • F: everything distinct; each quad creates own rotation group

    1 2 3 0 5 6 7 4

I added the % and [ codes to the original set. The F variant is not available in all ports of the algorithm.

Symmetry arrays are calculated for each rotational variant.

Adjacency rules are given by specifying a tile and rotational variant and the tile and rotational variant that can be to its right. Based on the symmetries, adjacency in other directions and for other variants can be inferred.

If L[n] represents slot n in the symmetry arrays for the left tile, and R[n] for the right. Suppose we have an explicit adjacency rule that says a T tile rotation 0 (T0) can be left of an L tile rotation 2 (L2). Then we also know that L2 can be to the left of T0 and that R[2] can be above L[2], that is, that L3 can be above T1.

The full set of symmetries is defined as follows:

L=L[1]
D=L[2]
R=R[1]
U=R[2]

R=L+R L[7]+R[7] R[5]+L[5] R[3]+L[3]
U=D+U U[7]+D[7] D[5]+U[5] U[3]+D[3]
L=R+L R[7]+L[7] L[5]+R[5] L[3]+R[3]
D=U+D D[7]+U[7] U[5]+D[5] D[3]+U[3]
        

The idea behind the symmetry codes is to ensure tile patterns fit smoothly together, but they can also be used to constrain the set of allowable patterns whether they represent the actual symmetry of the pattern or not. The table below shows the effects of changing the symmetry code. The patterns below consist of a two-coloured bar along the left edge of the tile and a circle centered in the tile. The "real" symmetry codes should be F and X, respectively. Here I run the model them with different symmetry codes for the bar while keeping the rule sets the same. Each symmetry gives a slightly different vibe to the overall arrangement. The 4-fold symmetry variants are the most similar to each other, because they differ only in which reflection variants are considered the same, so they mainly impact frequencies of certain orientations. The other symmetry variants impose stricter constraints on the patterns, either because of limitations of the rules (F) or limitations of the symmetries (X, I, \).

1 distinct rotation
X
 
2 distinct rotations
I\
 
4 distinct rotations
TL[%
 
8 distinct rotations
F

Overlapping Mode

The overlapping mode starts with a single source image and constructs small base tiles by taking subtiles from the image. Duplicate tiles are consolidated and adjacency rules inferred. In this case "adjacency" means that the tiles are the same, except for the row or column on the edge. That is, for one NxN tile to be allowed to the right of another, the last N-1 columns of the left tile must match the first N-1 columns of the right tile. (Hence the name: overlapping.) The algorithm then proceeds as the simple tiled version does, except when it comes to final rendering: instead of rendering out a whole tile at each grid point, we render out just the corner pixel at that point. This is why the overlapping variant produces much smaller tilings for the same effort: the final size is reduced by a factor of the tilesize.

The technique for reducing duplicates is quite clever. The original set of colours is mapped to an integer index. Each matrix of integers is then converted into a number base K, where K is the number of distinct colours. Comparing tiles is just a matter of comparing these numbers. The technique works best if the number of distinct tiles and colours is relatively small: so fairly small and simple source images with a small number of distinct colours.

When the overlapping mode is used, you get designs in the style of the original image, with similar constraints. Since the tiles are sampled, there are no explicit symmetry rules as there are for explicit tiles. Instead, an overall parameter defines the allowable number of distinct symmetry variants of the pattern: typically 8 (the default), 4, or 2. As the number of symmetries decreases, the pattern becomes more constrained:

Base tile:

Symmetry 8Symmetry 4Symmetry 2

The neighbourhood parameter is also the tilesize parameter in the overlapping mode. N=1 amounts to randomly picking pixels from the sampled set. N=2 is a little better but doesn't really honor the constraints and style of the original very well. For this sample tile N more than 3 consistently fails but for other patterns it can give you output patterns with more global consistency with the source pattern:.

N=1N=2N=3

Inferred Tiles

Inferred tiles combines the operation of the simple tiling mode with the sampling of overlap mode: tiles are sampled from the base tile like in the overlap mode, but the adjacent rules are based only on the edge columns, not the overlap. Keeping the full set of tiles and adjacency rules can lead to very expensive and poorly constrained results, especially for high symmetry values. In addition to using a smaller symmetry value, the inference also separates tilesize from neighbourhood size (as does simple tiling), and provides for only keeping a certain fraction of the rules. Selection of rules to keep is random.

Base tile:

With symmetry=2 and a tilesize of 10 (vs sample image size 70):

keep=25%keep=50%keep=75%

Using 100% of the rules with the same base pattern as the two series in the previous section (with symmetry 8, neighbourhood 3) gives:

Other variations give spottier patterns.

Clearly explicit rules or the overlapping algorithm produces cleaner results. I am experimenting with other ways of controlling the rule sets.

References

  1. Source code for original implementation of Wave Function Collapse algorithm (C#), pointers to other ports: https://github.com/mxgmn/WaveFunctionCollapse>

Updated: 20220720, fixing mistranscribed symmetry rules.