Previous | Contents | Next

Chapter 6: How to write a new puzzle

This chapter gives a guide to how to actually write a new puzzle: where to start, what to do first, how to solve common problems.

The previous chapters have been largely composed of facts. This one is mostly advice.

6.1 Choosing a puzzle

Before you start writing a puzzle, you have to choose one. Your taste in puzzle games is up to you, of course; and, in fact, you're probably reading this guide because you've already thought of a game you want to write. But if you want to get it accepted into the official Puzzles distribution, then there's a criterion it has to meet.

The current Puzzles editorial policy is that all games should be fair. A fair game is one which a player can only fail to complete through demonstrable lack of skill – that is, such that a better player in the same situation would have known to do something different.

For a start, that means every game presented to the user must have at least one solution. Giving the unsuspecting user a puzzle which is actually impossible is not acceptable. (There is an exception: if the user has selected some non-default option which is clearly labelled as potentially unfair, then you're allowed to generate possibly insoluble puzzles, because the user isn't unsuspecting any more. Same Game and Mines both have options of this type.)

Also, this actually rules out games such as Klondike, or the normal form of Mahjong Solitaire. Those games have the property that even if there is a solution (i.e. some sequence of moves which will get from the start state to the solved state), the player doesn't necessarily have enough information to find that solution. In both games, it is possible to reach a dead end because you had an arbitrary choice to make and made it the wrong way. This violates the fairness criterion, because a better player couldn't have known they needed to make the other choice.

(GNOME has a variant on Mahjong Solitaire which makes it fair: there is a Shuffle operation which randomly permutes all the remaining tiles without changing their positions, which allows you to get out of a sticky situation. Using this operation adds a 60-second penalty to your solution time, so it's to the player's advantage to try to minimise the chance of having to use it. It's still possible to render the game uncompletable if you end up with only two tiles vertically stacked, but that's easy to foresee and avoid using a shuffle operation. This form of the game is fair. Implementing it in Puzzles would require an infrastructure change so that the back end could communicate time penalties to the mid-end, but that would be easy enough.)

Providing a unique solution is a little more negotiable; it depends on the puzzle. Solo would have been of unacceptably low quality if it didn't always have a unique solution, whereas Twiddle inherently has multiple solutions by its very nature and it would have been meaningless to even suggest making it uniquely soluble. Somewhere in between, Flip could reasonably be made to have unique solutions (by enforcing a zero-dimension kernel in every generated matrix) but it doesn't seem like a serious quality problem that it doesn't.

Of course, you don't have to care about all this. There's nothing stopping you implementing any puzzle you want to if you're happy to maintain your puzzle yourself, distribute it from your own web site, fork the Puzzles code completely, or anything like that. It's free software; you can do what you like with it. But any game that you want to be accepted into my Puzzles code base has to satisfy the fairness criterion, which means all randomly generated puzzles must have a solution (unless the user has deliberately chosen otherwise) and it must be possible in theory to find that solution without having to guess.

6.2 Getting started

The simplest way to start writing a new puzzle is to copy nullgame.c. This is a template puzzle source file which does almost nothing, but which contains all the back end function prototypes and declares the back end data structure correctly. It is built every time the rest of Puzzles is built, to ensure that it doesn't get out of sync with the code and remains buildable.

So start by copying nullgame.c into your new source file. Then you'll gradually add functionality until the very boring Null Game turns into your real game.

Next you'll need to add your puzzle to the Makefiles, in order to compile it conveniently. Do not edit the Makefiles: they are created automatically by the script mkfiles.pl, from the file called Recipe. Edit Recipe, and then re-run mkfiles.pl.

Also, don't forget to add your puzzle to list.c: if you don't, then it will still run fine on platforms which build each puzzle separately, but Mac OS X and other monolithic platforms will not include your new puzzle in their single binary.

Once your source file is building, you can move on to the fun bit.

6.2.1 Puzzle generation

Randomly generating instances of your puzzle is almost certain to be the most difficult part of the code, and also the task with the highest chance of turning out to be completely infeasible. Therefore I strongly recommend doing it first, so that if it all goes horribly wrong you haven't wasted any more time than you absolutely had to. What I usually do is to take an unmodified nullgame.c, and start adding code to new_game_desc() which tries to generate a puzzle instance and print it out using printf(). Once that's working, then I start connecting it up to the return value of new_game_desc(), populating other structures like game_params, and generally writing the rest of the source file.

There are many ways to generate a puzzle which is known to be soluble. In this section I list all the methods I currently know of, in case any of them can be applied to your puzzle. (Not all of these methods will work, or in some cases even make sense, for all puzzles.)

Some puzzles are mathematically tractable, meaning you can work out in advance which instances are soluble. Sixteen, for example, has a parity constraint in some settings which renders exactly half the game space unreachable, but it can be mathematically proved that any position not in that half is reachable. Therefore, Sixteen's grid generation simply consists of selecting at random from a well defined subset of the game space. Cube in its default state is even easier: every possible arrangement of the blue squares and the cube's starting position is soluble!

Another option is to redefine what you mean by ‘soluble’. Black Box takes this approach. There are layouts of balls in the box which are completely indistinguishable from one another no matter how many beams you fire into the box from which angles, which would normally be grounds for declaring those layouts unfair; but fortunately, detecting that indistinguishability is computationally easy. So Black Box doesn't demand that your ball placements match its own; it merely demands that your ball placements be indistinguishable from the ones it was thinking of. If you have an ambiguous puzzle, then any of the possible answers is considered to be a solution. Having redefined the rules in that way, any puzzle is soluble again.

Those are the simple techniques. If they don't work, you have to get cleverer.

One way to generate a soluble puzzle is to start from the solved state and make inverse moves until you reach a starting state. Then you know there's a solution, because you can just list the inverse moves you made and make them in the opposite order to return to the solved state.

This method can be simple and effective for puzzles where you get to decide what's a starting state and what's not. In Pegs, for example, the generator begins with one peg in the centre of the board and makes inverse moves until it gets bored; in this puzzle, valid inverse moves are easy to detect, and any state that's reachable from the solved state by inverse moves is a reasonable starting position. So Pegs just continues making inverse moves until the board satisfies some criteria about extent and density, and then stops and declares itself done.

For other puzzles, it can be a lot more difficult. Same Game uses this strategy too, and it's lucky to get away with it at all: valid inverse moves aren't easy to find (because although it's easy to insert additional squares in a Same Game position, it's difficult to arrange that after the insertion they aren't adjacent to any other squares of the same colour), so you're constantly at risk of running out of options and having to backtrack or start again. Also, Same Game grids never start off half-empty, which means you can't just stop when you run out of moves – you have to find a way to fill the grid up completely.

The other way to generate a puzzle that's soluble is to start from the other end, and actually write a solver. This tends to ensure that a puzzle has a unique solution over and above having a solution at all, so it's a good technique to apply to puzzles for which that's important.

One theoretical drawback of generating soluble puzzles by using a solver is that your puzzles are restricted in difficulty to those which the solver can handle. (Most solvers are not fully general: many sets of puzzle rules are NP-complete or otherwise nasty, so most solvers can only handle a subset of the theoretically soluble puzzles.) It's been my experience in practice, however, that this usually isn't a problem; computers are good at very different things from humans, and what the computer thinks is nice and easy might still be pleasantly challenging for a human. For example, when solving Dominosa puzzles I frequently find myself using a variety of reasoning techniques that my solver doesn't know about; in principle, therefore, I should be able to solve the puzzle using only those techniques it does know about, but this would involve repeatedly searching the entire grid for the one simple deduction I can make. Computers are good at this sort of exhaustive search, but it's been my experience that human solvers prefer to do more complex deductions than to spend ages searching for simple ones. So in many cases I don't find my own playing experience to be limited by the restrictions on the solver.

(This isn't always the case. Solo is a counter-example; generating Solo puzzles using a simple solver does lead to qualitatively easier puzzles. Therefore I had to make the Solo solver rather more advanced than most of them.)

There are several different ways to apply a solver to the problem of generating a soluble puzzle. I list a few of them below.

The simplest approach is brute force: randomly generate a puzzle, use the solver to see if it's soluble, and if not, throw it away and try again until you get lucky. This is often a viable technique if all else fails, but it tends not to scale well: for many puzzle types, the probability of finding a uniquely soluble instance decreases sharply as puzzle size goes up, so this technique might work reasonably fast for small puzzles but take (almost) forever at larger sizes. Still, if there's no other alternative it can be usable: Pattern and Dominosa both use this technique. (However, Dominosa has a means of tweaking the randomly generated grids to increase the probability of them being soluble, by ruling out one of the most common ambiguous cases. This improved generation speed by over a factor of 10 on the highest preset!)

An approach which can be more scalable involves generating a grid and then tweaking it to make it soluble. This is the technique used by Mines and also by Net: first a random puzzle is generated, and then the solver is run to see how far it gets. Sometimes the solver will get stuck; when that happens, examine the area it's having trouble with, and make a small random change in that area to allow it to make more progress. Continue solving (possibly even without restarting the solver), tweaking as necessary, until the solver finishes. Then restart the solver from the beginning to ensure that the tweaks haven't caused new problems in the process of solving old ones (which can sometimes happen).

This strategy works well in situations where the usual solver failure mode is to get stuck in an easily localised spot. Thus it works well for Net and Mines, whose most common failure mode tends to be that most of the grid is fine but there are a few widely separated ambiguous sections; but it would work less well for Dominosa, in which the way you get stuck is to have scoured the whole grid and not found anything you can deduce anywhere. Also, it relies on there being a low probability that tweaking the grid introduces a new problem at the same time as solving the old one; Mines and Net also have the property that most of their deductions are local, so that it's very unlikely for a tweak to affect something half way across the grid from the location where it was applied. In Dominosa, by contrast, a lot of deductions use information about half the grid (‘out of all the sixes, only one is next to a three’, which can depend on the values of up to 32 of the 56 squares in the default setting!), so this tweaking strategy would be rather less likely to work well.

A more specialised strategy is that used in Solo and Slant. These puzzles have the property that they derive their difficulty from not presenting all the available clues. (In Solo's case, if all the possible clues were provided then the puzzle would already be solved; in Slant it would still require user action to fill in the lines, but it would present no challenge at all). Therefore, a simple generation technique is to leave the decision of which clues to provide until the last minute. In other words, first generate a random filled grid with all possible clues present, and then gradually remove clues for as long as the solver reports that it's still soluble. Unlike the methods described above, this technique cannot fail – once you've got a filled grid, nothing can stop you from being able to convert it into a viable puzzle. However, it wouldn't even be meaningful to apply this technique to (say) Pattern, in which clues can never be left out, so the only way to affect the set of clues is by altering the solution.

(Unfortunately, Solo is complicated by the need to provide puzzles at varying difficulty levels. It's easy enough to generate a puzzle of at most a given level of difficulty; you just have a solver with configurable intelligence, and you set it to a given level and apply the above technique, thus guaranteeing that the resulting grid is solvable by someone with at most that much intelligence. However, generating a puzzle of at least a given level of difficulty is rather harder; if you go for at most Intermediate level, you're likely to find that you've accidentally generated a Trivial grid a lot of the time, because removing just one number is sufficient to take the puzzle from Trivial straight to Ambiguous. In that situation Solo has no remaining options but to throw the puzzle away and start again.)

A final strategy is to use the solver during puzzle construction: lay out a bit of the grid, run the solver to see what it allows you to deduce, and then lay out a bit more to allow the solver to make more progress. There are articles on the web that recommend constructing Sudoku puzzles by this method (which is completely the opposite way round to how Solo does it); for Sudoku it has the advantage that you get to specify your clue squares in advance (so you can have them make pretty patterns).

Rectangles uses a strategy along these lines. First it generates a grid by placing the actual rectangles; then it has to decide where in each rectangle to place a number. It uses a solver to help it place the numbers in such a way as to ensure a unique solution. It does this by means of running a test solver, but it runs the solver before it's placed any of the numbers – which means the solver must be capable of coping with uncertainty about exactly where the numbers are! It runs the solver as far as it can until it gets stuck; then it narrows down the possible positions of a number in order to allow the solver to make more progress, and so on. Most of the time this process terminates with the grid fully solved, at which point any remaining number-placement decisions can be made at random from the options not so far ruled out. Note that unlike the Net/Mines tweaking strategy described above, this algorithm does not require a checking run after it completes: if it finishes successfully at all, then it has definitely produced a uniquely soluble puzzle.

Most of the strategies described above are not 100% reliable. Each one has a failure rate: every so often it has to throw out the whole grid and generate a fresh one from scratch. (Solo's strategy would be the exception, if it weren't for the need to provide configurable difficulty levels.) Occasional failures are not a fundamental problem in this sort of work, however: it's just a question of dividing the grid generation time by the success rate (if it takes 10ms to generate a candidate grid and 1/5 of them work, then it will take 50ms on average to generate a viable one), and seeing whether the expected time taken to successfully generate a puzzle is unacceptably slow. Dominosa's generator has a very low success rate (about 1 out of 20 candidate grids turn out to be usable, and if you think that's bad then go and look at the source code and find the comment showing what the figures were before the generation-time tweaks!), but the generator itself is very fast so this doesn't matter. Rectangles has a slower generator, but fails well under 50% of the time.

So don't be discouraged if you have an algorithm that doesn't always work: if it nearly always works, that's probably good enough. The one place where reliability is important is that your algorithm must never produce false positives: it must not claim a puzzle is soluble when it isn't. It can produce false negatives (failing to notice that a puzzle is soluble), and it can fail to generate a puzzle at all, provided it doesn't do either so often as to become slow.

One last piece of advice: for grid-based puzzles, when writing and testing your generation algorithm, it's almost always a good idea not to test it initially on a grid that's square (i.e. w==h), because if the grid is square then you won't notice if you mistakenly write h instead of w (or vice versa) somewhere in the code. Use a rectangular grid for testing, and any size of grid will be likely to work after that.

6.2.2 Designing textual description formats

Another aspect of writing a puzzle which is worth putting some thought into is the design of the various text description formats: the format of the game parameter encoding, the game description encoding, and the move encoding.

The first two of these should be reasonably intuitive for a user to type in; so provide some flexibility where possible. Suppose, for example, your parameter format consists of two numbers separated by an x to specify the grid dimensions (10x10 or 20x15), and then has some suffixes to specify other aspects of the game type. It's almost always a good idea in this situation to arrange that decode_params() can handle the suffixes appearing in any order, even if encode_params() only ever generates them in one order.

These formats will also be expected to be reasonably stable: users will expect to be able to exchange game IDs with other users who aren't running exactly the same version of your game. So make them robust and stable: don't build too many assumptions into the game ID format which will have to be changed every time something subtle changes in the puzzle code.

6.3 Common how-to questions

This section lists some common things people want to do when writing a puzzle, and describes how to achieve them within the Puzzles framework.

6.3.1 Drawing objects at only one position

A common phenomenon is to have an object described in the game_state or the game_ui which can only be at one position. A cursor – probably specified in the game_ui – is a good example.

In the game_ui, it would obviously be silly to have an array covering the whole game grid with a boolean flag stating whether the cursor was at each position. Doing that would waste space, would make it difficult to find the cursor in order to do anything with it, and would introduce the potential for synchronisation bugs in which you ended up with two cursors or none. The obviously sensible way to store a cursor in the game_ui is to have fields directly encoding the cursor's coordinates.

However, it is a mistake to assume that the same logic applies to the game_drawstate. If you replicate the cursor position fields in the draw state, the redraw code will get very complicated. In the draw state, in fact, it is probably the right thing to have a cursor flag for every position in the grid. You probably have an array for the whole grid in the drawstate already (stating what is currently displayed in the window at each position); the sensible approach is to add a ‘cursor’ flag to each element of that array. Then the main redraw loop will look something like this (pseudo-code):

for (y = 0; y < h; y++) {
    for (x = 0; x < w; x++) {
        int value = state->symbol_at_position[y][x];
        if (x == ui->cursor_x && y == ui->cursor_y)
            value |= CURSOR;
        if (ds->symbol_at_position[y][x] != value) {
            symbol_drawing_subroutine(dr, ds, x, y, value);
            ds->symbol_at_position[y][x] = value;
        }
    }
}

This loop is very simple, pretty hard to get wrong, and automatically deals both with erasing the previous cursor and drawing the new one, with no special case code required.

This type of loop is generally a sensible way to write a redraw function, in fact. The best thing is to ensure that the information stored in the draw state for each position tells you everything about what was drawn there. A good way to ensure that is to pass precisely the same information, and only that information, to a subroutine that does the actual drawing; then you know there's no additional information which affects the drawing but which you don't notice changes in.

6.3.2 Implementing a keyboard-controlled cursor

It is often useful to provide a keyboard control method in a basically mouse-controlled game. A keyboard-controlled cursor is best implemented by storing its location in the game_ui (since if it were in the game_state then the user would have to separately undo every cursor move operation). So the procedure would be:

6.3.3 Implementing draggable sprites

Some games have a user interface which involves dragging some sort of game element around using the mouse. If you need to show a graphic moving smoothly over the top of other graphics, use a blitter (see section 3.1.13 for the blitter API) to save the background underneath it. The typical scenario goes:

This way, you will be able to write the rest of the redraw function completely ignoring the dragged object, as if it were floating above your bitmap and being completely separate.

6.3.4 Sharing large invariant data between all game states

In some puzzles, there is a large amount of data which never changes between game states. The array of numbers in Dominosa is a good example.

You could dynamically allocate a copy of that array in every game_state, and have dup_game() make a fresh copy of it for every new game_state; but it would waste memory and time. A more efficient way is to use a reference-counted structure.

This way, the invariant data will persist for only as long as it's genuinely needed; as soon as the last game state for a particular puzzle instance is freed, the invariant data for that puzzle will vanish as well. Reference counting is a very efficient form of garbage collection, when it works at all. (Which it does in this instance, of course, because there's no possibility of circular references.)

6.3.5 Implementing multiple types of flash

In some games you need to flash in more than one different way. Mines, for example, flashes white when you win, and flashes red when you tread on a mine and die.

The simple way to do this is:

redraw() will never be called with flash_time non-zero unless flash_length() was first called to tell the mid-end that a flash was required; so whenever redraw() notices that flash_time is non-zero, you can be sure that the field in game_ui is correctly set.

6.3.6 Animating game moves

A number of puzzle types benefit from a quick animation of each move you make.

For some games, such as Fifteen, this is particularly easy. Whenever redraw() is called with oldstate non-NULL, Fifteen simply compares the position of each tile in the two game states, and if the tile is not in the same place then it draws it some fraction of the way from its old position to its new position. This method copes automatically with undo.

Other games are less obvious. In Sixteen, for example, you can't just draw each tile a fraction of the way from its old to its new position: if you did that, the end tile would zip very rapidly past all the others to get to the other end and that would look silly. (Worse, it would look inconsistent if the end tile was drawn on top going one way and on the bottom going the other way.)

A useful trick here is to define a field or two in the game state that indicates what the last move was.

Note also that Sixteen needs to store the direction of the move, because you can't quite determine it by examining the row or column in question. You can in almost all cases, but when the row is precisely two squares long it doesn't work since a move in either direction looks the same. (You could argue that since moving a 2-element row left and right has the same effect, it doesn't matter which one you animate; but in fact it's very disorienting to click the arrow left and find the row moving right, and almost as bad to undo a move to the right and find the game animating another move to the right.)

6.3.7 Animating drag operations

In Untangle, moves are made by dragging a node from an old position to a new position. Therefore, at the time when the move is initially made, it should not be animated, because the node has already been dragged to the right place and doesn't need moving there. However, it's nice to animate the same move if it's later undone or redone. This requires a bit of fiddling.

The obvious approach is to have a flag in the game_ui which inhibits move animation, and to set that flag in interpret_move(). The question is, when would the flag be reset again? The obvious place to do so is changed_state(), which will be called once per move. But it will be called before anim_length(), so if it resets the flag then anim_length() will never see the flag set at all.

The solution is to have two flags in a queue.

6.3.8 Inhibiting the victory flash when Solve is used

Many games flash when you complete them, as a visual congratulation for having got to the end of the puzzle. It often seems like a good idea to disable that flash when the puzzle is brought to a solved state by means of the Solve operation.

This is easily done:

6.4 Things to test once your puzzle is written

Puzzle implementations written in this framework are self-testing as far as I could make them.

Textual game and move descriptions, for example, are generated and parsed as part of the normal process of play. Therefore, if you can make moves in the game at all you can be reasonably confident that the mid-end serialisation interface will function correctly and you will be able to save your game. (By contrast, if I'd stuck with a single make_move() function performing the jobs of both interpret_move() and execute_move(), and had separate functions to encode and decode a game state in string form, then those functions would not be used during normal play; so they could have been completely broken, and you'd never know it until you tried to save the game – which would have meant you'd have to test game saving extensively and make sure to test every possible type of game state. As an added bonus, doing it the way I did leads to smaller save files.)

There is one exception to this, which is the string encoding of the game_ui. Most games do not store anything permanent in the game_ui, and hence do not need to put anything in its encode and decode functions; but if there is anything in there, you do need to test game loading and saving to ensure those functions work properly.

It's also worth testing undo and redo of all operations, to ensure that the redraw and the animations (if any) work properly. Failing to animate undo properly seems to be a common error.

Other than that, just use your common sense.