15
\$\begingroup\$

I've written a Sudoku puzzle generator. It currently runs through each line of the 9x9 grid and places numbers randomly if they're valid. It loops over all the numbers from 1-9 and then if it finds itself trapped in a corner with no valid answers it throws out the whole board and restarts.

I tested it by making 1000 puzzles and it took just under 100 seconds, so a single puzzle takes 0.1. In practical terms that's almost negligible but still seems wasteful to ditch the processing up to that point (as it takes, on average, hundreds of attempts to find a valid puzzle). I'm maybe just being impractical to want a more intelligent solution so I thought I'd ask how it looks to people on here and if anyone has suggestions on improving it.

import random
numbers = [1,2,3,4,5,6,7,8,9]

def makeBoard():
    board = None
    while board is None:
        board = attemptBoard()
    return board

def attemptBoard():
    board = [[None for _ in range(9)] for _ in range(9)]
    for i in range(9):
        for j in range(9):
            checking = numbers[:]
            random.shuffle(checking)
            x = -1
            loopStart = 0
            while board[i][j] is None:
                x += 1
                if x == 9:
                    #No number is valid in this cell, start over
                    return None
                checkMe = [checking[x],True]
                if checkMe in board[i]:
                    #If it's already in this row
                    continue
                checkis = False
                for checkRow in board:
                    if checkRow[j] == checkMe:
                        #If it's already in this column
                        checkis = True
                if checkis: continue
                #Check if the number is elsewhere in this 3x3 grid based on where this is in the 3x3 grid
                if i % 3 == 1:
                    if   j % 3 == 0 and checkMe in (board[i-1][j+1],board[i-1][j+2]): continue
                    elif j % 3 == 1 and checkMe in (board[i-1][j-1],board[i-1][j+1]): continue
                    elif j % 3 == 2 and checkMe in (board[i-1][j-1],board[i-1][j-2]): continue
                elif i % 3 == 2:
                    if   j % 3 == 0 and checkMe in (board[i-1][j+1],board[i-1][j+2],board[i-2][j+1],board[i-2][j+2]): continue
                    elif j % 3 == 1 and checkMe in (board[i-1][j-1],board[i-1][j+1],board[i-2][j-1],board[i-2][j+1]): continue
                    elif j % 3 == 2 and checkMe in (board[i-1][j-1],board[i-1][j-2],board[i-2][j-1],board[i-2][j-2]): continue
                #If we've reached here, the number is valid.
                board[i][j] = checkMe
    return board


def printBoard(board):
    spacer = "++---+---+---++---+---+---++---+---+---++"
    print (spacer.replace('-','='))
    for i,line in enumerate(board):
        print ("|| {} | {} | {} || {} | {} | {} || {} | {} | {} ||".format(
                    line[0][0] if line[0][1] else ' ',
                    line[1][0] if line[1][1] else ' ',
                    line[2][0] if line[2][1] else ' ',
                    line[3][0] if line[3][1] else ' ',
                    line[4][0] if line[4][1] else ' ',
                    line[5][0] if line[5][1] else ' ',
                    line[6][0] if line[6][1] else ' ',
                    line[7][0] if line[7][1] else ' ',
                    line[8][0] if line[8][1] else ' ',))
        if (i+1) % 3 == 0: print(spacer.replace('-','='))
        else: print(spacer)
\$\endgroup\$
1
  • \$\begingroup\$ I like that your source code to generate Sudoku puzzle is concise to check the validity of a Sudoku puzzle! Would you like to go further to check whether your generated Sudoku has one and only one solution? \$\endgroup\$ Commented Mar 4, 2017 at 19:40

1 Answer 1

16
\$\begingroup\$

1. Review

  1. There are no docstrings. What do these functions do?

  2. The Python style guide says, "limit all lines to a maximum of 79 characters." If the code followed this recommendation, then we wouldn't have to scroll it horizontally to read it here.

  3. The board is not represented consistently. Looking at printBoard, it seems that each cell is represented by a list [a, b] where b is False if the cell is empty, and True if it contains the number a. But the initialization of the board in attemptBoard looks like this:

    board = [[None for _ in range(9)] for _ in range(9)]
    

    which represents empty cells as None, so that if I try to print this board, I get:

    TypeError: 'NoneType' object is not subscriptable
    

    I would recommend using a consistent board representation. In this case I think it makes more sense to use None for an empty cell and a number for a full cell (rather than a list). That's because (i) None and small numbers don't need any memory allocation, whereas a list needs to be allocated; (ii) testing a None or a number is quicker than testing a list.

  4. In printBoard you have very repetitive code:

    print ("|| {} | {} | {} || {} | {} | {} || {} | {} | {} ||".format(
                line[0][0] if line[0][1] else ' ',
                line[1][0] if line[1][1] else ' ',
                line[2][0] if line[2][1] else ' ',
                line[3][0] if line[3][1] else ' ',
                line[4][0] if line[4][1] else ' ',
                line[5][0] if line[5][1] else ' ',
                line[6][0] if line[6][1] else ' ',
                line[7][0] if line[7][1] else ' ',
                line[8][0] if line[8][1] else ' ',))
    

    This can be rewritten using a loop:

    print("|| {} | {} | {} || {} | {} | {} || {} | {} | {} ||"
          .format(*(number if full else ' ' for number, full in line)))
    

    or, after simplifying the board representation as recommended above:

    print("|| {} | {} | {} || {} | {} | {} || {} | {} | {} ||"
          .format(*(cell or ' ' for cell in line)))
    
  5. The nested loops:

    for i in range(9):
        for j in range(9):
    

    can be combined into one using itertools.product:

    for i, j in itertools.product(range(9), repeat=2):
    
  6. The variable loopStart is never used.

  7. Instead of this complex while loop:

    x = -1
    while board[i][j] is None:
        x += 1
        if x == 9:
            #No number is valid in this cell, start over
            return None
        checkMe = [checking[x],True]
    
        # ... loop body here ...
    
        #If we've reached here, the number is valid.
        board[i][j] = checkMe
    

    write a for loop with an else:

    for x in checking:
        # ... loop body here ...
    
        # If we've reached here, the number is valid.
        board[i][j] = x
        break
    else:
        # No number is valid in this cell, start over.
        return None
    
  8. The column check:

    checkis = False
    for checkRow in board:
        if checkRow[j] == checkMe:
            #If it's already in this column
            checkis = True
    if checkis: continue
    

    can be simplified using the built-in function any:

    if any(row[j] == checkMe for row in board): continue
    
  9. The code for checking against other cells in the 3×3 block is very repetitive:

    if i % 3 == 1:
        if   j % 3 == 0 and checkMe in (board[i-1][j+1],board[i-1][j+2]): continue
        elif j % 3 == 1 and checkMe in (board[i-1][j-1],board[i-1][j+1]): continue
        elif j % 3 == 2 and checkMe in (board[i-1][j-1],board[i-1][j-2]): continue
    elif i % 3 == 2:
        if   j % 3 == 0 and checkMe in (board[i-1][j+1],board[i-1][j+2],board[i-2][j+1],board[i-2][j+2]): continue
        elif j % 3 == 1 and checkMe in (board[i-1][j-1],board[i-1][j+1],board[i-2][j-1],board[i-2][j+1]): continue
        elif j % 3 == 2 and checkMe in (board[i-1][j-1],board[i-1][j-2],board[i-2][j-1],board[i-2][j-2]): continue
    

    The reason you go to this trouble is to avoid testing against board[i-1][j] and board[i-2][j], which you know would be useless, because you already tested these cells when you checked the column. But in fact that's a false economy. You avoid an unnecessary test, but at the cost of a lot of extra code. It turns out to be just as fast, but a lot simpler, to test all the entries in previous rows of the block, like this:

    i0, j0 = i - i % 3, j - j % 3 # origin of 3x3 block
    if any(x in row[j0:j0+3] for row in board[i0:i]):
        continue
    
  10. The code only works for 9×9 Sudoku grids made up of 3×3 blocks. But there's nothing special about the numbers 3 and 9 here: the algorithm would be essentially the same for 2 and 4, or 4 and 16. So why not make the code general?

2. Revised code

This isn't any faster than the original code, but it's a lot shorter and simpler, which makes it a better place to start when speeding it up:

import itertools
import random

def attempt_board(m=3):
    """Make one attempt to generate a filled m**2 x m**2 Sudoku board,
    returning the board if successful, or None if not.

    """
    n = m**2
    numbers = list(range(1, n + 1))
    board = [[None for _ in range(n)] for _ in range(n)]
    for i, j in itertools.product(range(n), repeat=2):
        i0, j0 = i - i % m, j - j % m # origin of mxm block
        random.shuffle(numbers)
        for x in numbers:
            if (x not in board[i]                     # row
                and all(row[j] != x for row in board) # column
                and all(x not in row[j0:j0+m]         # block
                        for row in board[i0:i])):
                board[i][j] = x
                break
        else:
            # No number is valid in this cell.
            return None
    return board

3. Backtracking

If attempt_board finds that there are no valid numbers for some cell, then it throws away all its work and starts all over again from the beginning. But all that work is not necessarily invalid: most likely the mistake was made only in the last few steps, and so if the algorithm were to go back a little bit and try some different choices, then it would find a solution. This approach is known as backtracking.

Backtracking is easily implemented by using recursion:

def make_board(m=3):
    """Return a random filled m**2 x m**2 Sudoku board."""
    n = m**2
    board = [[None for _ in range(n)] for _ in range(n)]

    def search(c=0):
        "Recursively search for a solution starting at position c."
        i, j = divmod(c, n)
        i0, j0 = i - i % m, j - j % m # Origin of mxm block
        numbers = list(range(1, n + 1))
        random.shuffle(numbers)
        for x in numbers:
            if (x not in board[i]                     # row
                and all(row[j] != x for row in board) # column
                and all(x not in row[j0:j0+m]         # block
                        for row in board[i0:i])): 
                board[i][j] = x
                if c + 1 >= n**2 or search(c + 1):
                    return board
        else:
            # No number is valid in this cell: backtrack and try again.
            board[i][j] = None
            return None

    return search()

I find that this is about 60 times faster than the original code.

4. Algorithm X

There's a reformulation of Sudoku in terms of the "exact cover" problem, and this can be solved using Donald Knuth's "Algorithm X". See this blog post of mine for a detailed explaination of the how this algorithm can be used to solve Sudoku, and see this post on Code Review for an implementation in Python.

\$\endgroup\$
1
  • \$\begingroup\$ This was a very thorough answer, thank you! Just to clarify about the cells being [int, boolean], the boolean is actually for the printing. I used it to determine if the number should be visible to the user, but in retrospect it'd probably be better to just have 2 lists, one being the completed full of numbers that I can test against and the other in progress with int/None as you suggested. \$\endgroup\$ Commented May 6, 2015 at 9:04

Not the answer you're looking for? Browse other questions tagged or ask your own question.