36
\$\begingroup\$

Winner found

It seems as if we have a winner! Unless anyone plans on contesting the world's current fastest Sudoku solver, user 53x15 wins with the staggeringly fast solver Tdoku. For anyone still working on their solvers, I'll still benchmark new submissions when I have time.

The challenge

The goal of a game of Sudoku is to fill the board with the numbers 1-9, one in each cell, in such a way that each row, column and box only contains each number once. A very important aspect of a Sudoku puzzle is that there should only be one valid solution.

The goal of this challenge is simple, you should solve a set of Sudoku puzzles as fast as possible. However, you won't just be solving any old Sudoku, you'll be solving the very hardest Sudoku puzzles in existence, the 17-clue Sudokus. Here's an example:

Hard Sudoku

Rules

Language

You're free to use any language. If I don't have a compiler installed for your language, you should be able to provide a set of command line instructions needed to install an environment where your script can be run on Linux.

Benchmark machine

The benchmark will be run on a Dell XPS 9560, 2.8GHz Intel Core i7-7700HQ (3.8GHz boost) 4 cores, 8 threads, 16GB RAM. GTX 1050 4GB. The machine runs Ubuntu 19.04. Here's the uname output, for anyone interested.

Linux 5.0.0-25-generic #26-Ubuntu SMP Thu Aug 1 12:04:58 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Input

The input will be given as a file. It can be found here. The file contains 49151 Sudoku puzzles. The first line of the file is the number of puzzles, and every line after that is 81 characters long and represents a puzzle. The unknown cells are 0, and the known cells are 1-9.

Your program should be able to take the filename as an argument, or have the file input from STDIN, to facilitate manual checking of your solution. Please include an instruction for how your program takes input.

Timing / scoring

From discussions in the comments, and some reflection, the scoring criteria has been changed to be the time of your entire program. Your program should produce the output file with the correct hash even during official scoring. This doesn't interfere with any existing solution, and doesn't change the rankings as they stand now. Any thoughts on the scoring system are appreciated.

If two solutions have similar scores for individual runs, I will run multiple benchmarks, and the average time will be the final score. If the average scores differ by less than 2%, I will consider it a draw.

If your solution takes longer than an hour to run, it will not be officially scored. In those cases, you are responsible for reporting the machine on which it ran, and your score. For an optimized solver, this should not be an issue.

EDIT: It was brought to my attention that while difficult, the problem set at hand is not the most difficult there is. If time is available, I'll try to benchmark the solutions presented here against the harder puzzle set, and add the score to each submission. However, this will not be an official scoring, and is just for fun.

Verification

Your solution will be verified by a MD5/SHA256 checksum. Your script should be able to generate a file containing all puzzles and their solutions. However, the file will also be manually inspected, so don't try to get a hash collision. Your output file should match:

MD5: 41704fd7d8fd0723a45ffbb2dbbfa488
SHA256: 0bc8dda364db7b99f389b42383e37b411d9fa022204d124cb3c8959eba252f05

The file will be on the format:

<num_puzzles>
<unsolved_puzzle#1>,<solved_puzzle#1>
<unsolved_puzzle#2>,<solved_puzzle#2>
...
<unsolved_puzzle#n>,<solved_puzzle#n>

with a single trailing newline.

What's not allowed

You are in no way allowed to hard-code solutions. Your algorithm should be applicable on any set of Sudoku puzzles, both easy and harder Sudokus. However, it is entirely fine if your solution is slow for easier puzzles.

You are not allowed to have a non-deterministic program. You are allowed to use a random number generator, but the seed of the generator should be fixed. This rule is to ensure that measurements are more precise, and have less variance. (Thanks to Peter Taylor for the tip)

You are not allowed to use any external resources or web requests during the runtime of your program. Everything should be self-contained. This does not apply to installed libraries and packages, which are allowed.

Other info

If you want another test set to check your solution, here are 10000 easier Sudokus. Here are their solutions.

MD5: 3cb465ef6077c4fcab5bd6ae3bc50d62
SHA256: 0bc8dda364db7b99f389b42383e37b411d9fa022204d124cb3c8959eba252f05

If you have any questions, feel free to ask, and I'll try to clarify any misunderstandings.

\$\endgroup\$
10
  • \$\begingroup\$ I have an APL+WIN solver but unless you have a copy of the interpreter on your machine you will have to count me out. For info your hard example took 30ms and the first easy example 16ms. \$\endgroup\$
    – Graham
    Commented Aug 23, 2019 at 17:24
  • \$\begingroup\$ @Graham it took 30ms for all 49151 sudokus, or 30ms on average? \$\endgroup\$
    – maxb
    Commented Aug 23, 2019 at 17:55
  • 1
    \$\begingroup\$ Should the entries also be code golfed? Or... Can they be golfed, for the fun it? ;-) \$\endgroup\$
    – The Matt
    Commented Aug 24, 2019 at 3:28
  • 2
    \$\begingroup\$ @TheMatt I'd prefer non-golfed, just so I can verify that nothing fishy is going on \$\endgroup\$
    – maxb
    Commented Aug 24, 2019 at 6:26
  • 1
    \$\begingroup\$ In case you need the world's hardest sudoku: abcnews.go.com/blogs/headlines/2012/06/… \$\endgroup\$ Commented Mar 15, 2023 at 1:01

12 Answers 12

15
\$\begingroup\$

C++ - 0.201s official score

Using Tdoku (code; design; benchmarks) gives these results:

~/tdoku$ lscpu | grep Model.name
Model name:            Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz

~/tdoku$ # build:
~/tdoku$ CC=clang-8 CXX=clang++-8 ./BUILD.sh
~/tdoku$ clang -o solve example/solve.c build/libtdoku.a 

~/tdoku$ # adjust input format:
~/tdoku$ sed -e "s/0/./g" all_17_clue_sudokus.txt > all_17_clue_sudokus.txt.in

~/tdoku$ # solve:
~/tdoku$ time ./solve 1 < all_17_clue_sudokus.txt.in > out.txt
real    0m0.241s
user    0m0.229s
sys     0m0.012s

~/tdoku$ # adjust output format and sha256sum:
~/tdoku$ grep -v "^:0:$" out.txt | sed -e "s/:1:/,/" | tr . 0 | sha256sum
0bc8dda364db7b99f389b42383e37b411d9fa022204d124cb3c8959eba252f05  -

Tdoku has been optimized for hard Sudoku instances. But note, contrary to the problem statement, that 17 clue puzzles are far from the hardest Sudoku. Actually they're among the easiest, with the majority requiring no backtracking at all. See some of the other benchmark datasets in the Tdoku project for puzzles that are actually hard.

Also note that while Tdoku is the fastest solver I'm aware of for hard puzzles, it's not the fastest for 17 clue puzzles. For these I think the fastest is this rust project, a derivative of JCZSolve, which was optimized for 17 clue puzzles during development. Depending on the platform it might be 5-25% faster than Tdoku for these puzzles.

\$\endgroup\$
4
  • \$\begingroup\$ Wow, that was an interesting read about the implementation and theory behind it. Before I started this challenge, I wanted to find state of the art solvers and datasets. I guess I didn't look hard enough. From popular "scientific" articles, the 17 clue puzzles were all that's ever talked about, so it was my assumption that those were the hardest. I'll try to run all submissions against the data sets presented in your article, and I'll benchmark your submission later today. Fantastic job! \$\endgroup\$
    – maxb
    Commented Aug 30, 2019 at 4:36
  • \$\begingroup\$ Thanks! You see from the article that finding state-of-the-art solving took me on a long journey. :-) I get why people focus on 17 clue puzzles: the dataset is well known, well defined, complete or nearly so, moderately large, and hard for naive solvers. While it's interesting to study harder puzzles, hardness is tricky to formalize. e.g., do we mean subjectively or empirically hard for humans based on the techniques required? do we mean slow on average for a given solver under random permutations? Do we mean minimum backdoor size under a formula with chosen pigeonhole inferences? etc. \$\endgroup\$
    – 53x15
    Commented Aug 30, 2019 at 21:14
  • \$\begingroup\$ 0.201s - is that the time taken to solve all ~49k puzzles, roughly 4usec per puzzle? \$\endgroup\$ Commented Aug 27, 2020 at 22:10
  • \$\begingroup\$ Yes, that's right. \$\endgroup\$
    – 53x15
    Commented Aug 28, 2020 at 15:57
9
\$\begingroup\$

C - 0.045s unofficial score

I got this time on my i7-9750H with 6 cores 12 threads @ 4Ghz. I'm aware that my cpu is faster than the i7-7700HQ so I think (hope) it would be closer to 0.080s if it were officially scored.

Compile with: gcc file.c -O3 -march=native -fopenmp

The program takes the input file name as argument. It generates an output.txt file containing all sudokus with their solution. It can be significantly slower when the output file already exists.

This won't run on CPUs that don't support the SSE4.1 and AVX2 instructions set extensions.

#include <stdio.h>
#include <omp.h>
#include <stdbool.h>
#include <intrin.h>

// lookup tables that may or may not speed things up by avoiding division
static const unsigned char box_index[81] = {
    0, 0, 0, 1, 1, 1, 2, 2, 2,
    0, 0, 0, 1, 1, 1, 2, 2, 2,
    0, 0, 0, 1, 1, 1, 2, 2, 2,
    3, 3, 3, 4, 4, 4, 5, 5, 5,
    3, 3, 3, 4, 4, 4, 5, 5, 5,
    3, 3, 3, 4, 4, 4, 5, 5, 5,
    6, 6, 6, 7, 7, 7, 8, 8, 8,
    6, 6, 6, 7, 7, 7, 8, 8, 8,
    6, 6, 6, 7, 7, 7, 8, 8, 8
};

static const unsigned char column_index[81] = {
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
    0, 1, 2, 3, 4, 5, 6, 7, 8,
};

static const unsigned char row_index[81] = {
    0, 0, 0, 0, 0, 0, 0, 0, 0,
    1, 1, 1, 1, 1, 1, 1, 1, 1,
    2, 2, 2, 2, 2, 2, 2, 2, 2,
    3, 3, 3, 3, 3, 3, 3, 3, 3,
    4, 4, 4, 4, 4, 4, 4, 4, 4,
    5, 5, 5, 5, 5, 5, 5, 5, 5,
    6, 6, 6, 6, 6, 6, 6, 6, 6,
    7, 7, 7, 7, 7, 7, 7, 7, 7,
    8, 8, 8, 8, 8, 8, 8, 8, 8,
};

static const unsigned char box_start[81] = {
    0, 0, 0, 3, 3, 3, 6, 6, 6,
    0, 0, 0, 3, 3, 3, 6, 6, 6,
    0, 0, 0, 3, 3, 3, 6, 6, 6,
    27, 27, 27, 30, 30, 30, 33, 33, 33,
    27, 27, 27, 30, 30, 30, 33, 33, 33,
    27, 27, 27, 30, 30, 30, 33, 33, 33,
    54, 54, 54, 57, 57, 57, 60, 60, 60,
    54, 54, 54, 57, 57, 57, 60, 60, 60,
    54, 54, 54, 57, 57, 57, 60, 60, 60
};

static void add_column_indices(unsigned long long indices[2], unsigned char i) {
    indices[0] |= 0x8040201008040201ULL << column_index[i];
    indices[1] |= 0x8040201008040201ULL >> (10-column_index[i]);
}

static void add_row_indices(unsigned long long indices[2], unsigned char i) {
    switch (row_index[i]) {
        case 7:
            indices[0] |= 0x8000000000000000ULL;
            indices[1] |= 0xffULL;
            break;
        case 8:
            indices[1] |= 0x01ff00ULL;
            break;
        default:
            indices[0] |= 0x01ffULL << 9*row_index[i];
    }
}

static void add_box_indices(unsigned long long indices[2], unsigned char i) {
    indices[0] |= 0x1c0e07ULL << box_start[i];
    indices[1] |= 0x0381c0e0ULL >> (60-box_start[i]);
}

struct GridState {
    struct GridState* prev;                // last grid state before a guess was made, used for backtracking
    unsigned long long unlocked[2];        // for keeping track of which cells don't need to be looked at anymore. Set bits correspond to cells that still have multiple possibilities
    unsigned long long updated[2];        // for keeping track of which cell's candidates may have been changed since last time we looked for naked sets. Set bits correspond to changed candidates in these cells
    unsigned short candidates[81];        // which digits can go in this cell? Set bits correspond to possible digits
};

static void enter_digit(struct GridState* grid_state, signed char digit, unsigned char i) {
    // lock this cell and and remove this digit from the candidates in this row, column and box
    
    unsigned short* candidates = grid_state->candidates;
    unsigned long long* unlocked = grid_state->unlocked;
    unsigned long long to_update[2] = {0};
    
    if (i < 64) {
        unlocked[0] &= ~(1ULL << i);
    } else {
        unlocked[1] &= ~(1ULL << (i-64));
    }
    
    candidates[i] = 1 << digit;
    
    add_box_indices(to_update, i);
    add_column_indices(to_update, i);
    add_row_indices(to_update, i);
    
    to_update[0] &= unlocked[0];
    to_update[1] &= unlocked[1];
    
    grid_state->updated[0] |= to_update[0];
    grid_state->updated[1] |= to_update[1];
    
    const __m256i bit_mask = _mm256_setr_epi16(1<<0, 1<<1, 1<<2, 1<<3, 1<<4, 1<<5, 1<<6, 1<<7, 1<<8, 1<<9, 1<<10, 1<<11, 1<<12, 1<<13, 1<<14, 1<<15);
    const __m256i mask = _mm256_set1_epi16(~candidates[i]);
    for (unsigned char j = 0; j < 80; j += 16) {
        unsigned short m;
        if (j < 64) {
            m = (unsigned short) ((to_update[0] >> j) & 0xffff);
        } else {
            m = (unsigned short) (to_update[1] & 0xffff);
        }    
        __m256i c = _mm256_loadu_si256((__m256i*) &candidates[j]);
        __m256i u = _mm256_cmpeq_epi16(_mm256_and_si256(bit_mask, _mm256_set1_epi16(m)), _mm256_setzero_si256());
        c = _mm256_and_si256(c, _mm256_or_si256(mask, u));
        _mm256_storeu_si256((__m256i*) &candidates[j], c);
    }
    if ((to_update[1] & (1ULL << (80-64))) != 0) {
        candidates[80] &= ~candidates[i];
    }
}

static long long guesses = 0;

static struct GridState* make_guess(struct GridState* grid_state) {
    // Make a guess for the cell with the least candidates. The guess will be the lowest
    // possible digit for that cell. If multiple cells have the same number of candidates, the 
    // cell with lowest index will be chosen. Also save the current grid state for tracking back
    // in case the guess is wrong. No cell has less than two candidates.
    
    unsigned long long* unlocked = grid_state->unlocked;
    unsigned short* candidates = grid_state->candidates;
    
    // Find the cell with fewest possible candidates
    unsigned long long to_visit;
    unsigned long guess_index = 0;
    unsigned long i_rel;
    unsigned short cnt;
    unsigned short best_cnt = 16;
    
    to_visit = unlocked[0];
    while (_BitScanForward64(&i_rel, to_visit) != 0) {
        to_visit &= ~(1ULL << i_rel);
        cnt = __popcnt16(candidates[i_rel]);
        if (cnt < best_cnt) {
            best_cnt = cnt;
            guess_index = i_rel;
        }
    }
    to_visit = unlocked[1];
    while (_BitScanForward64(&i_rel, to_visit) != 0) {
        to_visit &= ~(1ULL << i_rel);
        cnt = __popcnt16(candidates[i_rel + 64]);
        if (cnt < best_cnt) {
            best_cnt = cnt;
            guess_index = i_rel + 64;
        }
    }
    
    // Find the first candidate in this cell
    unsigned long digit;
    _BitScanReverse(&digit, candidates[guess_index]);
    
    // Create a copy of the state of the grid to make back tracking possible
    struct GridState* new_grid_state = (struct GridState*) malloc(sizeof(struct GridState));
    memcpy(new_grid_state, grid_state, sizeof(struct GridState));
    new_grid_state->prev = grid_state;
    
    // Remove the guessed candidate from the old grid because if we need to get back to the old grid
    // we know the guess was wrong
    grid_state->candidates[guess_index] &= ~(1 << digit);
    if (guess_index < 64) {
        grid_state->updated[0] |= 1ULL << guess_index;
    } else {
        grid_state->updated[1] |= 1ULL << (guess_index-64);
    }
    
    // Update candidates
    enter_digit(new_grid_state, (signed char) digit, (unsigned char) guess_index);
    
    guesses++;
    
    return new_grid_state;
}


static struct GridState* track_back(struct GridState* grid_state) {
    // Go back to the state when the last guess was made
    // This state had the guess removed as candidate from it's cell
    
    struct GridState* old_grid_state = grid_state->prev;
    free(grid_state);
    
    return old_grid_state;
}


static bool solve(signed char grid[81]) {

    struct GridState* grid_state = (struct GridState*) malloc(sizeof(struct GridState));
    grid_state->prev = 0;
    unsigned long long* unlocked = grid_state->unlocked;
    unsigned long long* updated = grid_state->updated;
    unsigned short* candidates = grid_state->candidates;
    unlocked[0] = 0xffffffffffffffffULL;
    unlocked[1] = 0x1ffffULL;
    updated[0] = unlocked[0];
    updated[1] = unlocked[1];
    
    {
        signed char digit;
        unsigned short columns[9] = {0};
        unsigned short rows[9] = {0};
        unsigned short boxes[9] = {0};
        
        for (unsigned char i = 0; i < 64; ++i) {
            digit = grid[i];
            if (digit >= 49) {
                columns[column_index[i]] |= 1 << (digit-49);
                rows[row_index[i]] |= 1 << (digit-49);
                boxes[box_index[i]] |= 1 << (digit-49);
                unlocked[0] &= ~(1ULL << i);
            }
        }
        for (unsigned char i = 64; i < 81; ++i) {
            digit = grid[i];
            if (digit >= 49) {
                columns[column_index[i]] |= 1 << (digit-49);
                rows[row_index[i]] |= 1 << (digit-49);
                boxes[box_index[i]] |= 1 << (digit-49);
                unlocked[1] &= ~(1ULL << (i-64));
            }
        }
        
        for (unsigned char i = 0; i < 81; ++i) {
            if (grid[i] < 49) {
                candidates[i] = 0x01ff ^ (rows[row_index[i]] | columns[column_index[i]] | boxes[box_index[i]]);
            } else {
                candidates[i] = 1 << (grid[i]-49);
            }
        }
    }
    
    start:
    
    unlocked = grid_state->unlocked;
    candidates = grid_state->candidates;
            
    // Find naked singles
    {    
        bool found;
        const __m256i ones = _mm256_set1_epi16(1);
        const __m256i bit_mask = _mm256_setr_epi16(1<<0, 1<<1, 1<<2, 1<<3, 1<<4, 1<<5, 1<<6, 1<<7, 1<<8, 1<<9, 1<<10, 1<<11, 1<<12, 1<<13, 1<<14, 1<<15);
        do {
            found = false;
            for (unsigned char i = 0; i < 80; i += 16) {
                __m256i c = _mm256_loadu_si256((__m256i*) &candidates[i]);
                // Check if any cell has zero candidates
                if (_mm256_movemask_epi8(_mm256_cmpeq_epi16(c, _mm256_setzero_si256()))) {
                    // Back track, no solutions along this path
                    grid_state = track_back(grid_state);
                    goto start;
                } else {
                    unsigned short m;
                    if (i < 64) {
                        m = (unsigned short) ((unlocked[0] >> i) & 0xffff);
                    } else {
                        m = (unsigned short) (unlocked[1] & 0xffff);
                    }                        
                    
                    __m256i a = _mm256_cmpeq_epi16(_mm256_and_si256(c, _mm256_sub_epi16(c, ones)), _mm256_setzero_si256());
                    __m256i u = _mm256_cmpeq_epi16(_mm256_and_si256(bit_mask, _mm256_set1_epi16(m)), bit_mask);
                    int mask = _mm256_movemask_epi8(_mm256_and_si256(a, u));
                    
                    if (mask) {
                        unsigned long index, digit;
                        _BitScanForward(&index, mask);
                        index = (index >> 1) + i;                            
                        _BitScanReverse(&digit, candidates[index]);
                        enter_digit(grid_state, (signed char) digit, index);
                        found = true;
                    }
                }
            }
            if (unlocked[1] & (1ULL << (80-64))) {
                if (candidates[80] == 0) {
                    // no solutions go back
                    grid_state = track_back(grid_state);
                    goto start;
                } else if (__popcnt16(candidates[80]) == 1) {
                    // Enter the digit and update candidates
                    unsigned long digit;
                    _BitScanReverse(&digit, candidates[80]);
                    enter_digit(grid_state, (signed char) digit, 80);
                    found = true;
                }
            }
        } while (found);
    }
    
    // Check if it's solved, if it ever gets solved it will be solved after looking for naked singles
    if ((unlocked[0] | unlocked[1]) == 0) {
        // Solved it
        // Free memory
        while (grid_state) {
            struct GridState* prev_grid_state = grid_state->prev;
            free(grid_state);
            grid_state = prev_grid_state;
        }
        
        // Enter found digits into grid
        for (unsigned char j = 0; j < 81; ++j) {
            unsigned long index;
            _BitScanReverse(&index, candidates[j]);
            grid[j] = (signed char) index + 49;
        }
        
        return true;
    }
    
    // Find hidden singles
    // Don't check the last column because it doesn't fit in the SSE register so it's not really worth checking
    {
        const __m128i ones = _mm_set1_epi16(1);
        const __m128i bit_mask = _mm_setr_epi16(1<<0, 1<<1, 1<<2, 1<<3, 1<<4, 1<<5, 1<<6, 1<<7);
        const __m128i shuffle_mask = _mm_setr_epi8(2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 1);
        for (unsigned char i = 0; i < 81; i += 9) {
            
            // rows
            __m128i row_mask = _mm_set1_epi16(0x01ff ^ candidates[i+8]);
            __m128i c = _mm_loadu_si128((__m128i*) &candidates[i]);
            for (unsigned char j = 0; j < 7; ++j) {
                // rotate shift (1 2 3 4) -> (4 1 2 3)
                c = _mm_shuffle_epi8(c, shuffle_mask);
                row_mask = _mm_andnot_si128(c, row_mask);
            }
            
            // columns
            __m128i column_mask = _mm_set1_epi16(0x01ff);
            for (unsigned char j = 0; j < 81; j += 9) {
                if (j != i) {
                    column_mask = _mm_andnot_si128(_mm_loadu_si128((__m128i*) &candidates[j]), column_mask);
                }
            }
            
            // boxes aren't worth it
            
            __m128i or_mask = _mm_or_si128(row_mask, column_mask);                
            
            if (_mm_test_all_zeros(or_mask, _mm_sub_epi16(or_mask, ones))) {
                
                unsigned short m;
                if (i < 64) {
                    m = (unsigned short) ((unlocked[0] >> i) & 0xff);
                } else {
                    m = (unsigned short) ((unlocked[1] >> (i-64)) & 0xff);
                }
                
                c = _mm_loadu_si128((__m128i*) &candidates[i]);
                
                __m128i a = _mm_cmpgt_epi16(_mm_and_si128(c, or_mask), _mm_setzero_si128());
                __m128i u = _mm_cmpeq_epi16(_mm_and_si128(bit_mask, _mm_set1_epi16(m)), bit_mask);
                int mask = _mm_movemask_epi8(_mm_and_si128(a, u));
                
                if (mask) {
                    unsigned long index, digit;
                    _BitScanForward(&index, mask);
                    
                    index = index/2;
                    
                    int can = ((unsigned short*) &or_mask)[index];
                    _BitScanForward(&digit, can);
                    
                    index = index + i;
                    
                    enter_digit(grid_state, (signed char) digit, index);
                    goto start;
                }
                
            } else {
                // no solutions go back
                grid_state = track_back(grid_state);
                goto start;
            }
        }
    }
    
    // Find naked sets, up to 5
    {
        bool found = false;
        // because this is kind of an expensive task, we are not going to visit all cells but only those that were changed
        unsigned long long *to_visit_n = grid_state->updated;
        to_visit_n[0] &= unlocked[0];
        to_visit_n[1] &= unlocked[1];
        
        for (unsigned char n = 0; n < 2; ++n) {
            while (to_visit_n[n]) {
                unsigned long i_rel;
                _BitScanForward64(&i_rel, to_visit_n[n]);
                
                to_visit_n[n] ^= 1ULL << i_rel;
                unsigned char i = (unsigned char) i_rel + 64*n;
                
                unsigned short cnt = __popcnt16(candidates[i]);
                
                if (cnt <= 5) {
                    // check column
                    unsigned long long to_change[2] = {0};
                    unsigned char s;
                    __m128i a_i = _mm_set1_epi16(candidates[i]);
                    __m128i a_j = _mm_set_epi16(candidates[column_index[i]+63], candidates[column_index[i]+54], candidates[column_index[i]+45], candidates[column_index[i]+36], candidates[column_index[i]+27], candidates[column_index[i]+18], candidates[column_index[i]+9], candidates[column_index[i]]);
                    __m128i res = _mm_cmpeq_epi16(a_i, _mm_or_si128(a_i, a_j));
                    s = __popcnt16(_mm_movemask_epi8(res)) >> 1;
                    s += candidates[i] == (candidates[i] | candidates[column_index[i]+72]);
                    if (s > cnt) {
                        grid_state = track_back(grid_state);
                        goto start;
                    } else if (s == cnt) {
                        add_column_indices(to_change, i);
                    }

                    // check row
                    a_j = _mm_load_si128((__m128i*) &candidates[9*row_index[i]]);
                    res = _mm_cmpeq_epi16(a_i, _mm_or_si128(a_i, a_j));
                    s = __popcnt16(_mm_movemask_epi8(res)) >> 1;
                    s += candidates[i] == (candidates[i] | candidates[9*row_index[i]+8]);
                    if (s > cnt) {
                        grid_state = track_back(grid_state);
                        goto start;
                    } else if (s == cnt) {
                        add_row_indices(to_change, i);
                    }
                    
                    // check box
                    unsigned short b = box_start[i];
                    a_j = _mm_set_epi16(candidates[b], candidates[b+1], candidates[b+2], candidates[b+9], candidates[b+10], candidates[b+11], candidates[b+18], candidates[b+19]);
                    res = _mm_cmpeq_epi16(a_i, _mm_or_si128(a_i, a_j));
                    s = __popcnt16(_mm_movemask_epi8(res)) >> 1;
                    s += candidates[i] == (candidates[i] | candidates[b+20]);
                    if (s > cnt) {
                        grid_state = track_back(grid_state);
                        goto start;
                    } else if (s == cnt) {
                        add_box_indices(to_change, i);
                    }
                                    
                    to_change[0] &= unlocked[0];
                    to_change[1] &= unlocked[1];
                    
                    // update candidates
                    for (unsigned char n = 0; n < 2; ++n) {
                        while (to_change[n]) {
                            unsigned long j_rel;
                            _BitScanForward64(&j_rel, to_change[n]);
                            to_change[n] &= ~(1ULL << j_rel);
                            unsigned char j = (unsigned char) j_rel + 64*n;
                            
                            if ((candidates[j] | candidates[i]) != candidates[i]) {
                                if (candidates[j] & candidates[i]) {
                                    candidates[j] &= ~candidates[i];
                                    to_visit_n[n] |= 1ULL << j_rel;
                                    found = true;
                                }
                            }
                        }
                    }
                    
                    // If any cell's candidates got updated, go back and try all that other stuff again
                    if (found) {
                        goto start;
                    }
                    
                }
            }
        }
    }
    
    // More techniques could be added here but they're not really worth checking for on the 17 clue sudoku set
    
    // Make a guess if all that didn't work
    grid_state = make_guess(grid_state);
    goto start;
    
}


int main(int argc, char *argv[]) {
    
    if (argc != 2) {
        printf("Error! Pass the file name as argument\n");
        return -1;
    }
    
    FILE *f = fopen(argv[1], "rb");
    
    // test for files not existing
    if (f == 0) {
        printf("Error! Could not open file %s\n", argv[1]);
        return -1;
    }
    
    // find length of file
    fseek(f, 0, SEEK_END);
    long fsize = ftell(f);
    fseek(f, 0, SEEK_SET);
    
    // deal with the part that says how many sudokus there are
    signed char *string = malloc(fsize + 1);
    fread(string, 1, fsize, f);
    fclose(f);

    size_t p = 0;
    while (string[p] != 10) {
        ++p;
    }
    ++p;
    
    signed char *output = malloc(fsize*2 - p + 2);
    memcpy(output, string, p);
    
    // solve all sudokus and prepare output file
    int i;
    #pragma omp parallel for shared(string, output, fsize, p) schedule(dynamic)
    for (i = fsize - p - 81; i >= 0; i-=82) {
        // copy unsolved grid
        memcpy(&output[p+i*2], &string[p+i], 81);
        memcpy(&output[p+i*2+82], &string[p+i], 81);
        // add comma and newline in right place
        output[p+i*2 + 81] = ',';
        output[p+i*2 + 163] = 10;
        // solve the grid in place
        solve(&output[p+i*2+82]);
    }
    
    // create output file
    f = fopen("output.txt", "wb");
    fwrite(output, 1, fsize*2 - p + 2, f);
    fclose(f);    

    free(string);
    free(output);

    return 0;
}

The algorithm uses normal techniques humans use. Specifically looking for naked singles, hidden singles and naked subsets. If that doesn't work it uses the infamous Nishio technique (just backtracking). I added more techniques initially like looking for locked candidates, but they turned out to not really be worth it for these 17-clue sudokus.

\$\endgroup\$
7
  • \$\begingroup\$ very cool!! I recently did some benchmarks and it loooks like std::copy usually is a fraction faster than memcpy \$\endgroup\$
    – jdt
    Commented Sep 22, 2021 at 22:27
  • \$\begingroup\$ @JohanduToit The memcpy part is not really critical since the algorithm makes on average only 1 guess per sudoku (it doesn't get called often). When looking for locked candidates this goes down to only 0.3 or 0.2 guesses but my method for finding them was kinda bad so it didn't make much of a difference in time. I think if you want to make it significantly faster that would be something to look into though. \$\endgroup\$
    – Mirage
    Commented Sep 22, 2021 at 23:02
  • 1
    \$\begingroup\$ Wow, this is really impressive! I noticed that you're using omp.h to solve all of the puzzles in parallel. It was a long time since I hosted this challenge, so I can't remember the exact verdict on parallelization, but your solution would be either the fastest or second fastest running single-threaded. I'll try to get this benchmarked to get you an official score! \$\endgroup\$
    – maxb
    Commented Sep 23, 2021 at 15:05
  • \$\begingroup\$ Very nice Mirage! I ran this through tdoku's benchmarks, and it is actually the fastest solver I've seen whose primary representation is cell candidate sets. It's right in there with other top solvers in single-threaded performance, and it does the least guessing of any solver except tdoku. Here is the comparison on various data sets: pastebin.com/ysXKWXJQ I haven't benchmarked with openmp since all of these solvers will benefit in roughly the same way. Is this on github somewhere? It is an interesting specimen ... may I add it to tdoku's "other solver" benchmarks? \$\endgroup\$
    – 53x15
    Commented Oct 2, 2021 at 19:20
  • \$\begingroup\$ Sorry, looks like I didn't set up the comparison correctly in the benchmarks above. Your solver returns when it finds the first solution and doesn't confirm that the solution is unique (something all the other solvers were doing). When you run them all in "first solution" mode then the results are here: pastebin.com/JTmkp8Y7 Still a solid performer, and maybe with headroom, but not the outlier it seemed to be in either speed and guesses. Instead it's comparable to other fast solvers with a cell-based representation. A little faster on some datasets, a little slower on others. \$\endgroup\$
    – 53x15
    Commented Oct 2, 2021 at 23:11
8
\$\begingroup\$

Node.js, 8.231s 6.735s official score

Takes the file name as argument. The input file may already contain the solutions in the format described in the challenge, in which case the program will compare them with its own solutions.

The results are saved in 'sudoku.log'.

Code

'use strict';

const fs = require('fs');

const BLOCK     = [];
const BLOCK_NDX = [];
const N_BIT     = [];
const ZERO      = [];
const BIT       = [];

console.time('Processing time');

init();

let filename = process.argv[2],
    puzzle = fs.readFileSync(filename).toString().split('\n'),
    len = puzzle.shift(),
    output = len + '\n';

console.log("File '" + filename + "': " + len + " puzzles");

// solve all puzzles
puzzle.forEach((p, i) => {
  let sol, res;

  [ p, sol ] = p.split(',');

  if(p.length == 81) {
    if(!(++i % 2000)) {
      console.log((i * 100 / len).toFixed(1) + '%');
    }
    if(!(res = solve(p))) {
      throw "Failed on puzzle " + i;
    }
    if(sol && res != sol) {
      throw "Invalid solution for puzzle " + i;
    }
    output += p + ',' + res + '\n';
  }
});

// results
console.timeEnd('Processing time');
fs.writeFileSync('sudoku.log', output);
console.log("MD5 = " + require('crypto').createHash('md5').update(output).digest("hex"));

// initialization of lookup tables
function init() {
  let ptr, x, y;

  for(x = 0; x < 0x200; x++) {
    N_BIT[x] = [0, 1, 2, 3, 4, 5, 6, 7, 8].reduce((s, n) => s + (x >> n & 1), 0);
    ZERO[x] = ~x & -~x;
  }

  for(x = 0; x < 9; x++) {
    BIT[1 << x] = x;
  }

  for(ptr = y = 0; y < 9; y++) {
    for(x = 0; x < 9; x++, ptr++) {
      BLOCK[ptr] = (y / 3 | 0) * 3 + (x / 3 | 0);
      BLOCK_NDX[ptr] = (y % 3) * 3 + x % 3;
    }
  }
}

// solver
function solve(p) {
  let ptr, x, y, v,
      count = 81,
      m = Array(81).fill(-1),
      row = Array(9).fill(0),
      col = Array(9).fill(0),
      blk = Array(9).fill(0);

  // helper function to check and play a move
  function play(stack, x, y, n) {
    let p = y * 9 + x;

    if(~m[p]) {
      if(m[p] == n) {
        return true;
      }
      undo(stack);
      return false;
    }

    let msk, b;

    msk = 1 << n;
    b = BLOCK[p];

    if((col[x] | row[y] | blk[b]) & msk) {
      undo(stack);
      return false;
    }
    count--;
    col[x] ^= msk;
    row[y] ^= msk;
    blk[b] ^= msk;
    m[p] = n;
    stack.push(x << 8 | y << 4 | n);

    return true;
  }

  // helper function to undo all moves on the stack
  function undo(stack) {
    stack.forEach(v => {
      let x = v >> 8,
          y = v >> 4 & 15,
          p = y * 9 + x,
          b = BLOCK[p];

      v = 1 << (v & 15);

      count++;
      col[x] ^= v;
      row[y] ^= v;
      blk[b] ^= v;
      m[p] = -1;
    });
  }

  // convert the puzzle into our own format
  for(ptr = y = 0; y < 9; y++) {
    for(x = 0; x < 9; x++, ptr++) {
      if(~(v = p[ptr] - 1)) {
        col[x] |= 1 << v;
        row[y] |= 1 << v;
        blk[BLOCK[ptr]] |= 1 << v;
        count--;
        m[ptr] = v;
      }
    }
  }

  // main recursive search function
  let res = (function search() {
    // success?
    if(!count) {
      return true;
    }

    let ptr, x, y, v, n, max, best,
        k, i, stack = [],
        dCol = Array(81).fill(0),
        dRow = Array(81).fill(0),
        dBlk = Array(81).fill(0),
        b, v0;

    // scan the grid:
    // - keeping track of where each digit can go on a given column, row or block
    // - looking for a cell with the fewest number of legal moves
    for(max = ptr = y = 0; y < 9; y++) {
      for(x = 0; x < 9; x++, ptr++) {
        if(m[ptr] == -1) {
          v = col[x] | row[y] | blk[BLOCK[ptr]];
          n = N_BIT[v];

          // abort if there's no legal move on this cell
          if(n == 9) {
            return false;
          }

          // update dCol[], dRow[] and dBlk[]
          for(v0 = v ^ 0x1FF; v0;) {
            b = v0 & -v0;
            dCol[x * 9 + BIT[b]] |= 1 << y;
            dRow[y * 9 + BIT[b]] |= 1 << x;
            dBlk[BLOCK[ptr] * 9 + BIT[b]] |= 1 << BLOCK_NDX[ptr];
            v0 ^= b;
          }

          // update the cell with the fewest number of moves
          if(n > max) {
            best = {
              x  : x,
              y  : y,
              ptr: ptr,
              msk: v
            };
            max = n;
          }
        }
      }
    }

    // play all forced moves (unique candidates on a given column, row or block)
    // and make sure that it doesn't lead to any inconsistency
    for(k = 0; k < 9; k++) {
      for(n = 0; n < 9; n++) {
        if(N_BIT[dCol[k * 9 + n]] == 1) {
          i = BIT[dCol[k * 9 + n]];

          if(!play(stack, k, i, n)) {
            return false;
          }
        }

        if(N_BIT[dRow[k * 9 + n]] == 1) {
          i = BIT[dRow[k * 9 + n]];

          if(!play(stack, i, k, n)) {
            return false;
          }
        }

        if(N_BIT[dBlk[k * 9 + n]] == 1) {
          i = BIT[dBlk[k * 9 + n]];

          if(!play(stack, (k % 3) * 3 + i % 3, (k / 3 | 0) * 3 + (i / 3 | 0), n)) {
            return false;
          }
        }
      }
    }

    // if we've played at least one forced move, do a recursive call right away
    if(stack.length) {
      if(search()) {
        return true;
      }
      undo(stack);
      return false;
    }

    // otherwise, try all moves on the cell with the fewest number of moves
    while((v = ZERO[best.msk]) < 0x200) {
      col[best.x] ^= v;
      row[best.y] ^= v;
      blk[BLOCK[best.ptr]] ^= v;
      m[best.ptr] = BIT[v];
      count--;

      if(search()) {
        return true;
      }

      count++;
      m[best.ptr] = -1;
      col[best.x] ^= v;
      row[best.y] ^= v;
      blk[BLOCK[best.ptr]] ^= v;

      best.msk ^= v;
    }

    return false;
  })();

  return res ? m.map(n => n + 1).join('') : false;
}

// debugging
function dump(m) {
  let x, y, c = 81, s = '';

  for(y = 0; y < 9; y++) {
    for(x = 0; x < 9; x++) {
      s += (~m[y * 9 + x] ? (c--, m[y * 9 + x] + 1) : '-') + (x % 3 < 2 || x == 8 ? ' ' : ' | ');
    }
    s += y % 3 < 2 || y == 8 ? '\n' : '\n------+-------+------\n';
  }
  console.log(c);
  console.log(s);
}

Example output

Tested on an Intel Core i7 7500U @ 2.70 GHz.

output

\$\endgroup\$
11
  • 1
    \$\begingroup\$ @maxb I see. I did not understand that your concern was about multi-threading. \$\endgroup\$
    – Arnauld
    Commented Aug 24, 2019 at 21:22
  • 1
    \$\begingroup\$ it should be faster if you delete the line if(!stack.some(... and its corresponding } (but preserve what's between them), and at the beginning of play() insert if(m[y][x]>=0)return v==1<<m[y][x] \$\endgroup\$
    – ngn
    Commented Aug 28, 2019 at 15:44
  • 1
    \$\begingroup\$ Why is your solution so much faster than the other ones? \$\endgroup\$
    – user9207
    Commented Aug 28, 2019 at 18:54
  • 2
    \$\begingroup\$ @Anush What I've called 'forced moves' in the code is what makes it significantly faster and is better known as hidden singles. (We could also look for hidden twins, triples, quads, etc. but I'm not sure it's really worth it, at least in Node.) \$\endgroup\$
    – Arnauld
    Commented Aug 28, 2019 at 19:58
  • 6
    \$\begingroup\$ "I started looking at naked singles" careful with the phrasing :) \$\endgroup\$
    – ngn
    Commented Aug 29, 2019 at 8:20
4
\$\begingroup\$

Python 3 (with dlx) 4min 46.870s official score

(single core i7-3610QM here)

Obviously beatable with a compiled language like C, and making use of threading, but it's a start...

sudoku is a module I've placed on github (copied at the footer of this post) which uses dlx under the hood.

#!/usr/bin/python
import argparse
import gc
import sys
from timeit import timeit

from sudoku import Solver

def getSolvers(filePath):
    solvers = []
    with open(filePath, 'r') as inFile:
        for line in inFile:
            content = line.rstrip()
            if len(content) == 81 and content.isdigit():
                solvers.append(Solver(content))
    return solvers

def solve(solvers):
    for solver in solvers:
        yield next(solver.genSolutions())

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Time or print solving of some sudoku.')
    parser.add_argument('filePath',
                        help='Path to the file containing proper sudoku on their own lines as 81 digits in row-major order with 0s as blanks')
    parser.add_argument('-p', '--print', dest='printEm', action='store_true',
                        default=False,
                        help='print solutions in the same fashion as the input')
    parser.add_argument('-P', '--pretty', dest='prettyPrintEm', action='store_true',
                        default=False,
                        help='print inputs and solutions formatted for human consumption')
    args = parser.parse_args()

    if args.printEm or args.prettyPrintEm:
        solvers = getSolvers(args.filePath)
        print(len(solvers))
        for solver, solution in zip(solvers, solve(solvers)):
            if args.prettyPrintEm:
                print(solver)
                print(solution)
            else:
                print('{},{}'.format(solver.representation(noneCharacter='0'), solution.representation()))
    else:
        setup = '''\
from __main__ import getSolvers, solve, args, gc
gc.disable()
solvers = getSolvers(args.filePath)'''
        print(timeit("for solution in solve(solvers): pass", setup=setup, number=1))

Usage

  • Install Python 3
  • Save sudoku.py somewhere on your path (from the git hub link or copy it from below)
  • Save the above code as testSolver.py somewhere on your path
  • Install dlx:
python -m pip install dlx
  • Run it (by the way it consumes memory like it's going out of fashion)
usage: testSolver.py [-h] [-p] [-P] filePath

Time or print solving of some sudoku.

positional arguments:
  filePath      Path to the file containing proper sudoku on their own lines
                as 81 digits in row-major order with 0s as blanks

optional arguments:
  -h, --help    show this help message and exit
  -p, --print   print solutions in the same fashion as the input
  -P, --pretty  print inputs and solutions formatted for human consumption

Pipe output as required in the challenge spec to a file if need be:

python testSolver.py -p input_file_path > output_file_path

sudoku.py (yes there are extra features here other than solving)

import dlx
from itertools import permutations, takewhile
from random import choice, shuffle

'''
A 9 by 9 sudoku solver.
'''
_N = 3
_NSQ = _N**2
_NQU = _N**4
_VALID_VALUE_INTS = list(range(1, _NSQ + 1))
_VALID_VALUE_STRS = [str(v) for v in _VALID_VALUE_INTS]
_EMPTY_CELL_CHAR = '·'

# The following are mutually related by their ordering, and define ordering throughout the rest of the code. Here be dragons.
#
_CANDIDATES = [(r, c, v) for r in range(_NSQ) for c in range(_NSQ) for v in range(1, _NSQ + 1)]
_CONSTRAINT_INDEXES_FROM_CANDIDATE = lambda r, c, v: [ _NSQ * r + c, _NQU + _NSQ * r + v - 1, _NQU * 2 + _NSQ * c + v - 1, _NQU * 3 + _NSQ * (_N * (r // _N) + c // _N) + v - 1]
_CONSTRAINT_FORMATTERS =                             [ "R{0}C{1}"  , "R{0}#{1}"                , "C{0}#{1}"                   , "B{0}#{1}"]
_CONSTRAINT_NAMES = [(s.format(a, b + (e and 1)), dlx.DLX.PRIMARY) for e, s in enumerate(_CONSTRAINT_FORMATTERS) for a in range(_NSQ) for b in range(_NSQ)]
_EMPTY_GRID_CONSTRAINT_INDEXES = [_CONSTRAINT_INDEXES_FROM_CANDIDATE(r, c, v) for (r, c, v) in _CANDIDATES]
#
# The above are mutually related by their ordering, and define ordering throughout the rest of the code. Here be dragons.


class Solver:
    def __init__(self, representation=''):
        if not representation or len(representation) != _NQU:
            self._complete = False
            self._NClues = 0
            self._repr = [None]*_NQU # blank grid, no clues - maybe to extend to a generator by overriding the DLX column selection to be stochastic.
        else:
            nClues = 0
            repr = []
            for value in representation:
                if not value:
                    repr.append(None)
                elif isinstance(value, int) and 1 <= value <= _NSQ:
                    nClues += 1
                    repr.append(value)
                elif value in _VALID_VALUE_STRS:
                    nClues += 1
                    repr.append(int(value))
                else:
                    repr.append(None)
            self._complete = nClues == _NQU
            self._NClues = nClues
            self._repr = repr

    def genSolutions(self, genSudoku=True, genNone=False, dlxColumnSelctor=None):
        '''
        if genSudoku=False, generates each solution as a list of cell values (left-right, top-bottom)
        '''
        if self._complete:
            yield self
        else:
            self._initDlx()
            dlxColumnSelctor = dlxColumnSelctor or dlx.DLX.smallestColumnSelector
            if genSudoku:
                for solution in self._dlx.solve(dlxColumnSelctor):
                    yield Solver([v for (r, c, v) in sorted([self._dlx.N[i] for i in solution])])
            elif genNone:
                for solution in self._dlx.solve(dlxColumnSelctor):
                    yield
            else:
                for solution in self._dlx.solve(dlxColumnSelctor):
                    yield [v for (r, c, v) in sorted([self._dlx.N[i] for i in solution])]

    def uniqueness(self, returnSolutionIfProper=False):
        '''
        Returns: 0 if unsolvable;
                 1 (or the unique solution if returnSolutionIfProper=True) if uniquely solvable; or
                 2 if multiple possible solutions exist
        - a 'proper' sudoku is uniquely solvable.
        '''
        slns = list(takewhile(lambda t: t[0] < 2, ((i, sln) for i, sln in enumerate(self.genSolutions(genSudoku=returnSolutionIfProper, genNone=not returnSolutionIfProper)))))
        uniqueness = len(slns)
        if returnSolutionIfProper and uniqueness == 1:
            return slns[0][1]
        else:
            return uniqueness

    def representation(self, asString=True, noneCharacter='.'):
        if asString:
            return ''.join([v and str(_VALID_VALUE_STRS[v - 1]) or noneCharacter for v in self._repr])
        return self._repr[:]

    def __repr__(self):
        return display(self._repr)

    def _initDlx(self):
        self._dlx = dlx.DLX(_CONSTRAINT_NAMES)
        rowIndexes = self._dlx.appendRows(_EMPTY_GRID_CONSTRAINT_INDEXES, _CANDIDATES)
        for r in range(_NSQ):
            for c in range(_NSQ):
                v = self._repr[_NSQ * r + c]
                if v is not None:
                    self._dlx.useRow(rowIndexes[_NQU * r + _NSQ * c + v - 1])


_ROW_SEPARATOR_COMPACT = '+'.join(['-' * (2 * _N + 1) for b in range(_N)])[1:-1] + '\n'
_ROW_SEPARATOR = ' ·-' + _ROW_SEPARATOR_COMPACT[:-1] + '-·\n'
_TOP_AND_BOTTOM = _ROW_SEPARATOR.replace('+', '·')

_ROW_LABELS = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J']
_COL_LABELS = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
_COLS_LABEL = ' ' + ' '.join([i % _N == 0 and '  ' + l or l for i, l in enumerate(_COL_LABELS)]) + '\n'


def display(representation, conversion=None, labelled=True):
    result = ''
    raw = [conversion[n or 0] for n in representation] if conversion else representation
    if labelled:
        result += _COLS_LABEL + _TOP_AND_BOTTOM
        rSep = _ROW_SEPARATOR
    else:
        rSep = _ROW_SEPARATOR_COMPACT
    for r in range(_NSQ):
        if r > 0 and r % _N == 0:
            result += rSep
        for c in range(_NSQ):
            if c % _N == 0:
                if c == 0:
                    if labelled:
                        result += _ROW_LABELS[r] + '| '
                else:
                    result += '| '
            result += str(raw[_NSQ * r + c] or _EMPTY_CELL_CHAR) + ' '
        if labelled:
            result += '|'
        result += '\n'
    if labelled:
        result += _TOP_AND_BOTTOM
    else:
        result = result[:-1]
    return result

def permute(representation):
    '''
    returns a random representation from the given representation's equivalence class
    '''
    rows = [list(representation[i:i+_NSQ]) for i in range(0, _NQU, _NSQ)]
    rows = permuteRowsAndBands(rows)
    rows = [[r[i] for r in rows] for i in range(_NSQ)]
    rows = permuteRowsAndBands(rows)
    pNumbers = [str(i) for i in range(1, _NSQ + 1)]
    shuffle(pNumbers)
    return ''.join(''.join([pNumbers[int(v) - 1] if v.isdigit() and v != '0' else v for v in r]) for r in rows)

def permuteRowsAndBands(rows):
    bandP = choice([x for x in permutations(range(_N))])
    rows = [rows[_N * b + r] for b in bandP for r in range(_N)]
    for band in range(0, _NSQ, _N):
        rowP = choice([x for x in permutations([band + i for i in range(_N)])])
        rows = [rows[rowP[i % _N]] if i // _N == band else rows[i] for i in range(_NSQ)]
    return rows


def getRandomSolvedStateRepresentation():
    return permute('126459783453786129789123456897231564231564897564897231312645978645978312978312645')


def getRandomSudoku():
    r = getRandomSolvedStateRepresentation()
    s = Solver(r)
    indices = list(range(len(r)))
    shuffle(indices)
    for i in indices:
        ns = Solver(s._repr[:i] + [None] + s._repr[i+1:])
        if ns.uniqueness() == 1:
            s = ns
    return s


if __name__ == '__main__':
    print('Some example useage:')
    inputRepresentation = '..3......4......2..8.12...6.........2...6...7...8.7.31.1.64.9..6.5..8...9.83...4.'
    print('>>> s = Solver({})'.format(inputRepresentation))
    s = Solver(inputRepresentation)
    print('>>> s')
    print(s)
    print('>>> print(s.representation())')
    print(s.representation())
    print('>>> print(display(s.representation(), labelled=False))')
    print(display(s.representation(), labelled=False))
    print('>>> for solution in s.genSolutions(): solution')
    for solution in s.genSolutions(): print(solution)
    inputRepresentation2 = inputRepresentation[:2] + '.' + inputRepresentation[3:]
    print('>>> s.uniqueness()')
    print(s.uniqueness())
    print('>>> s2 = Solver({})  # removed a clue; this has six solutions rather than one'.format(inputRepresentation2))
    s2 = Solver(inputRepresentation2)
    print('>>> s2.uniqueness()')
    print(s2.uniqueness())
    print('>>> for solution in s2.genSolutions(): solution')
    for solution in s2.genSolutions(): print(solution)
    print('>>> s3 = getRandomSudoku()')
    s3 = getRandomSudoku()
    print('>>> s3')
    print(s3)
    print('>>> for solution in s3.genSolutions(): solution')
    for solution in s3.genSolutions(): print(solution)
\$\endgroup\$
3
  • \$\begingroup\$ Impressive for a Python solution, I'll try to benchmark it later today. \$\endgroup\$
    – maxb
    Commented Aug 24, 2019 at 8:31
  • 1
    \$\begingroup\$ Thanks; and wow, so much faster there maxb! \$\endgroup\$ Commented Aug 24, 2019 at 13:42
  • 1
    \$\begingroup\$ +1 for using dancing links \$\endgroup\$
    – user9207
    Commented Aug 25, 2019 at 4:32
4
\$\begingroup\$

Python 3 + Z3 - 10min 45.657s official score

about 1000s on my laptop.

import time
start = time.time()

import z3.z3 as z3
import itertools
import datetime
import sys

solver = z3.Solver()
ceils = [[None] * 9 for i in range(9)]

for row in range(9):
    for col in range(9):
        name = 'c' + str(row * 9 + col)
        ceil = z3.BitVec(name, 9)
        solver.add(z3.Or(
            ceil == 0b000000001,
            ceil == 0b000000010,
            ceil == 0b000000100,
            ceil == 0b000001000,
            ceil == 0b000010000,
            ceil == 0b000100000,
            ceil == 0b001000000,
            ceil == 0b010000000,
            ceil == 0b100000000
        ))
        solver.add(ceil != 0)
        ceils[row][col] = ceil
for i in range(9):
    for j in range(9):
        for k in range(9):
            if j == k: continue
            solver.add(ceils[i][j] & ceils[i][k] == 0)
            solver.add(ceils[j][i] & ceils[k][i] == 0)
            row, col = i // 3 * 3, i % 3 * 3
            solver.add(ceils[row + j // 3][col + j % 3] & ceils[row + k // 3][col + k % 3] == 0)

row_col = list(itertools.product(range(9), range(9)))
lookup = { 1 << i: str(i + 1) for i in range(9) }

def solve(line):
    global solver, output, row_col, ceils, lookup
    solver.push()
    for value, (row, col) in zip(line, row_col):
        val = ord(value) - 48
        if val == 0: continue
        solver.add(ceils[row][col] == 1 << (val - 1))

    output = []
    if solver.check() == z3.sat:
        model = solver.model()
        for row in range(9):
            for col in range(9):
                val = model[ceils[row][col]].as_long()
                output.append(lookup[val])
    solver.pop()

    return ''.join(output)

count = int(input())
print(count)
for i in range(count):
    if i % 1000 == 0:
        sys.stderr.write(str(i) + '\n')
    line = input()
    print(line + "," + solve(line))
end = time.time()

sys.stderr.write(str(end - start))

Install dependency

pip install z3-solver

Run

python3 solve.py < in.txt > out.txt

I'm not sure how to improve its performance, since it just solved magically...

\$\endgroup\$
2
  • \$\begingroup\$ Quite impressive for a general constraint solver. My first implementation was way slower than this. Running a benchmark right now, I'll update the post once it's done. \$\endgroup\$
    – maxb
    Commented Aug 26, 2019 at 5:53
  • \$\begingroup\$ @maxb just done some general clean up, and i believe there is no need to update the benchmark... \$\endgroup\$
    – tsh
    Commented Aug 26, 2019 at 7:03
4
\$\begingroup\$

C - 2.228s 1.690s official score

based on @Arnauld's

#include<fcntl.h>
#define O const
#define R return
#define S static
#define  $(x,y...)if(x){y;}
#define  W(x,y...)while(x){y;}
#define fi(x,y...)for(I i=0,_n=(x);i<_n;i++){y;}
#define fj(x,y...)for(I j=0,_n=(x);j<_n;j++){y;}
#define fp81(x...)for(I p=0;p<81;p++){x;}
#define  fq3(x...)for(I q=0;q<3;q++){x;}
#define fij9(x...){fi(9,fj(9,x))}
#define m0(x)m0_((V*)(x),sizeof(x));
#define popc(x)__builtin_popcount(x)
#define ctz(x)__builtin_ctz(x)
#include<sys/syscall.h>
#define sc(f,x...)({L u;asm volatile("syscall":"=a"(u):"0"(SYS_##f)x:"cc","rcx","r11","memory");u;})
#define sc1(f,x)    sc(f,,"D"(x))
#define sc2(f,x,y)  sc(f,,"D"(x),"S"(y))
#define sc3(f,x,y,z)sc(f,,"D"(x),"S"(y),"d"(z))
#define wr(a...)sc3(write,a)
#define op(a...)sc3( open,a)
#define cl(a...)sc1(close,a)
#define fs(a...)sc2(fstat,a)
#define ex(a...)sc1( exit,a)
#define mm(x,y,z,t,u,v)({register L r10 asm("r10")=t,r8 asm("r8")=u,r9 asm("r9")=v;\
                         (V*)sc(mmap,,"D"(x),"S"(y),"d"(z),"r"(r10),"r"(r8),"r"(r9));})
typedef void V;typedef char C;typedef short H;typedef int I;typedef long long L;
S C BL[81],KL[81],IJK[81][3],m[81],t_[81-17],*t;S H rcb[3][9],cnt;
S V*mc(V*x,O V*y,L n){C*p=x;O C*q=y;fi(n,*p++=*q++)R x;}S V m0_(C*p,L n){fi(n,*p++=0);}
S I undo(C*t0){cnt+=t-t0;W(t>t0,C p=*--t;H v=1<<m[p];fq3(rcb[q][IJK[p][q]]^=v)m[p]=-1)R 0;}
S I play(C p,H v){$(m[p]>=0,R 1<<m[p]==v)I w=0;fq3(w|=rcb[q][IJK[p][q]])$(w&v,R 0)cnt--;
                  fq3(rcb[q][IJK[p][q]]^=v);m[p]=ctz(v);*t++=p;R 1;}
S I f(){$(!cnt,R 1)C*t0=t;H max=0,bp,bv,d[9][9][4];m0(d);
 fij9(I p=i*9+j;$(m[p]<0,
  I v=0;fq3(v|=rcb[q][IJK[p][q]])I w=v^511;$(!w,R 0)H g[]={1<<j,1<<i,1<<BL[p]};
  do{I z=ctz(w);w&=w-1;fq3(d[IJK[p][q]][z][q]|=g[q]);}while(w);
  I n=popc(v);$(max<n,max=n;bp=p;bv=v)))
 fij9(I u=d[i][j][0];$(popc(u)==1,I l=ctz(u);$(!play(   i*9+l ,1<<j),R undo(t0)))
        u=d[i][j][1];$(popc(u)==1,I l=ctz(u);$(!play(   l*9+i ,1<<j),R undo(t0)))
        u=d[i][j][2];$(popc(u)==1,I l=ctz(u);$(!play(KL[i*9+l],1<<j),R undo(t0))))
 $(t-t0,R f()||undo(t0))
 W(1,I v=1<<ctz(~bv);$(v>511,R 0)fq3(rcb[q][IJK[bp][q]]^=v)m[bp]=ctz(v);cnt--;$(f(),R 1)
     cnt++;m[bp]=-1;fq3(rcb[q][IJK[bp][q]]^=v)bv^=v)
 R 0;}
asm(".globl _start\n_start:pop %rdi\nmov %rsp,%rsi\njmp main");
V main(I ac,C**av){$(ac!=2,ex(2))
 fij9(I p=i*9+j;BL[p]=i%3*3+j%3;KL[p]=(i/3*3+j/3)*9+BL[p];IJK[p][0]=i;IJK[p][1]=j;IJK[p][2]=i/3*3+j/3)
 I d=op(av[1],0,0);struct stat h;fs(d,&h);C*s0=mm(0,h.st_size,1,0x8002,d,0),*s=s0;cl(d); //in
 C*r0=mm(0,2*h.st_size,3,0x22,-1,0),*r=r0; //out
 I n=0;W(*s!='\n',n*=10;n+=*s++-'0')s++;mc(r,s0,s-s0);r+=s-s0;
 fi(n,m0(rcb);cnt=81;t=t_;$(s[81]&&s[81]!='\n',ex(3))mc(r,s,81);r+=81;*r++=',';
      fp81(I v=m[p]=*s++-'1';$(v>=0,v=1<<v;fq3(rcb[q][IJK[p][q]]|=v)cnt--))
      s++;$(!f(),ex(4))fp81(r[p]=m[p]+'1')r+=81;*r++='\n')
 wr(1,r0,r-r0);ex(0);}

compile and run:

gcc -O3 -march=native -nostdlib -ffreestanding
time ./a.out all_17_clue_sudokus.txt | md5sum
\$\endgroup\$
8
  • \$\begingroup\$ Congratulations, you (and Arnauld) are in the lead by a lot right now. \$\endgroup\$
    – maxb
    Commented Aug 26, 2019 at 12:32
  • \$\begingroup\$ @maxb i tried using more efficient i/o (direct syscalls without libc) but the effect wasn't as great as i hoped. i also tidied up the rest of the code. this should take away ~0.2s. do you mind re-scoring? \$\endgroup\$
    – ngn
    Commented Aug 27, 2019 at 20:36
  • \$\begingroup\$ Of course, I'll try to get it done sometime today \$\endgroup\$
    – maxb
    Commented Aug 28, 2019 at 5:06
  • \$\begingroup\$ I was also thinking about trying a RAMdisk for all I/O, just to see if that makes a difference. I doubt it'll make a huge difference, since reads and writes are sequential, and my SSD has a large enough cache to fit everything. \$\endgroup\$
    – maxb
    Commented Aug 28, 2019 at 7:24
  • \$\begingroup\$ @maxb there probably won't be any difference at all. the second time you run the program, the input file will already be in ram anyway - in linux's filesystem cache. \$\endgroup\$
    – ngn
    Commented Aug 28, 2019 at 7:44
4
\$\begingroup\$

C++ with Minisat(2.2.1-5) - 11.735s official score

This is nowhere near as fast as a specialized algorithm, but it's a different approach, an interesting point of reference, and easy to understand.

$ clang++ -o solve -lminisat solver_minisat.cc

#include <minisat/core/Solver.h>

namespace {

using Minisat::Lit;
using Minisat::mkLit;
using namespace std;

struct SolverMiniSat {
    Minisat::Solver solver;

    SolverMiniSat() {
        InitializeVariables();
        InitializeTriadDefinitions();
        InitializeTriadOnnes();
        InitializeCellOnnes();
    }

    // normal cell literals, of which we have 9*9*9
    static Lit Literal(int row, int column, int value) {
        return mkLit(value + 9 * (column + 9 * row), true);
    }

    // horizontal triad literals, of which we have 9*3*9, starting after the cell literals
    static Lit HTriadLiteral(int row, int column, int value) {
        int base = 81 * 9;
        return mkLit(base + value + 9 * (column + 3 * row));
    }

    // vertical triad literals, of which we have 3*9*9, starting after the h_triad literals
    static Lit VTriadLiteral(int row, int column, int value) {
        int base = (81 + 27) * 9;
        return mkLit(base + value + 9 * (row + 3 * column));
    }

    void InitializeVariables() {
        for (int i = 0; i < 15 * 9 * 9; i++) {
            solver.newVar();
        }
    }

    // create an exactly-one constraint over a set of literals
    void CreateOnne(const Minisat::vec<Minisat::Lit> &literals) {
        solver.addClause(literals);
        for (int i = 0; i < literals.size() - 1; i++) {
            for (int j = i + 1; j < literals.size(); j++) {
                solver.addClause(~literals[i], ~literals[j]);
            }
        }
    }

    void InitializeTriadDefinitions() {
        for (int i = 0; i < 9; i++) {
            for (int j = 0; j < 3; j++) {
                for (int value = 0; value < 9; value++) {
                    Lit h_triad = HTriadLiteral(i, j, value);
                    Lit v_triad = VTriadLiteral(j, i, value);
                    int j0 = j * 3 + 0, j1 = j * 3 + 1, j2 = j * 3 + 2;

                    Minisat::vec<Minisat::Lit> h_triad_def;
                    h_triad_def.push(Literal(i, j0, value));
                    h_triad_def.push(Literal(i, j1, value));
                    h_triad_def.push(Literal(i, j2, value));
                    h_triad_def.push(~h_triad);
                    CreateOnne(h_triad_def);

                    Minisat::vec<Minisat::Lit> v_triad_def;
                    v_triad_def.push(Literal(j0, i, value));
                    v_triad_def.push(Literal(j1, i, value));
                    v_triad_def.push(Literal(j2, i, value));
                    v_triad_def.push(~v_triad);
                    CreateOnne(v_triad_def);
                }
            }
        }
    }

    void InitializeTriadOnnes() {
        for (int i = 0; i < 9; i++) {
            for (int value = 0; value < 9; value++) {
                Minisat::vec<Minisat::Lit> row;
                row.push(HTriadLiteral(i, 0, value));
                row.push(HTriadLiteral(i, 1, value));
                row.push(HTriadLiteral(i, 2, value));
                CreateOnne(row);

                Minisat::vec<Minisat::Lit> column;
                column.push(VTriadLiteral(0, i, value));
                column.push(VTriadLiteral(1, i, value));
                column.push(VTriadLiteral(2, i, value));
                CreateOnne(column);

                Minisat::vec<Minisat::Lit> hbox;
                hbox.push(HTriadLiteral(3 * (i / 3) + 0, i % 3, value));
                hbox.push(HTriadLiteral(3 * (i / 3) + 1, i % 3, value));
                hbox.push(HTriadLiteral(3 * (i / 3) + 2, i % 3, value));
                CreateOnne(hbox);

                Minisat::vec<Minisat::Lit> vbox;
                vbox.push(VTriadLiteral(i % 3, 3 * (i / 3) + 0, value));
                vbox.push(VTriadLiteral(i % 3, 3 * (i / 3) + 1, value));
                vbox.push(VTriadLiteral(i % 3, 3 * (i / 3) + 2, value));
                CreateOnne(vbox);
            }
        }
    }

    void InitializeCellOnnes() {
        for (int row = 0; row < 9; row++) {
            for (int column = 0; column < 9; column++) {
                Minisat::vec<Minisat::Lit> literals;
                for (int value = 0; value < 9; value++) {
                    literals.push(Literal(row, column, value));
                }
                CreateOnne(literals);
            }
        }
    }

    bool SolveSudoku(const char *input, char *solution, size_t *num_guesses) {
        Minisat::vec<Minisat::Lit> assumptions;
        for (int row = 0; row < 9; row++) {
            for (int column = 0; column < 9; column++) {
                char digit = input[row * 9 + column];
                if (digit != '.') {
                    assumptions.push(Literal(row, column, digit - '1'));
                }
            }
        }
        solver.decisions = 0;
        bool satisfied = solver.solve(assumptions);
        if (satisfied) {
            for (int row = 0; row < 9; row++) {
                for (int column = 0; column < 9; column++) {
                    for (int value = 0; value < 9; value++) {
                        if (solver.model[value + 9 * (column + 9 * row)] ==
                            Minisat::lbool((uint8_t) 1)) {
                            solution[row * 9 + column] = value + '1';
                        }
                    }
                }
            }
        }
        *num_guesses = solver.decisions - 1;
        return satisfied;
    }
};

} //end anonymous namespace

int main(int argc, const char **argv) {
    char *puzzle = NULL;
    char solution[81];
    size_t size, guesses;

    SolverMiniSat solver;

    while (getline(&puzzle, &size, stdin) != -1) {
        int count = solver.SolveSudoku(puzzle, solution, &guesses);
        printf("%.81s:%d:%.81s\n", puzzle, count, solution);
    }
}
\$\endgroup\$
3
\$\begingroup\$

Java - 4.056s official score

The main idea of this is to never allocate memory when it is not needed. The only exception are primitives, which should be optimized by the compiler anyway. Everything else is stored as masks and arrays of operations done in each step, which can be undone when the recursion step is completed.

About half of all sudokus are solved completely without backtracking, but if I push that number higher the overall time seems to be slower. I'm planning om rewriting this in C++ and optimize even further, but this solver is becoming a behemoth.

I wanted to implement as much caching as possible, which lead to some issues. For example, if there are two cells on the same row which can only have the number 6, then we have reached an impossible case, and should return to the backtracking. But since I calculated all options in one sweep, and then placed numbers in cells with only one possibility, I didn't double check that I had placed a number in the same row just before. This lead to impossible solutions.

With everything being contained in the arrays defined at the top, the memory usage of the actual solver is about 216kB. The main part of the memory usage comes from the array containing all the puzzles, and the I/O handlers in Java.

EDIT: I have a version which is translated to C++ now, but it isn't vastly faster. The official time is around 3.5 seconds, which isn't a huge improvement. I think the main issue with my implementation is that I keep my masks as arrays rather than bitmasks. I'll try to analyze Arnauld's solution to see what can be done to improve it.

import java.util.HashMap;
import java.util.ArrayList;
import java.util.Arrays;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.File;
import java.io.PrintWriter;

public class Sudoku {   

    final private int[] unsolvedBoard;
    final private int[] solvedBoard; 
    final private int[][] neighbors;
    final private int[][] cells;

    private static int[] clues;
    final private int[][] mask;
    final private int[] formattedMask;
    final private int[][] placedMask;
    final private boolean[][][] lineMask;
    final private int[] lineCounters;
    final private int[][] sectionCounters;
    final private int[][] sectionMask;

    private int easySolved;
    private boolean isEasy;
    private int totEasy;
    private int placedNumbers;
    public long totTime = 0;
    private boolean solutionFound;
    public long lastPrint;
    private boolean shouldPrint;
    private boolean isImpossible = false;

    public Sudoku() {
        mask = new int[81][9];
        formattedMask = new int[81];
        placedMask = new int[64][64];
        lineMask = new boolean[64][81][9];
        sectionCounters = new int[9][27];
        sectionMask = new int[9][27];
        lineCounters = new int[64];
        neighbors = new int[81][20];
        unsolvedBoard = new int[81];
        solvedBoard = new int[81];
        cells = new int[][] {{0 ,1 ,2 ,9 ,10,11,18,19,20},
                             {3 ,4 ,5 ,12,13,14,21,22,23},
                             {6 ,7 ,8 ,15,16,17,24,25,26},
                             {27,28,29,36,37,38,45,46,47},
                             {30,31,32,39,40,41,48,49,50},
                             {33,34,35,42,43,44,51,52,53},
                             {54,55,56,63,64,65,72,73,74},
                             {57,58,59,66,67,68,75,76,77},
                             {60,61,62,69,70,71,78,79,80}};
    }

    final public long solveSudoku(int[] board, int clue) {

        long t1 = 0,t2 = 0;
        t1 = System.nanoTime();
        System.arraycopy(board, 0, unsolvedBoard, 0, 81);
        System.arraycopy(board, 0, solvedBoard, 0, 81);

        placedNumbers = 0;
        solutionFound = false;
        isEasy = true;
        isImpossible = false;

        for (int[] i : mask) {
            Arrays.fill(i, 0);
        }

        for (boolean[][] i : lineMask) {
            for (boolean[] j : i) {
                Arrays.fill(j, false);
            }
        }

        for (int i = 0; i < 81; i++) {
            if (solvedBoard[i] != -1) {
                put(i, solvedBoard[i]);
                placedNumbers++;
            }
        }

        solve(0, 0);
        t2 = System.nanoTime();
        easySolved += isEasy ? 1 : 0;

        if (solutionFound && placedNumbers == 81) {
            totTime += t2-t1;
            if (shouldPrint || t2-t1 > 5*1_000_000_000L) {
                System.out.print(String.format(
                    "Solution from %2d clues found in %7s", 
                    clue, 
                    printTime(t1, t2)
                ));
                shouldPrint = false;
                if (t2-t1 > 1*1000_000_000L) {
                    System.out.println();
                    display2(board, solvedBoard);
                }
            }
        } else {
            System.out.println("No solution");
            display2(unsolvedBoard, solvedBoard);
            return -1;
        }
        return t2 - t1;
    }

    final private void solve(int v, int vIndex) {

        lineCounters[vIndex] = 0;
        int easyIndex = placeEasy(vIndex);

        if (isImpossible) {
            resetEasy(vIndex, easyIndex);
            resetLineMask(vIndex);
            return;
        }

        if (placedNumbers == 81) {
            solutionFound = true;
            return;
        }
        // if (true) {
            // return;
        // }

        // either get the next empty cell
        // while (v < 81 && solvedBoard[v] >= 0) {
            // v++;
        // }
        // or get the cell with the fewest options
        generateFormattedMasks();
        int minOptions = 9;
        for (int i = 0; i < 81; i++) {
            int options = formattedMask[i] & 0xffff;
            if (options > 0 && options < minOptions) {
                minOptions = options;
                v = i;
            }
            if (options == 0 && solvedBoard[i] == -1) {
                isImpossible = true;
            }
        }
        if (!isImpossible) {
            for (int c = 0; c < 9; c++) {
                if (isPossible(v, c)) {
                    isEasy = false;
                    put(v, c);
                    placedNumbers++;
                    solve(v + 1, vIndex + 1); 
                    if (solutionFound) {
                        return;
                    }
                    unput(v, c);
                    placedNumbers--;
                }
            }
        }
        resetEasy(vIndex, easyIndex);
        resetLineMask(vIndex);
    }

    final private void resetEasy(int vIndex, int easyIndex) {
        for (int i = 0; i < easyIndex; i++) {
            int tempv2 = placedMask[vIndex][i];
            int c2 = solvedBoard[tempv2];
            unput(tempv2, c2);
            placedNumbers--;
        }
    }

    final private void resetLineMask(int vIndex) {
        if (lineCounters[vIndex] > 0) {
            for (int i = 0; i < 81; i++) {
                for (int c = 0; c < 9; c++) {
                    if (lineMask[vIndex][i][c]) {
                        enable(i, c);
                        lineMask[vIndex][i][c] = false;
                    }
                }
            }
        }       
        isImpossible = false;
    }

    final private int placeEasy(int vIndex) {
        int easyIndex = 0;
        int lastPlaced = 0, tempPlaced = 0, easyplaced = 0;
        int iter = 0;
        while (placedNumbers > lastPlaced+1) {
            lastPlaced = placedNumbers;
            tempPlaced = 0;
            while (placedNumbers > tempPlaced + 5) {
                tempPlaced = placedNumbers;
                easyIndex = placeNakedSingles(vIndex, easyIndex);
                if (isImpossible) {
                    return easyIndex;
                }
            }

            tempPlaced = 0;
            while (placedNumbers < 55*1 && placedNumbers > tempPlaced + 2) {
                tempPlaced = placedNumbers;
                easyIndex = placeHiddenSingles(vIndex, easyIndex);
                if (isImpossible) {
                    return easyIndex;
                }
            }

            tempPlaced = 0;
            while (placedNumbers < 65*1 && placedNumbers > tempPlaced + 1) {
                tempPlaced = placedNumbers;
                easyIndex = placeNakedSingles(vIndex, easyIndex);
                if (isImpossible) {
                    return easyIndex;
                }
            }

            if (iter < 2 && placedNumbers < 55*1) {
                checkNakedTriples(vIndex);
            }
            if (placedNumbers < 45*1) {
                checkNakedDoubles(vIndex);
                identifyLines(vIndex);
            }
            iter++;
        }
        return easyIndex;
    }

    final private int placeNakedSingles(int vIndex, int easyIndex) {
        generateFormattedMasks();
        for (int tempv = 0; tempv < 81; tempv++) {
            int possibilities = formattedMask[tempv];
            if ((possibilities & 0xffff) == 1) {
                possibilities >>= 16;
                int c = 0;
                while ((possibilities & 1) == 0) {
                    possibilities >>= 1;
                    c++;
                }
                if (isPossible(tempv, c)) {
                    put(tempv, c);
                    placedMask[vIndex][easyIndex++] = tempv;
                    placedNumbers++;
                } else {
                    isImpossible = true;
                    return easyIndex;
                }
            } else if (possibilities == 0 && solvedBoard[tempv] == -1) {
                isImpossible = true;
                return easyIndex;
            }
        }
        return easyIndex;
    }


    final private int placeHiddenSingles(int vIndex, int easyIndex) {
        for (int[] i : sectionCounters) {
            Arrays.fill(i, 0);
        }

        for (int c = 0; c < 9; c++) {
            for (int v = 0; v < 81; v++) {
                if (isPossible(v, c)) {
                    int cell = 3 * (v / 27) + ((v / 3) % 3);
                    sectionCounters[c][v / 9]++;
                    sectionCounters[c][9 + (v % 9)]++;
                    sectionCounters[c][18 + cell]++;
                    sectionMask[c][v / 9] = v;
                    sectionMask[c][9 + (v % 9)] = v;
                    sectionMask[c][18 + cell] = v;
                }
            }

            int v;

            for (int i = 0; i < 9; i++) {
                if (sectionCounters[c][i] == 1) {
                    v = sectionMask[c][i];
                    if (isPossible(v, c)) {
                        put(v, c);
                        placedMask[vIndex][easyIndex++] = v;
                        placedNumbers++;
                        int cell = 3 * (v / 27) + ((v / 3) % 3);
                        sectionCounters[c][9 + (v%9)] = 9;
                        sectionCounters[c][18 + cell] = 9;
                    } else {
                        isImpossible = true;
                        return easyIndex;
                    }
                }
            }

            for (int i = 9; i < 18; i++) {
                if (sectionCounters[c][i] == 1) {
                    v = sectionMask[c][i];
                    if (isPossible(v, c)) {
                        put(v, c);
                        placedMask[vIndex][easyIndex++] = v;
                        int cell = 3 * (v / 27) + ((v / 3) % 3);
                        placedNumbers++;
                        sectionCounters[c][18 + cell]++;
                    } else {
                        isImpossible = true;
                        return easyIndex;
                    }
                }
            }


            for (int i = 18; i < 27; i++) {
                if (sectionCounters[c][i] == 1) {
                    v = sectionMask[c][i];
                    if (isPossible(v, c)) {
                        put(v, c);
                        placedMask[vIndex][easyIndex++] = v;
                        placedNumbers++;
                    } else {
                        isImpossible = true;
                        return easyIndex;
                    }
                }
            }

        }
        return easyIndex;
    }

    final private int getFormattedMask(int v) {
        if (solvedBoard[v] >= 0) {
            return 0;
        }
        int x = 0;
        int y = 0;
        for (int c = 8; c >= 0; c--) {
            x <<= 1;
            x += mask[v][c] == 0 ? 1 : 0;
            y += mask[v][c] == 0 ? 1 : 0;
        }
        x <<= 16;
        return x + y;
    }

    final private int getCachedMask(int v) {
        return formattedMask[v];
    }

    final private void generateFormattedMasks() {
        for (int i = 0; i < 81; i++) {
            formattedMask[i] = getFormattedMask(i);
        }
    }

    final private void generateFormattedMasks(int[] idxs) {
        for (int i : idxs) {
            formattedMask[i] = getFormattedMask(i);
        }
    }


    final private void checkNakedDoubles(int vIndex) {
        generateFormattedMasks();
        for (int i = 0; i < 81; i++) {
            int bitmask = formattedMask[i];
            if ((bitmask & 0xffff) == 2) {
                for (int j = i+1; j < (i/9+1)*9; j++) {
                    int bitmask_j = formattedMask[j];
                    if (bitmask == bitmask_j) {
                        bitmask >>= 16;
                        int c0, c1, k = 0;
                        while ((bitmask & 1) == 0) {
                            k++;
                            bitmask >>= 1;
                        }
                        c0 = k;
                        bitmask >>= 1;
                        k++;
                        while ((bitmask & 1) == 0) {
                            k++;
                            bitmask >>= 1;
                        }
                        c1 = k;
                        for (int cell = (i/9)*9; cell < (i/9+1)*9; cell++) {
                            if (cell != i && cell != j) {
                                if (!lineMask[vIndex][cell][c0]) {
                                    disable(cell, c0);
                                    lineMask[vIndex][cell][c0] = true;
                                    lineCounters[vIndex]++;
                                }
                                if (!lineMask[vIndex][cell][c1]) {
                                    disable(cell, c1);
                                    lineMask[vIndex][cell][c1] = true;
                                    lineCounters[vIndex]++;
                                }
                            }
                        }
                    }
                }
            }
        }

        for (int idx = 0; idx < 81; idx++) {
            int i = (idx%9)*9 + idx/9;
            int bitmask = formattedMask[i];
            if ((bitmask & 0xffff) == 2) {
                for (int j = i+9; j < 81; j += 9) {
                    int bitmask_j = formattedMask[j];
                    if (bitmask == bitmask_j) {
                        bitmask >>= 16;
                        int c0, c1, k = 0;
                        while ((bitmask & 1) == 0) {
                            k++;
                            bitmask >>= 1;
                        }
                        c0 = k;
                        bitmask >>= 1;
                        k++;
                        while ((bitmask & 1) == 0) {
                            k++;
                            bitmask >>= 1;
                        }
                        c1 = k;
                        for (int cell = i % 9; cell < 81; cell += 9) {
                            if (cell != i && cell != j) {
                                if (!lineMask[vIndex][cell][c0]) {
                                    disable(cell, c0);
                                    lineMask[vIndex][cell][c0] = true;
                                    lineCounters[vIndex]++;
                                }
                                if (!lineMask[vIndex][cell][c1]) {
                                    disable(cell, c1);
                                    lineMask[vIndex][cell][c1] = true;
                                    lineCounters[vIndex]++;
                                }
                            }
                        }
                    }
                }
            }
        }

        for (int idx = 0; idx < 9; idx++) {
            for (int i = 0; i < 9; i++) {
                int bitmask = formattedMask[cells[idx][i]];
                if ((bitmask & 0xffff) == 2) {
                    for (int j = i+1; j < 9; j++) {
                        int bitmask_j = formattedMask[cells[idx][j]];
                        if (bitmask == bitmask_j) {
                            bitmask >>= 16;
                            int c0, c1, k = 0;
                            while ((bitmask & 1) == 0) {
                                k++;
                                bitmask >>= 1;
                            }
                            c0 = k;
                            bitmask >>= 1;
                            k++;
                            while ((bitmask & 1) == 0) {
                                k++;
                                bitmask >>= 1;
                            }
                            c1 = k;
                            for (int cellIdx = 0; cellIdx < 9; cellIdx++) {
                                if (cellIdx != i && cellIdx != j) {
                                    int cell = cells[idx][cellIdx];
                                    if (!lineMask[vIndex][cell][c0]) {
                                        disable(cell, c0);
                                        lineMask[vIndex][cell][c0] = true;
                                        lineCounters[vIndex]++;
                                    }
                                    if (!lineMask[vIndex][cell][c1]) {
                                        disable(cell, c1);
                                        lineMask[vIndex][cell][c1] = true;
                                        lineCounters[vIndex]++;
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }

    final private void checkNakedTriples(int vIndex) {

        generateFormattedMasks();

        for (int i = 0; i < 81; i++) {
            int bitmask = formattedMask[i];
            if ((bitmask & 0xffff) == 3) {
                for (int j = i+1; j < (i/9+1)*9; j++) {
                    int bitmask_j = formattedMask[j];
                    if (bitmask_j > 0 && bitmask == (bitmask | bitmask_j)) {
                        for (int k = j+1; k < (i/9+1)*9; k++) {
                            int bitmask_k = formattedMask[k];
                            if (bitmask_k > 0 && bitmask == (bitmask | bitmask_k)) {

                                int bitmask_shifted = bitmask >> 16;
                                int c0, c1, c2, l = 0;
                                while ((bitmask_shifted & 1) == 0) {
                                    l++;
                                    bitmask_shifted >>= 1;
                                }
                                c0 = l;
                                bitmask_shifted >>= 1;
                                l++;
                                while ((bitmask_shifted & 1) == 0) {
                                    l++;
                                    bitmask_shifted >>= 1;
                                }
                                c1 = l;
                                bitmask_shifted >>= 1;
                                l++;
                                while ((bitmask_shifted & 1) == 0) {
                                    l++;
                                    bitmask_shifted >>= 1;
                                }
                                c2 = l;
                                for (int cell = (i/9)*9; cell < (i/9+1)*9; cell++) {
                                    if (cell != i && cell != j && cell != k) {
                                        if (!lineMask[vIndex][cell][c0]) {
                                            disable(cell, c0);
                                            lineMask[vIndex][cell][c0] = true;
                                            lineCounters[vIndex]++;
                                        }
                                        if (!lineMask[vIndex][cell][c1]) {
                                            disable(cell, c1);
                                            lineMask[vIndex][cell][c1] = true;
                                            lineCounters[vIndex]++;
                                        }
                                        if (!lineMask[vIndex][cell][c2]) {
                                            disable(cell, c2);
                                            lineMask[vIndex][cell][c2] = true;
                                            lineCounters[vIndex]++;
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }

        for (int idx = 0; idx < 81; idx++) {
            int i = (idx%9)*9 + idx/9;
            int bitmask = formattedMask[i];
            if ((bitmask & 0xffff) == 3) {
                for (int j = i+9; j < 81; j += 9) {
                    int bitmask_j = formattedMask[j];
                    if (bitmask_j > 0 && bitmask == (bitmask | bitmask_j)) {
                        for (int k = j+9; k < 81; k += 9) {
                            int bitmask_k = formattedMask[k];
                            if (bitmask_k > 0 && bitmask == (bitmask | bitmask_k)) {

                                int bitmask_shifted = bitmask >> 16;
                                int c0, c1, c2, l = 0;
                                while ((bitmask_shifted & 1) == 0) {
                                    l++;
                                    bitmask_shifted >>= 1;
                                }
                                c0 = l;
                                bitmask_shifted >>= 1;
                                l++;
                                while ((bitmask_shifted & 1) == 0) {
                                    l++;
                                    bitmask_shifted >>= 1;
                                }
                                c1 = l;
                                bitmask_shifted >>= 1;
                                l++;
                                while ((bitmask_shifted & 1) == 0) {
                                    l++;
                                    bitmask_shifted >>= 1;
                                }
                                c2 = l;
                                for (int cell = i%9; cell < 81; cell += 9) {
                                    if (cell != i && cell != j && cell != k) {
                                        if (!lineMask[vIndex][cell][c0]) {
                                            disable(cell, c0);
                                            lineMask[vIndex][cell][c0] = true;
                                            lineCounters[vIndex]++;
                                        }
                                        if (!lineMask[vIndex][cell][c1]) {
                                            disable(cell, c1);
                                            lineMask[vIndex][cell][c1] = true;
                                            lineCounters[vIndex]++;
                                        }
                                        if (!lineMask[vIndex][cell][c2]) {
                                            disable(cell, c2);
                                            lineMask[vIndex][cell][c2] = true;
                                            lineCounters[vIndex]++;
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }

        for (int idx = 0; idx < 9; idx++) {
            for (int i = 0; i < 9; i++) {
                int bitmask = formattedMask[cells[idx][i]];
                if ((bitmask & 0xffff) == 3) {
                    for (int j = i+1; j < 9; j++) {
                        int bitmask_j = formattedMask[cells[idx][j]];
                        if (bitmask_j > 0 && bitmask == (bitmask | bitmask_j)) {
                            for (int k = j+1; k < 9; k++) {
                                int bitmask_k = formattedMask[cells[idx][k]];
                                if (bitmask_k > 0 && bitmask == (bitmask | bitmask_k)) {

                                    int bitmask_shifted = bitmask >> 16;
                                    int c0, c1, c2, l = 0;
                                    while ((bitmask_shifted & 1) == 0) {
                                        l++;
                                        bitmask_shifted >>= 1;
                                    }
                                    c0 = l;
                                    bitmask_shifted >>= 1;
                                    l++;
                                    while ((bitmask_shifted & 1) == 0) {
                                        l++;
                                        bitmask_shifted >>= 1;
                                    }
                                    c1 = l;
                                    bitmask_shifted >>= 1;
                                    l++;
                                    while ((bitmask_shifted & 1) == 0) {
                                        l++;
                                        bitmask_shifted >>= 1;
                                    }
                                    c2 = l;
                                    for (int cellIdx = 0; cellIdx < 9; cellIdx++) {
                                        if (cellIdx != i && cellIdx != j && cellIdx != k) {
                                            int cell = cells[idx][cellIdx];
                                            if (!lineMask[vIndex][cell][c0]) {
                                                disable(cell, c0);
                                                lineMask[vIndex][cell][c0] = true;
                                                lineCounters[vIndex]++;
                                            }
                                            if (!lineMask[vIndex][cell][c1]) {
                                                disable(cell, c1);
                                                lineMask[vIndex][cell][c1] = true;
                                                lineCounters[vIndex]++;
                                            }
                                            if (!lineMask[vIndex][cell][c2]) {
                                                disable(cell, c2);
                                                lineMask[vIndex][cell][c2] = true;
                                                lineCounters[vIndex]++;
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }

    }

    final private void identifyLines(int vIndex) {

        int disabledLines = 0;
        int[][] tempRowMask = new int[3][9];
        int[][] tempColMask = new int[3][9];
        for (int i = 0; i < 9; i++) {
            for (int c = 0; c < 9; c++) {
                for (int j = 0; j < 3; j++) {
                    tempRowMask[j][c] = 0;
                    tempColMask[j][c] = 0;
                }
                for (int j = 0; j < 9; j++) {
                    if (mask[cells[i][j]][c] == 0) {
                        tempRowMask[j/3][c]++;
                        tempColMask[j%3][c]++;
                    }
                }

                int rowCount = 0;
                int colCount = 0;
                int rowIdx = -1, colIdx = -1;
                for (int j = 0; j < 3; j++) {
                    if (tempRowMask[j][c] > 0) {
                        rowCount++;
                        rowIdx = j;
                    }
                    if (tempColMask[j][c] > 0) {
                        colCount++;
                        colIdx = j;
                    }
                }
                if (rowCount == 1) {
                    for (int j = (i/3)*3; j < (i/3 + 1)*3; j++) {
                        if (j != i) {
                            for (int k = rowIdx*3; k < (rowIdx+1)*3; k++) {
                                int cell = cells[j][k];
                                if (!lineMask[vIndex][cell][c]) {
                                    disable(cell, c);
                                    lineMask[vIndex][cell][c] = true;
                                    lineCounters[vIndex]++;
                                }
                            }
                        }
                    }

                }
                if (colCount == 1) {
                    for (int j = i % 3; j < 9; j += 3) {
                        if (j != i) {
                            for (int k = colIdx; k < 9; k += 3) {
                                int cell = cells[j][k];
                                if (!lineMask[vIndex][cell][c]) {
                                    disable(cell, c);
                                    lineMask[vIndex][cell][c] = true;
                                    lineCounters[vIndex]++;
                                }
                            }
                        }
                    }
                }
            }
        }
    }

    final private boolean isPossible(int v, int c) {
        return mask[v][c] == 0;
    }

    final private int checkMask(int[][] neighbors, int v, int c) {
        int tempValue = 0;
        for (int n : neighbors[v]) {
            if (mask[n][c] > 0) {
                tempValue++;
            }
        }
        return tempValue;
    }

    final private void put(int v, int c) {
        solvedBoard[v] = c;
        for (int i : neighbors[v]) {
            mask[i][c]++;
        }
        for (int i = 0; i < 9; i++) {
            mask[v][i]++;
        }
    }

    final private void disable(int v, int c) {
        mask[v][c]++;
    }

    final private void unput(int v, int c) {
        solvedBoard[v] = -1;
        for (int i : neighbors[v]) {
            mask[i][c]--;
        }
        for (int i = 0; i < 9; i++) {
            mask[v][i]--;
        }       
    }

    final private void enable(int v, int c) {
        // enables++;
        mask[v][c]--;
    }

    public String getString(int[] board) {
        StringBuilder s = new StringBuilder();
        for (int i : board) {
            s.append(i+1);
        }
        return s.toString();
    }

    public long getTime() {
        return totTime;
    }

    public static String printTime(long t1, long t2) {
        String unit = " ns";
        if (t2-t1 > 10000) {
            unit = " us";
            t1 /= 1000; t2 /= 1000;
        }
        if (t2-t1 > 10000) {
            unit = " ms";
            t1 /= 1000; t2 /= 1000;
        }
        if (t2-t1 > 10000) {
            unit = " seconds";
            t1 /= 1000; t2 /= 1000;
        }
        return (t2-t1) + unit;
    }

    public void display(int[] board) {

        for (int i = 0; i < 9; i++) {
            if (i % 3 == 0) {
                System.out.println("+-----+-----+-----+");
            }
            for (int j = 0; j < 9; j++) {
                if (j % 3 == 0) {
                    System.out.print("|");
                } else {
                    System.out.print(" ");
                }
                if (board[i*9+j] != -1) {
                    System.out.print(board[i*9+j]+1);
                } else {
                    System.out.print(" ");
                }
            }
            System.out.println("|");
        }
        System.out.println("+-----+-----+-----+");
    }

    public void display2(int[] board, int[] solved) {

        for (int i = 0; i < 9; i++) {
            if (i % 3 == 0) {
                System.out.println("+-----+-----+-----+  +-----+-----+-----+");
            }
            for (int j = 0; j < 9; j++) {
                if (j % 3 == 0) {
                    System.out.print("|");
                } else {
                    System.out.print(" ");
                }
                if (board[i*9+j] != -1) {
                    System.out.print(board[i*9+j]+1);
                } else {
                    System.out.print(" ");
                }
            }

            System.out.print("|  ");

            for (int j = 0; j < 9; j++) {
                if (j % 3 == 0) {
                    System.out.print("|");
                } else {
                    System.out.print(" ");
                }
                if (solved[i*9+j] != -1) {
                    System.out.print(solved[i*9+j]+1);
                } else {
                    System.out.print(" ");
                }
            }

            System.out.println("|");
        }
        System.out.println("+-----+-----+-----+  +-----+-----+-----+");
    }

    private boolean contains(int[] a, int v) {
        for (int i : a) {
            if (i == v) {
                return true;
            }
        }
        return false;
    }

    public void connect() {
        for (int i = 0; i < 81; i++) {
            for (int j = 0; j < 20; j++) {
                neighbors[i][j] = -1;
            }
        }
        int[] n_count = new int[81];

        HashMap<Integer,ArrayList<Integer>> map 
            = new HashMap<Integer,ArrayList<Integer>>();

        for (int[] c: cells) {
            ArrayList<Integer> temp = new ArrayList<Integer>();
            for (int v : c) {
                temp.add(v);
            }
            for (int v : c) {
                map.put(v,temp);
            }
        }

        for (int i = 0; i < 81; i++) {
            for (int j = (i/9)*9; j < (i/9)*9 + 9; j++) {
                if (i != j) {
                    neighbors[i][n_count[i]++] = j;
                }
            }
            for (int j = i%9; j < 81; j += 9) {
                if (i != j) {
                    neighbors[i][n_count[i]++] = j;
                }
            }
            for (int j : map.get(i)) {
                if (i != j) {
                    if (!contains(neighbors[i], j)) {
                        neighbors[i][n_count[i]++] = j;
                    }
                }
            }
        }
    }

    public static int[][] getInput(String filename) {
        int[][] boards;
        try (BufferedInputStream in = new BufferedInputStream(
            new FileInputStream(filename))) {

            BufferedReader r = new BufferedReader(
                new InputStreamReader(in, StandardCharsets.UTF_8));
            int n = Integer.valueOf(r.readLine());
            boards = new int[n][81];
            clues = new int[n];
            for (int i = 0; i < n; i++) {
                for (int j = 0; j < 81; j++) {
                    int x = r.read();
                    boards[i][j] = x - 49;
                    clues[i] += x > 48 ? 1 : 0;
                }
                r.read();
            }
            r.close();
        } catch (IOException ex) {
            throw new RuntimeException(ex);
        }
        return boards;
    }

    private int getTotEasy() {
        return totEasy;
    }

    public String getSolution() {
        StringBuilder s = new StringBuilder(256);
        for (int i : unsolvedBoard) {
            s.append(i+1);
        }
        s.append(",");
        for (int i : solvedBoard) {
            s.append(i+1);
        }
        return s.toString();
    }

    public static void main (String[] args) {
        long t0 = System.nanoTime();
        Sudoku gc = new Sudoku();
        File f;
        PrintWriter p;
        try {
            f = new File("sudoku_output.txt");
            p = new PrintWriter(f);
        } catch (Exception e) {
            return;
        }
        if (args.length != 1) {
            System.out.println("Usage: java Sudoku <input_file>");
            return;
        }
        int[][] boards = gc.getInput(args[0]);
        long tinp = System.nanoTime();
        gc.connect();
        long t1 = System.nanoTime();
        p.println(boards.length);

        long maxSolveTime = 0;
        int maxSolveIndex = 0;
        long[] solveTimes = new long[boards.length];
        for (int i = 0; i < boards.length; i++) {
            long tempTime = System.nanoTime();
            if (tempTime - gc.lastPrint > 200_000_000 
                || i == boards.length - 1) {

                gc.shouldPrint = true;
                gc.lastPrint = tempTime;
                System.out.print(String.format(
                    "\r(%7d/%7d) ", i+1, boards.length));
            }
            long elapsed = gc.solveSudoku(boards[i], gc.clues[i]);
            if (elapsed == -1) {
                System.out.println("Impossible: " + i);
            }
            if (elapsed > maxSolveTime) {
                maxSolveTime = elapsed;
                maxSolveIndex = i;
            }
            solveTimes[i] = elapsed;
            p.println(gc.getSolution());
            // break;
        }

        p.close();
        long t2 = System.nanoTime();
        Arrays.sort(solveTimes);
        System.out.println();
        System.out.println("Median solve time: " 
            + gc.printTime(0, solveTimes[boards.length/2]));
        System.out.println("Longest solve time: " 
            + gc.printTime(0, maxSolveTime) + " for board " + maxSolveIndex);
        gc.display(boards[maxSolveIndex]);
        System.out.println();

        System.out.println("Total time (including prints): " 
            + gc.printTime(t0,t2));
        System.out.println("Sudoku solving time: " 
            + gc.printTime(0,gc.getTime()));
        System.out.println("Average time per board: " 
            + gc.printTime(0,gc.getTime()/boards.length));
        System.out.println("Number of one-choice digits per board: " 
            + String.format("%.2f", gc.getTotEasy()/(double)boards.length));  
        System.out.println("Easily solvable boards: " + gc.easySolved);
        System.out.println("\nInput time: " + gc.printTime(t0,tinp));
        System.out.println("Connect time: " + gc.printTime(tinp,t1));
        try {
            Thread.sleep(10000);
        } catch (InterruptedException e) {

        }
    }
}
\$\endgroup\$
1
  • \$\begingroup\$ I am am not mistaken you should save some time translating those jagged arrays into 2D arrays. \$\endgroup\$
    – CSharpie
    Commented Dec 21, 2019 at 13:14
3
\$\begingroup\$

Python3 - 2min 1.007s official score

It takes around 100 seconds on my i5-9400F

import copy

SUDOKU_VALUES = [1, 2, 4, 8, 16, 32, 64, 128, 256]
SUDOKU_MAX = 511
OPTION_COUNT_CACHE = [
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2,
    3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3,
    3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3,
    4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4,
    3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5,
    6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4,
    4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5,
    6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3,
    4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6,
    6, 7, 6, 7, 7, 8, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3,
    4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5,
    4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3,
    4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6,
    6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6,
    7, 5, 6, 6, 7, 6, 7, 7, 8, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4,
    5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 3, 4,
    4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6,
    7, 6, 7, 7, 8, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, 5, 6, 6, 7,
    6, 7, 7, 8, 6, 7, 7, 8, 7, 8, 8, 9
    ] # Basically just .count_ones()


class SudokuEmpty:
    def __init__(self):
        self.data = list(range(81))
        self.pos = 81

    def remove(self, index):
        self.pos -= 1
        data = self.data
        data[index], data[self.pos] = data[self.pos], data[index]


class Solver:
    def __init__(self, sudoku):
        self.to_explore = SudokuEmpty()
        self.options = [SUDOKU_MAX for _ in range(81)]
        for (i, item) in enumerate(sudoku):
            if item != 0:
                self.options[i] = SUDOKU_VALUES[item - 1]
                self.apply_number(i)

    def hidden_singles(self, square):
        options = self.options
        value = options[square]
        options[square] = 0
        row_start = square // 9 * 9
        column_start = square % 9
        box_start = square // 3 % 3 * 3 + square // 27 * 27
        needed = (SUDOKU_MAX
                  - ((options[row_start + 8]
                      | options[row_start + 7]
                      | options[row_start + 6]
                      | options[row_start + 5]
                      | options[row_start + 4]
                      | options[row_start + 3]
                      | options[row_start + 2]
                      | options[row_start + 1]
                      | options[row_start])
                     & (options[column_start + 72]
                        | options[column_start + 63]
                        | options[column_start + 54]
                        | options[column_start + 45]
                        | options[column_start + 36]
                        | options[column_start + 27]
                        | options[column_start + 18]
                        | options[column_start + 9]
                        | options[column_start])
                     & (options[box_start + 20]
                        | options[box_start + 19]
                        | options[box_start + 18]
                        | options[box_start + 11]
                        | options[box_start + 10]
                        | options[box_start + 9]
                        | options[box_start + 2]
                        | options[box_start + 1]
                        | options[box_start])))
        option_count = OPTION_COUNT_CACHE[needed]
        if option_count == 0:
            self.options[square] = value
            return True
        elif option_count == 1:
            if value & needed != 0:
                self.options[square] = value & needed
                return True
            else:
                return False
        else:
            return False

    def apply_number(self, square):
        options = self.options
        value = options[square]
        not_value = SUDOKU_MAX - value
        column_start = square % 9
        row_start = square - column_start
        box_start = square // 3 % 3 * 3 + square // 27 * 27
        options[row_start + 8] &= not_value
        options[row_start + 7] &= not_value
        options[row_start + 6] &= not_value
        options[row_start + 5] &= not_value
        options[row_start + 4] &= not_value
        options[row_start + 3] &= not_value
        options[row_start + 2] &= not_value
        options[row_start + 1] &= not_value
        options[row_start] &= not_value

        options[column_start + 72] &= not_value
        options[column_start + 63] &= not_value
        options[column_start + 54] &= not_value
        options[column_start + 45] &= not_value
        options[column_start + 36] &= not_value
        options[column_start + 27] &= not_value
        options[column_start + 18] &= not_value
        options[column_start + 9] &= not_value
        options[column_start] &= not_value

        options[box_start + 20] &= not_value
        options[box_start + 19] &= not_value
        options[box_start + 18] &= not_value
        options[box_start + 11] &= not_value
        options[box_start + 10] &= not_value
        options[box_start + 9] &= not_value
        options[box_start + 2] &= not_value
        options[box_start + 1] &= not_value
        options[box_start] &= not_value
        options[square] = value

    def process(self, routes):
        values = []
        while 1:
            min_length = 20
            min_pos = 0
            min_pos_x = 0
            x = 0
            while x < self.to_explore.pos:
                pos = self.to_explore.data[x]
                if not self.hidden_singles(pos):
                    return False
                option = self.options[pos]
                length = OPTION_COUNT_CACHE[option]
                if length < min_length:
                    if length == 0:
                        return False
                    elif length == 1:
                        for (i, item) in enumerate(SUDOKU_VALUES):
                            if option == item:
                                self.apply_number(pos)
                                self.to_explore.remove(x)
                                break
                    else:
                        min_length = length
                        min_pos = pos
                        min_pos_x = x
                        x += 1

                else:
                    x += 1

            if min_length != 20:
                values.clear()
                options = self.options[min_pos]
                for (i, item) in enumerate(SUDOKU_VALUES):
                    if options & item != 0:
                        values.append(i + 1)
                if not values:
                    return False

                self.to_explore.remove(min_pos_x)
                item = values.pop()
                for value in values:
                    clone = copy.deepcopy(self)
                    clone.options[min_pos] = SUDOKU_VALUES[value - 1]
                    clone.apply_number(min_pos)
                    routes.append(clone)
                self.options[min_pos] = SUDOKU_VALUES[item - 1]
                self.apply_number(min_pos)
            else:
                return True

    def get_result(self):
        solution = [0 for _ in range(81)]
        for (i, option) in enumerate(self.options):
            for (x, value) in enumerate(SUDOKU_VALUES):
                if option == value:
                    solution[i] = x + 1
                    break
        return solution


def solve(sudoku):
    routes = [Solver(sudoku)]
    while routes:
        route = routes.pop()
        result = route.process(routes)
        if result:
            return route.get_result()
    raise Exception("Empty routes, but still unsolved")

if __name__ == '__main__':
    with open('all_17_clue_sudokus.txt') as file:
        sudokus = file.read().splitlines()
        print(sudokus[0])
    for sudoku in sudokus[1:]:
        solution = ''.join(map(str, solve(list(map(int, sudoku)))))
        print('%s,%s' % (sudoku, solution))

The path to the sudokus is hardcoded, it has to be all_17_clue_sudokus.txt

To run

time python3 lib.py > output
sha256sum output
\$\endgroup\$
1
  • \$\begingroup\$ It was a while since I ran these benchmarks, but if I have some time this weekend I'll try to run yours and give it a score \$\endgroup\$
    – maxb
    Commented Jun 12, 2020 at 7:50
2
\$\begingroup\$

C - 12min 28.374s official score

runs for about 30m 15m on my i5-7200U and produces the correct md5 hash

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<sys/time.h>
#define B break
#define O const
#define P printf
#define R return
#define S static
#define $(x,y...)  if(x){y;}
#define E(x...)    else{x;}
#define W(x,y...)  while(x){y;}
#define fi(x,y...) for(I i=0,_n=(x);i<_n;i++){y;}
#define fj(x,y...) for(I j=0,_n=(x);j<_n;j++){y;}
typedef void V;typedef char C;typedef short H;typedef int I;typedef long long L;
S C h[81][20]; //h[i][0],h[i][1],..,h[i][19] are the squares that clash with square i
S H a[81]      //a[i]: bitmask of possible choices; initially one of 1<<0, 1<<1 .. 1<<8, or 511 (i.e. nine bits set)
   ,b[81];     //b[i]: negated bitmask of impossible chioces; once we know square i has value v, b[i] becomes ~(1<<v)
S I f(){ //f:recursive solver
 I p=-1; //keep track of the popcount (number of 1 bits) in a
 W(1,I q=0;                                         //simple non-recursive deductions:
     fi(81,fj(20,a[i]&=b[h[i][j]])                  // a[i] must not share bits with its clashing squares
           $(!(a[i]&a[i]-1),$(!a[i],R 0)b[i]=~a[i]) // if a[i] has one bit left, update b[i].  if a[i]=0, we have a contradiction
           q+=__builtin_popcount(a[i]))             // compute new popcount
     $(p==q,B)p=q;)                                 // if the popcount of a[] changed, try to do more deductions
 I k=-1,mc=10;fi(81,$(b[i]==-1,I c=__builtin_popcount(a[i]);$(c<mc,k=i;mc=c;$(c==2,B)))) //find square with fewest options left
 $(k==-1,R 1) //if there isn't any such, we're done - success! otherwise k is that square
 fi(9,$(a[k]&1<<i,H a0[81],b0[81];                                        //try different values for square k
                  memcpy(a0,a,81*sizeof(*a));memcpy(b0,b,81*sizeof(*b));  // save a and b
                  a[k]=1<<i;b[k]=~a[k];$(f(),R 1)                         // set square k and make a recursive call
                  memcpy(a,a0,81*sizeof(*a));memcpy(b,b0,81*sizeof(*b)))) // restore a and b
 R 0;}
S L tm(){struct timeval t;gettimeofday(&t,0);R t.tv_sec*1000000+t.tv_usec;} //current time in microseconds
I main(){L t=0;I n;scanf("%d",&n);P("%d\n",n);
 fi(81,L l=0;fj(81,$(i!=j&&(i%9==j%9||i/9==j/9||(i/27==j/27&&i%9/3==j%9/3)),h[i][l++]=j))) //precompute h
 fi(n,S C s[82];scanf("%s",s);printf("%s,",s);                        //i/o and loop over puzzles
      fj(81,a[j]=s[j]=='0'?511:1<<(s[j]-'1');b[j]=s[j]=='0'?-1:~a[j]) //represent '1' .. '9' as 1<<0 .. 1<<8, and 0 as 511
      t-=tm();I r=f();t+=tm();                                        //measure time only for the solving function
      $(!r,P("can't solve\n");exit(1))                                //shouldn't happen
      fj(81,s[j]=a[j]&a[j]-1?'0':'1'+__builtin_ctz(a[j]))             //1<<0 .. 1<<8 to '1' .. '9'
      P("%s\n",s))                                                    //output
 fflush(stdout);dprintf(2,"time:%lld microseconds\n",t);R 0;}         //print self-measured time to stderr so it doesn't affect stdout's md5

compile (preferably with clang v6) and run:

clang -O3 -march=native a.c
time ./a.out <all_17_clue_sudokus.txt | tee o.txt | nl
md5sum o.txt
\$\endgroup\$
6
  • 4
    \$\begingroup\$ Why so ugly? This isn't code-golf! \$\endgroup\$ Commented Aug 23, 2019 at 22:00
  • 4
    \$\begingroup\$ @JonathanAllan that's how i usually code (unless i'm in a team who prefer to do otherwise). it's beautiful :) \$\endgroup\$
    – ngn
    Commented Aug 23, 2019 at 22:01
  • 1
    \$\begingroup\$ Haha, "beautiful", and easy to come back to in 6 months :p \$\endgroup\$ Commented Aug 23, 2019 at 22:06
  • 1
    \$\begingroup\$ yes, actually. i've been doing this for a couple of years and i find it more efficient. in the apl world it's known as incunabulum style. with bloated code you move your eyes mostly vertically (unnatural and unfit for our landscape monitors) and scroll a lot. with tight code you can see all of it at once, so it's easier to find your way around it and to judge its complexity at a glance. \$\endgroup\$
    – ngn
    Commented Aug 23, 2019 at 22:13
  • \$\begingroup\$ Is it a backtracking solution? I see two memcpy in there and some recursion going on. I'll try to verify it today. \$\endgroup\$
    – maxb
    Commented Aug 24, 2019 at 6:34
2
\$\begingroup\$

Typescript(Deno) - 22min 12s unofficial score

on an Lenovo G50-45 Laptop with 4cores and 4threads @1.5GHz

reads from all_17_clue_sudokus.txt or first argument

outputs to output.txt or second argument

download and run with deno run --allow-all https://raw.githubusercontent.com/RubenArtmann/sudoku-solver/master/mod.ts with Deno installed

github page

solver.ts

const parse = (string: string)=>{
    let sudoku = new Uint16Array(9*9);
    for(let i=0; i<string.length; i++) {
        let n = string.charCodeAt(i)-"1".charCodeAt(0);
        sudoku[i] = n>=0&&n<9?1<<n:511;
    }
    return sudoku;
};
const bitCount = (n:number)=>{
  n = n - ((n >> 1) & 0x55555555)
  n = (n & 0x33333333) + ((n >> 2) & 0x33333333)
  return ((n + (n >> 4) & 0xF0F0F0F) * 0x1010101) >> 24
};

const forced = (sudoku:Uint16Array)=>{
    for(let i=0; i<sudoku.length; i++) {
        let row = i%9;
        let column = Math.floor(i/9);

        // skip if already spread
        if(sudoku[i]&(1<<9)) continue;

        if(bitCount(sudoku[i]&511)===1) {
            for(let j=0; j<9; j++) {
                if(j!==column) sudoku[row+j*9] &= ~(sudoku[i]&511);
                if(j!==row) sudoku[j+column*9] &= ~(sudoku[i]&511);

                let xoff = row-row%3;
                let yoff = column-column%3;
                let x = xoff+j%3;
                let y = yoff+Math.floor(j/3);
                if(x!==row && y!==column) sudoku[x+y*9] &= ~(sudoku[i]&511);

                // set already spread bit
                sudoku[i] |= 1<<9;
            }
        }
    }
};
const dump = (sudoku:Uint16Array)=>{
    return [...sudoku].map(e=>{
        let char = "";
        for(let j=0; j<9; j++) {
            if(e&(1<<j)) char += j+1;
        }
        if(char.length!==1) {
            log(sudoku)
            throw new Error("");
        }
        return char;
    }).join("");
};
const log = (sudoku:Uint16Array)=>{
    for(let column=0; column<9; column++) {
        console.log([...sudoku.slice(column*9,column*9+9)].map((n,row)=>{
            let string = "";
            for(let j=0; j<9; j++) {
                string += sudoku[column*9+row]&(1<<j)?j+1:".";
            }
            return string;
        }).join(" "));
    }
    console.log("")
};

let count = 0;

const solve = (sudoku:Uint16Array)=>{
    let sudokuBuf = new Uint16Array(9*9);
    let stack: Uint16Array[] = [sudoku];
    let stack_length = 1;
    while(stack_length>0) {
        count++;
        // console.log(stack.length)
        stack_length--;
        let sudoku: Uint16Array = stack[stack_length] as Uint16Array;
        stack[stack_length] = sudokuBuf;
        sudokuBuf = sudoku;

        // do forced moves
        forced(sudoku);

        let solved = true;
        let invalid = false;
        for(let i=0; i<sudoku.length; i++) {
            let c = bitCount(sudoku[i]&511);
            if(c!==1) solved = false;
            if(c===0) {
                invalid = true;
                break;
            }
        }
        if(invalid) continue;
        if(solved) {
            // need to apply forced one more time and check for changes / empty cells
            forced(sudoku);
            let solved = true;
            for(let i=0; i<sudoku.length; i++) {
                let c = bitCount(sudoku[i]&511);
                if(c===0) solved = false;
            }
            if(solved) return sudoku;
            continue;
        }

        // choose cell to split on
        let splitIndex = -1;
        for(let i=2; i<=9; i++) {
            for(let j=0; j<sudoku.length; j++) {
                if(bitCount(sudoku[j]&511)===i) {
                    splitIndex = j;
                    i = 1000;
                    break;
                }
            }
        }
        if(splitIndex<0) throw new Error("");

        // branch/split
        for(let i=0; i<9; i++) {
            if(sudoku[splitIndex]&(1<<i)) {
                if(stack[stack_length]===undefined) stack[stack_length] = new Uint16Array(9*9);
                let newsudoku = stack[stack_length];
                newsudoku.set(sudoku,0);
                newsudoku[splitIndex] = 1<<i;
                stack_length++;
            }
        }
        // log(sudoku);
    }
    throw new Error("Out of Options!")
    return new Uint16Array(9*9);
};



const context: Worker = self as any;

context.onmessage = (event)=>{
    let result = dump(solve(parse(event.data.data)));
    // console.log(count)
    context.postMessage({
        data: result,
        callback: event.data.callback
    })
};

mod.ts

let callbacks = new Map();
let workers: any[] = [];
let tasks: {data:any,callback:number}[] = [];

for(let i=0; i<4; i++) {
    let worker = new Worker(new URL("solver.ts", import.meta.url).toString(),{ type: "module" }) as any;
    worker.tasks = 0;
    worker.onmessage = (event:{data:any})=>{
        worker.tasks--;
        callbacks.get(event.data.callback)(event.data.data);
        callbacks.delete(event.data.callback);
    };
    workers.push(worker);
}

const sleep = (time:number)=>{
    return new Promise((resolve)=>{
        setTimeout(resolve,time);
    });
};

const loadBalancer = async()=>{
    while(true) {
        // console.log(workers.map(w=>w.tasks),tasks.length);
        let dispatches = 0;
        for(let i=0; i<workers.length; i++) {
            let worker = workers[i];
            if(worker.tasks>=5||tasks.length<=0) continue;
            if(worker.taks<=0) console.log("running low!")

            let task = tasks.shift();
            if(task === undefined) continue;
            worker.postMessage(task);
            worker.tasks++;
            dispatches++;
        }
        if(dispatches<=0) await sleep(100);
    }
};
loadBalancer();

const dispatch = async(data:any)=>{
    let random = Math.random();

    tasks.push({
        data: data,
        callback: random
    });
    return new Promise((resolve)=>{
        callbacks.set(random,resolve);
    });
};

// dispatch("030000000000001008700580000000024050040873900003600000900000002005002091200000704").then(console.log)
// console.log(workers)

let count = 0;

let file = Deno.readTextFileSync(Deno.args[0]||"./all_17_clue_sudokus.txt");
// let file = Deno.readTextFileSync("./hard_sudokus.txt");
let sudokus = file.split("\n").slice(1);
let promises = sudokus.map(async(e,i)=>{
    let result = await dispatch(e);
    count++;
    console.log(count,"/",sudokus.length);
    console.log(e);
    console.log(result);
    return result;
});

let solutions = await Promise.all(promises);
console.log("done!");
workers.map(w=>w.terminate());

let output = "";
output += `${sudokus.length}\n`;
for(let i=0; i<sudokus.length; i++) {
    output += `${sudokus[i]},${solutions[i]}\n`;
}

Deno.writeTextFileSync(Deno.args[1]||"./output.txt",output);

Deno.exit(0)
\$\endgroup\$
3
  • \$\begingroup\$ Welcome to Code Golf! Nice first answer! \$\endgroup\$ Commented Apr 6, 2021 at 12:29
  • \$\begingroup\$ Welcome to Code Golf! I'm not super familiar with typescript, but I'll try to boot up my old laptop and see if I can get it running! \$\endgroup\$
    – maxb
    Commented Apr 9, 2021 at 8:00
  • 1
    \$\begingroup\$ @maxb it should be as simple as curl -fsSL https://deno.land/x/install/install.sh | sh and deno run --allow-all https://raw.githubusercontent.com/RubenArtmann/sudoku-solver/master/mod.ts \$\endgroup\$ Commented Apr 10, 2021 at 11:58
2
\$\begingroup\$

Rust - 0.150s unofficial score

https://github.com/mstdokumaci/sudokumaci/tree/main/rust

Here is my attempt in Rust. For the list of sudoku puzzles above (all_17_clue), I get ~150ms on a MacBook Pro (16-inch, 2019, 2.3 GHz i9-9880H).

I build the binary by this command: cargo rustc --release -- -C target-cpu=native

I am not an expert on writing optimized code and I am coding in Rust for the first time. I believe the algorithm is at the sweet spot of human solutions and brute force, and it can be further optimized. I would appreciate it if anybody would like to suggest any improvements.

\$\endgroup\$
2
  • \$\begingroup\$ Welcome to Code Golf and Coding Challenges Stack Exchange!, But you are only allowed to post a language per answer, You can split by answers. \$\endgroup\$
    – Fmbalbuena
    Commented Feb 28, 2022 at 13:12
  • \$\begingroup\$ You can include the score, This will help to compare answers. \$\endgroup\$
    – Fmbalbuena
    Commented Feb 28, 2022 at 13:23

Not the answer you're looking for? Browse other questions tagged or ask your own question.