# Desystemize #9

### What do revolutionary new Sudoku techniques teach us about real-world problem solving?

On the heels of Representation and Uncertainty telling us to think more critically about our ontologies, I’d like to share a success story of **ontological remodeling** – that is, using new language to describe an existing thing. The existing thing in this case is the puzzle game Sudoku, which might seem a bit too straightforward to be described in a new way. You’re placing the digits 1 to 9 in a 9x9 grid without repeating a digit in a row, column, or box. There doesn’t seem to be space for unrepresented details in the way we’re interested in.

But here’s a quote from a January 2022 video of a man named Simon solving a Sudoku:

I now understand how to explain what’s going on in the puzzle. What I can tell you, what we’re going to end up with here, is that this digit, this digit, and this digit are a 1, 2, 3 triple. And isn’t it amazing? This, to me, is quite beautiful. I can tell you without fear of contradiction, if I had tried this puzzle two years ago…well, unless there is something else here that I’m not spotting, I might never have spotted that. Yet nowadays, because set equivalence theory has become so ingrained and practiced, it’s a very quick spot. And that being a 1, 2, 3 triple is just remarkably gorgeous. So how do we prove this is a 1, 2, 3 triple…? I mean, this puzzle could appear in the World Puzzle Championship now. There’s no way this could have appeared a few years ago, no one would have been able to do it! But now, that break-in is so well-signaled it’s almost fair!

We haven’t made any breaking new discoveries about the digits 1 to 9. But this set equivalence theory, whatever it is, has changed how Simon views puzzles so much that it’s not even comparable to two years ago. It’s all still just digits in a grid, but describing them differently leads you to a new conclusion. And precisely because Sudoku is so simply defined, the case study of set equivalence theory is worth studying. If we can find hidden secrets *here*, in a 9x9 grid of single digits, we can find them anywhere.

Let’s start by explicating the existing, “normal” ontology for a Sudoku. The rules specify that you have exactly one of each digit from one to nine in every *row, column, *and* box* – so it’s no surprise that these are the usual terms used to describe Sudoku puzzles. For example, you might say “I know that *row 2* needs a 3, and it can only go in to this square,” or *“Box 5 *already has a 9, so the 9 can’t be in any of the other cells in *box 5*.” Because the explicit impact of the rules is based on these sorts of divisions, it’s a very useful ontology to have.

But that’s not the only way you could describe a Sudoku. You could partition the grid into an orange* ostrich head *and a fat green* frog*:

There are things you can say about these shapes. No digit can appear in the *ostrich head* more than three times, because its only in three *boxes* total. A digit that appears on one of the *frog legs* won’t appear on the other, because they’re in the same *row*. While these claims are true, they’re ultimately just pointing back to the words we were comfortable with already. Drawing the *ostrich* and the *frog* didn’t really accomplish anything.

Ontologies can’t exactly be *wrong*, since they’re just methods of description. But it’s pretty obvious that this one isn’t useful. It doesn’t let you state anything new that you couldn’t state using your previous vocabulary. It’s just an unnecessary level of abstraction that has to be peeled away to get to the useful ontology.

Here’s another way you could partition the sudoku: one blue *ring* in the center, and four orange *anchors* in the corners.

Is this any better than our frog and ostrich? Does it let us make any statements that we couldn’t easily make with *row, column, *and* box? *As it happens, yes. You can truthfully make the following statement: “The digits in the *ring* are exactly the same as the digits in the *four anchors*.”

Wait, what?

This surprising fact is called “Phistomofel’s Theorem”, named for the setter who initially popularized it. It’s a specific example of set equivalence theory that helped many people (myself included) understand it. Imagine that the letters 1 to 9 were all on tiles, like in a game of Scrabble. Each row, column, or box has the tiles 1 to 9 once each. Suppose you took the tiles in these two rows and these two boxes and put them in a blue sack:

Row 4, Row 7, Box 4, and Box 6 (image mislabeled) each have a run of the digits 1-9, so in total the blue sack has 36 tiles, four copies of each digit. Now, take a clone of your puzzle. This time, put these four columns in an orange sack:

Since both sacks have 4 runs of the digits 1 to 9, they have the same composition of digits inside them. But now, let’s superimpose the contents of the two sacks on to one grid, and look at the cells with both colors:

The cells with both colors are in both sacks. If you take the exact same cell out of both sacks, then clearly the two identical sacks will remain identical – you took the same thing out of each of them. You can do this for every cell with two colors. So, removing them all from our superimposed grid, we’re left with this:

That’s how we know that the digit composition of these regions – the *ring* and the *anchors - *are identical. And that means that this is a *useful* upgrade to our ontology, because now we can make statements like “There are two sixes in the *ring*, and only two spots in the *anchors* where a six can go, so there must be a six in each of them.” Those statements would be extremely tedious and unintuitive to make solely in terms of *rows, columns, *and *boxes.*Once you’ve gotten the hang of Phistomofel’s Theorem, it’s a relatively short jump to using set equivalence theory in general. When we were picking runs of the digits 1-9, we made those particular choices because they happened to have a lot of overlap to them. But the core logic works for

*any regions*that you make out of complete sets of the digits 1 to 9 – you can always eliminate the overlap and say that what’s left of the two regions are equal.

On its own, this is a neat bit of Sudoku trivia that might happen to pop up now and then in a puzzle. But what’s really interesting is when a puzzle is *designed* around this phenomenon. Setters (the term of art for people who design hand-crafted Sudoku puzzles for solvers to work through) can make a puzzle that’s virtually impossible using just *rows*, *columns*, and *boxes*, and can only be solved by upgrading your ontology. The Youtube channel *Cracking the Cryptic* has been an excellent source documenting this evolution.

Now I can reveal where that quote up top came from: a video from *Cracking the Cryptic* called *How to Make an Impossible Sudoku Easy*. (I’ll be spoiling the break-in to the puzzle, so if you’re interested in trying it on your own first, do it now before reading on or watching the video.) Technically speaking, this isn’t quite a normal Sudoku, but a variant with some special rules:

The variant itself adds some words to our ontology. “This *two-cell cage* sums to 12, so it can’t have a 1 or a 2 in it.” “This *line* has a 5 on one end, so it must have a 5 on the other end.” “This digit is a 6, so the one on the other side of the* dot* is a 5 or a 7.” But such straightforward interpretations of the variant rules aren’t what interest us here. If Simon had seen these rules two years ago, he wouldn’t have struggled to notice what they mean. Set equivalence theory is special in that it’s *not* a straightforward application of set rules. It’s a *new way of seeing*, one that you need to know to invoke instead of seeing an obvious hint for.

In this case, what’s interesting is the interaction of set equivalence theory with the palindromic lines. Start with a column set that includes parts of all 6 lines: columns 1, 4, and 7.

Next, get a row set that includes parts of all 6 lines - rows 3 and 6:

This is a little different from how we did Phistomofel’s Theorem. The orange and blue sets aren’t equal *exactly* – we needed three lots of 1 to 9 for blue, while two were enough for orange. Instead, the claim here is that we know *how much greater* blue is than orange – specifically, it has exactly one more run of 1 to 9 than orange.

We start by removing overlapping cells:

But we can go much further than that. Since we made sure to include the palindromic lines, we know that the blue segments of each line cancel out with the orange segments of the line. This leaves us with no orange digits at all, and 9 blue ones:

This means we know the composition of the blue digits. They’re exactly one set of the digits 1 to 9. The sum of the digits 1 to 9 is 451, and the top three cages sum to 13, 12, and 14. Since 13 + 12 + 14 = 39, the remaining digits must sum to 6. We know our only options are the digits 1 to 9 once each, which means that there’s only one way to do it – putting 1, 2, and 3 in the bottom three cells, in some order.

After finding this break-in, Simon remarked:

It’s not tricky – well, actually, I’m going to qualify that, because that’s a silly thing for me to have said. It’s absolutely impossible if you’ve not seen it before. Well, not impossible – virtually impossible. Certainly, probably impossible to do it at speed. But if you’ve seen it a few times before, it’s second-nature.

This is the hallmark of problems solved by ontological remodeling. You don’t want to say they’re *tricky*, exactly, because the new framework makes them feel pretty approachable. But without the new framework, they’re basically impossible. Trying to describe the difficulty of these problems is something of a trap, because so much of the difficulty* depends on the description. *Instead, you need to play around with new forms of expression and see which patterns are easy to describe with those forms.

The sheer simplicity of Sudoku makes it an excellent example for how powerful ontological remodeling can be even when it seems obvious which form to represent things in. Even if your generating rules are all related to *rows*, *columns*, and *boxes*, they can interact in such a way that you need new language to describe the interaction. In fact, this has become something of a sport on the channel, with puzzles like this one that start by showing a computer solver unable to solve a classic Sudoku logically (only getting the solution through brute forcing all possibilities), then setting a human loose on the puzzle to discover the hidden trick.

That’s as pure and straightforward a demonstration of ontological remodeling as you’re ever likely to get. Sudoku is a game whose rules only invoke *rows*, *columns* and *boxes*, but a computerized solver can’t describe the solve path with *rows*, *columns*, and *boxes*. The interplay of those rules leads to higher order patterns, and those higher order patterns need higher order language. But precisely because this example is so straightforward, it can give us a deceptive idea of how easy it is to notice that your ontology is lacking. While the solvers are in an ostensible “conflict” with the setters, it’s a thin bit of fun narrative over what is fundamentally a collaborative process. The setter wants people to notice the clever logical leap they demanded of their solvers, and that means they need people to *solve* *it. *Getting the warning that the computer can’t solve it gives you the enormous clue that *you’ve got to try something new. *You may not know what immediately, but you’re looking for the right sort of thing.

In fact, while there are plenty of puzzles that are totally incomprehensible without set equivalence theory, Phistomofel’s theorem itself has more tentative origins. Phistomofel created a puzzle that was best solved using the ring, but only published it when he was sure it could be solved “normally”. The ring was not a finding that solvers worldwide spontaneously developed when faced with a single, staggeringly beautiful puzzle. Instead, the origin story is far more pedestrian: Phistomofel talked about the ring in a forum post. Discussion followed that took the specific *ring* and *anchors* and generalized them into more generic statements about 1-9 sets. Only after the concept had reached a critical mass of socialization were most authors comfortable making puzzles where set equivalence theory was truly *required*.

So as great as puzzles are for practicing these techniques, you should never mistake a setters kindness for the indifferent illegibility of real-world problems. There are uncountably many things we’re thinking about the wrong way, but nature has no Phistomofel on deck nervously ensuring that it’s at least kind of possible to do it through brute force or starting a discussion with a helpful example. All we get are a bunch of problems that seem impossibly hard. Nothing about the world gives us an easy rule for which things are actually irreducibly tricky and which things just need better forms of description. That’s why it’s so important to cultivate an intuitive appreciation for the power of ontological remodeling. The reason this technique is so under-discussed and under-used is precisely because there’s no well-defined way to tell that it’s needed for a given problem. Individuals just have to take that leap of faith themselves, come up with a new way of seeing, and then demonstrate that problems that were once intractable are suddenly workable. We call those people “visionaries”, and we sure could use some more of them these days. *Thanks to Simon, Philip Newman, and Sam Cappleman-Lynes for reviewing a draft of this article.*

This is a secret that Simon only tells his closest friends.

I think your description of the first blue sack is wrong. Shouldn't it be Row 3 + Row 7 + Box 4 + Box 6?

This is such a cool article!!!

It got me wondering, as a mathematician, where the line is drawn between seeing a problem from a new angle and formal ontological remodeling. For example, the duality between stochastic differential equations (SDE) describing random walks and partial differential equations (PDE) describing the distribution of possible places a random walker could be & how that distribution changes over time. Would solving an SDE by way of "remodeling" it as a PDE be an example of ontological remodeling, similar to solving Sudoku with rings & anchors instead of rows, columns & boxes?

Thanks again for writing this very clear & beautiful article!