{"id":45,"date":"2013-05-20T19:32:02","date_gmt":"2013-05-21T02:32:02","guid":{"rendered":"https:\/\/www.alexkorn.com\/blog\/?p=45"},"modified":"2013-05-21T12:53:48","modified_gmt":"2013-05-21T19:53:48","slug":"graph-search-algorithmic-optimization-and-word-games-how-to-program-ruzzle","status":"publish","type":"post","link":"https:\/\/www.alexkorn.com\/blog\/2013\/05\/graph-search-algorithmic-optimization-and-word-games-how-to-program-ruzzle\/","title":{"rendered":"Graph search, algorithmic optimization, and word games: How to program Ruzzle"},"content":{"rendered":"<p><i>(Disclaimer: I have no affiliation with Ruzzle.)<\/i><\/p>\n<h3>Introduction<\/h3>\n<p>\nI was recently introduced to the game <a href=\"http:\/\/www.ruzzle-game.com\/\">Ruzzle<\/a>, a word finding game with some Scrabble-like scoring components. If you&#8217;re not familiar with it, check out their website and watch the quick introductory video.\n<\/p>\n<div id=\"attachment_47\" style=\"width: 330px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/photo-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-47\" src=\"https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/photo-1.png\" alt=\"Ruzzle screenshot\" width=\"320\" height=\"480\" class=\"size-full wp-image-47\" srcset=\"https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/photo-1.png 640w, https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/photo-1-200x300.png 200w\" sizes=\"auto, (max-width: 320px) 100vw, 320px\" \/><\/a><p id=\"caption-attachment-47\" class=\"wp-caption-text\">Ruzzle screenshot<\/p><\/div>\n<p>\nI was fascinated by how the game calculated all the possible words on each board. So I decided to program it. <a href=\"http:\/\/ruzzle.alexkorn.com\/\">Check out the Ruzzle demo page<\/a>, which creates a randomized board and finds all possible words.\n<\/p>\n<h3>Overview<\/h3>\n<p>\nThe way to do this search for all possible words is by viewing the letters as a <a href=\"http:\/\/en.wikipedia.org\/wiki\/Directed_graph\">directed graph<\/a> where the letters are nodes and edges are connections between adjacent letters.\n<\/p>\n<div id=\"attachment_48\" style=\"width: 350px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/Ruzzle-graph.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-48\" src=\"https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/Ruzzle-graph.png\" alt=\"Ruzzle graph\" width=\"340\" height=\"329\" class=\"size-full wp-image-48\" srcset=\"https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/Ruzzle-graph.png 340w, https:\/\/www.alexkorn.com\/blog\/wp-content\/uploads\/2013\/05\/Ruzzle-graph-300x290.png 300w\" sizes=\"auto, (max-width: 340px) 100vw, 340px\" \/><\/a><p id=\"caption-attachment-48\" class=\"wp-caption-text\">Ruzzle graph. This was more of a pain to make than you might expect.<\/p><\/div>\n<p>\nWith this graph, we can do a <a href=\"http:\/\/en.wikipedia.org\/wiki\/Depth-first_search\">Depth-first search (DFS)<\/a> for words starting at each node.\n<\/p>\n<h3>Creating the board<\/h3>\n<p>First, we need to create a 4&#215;4 board of letters. In order to create reasonable boards for English, we weight the letter distribution to <a href=\"http:\/\/en.wikipedia.org\/wiki\/Letter_frequency#Relative_frequencies_of_letters_in_the_English_language\">that of their distribution in English<\/a>.<\/p>\n<p>\nThe following code creates the graph\/board, assigns letters to each node\/tile, and outputs the board. The basic graph objects are shown at the end of this post.\n<\/p>\n<pre lang=\"php\">\r\n\/**\r\n * Create graph\/board\r\n *\/\r\n\r\n$timeStart = microtime(true);\r\n\r\n$colCount = 4;\r\n$rowCount = 4;\r\n\r\n$lettersGraph = new Graph_Directed();\r\n\/**\r\n * @var array Map<colId, Map<rowId, char letter>>\r\n *\/\r\n$lettersGraphNode = [];\r\nfor ($colIter = 0; $colIter < $colCount; ++$colIter)\r\n{\r\n    for ($rowIter = 0; $rowIter < $rowCount; ++$rowIter)\r\n    {\r\n        $node = $lettersGraphNode[$colIter][$rowIter] = new Graph_Node($lettersGraph);\r\n    }\r\n}\r\n\r\n\/**\r\n * Attach each node to all adjacent nodes\r\n *\/\r\nfor ($colIter = 0; $colIter < $colCount; ++$colIter)\r\n{\r\n    for ($rowIter = 0; $rowIter < $rowCount; ++$rowIter)\r\n    {\r\n        $node = $lettersGraphNode[$colIter][$rowIter];\r\n        for ($adjacentColIter = max(0, $colIter - 1);\r\n            $adjacentColIter < min($colCount, $colIter + 1 + 1);\r\n            ++$adjacentColIter)\r\n        {\r\n            for ($adjacentRowIter = max(0, $rowIter - 1);\r\n                $adjacentRowIter < min($rowCount, $rowIter + 1 + 1);\r\n                ++$adjacentRowIter)\r\n            {\r\n                \/\/ don't connect letters to themselves\r\n                if ($colIter == $adjacentColIter &#038;&#038;\r\n                        $rowIter == $adjacentRowIter)\r\n                {\r\n                    continue;\r\n                }\r\n                $node->connectFlowFrom(\r\n                        $lettersGraphNode[$adjacentColIter][$adjacentRowIter]);\r\n            }\r\n        }\r\n    }\r\n}\r\n\r\n\/**\r\n * @see http:\/\/en.wikipedia.org\/wiki\/Letter_frequency#Relative_frequencies_of_letters_in_the_English_language\r\n *\/\r\n$letterFrequency = [\r\n    'A' => 0.08167,\r\n    'B' => 0.01492,\r\n    'C' => 0.02782,\r\n    'D' => 0.04253,\r\n    'E' => 0.12702,\r\n    'F' => 0.02228,\r\n    'G' => 0.02015,\r\n    'H' => 0.06094,\r\n    'I' => 0.06966,\r\n    'J' => 0.00153,\r\n    'K' => 0.00772,\r\n    'L' => 0.04025,\r\n    'M' => 0.02406,\r\n    'N' => 0.06749,\r\n    'O' => 0.07507,\r\n    'P' => 0.01929,\r\n    'Q' => 0.00095,\r\n    'R' => 0.05987,\r\n    'S' => 0.06327,\r\n    'T' => 0.09056,\r\n    'U' => 0.02758,\r\n    'V' => 0.00978,\r\n    'W' => 0.02360,\r\n    'X' => 0.00150,\r\n    'Y' => 0.01974,\r\n    'Z' => 0.00074 + 0.00001, \/\/ there's some rounding in Wikipedia's list\r\n];\r\n$letterToMaxCumulativeFrequency = [];\r\n$cumulativeFrequency = 0;\r\nforeach ($letterFrequency as $letter => $frequency)\r\n{\r\n    $cumulativeFrequency += $frequency;\r\n    $letterToMaxCumulativeFrequency[$letter] = $cumulativeFrequency;\r\n}\r\n\r\nforeach ($lettersGraph->getNodes() as $node)\r\n{\r\n    $val = mt_rand(0, mt_getrandmax()) \/ mt_getrandmax();\r\n    foreach ($letterToMaxCumulativeFrequency as $letter => $maxFrequency)\r\n    {\r\n        if ($val < $maxFrequency)\r\n        {\r\n            $node->letter = $letter;\r\n            break;\r\n        }\r\n    }\r\n}\r\n\r\n\/**\r\n * @param array $lettersGraphNode\r\n * @return string\/HC\r\n *\/\r\nfunction printTileLetters(array $lettersGraphNode)\r\n{\r\n    $result = '<table><tbody>';\r\n    foreach ($lettersGraphNode as $colData)\r\n    {\r\n        $result .= '<tr>';\r\n        foreach ($colData as $rowData)\r\n        {\r\n            $result .= '<td>' . $rowData->letter . '<\/td>';\r\n        }\r\n        $result .= '<\/tr>';\r\n    }\r\n    $result .= '<\/tbody><\/table>';\r\n    return $result;\r\n}\r\n\r\necho printTileLetters($lettersGraphNode);\r\n<\/pre>\n<h3>Graph search, algorithmic complexity, and optimization<\/h3>\n<p>\nFirst, we need to load and parse the dictionary. I&#8217;ve found <a href=\"https:\/\/code.google.com\/p\/dotnetperls-controls\/downloads\/detail?name=enable1.txt&#038;can=2&#038;q=\">this dictionary<\/a> works well.\n<\/p>\n<pre lang=\"php\">\r\n\/**\r\n * Load\/parse dictionary\r\n *\/\r\n\r\n$dictionaryList = file(dirname(dirname(__FILE__)) . '\/data\/dictionary.txt');\r\n$dictionaryMap = [];\r\nforeach ($dictionaryList as $word)\r\n{\r\n    $dictionaryMap[strtoupper(trim($word))] = true;\r\n}\r\n\r\n\/**\r\n * Set up first letters array, which helps us optimize the otherwise-slow\r\n * DFS for words.\r\n *\/\r\n\r\n$dictionaryFirstNLetters = [];\r\ndefine('MIN_FIRST_LETTERS_LENGTH', 4);\r\ndefine('MAX_FIRST_LETTERS_LENGTH', 10);\r\n\r\nforeach ($dictionaryList as $word)\r\n{\r\n    $word = strtoupper(trim($word));\r\n    for ($length = MIN_FIRST_LETTERS_LENGTH; $length < MAX_FIRST_LETTERS_LENGTH; ++$length)\r\n    {\r\n        $firstNLetters = substr($word, 0, $length);\r\n        if ($firstNLetters != $word)\r\n        {\r\n            $dictionaryFirstNLetters[$firstNLetters] = true;\r\n        }\r\n        else\r\n        {\r\n            break;\r\n        }\r\n    }\r\n}\r\n\r\necho \"Dictionary loaded\/parsed in \" . number_format(microtime(true) - $timeStart, 3) . \"s<br><br>\\n\";\r\n<\/pre>\n<p>\nNote the variable <code>$dictionaryFirstNLetters<\/code>. We use this to optimize our DFS by allowing us to &#8220;bail out&#8221; early in our search if we see that there are no words that start with our <code>$currentResult<\/code>. If you try taking out this code, you&#8217;ll see that this algorithm can take an incredibly long time to run &#8211; even with a <code>$maxDepth<\/code> of 6 characters (only find words with up to 6 characters), the runtime can easily reach a minute. It increases quickly as <code>$maxDepth<\/code> is increased.\n<\/p>\n<p>\nBelow is our optimized depth-first search.\n<\/p>\n<pre lang=\"php\">\r\n\/**\r\n * @param Zeal_Graph_Node $source\r\n * @param int $maxDepth\r\n * @param array $dictionaryMap\r\n * @param array $dictionaryFirstNLetters\r\n * @param array $previouslyVisitedNodeIds\r\n * @param string $currentResult\r\n * @param array $results (By reference)\r\n * @return array\r\n *\/\r\nfunction depthFirstSearch(Zeal_Graph_Node $source, $maxDepth,\r\n        array $dictionaryMap, array $dictionaryFirstNLetters,\r\n        $previouslyVisitedNodeIds = [], $currentResult = '', array &$results = [])\r\n{\r\n    if (strlen($currentResult) >= $maxDepth)\r\n    {\r\n        return [];\r\n    }\r\n    $currentResult .= $source->letter;\r\n    if (isset($dictionaryMap[$currentResult]))\r\n    {\r\n        $results[] = $currentResult;\r\n    }\r\n    \/\/ bail out if there are no words that start with these letters\r\n    if (strlen($currentResult) >= MIN_FIRST_LETTERS_LENGTH &&\r\n        strlen($currentResult) <= MAX_FIRST_LETTERS_LENGTH &#038;&#038;\r\n            !isset($dictionaryFirstNLetters[$currentResult]))\r\n    {\r\n        return [];\r\n    }\r\n    $previouslyVisitedNodeIds[] = $source->id();\r\n    foreach ($source->outgoingEdges as $edge)\r\n    {\r\n        $destination = $edge->getNodeTo();\r\n        \/\/ can't repeat letters\r\n        if (in_array($destination->id(), $previouslyVisitedNodeIds))\r\n        {\r\n            continue;\r\n        }\r\n        $results += depthFirstSearch($destination, $maxDepth,\r\n                $dictionaryMap, $dictionaryFirstNLetters, $previouslyVisitedNodeIds, $currentResult, $results);\r\n    }\r\n    return $results;\r\n}\r\n\r\n\/**\r\n * Iterate over each tile and find all words starting with that letter\r\n *\/\r\n$results = [];\r\nfor ($colIter = 0; $colIter < $colCount; ++$colIter)\r\n{\r\n    for ($rowIter = 0; $rowIter < $rowCount; ++$rowIter)\r\n    {\r\n        $results = array_merge(\r\n            $results,\r\n            depthFirstSearch(\r\n                $lettersGraphNode[$colIter][$rowIter], $colCount * $rowCount,\r\n                $dictionaryMap, $dictionaryFirstNLetters)\r\n        );\r\n    }\r\n}\r\n\r\necho \"<br>Stats\\n<ul><li>Before uniquing: \" . number_format(count($results)) . \"<\/li>\\n\";\r\n$resultsUnique = array_unique($results);\r\necho \"<li>After uniquing: \" . number_format(count($resultsUnique)) . \"<\/li><\/ul>\\n\";\r\necho 'Words found<ul><li>' . implode(\"<\/li><li>\\n\", $resultsUnique) . '<\/li><\/ul>';\r\n\r\necho \"\\n\" . number_format(microtime(true) - $timeStart, 2) . \"s\\n\";\r\n<\/pre>\n<h3>What&#8217;s next<\/h3>\n<p>\nThis crash-course in Ruzzle programming doesn&#8217;t address the scoring, graphics, or actual gameplay. But these would all be fairly easy to add to the basic chasis that we&#8217;ve created here. I&#8217;ll leave that as an exercise to the reader \ud83d\ude42\n<\/p>\n<h3>Reference<\/h3>\n<p>\nThe graph code I use is straightforward, and given below.\n<\/p>\n<h4>Directed graph<\/h4>\n<pre lang=\"php\">\r\n\/**\r\n * A directed graph.\r\n * - http:\/\/en.wikipedia.org\/wiki\/Directed_graph\r\n *\/\r\nclass Graph_Directed\r\n{\r\n    \/**************************************************************************\r\n     * @section ID generators\r\n     *\/\r\n\r\n    \/**\r\n     * @var int\r\n     *\/\r\n    protected $_maxNodeId = 0;\r\n    \/**\r\n     * @param Graph_Node $node\r\n     * @return int\r\n     *\/\r\n    public final function getNextNodeId(Graph_Node $node)\r\n    {\r\n        ++$this->_maxNodeId;\r\n        $this->_nodes[$this->_maxNodeId] = $node;\r\n        return $this->_maxNodeId;\r\n    }\r\n\r\n    \/**\r\n     * @var int\r\n     *\/\r\n    protected $_maxEdgeId = 0;\r\n    \/**\r\n     * @param Graph_Edge $edge\r\n     * @return int\r\n     *\/\r\n    public final function getNextEdgeId(Graph_Edge $edge)\r\n    {\r\n        ++$this->_maxEdgeId;\r\n        if (!($edge instanceof ipGraph_EdgeResidual))\r\n        {\r\n            $this->_edges[$this->_maxEdgeId] = $edge;\r\n        }\r\n        return $this->_maxEdgeId;\r\n    }\r\n\r\n    \/**************************************************************************\r\n     * @section Edges, Nodes getters\r\n     *\/\r\n\r\n    \/**\r\n     * @var array Map<int id, Zeal_Graph_Node> $_nodes\r\n     *\/\r\n    protected $_nodes = [];\r\n    \/**\r\n     * Returns nodes\r\n     * @param none\r\n     * @return Map<int id, Zeal_Graph_Node>\r\n     *\/\r\n    public final function getNodes()\r\n    {\r\n        return $this->_nodes;\r\n    }\r\n    \/**\r\n     * @param none\r\n     * @return int \r\n     *\/\r\n    public final function getNodeCount()\r\n    {\r\n        return count($this->_nodes);\r\n    }\r\n\r\n    \/**\r\n     * @var array Map<int id, Graph_Edge $_edges\r\n     *\/\r\n    protected $_edges = [];\r\n    \/**\r\n     * @param none\r\n     * @return array Map<edgeId, Graph_Edge> \r\n     *\/\r\n    public final function getEdges()\r\n    {\r\n        return $this->_edges;\r\n    }\r\n    \/**\r\n     * @param none\r\n     * @return int\r\n     *\/\r\n    public final function getEdgeCount()\r\n    {\r\n        return count($this->_edges);\r\n    }\r\n\r\n}\r\n<\/pre>\n<h4>Node<\/h4>\n<pre lang=\"php\">\r\n\/**\r\n * A node\/vertex in a graph.\r\n * http:\/\/en.wikipedia.org\/wiki\/Graph_(mathematics)\r\n *\/\r\nclass Graph_Node\r\n{\r\n    \/**\r\n     * @var Graph_Directed The graph to which this node belongs\r\n     *\/\r\n    protected $_graph;\r\n    \r\n    \/**\r\n     * @param Graph_Directed $graph \r\n     *\/\r\n    public function __construct(Graph_Directed $graph)\r\n    {\r\n        $this->_id = $graph->getNextNodeId($this);\r\n        $this->_graph = $graph;\r\n    }\r\n\r\n    \/**\r\n     * @var int Unique across the parent Graph_Directed\r\n     *\/\r\n    protected $_id;\r\n    \/**\r\n     * @param none\r\n     * @return int\r\n     *\/\r\n    public function id()\r\n    {\r\n        return $this->_id;\r\n    }\r\n\r\n    \/**\r\n     * @param Graph_Node $node\r\n     * @return Graph_Edge\r\n     *\/\r\n    public function connectFlowFrom(Graph_Node $node)\r\n    {\r\n        $edge = new Graph_Edge($this->_graph, $node, $this);\r\n        return $edge;\r\n    }\r\n\r\n}\r\n<\/pre>\n<h4>Edge<\/h4>\n<pre lang=\"php\">\r\n\/**\r\n * A directed edge in a directed graph.\r\n *\/\r\nclass Graph_Edge\r\n{\r\n    \/**\r\n     * @var int Unique across the graph\r\n     *\/\r\n    protected $_id;\r\n\r\n    \/**\r\n     * @var Graph_Node $_nodeFrom\r\n     *\/\r\n    protected $_nodeFrom;\r\n    \/**\r\n     * @var Graph_Node $_nodeTo\r\n     *\/\r\n    protected $_nodeTo;\r\n\r\n    \/**\r\n     * @var Graph_Directed\r\n     *\/\r\n    protected $_graph;\r\n\r\n    \/**\r\n     * @param Graph_Directed $graph\r\n     * @param Graph_Node $nodeFrom\r\n     * @param Graph_Node $nodeTo\r\n     *\/\r\n    public function __construct(Graph_Directed $graph,\r\n            Graph_Node $nodeFrom, Graph_Node $nodeTo)\r\n    {\r\n        $this->_id = $graph->getNextEdgeId($this);\r\n        $this->_graph = $graph;\r\n        $this->_nodeFrom = $nodeFrom;\r\n        $nodeFrom->outgoingEdges[] = $this;\r\n        $this->_nodeTo = $nodeTo;\r\n    }\r\n\r\n    \/**\r\n     * @param none\r\n     * @return int\r\n     *\/\r\n    public function id()\r\n    {\r\n        return $this->_id;\r\n    }\r\n\r\n    \/**\r\n     * @param none\r\n     * @return Graph_Node\r\n     *\/\r\n    public function getNodeTo()\r\n    {\r\n        return $this->_nodeTo;\r\n    }\r\n\r\n    \/**\r\n     * @param none\r\n     * @return Graph_Node\r\n     *\/\r\n    public function getNodeFrom()\r\n    {\r\n        return $this->_nodeFrom;\r\n    }\r\n\r\n    \/**\r\n     * @param none\r\n     * @return Graph_Directed\r\n     *\/\r\n    public function getGraph()\r\n    {\r\n        return $this->_graph;\r\n    }\r\n\r\n}\r\n<\/pre>\n<p>\nFeel free to ask questions or give feedback <a href=\"http:\/\/twitter.com\/alexkorn\">via Twitter<\/a>.<\/p>\n<!-- AddThis Advanced Settings generic via filter on the_content --><!-- AddThis Share Buttons generic via filter on the_content -->","protected":false},"excerpt":{"rendered":"<p>(Disclaimer: I have no affiliation with Ruzzle.) Introduction I was recently introduced to the game Ruzzle, a word finding game with some Scrabble-like scoring components. If you&#8217;re not familiar with it, check out their website and watch the quick introductory video. I was fascinated by how the game calculated all the possible words on each [&hellip;]<!-- AddThis Advanced Settings generic via filter on get_the_excerpt --><!-- AddThis Share Buttons generic via filter on get_the_excerpt --><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[27,29,28,6],"class_list":["post-45","post","type-post","status-publish","format-standard","hentry","category-tutorials","tag-algorithms","tag-depth-first-search","tag-graph-theory","tag-php"],"_links":{"self":[{"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/posts\/45","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/comments?post=45"}],"version-history":[{"count":0,"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/posts\/45\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/media?parent=45"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/categories?post=45"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.alexkorn.com\/blog\/wp-json\/wp\/v2\/tags?post=45"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}