Graph search, algorithmic optimization, and word games: How to program Ruzzle

Posted: May 20th, 2013 | Author: | Filed under: Tutorials | Tags: , , , | No Comments »

(Disclaimer: I have no affiliation with Ruzzle.)

Introduction

I was recently introduced to the game Ruzzle, a word finding game with some Scrabble-like scoring components. If you’re not familiar with it, check out their website and watch the quick introductory video.

Ruzzle screenshot

Ruzzle screenshot

I was fascinated by how the game calculated all the possible words on each board. So I decided to program it. Check out the Ruzzle demo page, which creates a randomized board and finds all possible words.

Overview

The way to do this search for all possible words is by viewing the letters as a directed graph where the letters are nodes and edges are connections between adjacent letters.

Ruzzle graph

Ruzzle graph. This was more of a pain to make than you might expect.

With this graph, we can do a Depth-first search (DFS) for words starting at each node.

Creating the board

First, we need to create a 4×4 board of letters. In order to create reasonable boards for English, we weight the letter distribution to that of their distribution in English.

The following code creates the graph/board, assigns letters to each node/tile, and outputs the board. The basic graph objects are shown at the end of this post.

/**
 * Create graph/board
 */
 
$timeStart = microtime(true);
 
$colCount = 4;
$rowCount = 4;
 
$lettersGraph = new Graph_Directed();
/**
 * @var array Map<colId, Map<rowId, char letter>>
 */
$lettersGraphNode = [];
for ($colIter = 0; $colIter < $colCount; ++$colIter)
{
    for ($rowIter = 0; $rowIter < $rowCount; ++$rowIter)
    {
        $node = $lettersGraphNode[$colIter][$rowIter] = new Graph_Node($lettersGraph);
    }
}
 
/**
 * Attach each node to all adjacent nodes
 */
for ($colIter = 0; $colIter < $colCount; ++$colIter)
{
    for ($rowIter = 0; $rowIter < $rowCount; ++$rowIter)
    {
        $node = $lettersGraphNode[$colIter][$rowIter];
        for ($adjacentColIter = max(0, $colIter - 1);
            $adjacentColIter < min($colCount, $colIter + 1 + 1);
            ++$adjacentColIter)
        {
            for ($adjacentRowIter = max(0, $rowIter - 1);
                $adjacentRowIter < min($rowCount, $rowIter + 1 + 1);
                ++$adjacentRowIter)
            {
                // don't connect letters to themselves
                if ($colIter == $adjacentColIter &&
                        $rowIter == $adjacentRowIter)
                {
                    continue;
                }
                $node->connectFlowFrom(
                        $lettersGraphNode[$adjacentColIter][$adjacentRowIter]);
            }
        }
    }
}
 
/**
 * @see http://en.wikipedia.org/wiki/Letter_frequency#Relative_frequencies_of_letters_in_the_English_language
 */
$letterFrequency = [
    'A' => 0.08167,
    'B' => 0.01492,
    'C' => 0.02782,
    'D' => 0.04253,
    'E' => 0.12702,
    'F' => 0.02228,
    'G' => 0.02015,
    'H' => 0.06094,
    'I' => 0.06966,
    'J' => 0.00153,
    'K' => 0.00772,
    'L' => 0.04025,
    'M' => 0.02406,
    'N' => 0.06749,
    'O' => 0.07507,
    'P' => 0.01929,
    'Q' => 0.00095,
    'R' => 0.05987,
    'S' => 0.06327,
    'T' => 0.09056,
    'U' => 0.02758,
    'V' => 0.00978,
    'W' => 0.02360,
    'X' => 0.00150,
    'Y' => 0.01974,
    'Z' => 0.00074 + 0.00001, // there's some rounding in Wikipedia's list
];
$letterToMaxCumulativeFrequency = [];
$cumulativeFrequency = 0;
foreach ($letterFrequency as $letter => $frequency)
{
    $cumulativeFrequency += $frequency;
    $letterToMaxCumulativeFrequency[$letter] = $cumulativeFrequency;
}
 
foreach ($lettersGraph->getNodes() as $node)
{
    $val = mt_rand(0, mt_getrandmax()) / mt_getrandmax();
    foreach ($letterToMaxCumulativeFrequency as $letter => $maxFrequency)
    {
        if ($val < $maxFrequency)
        {
            $node->letter = $letter;
            break;
        }
    }
}
 
/**
 * @param array $lettersGraphNode
 * @return string/HC
 */
function printTileLetters(array $lettersGraphNode)
{
    $result = '<table><tbody>';
    foreach ($lettersGraphNode as $colData)
    {
        $result .= '<tr>';
        foreach ($colData as $rowData)
        {
            $result .= '<td>' . $rowData->letter . '</td>';
        }
        $result .= '</tr>';
    }
    $result .= '</tbody></table>';
    return $result;
}
 
echo printTileLetters($lettersGraphNode);

Graph search, algorithmic complexity, and optimization

First, we need to load and parse the dictionary. I’ve found this dictionary works well.

/**
 * Load/parse dictionary
 */
 
$dictionaryList = file(dirname(dirname(__FILE__)) . '/data/dictionary.txt');
$dictionaryMap = [];
foreach ($dictionaryList as $word)
{
    $dictionaryMap[strtoupper(trim($word))] = true;
}
 
/**
 * Set up first letters array, which helps us optimize the otherwise-slow
 * DFS for words.
 */
 
$dictionaryFirstNLetters = [];
define('MIN_FIRST_LETTERS_LENGTH', 4);
define('MAX_FIRST_LETTERS_LENGTH', 10);
 
foreach ($dictionaryList as $word)
{
    $word = strtoupper(trim($word));
    for ($length = MIN_FIRST_LETTERS_LENGTH; $length < MAX_FIRST_LETTERS_LENGTH; ++$length)
    {
        $firstNLetters = substr($word, 0, $length);
        if ($firstNLetters != $word)
        {
            $dictionaryFirstNLetters[$firstNLetters] = true;
        }
        else
        {
            break;
        }
    }
}
 
echo "Dictionary loaded/parsed in " . number_format(microtime(true) - $timeStart, 3) . "s<br><br>\n";

Note the variable $dictionaryFirstNLetters. We use this to optimize our DFS by allowing us to “bail out” early in our search if we see that there are no words that start with our $currentResult. If you try taking out this code, you’ll see that this algorithm can take an incredibly long time to run – even with a $maxDepth of 6 characters (only find words with up to 6 characters), the runtime can easily reach a minute. It increases quickly as $maxDepth is increased.

Below is our optimized depth-first search.

/**
 * @param Zeal_Graph_Node $source
 * @param int $maxDepth
 * @param array $dictionaryMap
 * @param array $dictionaryFirstNLetters
 * @param array $previouslyVisitedNodeIds
 * @param string $currentResult
 * @param array $results (By reference)
 * @return array
 */
function depthFirstSearch(Zeal_Graph_Node $source, $maxDepth,
        array $dictionaryMap, array $dictionaryFirstNLetters,
        $previouslyVisitedNodeIds = [], $currentResult = '', array &$results = [])
{
    if (strlen($currentResult) >= $maxDepth)
    {
        return [];
    }
    $currentResult .= $source->letter;
    if (isset($dictionaryMap[$currentResult]))
    {
        $results[] = $currentResult;
    }
    // bail out if there are no words that start with these letters
    if (strlen($currentResult) >= MIN_FIRST_LETTERS_LENGTH &&
        strlen($currentResult) <= MAX_FIRST_LETTERS_LENGTH &&
            !isset($dictionaryFirstNLetters[$currentResult]))
    {
        return [];
    }
    $previouslyVisitedNodeIds[] = $source->id();
    foreach ($source->outgoingEdges as $edge)
    {
        $destination = $edge->getNodeTo();
        // can't repeat letters
        if (in_array($destination->id(), $previouslyVisitedNodeIds))
        {
            continue;
        }
        $results += depthFirstSearch($destination, $maxDepth,
                $dictionaryMap, $dictionaryFirstNLetters, $previouslyVisitedNodeIds, $currentResult, $results);
    }
    return $results;
}
 
/**
 * Iterate over each tile and find all words starting with that letter
 */
$results = [];
for ($colIter = 0; $colIter < $colCount; ++$colIter)
{
    for ($rowIter = 0; $rowIter < $rowCount; ++$rowIter)
    {
        $results = array_merge(
            $results,
            depthFirstSearch(
                $lettersGraphNode[$colIter][$rowIter], $colCount * $rowCount,
                $dictionaryMap, $dictionaryFirstNLetters)
        );
    }
}
 
echo "<br>Stats\n<ul><li>Before uniquing: " . number_format(count($results)) . "</li>\n";
$resultsUnique = array_unique($results);
echo "<li>After uniquing: " . number_format(count($resultsUnique)) . "</li></ul>\n";
echo 'Words found<ul><li>' . implode("</li><li>\n", $resultsUnique) . '</li></ul>';
 
echo "\n" . number_format(microtime(true) - $timeStart, 2) . "s\n";

What’s next

This crash-course in Ruzzle programming doesn’t address the scoring, graphics, or actual gameplay. But these would all be fairly easy to add to the basic chasis that we’ve created here. I’ll leave that as an exercise to the reader :)

Reference

The graph code I use is straightforward, and given below.

Directed graph

/**
 * A directed graph.
 * - http://en.wikipedia.org/wiki/Directed_graph
 */
class Graph_Directed
{
    /**************************************************************************
     * @section ID generators
     */
 
    /**
     * @var int
     */
    protected $_maxNodeId = 0;
    /**
     * @param Graph_Node $node
     * @return int
     */
    public final function getNextNodeId(Graph_Node $node)
    {
        ++$this->_maxNodeId;
        $this->_nodes[$this->_maxNodeId] = $node;
        return $this->_maxNodeId;
    }
 
    /**
     * @var int
     */
    protected $_maxEdgeId = 0;
    /**
     * @param Graph_Edge $edge
     * @return int
     */
    public final function getNextEdgeId(Graph_Edge $edge)
    {
        ++$this->_maxEdgeId;
        if (!($edge instanceof ipGraph_EdgeResidual))
        {
            $this->_edges[$this->_maxEdgeId] = $edge;
        }
        return $this->_maxEdgeId;
    }
 
    /**************************************************************************
     * @section Edges, Nodes getters
     */
 
    /**
     * @var array Map<int id, Zeal_Graph_Node> $_nodes
     */
    protected $_nodes = [];
    /**
     * Returns nodes
     * @param none
     * @return Map<int id, Zeal_Graph_Node>
     */
    public final function getNodes()
    {
        return $this->_nodes;
    }
    /**
     * @param none
     * @return int 
     */
    public final function getNodeCount()
    {
        return count($this->_nodes);
    }
 
    /**
     * @var array Map<int id, Graph_Edge $_edges
     */
    protected $_edges = [];
    /**
     * @param none
     * @return array Map<edgeId, Graph_Edge> 
     */
    public final function getEdges()
    {
        return $this->_edges;
    }
    /**
     * @param none
     * @return int
     */
    public final function getEdgeCount()
    {
        return count($this->_edges);
    }
 
}

Node

/**
 * A node/vertex in a graph.
 * http://en.wikipedia.org/wiki/Graph_(mathematics)
 */
class Graph_Node
{
    /**
     * @var Graph_Directed The graph to which this node belongs
     */
    protected $_graph;
 
    /**
     * @param Graph_Directed $graph 
     */
    public function __construct(Graph_Directed $graph)
    {
        $this->_id = $graph->getNextNodeId($this);
        $this->_graph = $graph;
    }
 
    /**
     * @var int Unique across the parent Graph_Directed
     */
    protected $_id;
    /**
     * @param none
     * @return int
     */
    public function id()
    {
        return $this->_id;
    }
 
    /**
     * @param Graph_Node $node
     * @return Graph_Edge
     */
    public function connectFlowFrom(Graph_Node $node)
    {
        $edge = new Graph_Edge($this->_graph, $node, $this);
        return $edge;
    }
 
}

Edge

/**
 * A directed edge in a directed graph.
 */
class Graph_Edge
{
    /**
     * @var int Unique across the graph
     */
    protected $_id;
 
    /**
     * @var Graph_Node $_nodeFrom
     */
    protected $_nodeFrom;
    /**
     * @var Graph_Node $_nodeTo
     */
    protected $_nodeTo;
 
    /**
     * @var Graph_Directed
     */
    protected $_graph;
 
    /**
     * @param Graph_Directed $graph
     * @param Graph_Node $nodeFrom
     * @param Graph_Node $nodeTo
     */
    public function __construct(Graph_Directed $graph,
            Graph_Node $nodeFrom, Graph_Node $nodeTo)
    {
        $this->_id = $graph->getNextEdgeId($this);
        $this->_graph = $graph;
        $this->_nodeFrom = $nodeFrom;
        $nodeFrom->outgoingEdges[] = $this;
        $this->_nodeTo = $nodeTo;
    }
 
    /**
     * @param none
     * @return int
     */
    public function id()
    {
        return $this->_id;
    }
 
    /**
     * @param none
     * @return Graph_Node
     */
    public function getNodeTo()
    {
        return $this->_nodeTo;
    }
 
    /**
     * @param none
     * @return Graph_Node
     */
    public function getNodeFrom()
    {
        return $this->_nodeFrom;
    }
 
    /**
     * @param none
     * @return Graph_Directed
     */
    public function getGraph()
    {
        return $this->_graph;
    }
 
}

Feel free to ask questions or give feedback via Twitter.


Turning off text anti-aliasing with NetBeans 7.3, 7.2, 7.1 on OS X

Posted: March 5th, 2012 | Author: | Filed under: Uncategorized | Tags: , , | 1 Comment »

Update 2013/02/10: This also works with NetBeans 7.3.

Update 2012/07/24: This post was originally written about 7.1. Everything here also applies to Netbeans 7.2, so I’ve update this to refer to 7.2.

Another version of NetBeans, another hack to get it to turn off text anti-aliasing. Previously, I showed how with NetBeans 7.0. NetBeans 7.3.x is nearly the same, but a bit simpler.

Just make the following changes to the definition of netbeans_default_options in /Applications/NetBeans/NetBeans\ 7.3.app/Contents/Resources/NetBeans/etc/netbeans.conf (or something similar if you didn’t install it in the default location):

Add:

-J-Dswing.aatext=false

Change:

-J-Dapple.awt.graphics.UseQuartz=false

Restart NetBeans and you should have non-anti-aliased fonts.

Notes

Note that, unlike NetBeans 7.0, you no longer need to add -J-Dawt.useSystemAAFontSettings=off.

See some notes about text anti-aliasing on the NetBeans wiki.


This is how quickly we give up on resolutions

Posted: January 1st, 2012 | Author: | Filed under: Uncategorized | Tags: | No Comments »

On Facebook today, I noticed one of my friends posted a picture with the label “1/365″. Like other people, she’s attempting to post one picture every day of the year. An admirable goal!

This got me thinking: I wonder how long it takes most people to give up on resolutions? Well, the best (read: easiest and cheapest) way to figure this out is to see how many Google results there are for “1/365″, “2/365″, etc.

This is how quickly we give up on our resolutions*:

It appears most people give up after the first week, and the vast majority give up after the first month or two. But if they stick with it for the first 10 months or so, they’re talking about it a lot until the end of the year.

The technical details

First off, Google’s estimates of how many results a query has are grossly exaggerated (thanks, James). The chart above uses Andrew Zich’s Google Search Matrix.

I had to remove some outliers, which I set to be the average of up to 3 days on either side of them.

  • “7/365″: 719k (mostly in the form of “24/7/365″)
  • “364/365″: 1.8m (apparently a lot of sites just have numbers in sequence)

Below is the PHP script I used to generate the numbers initially. Note that the Google homepage (and thus the text around the number of results) looks different when you aren’t logged in.

echo '<html><table>';
 
$matches;
for ($i = 1; $i <= 365; ++$i)
{
    $url = 'http://www.google.com/search?q=' . urlencode('"' . $i . '/365"');
    $result = file_get_contents($url);
 
    preg_match('/of about <b>(.*)<\/b> for/', $result, $matches);
 
    echo '<tr><td>' . $i . '</td><td>' . str_replace(',', '', $matches[1]) . '</td></tr>';
}
echo '</table></html>';

Here is the chart created with the numbers that Google estimates on the first page of results:

The general shape of the data is the same, but the actual numbers of results are very different between the two.

The disclaimer

Oh man, this is so unscientific it almost hurts. Just because someone stops posting about their resolutions doesn’t mean they’ve given up. Not everyone who’s come up with a resolution posts about it online. Etc etc etc. I sacrificed a lot of scientific rigor to come up with a catchy title.


How to Enable Key Repeat in OS X Lion (10.7)

Posted: July 22nd, 2011 | Author: | Filed under: Uncategorized | Tags: , , | No Comments »

To enable key repeat in Mac OS 10.7 Lion, open up your Console, enter the following line, then restart your computer.

defaults write -g ApplePressAndHoldEnabled -bool false

The default behavior in 10.7 is to have a special keys selector pop up when holding down a key, similar to the behavior in iOS. This can be handy for typing accented letters, but makes things like navigating in Vi much more difficult. As of now, there is no way to change this setting in the user interface.


Turning off text anti-aliasing with NetBeans 7.0 on OS X

Posted: May 2nd, 2011 | Author: | Filed under: Uncategorized | Tags: , , | 3 Comments »

See the updated post for NetBeans 7.1 and 7.2.

The quick version

For those who want to turn off text anti-aliasing in NetBeans 7.0.x on Mac OS X without reading my tirades, just make the following changes to the definition of netbeans_default_options in /Applications/NetBeans/NetBeans\ 7.0.app/Contents/Resources/NetBeans/etc/netbeans.conf (or something similar if you didn’t install it in the default location):

Add:

-J-Dswing.aatext=false -J-Dawt.useSystemAAFontSettings=off 

Change:

-J-Dapple.awt.graphics.UseQuartz=false

Restart NetBeans and you should have non-anti-aliased fonts.

The full story

The perfect programming font: many have searched; few have found.

It’s Monaco, 10pt, no anti-aliasing.

But I digress. My goal here is not to discuss fonts (but seriously, I love discussing fonts), but to write the most keyword-heavy post for those trying to figure out how to turn off text anti-aliasing in NetBeans 7 on Mac OS X. By default, NetBeans uses anti-aliasing for its editor, which, for most fonts, is great, but not for Monaco 10pt.

When I used NetBeans 6.x, I found this page to be quite useful. However, when I tried turning off anti-aliasing with NetBeans 7.0, it did not work.

After a bit of exploring, I found the key to be setting -J-Dapple.awt.graphics.UseQuartz=false. I believe Quartz refers to the 2D graphics rendering API for OS X, but for the sake of your non-anti-aliased font, you certainly don’t need it.

In-Application Setting

NetBeans has been well aware of the issue for quite some time, so don’t expect the anti-aliasing setting to be modifiable inside the application any time soon.


Sending email from AWS in PHP with AmazonSES

Posted: April 28th, 2011 | Author: | Filed under: AWS, Hosting, Tutorials | Tags: | 2 Comments »

Update 2013/06/27: I’ve updated some of the code to match new SES interfaces.

For those who have sent emails in PHP, the following code should look quite familiar:

mail($to, $subject, $message);

Unfortunately, this will silently fail on Amazon Web Services. But you do need to send those emails, right? Well, it’s pretty easy to get emails sent from AWS with Amazon Simple Email Service (Amazon SES). However, in my research on this, I found several false leads and no good tutorials, so here goes: step by step, how to send emails with AWS.

First step: sign up.

AWS SDK for PHP

Next, install the source code code for AWS SDK for PHP. Run the following in the directory where you store external libraries (using sudo when needed):

yum install git
git clone git://github.com/amazonwebservices/aws-sdk-for-php.git AWSSDKforPHP

Now that you have the the AWS SDK on your server, include AWSSDKforPHP/sdk.class.php into your application. You now have access to the AmazonSES class.

The following is a basic function to emulate PHP’s mail().

function amazonSesEmail($to, $subject, $message)
{
    $amazonSes = new AmazonSES(array(
        'key' => AWS_KEY,
        'secret' => AWS_SECRET_KEY
    ));
 
    $response = $amazonSes->send_email(AWS_SES_FROM_EMAIL,
        array('ToAddresses' => array($to)),
        array(
            'Subject.Data' => $subject,
            'Body.Text.Data' => $message,
        )
    );
    if (!$response->isOK())
    {
        // handle error
    }
}

Note that you need to define the AWS_KEY and AWS_SECRET_KEY for your application. You can find these by going to your AWS Security Credentials page and looking under “Access Keys”.

Validating the sending email address

Next, define AWS_SES_FROM_EMAIL to be the email address from which you want to send emails, e.g. no-reply@example.com. If you try to send an email at this point, you’ll get a MessageRejected error with the message “Email address is not verified.” Several pages I visited recommended I download the Amazon SES Scripts (Perl), but don’t do it! You’ll need to subject yourself to a world of pain in the form of installing Perl modules* (and probably Perl too, given that you’re reading a post on PHP!).

Instead, you can actually use AmazonSES to verify email addresses in PHP.

$amazonSes = new AmazonSES(array(
    'key' => AWS_KEY,
    'secret' => AWS_SECRET_KEY
));
$amazonSes->verify_email_address($fromEmailAddress);

Much easier than the Perl script shenanigans. This will send an email to your sending email address with a link to verify that you own that email address. Do the same for your recipient addresses (since you’re in sandbox mode, which has restrictions including the requirement to validate your recipient address). Verify each address and you should be good to go to send emails.

Setting up production access

At this point you’ll be able to send a couple test emails, but the volume is fairly limited. Request production access and you should be granted the ability to send many more emails and not need to verify recipient email addresses.


Reference

Visit the AWS SDK for PHP reference page for much more information on how AWS SDK for PHP works.

Amazon SES Scripts (Perl)

If you do want to install the Amazon SES Perl scripts and are getting an error along the lines of “Can’t locate XML/LibXML.pm in @INC…”, you can use yum to install all of the necessary packages to get them to run.

yum -y install perl-Digest-SHA perl-URI perl-libwww-perl perl-MIME-tools perl-Crypt-SSLeay perl-XML-LibXML

Thanks to this post for that hint.

(^ Back up)


Ridiculously Simple Observer Pattern with PHP 5.3 and Closures

Posted: March 25th, 2011 | Author: | Filed under: Tutorials | Tags: , | No Comments »

The observer pattern is a handy design pattern often used in UI-focused languages like JavaScript, but not often used in PHP. Prior to PHP 5.3, the observer pattern was cumbersome to use and required a lot of objects – one for every possible observer. However, anonymous functions (Closures) in PHP 5.3 can replace the previously used Observer objects. This makes the observer pattern much more practical by reducing the number of classes that you need to make, often for one-time use.

Let’s write up our basic Observable object. You may want to make this class abstract since it’s probably not useful on its own.

class Observable
{
    /**
     * @var array Map<string eventName, List<Closure observer>> $_observers
     */
    protected $_observers = array();
 
    /**
     * @param string $eventName
     * @param array $data
     */
    protected final function _fireEvent($eventName, array $data = null)
    {
        if (isset($this->_observers[$eventName]))
        {
            foreach ($this->_observers[$eventName] as $observer)
            {
                $observer($data);
            }
        }
    }
 
    /**
     * @param string $eventName
     * @param Closure $observer With parameter (array $data)
     */
    public final function addObserver($eventName, Closure $observer)
    {
        if (!isset($this->_observers[$eventName]))
        {
            $this->_observers[$eventName] = array();
        }
        $this->_observers[$eventName][] = $observer;
    }
 
    /**
     * @param string $eventName
     * @param Closure $observer The observer to remove
     */
    public final function removeObserver($eventName, Closure $observer)
    {
        if (isset($this->_observers[$eventName]))
        {
            foreach ($this->_observers[$eventName] as $key => $existingObserver)
            {
                if ($existingObserver === $observer)
                {
                    unset($this->_observers[$eventName][$key]);
                }
            }
        }
    }
 
}

Now that we have the basic observable class, let’s look at a simple example:

class Person extends Observable
{
    protected $_name;
    protected $_friends = array();
 
    public function __construct($name)
    {
        $this->_name = $name;
    }
 
    public function getName()
    {
        return $this->_name;
    }
 
    public function getIntroducedTo(Person $person)
    {
        $this->_friends[] = $person;
        $this->_fireEvent('introduced', array('other' => $person, 'me' => $this));
    }
 
}
 
$sally = new Person('Sally');
$sally->addObserver('introduced', function(array $data)
    {
        echo 'Hi, ' . $data['other']->getName() . ', my name is ' .
            $data['me']->getName() . '.';
    });
 
$sally->getIntroducedTo(new Person('Harry'));

The output will be "Hi, Harry, my name is Sally.". Note that you can easy modify the greeting Sally uses from outside that Person object without changing the functionality of adding the friend.

And there you have it – the observer pattern and an example, in less than 75 lines of code!


Getting PHP and MySQL running on Amazon EC2

Posted: March 17th, 2011 | Author: | Filed under: AWS, Hosting, Tutorials | Tags: , , , , , | 29 Comments »

Do you know nothing about Amazon Web Services (AWS) or Linux server administration, but want to get a PHP/MySQL server set up on AWS? I was once like you, relying upon my web host to have PHP and MySQL installed and configured, so it was a bit daunting initially to work with AWS, but it’s actually rather simple. Read on and I’ll show you how to set up PHP and MySQL on one of Amazon’s free servers step by step. You can have a functioning site up and running within half an hour.

Amazon Web Services

First things first: Amazon Web Services has a ton of different products, but the one you want is Amazon Elastic Compute Cloud (EC2). Go there, and click “Sign Up for Amazon EC2″.

Once you’ve gotten that set up, go to the AWS Management Console, and click on “Instances” in the Navigation panel. An Instance is just a virtual server – so let’s create one! Click “Launch Instance” under “My Instances”, and select “Basic 64-bit Amazon Linux AMI”. On the Instance Details phase, select “Micro” (which is Free tier eligible). Continue until you need to enter the “Name” – if you don’t know what else to call it, just use “Web/DB server”.

Next you create a Key Pair – this will be the credentials you’ll use to SSH into the box. The instructions here should be fairly straightforward. Next is the Security Group, which will be used to specify the firewall used for your instance. Feel free to use the default Group for now. Continue to the Review phase and launch it!

You should now be able to SSH into your instance using your .pem file with ssh -i [FILE NAME].pem ec2-user@ec2-[IP ADDRESS].compute-1.amazonaws.com. Alright, we’ve got a server up and running! However, you may notice that this server has very little installed on it. which php? Nothing. which mysql? The same. Let’s install some software.

Configuring the Linux Server

Below I’ll show you how to set up PHP and MySQL on the server. I’ve separated PHP and MySQL so that it’s easier to adapt this to having two instances.

PHP

First, the basics for PHP:

sudo yum install php-mysql php php-xml php-mcrypt php-mbstring php-cli mysql httpd

Press ‘y’ for each of the prompts that shows up. Note that you’re logged in as ec2-user, so you need to sudo all of these commands.

You should now be able to create and run a PHP test file. Next, let’s get MySQL up and running.

MySQL server

First, install and begin running the server:

sudo yum install mysql-server
sudo /etc/init.d/mysqld start

Next, set the root password. I’ve found this password generator to be just dandy.

mysqladmin -u root password '[PASSWORD]'

Now we set up two users for MySQL: the administrator, which you’ll use to create and modify tables; and the app user, which the app will use to query the DB (with more limited privileges). Log into MySQL as root (mysql -u root -p) and enter each of the following lines:

CREATE DATABASE [DB NAME];
CREATE USER '[DB NAME]_admin'@'localhost' IDENTIFIED BY '[ADMIN PASSWORD]';
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, ALTER ON [DB NAME].* TO '[DB NAME]_admin'@'localhost';
CREATE USER '[DB NAME]_user'@'localhost' IDENTIFIED BY '[USER PASSWORD]';
GRANT SELECT, INSERT, UPDATE, DELETE ON [DB NAME].* TO '[DB NAME]_user'@'localhost';

You may want to fine-tune your database settings further than this, but this is a good start.

Make it web-accessible

We now have PHP and MySQL running on the box, but cannot access it through a browser. We need to configure the web server and set up an Elastic IP.

Web Server

First, let’s create a test PHP file that will be accessed by the browser. Create directories so that you can put your file in /opt/app/current1. Make an index.php file that contains whatever you want.

If you want to FTP transfer files to your server, you’ll want to give the ec2-user permissions to modify files in your web directory:

sudo chown ec2-user /opt/app/current

To set up the web server, httpd, we need to first modify its configuration file, located at /etc/httpd/conf/httpd.conf. Open it up with vim, emacs, or your favorite text editor, and go to the bottom of the file. Here you’ll see a small section on the VirtualHost (between <VirtualHost *:80> and </VirtualHost>). Uncomment it out and set DocumentRoot to /opt/app/current. Restart (or start) httpd:

sudo /etc/init.d/httpd restart

Elastic IP and Security Groups

In the AWS Management Console, click on “Elastic IPs”, then “Allocate New Address” under “Addresses”. Once the address is created, click on it, then “Associate Address”. Select the instance and associate it.

Now click on “Security Groups” in the Navigation panel. Select the Security Group that you used for the instance (probably the default one). Under the “Inbound” tab, add an HTTP rule (port 80). Click “Apply Rule Changes”, and you should now be able to access your website! In your browser, go to http://ec2-[IP ADDRESS].compute-1.amazonaws.com/, where the IP address is the Elastic IP you made with periods replaced with hyphens.

Hello World! or Putting it all together

We now have all the pieces we need to access MySQL from PHP and serve that to the browser-accessible website. So let’s log into mysql and create a sample table:

mysql -u [DB NAME]_admin -p
[type password]
mysql> use [DB NAME];
mysql> CREATE TABLE test (message VARCHAR(255));
mysql> INSERT INTO test (message) VALUES ('Hello world!');

Now modify your index.php file (/opt/app/current/index.php) to be the following:

<?php
$conn =  new mysqli('localhost', '[DB NAME]_user', '[USER PASSWORD]', '[DB NAME]');
$result = $conn->query("SELECT message FROM test;");
$row = $result->fetch_assoc();
echo $row['message'];
?>

We now have a fully functioning PHP and MySQL server!

Taking it further

That’s it for the basics, but there’s so much more that you can do now.

PHPUnit

sudo pear upgrade
sudo yum install php-pear
sudo pear channel-discover pear.phpunit.de
sudo pear channel-discover pear.symfony-project.com
sudo pear channel-discover components.ez.no
sudo pear install phpunit/PHPUnit

phpMyAdmin

I’ve found it handy to set up an administration area for my sites using a different port on the same URL. Note that port 80 is the default for web traffic, but 8080 is also commonly used.

Create /opt/app/admin. Then, in httpd.conf, add the line Listen 8080 after Listen 80 and add another VirtualHost entry, using <VirtualHost *:8080> and pointing to the /opt/app/admin directory. Update your Security Group to allow traffic over port 8080. Make sure to restart Apache and you should now be able to access your admin folder through your browser at yourdomain.com:8080.

You can then download phpMyAdmin into /opt/app/admin/pma and unzip it. Using the [DB NAME]_admin user, you can now manage your databases there through your browser.

Using two Instances

It can be very beneficial to performance to separate the web server and the DB server. To do this, you’ll need to set up two Instances, one of which has the web server httpd running and an Elastic IP, and the other of which has the MySQL server mysqld running.

They can use the same Security Group, but you’ll have to add the MySQL rule (port 3066) for Inbound traffic to allow the servers to talk to each other.

On the web box, instead of using “localhost” as the MySQL host, use the Internal IP address of the MySQL box. On the DB box, set up your grant permissions to allow from anywhere in '%.ec2.internal' (or just from your IPs).

Notes

  1. /opt/app/current is a Rails convention that I enjoy. What you should do is put your releases in /opt/app/releases/[release #], then have /opt/app/current be a symlink to the current release.

    Another (much more) common standard is to put web-accessible code in /var/www/html. Feel free to put your HTML code wherever you want; just make sure to update httpd.conf appropriately (and restart Apache).

    ^ Back up


Thanks to Ryan Ausanka-Crues at Palomino Labs for help with this.


How to fully maximize Chrome windows in OS X

Posted: July 1st, 2010 | Author: | Filed under: Browsers | Tags: , , | No Comments »

(Edit 2013/06/28) Quick version: press Shift when you click the green “Maximize” button.

(Full version) I, along with many of my friends and coworkers, have run into issues with maximizing Google Chrome windows on Mac OS X. It seems that, when clicking on the green button in the title bar, that the window resizes vertically, but not horizontally. And when you’re actually trying to get the window to fill up the full screen, this is highly unfortunate.

What’s going on

The green button is not the Maximize button in OS X; it’s the Zoom button. As described in Chrome’s issues tracker, Chrome’s intended behavior for this button is to “size-to-best-fit, not fill-screen”. Therefore, it will only make the window wider if there is some visible content on the page that’s wider than the window is currently, i.e., if there’s a horizontal scrollbar. This page shows it quite wonderfully: there’s a 65k-pixel-wide div at the bottom that will make the window expand to the full screen size when you Zoom. Hide the div, and the window will only expand vertically.

How to make Zoom fill the full screen

Despite this somewhat unexpected default behavior, there are a couple ways to fill the screen anyway when Zooming:

  • Press Shift when you click on the Zoom button. This, as implemented by Chrome, will make the window fill the full screen.
  • This one has some subtlety: Under System Preferences->Keyboard->Keyboard Shortcuts->Application Shortcuts, add a shortcut for Chrome. Make the Menu Title be “Zoom” (it’s under the Window menu). Since Command+M is the keyboard shortcut for Minimize, let’s make the Keyboard Shortcut for Zoom be Command+Shift+M. Go back to Chrome and try it out your keyboard shortcut. Voila! But wait–if you just click on Zoom in the menu, it will continue to resize only vertically. What’s going on here? Because you’re pressing Shift while activating Zoom with the keyboard shortcut, it works just like pressing Shift while clicking the green button. No joke.

Versions and such:

  • Chrome version: 5.0.375.8628.0.1500.52 (/always)
  • OS X version: 10.6.4 (/all)

How to recover deleted JavaScript files using the cache in Chrome or Firefox

Posted: May 19th, 2010 | Author: | Filed under: JavaScript | Tags: , , , , | 1 Comment »

So you’ve been writing some JavaScript files (or HTML files or images) and testing them in Google Chrome or Mozilla Firefox. Unfortunately, you’ve deleted the JavaScript files, forgot to check them into the repository, didn’t make a backup, and they’re gone. Don’t fear! There is still hope!

Your missing file is stored in your browser’s cache, which can be accessed by typing about:cache into the URL box in Chrome or about:cache?device=disk in Firefox. This will list links to every file in your cache (and thus may take a moment to open). Individual files can be accessed by going to chrome://net-internals/view-cache/[[URL]] in Chrome or about:cache-entry?client=HTTP&sb=1&key=[[URL]] in Firefox.

The cached version will look like the following in Chrome:

[[URL]]

HTTP/1.1 200 OK Date: Wed, 19 May 2010 15:17:36 GMT Server: Apache Last-Modified: Wed, 19 May 2010 15:13:53 GMT ETag: "18aa1e6-1761-486f3e7ebae40" Accept-Ranges: bytes Content-Length: 5985 Content-Type: application/javascript
00000000: fb 00 00 00 01 00 00 00 34 62 bc 3d 8a e5 2d 00 ........4b.=..-. 00000010: ef 1a be 3d 8a e5 2d 00 e3 00 00 00 48 54 54 50 ...=..-.....HTTP 00000020: 2f 31 2e 31 20 32 30 30 20 4f 4b 00 44 61 74 65 /1.1 200 OK.Date 00000030: 3a 20 57 65 64 2c 20 31 39 20 4d 61 79 20 32 30 : Wed, 19 May 20 00000040: 31 30 20 31 35 3a 31 37 3a 33 36 20 47 4d 54 00 10 15:17:36 GMT. 00000050: 53 65 72 76 65 72 3a 20 41 70 61 63 68 65 00 4c Server: Apache.L 00000060: 61 73 74 2d 4d 6f 64 69 66 69 65 64 3a 20 57 65 ast-Modified: We 00000070: 64 2c 20 31 39 20 4d 61 79 20 32 30 31 30 20 31 d, 19 May 2010 1 00000080: 35 3a 31 33 3a 35 33 20 47 4d 54 00 45 54 61 67 5:13:53 GMT.ETag 00000090: 3a 20 22 31 38 61 61 31 65 36 2d 31 37 36 31 2d : "18aa1e6-1761- 000000a0: 34 38 36 66 33 65 37 65 62 61 65 34 30 22 00 41 486f3e7ebae40".A 000000b0: 63 63 65 70 74 2d 52 61 6e 67 65 73 3a 20 62 79 ccept-Ranges: by 000000c0: 74 65 73 00 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 tes.Content-Leng 000000d0: 74 68 3a 20 35 39 38 35 00 43 6f 6e 74 65 6e 74 th: 5985.Content 000000e0: 2d 54 79 70 65 3a 20 61 70 70 6c 69 63 61 74 69 -Type: applicati 000000f0: 6f 6e 2f 6a 61 76 61 73 63 72 69 70 74 00 00 on/javascript..
00000000: 2f 2a 2a 0a 20 2a 20 40 61 75 74 68 6f 72 20 41 /**. * @author A 00000010: 6c 65 78 20 4b 6f 72 6e 20 3c 61 6c 65 78 6b 6f lex Korn <alexko
...

The last section (in both browsers) displays the hexadecimal representation and the text of your file. The last column of this section displays newline characters as periods in many text editors, so we can’t use it to recover the file. We’ll have to use the hexadecimal. I wrote the following PHP script, which will extract and convert the hexadecimal representation into the text version:

<?php
$cacheString = ''; // COPY THE LAST SECTION IN HERE (Use /'/\\\'/g to escape quotation marks)

$matches = array();
preg_match_all('/\s[0-9a-f]{2}\s/', $cacheString, $matches);
foreach ($matches[0] as $match)
{
    echo chr(hexdec($match));
}
?>

Voilà! Your once-lost code is now found!


A couple notes:

  • If your server has compression enabled (such as gzip), you’ll have to go through the extra step of unzipping the content
  • Chrome version: 5.0.375.38 beta
  • Firefox version: 3.6.3

Update 2011/05/14: I changed the \b’s in the RegEx to \s’s as recommended in this post on recovering gzipped files.