Jan 20, 2015

Building a simple LAMP Application using Docker

As part of our team's effort to familiarize ourselves with Docker we wanted to start with a simple and well understood use case. To do so, we decided to build a small LAMP application on Fedora 21 Server (not F21 Workstation and not F21 Cloud). This helped us get to grips with the platform and in the event that it might help others, I thought it was worth sharing. Thanks to Erin Boyd, Jeff Vance, Scott Creeley and Brad Childs for their help in putting this together.

The application can be built in 4 relatively simple steps:

1) Set up your Docker Host environment
2) Launch a PHP Container
3) Launch a MySQL Container and create a Database
4) Write a simple PHP app that queries the database.

1. Setting up your environment

This tutorial offers a straightforward guide to setting up Fedora 21(F21) as your Docker Host. 

2. Configuring and launching the PHP Container

In order to serve the PHP page, we'll need a docker image that provides a PHP enabled Web Server. We can get this image from the docker hub by simply typing "docker pull php". You can then type "docker images" and you should now see the php image in your list of local images. 

One of the first things you need to understand about Docker is the relationship between containers and your data. The best practice is to store the data outside of your container and then mount the external store onto a directory within the container. This model allows you to have the PHP runtime inside a container but the actual PHP pages being served from an external location, such as a directory on the Host (F21) filesystem. 

To do this, you must select an EXISTING directory on F21 to host the PHP pages. In our case, we selected /opt/php/. Then you can simply launch a PHP container from the image and mount /opt/php onto /var/www/html using the -v parameter as follows: 

docker run php:5.6-apache --name RandomAppName -v /opt/php:/var/www/html/

The next step is to create a simple php files in /opt/php and verify that the page is being served by the container. When you launched the container, Docker will return a ContainerID. In order to obtain the IP address of your PHP container, run: 

docker inspect [ContainerID] | grep "IPAddress"

Next, launch a browser in F21 and use the IP of the PHP container in the URL, for example -

You should now be serving webpages! However, there's one more wrinkle. Unfortunately, the PHP image we're using does not contain any MySQL client libraries. To resolve this, we'll create a layered image that extends the PHP image that includes the MySQL client libraries. 

1. Stop the old container: 
docker stop [ContainerID]
2. Create a new directory, such as /opt/mynewphpimage
3. Create a new file in the directory called "Dockerfile" and add the contents below:
FROM php:5.6-apache
RUN docker-php-ext-install mysqli
4. Generate the image called "php-mysql" using the following command:
docker build -t php-mysql .

Finally, launch a new container with your new image that is mounted to the :
docker run -it --rm --name my-apache-php-app -v /opt/php:/var/www/html/ php-mysql

3. Configuring and launching the MySQL Container

Now that we have our PHP container all set up. The next step is to setup the MySQL container which means we'll need a MySQL Docker image. I used the tutum/mysql image and you can install it by running "docker pull tutum/mysql". You can then run "docker images" on your F21 Host and you should now see the tutum/mysql image in your list of local images.

MySQL will be set up the same way as the PHP Container was in that the actual database files will be persisted to /opt/mysql on the Fedora 21 Host and the container will only contain the MySQL runtime. This blog post describes how to do that, as well as create a small database using the container. You can verify the database files are in /opt/mysql after it has been created.

Lastly, you'll need to ensure that you create a user in MySQL that has access to the appropriate tables and database that you created, so that the PHP MySQLi client can use it to connect to them. I did this using the mysql command within the MySQL container as demonstrated below:

1. docker exec -it [ContainerIDOfMySQLContainer] bin/bash
2. mysql
3. mysql> CREATE USER 'myUserName'@'%' IDENTIFIED BY 'myPassword';

4. Write a simple PHP App that queries the database

Now that the runtimes are ready, we're going to write our application. To do so:

1. Create a "/opt/php/db.php" file on your F21 Host and place the following contents within it.

<html><head><title>PHP Test</title> </head> <body>

 $mysqli = new mysqli("", "myUserName", "myPassword", "us_states");
 if ($mysqli->connect_errno) {
    echo "Failed to connect to MySQL: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;

if ($result = $mysqli->query("SELECT * from states")) {

   while ($row = $result->fetch_assoc()) {
    echo " State: ".$row['state'];
    echo " Population: ".$row['population'];


} else {
echo "No result";

2. Launch a browser on your F21 Host and modify the URL to reflect your db.php. The PHP page should now reflect what is displayed below.

Jan 19, 2015

Creating and Persisting Databases outside your container using the Tutum/MySQL Docker Image

I'm presently hacking on Docker and I ran into a few issues getting the Tutum/MySQL Docker Image working on Fedora 21. This overview assumes that you've already gone through the basics to get Docker running on F21.

The issue I'm having is that I want to launch the MySQL container but persist the actual data (i.e. the System and Application Database) elsewhere. To do this we have to pass in a -v argument to the Docker run statement, which mounts your alternative host path to the /var/lib/mysql location in the container.

In order to do this one runs the following command to mount /opt/mysql on my F21 host server to /var/lib/mysql (which is where MySQL DBs live) in the container:

[root@localhost mysql]# docker run -d -v /opt/mysql:/var/lib/mysql tutum/mysql /bin/bash -c "/usr/bin/mysql_install_db"

However, if you check the logs, you'll notice that the attempt failed:

[root@localhost mysql]# docker logs 3224ede26db55241b320655da14bbf2737212fb3889b871ff5aa03307edf9ac8
chown: changing ownership of '/var/lib/mysql': Permission denied
Cannot change ownership of the database directories to the 'mysql'
user.  Check that you have the necessary permissions and try again.

To resolve this issue, you need to attach a --privileged directive to the docker run command as follows:

[root@localhost opt]# docker run --privileged -d -v /opt/mysql:/var/lib/mysql tutum/mysql /bin/bash -c "/usr/bin/mysql_install_db"

This will run the appropriate script in the container to create the system database and store it on your host machine and then return and release the container.  The next step is to relaunch the image with it mounted to the same system directory

[root@localhost opt]# docker run --privileged -d -p 3306:3306 -v /opt/mysql:/var/lib/mysql tutum/mysql

This should work. You can shell into the container and run the necessary SQL scripts or MySQL Admin commands to create any additional databases you wish. All Databases and their modifications will always be persisted outside of the container to the mounted directory that was specified.  For example:

[root@localhost opt]# docker exec -it aa9b0ddf6049 bash
root@aa9b0ddf6049:/# mysql

USE us_states;
INSERT INTO states (id, state, population) VALUES (NULL, ‘Alabama’, ‘4822023’);

root@aa9b0ddf6049:/# exit

[root@localhost opt]# docker stop [containerID]

In some cases I have got an exception when trying to start the MySQL docker image that states:

Cannot start container [container id]: (exit status 1)  

I find that I can get resolve this by simply restarting the docker service and trying again:

systemctl docker restart

Jan 28, 2013

Leveraging CapMetro Rail for SXSW Hotels and Commutes

It's inevitable that each year, all the hotels sell out within a certain radius of the SXSW location (the Austin Convention center). If you run into this problem, here's a little hack that might save the day.

Austin recently added a nice 2 car light rail option known as the Capital Metro Rail. This just so happens to stop right outside the Austin Convention Center and also has a greatly extended timetable during SXSW. There are nice (Westin) and decent (La Quinta) hotels within walking distance at several of the stops on the line.

This is a list of the stops on the line. Punch the address for a station into Google Maps and search nearby for hotels. The example I've provided below, using the Kramer station address, stops right next to The Domain complex which has several hotels within walking distance.

View Larger Map

Aug 14, 2012

Building an Iron Man Suit - Part 2

Given that I've had a ton of hits on the rather unsubstantive original post, I thought I'd post an update on what I've learned so far to save other folks the time I've spent figuring out how to approach this. Note: This post provides all the resources you need to build a full replica fiberglass Mark III Iron Man suit

As I mentioned in the previous post, there is an existing Iron Man Suit Builders community called Stark Industries Weapons, Data and Armor Technology (SIWDAT).  The site is largely driven by a gentleman named TMP (Timeless Movie Prop) who is a professional sculptor and prop builder. Its a great site and has a good forum where one can see some of the existing suits folks have built, but unfortunately has lost a lot of its content due to some forum crash in 2011. Users are slowly re-adding the content. I personally didn't find it super useful in how to build suits other than discovering what people use for full size blueprints for the various Iron Man Suits (War Machine, Mark I, Mark III, etc.), which is called Pepakura.

Pepakura is essentially a 3D Model comprised of a number of parts, whereby you can print out the parts and it tells you how to attach the tabs of the various parts together to recreate the 3D model. The cliff note version is that you print the templates on cardstock paper, Xacto Knife the templates out, glue the tabs on the templates together to build the model, resin the model, glue on fiberglass cloth, resin the cloth, apply a finishing product, sand it and paint it and you end up with amazing replica Iron Man Suits like the one to the right.

Pepkakura is actually pretty simple. It uses free software and you just need to find the templates, which are PDO files, and figure out what to do with them. If you watch all seven of the pepakura tutorials below, you'll learn everything you need to see how Pepakura is used to build an Iron Man Suit from start to finish. The link to the next tutorial appears at the end of each video. Plus, Stealth, their creator is pretty funny.
Another gentleman that goes by the handle Dancin_Fools, has one of the most popular and detailed Mark III pepakura suit designs built using 3D Studio Max.This is the link to the thread where you can see some of the models and the outcome.  You can download the templates (PDO) files directly here.

I'm actually attempting to build an aluminum suit since I'm having some fun exploring how far my son and I can get trying to build a real suit and not a costume. Aluminum is a much harder medium to work with than paper, so at this point, I've decided to first go/think through the process with paper to make sure I have the scale correctly identified. I'm currently working on the chest piece from the PDO files provided above.

Do not understimate the huge amount of time it takes to cut the pepakura templates out using the xacto knife, it is taking me days (in my spare time) to cut out just the chest peices. Someone should start a business selling pre-cut templates at a scale specified by the customer.

I was pretty encouraged to find that other people have actually built Aluminum suits. I have decided that Welding is going to really complicate things, so my current plan has changed to using steel rivets to join the pieces. Check out this really cool fully aluminum Mark VI suit below. The builders thread for this is available here. Granted, there's not a whole lot out there that I've found (yet) on how to build aluminum suits. I'll post more as I find it.

UPDATE: Samurai169 over at RPF has an awesome welded 20 gauge steel suit he is building.


The next step is to actually build your Arc Reactor replica. I've at least got this bit accomplished already. To begin with, here's a refresher of what an actual Mark I Arc Reactor looks like:

Now, unless you've got a 3 inch hole in your solar plexus, or you've already made the suit and it sits an inch or two off of your chest, you're not going to be able to wear a true replica. Instructables has a great section on how to build your own Iron Man Arc Reactor. I got all that I needed from Radioshack, Hobby Lobby and Lowes. The images below are some examples of Arc Reactors that can be built.


Lastly, if you want to start wiring the suit for functionality such as automating the face plate opening and closing, then the XRobots web site has a lot of tutorials and likely what you're looking for.

Also, dont forget about the ability to use the Arduino platform, as it provides a platform to start building out the suits motorized functions.

Another option is to use a 3D Printer as it possible to print some of the components in their entirety. For example, MakerBot has a 3D Printer that can print components the size of a loaf of bread. Some folks are using it to print entire helmets.

Happy Making !

Jul 12, 2012

Building an Iron Man Suit

This is the first post. My 7 year old son and I are going to try and see how far we get. I'm hoping the project will be fun, it will be some good bonding time and we'll learn a little bit about mechanics and science along the way. I plan to blog our progress as it happens in case anyone else is interested in attempting the same thing.

Our approach (As of today) 

I can program so I'm not too worried about the sensors and display. I am going to take some old iPhones and hack them to do add some functionality to the suit. The hard part is going to be building the actual suit. I want to do it out of sheet metal. I'm not sure if this will be too heavy. This involves also learning how to weld. I've been reading from this blog post on Instructables how to MIG Weld. For the record, conducting current to produce extreme heat through the use of a gas based on a tutorial I read off the internet scares the hell out of me.

So the base design will be welding sheet metal cut-outs to produce the core suit, havent figured out how we'll pull off the helmet, and use carefully placed halogen LEDs to produce the ARC reactor. The plan is to then add to it once (and if) we pull off the core suit. I'm gonna build the suit for my son and not me. Which is tricky because he's growing like a weed. However, I'm sure this will look better than a suit with me in it, aka  an iron man suit with a rather pronounced mid section.

Update: There is an Iron Man Suit builders community !

We're using foamboard and the model below to build our prototype. The foamboard is key to help us get the scale right before we move to sheet metal.

May 3, 2012

Art, Engineering and the Digital Afterlife

I love storytellers. Their ability to envision the future is amazing. While being artists, in a lot of ways they are visionary scientists and engineers as well. How often do we see some concept imagined in a comic, book or a movie by an artist only to see it eventually made real shortly thereafter by a scientist or engineer. These artists inspire the makers. For instance, take the concepts envisioned within the Iron Man franchise which inspired one guy to make his own Iron Man Suit (low-tech but freaking awesome) and and how the US Military is building the real thing (which could definitely use a coat of hot-rod red).


With that said, I've been noodling a little on the thoughts behind the Battlestar Galactica prequel "Caprica". The premise is that one could create a digital afterlife where a soul could be reanimated provided enough data about the original human is preserved. In the show, the premise is powerful enough to disrupt contemporary religion.

The basic engineering concepts to make this a reality, are divided into two areas:
  1. A device that humans wear on their head which allows them to enter incredibly realistic digital three-dimensional worlds which they navigate as avatars. These worlds are limited by the fact that each avatar is directed by a real human in a real world outside the digital one. Think of it as an uber-realistic Second Life where you control your avatar with your mind.

  2. Technology that extends the previous concept to allow one to create autonomous avatars and inject them into these worlds. The avatars behaviour originates from rules divined from data about the original human that they reflect. In other words, if its your data, this avatar is you... except that its an autonomous copy.
So the 2nd invention is the bit that captivated my imagination. Presuming, at the point of someone's death (with something like a Zoe chip from The Final Cut), one could access data such as their entire purchase histories, every word they ever spoke or wrote, a 3 dimensional rendering of them and every action they ever undertook, could we create an avatar that would behave the same way they did and had the same memories? I think we could.


Dec 21, 2011

Part 2: Classifying and Quantifying Historical Private Equity Investments

This is part 2 of a series of posts. My previous post describes how to obtain the data you need for what is described in this section

Identifying and Extracting the entities within the semi-structured data

At this point, you now have all the data you need, but it is in semi-structured(HTML) format and still not accessible to query. We need to extract the appropriate entities that we require for our queries and store them in structured form so that we can start analyzing the data. So, what is the quickest way to do this?

The first step is to find consistent markers in the HTML you can use to pattern match to identify the entity. We need to know where each entity begins and where they end so we can extract the information. Most websites these days use a Content Management System (CMS) to deliver their content in HTML, which means they use templates and therefore have consistent markers in the HTML that don't change from page to page. If you are a Mozilla FireFox fan you can make use of the FireBug inspect option to click on the text (such as the company name) in the browser and it will take you directly to the corresponding markup in the HTML source. Google's Chrome browser has a similar function accessible by right-clicking on the page and selecting "Inspect element". This is WAY faster than searching through the source. Keep in mind though, the HTML pulled down in the crawl might actually be different than what is displayed through an inspect function. This might happen because the developer has chosen to dynamically manipulate the browsers Document Object Model (DOM), often on page onload. i.e. Sometimes the entity markers are in the Script.

Identifying the Markers for the Company Name

You might have guessed it by now, this is effectively screen scraping. Now before we start doing this at scale using Map/Reduce, you should first write a little Java POJO that can handle the extraction for each Crunchbase Company Page. I like to do this by dropping the entire page source into a constructor that privately calls a set of methods to extract the data and then makes the normalized structured entities (Company, Street Address, ZipCode, City, State, etc.) available via getter methods. You then write a static main method where you can pass in URLs of different Company pages to test how well your extraction techniques are working. Once you've run this through a decent set of sample pages and you are comfortable your extractors are working consistently we can now move onto doing this with Map/Reduce.

M/R Job Reading from Nutch Segments
If you're new to Hadoop & Map/Reduce, I suggest you take a Cloudera or Hortonworks Training class or read Tom White's outstanding "Hadoop: The Definitive Guide". The short version is that Map/Reduce is a 2 phase (with the Reduce phase being optional) framework that processes records out of a block of data, one record at a time. Each record is passed as a key and a value to the Mapper to be processed. It is designed to handle data of arbitrary format, so for each Hadoop Job you need to specify a specific Reader that knows how to parse out the records contained within the block of data.

In the example to the left (click for bigger picture), You can see a very simple Map/Reduce Job I've created. The key configuration property of this job is the InputFormatter (Record Reader):
This tells the job to use a Class that knows how to read Sequence Files. Nutch stores all the web pages in a given crawl depth/segment as a sequence of Content objects (one Content Object per Web Page) inside a Sequence File. The record reader passes a Content Object to the Mapper as the Value for each record it passes in (we ignore the Key). Inside the Map, we are then free to do whatever we want in processing the web page. In this example I drop it into my Crunchbase Company POJO's constructor and then write out the name of the company and the sector the company belongs to.

In the full example I don't just write out those two properties for each Company, but rather a tab delimited record that looks like the following:
Company Address City State ZipCode Sector Investor FundingRound Amount Month Day Year
As you can imagine most companies have multiple rounds of funding and therefore they would have a unique record for each round of funding. Having the data broken out like this allows one to Group By a variety of factors and SUM(Amount). This is all we need to quantify the disbursement of funds for a given factor and analyze over a given time dimension. Once the Hadoop Job is complete and all the data is extracted and normalized for each Company, we are now ready to start answering some of the questions that we have around the Tech Bubble. I'll cover this in my next post.

I chose to go into detail to show how one can extract data out of HTML for analysis since so few locations on the web have an alternate structured data equivalent of what is represented in HTML. Crunchbase, however, actually does have this, in that each Company HTML page contains a link to a JSON representation of the Company data. The fastest way to get at the JSON data is write a Map Job that reads each Company Page and then writes out just the link for the URL of the page containing the JSON data. Once this is complete, you now have a new seed list that you can crawl to a depth of 1. This will create a new Nutch segment that you can run a new Map job over, now having a much easier time extracting the pertinent data (using a library like org.json) and writing out the same schema in the same fashion described earlier.