Infrastructure as a Service, like OpenStack and AWS, have made it possible to consume infrastructure on demand. It’s important to understand the ways in which both humans and machines interact with IaaS offerings in order to design optimal systems that leverage all possible automation opportunities. I drew the diagram below to help illustrate. Everything is an API At the heart of IaaS are REST APIs that provide granular access to every resource type, such as compute, storage and network. These APIs provide clarity about which resources are being managed and accommodate the type......
Continue Reading
Someone asked me today whether he should use HEAT or Ansible to automate his OpenStack deployment. My answer is that he should use both! It’s helpful to understand the original design decisions for each tool in order to use each effectively. OpenStack HEAT and Ansible were designed to do different things, although in the opensource tradition, they have been extended to accommodate some overlapping functionalities. Cloud Native In my post on What is Cloud Native, I show the five elements of application life cycle that can be automated in the cloud (image shown......
Continue Reading

I’ve recently had some people ask how I deploy MongoDB. For a while I used their excellent online tool to deploy and monitor my clusters. Unfortunately they changed direction and I couldn’t afford their new tools, so I turned to Ansible. In order more easily share the process, I posted a simple example that you can run locally using Vagrant to deploy MongoDB using Ansible. https://github.com/dwatrous/ansible-mongodb As soon as you finish running the Ansible script, you can immediately connect to MongoDB and start working with Data. If you’re looking to learn more about......
Continue Reading
Today I read an article on the Wall Street Journal about the benefits of taking handwritten notes over taking notes electronically. It’s a great article and I have seen parallels in my own professional work. For years I have trained computer programmers and others in various technology roles. I have observed a similar effect which has led me to make a case for starting with what I call a “blank canvas”. I try to explain that one of the reasons it is ineffective to start with technology, rather than a blank sheet of......
Continue Reading

I hear a lot of people talking about cloud native applications these days. This includes technologists and business managers. I have found that there really is a spectrum of meaning for the term cloud native and that two people rarely mean the same thing when they say cloud native. At one end of the spectrum would be running a traditional workload on a virtual machine. In this scenario the virtual host may have been manually provisioned, manually configured, manually deployed, etc. It’s cloudiness comes from the fact that it’s a virtual machine running......
Continue Reading
All around me architects and business managers are beginning to mandate that internal software applications be built (and sometimes rebuilt) as microservices so that we can reuse them and compose applications more quickly. I do admit that the idea of a curated catalog of well designed microservices is attractive. Contrary to all the buzz that this is a new approach and will produce huge efficiencies, there is a lot of history that points to serious barriers to a microservice economy. Microservices are Not New What is being referred to as microservices today is......
Continue Reading
In a previous post I demonstrated a method to deploy a multi-node Hadoop cluster using Vagrant and Ansible. This post builds on that and shows how to deploy a Hadoop cluster with an arbitrary number of slave nodes in minutes on OpenStack. This process makes use of the OpenStack orchestration layer HEAT to provision the resources, after which Ansible use used to configure those resources. All the scripts to do this yourself is available on github to clone and fork: https://github.com/dwatrous/hadoop-multi-server-ansible I have recorded a video demonstrating the entire process, including scaling the......
Continue Reading
I’ve recently been involved with several groups interested in using Hadoop to process large sets of data, including use of higher level abstractions on top of Hadoop like Pig and Hive. What has surprised me most is that no one is automating their installation of Hadoop. In each case that I’ve observed they start by manually provisioning some servers and then follow a series of tutorials to manually install and configure a cluster. The typical experience seems to take about a week to setup a cluster. There is often a lot of wasted......
Continue Reading
Hopefully it’s obvious that separating configuration from application code is always a good idea. One simple and effective way I’ve found to do this in my python (think bottle, flask, etc.) apps is with a simple JSON configuration file. Choosing JSON makes sense for a few reasons: Easy to read (for humans) Easy to consume (by your application Can be version alongside application code Can be turned into a configuration REST service Here’s a short example of how to do this for a simple python application that uses MongoDB. First the configuration file.......
Continue Reading
The more I automate, the more I have to answer the question of how to manage my secrets. Secrets that frequently come up include: SSH key pairs SSL private keys Credentials for external resources, such as databases and SaaS integrations Before cloud, when server resources were not ephemeral, these could be managed manually when the server was created. In cloud environments, servers are created and destroyed automatically and from minute to minute, which leaves the question about how to manage secrets. The OpenStack community is working on one solution called Barbican. I’ve been......
Continue Reading