Build a Multi-server LEMP stack using Ansible

My objective in this post is to explore the use of Ansible to configure a multi-server LEMP stack. This builds on the preliminary work I did demonstrating how to use Vagrant to create an environment to run Ansible. You can follow this entire example on any Windows (or Linux) host. Ansible only runs on Linux […]

Read more

Using Vagrant to Explore Ansible

Last week I wrote about Vagrant, a fantastic tool to spin up virtual development environments. Today I’m exploring Ansible. Ansible is an open source tool which streamlines certain system administration activities. Unlike Vagrant, which provisions new machines, Ansible takes an already provisioned machine and configures it. This can include installing and configuring software, managing services, […]

Read more

Using Vagrant to build a LEMP stack

I may have just fallen in love with the tool Vagrant. Vagrant makes it possible to quickly create a virtual environment for development. It is different than cloning or snapshots in that it uses minimal base OSes and provides a provisioning mechanism to setup and configure the environment exactly the way you want for development. […]

Read more

Craft vs. Delivery in Software Development

As a software engineer I love well crafted software. Carefully chosen abstractions, effective use of patterns, thorough test coverage, all increase business value. Craft takes time and requires skill and proper tools and resources. Unfortunately, I frequently find myself frustrated that business partners see value differently and care only about delivering a fixed set of […]

Read more

Load Testing with Locust.io

I’ve recently done some load testing using Locust.io. The setup was more complicated than other tools and I didn’t feel like it was well documented on their site. Here’s how I got Locust.io running on two different Linux platforms. Locust.io on RedHal Enterprise Linux (RHEL) Naturally, these instructions will work on CentOS too. sudo yum […]

Read more

Load Testing Alternatives for Large Scale Web Applications

Load testing web applications is a big deal in a day of web scale traffic. There are countless ways to get traffic to a website, and when one of them goes right (like a slashdot or viral content), it can produce an enormous load in a very short time. Building and testing large scale software […]

Read more

Detecting Credit Card Fraud – Frequency Algorithm

About 13 years ago I created my first integration with Authorize.net for a client who wanted to accept credit card payments directly on his website. The internet has changed a lot since then and the frequency of fraud attempts has increased. One credit card fraud signature I identified while reviewing my server logs for one […]

Read more

Analyze Tomcat Logs using PIG (hadoop)

In a previous post I illustrated the use of Hadoop to analyze Apache Tomcat log files (catalina.out). Below I perform the same Tomcat log analysis using PIG. The motivation behind PIG is the ability us a descriptive language to analyze large sets of data rather than writing code to process it, using Java or Python […]

Read more

Hadoop Scripts in Python

I read that Hadoop supports scripts written in various languages other than Java, such as Python. Since I’m a fan of python, I wanted to prove this out. It was my good fortune to find an excellent post by Michael Noll that walked me through the entire process of scripting in Python for Hadoop. It’s […]

Read more

Hadoop Takes Compressed Files (gzip, bzip2) as Direct Input

The log rotation mechanism on my servers automatically compresses (gzip) the rotated log file to save on disk space. I discovered that Hadoop is already designed to deal with compressed files using gzip, bzip2 and LZO out of the box. This means that no additional work is required in the Mapper class to decompress. Here’s […]

Read more