I’ve recently been involved with several groups interested in using Hadoop to process large sets of data, including use of higher level abstractions on top of Hadoop like Pig and Hive. What has surprised me most is that no one is automating their installation of Hadoop. In each case that I’ve observed they start by manually provisioning some servers and then follow a series of tutorials to manually install and configure a cluster. The typical experience seems to take about a week to setup a cluster. There is often a lot of wasted......
Continue Reading
Hopefully it’s obvious that separating configuration from application code is always a good idea. One simple and effective way I’ve found to do this in my python (think bottle, flask, etc.) apps is with a simple JSON configuration file. Choosing JSON makes sense for a few reasons: Easy to read (for humans) Easy to consume (by your application Can be version alongside application code Can be turned into a configuration REST service Here’s a short example of how to do this for a simple python application that uses MongoDB. First the configuration file.......
Continue Reading
The more I automate, the more I have to answer the question of how to manage my secrets. Secrets that frequently come up include: SSH key pairs SSL private keys Credentials for external resources, such as databases and SaaS integrations Before cloud, when server resources were not ephemeral, these could be managed manually when the server was created. In cloud environments, servers are created and destroyed automatically and from minute to minute, which leaves the question about how to manage secrets. The OpenStack community is working on one solution called Barbican. I’ve been......
Continue Reading
I previously explained how to get one-minute resolution in Munin. The process to get sub-minute resolution in Munin is more tricky. The main reason it’s more tricky is that cron only runs once per minute, which means data must be generated and cached in between cron runs for collection when cron runs. In the case where a single datapoint is collected each time cron runs, the time at which cron runs is sufficient to store the data in rrd. With multiple datapoints being collected on a single cron run, it’s necessary to embed......
Continue Reading
I’ve recently been conducting some performance testing of a PaaS solution. In an effot to capture specific details resulting from these performance tests and how the test systems hold up under load, I looked into a few monitoring tools. Munin caught my eye as being easy to setup and having a large set of data available out of the box. The plugin architecture also made it attractive. One major drawback to Munin was the five minute resolution. During performance tests, it’s necessary to capture data much more frequently. With the latest Munin, it’s......
Continue Reading
Munin is a monitoring tool which captures and graphs system data, such as CPU utilization, load and I/O. Munin is designed so that all data is collected by plugins. This means that every built in graph is a plugin that was included with the Munin distribution. Each plugin adheres to the interface (not a literal OO inteface), as shown below. Munin uses Round Robin Database files (.rrd) to store captured data. The default time configuration in Munin collects data in five minute increments. Some important details: Plugins can be written in any language,......
Continue Reading
This post is an extension of Managed services in CloudFoundry and follows the discussion of external services integration. The echo service and python service broker implementation are deployed using the documented procedure for Stackato. If necessary, you can get a Stackato instance up and running quickly. The result should be two apps deployed as shown below. Optional validation It is possible to validate that the deployed apps work as expected. The curl commands below validate the echo service is working properly. watrous@watrous-helion:~$ curl -X PUT http://echo-service.stackato.danielwatrous.com/echo/51897770-560b-11e4-b75a-9ad2017223a9 -w "\n" {"instance_id": "51897770-560b-11e4-b75a-9ad2017223a9", "dashboard_url": "http://localhost:8090/dashboard/51897770-560b-11e4-b75a-9ad2017223a9", "state":......
Continue Reading
This post builds on the discussion of Managed Services in CloudFoundry and covers the first of the two methods for using unmanaged services in CloudFoundry. It makes use of the Python echo service and Python service broker API implementation used previously. Manually provision the service This method assumes that an existing service has been provisioned outside of CloudFoundry. An instance of the echo service can be manually provisioned using cURL using the following commands. In this example, a new instance is provisioned with the id user-service-instance and bound to an app with id......
Continue Reading
CloudFoundry defines a Service Broker API which can be implemented and added to a CloudFoundry installation to provide managed services for apps. In order to better understand the way managed services are created and integrated with CloudFoundry (and derivative technologies like Stackato and HP Helion Development Platform), I created an example service and implemented the Service Broker API to manage it. The code for both implementations are on github. https://github.com/dwatrous/cf-echo-service https://github.com/dwatrous/cf-service-broker-python Deploy the services For this exercise I deployed bosh-lite on my Windows 7 laptop. I then follow the documented procedure to push......
Continue Reading
CloudFoundry, Stackato and Helion Development Platform accommodate (and encourage) external services for persistent application needs. The types of services include relational databases, like MySQL or PostgreSQL, NoSQL datastores, like MongoDB, messaging services like RabbitMQ and even cache technologies like Redis and Memcached. In each case, connection details, such as a URL, PORT and credentials, are maintained by the cloud controller and injected into the environment of new application instances. Injection It’s important to understand that regardless of how the cloud controller receives details about the service, the process of getting those details to......
Continue Reading