Daniel Watrous on Software Engineering

A Collection of Software Problems and Solutions

Posts tagged configuration

Software Engineering

Configuration of python web applications

Hopefully it’s obvious that separating configuration from application code is always a good idea. One simple and effective way I’ve found to do this in my python (think bottle, flask, etc.) apps is with a simple JSON configuration file. Choosing JSON makes sense for a few reasons:

  • Easy to read (for humans)
  • Easy to consume (by your application
  • Can be version alongside application code
  • Can be turned into a configuration REST service

Here’s a short example of how to do this for a simple python application that uses MongoDB. First the configuration file.

{
    "use_ssl": false,
    "authentication": {
        "enabled": false,
        "username": "myuser",
        "password": "secret"},
    "mongodb_port": 27017,
    "mongodb_host": "localhost",
    "mongodb_dbname": "appname"
}

Now in the python application you would load the simple JSON file above in a single line:

# load in configuration
configuration = json.load(open('softwarelicense.conf.json', 'r'))

Now in your connection code, you would reference the configuration object.

# establish connection pool and a handle to the database
mongo_client = MongoClient(configuration['mongodb_host'], configuration['mongodb_port'], ssl=configuration['use_ssl'])
db = mongo_client[configuration['mongodb_dbname']]
if configuration['authentication']['enabled']:
    db.authenticate(configuration['authentication']['username'], configuration['authentication']['password'])
 
# get handle to collections in MongoDB
mycollection = db.mycollection

This keeps the magic out of your application and makes it easy to change your configuration without direct changes to your application code.

Configuration as a web service

Building a web service to return configuration data is more complicated for a few reasons. One is that you want to make sure it’s secure, since most configuration data involves credentials, secrets and details about deployment details (e.g. paths). If a hacker managed to get your configuration data that could significantly increase his attack surface. Another reason is that it’s another potential point of failure. If the service was unavailable for any reason, your app would still need to know how to startup and run.

If you did create a secure, highly available configuration service, the only change to your application code would be to replace the open call with a call to urllib.urlopen or something similar.

Update, October 31, 2016

The below implementation extends the above ideas.

{
   "common": {
      "authentication": {
         "jwtsecret": "badsecret",
         "jwtexpireoffset": 1800,
         "jwtalgorithm": "HS256",
         "authorize": {
            "role1": "query or group to confirm role1",
            "role2": "query or group to confirm role2"
         },
         "message":{
            "unauthorized":"Access to this system is granted by request. Contact PersonName to get access."
         }
      }
   }
}

The following class encapsulates the management of the configuration file.

import os
import json
 
class authentication_config:
    common = {}
 
    def __init__(self, configuration):
        self.common = configuration['common']['authentication']
 
    def get_config(self):
        return self.common
 
class configuration:
    authentication = None
 
    def __init__(self, config_directory, config_file):
        # load in configuration (directory is assumed relative to this file)
        config_full_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), config_directory, config_file)
        with open(config_full_path, 'r') as myfile:
            configuration_raw = json.load(myfile)
        # prepare authentication
        self.authentication = authentication_config(configuration_raw)
 
 
def main():
    # get path to configuration
    config_directory = 'conf'
    config_file = 'myapp.conf.json'
 
    config = configuration(config_directory, config_file)
    print config.authentication.get_config['jwtsecret']
 
 
if __name__ == '__main__':
    main()
Software Engineering

A Review of Docker

The most strikingly different characteristic of Docker, when compared to other deployment platforms, is the single responsibility per container Design (although some see it differently). One reason this looks so different is that many application developers view the complete software stack on which they deploy as a collection of components on a single logical server. For developers of larger applications, who already have experience deploying distributed stacks, the security and configuration complexity of Docker may feel more familiar. Docker brings a fresh approach to distributed stacks; one that may seem overly complex for developers of smaller applications to enjoy the convenience of Deploying their full stack to a single logical server.

Link to create applications

Docker does mitigate some of the complexity of a distributed stack by way of Linking. Linking is a way to connect multiple containers so that they have access to each other’s resources. Communication between linked containers happens over a private network between the two containers. Each container has a unique IP address on the private network. We’ll see later on that share volumes are a special case in Linking containers.

Statelessness and Persistence

One core concept behind Docker containers is that they are transient. They are fast and easy to start, stop and destroy. Once stopped, any resources associated with the running container are immediately returned to the system. This stateless approach can be a good fit for modern web applications, where statelessness simplifies scaling and concurrency. However, the question remains about what to do with truly persistent data, like records in a database.

Docker answers this question of persistence with Volumes. At first glance this appears to only provide persistence between running containers, but it can be configured to share data between host and container that will survive after the container exits.

It’s important to note that storing data outside the container breaches one isolation barrier and could be an attack vector back into any container that uses that data. It’s also important to understand that any data stored outside the container may require infrastructure to manage, backup, sync, etc., since Docker only manages containers.

Infrastructure Revision Control

Docker elevates infrastructure dependencies one level above system administration by encapsulating application dependencies inside a single container. This encapsulation makes it possible to maintain versioned deployment artifacts, either as Docker Buildfile or in binary form. This enables some interesting possibilities, such as testing a new server configuration or redeploying an old server configuration in minutes.

Two Ways to Build Docker Container Images

Docker provides two ways to create a new container.

  1. Buildfile
  2. Modify an existing container

Buildfile

A Buildfile is similar to a Vagrantfile. It references a base image (starting point) and a number of tasks to execute on that base image to arrive at a desired end state. For example, one could start with the Ubuntu base image and run a series of apt-get commands to install the Nginx web server and copy the default configuration. After the image is created, it can be used to create new containers that have those dependencies ready to go.

Images are containers in embryo, similar to how a class is an object in embryo.

A Buildfile can also be added to a git repository and builds can be automatically triggered whenever a change is committed against the Buildfile.

Modify an existing container

Unlike the Buildfile, which is a textfile containing commands, it is also possible to build a container from an existing image and run ‘/bin/bash’. From the bash prompt any desired changes can be made. These commands modify the actual image, which can then be committed into the DockerHub repository or stored elsewhere for later use.

In either case, the result is a binary image that can be used to create a container providing a specific dependency profile.

Scaling Docker

Docker alone doesn’t answer the question about how to scale out containers, although there are a lot of projects trying to answer that question. It’s important to know that containerizing an application doesn’t automatically make it easier to scale. It is necessary to create logic to build, monitor, link, distribute, secure, update and otherwise manage containers.

Not Small VMs

It should be obvious by this point that Docker containers are not intended to be small Virtual Machines. They are isolated, single function containers that should be responsible for a single task and linked together to provide a complete software stack. This is similar to the Single Responsibility Principle. Each container should have a single responsibility, which increases the likelihood of reuse and decreases the complexity of ongoing management.

Application Considerations

I would characterize most of the discussion above as infrastructure considerations. There are several application specific considerations to review.

PaaS Infection of Application Code

Many PaaS solutions infect application code. This may be in the form of requiring use of certain libraries, specific language versions or adhering to specific resource structures. The trade-off promise is that in exchange for the rigid application requirements, the developer enjoys greater ease and reliability when deploying and scaling an application and can largely ignore system level concerns.

The container approach that Docker takes doesn’t infect application code, but it drastically changes deployment. Docker is so flexible in fact, that it becomes possible to run different application components with different dependencies, such as differing versions of the same programming language. Application developers are free to use any dependencies that suit their needs and develop in any environment that they like, including a full stack on a single logical server. No special libraries are required.

While this sounds great, it also increases application complexity in several ways, some of which are unexpected. One is that the traditional role of system administrator must change to be more involved in application development. The management of security, patching, etc. need to happen across an undefined number of containers rather than a fixed number of servers. A related complexity is that application developers need to be more aware of system level software, security, conflicts management, etc.

While it is true that Docker containers don’t infect application code, they drastically change the application development process and blur traditional lines between application development and system administration.

Security is Complicated

Security considerations for application developers must expand to include understanding of how containers are managed and what level of system access they have. This includes understanding how Linking containers works so that communication between containers and from container to host or from container to internet can be properly secured. Management of persistent data that must survive beyond the container life cycle needs to enforce the same isolation and security that the container promises. This can become tricky in a shared environment.

Configuration is complicated

Application configuration is also complicated, especially communication between containers that are not running on a single logical server, but instead are distributed among multiple servers or even multiple datacenters. Connectivity to shared resources, such as a database or set of files becomes tricky if those are also running in containers. In order to accommodate dynamic life cycle management of containers across server and datacenter boundaries, some configuration will need to be handled outside the container. This too will require careful attention to ensure isolation and protection.

Conclusion

Docker and related containerization tools appear to be a fantastic step in the direction of providing greater developer flexibility and increased hardware utilization. The ability to version infrastructure and deploy variants in minutes is a big positive.

While the impacts on application development don’t directly impact the lines of code written, they challenge conventional roles, such as developer and system administrator. Increased complexity is introduced by creating a linked software stack where connectivity and security between containers need to be addressed, even for small applications.

Software Engineering

Access Apache LDAP authentication details in PHP

Today I was working on a small web application that will run on a corporate intranet. There was an existing LDAP server and many existing web apps use the authentication details cached in the browser (Basic Authentication) to identify a user and determine access levels.

My application is written in PHP and I wanted to leverage this same mechanism to determine the current user and customize my application. Since my searches on Google didn’t pull up anything similar, I want to document what I did.

I did explore the possibility of using PHP’s LDAP library to perform the authentication but decided instead to use the basic authentication provided by Apache. I have two reasons for this. First is that most users are accustomed to this already and developers of other internal applications are familiar with this approach. Second is that authentication details are more easily cached for a long duration, which cuts down on reauthentication. In a trusted intranet environment this is very desirable.

Apache Configuration

To begin with, I installed and configured Apache on Ubuntu Linux. One of the modules that wasn’t enabled by default is authnz_ldap. I did notice that it was available so I ran this command to enable the module:


$ sudo a2enmod authnz_ldap
$ sudo /etc/init.d/apache2 restart

With this module installed I then needed to give it some details about my LDAP server and what paths I wanted protected by LDAP authentication. I accomplished this by adding a <Location> directive to the httpd.conf file. This is what my Location directive looked like:


<Location "/">
  AuthBasicProvider ldap
  AuthType Basic
  AuthzLDAPAuthoritative off
  AuthName "My Application"
  AuthLDAPURL "ldap://directory.example.com:389/DC=example,DC=com?sAMAccountName?sub?(objectClass=*)"
  AuthLDAPBindDN "CN=apache,CN=Users,DC=example,DC=com"
  AuthLDAPBindPassword hackme
  Require valid-user
</Location>

After making this change another restart was necessary. At this point I reloaded a page that was protected and was able to authenticate as expected.

PHP Access to authentication

This is surprisingly easy (and surprisingly undocumented anywhere that I could find). PHP will automatically populate several $_SERVER superglobal keys with the authentication values cached in Apache. The key values are:


$_SERVER['PHP_AUTH_USER']
$_SERVER['PHP_AUTH_PW']

Later I found that you can also add additional values to AuthLDAPURL that will populate additional keys in your $_SERVER superglobal.

At this point you might choose to perform additional operations against the LDAP server using PHP’s library or simply use the available values to customize your intranet web application.