Daniel Watrous on Software Engineering

A Collection of Software Problems and Solutions

Software Engineering

Software Development and batch size

I’ve been having a lot of conversations lately about batch size and how the choice of batch size impacts software development and release processes. Today I went looking for some other perspectives and I found this post about optimization on the IBM website. The author provides a good summary of the benefits available by decreasing batch sizes in software development.

I have been using agile methodologies for quite a long time, and they are much better than the traditional waterfall model. In the latter, all development is done at once, before testing occurs. Defaults are detected way after they were put in the code, which results in lengthy, ineffective, debugging sessions. In agile methodologies, development is cut in small pieces, and each piece is tested before moving to the next one. [Faults] are detected when the new code is fresh in the memory of developers, leading to short and effective debugging. This seems in line with the idea of small batch sizes advocated by Reinertsen.

This is true in my experience. Small batches increase the quality of each software deliverable while decreasing the frequency of bad releases or production issues. Later in the article he points out one instance where large batch size appears to yield a better outcome: “in a manufacturing environment, it is better to assign work in progress to machines globally instead of dealing with them one at a time…”

He seems to be asking why optimizing an entire population as a batch (large batch size) works better than optimizing smaller populations (small batch size) in certain circumstances. In other words, why does software development favor small batches when manufacturing appears to favor large batches?

Input and Output Variance

In a factory setting where every machine accepts identical raw input and produces identical completed output, it may work better to optimize for large batches. The variance of the input and output is what makes software development different, but there are also manufacturing examples where small batches were made more effective. One notable example was when Toyota changed how they manufactured automobile parts that required custom dies. Reducing the changeover time from three days to three minutes allowed them to decrease their batch size and increase overall efficiency.

Software development is all about specialization and context switching

Nearly all software development takes a unique input and produces a unique output (high variance). I am aware of very few instances where a programmer routinely takes in a similar input, does some processing on it, and returns a normalized output. Given human limitations in context switching (multitasking) and the ramp up time to become proficient with programming languages and platforms, smaller batches tend to work best for software development. Some organizations do focus on trying to make their developers better at context switching and proficient with more technologies, but in my experience, these initiatives often fail to deliver long term benefit to the organization.

Batch size and application integration

One argument I often hear about decreasing the batch size is that integration testing isn’t possible. I acknowledge that many systems are made up of tightly coupled (and often hard coded) components. Loose coupling has long been an effective pattern in software development and facilitates smaller batches while increasing overall system resilience. Loose coupling also increases reuse and improves flexibility in composing larger systems. Integration can happen most effectively when system components are loosely coupled.

Additional References

https://martinfowler.com/bliki/ActivityOriented.html: silos within organizations resist decreasing batch sizes and favor work that benefits the silo over work that benefits the business.

Organizing by activity gets in the way of lowering batch size of work that is handed-off between teams. A separate team of testers won’t accept one story at a time from a team of developers. They’d rather test a release worth of stories or at least all stories in a feature at a time. This lengthens the feedback loop, increases end-to-end cycle time and hurts overall responsiveness.

https://hbr.org/2012/05/six-myths-of-product-development: Fallacy number 2 addresses the idea that large batches improve the economics of the development process. In fact the reverse is true in manufacturing and software development.

By shrinking batch sizes, one company improved the efficiency of its product testing by 220% and decreased defects by 33%.

http://www.informit.com/articles/article.aspx?p=1833567&seqNum=3: In this post, the author of Continuous Delivery explains why smaller batches results in overall decreased risk to the organization.

When we reduce batch size we can deploy more frequently, because reducing batch size drives down cycle time.

Software Engineering

Generate TLS Secret for kubernetes

Often in development or when working on proofs of concept (PoC), I need working SSL to protect an endpoint. If I controlled the domain, I would use Lets Encrypt to generate a certificate. When I don’t control the domain, I often use self signed certificates. Below is how I create them and then use them to create a Secret in kubernetes.

Choosing a domain (common name)

When I don’t control the domain, that usually means I can’t setup a subdomain with appropriate name resolution for my project. In this case I use a wildcard DNS provider, like nip.io. In my kubernetes clusters, there is usually a gateway that facilitates ingress to pods running on the cluster. The example below assumes that my gateway is running at

Create the key and certificate

I use openssl to create the key and certificate in one command

$ openssl req -newkey rsa:2048 -nodes -keyout onboard. -x509 -days 365 -out onboard.
Generating a 2048 bit RSA private key
writing new private key to 'onboard.'
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:Texas
Locality Name (eg, city) []:Houston
Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company
Organizational Unit Name (eg, section) []:Engineering
Common Name (e.g. server FQDN or YOUR name) []:onboard.
Email Address []:
$ ll onboard.*
-rw-r--r--  1 dwatrous  2104653700   1.5K Oct 30 08:27 onboard.
-rw-r--r--  1 dwatrous  2104653700   1.6K Oct 30 08:27 onboard.

Create the kubernetes resource

Now that I have the key and crt file, I’m ready to create a kubernetes Secret using these files. Kubernetes stores these files as a base64 string, so the first step is to encode them.

$ cat onboard. | base64
$ cat onboard. | base64

At this point I can easily create a kubernetes resource definition (YAML) that will create the Secret resource.

apiVersion: v1
kind: Secret
  name: onboard.
  namespace: default
type: kubernetes.io/tls

The last thing to do is use kubectl (or the API) to create the actual resource. Assuming the above YAML was in a file onboard., the following command would create the resource.

kubectl apply -f /path/to/onboard.

Use the Secret in a Pod

The Deployment definition below shows how to use the Secret above within a Pod.

apiVersion: extensions/v1beta1
kind: Deployment
  name: onboarding
  replicas: 1
        app: onboarding
      - image: username/onboarding
        name: onboarding

        - name: tls
          mountPath: /usr/src/app/tls

      - name: tls
          secretName: onboard.
Software Engineering

kubernetes overview

Kubernetes is getting a lot of attention recently, and there is good reason for that. Docker containers alone are little more than a developer convenience. Orchestration moves containers from laptop into the datacenter. Kubernetes does that in a way that simplifies development and operations. Unfortunately I struggled to find easy to understand high level descriptions of how kubernetes worked, so I made the diagram below.


While I don’t show the operator specifically (usually someone in IT, or a managed offering like GKE), everything in the yellow box would be managed by the operator. The nodes host pods. The master nodes facilitate orchestration. The gateway facilitates incoming traffic. A bastion host is often used to manage members of the cluster, but isn’t shown above.

Persistent storage, in the form of block storage, databases, etc. is also not directly part of the cluster. There are kubernetes resources, like Persistent Volumes, that can be used to associate external persistent storage with pods, but the management of the storage solution is outside the scope of managing kubernetes.


The Developer performs two primary activities: Create docker images and tell kubernetes to deploy them. Kubernetes is not opinionated when it comes to container image design. This means a developer can choose to manage ports, volumes, libraries, runtimes, and so on in any way that suits him. Kubernetes also has no opinion on container registry, as long as it is compliant with the docker registry v2 spec.

There are a few ways to deploy a container image to kubernetes. The diagram above shows two, one based on an application definition (usually in YAML format) and the other using shortcut commands. Both are typically triggered using the kubectl CLI. In either case, the developer gives kubernetes a desired state that includes details about which container image, how many replicas and exposed ports, for example. Kubernetes then assumes the job of ensuring that the desired state is the actual state. When nodes in a cluster change, or containers fail, kubernetes acts in realtime to to what is necessary to get back to the desired state.


The consumer doesn’t need to know anything about the kubernetes cluster, it’s members or even that kubernetes is being used. The gateway shown might be an nginx reverse proxy or HAProxy. The important point is that the gateway needs to be able to route to the pods, which are generally managed on a flannel or calico type network. It is possible to create redundant gateways and place them behind a load balancer.

Services are used to expose pods (and deployments). Usually the service type of LoadBalancer will trigger an automatic reconfiguration of the gateway to route traffic. Since the gateway is a single host, each port can be used only once. To get around this limitation, it is possible to use an ingress controller to provide name based routing.


Kubernetes definitely has its share of complexity. Depending on your role, it can be very approachable. Cluster installation is by far the most difficult part, but after that, the learning curve is quite small.

Software Engineering

IT General Controls: Infrastructure vs Routing

IT general controls are important for various reasons, such as business continuity and regulatory compliance. Traditionally, controls have focused on the infrastructure itself. In the context of long running servers in fixed locations, this was often an effective approach. As virtualization and container technologies become more prevalent, especially in public cloud, infrastructure focused IT controls can start to get in the way of realizing the following benefits:

  • Just in time provisioning
  • Workload migration
  • Network isolation
  • Tight capacity management
  • DevOps
  • Automated deployments
  • Automated remediation

One way to maintain strong IT controls can still get the above benefits is to shift the focus of those controls away from the infrastructure and instead focus on routing (traffic management).

As shown above, a focus on routing ensures that IT can control where production traffic is routed, including production data. Engineering teams are free to deploy as needed and automation can be used freely. Since infrastructure is replaced with each deployment, rather than updated, there is no need to maintain rigid controls around any specific server, VM or container.

In the diagram shown, a gateway is used to facilitate routing. Other mechanisms, like segregated container image repositories and deployment environments may also be appropriate.

Software Engineering

Focus on Activities and Outcomes, not Procedures and Tools

An unfortunate digression in many technology selection efforts amounts to philosophical debates around tools and procedures. These debates revolve around tool selection, access controls, SLAs, ownership, responsibility, etc. What’s so bad about this? These debates too often fail to identify and measure against the outcomes that will benefit the business and the activities that will produce those outcomes.

What is an Outcome

An outcome is any state or deliverable.

Presumably the state or deliverable is defined by the business and, if achieved, would contribute to achieving the strategic objectives for the company. Outcomes should often be tied to specific goals. By it’s very nature, an outcome should be agnostic of any tool or procedure. Here are a couple of examples to illustrate.

Poorly defined Outcome

System A should have records that capture data B and processes to ensure that data B is replicated to downstream systems C and D.

As an outcome, this places focus on Systems rather than business activities (we’ll define activities better later). While it’s true that this is a state or deliverable, it doesn’t provide any clarity around what the business is actually trying to do. It should be obvious that the outcome above was prompted in some way by a business need for data B, but as it is written, the outcome is focused on Systems A, C and D, which may or may not satisfy business objectives.

Well defined Outcome

Department A needs data B available for all records related to entity C. Data B may also be useful to other departments and so should be available by request. Data B should be considered internal and confidential.

This outcome clearly focuses on data B and provides information about which department(s) need it and how it relates to other data. It also provides some guidance about data sensitivity and scope.

What is an Activity

An activity is any unit of contribution that leads to achieving an outcome.

When designing systems and formulating outcomes, I prefer to think of all activities as exclusively executed by humans. In other words, ignoring any technology or tools, how would a human produce the desired outcome. For example, rather than say “an onboarding representative should email, call or fax the client to request specific information“, simply say “an onboarding rep should request specific information“. Leaving out the medium by which the request is made invites creativity and clarifies business outcomes. This is similar to why I suggest always designing software on paper first.

Just imagine an activity description that includes email as the channel by which something is communicated. This is more likely to result in an outcome description that includes an email system and potentially obscures the actual purpose of the communication in the first place. Remember, the business doesn’t want an email system, they want to communicate something. The business activity is communication, not email.

Activities lead to Outcomes

Very few businesses actually deal in systems and find direct benefit from system (or vendor) selection. Conversely, most businesses differentiate based on specialized information, expertise and access to limited resources.

When activities and outcomes amplify and streamline a company’s differentiating factors, a business will generally thrive.

Procedures and Tools are Secondary

The unfortunate digression I mentioned earlier has to do with what motivates discussion of activities and outcomes when compared to what motivates discussion of procedures and tools. Proper discussion of activities and outcomes is motivated by a clear understanding of the differentiating factors unique to a business. Discussion of procedures and tools is motivated by business continuity and budget. Business continuity and budget are important considerations, but they are secondary.

Procedures should be measured against Activities

Imagine a really solid, well implemented procedure that ensures business continuity and even promotes efficiency and transparency. That sounds great doesn’t it? It is good, when the procedure is based on activities that lead to outcomes which align with business objectives. Othwerwise it doesn’t matter how great the procedures are.

Discussions about procedures often focus on pain points within the business, such as events that prevented a desired outcome or tarnished the brand image of a company. Procedural discussions also occur when coordinating access to resources, scheduling, and so on. These conversations are necessary, and can be helpful, but only in the context of the activities they support. These discussions begin to run counter to the benefit they intend to produce when compared exclusively to past procedures. Comparing new procedures to old procedures may feel like a step forward, but without measuring the new procedure against the activities it is meant to support, there is great risk of drift.

Tools should be selected based on alignment with outcomes and activities

In a similar way, tools discussions often focus on vectors such as cost, implementation, maintenance, accommodation of existing procedures, etc. Tool vendors expend a great deal of effort showing how they stack up against competing tools along these same vectors. Who wouldn’t feel great about getting a superior tool, with more features and at a lower cost point? The problem is that these vectors alone fail to account for business differentiators. This is why tools selection must measure against well defined activities and outcomes before any of the typical vectors are considered.

A premature focus on tools has the result of shaping business outcomes that are tied to a specific tool. Starting with a clear focus on activities and outcomes actually increases the likelihood of innovation over starting with a tool focus. Referring back to my start on paper article, the predefined capabilities of a tool actually project activities and outcomes on to the business. It should be the business activities and outcomes that project on to the tool in order to qualify it as a fit. Rather than ask “how can our business use this?” we want to say “how can the tool accommodate this activity?”.

Inspiration and SaaS

Some might argue that exploration of the features and structure of tools, vendors, etc. can inspire new visions of activities and outcomes in a business. I don’t disagree, and there may be some industries or business functions where that is more true than others. In cases where a business function has been normalized by social convention or outside regulation, there may not exist much flexibility to define activities or outcomes in any way other than that convention or regulation. In those cases, the typical vectors for tool selection and procedural discussions about business continuity and budget may be sufficient. That is one of the reasons that off the shelf software, and later SaaS, has been so successful.

When it comes to specialized systems, such as market differentiating proprietary software, machine learning neural networks for market insight, management of expertise and limited resources, it’s best to start from a human perspective and get a clear picture of the activities and outcomes that will optimize business success.

Software Engineering

Developer Productivity and Vertical vs Horizontal Deployments

I’ve recently had many conversations related to developer productivity. In order for a developer to be productive, he must have control over enough of the application lifecycle to complete his work. When a developer gets stuck at any point in the application lifecycle, his productivity drops, which can often reduce morale too.

One question I’ve been asking is: how much of the application lifecycle needs to fall under the scope of the developer? In other words, how broad is the scope of the application lifecycle that needs to be available to a developer in order to keep him productive. Does the developer need to be able to create and configure his own server? If there is an application stack, should he also be empowered to deploy other applications and services in the stack on which his component depends? As development efforts increase, should capacity be increased to accommodate the individual development environments for each developer?

Vertical vs Horizontal deployments

As I was working through these questions with some colleagues, I began to make a distinction between a vertical and a horizontal deployment.

A vertical deployment is one that requires deploying all tiers and components in order to test any one of them. While this can create a less volatile development environment, it also increases the complexity and resource footprint required to develop an application. It also complicates integration, since any work done on other components or tiers in the stack are not available until the vertical development deployment is refreshed.

A horizontal deployment is one that focuses only on one component or tier. It is assumed that other application dependencies are provided elsewhere. This decreases development overhead and resource needs. It also speeds up integration, since changes made to other horizontal components become available more quickly. This can also increase developer productivity, since a developer is only required to understand his application, not the full stack.

In the above diagram I illustrate that many applications now have dependencies on other applications. This is especially true for microservices. However, it should not be necessary to deploy all related applications in order to develop one of them. I propose instead a horizontal deployment where all related applications are moving toward an integration deployment and that all development, QA and other validation work operate against the integration layer. For a team following the github flow, the initial branch, the pull request and finally the merge should represent stages in the horizontal progress toward production ready code. This also has the advantage of catching most integration problems in the development and QA stages, because production ready code can make it more quickly into the integration tier and is immediately available to any integrating applications.

Capacity benefits

One of the most obvious benefits to the horizontal approach is a reduced strain on compute and storage capacity. Sharing more of the vertical stack leaves available infrastructure resources free for other application teams. Naturally containers would accentuate this benefit even more.

When to go vertical

There are times when a developer will need to deploy other elements in the vertical stack. These may include database changes that would interfere with other development teams or coordinated modifications to interdependent applications. Even in these scenarios, it may be beneficial to develop against another team’s development deployment rather than their integration deployment.

Software Engineering

Infrastructure as Code

One of the most significant enablers of IT and software automation has been the shift away from fixed infrastructure to flexible infrastructure. Virtualization, process isolation, resource sharing and other forms of flexible infrastructure have been in use for many decades in IT systems. It can be seen in early Unix systems, Java application servers and even in common tools such as Apache and IIS in the form of virtual hosts. If flexible infrastructure has been a part of technology practice for so long, why is it getting so much buzz now?

Infrastructure as Code

In the last decade, virtualization has become more accessible and transparent, in part due to text based abstractions that describe infrastructure systems. There are many such abstractions that span IaaS, PaaS, CaaS (containers) and other platforms, but I see four major categories of tool that have emerged.

  • Infrastructure Definition. This is closest to defining actual server, network and storage.
  • Runtime or system configuration. This operates on compute resources to overlay system libraries, policies, access control, etc.
  • Image definition. This produces an image or template of a system or application that can then be instantiated.
  • Application description. This is often a composite representation of infrastructure resources and relationships that together deliver a functional system.

Right tool for the right job

I have observed a trend among these toolsets to expand their scope beyond one of these categories to encompass all of them. For example, rather than use a chain of tools such as Packer to define an image, HEAT to define the infrastructure and Ansible to configure the resources and deploy the application, someone will try to use Ansible to to all three. Why is that bad?

A tool like HEAT is directly tied to the OpenStack charter. It endeavors to adhere to the native APIs as they evolve. The tools is accessible, reportable and integrated into the OpenStack environment where the managed resources are also visible. This can simplify troubleshooting and decrease development time. In my experience, a tool like Ansible generally lags behind in features, API support and lacks the native interface integration. Some argue that using a tool like Ansible makes the automation more portable between cloud providers. Given the different interfaces and underlying APIs, I haven’t seen this actually work. There is always a frustrating translation when changing providers, and in many cases there is additional frustration due to idiosyncrasies of the tool, which could have been avoided if using more native interfaces.

The point I’m driving at is that when a native, supported and integrated tool exists for a given stage of automation, it’s worth exploring, even if it represents another skill set for those who develop the automation. The insight gained can often lead to a more robust and appropriate implementation. In the end, a tool can call a combination of HEAT and Ansible as easily as just Ansible.

Containers vs. Platforms

Another lively discussion over the past few years revolves around where automation efforts should focus. AWS made popular the idea that automation at the IaaS layer was the way to go. A lot of companies have benefitted from that, but many more have found the learning curve too steep and the cost of fixed resources too high. Along came Heroku and promised to abstract away all the complexity of IaaS but still deliver all the benefits. The cost of that benefit came in either reduced flexibility or a steep learning curve to create new deployment contexts (called buildpacks). When Docker came along and provided a very easy way to produce a single function image that could be quickly instantiated, this spawned discussion related to how the container lifecycle should be orchestrated.

Containers moved the concept of image creation away from general purpose compute, which had been the focus of IaaS, and toward specialized compute, such as a single application executable. Start time and resource efficiency made containers more appealing than virtual servers, but questions about how to handle networking and storage remained. The docker best practice of single function containers drove up the number of instances when compared to more complex virtual servers that filled multiple roles and had longer life cycles. Orchestration became the key to reliable container based deployments.

The descriptive approaches that evolved to accommodate containers, such as kubernetes, provide more ease and speed than IaaS, while providing more transparency and control than PaaS. Containers make it possible to define their application deployment scenario, including images, networking, storage, configuration, routing, etc., in plain text and trust the Container as a Service (CaaS) to orchestrate it all.


Up to this point, infrastructure as code has evolved from shell and bash scripts, to infrastructure definitions for IaaS tools, to configuration and image creation tools for what those environments look like to full application deployment descriptions. What remains to mature are the configuration, secret management and regional distribution of compute locality for performance and edge data processing.

Software Engineering

Kubernetes vs. Docker Datacenter

I found this article on serverwatch today: http://www.serverwatch.com/server-trends/why-kubernetes-is-all-conquering.html

It’s not technically deep, but it does highlight the groundswell of interest for and adoption of kubernetes. It’s also worth noting that GCE and Azure will now both have a native, fully managed kubernetes offering. I haven’t found a fully managed docker datacenter offering, but I’m sure there is one. It would be interesting to compare the two from a public cloud offering perspective.

I’ve worked a lot with OpenStack for on premises clouds. This naturally leads to the idea of using OpenStack as a platform for container orchestration platforms (yes, I just layered platforms). As of today, the process of standing up Docker Datacenter or kubernetes still needs to mature. Last month eBay mentioned that it created its own kubernetes deployment tool on top of openstack: http://www.zdnet.com/article/ebay-builds-its-own-tool-to-integrate-kubernetes-and-openstack/. While it does plan to open source the new tool, it’s not available today.

One OpenStack Vendor, Mirantis, provides support for kubernetes through Murano as their preferred container solution: https://www.mirantis.com/solutions/container-technologies/. I’m not sure how reliable Murano is for long term management of kubernetes. For organizations that have an OpenStack vendor, support like this could streamline the evaluation and adoption of containers in the enterprise.

I did find a number of demo, PoC, kick the tires examples of Docker datacenter on OpenStack, but not much automation or production support. I still love the idea of using the Docker trusted registry. I know that kubernetes provides a private registry component (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/registry), but it’s not as sophisticated as Docker Trusted Registry in terms of signing, scanning, etc. However, this functionality is quickly making its way into kubernetes, with some functionality already available in alpha: https://github.com/kubernetes/kubernetes/issues/22888

On the whole, I’m more drawn to kubernetes from a wholistic point of view, but Docker is effectively keying into some real enterprise concerns. Given the open source community and vendor investment in kubernetes, I expect the enterprise gap (like a trusted registry for kubernetes) will close this year.

Software Engineering

JWT based authentication in Python bottle

May applications require authentication to secure protected resources. While standards like oAuth accommodate sharing resources between applications, more variance exists in implementations of securing the app in the first place. A recent standard, JWT, provides a mechanism for creating tokens with embedded data, signing these tokens and even encrypting them when warranted.

This post explores how individual resource functions can be protected using JWT. The solution involves first creating a function decorator to perform the authentication step. Each protected resource call is then decorated with the authentication function and subsequent authorization can be performed against the data in the JWT. Let’s first look at the decorator.

jwtsecret = config.authentication.get('jwtsecret')
class AuthorizationError(Exception):
    """ A base class for exceptions used by bottle. """
def jwt_token_from_header():
    auth = bottle.request.headers.get('Authorization', None)
    if not auth:
        raise AuthorizationError({'code': 'authorization_header_missing', 'description': 'Authorization header is expected'})
    parts = auth.split()
    if parts[0].lower() != 'bearer':
        raise AuthorizationError({'code': 'invalid_header', 'description': 'Authorization header must start with Bearer'})
    elif len(parts) == 1:
        raise AuthorizationError({'code': 'invalid_header', 'description': 'Token not found'})
    elif len(parts) > 2:
        raise AuthorizationError({'code': 'invalid_header', 'description': 'Authorization header must be Bearer + \s + token'})
    return parts[1]
def requires_auth(f):
    """Provides JWT based authentication for any decorated function assuming credentials available in an "Authorization" header"""
    def decorated(*args, **kwargs):
            token = jwt_token_from_header()
        except AuthorizationError, reason:
            bottle.abort(400, reason.message)
            token_decoded = jwt.decode(token, jwtsecret)    # throw away value
        except jwt.ExpiredSignature:
            bottle.abort(401, {'code': 'token_expired', 'description': 'token is expired'})
        except jwt.DecodeError, message:
            bottle.abort(401, {'code': 'token_invalid', 'description': message.message})
        return f(*args, **kwargs)
    return decorated

In the above code the requires_auth(f) function makes use of a helper function to verify that there is an Authorization header and that it appears to contain the expected token. A custom exception is used to indicate a failure to identify a token in the header.

The requires_auth function then uses the python JWT library to decode the key based on a secret value jwtsecret. The secret is obtained from a config object. Assuming the JWT decodes and is not expired, the decorated function will then be called.


The following function can be use to generate a new JWT.

jwtexpireoffset = config.authentication.get('jwtexpireoffset')
jwtalgorithm = config.authentication.get('jwtalgorithm')
def build_profile(credentials):
    return {'user': credentials['user'],
            'role1': credentials['role1'],
            'role2': credentials['role2'],
            'exp': time.time()+jwtexpireoffset}
def authenticate():
    # extract credentials from the request
    credentials = bottle.request.json
    if not credentials or 'user' not in credentials or 'password' not in credentials:
        bottle.abort(400, 'Missing or bad credentials')
    # authenticate against some identity source, such as LDAP or a database
        # query database for username and confirm password
        # or send a query to LDAP or oAuth
    except Exception, error_message:
        logging.exception("Authentication failure")
        bottle.abort(403, 'Authentication failed for %s: %s' % (credentials['user'], error_message))
    credentials['role1'] = is_authorized_role1(credentials['user'])
    credentials['role2'] = is_authorized_role2(credentials['user'])
    token = jwt.encode(build_profile(credentials), jwtsecret, algorithm=jwtalgorithm)
    logging.info('Authentication successful for %s' % (credentials['user']))
    return {'token': token}

Notice that two additional values are stored in the global configuration, jwtalgorithm and jwtexpireoffset. These are used along with jwtsecret to encode the JWT token. The actual verification of user credentials can happen in many ways, including direct access to a datastore, LDAP, oAuth, etc. After authenticating credentials, it’s easy to authorize a user based on roles. These could be implemented as separate functions and could confirm role based access based on LDAP group membership, database records, oAuth scopes, etc. While the role level access shown above looks binary, it could easily be more granular. Since a JWT is based on JSON, the JWT payload is represented as a JSON serializable python dictionary. Finally the token is returned.

Protected resources

At this point, any protected resource can be decorated, as shown below.

def get_jwt_credentials():
    # get and decode the current token
    token = jwt_token_from_header()
    credentials = jwt.decode(token, jwtsecret)
    return credentials
def get_protected_resource():
    # get user details from JWT
    authenticated_user = get_jwt_credentials()
    # get protected resource
        return {'resource': somedao.find_protected_resource_by_username(authenticated_user['username'])}
    except Exception, e:
        logging.exception("Resource not found")
        bottle.abort(404, 'No resource for username %s was found.' % authenticated_user['username'])

The function get_protected_resource will only be executed if requires_auth successfully validates a JWT in the header of the request. The function get_jwt_credentials will actually retrieve the JWT payload to be used in the function. While I don’t show an implementation of somedao, it is simply an encapsulation point to facilitate access to resources.

Since the JWT expires (optionally, but a good idea), it’s necessary to build in some way to extend the ‘session’. For this a refresh endpoint can be provided as follows.

def refresh_token():
    """refresh the current JWT"""
    # get and decode the current token
    token = jwt_token_from_header()
    payload = jwt.decode(token, jwtsecret)
    # create a new token with a new exp time
    token = jwt.encode(build_profile(payload), jwtsecret, algorithm=jwtalgorithm)
    return {'token': token}

This simply repackages the same payload with a new expiration time.


The need to explicitly refresh the JWT increases (possibly double) the number of requests made to an API only for the purpose of extending session life. This is inefficient and can lead to awkward UI design. If possible, it would be convenient to refactor requires_auth to perform the JWT refresh and add the new JWT to the header of the request that is about to be processed. The UI could then grab the updated JWT that is produced with each request to use for the subsequent request.

The design above will actually decode the JWT twice for any resource function that requires access to the JWT payload. If possible, it would be better to find some way to inject the JWT payload into the decorated function. Ideally this would be done in a way that functions which don’t need the JWT payload aren’t required to add it to their contract.

The authenticate function could be modified to return the JWT as a header, rather than in the body of the request. This may decrease the chances of a JWT being cached or logged. It would also simplify the UI if the authenticate and refresh functions both return the JWT in the same manner.


This same implementation could be reproduced in any language or framework. The basic steps are

  1. Perform (role based) authentication and authorization against some identity resource
  2. Generate a token (like JWT) indicating success and optionally containing information about the authenticated user
  3. Transmit the token and refreshed tokens in HTTP Authorization headers, both for authenticate and resource requests

Security and Risks

At the heart of JWT security is the secret used to sign the JWT. If this secret is too simple, or if it is leaked, it would be possible for a third party to craft a JWT with any desired payload, and trick an application into delivering protected resources to an attacker. It is important to choose strong secrets and to rotate them frequently. It would also be wise to perform additional validity steps. These might include tracking how many sessions a user has, where those session have originated and the nature and frequency/speed of requests. These additional measures could prevent attacks in the event that a JWT secret was discovered and may indicate a need to rotate a secret.


In microservice environments, it is appealing to authenticate once and access multiple microservice endpoints using the same JWT. Since a JWT is stateless, each microservice only needs the JWT secret in order to validate the signature. This potentially increases the attach surface for a hacker who wants the JWT secret.