Alpha Site Needed

This work is based on extensive experience at several large organizations, but it has not yet been used in a real company, which is an indispensable part of developing a really useful tool.

What is needed most is for a real-world project to take this pre-alpha version and adapt it for their needs, so it sees some action solving real problems, where it can be stretched, it’s weaknesses exposed, and so users can give feedback and contribute changes.

You can find the code here: https://github.com/peoplemerge/deploymentobjects.  At the moment, it has a couple of hardcoded values that will prevent it from running in your environment, but those will be cleaned up in the next few days. The instructions you need to install and run it will be provided, but they haven’t been written yet!

The author Dave Thomas will bend over backwards to make changes to DeploymentObjects in response to the needs of early adopters.

If you’re hiring, you might offer him a job to make this framework work for you!  His CV can be found here.

Problem Statement

You’re a developer just starting at a new company. You write some code, maybe run a build, and copy it to a test server. On the test server, you restart a service and either tail a logfile until it says it’s ready, or wait the usual amount of time until you think it’s ready to go. Then you open up a browser, enter the URL, and look to see if the page loads.

Some weeks go by, QA says everything’s peachy, and you’re ready to go to production. Unfortunately, nobody ran the updated Chef config in QA and your updates take down prod, so you have to back out your changes. You start to wonder why it’s not totally automated. At your last company, you helped automate the process using Puppet so you would need to rewrite all those messy bash installer scripts to make them work here.

Just as administering most production environments is a labor-intensive manual process, there is no such thing as an installer for distributed systems. Deploying distributed software systems today involves a sequence of steps across multiple systems which are individually automated but rarely automated as a group. In instances when deployment is automated, it is accomplished by 1-off scripts too tightly coupled to external communication to be reusable. So many orthogonal concerns exist that efforts produce quickly growing balls of mud.

Lots of people run a Continuous Integration (CI) server on a local machine, but what about distributed servers?  This requires extensive setup and configuration for each company that wishes to run load tests, so small shops don’t follow the practice. Production environments are deployed using a different process from that which occurs from the CI server. Certain steps must be done manually, leaving the possibility for user error. The result is that the dev environment does not equal the prod environment, resulting in a higher frequency of errors when software is released, causing various forms of pain to the organization.

Architecture

Here I introduce a novel DeploymentObjects tool that attempts to reduce the pains of deploying and maintaining applications by making tasks previously impractical to automate practical.  The DeploymentObjects project provides a user-extensible external Domain-Specific Language (DSL) that describes distributed automation tasks using the ANTLR meta-compiler but also allows the user to interface with its internal DSL with a reference implementation written in Java (implementations in Ruby, Perl, and Python will follow).  It introduces the Deployment Object design pattern and provides a Domain-Driven Design (DDD) [8,9] for achieving the intended goals, promoting code reuse and extension while being covered by extensive unit and integration tests.  It does not reinvent wheels such as perfectly good configuration management tools but provides sensible hooks with which to trigger them.  It employs multiple methods for dispatching actions: SSH for bootstrapping and simple tasks, and Zookeeper for coordinating distributed tasks.  Users can specify their current environments using a simple YAML state repository or loaded into Zookeeper where they can be used as a blackboard watched by many nodes.  Finally, to deal with events that occur such as errors appearing in a logfile, developers can employ grammars to trigger events, triggering action elsewhere.

Initial Use Cases

We built 3 use cases to prove the concept and flesh out the object model.
Use case #1: Create an environment

In this use case, we create CentOs 6 virtual machines using kickstart and add them to a naming service, initially POSIX hosts files, to create environment to which one could deploy applications.

Although more time consuming to build and provision, automated bare-metal system installation is preferable to cloning existing VMs since the process is transparent and can be updated at a later date.  Cloned VMs from a gold image suffer when changes must be made to them, such as updating the operating system to a newer version: the user must rely on a documented process that was followed and is up to date.

A reasonable middle ground may be followed which does both: one uses a source-controlled process to create a gold image using an automated process; one can clone the gold image when new VMs are needed; should a change to the gold image be required, one changes the source and builds a new gold image that can be cloned.  Doing both capitalizes on the speed of copying VMs while providing a repeatable means of creating the gold image.

This is accomplished by writing kickstart files for each node, connecting via SSH to a hypervisor capable of creating virtual machines, and executing a script that initiates the creation of a node.

For this implementation, several assumptions are made.

It is assumed the user does not control the DNS server he uses.  This is typical for developers who submit code releases to systems administrators in an operations group who controls production servers.  Many operations already have the process of adding hosts to DNS servers automated, so we don’t reinvent the wheel. If such a group were to use the system, they would just need to implement an interface that calls their automation framework.  It is assumed that name resolution through hosts files is sufficient.

It is assumed that the operating system to be GNU/Linux. Other operating systems will be supported in time.

This is a useful use case because creating a group of VMs for the purpose of deploying and testing code from a Continuous Integration (CI) server currently requires some special tooling to overcome several problems.  First, using a bare-metal installation process, once VMs finish their installation, they shut down and must be restarted.  Next, their hostnames must be set and their IPs must be obtained programmatically and saved to some type of naming service.  Once all nodes are provisioned, they must notify the process on the CI server that provisioning is complete so it can proceed with deploying and testing the code on those nodes.  Finally, it is also beneficial because one can employ the same code base to deploy the application to a production environment.

Use Case #2: Deploy code to the environment
[todo]

Use Case #3: When a significant event is reported in a log file, trigger an action to occur in that environment
[todo]

Domain Modeling and the Ubiquitous Language

DDD places a high priority on Ubiquitous Language, which strives to get the terms used by domain experts to be the same as those software developers.  In a DDD implementation, code design such as class, variable, and method names closely resemble the technical jargon used by the experts.  An attempt at defining a ubiquitous language for DeploymentObjects produced the following language, with key modeling concepts in CAPS:

ENVIRONMENTS include HOSTS optionally organized by ROLES.  ENVIRONMENTS have identity and they are the aggregate root for HOSTS and ROLES.

A DISTRIBUTION is a mapping of APPLICATION DATA and ARTIFACTS to HOSTS by their ROLE.  A USER can DEFINE a DISTRIBUTION.

A DEPLOYMENT is built from a DISTRIBUTION applied to an ENVIRONMENT.

HOSTS have access to STORAGE, which may be LOCAL storage on that HOST, an NFS server, checkpointed DISK IMAGES available to a HYPERVISOR, or ONLINE STORAGE such as Amazon’s S3. APPLICATION DATA resides on STORAGE.

A HYPERVISOR is a HOST with the ability to CONTROL the POOL of HOSTS running inside it, performing actions like creating HOSTS using unallocated disk and cpu resources.  Amazon’s EC2 also provides the ability to CONTROL a POOL of HOSTS, but unlike a HYPERVISOR, it is not a HOST.

A PROCEDURE is composed of a series of STEPS that can be run locally or DISPATCHED to another HOST.  A PROCEDURE can be used to CREATE an ENVIRONMENT, or to change the STATE of an existing ENVIRONMENT.  A USER can DEFINE a PROCEDURE.

A JOB is what is run at the time a USER requests that a PROCEDURE be applied to an ENVIRONMENT.  The system can be organized such that when an error takes place, it triggers some action by RUNNING a JOB.  A running JOB updates the state of the ENVIRONMENT through EVENTS.

A PACKAGE is an ARTIFACT with an externally-controlled resource STATE, whereas a BUNDLE is an ARTIFACT without externally-controlled resource STATE.

RPM is a type of PACKAGE, and TAR and JAR are types of BUNDLE.

ARTIFACTS can even be a collection of uncommitted source files in a directory of the user’s HOST, but it is neither a PACKAGE, nor a BUNDLE.

YUM is an ARTIFACT REPOSITORY that makes RPM PACKAGES available.  A MAVEN REPOSITORY is an ARTIFACT REPOSITORY that makes JAR BUNDLES available.  A VERSIONED DIRECTORY

A USER specifies a PROCEDURE contained in the MODEL to run in a specific ENVIRONMENT.

Some common PROCEDURES include:

  • an ENVIRONMENT CREATION PROCEDURE
  • a BUILD PROCEDURE gets source code from a SOURCE REPOSITORY to create an PACKAGE or BUNDLE and loads it into an ARTIFACT REPOSITORY
  • and an INSTALLATION PROCEDURE which loads an ARTIFACT from a REPOSITORY onto a HOST by its ROLE.

An ENVIRONMENT built from SNAPSHOT STORAGE can be reverted to it’s original STATE.