Hi there. This isn’t so much a technical blogpost itself but more a post about my experiences trying to build a Docker based CI/CD environment and discussing the matter with a customer.
The initial situation is as follows:
We migrated a medium, „post startup phase“ sized website to AWS based cloudhosting a couple of years ago and I am responsible for the hosting, monitoring, alerting stuff. The whole DevOps shenanigans basically. We don’t provide any web development but we offered the customer access to our preferred CI solution (CodeShip) and I designed a pretty straightforward „Git push to production“ pipeline based on Codeship and AWS CodeDeploy and the customer pushes to prod often multiple times a day. Which is basically the definition of continuous deployment.
The infrastructure itself is not very special. You got your PHP app, served by Apache, running in a self-healing Autoscaling Group behind a LoadBalancer. Add some MySQL, Redis, Memcache and Cloudfront (all services by AWS) and you have a pretty solid website with 99,99% uptime during the last year.
Additionally the customer wasn’t willing to pay more for infrastructure design than absolutely necessary, so I didn’t really touch a lot of stuff. Implemented some tweaks, a little Lambda here and there, taking care of the monthly updates (Packer FTW!) and thats about it.
So now we have this site running for some time and the customer gets eager.
„What is up with all these new developments? I read something about Docker and Kubernetes and Microservices and stuff. We want that, too! We want to build and deploy Microservices with Docker and everything!“
So they asked me how I would build or plan something like that. At first they were pretty stoked about Kubernetes (which I really think is super dope and probably the standard in the near future).
But here is where the „We want that!“, the „Do we really need that?“ and the „And how much would that cost?“ worlds collided.
When they asked me if I would be willing to build an initial Kubernetes Cluster, I started by asked questions back. Like „You want high availability, right? So you would need a 3 server Kubernetes Master Cluster and highly available etcd environment to avoid race conditions even before you could even think about deploying the first service. Do you have any experience with Kubernetes configuration management? High Availability? etcd? Service definitions? Pods? Storage management? Loadbalancing?“ and so on.
I would have had no problems to just provide a „Proof of Concept“ Kubernetes infrastructure but there is always the risk that this Proof of Concept/Test environment slowly migrates to become the new „prod“. And being responsible and „on call“ for a system with so much moving parts and points of failures, that wasn’t a risk I was willing to take.
But that is somewhat the point of my blogpost/title.
A lot of people/businesses have this blurry feeling the have to do „something“ to keep up. They read everywhere about Docker and Kubernetes and Microservices and all that fancy bleeding edge stuff and they wan’t all this cool „Facebook/Google/Netflix“ tech, too.
But they often lack the technical knowledge of how these things work on the infrastructure level. That there is no „turnkey“ solution for a HA Kubernetes Cluster.
At least not one that I alone can provide in a couple of days. There are just so much moving parts that interlock. Server provisioning and bootstrapping, builds, testing, deployment, versioning, healthchecks, autoscaling, storage persistence, databases, logging, networking, loadbalancing, disaster recovery and and and.
Of course you can hire yourself a team of consultants to plan and build you a Cluster, but you have to be able (or willing) to afford that.
All these parts are vital for a reliable infrastructure. Don’t get me wrong, it is surely doable. I really enjoy reading case studies of fancy infrastructure and experimenting with this technology. And I know this feeling of „Oh boy, I want to build something like that, too!“
But I, as a One Man DevOps „team“, am simply not able to build this kind of things, which takes Github, Netflix or Google hughe teams of highly skilled professionals and a couple of month to pull of.
I as a „lone DevOps“, and probably a lot of other DevOps/Coders/System Architects and Sysadmins, I just can’t and I won’t compete with a team of Google Engineers.
But back to the customer. After we pretty much ruled out Kubernetes for now I argued in favor for Amazons own EC2 Container Service. You get out of the box scheduling, you pay only for the running instances in the cluster, you don’t need a large overhead of pure management/master servers and it integrates seamlessly to the AWS toolset.
So we settled with that and I build a prototype CI/CD pipeline based on their reference architecture purely with AWS tools. But when trying to explain the pipeline, it became pretty obvious that there were quite some misconceptions of how microservices are supposed to work. As I tried to explain how the build Docker images would be pushed to the Hub or ECR the reaction was: „No, no, we don’t have them in a hub or something, we build them from public Docker images locally on our development machines.“
I had to explain, that in order to use your Docker image somewhere else than your local machine, they had to be stored somewhere. You can’t just build them in location X and expect them „just to run“ in location Y.
I think this is the traditional monolith-thinking.
„I take my code, I build it on this machine and then it runs on this machine.“
But with a pipeline like that one part of the pipeline just grabs the code, another parts builds it and then just another part runs it. Of course it seem self explanatory for someone who works with Docker and this kind of pipelines. But for others it isn’t. There was (and will be) quite some explaining to do.
But now I have a pipeline running and we are on the path to deploying the first microservice. My only worry right now is, that they still don’t really the the „micro“ part of it. I had a look at some of their Dockerfiles and they basically build a „fat“ version of their app for local development with docker-compose right now. Which is totally ok, but I fear that they are just going to take a huge chunk of it, put it in a container, run it as a whole and call it „microservice“.
I am also worried about the Docker build times.
As they have no prebuild base containers and build everything from scratch, the build times on CodeBuild might get problematic, as it don’t support layer caches.
So I am pretty sure that, when they are not willing to provide a preconfigured build container, the build times for this microservice are going to exceed the current build time for their „classic“ infrastructure via Codeship/CodeDeploy by far.
I made the suggestion to have a look at Codeship Pro or Circle-Ci (which supports layer caching afaik) but right now they don’t want to spend more than absolutely necessary.
So we will see, the pipeline will probably go live soon and I am curiously waiting for the next misconceptions, misunderstandings and complaints I will hear when everything not quite works like they expected.