Notes on CodeBuild and CodePipeline

Ashley FriezeMay 20th, 2021Last Updated: May 17th, 2021

1 144 4 minutes read

Shall we use Jenkins or CodeBuild? The eternal question.

On the one hand, Jenkins allows you define builds and pipelines inside your repository, on the other hand, a Jenkins server is something of a beast to maintain.

CodeBuild and CodePipeline are essentially serverless Jenkins… but with some key stuff missing

What’s Missing?

Some Bad Things Are Gone

Server patching
Management of the number of build agents
Right-sizing the Jenkins server for the load
Paying for server time while it’s idle
Random variations between Jenkins host configurations

Some Good Things Are Missing

A clean pipeline scripting language
Auto-discovery of builds in an organization
Holistic dashboards to navigate your organization
A single picture of a build/pipeline

CodeBuild and Pipeline FTW

Despite the potential short-term shortcomings of losing Jenkins, if you’re already using AWS, the CodeBuild approach is just too easy to resist. The only tipping point would be if you already had a huge real-estate in Jenkins, it may not be worth migrating. Or if you have a complex set of builds that call each other, then Jenkins may help.

Finally, if some admins somewhere are happy to run a Jenkins farm for you, then maybe it’s 50/50 which way to go.

Making Up The Gap

In my current assignment, I’ve tried to solve three problems:

Scanning the repos of our GitHub organization to automatically produce builds and pipelines
Standardising how builds and pipelines work, with the right amount of copy and paste
Consequently, using declarative pipeline scripting for deployment

I’ve solved this with my own tooling, and I’ll share approximately how I did that.

The Right Amount of Copy and Paste

We tended to find our Jenkinsfile was the source of a lot of copy and paste, and maybe this is ok. A basic buildspec.yaml which spits out builds for every branch doesn’t have that much variance, but then is pretty much impossible to reference.

If key bits of boilerplate across repo builds are shared assets, you get weird cross project dependencies that aren’t really that beneficial.

What Is The Standard Build Process?

There are many ways to cook CodeBuild and CodePipeline. Our process is:

CodeBuild is triggered by every PUSH request in GitHub to do a build
If the push is to main then the CodeBuild spec packages up the assets of the build and drops them into S3 for CodePipeline
CodePipeline watches S3 and runs when there’s a new package

Though CodePipeline can be triggered by GitHub, we don’t want a deployment for every branch. We DO want a build for every branch.

How To Produce Builds and Pipelines

CodeBuild and CodePipeline can be created from Cloudformation templates. If we had some sort of tool that could generate a Cloudformation template to describe all the builds and pipelines we need, then we could deploy that template and the builds would exist.

So how about we create a scanning tool as a scheduled lambda, which scans GitHub and outputs the template. Then, let’s have a CodePipeline job which consumes that output template and deploys it to create the various builds. Oh, and let’s define the build and pipeline for the scheduled lambda, as well as the pipeline for the generated template within this uber template too.

This means the first time you run the lambda locally, it spits out the template to bootstrap having the whole architecture in AWS.

That’s quite cool.

It also means that if the scan notices a repo that no longer exists, or needs a build or pipeline, the produced template will omit those builds, causing the process to both create and delete templates (and update them too).

So How Do I Create Such A Thing?

How precisely you choose to do this is a question of what you can make and support. Here’s the tech stack I used:

TypeScript for the Lambda
Octokit as a GitHub client
yaml as a Java/TypeScript yaml reader/writer
jszip to produce zip files for pushing to S3

My solution compares the template it wants to store in S3 with the one already there, and doesn’t republish if there’s no difference.

The Weird Quirk with `!Ref`

The final detail is that Cloudformation templates in .yml like to use !Ref in place of certain values to refer to other parts of the template. It turns out that getting this to work with a generic .yml reader is pretty hard going. Same would be true for other shortcut functions (not used in my solution).

I switched to using the alternative variation:

SomeParameter: !Ref TheReference
 
# becomes
 
SomeParameter:
  Ref: TheReference

That’s not really that hard.

Conclusion

It took quite some doing to get this going, but in the first few hours of using it, it pretty much spewed out all my builds and pipelines and they worked. I’ve allowed each repo to have some control over the fine details of the builds and pipelines if they want to, but there are few places where I’ve needed it so far.

This solution seems to use the available tools quite nicely. I’m sure it’ll have some growing pains, but equally, it seems a good balance of standards and copy/paste boilerplate.

To be clear, to get a build, I just need to write a buildspec.yaml file, and a build will show up within a few minutes (unless I get impatient and trigger the lambda by hand).

To get a pipeline, I just write a small custom .yml file containing a few parameters that describe which features of our standard pipeline I want to use.

Customisations are achieved by allowing a mix-in approach, where we can add some override values to the standard section of the Cloudformation template, so there’s as small amount of new syntax to explain as possible.

I may report back in future to say that this ultimately made life hard, but we’re really enjoying the honeymoon period with this approach at the moment.

Published on Java Code Geeks with permission by Ashley Frieze, partner at our JCG program. See the original article here: Notes on CodeBuild and CodePipeline

Opinions expressed by Java Code Geeks contributors are their own.