Having reliable pipelines to build, test, compile and ship your products are, in my opinion, the backbone of a software organisation. Having good developers is one thing but if you don't have the pipe that gets that product "out there" for the customers to see as soon as humanly possible then you have yourselves a bottleneck.
Delivering a product quickly and efficiently is the first of the "Three ways of DevOps"; Increasing the "flow" from left to right. 'Left' being the action of committing your code. 'Right' being the moment a customer can use the product, whether thats live on a production site or making an artefact available to download. This also ensures that we don't pass on defects downstream, that the system is understood by the entire team and that emphasises the performance of the system overall.
One of the exciting challenges we have here at Alfresco is the opportunity to improve on our current pipelines and think out-of-the-box. Innovation is key. Just because we have used a central system historically this doesn't mean that we are pinned to only using that system. Teams within Alfresco are now being encouraged to design, implement, manage and maintain their own build pipelines. Whether the pipelines remain on Bamboo or move to a different platform entirely is a decision each team makes together as a team.
The DevOps engineers have recently been reorganised to embed in these teams that require this support and to spread our culture. This change to our structure and daily work has given us the chance and freedom to investigate and work on technologies we may not have had the opportunity to previously. We all still meet weekly to discuss our work just to make sure we all aren't reinventing the wheel or duplicating our efforts.
My recent challenge was to evaluate a pipeline for the team I work in. I needed to look at the throughput and the duration of the builds. When the builds were being triggered they were consuming most of the central build resources, preventing other teams from building any of their products and placing their jobs in a queue. And when these jobs were building, they could sometimes take up to 9 hours to complete!!
This is far too long and contradicts the second of the "Three ways of DevOps"; Amplifying the "feedback" loop, right to left. We need to be able to improve how the system works and make corrections continually. The goal being that all customers, internal and external, are understood and responded to. These actions also further embed knowledge of the working systems in the teams working on them.
As part of my evaluation I discovered some interesting things. The most interesting being that our use of "elastic agents" was flawed. Elastic agents are build agents that are created dynamically on AWS, based on demand from our central build server. Each of the elastic agents are stateless. They all spin up with the same AMI. And thats how it should be. Most if not all of the jobs we run need to download libraries and dependencies before being able to run their main tasks. So we could have nearly 20 minutes of downloads happening to run 2 minutes worth of unit tests!!
To fix this we attempted to use EFS. This allowed all the elastic agents to launch as normal but then mount the EFS volume themselves, using this volume as a single repository for commonly downloaded libraries. This approach seemed to work at first but then there were issues with artefact and snapshot conflicts so we reverted back. This is still a valid approach but how we manage and implement the correct levels of segregation of the shared resources and job outputs still needs to be determined.
Another approach I took, which is one of my previously mentioned POC's, was to use a pre-provisioned Docker image that ALREADY contains the common libraries and dependencies in the correct directories. This means when the instance of the image is running (container) all the build agent has to do is checkout the source code and run the build commands. Sounds great! This method could be used both on our current build system and in conjunction with my second POC. We'll get to that one in a bit.
The HUGE benefit of having Docker containers build your code is that the teams can manage their own build images. All we need to do is to configure the jobs to use the images we produce. Sounds so simple. But with this implementation comes the need to skill up dev's who may not of even installed software on the command line, let alone use Docker to manage this. Still, another challenge of ours to face.
Moving onto my second POC now. This was to "prove" that we could use AWS and its services to build, test and deploy our code in a very aggressive manner. We're talking multiple deployments per day. It is very possible to achieve this but the work, time and cost involved is the greatest of all the approaches taken so far. It also requires skills in multiple areas (AWS, Docker, Lambda coding) that may not necessarily be something the teams have the time to invest in. We all have deadlines to meet. With this, I was able to create an AWS Pipeline that glued together a mirrored repo on AWS CodeCommit and a build project using AWS CodeBuild. Once we have a strategy for deployment of test/staging systems we can start to include AWS CodeDeploy to the pipeline. Getting this up and running took a matter of an hour or two. It was so simple. I was also able to produce an architectural diagram per-branch to see the workflows and resources created:
This is still an ongoing piece of work and the decisions to use this idea are still up for discussion. Having the freedom to experiment and break from the status-quo is a perfect segway to the third of the "Three ways of Devops"; Continual experimentation and learning. Within this team and indeed Alfresco, I was able to explore technical alternatives, experiment with new systems, learn new products and fail rapidly. Theres no better way to learn than from your own mistakes. We need to be able to take risks as often as possible. If we didn't, we wouldn't be there business we are today.
Take risks in your day to day duties! Add comments and thoughts to this post to let the community know what you have overcome!