Continuous Integration and Continuous Delivery at Scale

Post on March 1, 2021 by Shweta Oak

Shweta Oak Senior Manager, Engineering

Introduction 

Continuous Integration and Continuous Delivery (CI-CD) processes can benefit organizations through accelerated time to market, reliable releases, and overall enhancements to the quality of software. At PubMatic, we deploy cutting-edge products to help stay on top of industry changes and evolving trendsThroughout our history, as our business grew, our release cycles shortenedOur CI-CD infrastructure and processes played an important role in delivering releases faster at the highest quality. 

What we set out to achieve 

A solid CI-CD infrastructure cannot be built overnight, but it is part of a journey or a process of change that takes time and a committed team. In setting our plans for a growing global organization, we developed strategic vision and defined measurable goals for our CI-CD approach: 

  • Every change to an application should result in a stable releasable version. 
  • Early on, we had one build and deployment job per application maintained by the developers. The objective was to have one custom job per technology stack (Java, Go lang, Angular, Spark and ML applications).  
  • Code quality checks must be enforced in an automated manner. 
  • Rely less on maintaining and prepping local environmentshost integration environments for developers and QA engineers.
  • Developers should be empowered to focus more on code and not on operational tasks that may slow them down. 

There were bound to be challenges on this journey. PubMatic has a diverse technology stack and an expanding product line. We needed to build a pipeline that could deploy code easily and quickly across multiple data centers distributed around the globe. This pipeline also needed to be highly available to meet the demands of a geographically diverse engineering team. As expected, with any new technology or process, in the early stages we faced questions about adoption and the learning curve. 

How we took it on 

We set up a taskforce with a focused group of engineers, data center operations group members and architecture group members for this initiativeThe core CI team took a minimalistic approach to solving the problem by starting with simple goals and solutions and then improvising. This team researched and chose what it believed are the best-in-class tools to meet our needs. We started by building a programmable and customizable pipeline iteratively in small chunks. 

One key to our success was not giving people a way to bypass the CI/CD pipeline. In fact, we incentivized teams by linking CI-CD compliance to agile team maturity metrics and published achievements and progress reports on a quarterly basis. The core group conducted companywide and teamspecific knowledge sharing and training sessions to increase familiarity with the tools and custom frameworks. We established modes of communication such as dedicated Slack channels for quick resolution of issues faced by the engineering teams.  Most importantly, we highlighted success stories within the company to motivate everyone to get onboard. 

The nuts and bolts 

A simplified representative view of a CI CD pipeline: 

 

The first objective was to standardize the release workflowWe standardized the deployment stages asdevelopment, integration, pre-production and production. We also standardized codebranching practices across teams and established code quality and coverage SLAs 

Automation and infrastructure as code are some of the keys to the success of CI-CD. 

Automation, automation, and automation! 

Automated Build

We developed scripts for building, testing, and delivery and configuration of codeWe built a generic build pipeline library which followed the pre-defined workflow stages with hooks for customization based on the technology stack. We started by focusing on one technology stack (Javabased microservices) and incorporated other tech stacks as adoption increased. The pipeline library, which was deployed on Jenkins, was created using the Blueocean plugin, which helps in visualization and monitoring of the pipeline. It is tightly integrated with git and scans for any applications which are “CI-CD” enabled. The build pipeline is automatically triggered on commits or code merge events. It includes hooks for, but not limited to, the following: 

  • Reference to the codebase git location 
  • Build environment 
  • Release packaging preferences 
  • Health check service 
  • Data centers for production deployment 
  • Enabling add-on stages such as swagger integrationCDN deployment, database change deployment 
  • Email list for approvals and notifications 

This made CI-CD integration for new applications extremely easy. Each new application simply needed to conform to the following prerequisites: 

  • Containerize the application with environmentspecific configuration files 
  • Define a CI-CD configuration file in the application git repository and override the common pipeline defaults using the hooks provided above 
  • Optionally define environmentspecific resource requirements as configuration (RAM, CPU, auto scaling needs, etc.). Fallback to defaults available 
  • The pipeline takes care of deploying the application with the relevant configuration as per the deploy stage. 

Examples of configuration files: 

Automated integration tests

The main purpose of CI-CD infrastructure is to be able to identify bugs early in the release cycle. Keeping this in mind, we added an integration environment as a deploy stage. Developers integrate code to the integration branch as early and as frequently as possible, which triggers a build followed by a deployment on the integration environment after the code quality checks are performed. The integration environment runs automated test suites and reports results.  

Automated deployment

The pipeline can deploy any version of the software to any environment such as development, integration, pre-production, and production on demand, at the push of a button. These deployments to various environments are secured via role-based approvals. The approver list is configurable at a perapplication level. We deploy our applications to private cloud orchestrated with Kubernetes. The pipeline has the capability of enabling autoscaling for every application.  

Database as code

Database management is part of GitHub Source Control so that database changes can be version controlled and deployment jobs can easily update databases as per the required versions without manual intervention. We have leveraged Liquibase, which is an open-source database change management tool that manages DB changes through a SQL changesetAnyone across engineering can easily propagate database changes to different environments like developmenttestor demo, instead of directly sourcing or altering the database manually. We can track which database changes were executed by whom and when. 

Injecting compliance into the CI CD pipeline

We integrated code analysis and code coverage tools such as Sonarqube into the pipeline with strict quality gate which fails non-compliant builds. We have made peercode reviews and code sign offs mandatory via the pipelineWe have also injected security compliance tools for checking and reporting docker vulnerabilities into the pipeline. 

Monitoring

The success of CI-CD infrastructure depends on the feedback loop. If there is a failure at any step of the pipeline the team is alerted. We have integrated thirdparty tools like Slack for alerting, nagios and graphana for monitoring all steps of the pipeline and all environments in the CI-CD infrastructure 

Taking it to the next level 

We have come a long way in enhancing the CI-CD infrastructure. The CI-CD infrastructure itself is deployed on the private cloud and is autoscalable. We have several teams and hundreds of applications using our CI CD pipeline across multiple deployment stages or environments. The pipeline supports hundreds of concurrent builds and deployments across multiple data centers. The average build time for any application is under 5 minutes. 

  • We are able to add support for custom build environment for a new technology stack (java, go, javascriptSpark, etc.) by just creating new docker image for the build environment and plugging it into the pipeline libraryThis also enables quick version upgrades for Java, GO, or NPM. 
  • We have added support to publish API documentation (swagger) to our CDN server with mandatory reviews through the release pipeline. 
  • A new deploy stage can be easily plugged into the pipeline and utilized by applications with minimal additional configuration. 
  • The uptime of the CI-CD infrastructure has a direct impact on our daily work and developer productivityThe upkeep of the tools and frameworks used in the CI-CD stack is also equally important. We have a parallel CI-CD setup that is in sync with the master setup at all times; which can be used for fail over and can double up for applying and testing upgrades to any CI-CD component such as a code quality tool or the pipeline itself. 

Conclusion 

Adopting CI-CD is not just about building something using technology to accelerate time to market, releasing quality products with confidence, and reducing the overall cost of manual efforts. It can elevate the entire software development process. It makes enforcement of SLAs such as unit test coverage, functional test coverage, code quality metrics, and security checks easier. There are many opportunities for optimization and improvement.   

We have learned in this journey that CICD is not solely the responsibility of an infrastructure team or an operations team alone. We are now looking at developing a highly cross trained engineering team where anyone can contribute to the CICD infrastructure.