/ Jenkins

Managing 1000+ Git repositories in Jenkins with a breeze

This is a pattern that I find simple, easy/quick to setup and it still keeps you in control of your build flow. Should be no problem applying it in an organization with a huge amount of repositories!

I am exclusively involved in Java projects. This post will sometimes assume Maven is being used. But it can probably inspire a solution in other projects too. I will not supply the complete running code. I'm just going to include small snippets and focus on explaining the general idea.

Problems

Splitting a big application into several smaller artifacts is a good thing! To me, that is obvious but still I find myself talking to people that don't agree on that. Here are some of the arguments I hear on why not to split applications.

"I want to build whatever I have checked out, locally, on my filesystem. We will need to spend many frustrating hours stepping dependencies between artifacts."

"It is hard to detect when artifacts no longer fit together. We will find severe problems late because we don't continuously integrate all the artifacts."

"We will need a huge amount of jobs in Jenkins (releasing, testing, integrating, deploying, snapshots...). We will need to spend much time managing them."

Ok! All valid points! And all of them are more or less show stoppers if you don't do continuous integration right!

Solution

In short, I propose a solution where you:

  • Define a clear branching strategy.
  • Define a translation strategy between branch and Maven artifact version.
  • Define how any given repo should be built.
  • Define depending repos of repos.
  • Add a Jenkinsfile to each repo.
  • Create a shared pipeline library.
  • Automate the creation of the jobs.

You might consider Pipeline Multibranch Plugin. I use Job DSL and have it create ordinary pipeline-jobs.

  • It gives me something static, the name of each job, to use when chaining jobs that depend on each other.
  • It also allows several jobs to work with the same branch. I can easily create a release-job and a snapshot-job that both work with the develop branch.

You could have a static release-job and use mutlibranch to dynamically create every other job. But still, I feel I have more control with Job DSL and I feel it makes Jenkins look more organized.

Branching strategy

You must know the meaing of the branches, in any given repo, in order to automate things. A defined branching strategy enables you to:

  • Clone any given repo.
  • Detect what branches exists.
  • Be sure which branch to use for snapshots or releases.

If your strategy is GitFlow, then:

  • The snapshot-job will
    • Build snapshots from develop.
    • Step dependencies in develop.
  • The release-job will
    • Build releases from hotfix if it exists or else release.
  • The feature-job will build any feature/X-branches.

Each repository has one release-cycle. Several artifacts, in the same repo, with different release-cycles are not allowed.

Branch to version translation

The integration between the Git service and Jenkins is setup so that when a commit is pushed to a feature-branch:

  • A job is triggered.
  • The branch name is identified.
  • A version is derived form the branch name.
  • Check to see if there is global a bom with that version
    • If no bom fall back to some default, fail or automate creation of that ´bom´-version.
  • The artifacts are built, with a bom with the version, and uploaded to a Maven repository.

The bom -repo may function in the same way. Developers branch out of develop, getting all default versions. They set specific versions for some artifacts and commit/push. Or you automate that same procedure whenever a bom-version is missing. Then the artifact that originally triggered the creation of that bom can find its version and set it to itself.

Then it will be possible to automatically create deploy-jobs for any "deployaple" repository. Where a dropdown list can automatically be populated with features. Features are found by listing feature-branches and translating to versions.

Developers won't have to fiddle with versions locally, they just have whatever version that is also in develop-branches. And can clone a bunch of repos, built locally, and work with that fitting together.

Building

You must know how to build any given repo. With Maven, the aggreement might be as simple as:

  • The project is built from the root of the repo.
  • The version, of the repository, is specified in the root of the repo. When using Maven, it is the version of the pom.xml.
  • Different Maven profiles are allowed. Any profiles that produces artifacts should be specified as metadata about the repo in the Jenkinsfile.

The important thing is to define these rules. Do not start treating specific repos differently in the global build scripts. Instead specify global rules that all repos should follow.

Depending repositories

To be able to automatically chain jobs and have them trigger each other, I need to know depending repos of each repo. The opposite of what you have in pom.xml.

One way of doing that is with a job that:

  • Regularly finds all repos, perhaps via the Git service REST-API.
  • Parse the pom.xml-files.
    • Find out what artifacts are contains in what repos.
    • Find out what artifacts are used in what repos.
  • Create a structure with the depending repos per repo.
  • Optimize that structure so that transitive dependencies are removed from list of direct dependencies.
  • Store that structure as a json text -file in a repo. Making it available for snapshot/release-jobs to clone and include.

Having this information pre-calculated saves alot of time when it is needed by some job.

Perhaps the depending repo structure can look something like this.

{
 ...
 "PROJECT-A/example-repo-d": [
  "PROJECT-B/example-repo-b",
  "PROJECT-C/example-repo-d"
 ],
 "PROJECT-C/example-repo-d": [
  "PROJECT-E/example-repo-b"
 ]
 ...
}

Jenkinsfile

It is very small and contains only metadata about the repo. This is just like Jenkins Infra handles their 1000+ plugins.

When using Maven, you might want to specify profiles to be built.

A repo that needs to be built, nothing else, may look like this:

buildRepo()

It may specify profiles:

buildRepo(
 profiles: [
  'profile1'
 ]
)

And if a profiles are needed as well as no profile, it may look like:

buildRepo(
 profiles: [
  '',
  'profile1'
 ]
)

This is all there is in the repositories. Only one Jenkinsfile and this is the only information it contains. I'm not saying you only need this. I'm just recommending to keep it light! Perhaps you invent thnkgs like deployable: true or autoDeployEnv: 'TEST-XY'...

Shared library

A shared pipeline library allows you to, only once, define how to do releases, snapshots and all other tasks.

With the above Jenkinsfile there should be a /vars/buildRepo.groovy containing something like:

...
def call(Map params = [:]) {
...
 if (JOB_BASE_NAME.equals("snapshot")) {
 ...
 } else if (JOB_BASE_NAME.equals("release")) {
 ...
 }
...
}
...

Automate creation of jobs

Most Git services (GitHub, GitLab, Bitbucket...) provide REST API:s. You can use that to automate creation/deletion/adjustment of jobs and always be in sync with the repos you have in your Git service. The job DSL would loop through all repositories.

...
 folder("gen") {
  displayName("Generated jobs")
  description("""
   These are generated by ${JOB_URL}
  """)
 }

 getJson(server+ "/rest/request/to/repos...")
  .values
  .each { repo ->
  folder("gen") {
   displayName("gen/" + repo.name)
   description("""
    Generated by ${JOB_URL}
   """)
  }

  pipelineJob("gen/" + repo.name + "/snapshot") {
...

Templates

I use the Job DSL plugin. Perhaps you want these jobs for every repository:

  • snapshot
  • release
  • feature
  • pull-request

Also a global job, release-orchestration.

All of these templates are pipelines. Their logic is implemented in the shared library. The shared library will find the Git repo to use from scm.getUserRemoteConfigs().get(0).getUrl() and the kind of job to build from JOB_BASE_NAME.

Snapshot

This job will:

  • Make sure develop is using latest dependencies (found in Maven repository). If there are newer versions:
    • Step dependencies to latest version.
    • Commit changes.
    • Push changes.
    • Re-trigger self, to help Jenkins understand that this new commit does not need to be built again. Done.
  • Build a snapshot version.
  • Upload snapshot-version to Maven repository.
  • Trigger dependingRepos configured in Jenkinsfile.
  • Done.

When using Maven, you can do this with Versions Maven Plugin.

Release

This job will:

  • Start from commit C1.
  • Set dependencies to latest released versions.
  • Step version to next release-version.
  • Make a commit C2.
  • Set dependencies to latest snapshots.
  • Step version to next snapshot-version.
  • Make a commit C3.
  • Try to push changes. If not successful:
    • Hard reset to C1.
    • Pull.
    • Start again with creating C2.
    • Do this loop, perhaps 5 times before giving up and fail.

This allows developers to work in the branches during the release-process.

Now that we know we are in sync with remote Git repo on where to perform the release, we can continue doing so.

  • Tag C2 with the release-version.
  • Perform the build commands, mvn package and loop any profiles needed.
  • Deploy in Maven repository, mvn deploy.

When using Maven, you can do this with Versions Maven Plugin.

Release orchestration

This job will:

  • Orchestrates a release.
  • Is parameterized with each repo.
  • When triggered:
    • Calculates the order to release selected repos. With the information found in their Jenkinsfile:s.
    • Invokes the release-jobs of each selected repo.

Features

Here is what features this setup can provide and how I intend them to be used.

Release

A release of a single repo can be performed from its release-job.

This will look at each repo and release from the branch that is first found in this order:

  1. hotfix
  2. release
  3. develop
  4. master

So if you want to release from a specific commit, not latest develop, just push a release, or hotfix, -branch that points to that commit.

Orchestrating a release

A release of one, or more, repos can be performed from the global release-orchestration-job. This will:

  • Ensure the release of each repo
    • Is done in the right order.
    • Their dependencies will be released first, if selected.
    • Use the latest release of their dependencies.
  • After release, trigger the snapshot-job of the first repo that was released. So that all the snapshot-jobs will run and step snapshot-versions.

It will invoke the release-jobs of each repo. This means you can have a look there for more details on that specific release.

Hotfix

Having the priority among branches, mentioned above, will enable you to push a hotfix-branch from any commit and have the release being performed from that commit. If your master points to latest installed version:

  • git checkout master
  • git checkout -b hotfix
  • git push -u origin hotfix

Then just trigger the release.

Advantages

Your entire Jenkins configuration is put under version control. Well... you need to create one Job DSL -job manually that polls, or is triggered by changes in, the git service. But that job can have its DSL in a Git-repo. This has a bunch of advantages.

  • No more browsing around in Jenkins and fiddling with settings.
  • You can track changes in the jobs. Just use git blame, it is all code now!
  • All your jobs are backed up with Git.
  • You can easily setup a development instance of Jenkins that behaves very much like your production instance.
  • You can generate release-jobs in one Jenkins and snapshot jobs in another. Letting only a few people use the release-jenkins and anyone use the other instance.