Monorepo architecture, CI/CD and Build pipeline

Monorepo architecture can have advantages over polyrepo (or multi-repo) in certain cases. Though, implementation of a successful monorepo is not hassle free especially when it come to automation, CI/CD and Build pipeline. Examples such problems are long running tests, and unnecessary release of unchanged packages.

I am going to give you some solutions based on my experiences to make automations more optimal.

Monorepo for Microservices

Obviously when we are talking about Monorepo architecture we are NOT talking about monolithic architecture.

Monorepo architecture vs Monolithic architecture

Monolithic architecture (or in short monolith) is when you have a single solution that may contain one or several projects. These projects are build dependencies of a single project When you build your solution, you build those dependences first and use it to build your main project. This can be any kind of solution, even a full stack project (ex. MVC application). Monolithic architecture in certain cases become too large to be handled from different perspectives.

Monolithic architecture
Monolithic architecture

Monorepo architecture is when multiple solutions sit in a same repository. Think them as small monoliths that that together servers some purpose. These can be services of a microservice architecture, or they can have another relation to each other. ex. infrastructure as code and application.

Microservices architecture on a Monorepo

Although, some people are religious about separating repositories, there are cases that a putting all (or some of) the services and codes in the same repo can be useful.

Simplified microservice
Simplified microservice

The original – by the book assumption of Microservices architecture was that multiple teams, with different domain knowledge and technology expertise in a huge organization are working with different microservices. These people don’t necessarily care about each other’s service, as long as it respects the agreed upon contract (ex. API Schema). Well, this assumption is not always true in real world. Teams are utilizing microservices for variety of reasons like scaling, using serverless for some somservices, or even for monolith sounds yesterday!

When to use a monorepo

Monorepo is not a ‘no no’, many teams choose this approach, an example is GitHub repo (it is private). Monorepo is really useful in case there is a team that works with multiple services, specially when services has very similar code base. Sometimes there are shared in house libraries (like communication to your message queue) and you don’t want to go the extra step create a private package (like nuget or npm Package). Some teams also prefer to keep IAC and/or database scripts near to the code.

Moreover, use of monorepo is when you want to do processes in a more structured and unified way. Things like CI/CD process, process of creating packages, testing releasing to production server. unified rules example for formatting and code coverage.

Another advantage of monorepo over poly-repo is encouraging teams to cooperate and engage in each others source code. This promotes the innersourcing , from engaging in each others pull request review and feature requests to contributing code to each others code base.

There is a case study at Google where you can read more about Advantages and Disadvantages of a Monolithic Repository

What to be careful about in a monorepo?

As I mentioned the first things is dependency. Dependency is not a bad thing, tightly coupling, though is a wicked thing! Think like this, if you change something outside the common ariea it should not affect any other project. On the other words, make sure services are not depended on each other.

Also, you should decide what happens when common libraries updated; Who supposed to update depended projects and what is the process? Do all dependent reports get updated automatically? How do you handle if a some of services want to stay in older versions of your library?

The next thing consider is security! more people accessing everything means more possible for a hacker to change sensitive things and deployit! These days with everything automated, we should be more careful.

While talking about security, it worth mentioning that we should see to it that no secret is ever committed to the repository (things like connection strings and API keys). Even if you delete them hackers to scan the git history and extract them from previous commits.

The long running test are another problem. If every time you run a test you need to test all other solutions/services, as well as all integration tests, you probably waiting longer and longer as the repo gets bigger. This also should be thought of when you setup your monorepo.

Another thing that can go wrong is when repository gets problem, like when git status shows files as changed by mistake, or some workflows doesn’t do their job well. This interrupts all teams working with all projects and can be really critical. You need to have a team that has responsibility to fix repo problems with very high priority.

CICD process in Monorepo architecture

If you decided to go for monorepo you need to optimize your cicd process. basically we want to run tests and build only projects that has been changed and not everything that is in the repo.

Decide about structure

The first step is to have a nice folder structure. You need to pre decide about things like where IaC and documentations is going to end up (near their projects or totally separate in root folders ) and come up with pre agreed upon structure. Here is an example :

-- Root |--------- .github |--------- tools |--------- common |--------- starter-templates |--------- back-end ||------------------ dot-net |||------------------------ project-abc |||------------------------ project-xyz ||------------------ node-js |||------------------------ ... |--------- front-end ||------------------ node-js |||------------------------ project-abc |||------------------------ project-xyz |--------- back-end |--------- iac |--------- docs

Use Starter Templates

If you choose monorepo, it make very much sense that all projects of a same type have same structure. For example make a template for all DDD project in dotnet, make another one for CRUD projects in dotnet and …

Make sure all the boiler plate building blocks (like logging, diagnostics, message queue, dependency injection and so on) are similar (as much as it makes sense). This way you get two benefits

  • when starting new project you skip putting many hours on boiler plates
  • programmers can jump to other projects and start contributing, without need to learn another architecture and environment
  • it is much easier and safer to share CI/CD processes.

Trigger pipelines for what has changed

Although everything is in a same repo, most of the time you should consider each solution as a single entity. This mean to only run testing, building, versioning, and releasing for solutions that has changed .

To achieve this you have 2 alternative, first to add triggers manually to pipelines, second is to make some kind of automatic process to run pipelines based on chained folders .

Separate workflow for each path

In github workflows for example, there is an option of path. If you use both the branches filter and the paths filter, the workflow will only run when both filters are satisfied. So if your folder is changed the workflow will start.

on: pull_request: branches: - main paths: - 'dot-net/project-abc'
Code language: YAML (yaml)

Then you can make workflows specific to each folder project folder. This functionality though does not exist on azure devops but GitLab has rules:changes.

For the path you can also use wildcard to share the wordflow between many projects. This is easier when you have good predefined structure. For example for our folder structure a few line above a path like : ‘**/dot-net/**’ triggers the workflow when a dotnet project changes.

Automatically detect changes in repo run cicd pipelines specific to what has changed

Another option is though to detect the changes and automatically run pipeline based on what has changed. To do that you can use git functionality. Though for shared CI/CD to work correctly you need to :

  • have an strict folder structure (as mentioned before)
  • projects are started from templates so they are to some degree similar to each other.

Test if the contents of a directory has changed

You can detect incoming changes at pull request time; the pull request ref points to state of merge of branch to target branch . Thus you can compare the current state of head to the head of branch that is being merged to using git diff command :

git diff --quiet HEAD^ HEAD -- "./a-folder-directory" || echo true
Code language: PHP (php)

HEAD^ refers to the first parent, that always is the left hand side of the merge, so triggered by pull request, it means the branch that we are trying to merge to (ex. main). -- is just there to expect path after it. Thus, line above means we compare current head and first parent for the given directory. If the folder is changed we echo the word “true”.

Loop through the whole mono repo

When you do this type of automation, you want to get the folder paths , you should get a little bit creative; one way it to scan for files like *.sln , *.csproj, pom.xml, build.gradle, package.json, dockerfile ,and etc. Here is a loop looking for dotnet solution files :

find . -name "*.sln" -print0 | while IFS= read -r -d '' f; do \ DIR=$(dirname "${f}"); DIR_CHANGED=$(git diff --quiet HEAD^ HEAD -- "./$DIR" || echo true); if [ "$DIR_CHANGED" != true ] ; then continue; fi # Do your stuff here done;
Code language: Bash (bash)

Lets us dissect the code above as it is relatively hard to understand bash script

find . -name "*.sln"
Code language: JavaScript (javascript)

This part is kind of straight forward it gets a list of all files with sln extension, in current directory (.) and all sub directory .

The -print0 separate the file names with a null character, makes it ready for the next step that is the loop. The rest of line 1 creates loop for each file name and put each file name in variable f

ex: ./dot-net/project-abc/my-project.sln

line 2 : now we need path to the folder that contains the solution file. dirname extract that from the file name ex: ./dot-net/project-abc/

Line 3 : as described tests if things in this folder has been changed . note that echo true is how you assign “true” to PATH. You cannot skip echo.

lines 4,5,6 : if folder is not change it goes to the next iteration of the loop (node that if you want you can also do more between these lines)

line 7 : now that you know things inside current path is changed ($PATH points to it ), you can do all you CI/CD process here. (example you want to test , build , publish , create docker image and … ) . Please feel free to use a shell function to keep your code clean and readable.

Note that in case of GitHub you need to make a deep checkout to fetch all history for all branches. (otherwise you get fatal: bad revision ‘HEAD^’ error) :

- uses: actions/[email protected] with: fetch-depth: 0
Code language: YAML (yaml)

Release strategies

The thing with Monorepo is that probably lots of people (or teams) are working with same repo at the same time and they might want to have releases several times a day. This problem should be addressed when you have a hyper active repo. One thing you can do is using a Trunk Based Development : Developers merge all code changes including feature, and bug fixes to a single branch, test that trunk branch and deploy it.

Here is an example : Team1 starts with crating a branch called feature/abc-256/some-new-thing and then at release time they merge to branch trunk/date-time that contains some other features and bug fixes from other teams instead of merging that directly to main. Then trunk/date-time is tested and if everything is good it is being released and merged to main.

You should not start this approach from day one though, if you have a monorepo but you once in a while some team makes a change, then probably it is more complication than problem solving.

Some more good reads





2 responses to “Monorepo architecture, CI/CD and Build pipeline”


    Hello. Thanks for this amazing article. I always have a big question about monorepo and its CICD workflow

    The direct question is: Should I build and test all projects into the monorepo every time a merge on main/master/trunk occur? No matter if a projects doesn’t change I will build and test, is it a good approach?

    Why did this idea come to me?
    What happen if I merge into main a super feture which work and run perfectly on PR/MR but when run on main there was a small error in a yml file or some misconfigure file? The code was already merged but no one of the changes was deployed. So I fix the yml but non of the paths defined in change/rules (depends on the CI tool) where touched then nothing will de deployed… We cann’t just re run the last failed pipeline because it always checked out agains the old/failed commit..

    There is no way to run the same projects that failed unless I touch some files (with empty spaces or enter or whatever change). IMO This is ugly.

    So, an idea came up, what if I always build and test all projects on main . Somehow find the way to deploy those that was changed. Maybe a creating a docker-tag based on the digest-content and so avoid a rolling update

    What do you think about this?

    Thank you very much

    1. Daniel Abrahamberg Avatar

      Hello David,
      Thanks for the comment, well it really depends when you merge to main. Think about this principle “you should always be able to deploy main”. If something wrong endup in your main it is already too late. So there is a strategy aligned with how you thought; that is realeasing the the feature (or trunk) branch before merging it with main. To achieve that you need a rather high degree of automation thought (ex. not to forget to merge your branchs, and making sure that you are not ending up with merge conflict after branch is released and got accepted). There is no one size fit all solution here but the main idea is to use canary or blue/green deployment to release your branch (obviously running all the tests before your release) and as soon as your new deployment is fully functional you merge it with main.
      If you do trunk I would run all the tests on trunk before release in that case but not feature to trunk.

Leave a Reply

Your email address will not be published. Required fields are marked *