Who does CI?
Do we want to do CI?
One of the tenants of CI is that main is always releasable.
“Your main branch must always be at a releasable state” can be heard in (at least) 2 ways.”
Way 1: only merge in a complete feature when it’s fully baked & ready to go. Way 2: every (small) commit (to main) is non-breaking.
I think a lot of dysfunction comes from only thinking 1 applies.
It’s okay to have code in main that isn’t exercised in production. You probably have loads of old code that isn’t used.
It’s okay to have a partially baked solution in main, in production, that users can’t get to.
This is what abstractions are for. Feature toggles too.
When (1) is the only workflow that’s “allowed”, the thought is that everyone writes code on their safe isolated branch. “Safety built in”, so to speak.
Then commits to main are turned off. “more safety”.
Maybe it’s just the places I’ve ended up, but it’s the same story at every one of them and I’m tired of having the same discussions. The last 4 places I’ve worked at worked in a similar way to what follows, which I posit all stems from Way 1 thinking.
Way 1 Thinking
#fleshout
- Sue, Joey, and Greg are all working on different features.
- No one wants to “ship shit” so everyone creates branches so as to not break the main branch.
- merge a branch into main and deploy.
- one time, it breaks
But we’re smart engineers. We’ll add one small step to the process so that this never happens again.
The DevOps team create a Staging server that Sue can deploy her branch to so that Product can look at it.
Product testing code before it gets merged into main becomes part of the process, because we don’t want it released to users before this. ( note: we assume that “before release” implies “before deployment” and “before merge” )
Product now wants to test Gregs branch. He deploys his changes to Staging so Product can see before it’s merged. Because it’s the process.
Greg had no way of knowing that other Product people were testing Sues branch so he unintentionally overwrote Sue’s work, making everyone angry.
But we’re smart engineers. We’ll add one more step. Before you deploy your code to Staging, you must use the shared calendar to claim blocks of time when your code will be on there. This becomes part of the process.
Later, Greg goes to put another branch on Staging. He checks the calendar. Joey had a block earlier in the day, but it ended an hour ago. He creates a block on the calendar for the next hour and deploys his code.
But unbeknownst to Greg, people were still looking at Staging.
But we’re smart engineers. We’ll add one more step. Now you must add a Slack message to the Staging-Deployment channel when you deploy to Staging and add the :loading: emoji to your message to let others know you’re actively using it. When you’re done, remove it.
Later, Product needs some of Gregs work out for Very Important Business Reasons. Greg checks the Calendar. He checks Slack. Joey again had a 3 hour block scheduled at the beginning of the day but the associated Slack message still has the loading icon. But the Calender block ended at 10am and it’s 3pm now. Greg Slacks him to see if he’s still using it. Being an async-first culture, Joey doesn’t answer. Ever the man of action, and seeing that it was hours ago, Greg deploys to Staging. Sure enough, Joey wasn’t finished.
But we’re smart engineers. We’ll add one more step.
… more here So this becomes part of the process.
Which of course Greg did, but Joey didn’t answer. Product needed this out for Very Important Business Reasons so Greg, being a man of action, did what he needed to get results. So now the Product people testing Sue’s branch is mad at Greg, as is Joey and Sue.
Retro comes and goes and the proposed solution is to give every developer their own “QA” environment that the DevOps team will manage via entries into the configuration files. Our DevOps team also knows that database migrations can be tricky to manage, they decide that all the QA environments will share a database so they can ensure that db migrations, and the code in the related branches, will work together.
Developers can deploy to their QA environment anytime they need to or whenever Product asks for a specific branch to be put up. Of course, since the team works in an async way, each member usually has 2-5 tickets going at once, each with their own branch (or [series of branches]). It’s very likely that Product would like to see multiple branches of one persons at the same time. So now either the developers share their QA environments with each other or they create intermediate temporary branches that are a combination of their other branches.
[series of branches] Of course, the async nature of the status quo workflow is to create branches off of branches because we want to continue to work off of the work we’ve already done, even if someone hasn’t reviewed the foundational PRs yet. We hope that no major changes will be proposed during the PR process that fundamentally changes our approach. If so, we’ll have to redo the entire ticket. Since we’re not getting feedback in a timely way, there’s no other option than for forge ahead. If someone has critical feedback during the PR review, they have to decide to raise the objection knowing that it will delay the work getting out and causing the author to have to do the work over, or to slap a “LGTM” on it and just hope that it works out okay. [/series of branches]
As expected, if one branch makes any type of destructive migration and it’s pushed to a QA environment, it modifies the database. This breaks other QA environments which doesn’t account for these changes, which is of course all of them.
But we’re smart engineers. There’s a more technical, if slightly more complicated, way to fix this.
configuration and migration conflicts now you need individual dbs for each.
Since every feature is on its own branch, you need to manage which branches are allowed to be merged. This often leads to complex branching strategies where in addition to main, you have production, staging, possibly other release branches. It ends up looking like this :insert complicated branch image:
(add in all the PR work needed too) (add in needing QA
Every step had good intentions. There was a rational decision at each step that’s hard to argue against.
Does this sound like “continuous integration”?. What is continuous about this? Why is the integration at the last possible moment? It’s supposed to be continuous!
- product teams should not be dictating when or how to merge code into main.
- None of the branches in this story will ever be deployed as is.
- any time you couple changes together, you increase risk.
- branch off of branch, big releases with multiple big-bang features merged in
I think developers are the only group of people who want to practice solo, perform solo yet considers themselves on a team — while at the same time— never wanting to practice, perform, or improve together because it’ll hurt their individual PR | ticket | lines-written count.