• 0

The System Of Continuous Migration

Category : Tools

Migrating

Introduction

We live in a world where a commercial organization has to be in a state of constant flux. That is  – if it wants to survive and prosper.

This statement is even more accurate for IT companies. (And  – as the popular saying goes – every company is an IT company today)

One could of course argue that I’m suffering from a consultant worldview bias. After all – consultants are mostly brought in to help with organizational and technological changes. In the last couple of years we at Otomato have been involved in dozens of projects that all had ‘migration’ or ‘transformation’ in their title.  So yes, definitely – change is all we see.

But I’ve spent more than 15 years in IT companies small and large prior to becoming a consultant – and it’s always been like this. With ever accelerating speed. We’ve been changing languages, frameworks, architectural patterns and of course tools. Always migrating, rewriting, adapting and rethinking. Because that’s the business we’re in – the business of innovation. Because the value we provide is the promise of brighter future. And that means we can never stand still – as yesterday’s future is tomorrow’s past.

The practical side of this exciting (and somewhat frightening) reality is that we are always on the outlook for new tools and technologies. Moreover  – at any given moment we have at least one migration project planned, executed or failing. And it is stressful. Because these migrations and POCs are always full of uncertainty and risk. And because our performance is often measured by migration success. We are expected to have grand triumph or to fail fast – to minimize the cost of failure. And the larger the migration project – the harder this becomes. The benefits of the new approach aren’t always immediately measureable. The true costs of migration only become seen after we’re neck deep. And we can’t really stop the daily grind to think it all through till the last bit.

So migrations are inevitable but stressful. And how do we make something less stressful? We practice it daily, we learn all the pitfalls and then develop a system to mitigate failures and risks. In other words – we do it continuously! And it certainly feels like we as an industry can benefit from a systemic definition of continuous migration. So let us look at various existing approaches, try to understand what works best and attempt to define a system.

The Two Approaches

In general we can say there are 2 leading approaches to migration. We can even label them as ‘the old way’ and ‘the new way’. The old way is the grand cutover approach and the new way is the start small approach. Yes, I know –  this old vs. new dichotomy is over-simplistic. Each approach has its own history, its own benefits and disadvantages. Moreover different systems require different approaches. Still there are certain trends in the industry that we can’t ignore. Sometimes these trends influence our decisions. And our goal here is to provide a system to base our decisions upon. A system that cuts through the mist of personal preferences and industry trends and provides a clearer view of the subject at hand.

But before that  – let’s overview the 2 approaches and what each one of them entails. To make this more interesting we’ll start with what we previously labeled ‘new’ and then look at the ‘old’.

I must admit – I have my own biases that I’ve developed over the years. I’ll do my best to keep them out of the text when describing the existing approaches. Still – if I was perfectly sure that one of the approaches is superior  – I wouldn’t be writing this. What we’re trying to do here is to develop a superset of concepts and criteria. Something that will allow us to enjoy the best of both worlds while escaping most pitfalls on the way.

At this stage an attentive reader might object that I’m not discussing anything new here. This is just the old, beaten dichotomy of product development  – waterfall vs. agile, planned ahead vs. iterative. I do realize there are similarities. But migration projects aren’t the same as application development projects. One could argue that in migration there is no such thing as MVP. Showing that migration is viable isn’t enough to prove that it’s cost-effective. Moreover many existing business-wide systems don’t lend themselves easily to iterative migration. In a way they can be seen as life-critical systems which require meticulous testing and extensive proof of meeting the requirements prior to going live. The kind of proof that is very hard to obtain in a playground environment.

The Grand Cutover Diagram

The Grand Cutover

Start Small Diagram

Start Small


 

So let us start:

Start Small (the iterative approach)

This approach stems from the idea that it’s either impossible or too expensive to create a real staging environment for verifying the changes. As a matter of fact it’s not only about creating an environment. It is mainly about generating sufficient load of real-life use cases in order to verify system readiness. The investment in such testing is seen as too high, especially if we think of migration as a one-time process. Migrate and forget. Which – as we already said in the introduction – is not the case in our modern world.

So if preparing everything on the side in one stride doesn’t look feasible, what do we do? We start small. We take a greenfield project, a small service on the side, a specific system module. Or a separate team. The innovateurs. The test pilots. The Kamikazes. The Shaheeds. We migrate (or start from scratch, if it’s a new project) that part of our system to the new framework. This is an experiment, an evaluation. No obligations, no commitments. Only good intentions and some bravery. In fact I think we need a new word for such migration projects – migrevaluations.

As a side note – from a small survey we’ve done – most engineers and managers today prefer to start small. With many of them not even seeing any other option. That’s why I called this ‘the new way’ – this is how many of us today feel things should be done. And it’s quite understandable. Psychologically it’s much easier and less intimidating to start something small than to try and think through all the implications of a months-long system-wide change. Additionally  –  most of us have had our brains so cleanly washed with Agile soap that we don’t see any alternatives. Scrum, Kanban et al. offer some great project management techniques – but they’re not necessarily the best framework for reasoning about a problem.

But the big question with the ‘start small’ approach is always: how (and when) do we verify that the migration is worthwhile?  “Define KPIs!”  – the smarter folks will say. E.g: the migration to the new tool should shorten the build time by 30%. Or: the migration to the new orchestration framework will allow us to release twice as often with 25% less bugs. I certainly believe that defining these goals is important and even vital when starting a new migrevaluation. So let’s say – we’ve determined the KPI. And our small kamikaze project has consistently achieved it across a defined state matrix . Now – how do we know if this achievement will scale all across our system?  After all – it’s evident that large systems require different approaches. You can’t manage a large company the same way you manage a startup. The performance and stability of a large multi-component system is based on the interactions between the multitude of its components. Testing in isolation doesn’t really prove anything.

The preachers of iteration will say: “ok then. If the sample is too small – we’ll add another component, team, service. And we’ll continue adding more – until we prove our point. Or find that the solution doesn’t scale well”  Which is a perfectly valid approach. In the world of science and experimentation. But not in the world of business and heartless financial calculations. Because if we prove ourselves wrong – we’ve already spent a lot of time and money.

In many cases what happens in such situation – is that a migration is led to a completion anyway. With some KPI mangling to make it look more like a success than a wasted effort. This happens because we’re all human and we all have loss aversion hard coded into our system. It’s much harder for us to admit failure after we’ve already envisioned success.

As we’ve seen – the ‘start small’ approach definitely has some very attractive sides, but isn’t without pitfalls. Let’s see what the alternative is.

The Grand Cutover

This approach entails an exhaustive preparation stage. First – all the migration costs are carefully evaluated. The KPIs are defined.  Then – a testing or staging environment is prepared. And only after all the tests have proven that the new platform is fully functional – we perform the grand migration!

We’ve already seen the main issues with this approach. It has high upfront costs, is perceived as hard to pull off and still – gives no promise that the migration will provide the expected benefits. The demon of loss aversion is raising its head in our psyche.

But I would argue that there are situations where investing in preparation is actually much more cost-effective than starting small and planning as we roll.

First there’s the case of life-critical systems – those systems where the cost of disruption is too high.

And second – it’s important to remember that not all migrations we perform are migrevalutaions. Some of them aren’t done to improve any business metrics. Instead they are required because:

  • the old system isn’t supported anymore
  • there’s been a company wide decision we have no influence upon
  • The migration is required by another change in a related system
  • Add your own reason here.

When this is the case – there’s no real reason to start small. Instead we want the transition to be as fast and painless as possible. With minimal downtime and no hidden hope for a rollback. And that means – we need to do everything in our power to get properly prepared for the shift. With steps being:

  • Define all the players and stakeholder influenced by the change
  • Gather their inputs and expectations from the new framework
  • Based on 2 – define the functional requirements that the new framework must implement
  • Define the test data set
  • Define and allocate the necessary resources (human, compute, storage and network)
  • Plan and implement company-wide training
  • Define the minimal time for system functionality restore
  • Rehearse the migration until the defined KPIs are consistently achieved.
  • Set the date for migration.
  • Cut over!

This is easier said than done, of course. Anyone who’s been through such a project should realize how much detail is hidden behind each of these steps. How much virtual blood, sweat and tears have to be shed in order to bring this to completion.

But on the brighter side – this is a much better planned-out process. With a defined start and end criteria, with a decisive direction. As long as we’re on track – we don’t need to re-evaluate as we go. And even if obstacles prevent us from delivering on time – we can always move the dates without compromising the content of the original plan.

Note that with all the grandeur of the task at hand  –  this planned out, monolithic  (I know, I know – this is a curse word) process involves much less heroics ( and consequently – less burnout)  than the guerilla mode of the iterative innovation.

With all that said – we all realize why this approach is out of favour nowadays. Exactly for the same reason we need to be continuously migrating.  The technological world is changing fast, the deadlines are pressing. Companies usually go into cross-the-board migrations only when  they find themselves in a near-death condition. The infamous Project Inversion at Linkedin required the infrastructure team to freeze all changes in existing systems for a few months. Only so were they able to focus on rebuilding everything for the move to microservices they had planned. And it’s not easy to convince ourselves that we need to put everything on hold for the promise of brighter future. It requires either trust or desperation.

Let’s Try To Define a System

So, with all that said – how do we define a global system for continuous migration?

  1. Embrace Continuous Migration

    • The first thing to do here is to accept the fact that migration is a continuous process. No matter of we start small or go all in – this is a work that’s never done. We’ll always have more stuff to migrate even before the current migration is over.
  2. Define Migration Strategy

    • Be very clear about why you’re entering a migration project, what type of system you are migrating, what will be the success and failure criteria and if failure is even an option.
    • Some questions to ask at that stage:
      • Is this a ‘life-critical’ system?
      • What can be considered a representative sample?
      • Is this a migrevaluation or a migration?
      • Are there alternative frameworks you’ll want to evaluate before deciding?
  3. Involve the Stakeholders

    We’ve outlined this when describing the end-to-end migration steps. But – we do believe this  to be a very important stage also when starting small. A lot of migrevaluations or side-project migrations either fail or become too costly because this stage is skipped. Take for example an infrastructure team tasked with evaluating a migration for a codebase that they have no deep understanding of. We always see much better results when developers and testers are involved from the very beginning. The have intimate knowledge of the code, it’s quirks and caveats, and of all the reasons for ugly hacks that are hidden all across the system.So please make sure you:

    • Define all the players and stakeholder influenced by the change
    • Gather their inputs and expectations from the new framework
  4. Define the KPIs and Exit Criteria

    • The intensiveness of this stage very much depends on the type of migration we’ve defined this to be in 2. Still – no matter if we start small or go all-in – we need to have a defined concept of where we want to arrive. Or at least what’s the next milestone we want to reach. And how do we decide if this is a go or  a no go.
  5. Define the Verification Strategy

    • How do we measure the KPIs and criteria we’ve defined? Options include:
        • Defining a testing data set
        • Using A/B testing
        • Using dark launching
        • Manual verification in a sandbox environment.
        • Any combination of the above.
  6. Allocate resources

    • Who is tasked with migration? Do we assign a special team? (generally an anti-pattern, in our experience). Or do we reserve some capacity of the existing teams for continuous migration activity. (The recommended approach) What non-human resources are needed for the migration effort? How scalable do we want these resources to be.
  7. Define the Knowledge Accumulation and Distribution Patterns

    This definitely depends on the migration strategy we’ve chosen. For all-in, grand cutover migrations – we want our teams to be ready when the big day arrives. Therefore this is the time to organize training, assign change agents and start preparing a corporate knowledge base for the new framework.

    If we’re starting small, evaluating and learning as we go – this is where we define best practices for progress documentation and create a migration project Wiki. Needless to say – in evaluation projects the accumulation of knowledge should be our foremost goal.

  8. Start the progress.

    We’re done with all the thinking – time to start doing. It’s important to note that our migration strategy shouldn’t directly impact our project management methods. We can perfectly well manage grand cutover projects using Kanban for splitting the work into manageable tasks, limiting WIP and verifying our progress all along the road.

  9. Plan for the next migration

We’ve already embraced the fact this was a continuous process, haven’t we?

 

Conclusion:

Migrations are an everyday part of our tech life. The stacks will continue to change and we’ll never want to be left behind. Migrations are inevitable but not easy. Different strategies and and approaches can be applied. In this post we’ve presented an attempt at creating a sequence of steps to base our continuous migration effort upon. This sequence is a result of our combined four decades of industry experience. Things we’ve seen working better and worse. Following these steps won’t guarantee a successful migration (as there are a lot of other factors involved) but can definitely make your effort less stressful and more effective.

 

Would you like some help with DevOps transformation or software delivery optimization at your company? Drop us a note – we’ll be happy to help!

 


  • 0

Dynamically spinning up Jenkins slaves on Docker clusters

Introduction:

Being able to dynamically spin up slave containers is great. But if we want to support significant build volumes we need more than a few Docker hosts. Defining a separate Docker cloud instance for each new host is definitely not something we want to do – especially as we’d need to redefine the slave templates for each new host. A much nicer solution is combining our Docker hosts into a cluster managed by a so-called container orchestrator (or scheduler) and then define that whole cluster as one cloud instance in Jenkins.
This way we can easily expand the cluster by adding new nodes into it without needing to update anything in Jenkins configuration.

There are 4 leading container orchestration platforms today and they are:

Kubernetes (open-sourced and maintained by Google)

Docker Swarm (from Docker Inc. – the company behind Docker)

Marathon (a part of the Mesos project)

Nomad (from Hashicorp)

A container orchestrator (or scheduler) is a software tool for the deployment and management of OS containers across a cluster of computers (physical or virtual). Besides running and auditing the containers, orchestrators provide such features as software-defined network routing, service discovery and load-balancing, secret management and more.

There are dedicated Jenkins plugins for Kubernetes and Nomad using the Cloud extension point. Which means they both provide the same ability of spinning up slaves on demand. But instead of doing it on a single Docker host they talk to the Kubernetes or Nomad master API respectively in order to provision slave containers somewhere in the cluster.

Nomad

Nomad plugin was originally developed by Ivo Verberk and further enhanced by yours truly while doing an exploratory project for Taboola. A detailed post describing our experience will be up on Taboola engineering blog sometime next month.
Describing Nomad usage is out of the scope of this book, but in general – exactly as the YAD plugin – it allows one to define a Nomad cloud and a number of slave templates. You can also define the resource requirements for each template so Nomad will only send your slaves to nodes that can provide the necessary amount of resources.
Currently there are no dedicated Pipeline support features in the Nomad plugin.
Here’s a screenshot of Nomad slave template configuration:Screen Shot 2017-07-11 at 12.24.48 AM

Kubernetes

The Kubernetes Plugin was developed and is still being maintained by Carlos Sanchez. The special thing about Kubernetes is that its basic deployment unit is a kubernetes pod which could consist of one or more containers. So here you get to define pod templates. Each pod template can hold multiple container templates. This is definitely great when we want predefined testing resources to be provisioned in a Kubernetes cluster as a part of the build.

The Kubernetes plugin has strong support for Jenkins pipelines with things like this available:

podTemplate(label: 'mypod', containers: [
 containerTemplate(name: 'maven', image: 'maven:3.3.9-jdk-8-alpine', ttyEnabled: true, command: 'cat'),
 containerTemplate(name: 'golang', image: 'golang:1.8.0', ttyEnabled: true, command: 'cat')]) 
{
  node('mypod') {
    stage('Checkout JAVA') {
    git 'https://github.com/jenkinsci/kubernetes-plugin.git'
    container('maven') {
      stage('Build Maven') {
        sh 'mvn -B clean install'
      }
    }
   }
 stage('Checkout Go') {
 git url: 'https://github.com/hashicorp/terraform.git'
 container('golang') {
 stage('Build Go) {
 sh """
 mkdir -p /go/src/github.com/hashicorp
 ln -s `pwd` /go/src/github.com/hashicorp/terraform
 cd /go/src/github.com/hashicorp/terraform && make core-dev
 """
 }
 }
 }
}

There’s detailed documentation in plugin’s README.md on github.

Marathon

There is a Jenkins marathon plugin but instead of spinning up build slaves it simply provides support for deploying applications to a Marathon-managed cluster
It requires a Marathon .json file to be present in the project workspace.
There’s also support for Pipeline code. Here’s an example of its usage:

 marathon(
   url: 'http://otomato-marathon',
   id: 'otoid',
   docker: otomato/oto-trigger')

 

Docker Swarm

I used to think there was no dedicated plugin for Swarm but then I found this. As declared in the README.md this plugin doesn’t use the Jenkins cloud API. Even though it does connect into it for label-based slave startup. This non-standard approach is probably the reason why the plugin isn’t hosted on the official Jenkins plugin repository.
The last commit on the github repository dates back 9 months ago, so it may also be outdated – as Docker and Swarm are changing all the time and so does the API.

Learn More:

This post is a chapter from the ebook ‘Docker, Jenkins, Docker’ which we recently released at Jenkins User Conference TLV 2017.  Follow this link to download the full ebook: http://otomato.link/go/docker-jenkins-docker-download-the-ebook-2/


  • 0

Continuous Lifecycle London 2017

Last week I had the honour to speak about ChatOps at Continuous Lifecycle conference in London. The conference is organised by The Register and heise Developer and is dedicated to all things DevOps and Continuous Software Delivery. There were 2 days of talks and one day of workshops. Regretfully I couldn’t attend the last day, but I heard some of the workshops were really great.

The Venue

QEIICC_evcom_partnership-20141007102013918

The venue was great! Situated right in the historical centre of London city, a few steps away from Big Ben, the QEII Center has a breathtaking view and a lot of space. The talks took place in 3 rooms : one large auditorium and 2 — smaller ones. It is quite hard to predict which talks will attract the most audience and it was hit and miss this time around too. Some talks were over-crowded while others felt a bit empty.

Between the talks everybody gathered in the recreation area to collect merchandise from the sponsors’ stands and enjoy coffe and refreshments.

The Audience

The pariticipants were mostly engineers, architects and engineering managers. As it happens too often in DevOps gatherings— the business folks were relatively few. Which is a pity — because DevOps and CI/CD is a clear business advantage based on better tech and process optimization. The sad part is the techies understand it, but the business people still too often fail to see the connection.

The Talks

Beside the keynotes (I only attended the first one) there were 3 tracks running in parallel. I had a chance to attend a few selected ones in between networking, mentally preparing for my talk and relaxing afterwards.

The keynote

The opening keynote was delivered by Dave Farley — the author of the canonical ‘Continuous Delivery’ book. Dave is a great speaker. He was talking about the necessity of iterative and incremental approaches to software engineering while bringing some exciting examples from space exploration history. Still to me it felt a bit like he was recycling his own ideas. The book was published 7 years ago. At that time it was a very important work. It laid out all the concepts and practices many of us were applying (or at least trying to promote) in a clear and concise way. I myself have used many of the examples from the book to explain CI/CD to my managers and employees numerous times over the years. But time has passed and I feel we need to find new ways of bringing the message. I do realise many IT organisations are still far from real continuous delivery. Some still don’t feel the danger of not doing it, others are afraid of sacrificing quality for speed. But more or less everybody already knows the theory behind it. Small batches, process automation, trunk-based development, integrated systems etc. It’s the implementation of these ideas that organisations are struggling with. The reasons for that are manyfold — politics, low trust, inertia, stress, burnout and lack of motivation. And of course the ever growing tool sprawl. What people really want to hear today is how to navigate this new reality. Practical advices on where to start, what to measure and how to communicate about it. Not the beaten story of agile software delivery and how it’s better than other methodologies.

The War Stories

Thankfully there was no lack of both success and failure stories and practical tips. There were some great talks on how to do deployments correctly, stories of successful container adoption and also Sarah Wells’ excellent presentation of the methodologies for influencing and coordinating the behaviours of distributed autonomous teams.

Focus on Security

As I already said — quite naturally not all the talks got the same level of interest. Still I think I noticed a certain trend — the talks dedicated to security attracted the largest crowd. Which is in itself very interesting. Security wasn’t traditionally on the priority list of DevOps-oriented organisations. Agility, quality, reliability — yes. Security — maybe later.

The disconnect was so obvious that some folks even called for adding the InfoSec professionals into the loop while inventing such clumsy terms as DevOpSec or DevSecOps.

But now it looks lke there’s a change in focus. New deployment and orchestration technologies are bringing new challenges and we suddenly see the DevOps enablers looking for answers to some hard questions that InfoSec is asking. No wonder all the talks on security I atteneded got a lot of attention. Lianping Chen’s presentation was focused on securing our CI/CD pipeline, while Dr. Phil Winder provided a great overview of container security best practices with a live demo and quite a few laughs. And there was also Jordan Taylor’s courageous live demo of using Hashicorp Vault for secret storage.

As a side note — if you’re serious about your web application and API security — you should definitely look at beame.io — they have some great tech for easy provisioning of SSL certificates in large volumes.

And for InfoSec professionals looking to get a grip on container technologies here’s a seminar we’ve recently developed : http://otomato.link/otomato/training/docker-for-information-security-professionals/

ChatOps

My talk was dedicated to the subject that I’ve been passionate about for the last couple of years — ChatOps. The slides are already online, but they are just illustrating the ideas I was describing so it’s better to wait until the video gets edited and uploaded (yes, I’m impatient too). In fact — while preparing for the talk I’ve laid out most of my thoughts in writing and I’m now thinking of converting that into a blog post. Hope to find some time for editing in the upcoming days. And if you’d like some help or advice enabling ChatOps at your company – drop us a line at contact@otomato.link

There was another talk somehow related to the topic at the conference. Job van der Voort — GitLab’s product marketing manager — described what he calls ‘Conversational Development’ — “a natural evolution of software development that carries a conversation across functional groups throughout the development process.” GitLab is a 100% remote working company and according to Job, this mode of operation allows them to be effective and ensure good communication across all teams.

GitLab Dinner

At the end of the first day all the speakers got an invitation to a dinner organised by GitlLab. There were no sales pitches — only good food and a great opportunity to talk to colleagues from all across Europe. Many thanks go to Richard and Job from GitLab for hosting the event. BTW — I just discovered that Job is coming to Israel and will be speaking at a meetup organised by our friends and partners — the great ALMToolBox. If you’re in Israel — it’s a great chance to learn more about GitLab and enjoy some pizza and beer on the 34th floor of Electra Tower. I’ll be there.


  • 4

DevOps is a Myth

(Practitioner’s Reflections on The DevOps Handbook)

The Holy Wars of DevOps

Yet another argument explodes online around the ‘true nature of DevOps’, around ‘what DevOps really means’ or around ‘what DevOps is not’. At each conference I attend we talk about DevOps culture, DevOps mindset and DevOps ways. All confirming one single truth – DevOps is a myth./img/sapiens.jpg

Now don’t get me wrong – in no way is this a negation of its validity or importance. As Y.N.Harrari shows so eloquently in his book ‘Sapiens’ – myths were the forming power in the development of humankind. It is in fact our ability to collectively believe in these non-objective, imagined realities that allows us to collaborate at large scale, to coordinate our actions, to build pyramids, temples, cities and roads.

There’s a Handbook!

I am writing this while finishing the exceptionally well written “DevOps Handbook”. If you really want to know what stands behind the all-too-often misinterpreted buzzword – you better read this cover-to-cover. It presents an almost-no-bullshit deep dive into why, how and what in DevOps. And it comes from the folks who invented the term and have been busy developing its main concepts over the last 7 years.


Now notice – I’m only saying you should read the “DevOps Handbook” if you want to understand what DevOps is about. After finishing it I’m pretty sure you won’t have any interest in participating in petty arguments along the lines of ‘is DevOps about automation or not?’. But I’m not saying you should read the handbook if you want to know how to improve and speed up your software manufacturing and delivery processes. And neither if you want to optimize your IT organization for innovation and continuous improvement.

Because the main realization that you, as a smart reader, will arrive at – is just that there is no such thing as DevOps. DevOps is a myth.

So What’s The Story?

It all basically comes down to this: some IT companies achieve better results than others. Better revenues, higher customer and employee satisfaction, faster value delivery, higher quality. There’s no one-size-fits-all formula, there is no magic bullet – but we can learn from these high performers and try to apply certain tools and practices in order to improve the way we work and achieve similar or better results. These tools and processes come from a myriad of management theories and practices. Moreover – they are constantly evolving, so we need to always be learning. But at least we have the promise of better life. That is if we get it all right: the people, the architecture, the processes, the mindset, the org structure, etc.

So it’s not about certain tools, cause the tools will change. And it’s not about certain practices – because we’re creative and frameworks come and go. I don’t see too many folks using Kanban boards 10 years from now. (In the same way only the laggards use Gantt charts today) And then the speakers at the next fancy conference will tell you it’s mainly about culture. And you know what culture is? It’s just a story, or rather a collection of stories that a group of people share. Stories that tell us something about the world and about ourselves. Stories that have only a very relative connection to the material world. Stories that can easily be proven as myths by another group of folks who believe them to be wrong.

But Isn’t It True?

Anybody who’s studied management theories knows how the approaches have changed since the beginning of the last century. From Taylor’s scientific management and down to McGregor’s X&Y theory they’ve all had their followers. Managers who’ve applied them and swore getting great results thanks to them. And yet most of these theories have been proven wrong by their successors.

In the same way we see this happening with DevOps and Agile. Agile was all the buzz since its inception in 2005. Teams were moving to Scrum, then Kanban, now SAFE and LESS. But Agile didn’t deliver on its promise of better life. Or rather – it became so commonplace that it lost its edge. Without the hype, we now realize it has its downsides. And we now hope that maybe this new DevOps thing will make us happy.

You may say that the world is changing fast – that’s why we now need new approaches! And I agree – the technology, the globalization, the flow of information – they all change the stories we live in. But this also means that whatever is working for someone else today won’t probably work for you tomorrow – because the world will change yet again.

Which means that the DevOps Handbook – while a great overview and historical document and a source of inspiration – should not be taken as a guide to action. It’s just another step towards establishing the DevOps myth.

And that takes us back to where we started – myths and stories aren’t bad in themselves. They help us collaborate by providing a common semantic system and shared goals. But they only work while we believe in them and until a new myth comes around – one powerful enough to grab our attention.

Your Own DevOps Story

So if we agree that DevOps is just another myth, what are we left with? What do we at Otomato and other DevOps consultants and vendors have to sell? Well, it’s the same thing we’ve been building even before the DevOps buzz: effective software delivery and IT management. Based on tools and processes, automation and effective communication. Relying on common sense and on being experts in whatever myth is currently believed to be true.

As I keep saying – culture is a story you tell. And we make sure to be experts in both the storytelling and the actual tooling and architecture. If you’re currently looking at creating a DevOps transformation or simply want to optimize your software delivery – give us a call. We’ll help to build your authentic DevOps story, to train your staff and to architect your pipeline based on practice, skills and your organization’s actual needs. Not based on myths that other people tell.


  • 0

Impressions from DevOpsDays Moscow 2017

Taking the Stage at DevOpsDays Moscow

The Russian Link

I’m on my way back from the first ever DevOpsDays event in Russia where I had the privilege to share the stage with shockingly gifted and knowledgeable speakers. This may sound like just another DevOpsDays to you, but for me it was a big deal. As some of you may know I was born in Russia and lived in St.Petersburg until the age of 15. I am a native russian speaker, and though I’ve spent almost two thirds of my life in Israel – the link to russian culture has never been broken. So when I saw the event planned on the official DevOpsDays community site – I applied to give a talk and was happy to get accepted.

 

During my 17 years in the industry I never had a chance to work with russian software engineers (not counting the Israeli Russians – these I’ve seen plenty). So I had no idea about how big and developed the Russian IT industry is. Well, we all recently heard that the mighty Russian hackers helped Trump to become president, so I was quite curious to experience all that brainpower firsthand.

I must say – I was not disappointed.

Applauding the Organisers!

First of all – the conference was very well organised. The venue was great, video and sound worked fine and the conference networking app supplied by http://meyou.ru was probably the best I’ve ever seen.

The event was conceived and brought to life by two local IT consulting companies – Logrocon and Express42. Special thanks go to Boris Zlobin, Evgeny Ognevoy, Alexander Titov and Mikhail Krivilev for conducting and keeping up the good vibes all through the day.

But the best thing about the conference are its participants. Both the speakers and the listeners. The Russian devops crowd struck me as very thoughtful, knowledgeable and passionate about their work. All that with a healthy dose of hackerly cynicism.

The Highlights

The keynote was delivered by Jan De Vries who told us about applying Nassim Taleb’s concept of anti-fragility to software architecture and delivery. The talk was good, but no eye-opener for me personally as this same idea was proposed by Asher Sterkin in his presentation at DevOpsDays TLV in 2013.

Interestingly some of the best talks came from big russian institutions and banks. The dev manager and the ops manager from Raiffeisen Bank gave a super-impressive pair talk describing their long and winding road to DevOps. Another great pair talk came from Alpha-Bank – full of humorous but very practical advice on how to implement ‘DevOps without bullshit’.

Leon Fayer came all the way from Baltimore, US to give a fiery ignite about what DevOps is not and also a powerful introduction to taking DevOps all the way to BizOps.

In general – it was good to see that we’re all playing the same game. People were eagerly discussing problems of cooperation and burnout along with how to run containers and if configuration management tools are any good.

Konstantin Nazarov from Tarantool presented a simple, no frills Docker orchestration solution based on Consul and a self-written Python wrapper.

Our gifted compatriot and my friend Ilya Sher expressed the growing frustration with existing CM tools and suggested using his New Generation Shell for building simple and manageable custom solutions.

For those among us who still feel CM tools can provide certain value and not only pain ( I do believe there are such scenarios)  – Ansible was definitely the star of the show. It got featured in a number of talks and there was a workshop on writing custom Ansible modules.

 

Marcin Wielgus  – the lead Kubernetes developer from Poland who we had the pleasure to host at DevConTLV X now brought his message of smart container orchestration to Russia. Kubernetes is certainly a hot topic all around the globe. So much so that it became the subject of one of the open space sessions at the end of the day.

 

Open Spaces

I personally chose to visit another open space that was dedicated to developing the DevOps community in Russia. The session touched me deeply because folks were talking about the need to support each other, to fight snobbism and ban flame wars. Everybody agreed that generation of quality content is the key to community-building. Human societies are centered around shared stories. Konstantin Nazarov outlined the fact that technical people are rarely good in writing and speaking as this is something that needs to be learned. He then offered mentorship to whoever wants to share their knowledge but isn’t sure where and how to start.

 

This was very inspiring. We can certainly use this type of mentorship back in Israel. Being the Jenkins Area Meetup organiser for the last year I’ve come to realise that it’s not easy to find community speakers. Consultants and evangelists (like myself) can also hold great, valuable talks, but it’s much harder to learn about the actual experiences from the trenches of corporate wars.

So if you’re reading this and thinking that you’d like to tell your own story – feel free to drop me a line and I’ll be happy to share some speaking/writing tips.

Conclusion

Moscow is beautiful, Russian hackers are great guys and we can all learn from each other. Containers and their orchestrators are still all the buzz, there’s some talk of the serverless, and Ansible is responsible for picking up what’s left for configuration management. And the largest challenge is the same as everywhere – how to get humans to work together effectively at scale. It certainly seems Russians can teach us a few lessons here – they know about scale. I hope we have some Russian speakers on our next DevOpsDays TLV.


  • 0

Thank you Intel Sports!

Category : Tools

intelinsports

Mission completed! We’ve done a full month of getting the #Intel Sports developers up to speed with git. It’s always fun to train bright folks – and the engineers at Intel are certainly among the brighthest we ‘ve had the privilege to preach git to.

While providing the training we’ve also developed a few ideas regarding git subtree and the plan is to share these ideas in a follow-up post to this one (which compares submodules to repo)

Have a great weekend!

 


  • 0

CI/CD for Microservices – Challenges and Best Practices

2625fb74-c674-49e3-b87d-08b6d807c273_749_499

Introduction:

Microservice Software Architecture is a software system architecture pattern whereas an application or a system is composed of a number of smaller interconnected services. This is in opposite to the previously popular monolith architectures in which, even if having a logically modular, component-based structure the application is packaged and deployed as a monolith.

The Microservice architectural pattern while having many benefits (which we’ll briefly outline in the following paragraph) also presents new challenges all along our software delivery pipeline. This whitepaper strives to map out these challenges and define the best practices for tackling them to ensure a streamlined and quality-oriented delivery process.

Microservice Architecture Benefits:

  • Smaller Application Footprint (per Service)

The ‘Micro’ notion of the concept has been getting some justified critic – as it’s not really about the size of the codebase, but more about correct logical separation of concerns. Still once we do split our existing or potential monolith into a number of services  – each separate service will inevitably have a smaller footprint which brings us the following benefits:

  • Comprehensibility

Smaller applications are generally easier to understand, debug and change than complex, large systems.

  • Shorter Development Time

Smaller applications start-up time is shorter, making developers more efficient in their dev-test cycles and allowing for agile development.

  • Improved Continuous Delivery/Deployment Support

Updating a large complex system takes a lot of time while the implication of each change is hard to identify. On the other hand – when working with microservices – each service is (or should be) deployable independently, which makes it much easier to update just a part of the system and rollback only the breaking change if anything goes wrong.

  • Scalability Improvement

When each service is scalable independently it becomes easier to provision the resources adapted for its special needs. Additionally we don’t need to scale everything. We can for example run a few instances of front-end logic but only one instance of business logic.

  • Easier to adopt new frameworks/technologies.

Unlike a monolith we don’t have to write everything in Java or Python (for example). As each service is installed and built independently – it can be written in a different language and based on a different framework. As long as it supports the well-defined API/message protocol to communicate with other services.

  • Allows for a more efficient organizational structure.

It’s been noted by a number of studies that individual performance is reduced when the overall team size grows beyond a specific size (namely 5-7 engineers). (See also the 2-Pizza-Team concept) This is caused by several reasons like coordination/communication complexity, the ‘social loafing’ and ‘free rider’ phenomena. Microservice architecture allows a small team to maintain the development of each service – thus allowing for more efficient development iterations and improved communication.

And now that we’ve outlined the benefits, let’s look at the challenges of this architectural pattern as manifested in CI/CD concerns.

The Challenges of Microservice Architecture (from CI/CD perspective)

All challenges of microservices are caused by the complexity which originates from the fact that we are dealing with a distributed system. Distributed systems are inherently more difficult to plan, build and reason about. Here’s a list of specific challenges we will encounter:

  • Dependency Management

One of the biggest enemies of distributed architectures are dependencies. In a microservice-based architecture, services are modeled as isolated units that manage a reduced set of problems. However, fully functional systems rely on the cooperation and integration of its parts, and microservice architectures are not an exception. The question then becomes: how do we manage dependencies between multiple fast-moving, independently evolving components?

  • Versioning and Backward Compatibility

Microservices are developed in isolation with each service having its own distinct lifecycle. This requires us to define very specific versioning rules and requirements. Each service has to make absolutely clear which versions of dependent services it relies upon.

  • Data partitioning and sharing

Best practices of microservice development propose having a separate database for each service. However this isn’t always feasible and surely is never easy when you have transactions spanning multiple services. In CI/CD context this may mean we have to deal with multiple inter-related schema updates.

  • Testing

While being able to operate in isolation – a microservice isn’t worth much without its counterparts. On the other hand – bringing up the full system topology for testing just one service cancels out the benefits of modularity and encapsulation that microservices are supposed to bring. The challenge here is to define the minimum viable system testing subset and/or provide good enough mockups/stubs for testing in absence of real services. Additional challenges lie in service communication patterns which mostly occur over network and therefore must take in account possible network hiccups and outages.

  • Resource Provisioning and Deployment

In a microservice architecture each service should be independently deployable. On the other hand we need a way to know where and how to find this service. We need a way to tell our services where the shared resources (like databases, data collectors and message queues) reside and how to access them. This brings about the need for service discovery, configuration separation from business functionality and failure resilience in case certain service is temporarily missing/unavailable.

  • Variety/Polyglossia

Microservices allow us to develop each service in a different language and using a different framework. This lets us use the right tool for the job in each individual case but it’s a mixed blessing from the delivery viewpoint. Because our delivery pipeline is most efficient when it defines a clear unified framework with distinct building blocks and a simple API. This may become challenging when having to support a variety of technologies, build tools and frameworks.

Tackling the Microservice Architecture Challenges in the CI/CD Pipeline (and beyond)

Now that we’ve outlined the challenges accompanying the delivery of microservice-based systems, let’s define the best practices for dealing with them when building a modern CI/CD pipeline.

Dependency Management:

Even before looking at build-time dependency management we need to look at the wider concepts of service inter-dependency. With microservices each service is meant to be able to operate on its own. Therefore in an ideal setting no direct build-time dependencies should be needed at all. At the maximum a dependency on a common communication protocol and API version can be in place, with version changes taken care of by backward compatibility and consumer-driven contracts.

In order to achieve this the architectural concepts of loose coupling and problem locality should be applied when splitting up our system into separate services.

  • Loose coupling: microservices should be able to be modified without requiring changes in other microservices.
  • Problem locality: related problems should be grouped together (in other words, if a change requires an update in another part of the system, those parts should be close to each other).

 

      • If two or more services are too tightly coupled – i.e. have too many correlated changes which require careful coordination – it should be considered to unify them into one service.
      • If we’re not in the ideal setting of loose coupling and concern separation and re-architecting the system is currently impossible (for lack of resources or business reasons) then strict semantic versioning should be applied to all interdependent services. This is to make sure we are building and deploying against correct versions of service counterparts.

Versioning:

As stated in the previous paragraph – semantic versioning is a good way of signifying when a breaking change occurs in the service semantics or data schema. In practice this means that any given service should be able to talk to another service as long as the contract between them is sealed. With the MAJOR field of semantic version being the guarantee of that seal. For experimental or feature branches – the name of the branch can be added as metadata to the version name as suggested here: http://semver.org/#spec-item-10

Data Partitioning:

  • If each service is based on its own database then database schema changes and deployment becomes the responsibility of that service installation scripts. For the CI/CD pipeline it means we need to be able to support multiple database deployment in our test and staging environment provisioning cycles.
  • If services share databases it means we need to identify all the data dependencies and retest all the dependent services whenever a shared data schema is changed.  

Testing

  • For a much deeper look at microservice testing patterns look here: http://martinfowler.com/articles/microservice-testing
  • Deployment of individual services should be a part of the end-to-end test to verify successful upgrade and rollback procedures as part of the full system test.
  • End-to-End Tests should be only executed after unit and integration tests have completed successfully and test coverage thresholds have been met. This is because the setup and execution of the e2e environment tends to be difficult and error-prone and we should introduce sufficient gating to ensure its stability.
  • In such a case it may be a good idea to separate integration tests in CI pipeline so that external outages don’t block development.
  • Due to interservice communication reliance on network and overall system complexity integration tests can be expected to fail with higher frequency due to non-related infrastructure or version dependency errors.
  • Integration tests: As stated earlier – the minimum viable subset of interdependent services should be identified wherever possible to simplify testing environment provisioning and deployment.
  • Automated Deployment becomes absolutely necessary with each service deployable by itself and a deployment orchestration solution (e.g Ansible playbook) describing the various topologies and event sequences.
  • Test doubles (Mocks, Stubs, etc) should be encouraged as a tool for testing service functionality in isolation.
  • Coverage thresholds are a good strategy for ensuring we’re writing tests for all the new functionality.
  • Unit tests become especially important in microservice environment as they allow for faster feedback without the need to instantiate the collaborating services. Test-Driven Development should be encouraged and unit test coverage should be measured.

Resource Provisioning and Deployment

  • Infrastructure-as-Code approach should be used for versioned and testable provisioning and deployment processes.
  • Microservices should enable horizontal scaling across a compute resource cluster. This calls for using:
    • A central configuration mechanism in a form of a distributed Key-Value store (such as Consul or etcd). Our CI/CD pipeline should support separate deployment of configuration to that store.
    • A cluster task scheduler (e.g Docker Swarm, Mesos, Kubernetes or Nomad). The CD process needs to interface with whichever system we choose  for implementing scratch rollouts, rolling updates and blue/green deployments.
    • Microservice architectures are often enabled by OS container technologies like Docker. Containers as a packaging an delivery mechanism should definitely be evaluated.

Variety/Polyglossia

It is very desirable to base the CI/CD flow for each service on the same template which includes the same building blocks and a well defined interface. I.e each service should provide similar ‘build’ , ‘test’, ‘deploy’ and ‘promote’ endpoints for integration into the CI system. Additionally the interface for querying service interdependency should be clearly defined. This will allow for almost instant CI/CD flow creation for each new service and will reduce the load on the CI/CD team. Ultimately this should allow developers to plug-n-play new services into the CI/CD system in a fully self-service mode.

Otomato Software ltd. 2016 All Rights Reserved.®


  • 0

DevOps Flow Metrics – http://devopsflowmetrics.org

Category : Tools

grpah

DevOps transformation goals can be defined as:

  • Heightened Release Agility
  • Improved Software Quality

Or simply:

Delivering Better Software Faster

Therefore measurable DevOps success criteria would be:

  • Being able to release versions faster and more often.
  • Having less defects and failures.

Measurement is one of the cornerstones of DevOps. But how do we measure flow?

In order to track the flow (the amount of change getting pushed through our pipeline in a given unit of time) we’ve developed the 12 DevOps Flow Metrics.

They are based on our industry experience and ideas from other DevOps practitioners and are a result of 10 years of implementing DevOps and CI/CD in large organisations.

The metrics were initially publicly presented by Anton Weiss at a DevOpsDays TLV 2016 ignite talk. The talk got a lot of people interested and that’s why we decided to share the metrics with the community.

We’ve created a github pages based minisite where everyone can learn about the metrics, download the presentation and submit comments and pull requests.

Looking forward to your feedback!

Get the metrics here : http://devopsflowmetrics.org

 


  • 0

  • 0

Jenkins and the Future of Software Delivery

Are you optimistic when you look into the future and try to see what it brings? Do you believe in robot apocalypse or the utopia of singularity? Do you think the world will change to the better or to the worse? Or are you just too busy fixing bugs in production and making sure all your systems are running smoothly?

Ant Weiss of Otomato describing the bright future of software delivery at Jenkins User Conference Israel 2016.