So I recently talked to a release engineering team leader at a very well-known american software+hardware company located in California.
They contacted me looking for top-notch build infrastructure engineers and we spent a very interesting hour discussing their technological stack and people processes. On the surface – they are moving in the right direction – automating all the things, using Chef for config management, codifying the infrastructure, using Artifactory for binaries… But one thing struck me in our conversation. The team leader (clearly a very smart and talented guy) said : “I’ve been trying to hire for two years. Good professionals are so hard to find…”
I agreed at first – we all know the market is starving for technical talent.
But reflecting on our conversation afterwards I realized that he in fact showed me the number one symptom of how far their team is from ‘doing devops’.
Yes – hiring is challenging. I’ve discussed this before. It may take some time to find the person whose mindset, values and the overall vibe suite your expectations. Even more so – if you’re low on budget, as the market has become highly competitive. But – for this specific team leader the budget was not an issue. He was failing to hire because of his organisation’s unwillingness to invest in mentoring and development.
He was looking for the ready-made fullstack ninja with a very specific skillset. there is no other way to explain his 2 year long quest.
Can you see how this totally opposes the ever important Devops value of Sharing? So you’ve built a few things to be proud of, you’re technologically savvy, you’re on the bleeding edge. Now is the time to share your knowledge! Now is the time to go hire bright novices. They are out there, they are hungry to learn from your experience and by sharing with them you will build the real devops on your team.
So if you’re failing to hire for more than a couple of months – don’t blame the market. Don’t complain about lousy candidates. Go revise your hiring process and more importantly – the way your team works. You may have all the technology, but it looks like your culture is broken. And with broken culture – even if you eventually succeed in hiring , you’ll have hard time reaping the true benefits of the Devops way – agility, quality, motivation and trust.
And may the devops be with you!
Hi! I’m Ilya Sher.
This guest post will describe the deploy process at Utab, one of my clients.
Utab uses the services architecture. Some services are written in Java, others in NodeJS. Each server has either one Java application or one NodeJS application. The production environment uses (with the exception of load balancing configuration) the immutable server approach.
The client specified the following requirements:
The Utab’s deploy and other scripts were made specifically for the client. Such custom scripts are usually simpler than any ready-made solution. It means they break less and are easier to modify and operate. No workarounds are required in cases when a chosen framework is not flexible enough.
Also note that custom solution solves the whole problem, not just parts of the problem as it’s often the case with ready-made solutions. For example the scripts also handle the following aspects: firewall rules configuration, ec2 volumes attachment, ec2 instances and images tagging, running tests before adding to load balancer. When all of this is solved by a single set of custom scripts and not by using several ready-made solutions plus your own scripts the result looks much better and more consistent.
We estimate that such custom solution has lower TCO despite the development. Also note that in case of Utab, the development effort for the framework scripts was not big: about ten working days upfront and another few days over three years. I attribute this at least partly to sound and simple software and system architecture.
What you see above is common practice and you can find a lot about these steps on the internet except for number 4, which was our customization.
Note that while the repository is Java oriented we use it for both Java and NodeJS artifacts for uniformity. We use mvn deploy:deploy-file to upload the NodeJS artifacts.
For NodeJS artifacts, static files that go to a CDN are included in the artifact. Java services are not exposed directly to the browser so Java artifacts do not contain any static files for the CDN.
Scripts that use the AWS API are in Python. Rest of them are in bash. I would be happy to use one language but bash is not convenient for using AWS API and Python is not as good as bash for systems tasks such as running programs or manipulating files.
I do hope to solve this situation by developing a new language (and a shell), NGS, which should be convenient for the system tasks, which today include talking to APIs, not just working with files and running processes.
Gets destination IP or hostname and the role to deploy there.
The Python scripts, when successful, notify the developers using dedicated Twitter account. This is as simple as:
api = twitter.Api(c['consumer_key'], c['consumer_secret'], c['access_token_key'], c['access_token_secret'])
I would like to elaborate about pulling the required artifacts from the repository. The artifacts are pulled from the repository to the machine where the deploy script runs – the management machine (yet another server in the cloud, near the repository and rest of the servers). I have not seen this implemented anywhere else.
Note that pulling artifacts to the machine where the script runs works fine when a management machine is used. You should not run the deployment script from your office as downloading and especially uploading artifacts would take some time. This limitation is a result of a trade off. The gained conceptual simplicity outweighs the minuses. When the setup scripts are uploaded to the destination server, the artifacts are uploaded [with them] so the destination application servers never need to talk to the repository and hence no security issues to be handled for example.
It’s a responsibility of developers to make sure that the build being packaged at this step is one of the builds that were tested in the staging environment.
This can be done in one of the following ways:
Run the deploy.py script giving an older version as an argument
When old servers are removed from the load balancing, they are not immediately terminated. Their termination is manual, after the “no rollback” decision. The servers that rotated out of the load balancing are now tagged with “env” tag value “removed-from:prod”.
We never finished it as rollbacks were very rare occasion and the other two methods work fine.
Instances naming: role/build number
For instances with several artifacts that have build number: role/B1-B2-B3-…, where B1 is primary artifacts build number and others follow in order of decreasing importance.
The status tag is updated during script runs so when deploying to production it can for example look like: “Configuring” or “Self test attempt 3/18” or “Self tests failed” (rare occasion!) or “Configuring load balancing”.
This post described one of the possible solutions. I think the solution described above is one of the best possible solutions for Utab. This does not mean it’s a good solution for any particular other company. Use your best judgement when adopting similar solution or any parts of it.
You should always assume that a solution you are looking at is not a good fit for your problem/situation and then go on and prove that it is, considering possible alternatives. Such approach helps avoiding the confirmation bias.
There are a lot of articles on the internet bashing each of the tools, but in our opinion – most of it comes from misunderstanding the tool’s design or trying to apply it in an unappropriate context.
This post summarizes the general rules of thumb we at Otomato follow when choosing a solution for this admittedly non-trivial situation.
First of all – whenever possible – we recommend integrating your components on binary package level rather than compiling everything from source each time. I.e. : packaging components to jars, npms, eggs, rpms or docker images, uploading to a binary repo and pulling in as versioned dependencies during the build.
Still – sometimes this is not an optimal solution, especially if you do a lot of feature branch development (which in itself is an anti-pattern in classical Continuous Delivery approach – see here for example).
For these cases we stick to the following guidelines.
Git Submodules :
In general : Whenever we want to integrate separate decoupled components with distinct lifecycles – we recommend submodules over repo, but their implementation must come with proper education regarding the special workflow they require. In the long run it pays off – as integration points can be managed in deterministic manner and with knowledge comes the certainty in the tool.
If you find your components are too tightly coupled or you you’re in need of continuous intensive development occurring concurrently in multiple repos you should probably use git subtrees or just spare yourself the headache and drop everything into one big monorepo. (This depends, of course, on how big your codebase is)
To read more about git subtree – see here.
The important thing to understand is that software integration is never totally painless and there is no perfect cure for the pain. Choose the solution that makes your life easier and assume the responsibilty of learning the accompanying workflow. As they say : “it’s not the tool – it’s how you use it.”
I’ll be happy to hear what you think, as this is a controversial issue with many different opinions flying around on the internet.
And – if you’re looking for some git consulting – drop us a note and we’ll be happy to help.
We really hope you’ve already registered to the Jenkins User Conference Israel which was SOLD OUT yesterday.
But even if you haven’t – there’s still a chance to hang out with Kohsuke Kawaguchi – the father of Jenkins himself – and other Jenkins fans at a meetup the good people at JFrog are organising the day after the conference.
Here is the link: http://meetu.ps/e/BHkfn/jvVGM/d
Right now there are still 22 slots available.
Go grab yours.
We are happy to announce that @antweiss will be speaking about the future of software delivery at JUC Israel 2016
We have a couple of 100 NIS discount tickets to the conference and ticket sale is closing tomorrow!
The first two folks to comment on this post will get the discount code.
Type them comments!!!
Openstack is one of the largest OSS projects today with hundreds of commits flowing in daily. This high rate of change requires an advanced CI infrastructure. The purpose of the talk is to provide an overview of this infrastructure, explaining the role of each tool and the pipelines along which changes have to travel before they find their way into the approved Openstack codebase.
Talk delivered by Anton Weiss at Openstack Day Israel 2016 : http://www.openstack-israel.org/#!agenda/cjg9
One can not talk about modern software delivery without mentioning Infrastructure As Code (IAC). It’s one of the cornerstones of DevOps. It turns ops into part-time coders and devs into part-time ops. IAC is undoubtedly a powerful concept – it has enabled the shift to giant-scale data centers, clouds and has made a lot of lives easier. Numerous tools (generally referred to as DevOps tools) have appeared in the last decade to allow codified infrastructures. And even tools that originally relied on a user-friendly GUI (and probably owe much of their success to the GUI) are now putting more emphasis on switching to codified flows (I am talking about Jenkins 2.0 of course, with it’s enhanced support of delivery pipeline-as-code).
IAC is easy to explain and has its clear benefits:
But why am I writing about this now? What made me revisit this already quite widely known and accepted pattern? ( Even many enterprise organizations are now ready to codify their infrastructure )
The reason is that I recently got acquainted with an interesting man who has a mind-stimulating agenda. Peter Muryshkin of SIGSPL.org has started a somewhat controversial discussion regarding the future of devops in the light of overall business digitalisation. One thing he rightfully notices is that software engineering has been learning a lot from industrial engineering – automating production lines, copying Toyotism and theory of constraints, containerising goods and services, etc. The observation isn’t new – to quote Fred Brooks as quoted by Peter :
“Techniques proven and routine in other engineering disciplines are considered radical innovations in software engineering.”
This is certainly true for labour automation that has existed long before IAC has brought its benefits to software delivery. It’s also true for monitoring and control systems which have been used on factories since the dawn of the 20th century and which computers started being used for in the 1960-ies.
But the progress of software delivery disciplines wasn’t incremental and linear. The cloud and virtualization virtually exploded on us. We didn’t have the time to continue slowly adapting the known engineering patterns when the number of our servers suddenly rocketed from dozens to thousands.
In a way – that’s what brought IAC on. There were (and still are) numerous non-IAC visual infrastructure automation tools in the industry. But their vendors couldn’t quite predict the needed scale and speed of operation caused by the black hole of data gravity. So the quick and smart solution of infrastructure-as-code was born.
And that brings us to what I’ve been recently thinking quite a lot about – missing visualization. Visibility and transparency (or measurement and sharing) are written all over the DevOps banners. Classic view of IAC actually insists that “Tools that utilize IaC bring visibility to the state and configuration of servers and ultimately provide the visibility to users within the enterprise“ In theory – that is correct. Everybody has access to configuration files, developers can use their existing SCM skills to make some sense of system changes over time… But that’s only in theory.
The practice is that with time the amount of IAC code lines grows in volume and complexity. As with any programming project – ugly hacks get introduced, logic bugs get buried under the pile of object and component interactions. (And just think of the power of a bug that’s been replicated to a thousand servers) Pretty soon only the people who support the infra code are able to really understand why and what configuration gets applied to which machine.
In the past years I’ve talked to enough desperate engineers trying to decipher puppet manifests or chef cookbooks written by their ops colleagues. Which makes me ask if maybe IAC sometimes hinders devops instead of enabling it…
Even the YAML-based configurations like those provided by Ansible or SaltStack become very hard to read and analyze beyond simple models.
As always is the problem with code – it’s written by humans for machines, not for other humans.
But on the other hand – machines are becoming ever better at visualizing code so humans can understand them. So is that happening in the IAC realm?
In my overview of Weapons of Mass Configuration I specifically looked at the GUI options for each of the 4 ninja turtles of the IAC and sadly found out that not even one of them got any serious praise from the users for their visualization features. Moreover the GUI solutions were disregarded as “just something the OSS vendors are trying to make a buck from”.
I certainly see this as a sign of infrastructure automation still being in its artisanal state. Made by and for the skilled craft workers who prefer to write infra code by hand. But exactly as the artisans had to make way for the factories and labour automation of the industrial revolution – a change is imminent in the current state of IAC. It’s just that the usable visualization is still lacking, the tools still require too much special skills and the artisans of the IAC want to stay in control.
Don’t get me wrong – I’m not saying our infra shouldn’t be codified. Creating repeatable, automated infrastructure has been my bread and butter for quite some time and tools like Puppet and Ansible have made this a much easier and cleaner task. I just feel we still have a long way to go. ( Immutable infrastructure and containerisation while being a great pattern and having benefits of its own also relies heavily on manual codification of both the image definitions and the management layers. )
Infrastructure management and automation is still too much of an issue and still requires too much special knowledge to be effectively handled with the existing tools. Ansible is a step in the direction of simplicity, but it’s a baby step. Composing infrastructure shouldn’t be more complicated than assembling an Ikea bookshelf – and for that new, simpler, ergonomic UIs need to be created.
Large-scale problems need industrial level solutions. Let’s just wait and see which software vendor provides such a solution first – one which will make even the artisans say ‘Wow!’ and let go of their chisel.
And with that said – I’ll go back to play with my newly installed Jenkins 2.0 instance for some artisanal pipeline-as-code goodness. 🙂
Otomato is happy to announce we’re collaborating with Codefresh on their great continuous delivery for docker containers solution. (If you’re using docker in your development/production or only thinking of doing it – go check out http://codefresh.io. ) Our collaboration is mostly focused on industry analysis and docker development ecosystem research with the goal of identifying pain points and providing effective solutions. With quite a bit of evangelism on top. The linked post is the first fruit of that.
We love productivity tools of all kind and believe that software should be making our lives easier. For some reason many software products we have to work with don’t really feel like their creators agree. They may have great architecture, reliability and functionality but it looks like usability is snapped on as an afterthought.
One example of this is apt – the Advanced Packaging Tool used on Debian/Ubuntu Linux. While a great tool in itself, for some reason it provides a number of different user interfaces for different purposes.
To install/remove/upgrade/download a package there's the apt-get tool: apt-get is a simple command line interface for downloading and installing packages. The most frequently used commands are update and install. Commands: update - Retrieve new lists of packages upgrade - Perform an upgrade install - Install new packages (pkg is libc6 not libc6.deb) remove - Remove packages autoremove - Remove automatically all unused packages purge - Remove packages and config files source - Download source archives build-dep - Configure build-dependencies for source packages dist-upgrade - Distribution upgrade, see apt-get(8) dselect-upgrade - Follow dselect selections clean - Erase downloaded archive files autoclean - Erase old downloaded archive files check - Verify that there are no broken dependencies changelog - Download and display the changelog for the given package download - Download the binary package into the current directory
But then if you want to search or query for packages you have to remember that a different utility is used – apt-cache:
apt-cache is a low-level tool used to query information from APT's binary cache files Commands: gencaches - Build both the package and source cache showpkg - Show some general information for a single package showsrc - Show source records stats - Show some basic statistics dump - Show the entire file in a terse form dumpavail - Print an available file to stdout unmet - Show unmet dependencies search - Search the package list for a regex pattern show - Show a readable record for the package depends - Show raw dependency information for a package rdepends - Show reverse dependency information for a package pkgnames - List the names of all packages in the system dotty - Generate package graphs for GraphViz xvcg - Generate package graphs for xvcg policy - Show policy settings
And what if we want to list all the files installed by a specific package? Well we’re supposed to remember that there’s a special utility for that – apt-file. And it’s not installed by default…
Figures we weren’t the only ones puzzled by this multitude of interfaces. Wajig is targeted at exactly that – providing a unified interface to all package-related functionality on Debian/Ubuntu. Or as stated on the site:
“Wajig is a simplifed and more unified command-line interface for package management. It adds a more intuitive quality to the user interface. Wajig commands are entered as the first argument to wajig. For example: “wajig install gnome”. Written in Python, Wajig uses traditional Debian administration and user tools including apt-get, dpkg, apt-cache, wget, and others. It is intended to unify and simplify common administrative tasks. … You will also love the fact that it logs what you do so you have a trail of bread crumbs to back track with if you install something that breaks things.”
Another nice thing – wajig knows when root privileges are needed and takes care of that, so we don’t need to rerun a command each time we forget to prepend it with sudo.
We still need to play with wajig a bit more to decide if it performs on all its promises but it definitely is going into a right direction.
Wishing you all user-friendly software and a good week.