Daniel Berman is the Product Marketing Manager at Logz.io, an Israeli and US-based organisation that focuses on combining cloud services with machine learning to provide visualisation of data for platforms and apps. The firm relies heavily on working in a DevOps environment and so, Berman has provided his insight into next year’s DevOps predictions because as he says, “it’s important, as [DevOps] changes almost every single day!”
The tendency to automate tasks where possible and practical is a consistent trend throughout DevOps. The concept of automated pipelines for software has become ubiquitous. For example, one can see the number of continuous integration and continuous delivery (CI/CD) tools continue to grow since GitHub introduced GitHub Actions.
Hand-in-hand with the popularity of automation comes the continuing rise of “infrastructure as code” tooling. Tools such as Terraform, AWS Cloud Formation, Azure Resource Manager, and GCP’s Deployment Manager allow environments to be spun up and down at will as part of the development process, in CI pipelines, or even in delivery and production. These tools are continuing to mature.
It feels like Kubernetes was everywhere in 2019. From its inception in 2015, this immensely popular container orchestrator has had the most mindshare in the DevOps community, despite competition from products like Mesos and Docker’s Swarm. Major software vendors like RedHat and VMWare are fully committed to supporting Kubernetes. An increasing number of software vendors are also delivering their applications by default on Kubernetes.
Kubernetes adoption is still growing. While the platform has yet to prove itself for all classes of workloads, the momentum behind it seems to be strong enough to carry it through for a good while.
Conversations about implementing Kubernetes increasingly go hand-in-hand with conversations about service meshes. “Service mesh” is a loose term that covers any software that handles service-to-service communication within a platform.
Service meshes can take care of a number of standard application tasks that application teams have traditionally had to solve in their own code and setups such as load balancing, encryption, authentication, authorisation, and proxying. Making these features configurable and part of the application platform frees up development teams to work on improvements to their code rather than standard patterns of service management in a distributed application environment.
Another trend in DevOps is to talk about observability in applications. Observability is often confused with monitoring, but they are two distinct concepts. A good way to understand the difference is to think of monitoring as an activity and observability as an attribute of a system. Observability is a concept that comes from real-world engineering and control theory. A system is said to possess observability when its internal state can be easily inferred from its outputs. What this means in practice is that it should be easy to infer from an application’s representation of its internal state what is going on at any given time. As applications get more distributed in nature, determining why parts of it are failing (and therefore affecting the system as a whole) becomes more difficult.
This is where the associated concept of cardinality, which refers to the number of discrete items of time-series data a system stores, comes in. As a rule, the higher the cardinality, the more likely a system is to be observable, since you have more pieces of data to look over when trying to troubleshoot it. Of course, the data gathered still needs to be pertinent to the system’s potential points of failure, and a mental map is also still required to effectively troubleshoot.
While the DevOps portmanteau has been a standard part of IT discussions for some time, other neologisms are coming to the fore. DevSecOps is one of these. This concept is gaining traction as teams aim to get security “baked in” to their pipelines from the outset rather than trying to bolt it on after development is complete. Thus security increasingly becomes a responsibility of DevOps, SRE, and development teams; consequently tools are springing up to help them with that.
“Compliance as code” tools like InSpec have gotten popular as automated continuous security becomes a priority for organisations buckling under the weight of the numerous applications, servers, and environments they track simultaneously.
Automated scanning of container images and other artefacts is also becoming the norm as applications proliferate. Products like Aqua and SysDig are fighting for market share in the continuous security space.
You may also hear DevSecNetQAGovOps mentioned as more and more pieces of the application lifecycle seek to make themselves part of automated pipelines. However, DevSecOps is still the most common reiteration to the by-now somewhat-classic DevOps pairing.
The Rise of SRE
Site Reliability Engineering is an engineering discipline that originated in 2003 at Google (before the word DevOps was even coined!), described at length in their eponymously book Site Reliability Engineering. Eschewing traditional approaches to the support and maintenance of running applications, Google elevated operations staff to a level considered equivalent to their engineering function. Within this paradigm, SRE engineers are tasked with ensuring that live issues are monitored and fixed, sometimes by writing fresh software to aid reliability. In addition, their feedback on architecture and rework pertaining to reliability and stability is taken on by the development team.
SRE works at the scale of Google’s operations, where a division between development and operations (normally an anti-pattern for DevOps) is arguably required because of the infrastructure’s size. Having a team responsible for an entire application from development to production (a more traditional DevOps approach) is difficult to achieve when the platform is large and standardized across hundreds of data centres.
DevOps companies more frequently advertised for “SRE Engineers” than “DevOps Engineers” in 2019. This may be in recognition of SRE’s specific engineering focus, as opposed to DevOps’ company-wide one.
There is increasing speculation about the role artificial intelligence (and, specifically, machine learning) can play in aiding or augmenting DevOps practices. Products such as Science Logic’s S1 are starting to trickle into the market and gain traction, although they are still in the early stages of adoption. These products use machine learning to detect anomalous behaviours in applications based on previously observed or normative behaviours.
In addition to traditional monitoring activities, AI can be used to optimise test cases, determining which to run and not run on each build. This can reduce the length of time it takes to get an application into production without taking unnecessary risks with the stability of the system.
On the more theoretical side, Google has published information about their use of machine learning algorithms to predict hardware failures before they occur. As machine learning becomes more mainstream, expect more products like these to arrive in the DevOps space.
Serverless has been a buzzword since AWS introduced AWS Lambda in 2014. Things have been heating up since then, as other providers and products have been getting in on the act.
The term “serverless computing” can be confusing—in part because servers still have to be involved at some level. Essentially, it describes a situation where the deployer of the application need not be concerned with where the code runs. It’s “serverless” in the sense that providing the servers is not something the developer needs to deal with. Typically, serverless applications are tightly coupled with their underlying computing platforms, so you need to be sure that you’re comfortable with that level of lock-in.
“Shifting Left and Right” in CI/CD
The concepts of “shifting left” and, to a lesser extent, “shifting right” in CI/CD gained visibility this year. As release cycles get smaller and smaller, “shifting left” means making efficiency improvements by failing builds earlier in the release cycle—not just with standard application testing, but also with code linting, QA/security checks, and any other checks that can alert the developer to issues with their code as early in the process as possible.
“Shift-right” testing takes place in production (or production-like) environments. It is intended to bring problems to the surface in production before monitoring or user issues are raised.
These are just some of the more noteworthy trends we’ve been watching amidst the maelstrom of activity in the world of DevOps in 2019. The acronym “CALMS” (Culture, Automation, Lean, Measurement, Sharing) is a helpful way to structure thinking about DevOps tools and techniques and, going from 2019 to 2020, the 10 DevOps trends in this article certainly exemplify these principles!