Every CIO I know is balancing two humongous-sized beach balls. One ball
represents driving innovation, improving experiences, and delivering data,
analytics, and AI capabilities. The CIO wants to fill this ball with helium
and make it easy for agile, self-organizing teams to collaborate,
experiment, and release changes quickly.
The other ball represents the reliability, stability, performance, and
security of mission-critical business services – from the ones that run core
operations down to the pilots and POCs the dev and data science teams are
DevOps + SRE Improve Reliability but Can’t Eliminate P1s
Wait! Wasn’t DevOps supposed to fix the paradox of deploying fast while
assuring high service levels? Shouldn’t all the investment in CI/CD
automations, continuous testing, shifting-left security practices, migrating
to the cloud, configuring infrastructure as code … among other practices …
eliminate the need for IT Ops, IT service management, having major P1s, and
putting practices in place to reduce the mean time to recover from them?
Of course, DevOps and SRE practices have made huge impacts on dev and ops
functions, but what CIOs know is that there will always be unknowns,
mistakes, and issues outside of IT’s control that creates instability. IT
will always need people responding to these issues and under business
pressure to resolve them faster, address root causes, and handle growing
I know this because of the
AIOps benchmark report
StarCIO recently completed, where respondents told us that MTTR/uptime are
top KPIs and that 70 percent require 3 hours or longer to resolve major P1s.
I recently shared
five reasons major P1 Incidents have terribly long resolution times
about these findings and also wrote about
three meaningful KPIs to focus agile development, DevOps, and IT Ops to
deliver business outcomes.
Why CIOs Must Add AIOps to their DevOps Programs
DevOps always included monitoring as a primary practice, and many have
followed through by implementing monitoring tools, increasing their
applications’ observability, developing service level objectives, and
measuring error budgets.
But the challenge is that there’s too much operational data for NOCs, IT
Ops, and SREs to parse through when incidents span apps, microservices,
planetary databases, and SaaS integrations across data centers, multiclouds,
and edge networks. That’s the challenge AIOps aims to address by
centralizing operational data and applying
open box machine learning algorithms to correlate incidents.
So it’s no surprise to see
BigPanda’s news of raising $190 million
and with 2021 sales growing by 155 percent on a year-over-year basis. “The
need among IT Operations teams for AI-powered insights and automation has
exploded in recent years,” said Assaf Resnick, co-founder and CEO of
And StarCIO’s research showed that
93 percent of organizations are investing in AIOps or plan to soon, and the early adopters (16 percent) report that AIOps is already making a
significant operational impact.
So for CIOs trying to juggle innovation and reliability, AIOps provides the
intelligence and automation to help IT keep both balls in the air.