Narrator: You're listening to the humans of DevOps podcast, a
podcast focused on advancing the humans of DevOps through skills,
knowledge, ideas and learning, or the SK il framework.
Richard Whitehead: Right, so the good news is people's jobs like
most definitely not at risk. The problem we're trying to solve or
solve, I think is it's increasing at a greater rate
than I think we can. We can sort of solve the problem. So the
expanding nature of the problem, I think, secures people's
employment for a very, very long time. I think in every aspect
of, of any form of digital transformation, when you look at
any aspect of the business, it doesn't matter how much effort
how much code rewrites how much automation we do. The
opportunity to refer to this as an opportunity, not a problem.
The opportunity is increasing. So, so fast, that I don't think
anybody's going to be able to jump anytime soon.
Jason Baum: Hey, everyone, welcome back. It's Jason Baum,
Director of Member experience at DevOps Institute. And this is
the humans of DevOps podcast. I hope you had a great week. We're
glad you came back to join us. So let me take you back to the
80s The machines are coming to get us scenes in lines from the
Terminator and it's many sequels are forever etched in our
brains. It hauntingly depicts a world hell bent on technological
growth that led to the rise in advanced machine learning
techniques and artificial intelligence that would
ultimately lead to the world's demise. Well, we're on our way
there, folks. But instead of doomsday scenarios, we're all
picturing. Ai ops has emerged as an essential step forward for
enterprises in a variety of different industries. Gartner
defines AI ops as artificial intelligence for IT operations,
which combines big data and machine learning to automate it
operations processes. At its core, it's all about it, teams
and organizations can use AI to manage data in their
environments. Through this approach, teams can employ large
scale datasets, machine learning and automation to MAKE IT ops
faster, simpler and more efficient. Many believe AI will
not only impact organizations, but will become a major facet in
our everyday lives through the emergence of new applications. I
mean, we already see ML and AI every day with things like face
ID smart replies, product recommendations, chat bots, you
name it. Today, we're going to talk to Richard Whitehead
evangelist and chief CTO of Moog soft about what aiops really
means for humans, as opposed to being stuck in the 80s version
of this story. Sorry about that. Richard, are he ready to get
human?
Richard Whitehead: How suddenly?
Jason Baum: First of all, thanks so much for coming on.
appreciate having you here on the podcast. And my apologies
for taking it back to the 80s. But every time I talk about AI,
my mind immediately goes to the machines are coming for us.
Richard Whitehead: And certainly, when I started
working in this space, I think every single PowerPoint
presentation I saw both internally and externally had at
least some reference to Skynet and a terminator. So yes. It's
not that not that far back.
Jason Baum: Yeah, no, it's not that far back. And I think
people are still still scared by it. I have a funny story that
I've told. Perhaps in the past, I'm trying to remember maybe
maybe not, of my mother actually keeps her Amazon name cannot be
said because it's actually in the room with me and we'll start
speaking. But we know what we're talking about. She puts it away.
She actually keeps it in a cabinet and when she wants to
use it, takes it out and plugs it in and then says the magic
words to turn it on because she's afraid of it listening to
her.
Richard Whitehead: That might be a legitimate fear. It's always
listening that it has to but yeah, I think I you know, I'm a
pragmatist. I don't fear artificial intelligence, but I
do occasionally have a sense of disappointment. You know, when
my door camera notifies me that there's a person outside and
it's it's clearly a dog. And, and also some, some shopping
recommendations. I get, you know, I I, if I were to buy an
item such as a dishwasher, I don't feel the need to be to
have somebody suggest that I buy a dishwasher for the next six
months. So I think people, you know, it's less fear, it's more
disappointment in what, what AI can bring to you based on some
of the some of the more obvious and commercial barons that are
out there.
Jason Baum: Yeah, so let's talk about what AI is. So I feel like
that's a good place to start. You heard the Gartner
definition, want to include that because I always feel like
having an academic of sorts definition is important to hear.
I've also heard someone very succinctly put it that AI,
artificial intelligence is simply the problem. And then
once it's solved, it's no longer AI, which I think is a
fascinating way to look at it. I'm curious how you would define
it.
Richard Whitehead: So much. So I definitions a little broader. To
go back to the specific definition, the garden
definition, which is actually evolved into that definition.
Because when you put the letters AI together, people
automatically assume it means artificial intelligence. So AI
Ops is the application of artificial intelligence
techniques to IT operations. So that's really it. And, of
course, means that it's a very broad definition, which means
there are a lot of technologies and techniques and solutions out
there that all fit into this umbrella definition. So when
people talk about artificial intelligence, there's a general
sense that what you're talking about is technology or
computers, that are in some way attempting to replicate humans,
you know, and that's, that's where the inevitable screenshot
of the Terminator, robot appears, and so forth. And I
think that's, that's generally true. So, you know, when you
think of chess playing, computers and things like that,
and that sense has been largely reinforced by some of the early
adoptions of AI in things like service desks. So you know, when
you call into a service desk, your first interaction is likely
to be with some form of AI capability where it attempts to,
to give you an answer very rapidly without any form of real
human interaction. And, you know, they, in most cases,
they're attempting to sort of pass the Turing test. In other
words, you're talking to a computer and is trying to make
it as human like that interaction to be as human like
as possible. So while it's true for things like service desks,
when you when you dig deeper into some technology, and start
talking about, you know, the concept of sort of monitoring
and observability remediation, and things like that, it becomes
less attempting to replicate a human but more attempting to
replicate what a human would try and do if they were involved. So
it's not a human interaction, it's an application of human
based intelligence, but in an automated fashion. So with that
broad definition, you're incorporating not just sort of
that chess playing type thing, that's actually the least of the
component. But you are talking about things like machine
learning, you're talking about sort of some sophisticated
algorithms that maybe do linear regression, and you're talking
about some techniques that are in the periphery of artificial
intelligence, such as natural language processing, you know,
I've mentioned already mentioned two that I'm personally
relatively familiar with, which is machine learning, and natural
language processing. And these are things that you don't
necessarily think of when you're talking about AI, but that
they're absolutely relevant and very pertinent to solving
specific problems.
Jason Baum: And with that, that's a good lead into this
next question is, what are some examples of, of problems that
we're looking to solve with AI ops and machine learning
specifically? And And also, when you hear about, okay, we're
looking to solve problems, and automate and speed up some
processes. I think a lot of the some of the misconception then
is, well, now I'm going to lose my job. AI is going to replace
me. So perhaps you could address both in this in in your answer,
what's the problem? And then as we solve it with AI ops, how are
we like, are people's jobs at risk?
Richard Whitehead: Right, so the good news is people's jobs are
most definitely not at risk. The problem we're trying to solve
solve, I think, is it's increasing at a greater rate
than I think we can. We can sort of solve the problem. So the
expanding nature of the problem, I think, secures people's
employment for a very, very long time. I think in every aspect
of, of any form of digital transformation, when you look at
any aspect of the business, it doesn't matter how much effort
how much code rewrite how much automation, we do the
opportunity, I want to refer to it as an opportunity, not a
problem, the opportunity is increasing so fast, that I don't
think anybody's going to be out of a job anytime soon. In fact,
I think if you look at some of the roles, the newer roles that
are emerging as a result of digital transformation, such as
site reliability engineers, sort of developers in general and
sort of operations, folks who are emerging in sort of a DevOps
type, type capacity, you know, that's an expanding market
opportunity, not a shrinking one. So individual teams might
be smaller as a result of, of this technology. But the market
opportunity in general is such that I think it's going to give
me a long time before the demand for people in these roles cools
off. So yeah, not not a problem there. We're dealing with a an
exploding market opportunity. So basically, it sort of comes down
to, I think I mentioned it earlier, the notion of
automation. So when we talk about AI replicating human
activity, I tend to think of an in this this sense, when you're
looking at something that a human would do on their date,
day to day, sort of line of work, when you're solving a
problem addressing an incident, debugging something, the
question we always ask ourselves is, what's the most common task,
what's the most common and repetitive tasks that a human
performance, and those are the things that I think are easy
targets for AI, to replicate, because they tend to be the
mundane tasks, you know, we tend tend to refer to them a lot of
toil. The stuff that you do every single time, that doesn't
necessarily add value, but it's just a task that has to be
performed in order to move on to the the next job of of actually
resolving the issue or, or finding the error. And that's
something that I think, is sort of overlooked, people tend to
think of AI as being an end goal, we're going to completely
replace a human, and you throw data at it and you get a
solution at the other end, I tend to think of AI certainly,
we sometimes refer to it at moogsoft as applied AI, it's
basically a very small tool, you can take a very small tool to
achieve a very specific task. And it could be something as
simple as doing a bit of triage, augmenting some information. So
that's one less task you have to do. That's one less system, you
have to log into to get some additional information. If that
information can be gathered for you, and presented to you.
That's one less mundane task you have to perform in order to get
to the really important stuff, which is using your your human
brain to resolve the issue. And so yeah, so applied AI is a good
way of looking at it.
Jason Baum: That's what it's all about. Right? Getting rid of
that mundane. I think that's the goal. Right? Right.
Richard Whitehead: So certainly a DevOps ideal, right? Yeah. The
daily work.
Jason Baum: Yeah. Efficiency. So um, so it's is it is it once
we're, it's set up, right, we've got it, the mundane is gone.
It's working, you know, everything is being automated,
is simply plug and play. And now we just let it go. Or, you know,
can we trust the machines to continue it? And just in
perpetuity, I guess forever? We're learning the machine is
learning and, and everything is all set?
Richard Whitehead: Well, there's a couple of angles to that. The
first one is, can you just unleash the power of AI and have
it do its job? And that the second aspect of that is, you
know, is it a one one time deal? Do you just use it up once and
let it run? So to address the can you unleash it, maybe maybe
one of the talks or one of the challenges that I I've had
dealing with sort of sort of very conservative minded IT
operations folks, when trying to bring in something as fuzzy as
AI, to a previously incredibly deterministic world where
everything is well understood, and every action has a very well
understood reaction. What are the challenges? Well, how can
how trustworthy is it? Is it going to get the same results
each time? And the answer is well, not always, because if the
input data is different, then it might respond differently. Um,
so from a trustworthy standpoint, you have to sort of
take a step back and think, well, there are many different
types of AI technology, even down to something like machine
learning, there's the concept of supervised and unsupervised
machine learning. And so if you're gonna just want to throw
some data at the proper system, and have it do its thing, you're
probably describing unsupervised machine learning. There are
certain techniques or certain areas where that's very
applicable. That's particularly trustworthy. In areas where
there's no real learning that needs to be done. I think a lot
of the concern that people have over sort of machine learning
and AI is, is where training has to occur, and how accurate is
the training, but there are certain techniques that just
work. So you don't need to build a model, you just react to the
data that's coming in. And so an example would be, you have a
flow of data coming into a system, and you're looking at
that data in real time and trying to identify patterns. So
you're not necessarily comparing it to a historical model, you're
just looking at the data as it is, in real time trying to
determine patterns. So that's a good example of unsupervised
data, because there's no training model, you're looking
at the data in real time, and coming up with an answer. So
that's a good example of something where you can just
turn it on and let it do its magic. There are other areas
where, you know, training becomes more of a more of an
important component. And I think, from our standpoint, when
applying those techniques to an operations type environment,
that's where the human becomes important. Because the
supervised model at that point, the training is done by a human.
So the system would say to you, this is something that I
determined from the input data, what do you think, and the human
has the opportunity to train it. So you know, practical turn,
that might be the ability to tag data, or press a button to give
it a thumbs up or a thumbs down. And that sort of human guided
supervised learning. Again, it becomes trustworthy, because the
human has provided the input. It's not something that the
system has determined on its own, that you're actually giving
it some sort of positive affirmation. So if the model is
good, it's because a human has trained it to be good, based on
their current knowledge.
Unknown: The tools we use as a team have a direct influence on
how we work together, and the success we create. We built
range with that in mind, by balancing asynchronous check ins
and real time collaboration, branch helps remote and hybrid
dev teams build alignment and baton back on the calendar
branch connects dozens of apps like JIRA and GitHub, in one
place. So everyone can share progress and updates on work,
making standups more focused and engaging for everyone. Visit us
range.com/devops To learn more, and try arrange for it.
Jason Baum: Interesting, so as a follow up to that does risk of
getting it wrong, play into the decision of whether the machine
is let to let it go type, like what you're saying just unleash
it, as opposed to a human being being kind of on the other end
sort of helping it does risk play into that of getting it
wrong?
Richard Whitehead: Well, the good news is, in most IT
operations environment, the relative level of the risk is
fairly low. But not in every case, obviously. And that's
where a lot of concern, I think comes I have no idea who coined
the phrase, but I like it, which is new to err is human. To
really mess it up you need a computer. And that's one of the
that's one of the the challenges with with automation, is that
you can really make a problem worse by fully automating some
kind of reaction to it. Risk is certainly an issue. When you
look at some of the stories in the press about artificial
intelligence. Nobody ever really publishes the good stories,
that's just that just happened. That's life. That's we're all
used to that. We take that for granted. It's the negative sides
of AI that get a lot of the publicity. And, you know,
there's a lot of concern about bias in learning models, and,
and some of those sort of issues. And that's really sort
of a big data problem where you're dealing with large
amounts of data from questionable sources that have
been used to train models. And from my standpoint, the way you
mitigate that risk is you move away from third party data. And
you try and focus solely on your environment. So don't use
external training data. And you can do that in an IT operations
environment, it's much easier to do that if you're, you're not
dealing with sort of medical data from the last 10 years,
that may or may not be tainted by some some, some poor poor
quality data that was introduced that you have no control over.
You're dealing with an IT operations environment you're
dealing with, with infrastructure and technology
that's in your control that you have. So you can you can build
models and do training, from data that that high quality data
that you that has good provenance, you know where it
came from. So a lot of those concerns, like I say, that are
based on poor quality models and poor quality data from
questionable sources. The good news is it operations has less
of a concern with that data, because we know where it comes
from.
Jason Baum: So with all of that, and it sounds like there's a lot
of management that has to go on behind the scenes, who's doing
that, who's who's going to manage the solution? How has
like the tech team changed? How has it been the work being
distributed? Where does AI ops play into this now? Do you need
a data scientist? Do existing team members take on new roles?
How are you structuring it?
Richard Whitehead: Right? So yes, we obviously have first
hand experience with that as a technology provider in that
space. And the answer to the Do we need a data scientist, is if
you're going to build a solution yourself, if you're going to
roll your own as it were, then yes, you're going to need a data
scientist we have, we have data scientists on board as part of
our team. They're slightly outside the engineering team.
And just like every other organization, they have
different skills. They come from different backgrounds. The war
science scientists, and they are engineers, the the sort of
programming languages tend to be more Python are focused and so
forth. So different people. Absolutely. If, however, you're
in IT operations, you probably shouldn't necessarily be looking
at getting a data scientist on board. Because there are
technologies out there commercial technologies, open
source technologies, where that work has been done for you. And
I think when people ask me, you know, am I going to have to
retrain my staff? I chuckle and say, No, the impact of AI on
operations is minor, it's almost trivial compared to some of the
seismic shifts we've already seen in the last five to 10
years. As operations people, and we shift from this everything
from to everything is code type environment, we now have
operators who are themselves, they look just like software
engineers, they're conversant in one, two, maybe even three
programming languages. They're fully conversant with code
repositories. And that that shift is far bigger than
anything that the introduction of AI is ever going to change.
So no, you're not going to have to become a data scientist just
to operate this. The technology is going to be in a form that's
easily consumable. It's going to look like software, it's going
to like software, you'll treat it like software, you're not
going to be building models yourself, the technology is
going to be doing that for you. So no, don't think you'll need a
data scientist. But absolutely, you're going to need to have
people who are very consistently conversant with with software
and infrastructure as code and that sort of thing.
Jason Baum: So where does AI ops ml ops? Where does that fit
within DevOps culture?
Richard Whitehead: At the end of the day, it's it's just
technology. It's it's a tool. Okay, so it's, it's neither a
good fit nor bad fit. It's just technology. If good AI ops
technology will fit very well. Because it just looks like
software, it reacts like software, you can configure it
as code. The changes you make are going to be very easy to
work with. The technology will offer both a strong UI but also
strong API's so the technology can fit into and be integrated
into a DevOps tool chain. It's just part of it. Part of the
value stream. It shouldn't stick out as necessarily being
something that's, that's a standalone industry or a
standalone job title, you shouldn't have to hire an AI ops
engineer. It's just, it's just technology.
Jason Baum: So what what are you excited about? With the future
of AI ops? What's what's coming down the pipeline that should
get us all excited?
Richard Whitehead: I think, you know, for me, as somebody who is
involved in the very early stages, just one, the first
thing is the adoption of it. It's the fact that we've made
that shift from this is scary, I don't know if I can trust it to,
gosh, I can't imagine life without it. Do you remember what
it was, like 10 years ago, when we have to do this stuff
ourselves? How How dull and boring was that aiops also
brings some stability. And there's a certain irony to that,
because when we talk about things like the fuzzy logic of
AI, people think of that as being kind of non deterministic
and scary. The reality is, it makes systems much more, much
more robust. So the ability for a system to be able to adapt
means that when you get certain changes, AI adapts along with
it, and becomes very flexible, and means that the sort of the
total cost of ownership, the maintenance of an AI system
drops significantly, because it's adaptive. And that's, I
think, really significant, that that's a, that's another thing
that just improves your sort of daily life is knowing that when
you plug something in, yes, you're going to have to maintain
it. But it's not something that's going to be a full time
job. It's not something that every single day, you're going
to have to touch and tweak. And I think, you know, people,
people forget that when they talk about automation. And you
hear that term, sort of no ops, floating around of like, well,
we just fully automate everything. And that's it that
humans can go on vacation and never touch it again. Well,
life's not like that. One of the benefits, one of the goals, even
of digital transformation is the ability for things to change at
a blistering pace, you want things to be incredibly reactive
and very dynamic. And you throw into that the natural entropy of
any, any system, and changes absolutely guaranteed, and the
rate of change is accelerating. So nothing's ever going to be
installed and forgotten about, this isn't a telecommuter. Most
of us are not dealing with a telecommunications environment,
where you install a switch, and then you'd love it and take care
of it for 25 years. Everything changes dramatically. So having
a system that's at least a little bit adaptive, and doesn't
require, you know, constant attention. You know, that's
something that makes people very, very happy. And I think
that's something I'm, I'm looking forward to, people seem
to benefit from. Also the just generally looking at new
opportunities. So as I mentioned, as we start to deploy
AI ops, in production environments, it's the little
things that are the game changes, the little benefits
that are multiplied over, you know, hundreds of times a week
that make everybody go Yeah, okay, this is really cool. I'm
glad we installed that that made that made a big difference.
expanding that to do some some other intriguing use cases,
finding new cases, new use cases is something I'm really excited
about.
Jason Baum: It sounds like when this is going to when it's when
it you know what's working is when you kind of forgot about
it. All right. Yeah. Right. That's, that's the end goal. So
I look, we're coming up to the end. This is I could talk about
the subject forever. I think it's fascinating. I love hearing
you speak about it. It's, it's, gosh, I can't believe we're
here, right? This point when some of these, these mundane
tasks are just no longer going to be a thing are already not a
thing. So I do like to ask, kind of like, this isn't like a
gotcha question. But but sometimes it is. Today's is not.
I like to ask a thinker. So what's one question you wish I'd
asked you? And how would you have answered it?
Richard Whitehead: Um, just just from sort of a personal point of
view as a tinkerer and an experiment, you know, I wish we
had more time to talk about natural language processing. You
know, I think I've been doing this for a very long time
somebody asked me, How long have you been writing regex Richard
and I it's, it's measured in decades. Um, I think might be
three decades now. And for me, you know, I, I joke that, you
know, I've only been writing regex for 30 years. So I'm a
relative noob I'm still learning. And then along comes
natural language processing. And by, by using sort of NLP, you
can do things in, in a couple of seconds. That would take maybe,
I don't know, 3030 minutes to express as a regular expression.
And, you know, for me, there are certain things that I enjoy
doing from from years ago, you know, I still, I still write
code in using VI. And, you know, I still spend a lot of time on
the command line on Linux systems. But if I never have to
write another regex, again, I'd be a happy person. So, so that
so the power of things like natural language processing
just, it impresses me, and also improves my daily life. So there
you go. That's, I answered that question.
Jason Baum: Great. Awesome. I love it. You should have been
interested in interviewing yourself. And you would also
have gotten through that line better than I just did. Well, I
really appreciate your time, Richard and educating us on AI
ops, ml ops, and you know, how it fits into into DevOps as a
tool and just in general makes our lives easier and not coming
to cause doomsday. So I really appreciate you coming on.
Richard Whitehead: It's all good. It's not Skynet.
Jason Baum: Thank goodness. If anyone names our company Skynet,
I think question there. There. Well, maybe just funny, I don't
know. Well, thank you so much, Richard, I really appreciate
your time. And thank you for listening to this episode of the
humans of DevOps Podcast. I'm going to end this episode The
way I always do, encouraging you to become a member of DevOps
Institute to get access to even more great resources just like
this one. Until next time, stay safe, stay healthy, and most of
all, state humans live long and prosper.
Narrator: Thanks for listening to this episode of the humans of
DevOps podcast. Don't forget to join our global community to get
access to even more great resources like this. Until next
time, remember, you are part of something bigger than yourself.
You belong
We recommend upgrading to the latest Chrome, Firefox, Safari, or Edge.
Please check your internet connection and refresh the page. You might also try disabling any ad blockers.
You can visit our support center if you're having problems.