Audioboom

Transcript for AIOps and Humans with Richard Whitehead, Moogsoft

00:00:03,389 → 00:00:06,269

Narrator: You're listening to the humans of DevOps podcast, a

00:00:06,269 → 00:00:09,929

podcast focused on advancing the humans of DevOps through skills,

00:00:09,959 → 00:00:14,309

knowledge, ideas and learning, or the SK il framework.

00:00:17,220 → 00:00:19,710

Richard Whitehead: Right, so the good news is people's jobs like

00:00:19,710 → 00:00:24,450

most definitely not at risk. The problem we're trying to solve or

00:00:24,480 → 00:00:29,520

solve, I think is it's increasing at a greater rate

00:00:29,550 → 00:00:36,300

than I think we can. We can sort of solve the problem. So the

00:00:36,300 → 00:00:39,330

expanding nature of the problem, I think, secures people's

00:00:39,330 → 00:00:43,920

employment for a very, very long time. I think in every aspect

00:00:44,460 → 00:00:48,000

of, of any form of digital transformation, when you look at

00:00:48,000 → 00:00:52,620

any aspect of the business, it doesn't matter how much effort

00:00:52,620 → 00:00:56,010

how much code rewrites how much automation we do. The

00:00:56,010 → 00:00:59,850

opportunity to refer to this as an opportunity, not a problem.

00:01:00,180 → 00:01:04,320

The opportunity is increasing. So, so fast, that I don't think

00:01:04,320 → 00:01:06,240

anybody's going to be able to jump anytime soon.

00:01:08,580 → 00:01:11,070

Jason Baum: Hey, everyone, welcome back. It's Jason Baum,

00:01:11,070 → 00:01:14,490

Director of Member experience at DevOps Institute. And this is

00:01:14,490 → 00:01:19,200

the humans of DevOps podcast. I hope you had a great week. We're

00:01:19,200 → 00:01:23,970

glad you came back to join us. So let me take you back to the

00:01:23,970 → 00:01:29,250

80s The machines are coming to get us scenes in lines from the

00:01:29,250 → 00:01:32,880

Terminator and it's many sequels are forever etched in our

00:01:32,880 → 00:01:37,530

brains. It hauntingly depicts a world hell bent on technological

00:01:37,530 → 00:01:40,380

growth that led to the rise in advanced machine learning

00:01:40,380 → 00:01:43,290

techniques and artificial intelligence that would

00:01:43,410 → 00:01:47,700

ultimately lead to the world's demise. Well, we're on our way

00:01:47,700 → 00:01:52,890

there, folks. But instead of doomsday scenarios, we're all

00:01:52,890 → 00:01:57,570

picturing. Ai ops has emerged as an essential step forward for

00:01:57,570 → 00:02:01,020

enterprises in a variety of different industries. Gartner

00:02:01,020 → 00:02:05,010

defines AI ops as artificial intelligence for IT operations,

00:02:05,010 → 00:02:08,760

which combines big data and machine learning to automate it

00:02:08,760 → 00:02:13,500

operations processes. At its core, it's all about it, teams

00:02:13,500 → 00:02:17,010

and organizations can use AI to manage data in their

00:02:17,010 → 00:02:20,520

environments. Through this approach, teams can employ large

00:02:20,520 → 00:02:24,870

scale datasets, machine learning and automation to MAKE IT ops

00:02:24,870 → 00:02:29,280

faster, simpler and more efficient. Many believe AI will

00:02:29,280 → 00:02:33,600

not only impact organizations, but will become a major facet in

00:02:33,600 → 00:02:37,470

our everyday lives through the emergence of new applications. I

00:02:37,470 → 00:02:41,550

mean, we already see ML and AI every day with things like face

00:02:41,550 → 00:02:45,330

ID smart replies, product recommendations, chat bots, you

00:02:45,330 → 00:02:48,930

name it. Today, we're going to talk to Richard Whitehead

00:02:48,960 → 00:02:55,050

evangelist and chief CTO of Moog soft about what aiops really

00:02:55,050 → 00:02:58,500

means for humans, as opposed to being stuck in the 80s version

00:02:58,500 → 00:03:03,210

of this story. Sorry about that. Richard, are he ready to get

00:03:03,210 → 00:03:03,720

human?

00:03:04,320 → 00:03:05,220

Richard Whitehead: How suddenly?

00:03:06,330 → 00:03:08,760

Jason Baum: First of all, thanks so much for coming on.

00:03:09,600 → 00:03:14,550

appreciate having you here on the podcast. And my apologies

00:03:14,550 → 00:03:18,480

for taking it back to the 80s. But every time I talk about AI,

00:03:18,870 → 00:03:24,270

my mind immediately goes to the machines are coming for us.

00:03:25,170 → 00:03:28,050

Richard Whitehead: And certainly, when I started

00:03:28,050 → 00:03:32,970

working in this space, I think every single PowerPoint

00:03:32,970 → 00:03:36,750

presentation I saw both internally and externally had at

00:03:36,780 → 00:03:42,690

least some reference to Skynet and a terminator. So yes. It's

00:03:42,690 → 00:03:44,490

not that not that far back.

00:03:45,150 → 00:03:48,660

Jason Baum: Yeah, no, it's not that far back. And I think

00:03:48,660 → 00:03:54,180

people are still still scared by it. I have a funny story that

00:03:54,180 → 00:03:58,530

I've told. Perhaps in the past, I'm trying to remember maybe

00:03:58,530 → 00:04:07,170

maybe not, of my mother actually keeps her Amazon name cannot be

00:04:07,170 → 00:04:10,290

said because it's actually in the room with me and we'll start

00:04:10,290 → 00:04:16,380

speaking. But we know what we're talking about. She puts it away.

00:04:16,410 → 00:04:18,750

She actually keeps it in a cabinet and when she wants to

00:04:18,750 → 00:04:22,740

use it, takes it out and plugs it in and then says the magic

00:04:22,740 → 00:04:25,680

words to turn it on because she's afraid of it listening to

00:04:25,680 → 00:04:26,070

her.

00:04:27,480 → 00:04:29,610

Richard Whitehead: That might be a legitimate fear. It's always

00:04:29,610 → 00:04:36,000

listening that it has to but yeah, I think I you know, I'm a

00:04:36,000 → 00:04:40,230

pragmatist. I don't fear artificial intelligence, but I

00:04:40,230 → 00:04:44,190

do occasionally have a sense of disappointment. You know, when

00:04:44,190 → 00:04:48,120

my door camera notifies me that there's a person outside and

00:04:48,120 → 00:04:53,520

it's it's clearly a dog. And, and also some, some shopping

00:04:53,520 → 00:04:57,690

recommendations. I get, you know, I I, if I were to buy an

00:04:57,690 → 00:05:02,580

item such as a dishwasher, I don't feel the need to be to

00:05:02,580 → 00:05:05,040

have somebody suggest that I buy a dishwasher for the next six

00:05:05,040 → 00:05:11,070

months. So I think people, you know, it's less fear, it's more

00:05:11,070 → 00:05:15,180

disappointment in what, what AI can bring to you based on some

00:05:15,180 → 00:05:19,560

of the some of the more obvious and commercial barons that are

00:05:19,560 → 00:05:19,920

out there.

00:05:21,270 → 00:05:25,290

Jason Baum: Yeah, so let's talk about what AI is. So I feel like

00:05:25,290 → 00:05:30,510

that's a good place to start. You heard the Gartner

00:05:30,540 → 00:05:33,930

definition, want to include that because I always feel like

00:05:33,930 → 00:05:38,130

having an academic of sorts definition is important to hear.

00:05:39,210 → 00:05:43,620

I've also heard someone very succinctly put it that AI,

00:05:43,800 → 00:05:47,190

artificial intelligence is simply the problem. And then

00:05:47,190 → 00:05:51,780

once it's solved, it's no longer AI, which I think is a

00:05:51,780 → 00:05:55,710

fascinating way to look at it. I'm curious how you would define

00:05:55,710 → 00:05:55,950

it.

00:05:57,270 → 00:06:01,140

Richard Whitehead: So much. So I definitions a little broader. To

00:06:01,140 → 00:06:04,800

go back to the specific definition, the garden

00:06:04,800 → 00:06:07,680

definition, which is actually evolved into that definition.

00:06:08,790 → 00:06:11,040

Because when you put the letters AI together, people

00:06:11,070 → 00:06:14,730

automatically assume it means artificial intelligence. So AI

00:06:14,730 → 00:06:17,310

Ops is the application of artificial intelligence

00:06:17,310 → 00:06:23,670

techniques to IT operations. So that's really it. And, of

00:06:23,670 → 00:06:26,700

course, means that it's a very broad definition, which means

00:06:26,700 → 00:06:30,030

there are a lot of technologies and techniques and solutions out

00:06:30,030 → 00:06:33,960

there that all fit into this umbrella definition. So when

00:06:33,960 → 00:06:37,320

people talk about artificial intelligence, there's a general

00:06:37,320 → 00:06:42,000

sense that what you're talking about is technology or

00:06:42,000 → 00:06:46,020

computers, that are in some way attempting to replicate humans,

00:06:46,650 → 00:06:49,680

you know, and that's, that's where the inevitable screenshot

00:06:49,680 → 00:06:53,850

of the Terminator, robot appears, and so forth. And I

00:06:53,850 → 00:06:56,190

think that's, that's generally true. So, you know, when you

00:06:56,190 → 00:06:59,490

think of chess playing, computers and things like that,

00:07:00,180 → 00:07:04,530

and that sense has been largely reinforced by some of the early

00:07:04,560 → 00:07:10,530

adoptions of AI in things like service desks. So you know, when

00:07:10,530 → 00:07:13,800

you call into a service desk, your first interaction is likely

00:07:13,800 → 00:07:17,760

to be with some form of AI capability where it attempts to,

00:07:18,210 → 00:07:21,450

to give you an answer very rapidly without any form of real

00:07:21,450 → 00:07:25,020

human interaction. And, you know, they, in most cases,

00:07:25,020 → 00:07:29,790

they're attempting to sort of pass the Turing test. In other

00:07:29,790 → 00:07:32,040

words, you're talking to a computer and is trying to make

00:07:32,040 → 00:07:34,740

it as human like that interaction to be as human like

00:07:34,740 → 00:07:38,790

as possible. So while it's true for things like service desks,

00:07:38,820 → 00:07:41,850

when you when you dig deeper into some technology, and start

00:07:41,850 → 00:07:44,700

talking about, you know, the concept of sort of monitoring

00:07:44,700 → 00:07:48,900

and observability remediation, and things like that, it becomes

00:07:48,930 → 00:07:53,040

less attempting to replicate a human but more attempting to

00:07:53,040 → 00:07:58,590

replicate what a human would try and do if they were involved. So

00:07:58,590 → 00:08:02,220

it's not a human interaction, it's an application of human

00:08:02,220 → 00:08:06,420

based intelligence, but in an automated fashion. So with that

00:08:06,420 → 00:08:09,450

broad definition, you're incorporating not just sort of

00:08:09,450 → 00:08:13,260

that chess playing type thing, that's actually the least of the

00:08:13,260 → 00:08:16,020

component. But you are talking about things like machine

00:08:16,020 → 00:08:21,360

learning, you're talking about sort of some sophisticated

00:08:21,360 → 00:08:24,900

algorithms that maybe do linear regression, and you're talking

00:08:24,900 → 00:08:28,470

about some techniques that are in the periphery of artificial

00:08:28,470 → 00:08:31,770

intelligence, such as natural language processing, you know,

00:08:31,770 → 00:08:35,280

I've mentioned already mentioned two that I'm personally

00:08:35,340 → 00:08:38,550

relatively familiar with, which is machine learning, and natural

00:08:38,550 → 00:08:42,570

language processing. And these are things that you don't

00:08:42,570 → 00:08:46,920

necessarily think of when you're talking about AI, but that

00:08:46,920 → 00:08:51,210

they're absolutely relevant and very pertinent to solving

00:08:51,210 → 00:08:52,260

specific problems.

00:08:53,010 → 00:08:56,910

Jason Baum: And with that, that's a good lead into this

00:08:56,940 → 00:09:04,170

next question is, what are some examples of, of problems that

00:09:04,170 → 00:09:07,680

we're looking to solve with AI ops and machine learning

00:09:07,680 → 00:09:10,890

specifically? And And also, when you hear about, okay, we're

00:09:10,890 → 00:09:15,060

looking to solve problems, and automate and speed up some

00:09:15,060 → 00:09:19,650

processes. I think a lot of the some of the misconception then

00:09:19,650 → 00:09:23,760

is, well, now I'm going to lose my job. AI is going to replace

00:09:23,760 → 00:09:28,650

me. So perhaps you could address both in this in in your answer,

00:09:28,920 → 00:09:32,250

what's the problem? And then as we solve it with AI ops, how are

00:09:32,250 → 00:09:34,920

we like, are people's jobs at risk?

00:09:36,000 → 00:09:38,460

Richard Whitehead: Right, so the good news is people's jobs are

00:09:38,460 → 00:09:43,230

most definitely not at risk. The problem we're trying to solve

00:09:43,260 → 00:09:48,300

solve, I think, is it's increasing at a greater rate

00:09:48,360 → 00:09:55,080

than I think we can. We can sort of solve the problem. So the

00:09:55,080 → 00:09:58,110

expanding nature of the problem, I think, secures people's

00:09:58,110 → 00:10:02,820

employment for a very, very long time. I think in every aspect

00:10:03,240 → 00:10:06,780

of, of any form of digital transformation, when you look at

00:10:06,780 → 00:10:11,430

any aspect of the business, it doesn't matter how much effort

00:10:11,430 → 00:10:14,790

how much code rewrite how much automation, we do the

00:10:14,790 → 00:10:18,240

opportunity, I want to refer to it as an opportunity, not a

00:10:18,240 → 00:10:22,890

problem, the opportunity is increasing so fast, that I don't

00:10:22,890 → 00:10:25,470

think anybody's going to be out of a job anytime soon. In fact,

00:10:25,470 → 00:10:30,630

I think if you look at some of the roles, the newer roles that

00:10:30,630 → 00:10:33,750

are emerging as a result of digital transformation, such as

00:10:34,110 → 00:10:39,660

site reliability engineers, sort of developers in general and

00:10:40,080 → 00:10:43,860

sort of operations, folks who are emerging in sort of a DevOps

00:10:43,860 → 00:10:47,880

type, type capacity, you know, that's an expanding market

00:10:47,880 → 00:10:52,830

opportunity, not a shrinking one. So individual teams might

00:10:52,830 → 00:10:57,900

be smaller as a result of, of this technology. But the market

00:10:57,900 → 00:11:01,950

opportunity in general is such that I think it's going to give

00:11:01,950 → 00:11:06,270

me a long time before the demand for people in these roles cools

00:11:06,270 → 00:11:10,200

off. So yeah, not not a problem there. We're dealing with a an

00:11:10,200 → 00:11:16,500

exploding market opportunity. So basically, it sort of comes down

00:11:16,500 → 00:11:19,590

to, I think I mentioned it earlier, the notion of

00:11:19,590 → 00:11:23,940

automation. So when we talk about AI replicating human

00:11:23,970 → 00:11:28,830

activity, I tend to think of an in this this sense, when you're

00:11:28,830 → 00:11:31,620

looking at something that a human would do on their date,

00:11:31,650 → 00:11:35,250

day to day, sort of line of work, when you're solving a

00:11:35,250 → 00:11:39,840

problem addressing an incident, debugging something, the

00:11:39,840 → 00:11:43,290

question we always ask ourselves is, what's the most common task,

00:11:43,290 → 00:11:46,140

what's the most common and repetitive tasks that a human

00:11:46,260 → 00:11:50,640

performance, and those are the things that I think are easy

00:11:50,640 → 00:11:54,420

targets for AI, to replicate, because they tend to be the

00:11:54,420 → 00:11:59,310

mundane tasks, you know, we tend tend to refer to them a lot of

00:11:59,310 → 00:12:03,030

toil. The stuff that you do every single time, that doesn't

00:12:03,030 → 00:12:08,640

necessarily add value, but it's just a task that has to be

00:12:08,640 → 00:12:12,420

performed in order to move on to the the next job of of actually

00:12:12,420 → 00:12:16,710

resolving the issue or, or finding the error. And that's

00:12:16,710 → 00:12:20,700

something that I think, is sort of overlooked, people tend to

00:12:20,700 → 00:12:23,760

think of AI as being an end goal, we're going to completely

00:12:23,760 → 00:12:27,330

replace a human, and you throw data at it and you get a

00:12:27,330 → 00:12:31,530

solution at the other end, I tend to think of AI certainly,

00:12:31,830 → 00:12:36,810

we sometimes refer to it at moogsoft as applied AI, it's

00:12:36,810 → 00:12:40,170

basically a very small tool, you can take a very small tool to

00:12:40,170 → 00:12:43,470

achieve a very specific task. And it could be something as

00:12:43,470 → 00:12:51,090

simple as doing a bit of triage, augmenting some information. So

00:12:51,780 → 00:12:54,780

that's one less task you have to do. That's one less system, you

00:12:54,780 → 00:12:57,570

have to log into to get some additional information. If that

00:12:57,570 → 00:13:00,840

information can be gathered for you, and presented to you.

00:13:01,950 → 00:13:05,550

That's one less mundane task you have to perform in order to get

00:13:05,550 → 00:13:09,120

to the really important stuff, which is using your your human

00:13:09,120 → 00:13:15,240

brain to resolve the issue. And so yeah, so applied AI is a good

00:13:15,240 → 00:13:16,110

way of looking at it.

00:13:16,499 → 00:13:18,059

Jason Baum: That's what it's all about. Right? Getting rid of

00:13:18,059 → 00:13:21,689

that mundane. I think that's the goal. Right? Right.

00:13:21,750 → 00:13:25,170

Richard Whitehead: So certainly a DevOps ideal, right? Yeah. The

00:13:25,170 → 00:13:25,860

daily work.

00:13:25,979 → 00:13:32,009

Jason Baum: Yeah. Efficiency. So um, so it's is it is it once

00:13:32,009 → 00:13:36,809

we're, it's set up, right, we've got it, the mundane is gone.

00:13:37,079 → 00:13:40,529

It's working, you know, everything is being automated,

00:13:40,709 → 00:13:45,659

is simply plug and play. And now we just let it go. Or, you know,

00:13:45,689 → 00:13:48,599

can we trust the machines to continue it? And just in

00:13:48,599 → 00:13:52,049

perpetuity, I guess forever? We're learning the machine is

00:13:52,049 → 00:13:54,689

learning and, and everything is all set?

00:13:55,830 → 00:13:58,770

Richard Whitehead: Well, there's a couple of angles to that. The

00:13:58,770 → 00:14:03,000

first one is, can you just unleash the power of AI and have

00:14:03,000 → 00:14:06,840

it do its job? And that the second aspect of that is, you

00:14:06,840 → 00:14:10,590

know, is it a one one time deal? Do you just use it up once and

00:14:10,710 → 00:14:21,060

let it run? So to address the can you unleash it, maybe maybe

00:14:22,710 → 00:14:26,520

one of the talks or one of the challenges that I I've had

00:14:26,550 → 00:14:30,360

dealing with sort of sort of very conservative minded IT

00:14:30,360 → 00:14:35,250

operations folks, when trying to bring in something as fuzzy as

00:14:35,280 → 00:14:39,510

AI, to a previously incredibly deterministic world where

00:14:39,510 → 00:14:44,010

everything is well understood, and every action has a very well

00:14:44,010 → 00:14:48,870

understood reaction. What are the challenges? Well, how can

00:14:48,900 → 00:14:53,700

how trustworthy is it? Is it going to get the same results

00:14:53,700 → 00:14:56,640

each time? And the answer is well, not always, because if the

00:14:56,670 → 00:15:00,450

input data is different, then it might respond differently. Um,

00:15:00,780 → 00:15:03,840

so from a trustworthy standpoint, you have to sort of

00:15:03,840 → 00:15:06,900

take a step back and think, well, there are many different

00:15:06,900 → 00:15:12,780

types of AI technology, even down to something like machine

00:15:12,780 → 00:15:15,450

learning, there's the concept of supervised and unsupervised

00:15:15,450 → 00:15:19,860

machine learning. And so if you're gonna just want to throw

00:15:19,860 → 00:15:24,690

some data at the proper system, and have it do its thing, you're

00:15:24,690 → 00:15:28,050

probably describing unsupervised machine learning. There are

00:15:28,050 → 00:15:30,570

certain techniques or certain areas where that's very

00:15:30,660 → 00:15:36,180

applicable. That's particularly trustworthy. In areas where

00:15:36,180 → 00:15:39,900

there's no real learning that needs to be done. I think a lot

00:15:39,900 → 00:15:42,720

of the concern that people have over sort of machine learning

00:15:42,720 → 00:15:47,970

and AI is, is where training has to occur, and how accurate is

00:15:47,970 → 00:15:50,760

the training, but there are certain techniques that just

00:15:50,760 → 00:15:54,960

work. So you don't need to build a model, you just react to the

00:15:54,960 → 00:16:00,090

data that's coming in. And so an example would be, you have a

00:16:00,090 → 00:16:03,540

flow of data coming into a system, and you're looking at

00:16:03,540 → 00:16:07,110

that data in real time and trying to identify patterns. So

00:16:07,110 → 00:16:10,860

you're not necessarily comparing it to a historical model, you're

00:16:10,860 → 00:16:14,100

just looking at the data as it is, in real time trying to

00:16:14,130 → 00:16:17,340

determine patterns. So that's a good example of unsupervised

00:16:17,460 → 00:16:20,220

data, because there's no training model, you're looking

00:16:20,220 → 00:16:24,090

at the data in real time, and coming up with an answer. So

00:16:24,090 → 00:16:27,450

that's a good example of something where you can just

00:16:27,720 → 00:16:32,040

turn it on and let it do its magic. There are other areas

00:16:32,040 → 00:16:37,770

where, you know, training becomes more of a more of an

00:16:37,770 → 00:16:42,690

important component. And I think, from our standpoint, when

00:16:42,690 → 00:16:45,870

applying those techniques to an operations type environment,

00:16:46,140 → 00:16:49,170

that's where the human becomes important. Because the

00:16:49,170 → 00:16:52,950

supervised model at that point, the training is done by a human.

00:16:53,850 → 00:16:58,410

So the system would say to you, this is something that I

00:16:58,500 → 00:17:03,150

determined from the input data, what do you think, and the human

00:17:03,180 → 00:17:07,800

has the opportunity to train it. So you know, practical turn,

00:17:07,800 → 00:17:12,360

that might be the ability to tag data, or press a button to give

00:17:12,360 → 00:17:19,830

it a thumbs up or a thumbs down. And that sort of human guided

00:17:19,830 → 00:17:24,450

supervised learning. Again, it becomes trustworthy, because the

00:17:24,450 → 00:17:27,390

human has provided the input. It's not something that the

00:17:27,390 → 00:17:30,450

system has determined on its own, that you're actually giving

00:17:30,450 → 00:17:34,560

it some sort of positive affirmation. So if the model is

00:17:34,710 → 00:17:38,940

good, it's because a human has trained it to be good, based on

00:17:38,940 → 00:17:39,930

their current knowledge.

00:17:41,970 → 00:17:44,880

Unknown: The tools we use as a team have a direct influence on

00:17:44,880 → 00:17:47,970

how we work together, and the success we create. We built

00:17:47,970 → 00:17:51,420

range with that in mind, by balancing asynchronous check ins

00:17:51,420 → 00:17:54,510

and real time collaboration, branch helps remote and hybrid

00:17:54,510 → 00:17:57,210

dev teams build alignment and baton back on the calendar

00:17:57,930 → 00:18:01,350

branch connects dozens of apps like JIRA and GitHub, in one

00:18:01,350 → 00:18:04,290

place. So everyone can share progress and updates on work,

00:18:04,530 → 00:18:08,610

making standups more focused and engaging for everyone. Visit us

00:18:08,610 → 00:18:12,150

range.com/devops To learn more, and try arrange for it.

00:18:15,599 → 00:18:21,059

Jason Baum: Interesting, so as a follow up to that does risk of

00:18:21,059 → 00:18:25,439

getting it wrong, play into the decision of whether the machine

00:18:25,589 → 00:18:31,619

is let to let it go type, like what you're saying just unleash

00:18:31,619 → 00:18:35,609

it, as opposed to a human being being kind of on the other end

00:18:35,609 → 00:18:39,629

sort of helping it does risk play into that of getting it

00:18:39,629 → 00:18:40,049

wrong?

00:18:41,040 → 00:18:42,960

Richard Whitehead: Well, the good news is, in most IT

00:18:42,960 → 00:18:46,380

operations environment, the relative level of the risk is

00:18:46,380 → 00:18:52,770

fairly low. But not in every case, obviously. And that's

00:18:52,770 → 00:18:58,440

where a lot of concern, I think comes I have no idea who coined

00:18:58,440 → 00:19:02,550

the phrase, but I like it, which is new to err is human. To

00:19:02,550 → 00:19:06,600

really mess it up you need a computer. And that's one of the

00:19:07,680 → 00:19:12,150

that's one of the the challenges with with automation, is that

00:19:12,570 → 00:19:16,050

you can really make a problem worse by fully automating some

00:19:16,050 → 00:19:22,050

kind of reaction to it. Risk is certainly an issue. When you

00:19:22,290 → 00:19:25,140

look at some of the stories in the press about artificial

00:19:25,140 → 00:19:30,420

intelligence. Nobody ever really publishes the good stories,

00:19:30,480 → 00:19:33,030

that's just that just happened. That's life. That's we're all

00:19:33,030 → 00:19:36,510

used to that. We take that for granted. It's the negative sides

00:19:36,540 → 00:19:40,350

of AI that get a lot of the publicity. And, you know,

00:19:40,350 → 00:19:45,810

there's a lot of concern about bias in learning models, and,

00:19:46,140 → 00:19:49,710

and some of those sort of issues. And that's really sort

00:19:49,710 → 00:19:54,720

of a big data problem where you're dealing with large

00:19:54,720 → 00:19:57,540

amounts of data from questionable sources that have

00:19:57,540 → 00:20:03,450

been used to train models. And from my standpoint, the way you

00:20:03,450 → 00:20:07,650

mitigate that risk is you move away from third party data. And

00:20:07,650 → 00:20:11,910

you try and focus solely on your environment. So don't use

00:20:12,120 → 00:20:17,100

external training data. And you can do that in an IT operations

00:20:17,100 → 00:20:20,100

environment, it's much easier to do that if you're, you're not

00:20:20,100 → 00:20:24,300

dealing with sort of medical data from the last 10 years,

00:20:24,810 → 00:20:29,730

that may or may not be tainted by some some, some poor poor

00:20:29,730 → 00:20:32,460

quality data that was introduced that you have no control over.

00:20:32,820 → 00:20:34,950

You're dealing with an IT operations environment you're

00:20:34,950 → 00:20:38,070

dealing with, with infrastructure and technology

00:20:38,070 → 00:20:42,390

that's in your control that you have. So you can you can build

00:20:42,390 → 00:20:48,480

models and do training, from data that that high quality data

00:20:48,480 → 00:20:50,490

that you that has good provenance, you know where it

00:20:50,490 → 00:20:54,420

came from. So a lot of those concerns, like I say, that are

00:20:54,420 → 00:20:57,990

based on poor quality models and poor quality data from

00:20:57,990 → 00:21:01,950

questionable sources. The good news is it operations has less

00:21:01,950 → 00:21:05,520

of a concern with that data, because we know where it comes

00:21:05,520 → 00:21:05,760

from.

00:21:06,329 → 00:21:10,109

Jason Baum: So with all of that, and it sounds like there's a lot

00:21:10,109 → 00:21:13,409

of management that has to go on behind the scenes, who's doing

00:21:13,409 → 00:21:17,999

that, who's who's going to manage the solution? How has

00:21:18,029 → 00:21:22,379

like the tech team changed? How has it been the work being

00:21:22,379 → 00:21:27,299

distributed? Where does AI ops play into this now? Do you need

00:21:27,299 → 00:21:32,369

a data scientist? Do existing team members take on new roles?

00:21:32,939 → 00:21:34,949

How are you structuring it?

00:21:36,660 → 00:21:38,520

Richard Whitehead: Right? So yes, we obviously have first

00:21:38,520 → 00:21:41,130

hand experience with that as a technology provider in that

00:21:41,130 → 00:21:46,740

space. And the answer to the Do we need a data scientist, is if

00:21:46,740 → 00:21:49,830

you're going to build a solution yourself, if you're going to

00:21:49,830 → 00:21:53,040

roll your own as it were, then yes, you're going to need a data

00:21:53,040 → 00:21:57,120

scientist we have, we have data scientists on board as part of

00:21:57,120 → 00:22:00,660

our team. They're slightly outside the engineering team.

00:22:02,310 → 00:22:04,830

And just like every other organization, they have

00:22:04,830 → 00:22:10,560

different skills. They come from different backgrounds. The war

00:22:10,560 → 00:22:14,760

science scientists, and they are engineers, the the sort of

00:22:14,760 → 00:22:20,280

programming languages tend to be more Python are focused and so

00:22:20,280 → 00:22:23,550

forth. So different people. Absolutely. If, however, you're

00:22:23,550 → 00:22:29,100

in IT operations, you probably shouldn't necessarily be looking

00:22:29,130 → 00:22:31,920

at getting a data scientist on board. Because there are

00:22:31,920 → 00:22:35,280

technologies out there commercial technologies, open

00:22:35,280 → 00:22:38,760

source technologies, where that work has been done for you. And

00:22:38,760 → 00:22:41,280

I think when people ask me, you know, am I going to have to

00:22:41,280 → 00:22:48,510

retrain my staff? I chuckle and say, No, the impact of AI on

00:22:48,540 → 00:22:54,000

operations is minor, it's almost trivial compared to some of the

00:22:54,000 → 00:22:56,940

seismic shifts we've already seen in the last five to 10

00:22:56,940 → 00:23:03,060

years. As operations people, and we shift from this everything

00:23:03,180 → 00:23:07,650

from to everything is code type environment, we now have

00:23:07,650 → 00:23:11,700

operators who are themselves, they look just like software

00:23:11,700 → 00:23:15,510

engineers, they're conversant in one, two, maybe even three

00:23:15,540 → 00:23:19,740

programming languages. They're fully conversant with code

00:23:19,740 → 00:23:24,900

repositories. And that that shift is far bigger than

00:23:24,900 → 00:23:27,480

anything that the introduction of AI is ever going to change.

00:23:28,140 → 00:23:31,710

So no, you're not going to have to become a data scientist just

00:23:31,710 → 00:23:38,940

to operate this. The technology is going to be in a form that's

00:23:38,940 → 00:23:42,060

easily consumable. It's going to look like software, it's going

00:23:42,060 → 00:23:44,820

to like software, you'll treat it like software, you're not

00:23:44,820 → 00:23:47,430

going to be building models yourself, the technology is

00:23:47,430 → 00:23:50,580

going to be doing that for you. So no, don't think you'll need a

00:23:50,580 → 00:23:55,980

data scientist. But absolutely, you're going to need to have

00:23:55,980 → 00:23:59,040

people who are very consistently conversant with with software

00:23:59,040 → 00:24:03,480

and infrastructure as code and that sort of thing.

00:24:04,349 → 00:24:08,039

Jason Baum: So where does AI ops ml ops? Where does that fit

00:24:08,039 → 00:24:09,989

within DevOps culture?

00:24:12,569 → 00:24:14,399

Richard Whitehead: At the end of the day, it's it's just

00:24:14,399 → 00:24:21,479

technology. It's it's a tool. Okay, so it's, it's neither a

00:24:21,479 → 00:24:28,019

good fit nor bad fit. It's just technology. If good AI ops

00:24:28,019 → 00:24:33,209

technology will fit very well. Because it just looks like

00:24:33,209 → 00:24:37,079

software, it reacts like software, you can configure it

00:24:37,349 → 00:24:42,149

as code. The changes you make are going to be very easy to

00:24:42,239 → 00:24:50,399

work with. The technology will offer both a strong UI but also

00:24:51,329 → 00:24:55,889

strong API's so the technology can fit into and be integrated

00:24:55,889 → 00:25:00,299

into a DevOps tool chain. It's just part of it. Part of the

00:25:00,299 → 00:25:06,149

value stream. It shouldn't stick out as necessarily being

00:25:06,149 → 00:25:10,259

something that's, that's a standalone industry or a

00:25:10,259 → 00:25:13,229

standalone job title, you shouldn't have to hire an AI ops

00:25:13,229 → 00:25:16,679

engineer. It's just, it's just technology.

00:25:18,510 → 00:25:21,990

Jason Baum: So what what are you excited about? With the future

00:25:21,990 → 00:25:24,690

of AI ops? What's what's coming down the pipeline that should

00:25:24,690 → 00:25:25,980

get us all excited?

00:25:28,020 → 00:25:31,560

Richard Whitehead: I think, you know, for me, as somebody who is

00:25:31,560 → 00:25:35,610

involved in the very early stages, just one, the first

00:25:35,610 → 00:25:38,070

thing is the adoption of it. It's the fact that we've made

00:25:38,070 → 00:25:42,390

that shift from this is scary, I don't know if I can trust it to,

00:25:43,530 → 00:25:47,250

gosh, I can't imagine life without it. Do you remember what

00:25:47,250 → 00:25:49,890

it was, like 10 years ago, when we have to do this stuff

00:25:49,890 → 00:25:55,800

ourselves? How How dull and boring was that aiops also

00:25:55,800 → 00:25:59,820

brings some stability. And there's a certain irony to that,

00:25:59,820 → 00:26:03,450

because when we talk about things like the fuzzy logic of

00:26:03,450 → 00:26:07,200

AI, people think of that as being kind of non deterministic

00:26:07,200 → 00:26:11,220

and scary. The reality is, it makes systems much more, much

00:26:11,220 → 00:26:15,480

more robust. So the ability for a system to be able to adapt

00:26:16,140 → 00:26:21,510

means that when you get certain changes, AI adapts along with

00:26:21,510 → 00:26:26,280

it, and becomes very flexible, and means that the sort of the

00:26:26,280 → 00:26:30,090

total cost of ownership, the maintenance of an AI system

00:26:30,720 → 00:26:35,550

drops significantly, because it's adaptive. And that's, I

00:26:35,550 → 00:26:39,510

think, really significant, that that's a, that's another thing

00:26:39,510 → 00:26:43,230

that just improves your sort of daily life is knowing that when

00:26:43,230 → 00:26:48,000

you plug something in, yes, you're going to have to maintain

00:26:48,000 → 00:26:50,220

it. But it's not something that's going to be a full time

00:26:50,220 → 00:26:52,770

job. It's not something that every single day, you're going

00:26:52,770 → 00:26:55,950

to have to touch and tweak. And I think, you know, people,

00:26:56,910 → 00:27:01,740

people forget that when they talk about automation. And you

00:27:01,740 → 00:27:05,640

hear that term, sort of no ops, floating around of like, well,

00:27:05,640 → 00:27:07,860

we just fully automate everything. And that's it that

00:27:07,860 → 00:27:11,850

humans can go on vacation and never touch it again. Well,

00:27:12,060 → 00:27:16,740

life's not like that. One of the benefits, one of the goals, even

00:27:17,040 → 00:27:20,490

of digital transformation is the ability for things to change at

00:27:20,940 → 00:27:24,570

a blistering pace, you want things to be incredibly reactive

00:27:24,600 → 00:27:29,340

and very dynamic. And you throw into that the natural entropy of

00:27:29,340 → 00:27:33,570

any, any system, and changes absolutely guaranteed, and the

00:27:33,570 → 00:27:37,530

rate of change is accelerating. So nothing's ever going to be

00:27:38,400 → 00:27:42,510

installed and forgotten about, this isn't a telecommuter. Most

00:27:42,510 → 00:27:44,670

of us are not dealing with a telecommunications environment,

00:27:45,060 → 00:27:48,630

where you install a switch, and then you'd love it and take care

00:27:48,630 → 00:27:56,670

of it for 25 years. Everything changes dramatically. So having

00:27:56,670 → 00:28:00,510

a system that's at least a little bit adaptive, and doesn't

00:28:00,510 → 00:28:03,840

require, you know, constant attention. You know, that's

00:28:03,840 → 00:28:06,540

something that makes people very, very happy. And I think

00:28:06,930 → 00:28:09,930

that's something I'm, I'm looking forward to, people seem

00:28:09,930 → 00:28:14,940

to benefit from. Also the just generally looking at new

00:28:14,940 → 00:28:21,840

opportunities. So as I mentioned, as we start to deploy

00:28:21,870 → 00:28:24,750

AI ops, in production environments, it's the little

00:28:24,750 → 00:28:27,930

things that are the game changes, the little benefits

00:28:27,930 → 00:28:31,110

that are multiplied over, you know, hundreds of times a week

00:28:32,070 → 00:28:34,350

that make everybody go Yeah, okay, this is really cool. I'm

00:28:34,350 → 00:28:37,620

glad we installed that that made that made a big difference.

00:28:38,490 → 00:28:42,660

expanding that to do some some other intriguing use cases,

00:28:42,840 → 00:28:46,050

finding new cases, new use cases is something I'm really excited

00:28:46,050 → 00:28:46,410

about.

00:28:46,829 → 00:28:48,959

Jason Baum: It sounds like when this is going to when it's when

00:28:48,959 → 00:28:51,629

it you know what's working is when you kind of forgot about

00:28:51,629 → 00:28:57,629

it. All right. Yeah. Right. That's, that's the end goal. So

00:28:57,659 → 00:29:01,349

I look, we're coming up to the end. This is I could talk about

00:29:01,349 → 00:29:04,649

the subject forever. I think it's fascinating. I love hearing

00:29:04,649 → 00:29:09,449

you speak about it. It's, it's, gosh, I can't believe we're

00:29:09,449 → 00:29:13,289

here, right? This point when some of these, these mundane

00:29:13,289 → 00:29:17,429

tasks are just no longer going to be a thing are already not a

00:29:17,429 → 00:29:23,129

thing. So I do like to ask, kind of like, this isn't like a

00:29:23,129 → 00:29:26,789

gotcha question. But but sometimes it is. Today's is not.

00:29:27,779 → 00:29:32,249

I like to ask a thinker. So what's one question you wish I'd

00:29:32,249 → 00:29:35,039

asked you? And how would you have answered it?

00:29:39,150 → 00:29:43,620

Richard Whitehead: Um, just just from sort of a personal point of

00:29:43,620 → 00:29:46,530

view as a tinkerer and an experiment, you know, I wish we

00:29:46,530 → 00:29:50,580

had more time to talk about natural language processing. You

00:29:50,580 → 00:29:53,610

know, I think I've been doing this for a very long time

00:29:53,610 → 00:29:56,340

somebody asked me, How long have you been writing regex Richard

00:29:56,760 → 00:30:01,350

and I it's, it's measured in decades. Um, I think might be

00:30:01,350 → 00:30:08,040

three decades now. And for me, you know, I, I joke that, you

00:30:08,040 → 00:30:10,800

know, I've only been writing regex for 30 years. So I'm a

00:30:10,800 → 00:30:15,210

relative noob I'm still learning. And then along comes

00:30:15,210 → 00:30:21,630

natural language processing. And by, by using sort of NLP, you

00:30:21,630 → 00:30:28,260

can do things in, in a couple of seconds. That would take maybe,

00:30:28,320 → 00:30:33,630

I don't know, 3030 minutes to express as a regular expression.

00:30:34,320 → 00:30:38,070

And, you know, for me, there are certain things that I enjoy

00:30:38,070 → 00:30:41,730

doing from from years ago, you know, I still, I still write

00:30:41,730 → 00:30:46,290

code in using VI. And, you know, I still spend a lot of time on

00:30:46,290 → 00:30:50,670

the command line on Linux systems. But if I never have to

00:30:50,670 → 00:30:55,320

write another regex, again, I'd be a happy person. So, so that

00:30:55,320 → 00:30:58,080

so the power of things like natural language processing

00:30:58,080 → 00:31:05,910

just, it impresses me, and also improves my daily life. So there

00:31:05,910 → 00:31:07,560

you go. That's, I answered that question.

00:31:07,679 → 00:31:10,709

Jason Baum: Great. Awesome. I love it. You should have been

00:31:10,709 → 00:31:14,819

interested in interviewing yourself. And you would also

00:31:14,819 → 00:31:18,389

have gotten through that line better than I just did. Well, I

00:31:18,389 → 00:31:21,569

really appreciate your time, Richard and educating us on AI

00:31:21,569 → 00:31:26,639

ops, ml ops, and you know, how it fits into into DevOps as a

00:31:26,639 → 00:31:31,349

tool and just in general makes our lives easier and not coming

00:31:31,349 → 00:31:35,669

to cause doomsday. So I really appreciate you coming on.

00:31:36,089 → 00:31:37,619

Richard Whitehead: It's all good. It's not Skynet.

00:31:38,250 → 00:31:43,440

Jason Baum: Thank goodness. If anyone names our company Skynet,

00:31:43,470 → 00:31:47,250

I think question there. There. Well, maybe just funny, I don't

00:31:47,250 → 00:31:50,400

know. Well, thank you so much, Richard, I really appreciate

00:31:50,400 → 00:31:53,610

your time. And thank you for listening to this episode of the

00:31:53,610 → 00:31:56,610

humans of DevOps Podcast. I'm going to end this episode The

00:31:56,610 → 00:32:00,390

way I always do, encouraging you to become a member of DevOps

00:32:00,390 → 00:32:03,540

Institute to get access to even more great resources just like

00:32:03,540 → 00:32:07,230

this one. Until next time, stay safe, stay healthy, and most of

00:32:07,230 → 00:32:09,780

all, state humans live long and prosper.

00:32:15,450 → 00:32:17,580

Narrator: Thanks for listening to this episode of the humans of

00:32:17,580 → 00:32:21,120

DevOps podcast. Don't forget to join our global community to get

00:32:21,120 → 00:32:24,480

access to even more great resources like this. Until next

00:32:24,480 → 00:32:27,780

time, remember, you are part of something bigger than yourself.

00:32:28,230 → 00:32:29,010

You belong

Sorry, your browser isn't supported by Audioboom.

Page load failed