The Checklist Your Deck Is Missing

Josh Tyson 00:00

All right, Jeff, well, maybe we can start here. ⁓ You have a lot of experience building out canonical knowledge and kind of source of truth for large companies. And as we've talked in the past on this podcast together, it feels like that is sort of the critical first step, right? Like if you're gonna have agents do things, you need to teach them. You need to give them knowledge to work from. And then as we were preparing for this call, we kind of had this realization that the same is sort of true for a lot of executives right now, they want to pick up tools and buy stuff but what they really probably need first is knowledge, So it all, kind of comes back to knowledge in a way.

Jeff McMillan 00:42

Yeah, and what's interesting, first of all, guys, thanks for having me back. ⁓ And what I find interesting, my first project on Wall Street ⁓ was to build out a customer information database. I was in my early 30s ⁓ and knew nothing about data. And what's interesting is that problem was critical over 25 years ago and maybe even more critical now. So something never changed. But to your point, everyone that I talk to wants to talk about AI applications and models. And let's be clear, that is the visible layer and obviously that's where the hype is. But to your point, Josh, like nothing happens in the world of AI without good information. And not just good information, but information that is accessible, of high quality, and structured in a way that AI can truly benefit from those assets. when you talk about AI architecture. It all starts on a foundational layer of high quality accessible information. So that's layer one. Number two is then what I'm gonna describe as the semantic layer or the model or whatever you wanna call it. But that's where you apply knowledge graphs and rag infrastructures. And then you have what I call your control layer, which is where you apply the business rules on what you want these systems to do and don't. And then you have the models. And then you have an orchestration layer and then you have applications. And like I said, everyone just kind of wants to jump to the top layer. And as we know, the heavy duty lifting and the non-sexy stuff is in those bottom two layers. And I think maybe for good now, people are starting to realize that and starting to get their act together and focusing appropriately on those issues.

Mm-hmm.

Robb Wilson 02:42

but companies don't like knowledge management. It's boring. They have to fund something that they don't understand. There's, it's hard to ROI analyze it. Like, is there a way around it?

Well, I think yes, but I think there's a tipping point, right? Like I think if you're building five or 10 or 15, sure, you can brute force it, you can fake it, but what happens when you've got 150 agents or 1500 agents or 15,000 agents? At some point, you're not gonna get around this. And what's so fascinating is, you know, I've been out and about now talking to people and helping with them. And the number one problem is education awareness of senior executives. And the second problem is a lack of a consistent data platform. And people are like, well, how do I fix that? I'm like, not in five weeks. Right. And, and, and, know, the other thing that's really funny guys is that when I talk to technologists and I talk to business people, there's this sort of battle going on between who's going to own the app layer.

Right.

Like that's where the, that's where the, that's where everything's happening. And I'm like, you're fighting over crumbs. to do it in a response.

Yeah. Like that's where that's where everything is happening, right? When it's. yeah, it's like gardening is about mowing the lawn or like understanding how to grow plants and botany, right? You're like, screw the botany. Let's just, I know how to mow a lawn. I know how to pull push a lawn mower. I'm a gardener.

That's exactly right.

That's right. you and I are fighting about who gets to push the lawnmower. But the point is, that true transformation in a world in which you have 15,000 bots, I think that's what you have to prepare for, and in which those things are going to go off the rails and they're going to do stupid things and you have to monitor them. That is not going to happen with a 90 % accessible dataset.

Mm-hmm. Right.

It needs 100 % accessibility. And by the way, you need to be in the 99 point something levels of quality, which very few firms are even anywhere close to that.

Yeah. If you think about humans, right, we have 10,000 employees in your company. It's the same problem. look at all these people, what are they doing? How are they making decisions and choices? How are they executing? There is a knowledge apparatus, but it's so institutionalized and tribal, it's not something like that upper management really has to spend a lot of time thinking about. Like at NASA, you have to spend a lot of time thinking about your knowledge management system because the difference between a crash, it could be just the communication between the measurement of metric to, Imperial. But in this scenario, so much of a company's knowledge is just passed on from person to person. And it's something that the top of the org hasn't really had to think about, right? They just sort of take it for granted that people will just train the next guy and share that information. but in this particular world, like, it's different Because now you're talking about a system that will just do what you ask it to do, irrespective of thinking about whether this is a smart idea or not.

I 100 % and to your point, if you don't explicitly train your agent to do something, it has the capacity to do very stupid things. And as humans, we kind of say, oh, that looks dumb. Now, how do we know that it's dumb? Because we've been doing this for 20 years and we've got institutional knowledge and we have judgment. And by the way, I'm not saying you can't teach AI to do that as well.

Yeah Mm-hmm.

But if you don't, and to be clear, you have to be explicit. And by the way, you have to be explicit. And here's the other thing that people, they just don't do is if you want something to operate at extraordinary high degrees of quality, one, you have to be incredibly explicit, but you might have to evaluate and iterate on those prompts 150 times before it gets the point.

But you have to be explicit, like you said.

This piece about evaluation is such a missing component for a lot of organizations. It's like, oh, I can vibe code now. Well, sure I can. I mean, I could also theoretically drive a Porsche 911 for 200 miles an hour on the racetrack, but probably wouldn't be a very good thing. And I think the illusion of skill, which by the way, use vibe coding, it's very powerful, but.

⁓ yeah. Mm-hmm.

If you want to do that, it comes with a responsibility of actually testing this thing. And because the non-deterministic nature of this technology, it will do weird things on you. And you have to run it through enough golden source testing, cosine similarities, User testing. And by the way, most organizations do not have their experts waiting around.

Yeah.

for Jeff McMillan to come along and say, I need you to spend four hours this week testing. And they're like, what do mean testing? I got clients to serve. So there's this organizational problem where the business is like, well, I want these tools. I can do this thing. And one, they don't really know how to do it. They can learn, but they don't know how to do it. And they don't have the capacity.

Right. Yeah, Josh and I were talking ⁓ the other day about this and one of the things that, you know, we were sort of, you know, came up in our discussion was like, you know, if Josh and I are going to handle a book together, you know, they'll be like, okay, you you grab chapter five, I'll grab chapter six, right? ⁓ Now, what I don't have to worry about is like, coming back two weeks later and being, hey, so sorry, we haven't had the chance to check in, where are you at? And he's like, ⁓ I've written 2000 pages. Right? That just will not happen. Right? So, but with an LLM, like it might be at 100,000 pages because, shoot, I forgot. Like, it's not going to stop. And what I'm trying to say is that in a lot of ways, it's helpful to anthropomorphize this, like to understand how to think about agents. We need to understand people because in some ways, if I ask you to invent a different color from the colors you know, that's pretty much impossible for you to do unless you've seen it. ⁓ Josh's joke is as a kid, he kept asking, what is the color clear? And I think it's kind of like that. And the idea behind this is to say, it's great to think of these.

these things as intelligent and we do compare to our intelligence because we have nothing else to compare to. But at some point we forget that it's not and that it does things we don't do. It will write and write and write and write it forever until you stop it. So in a lot of ways like people have this optimal stopping built in. You know they're lazy and they conserve calories and and they have other things to do with their time in life. And so you don't have to worry that asking someone to start writing that you're going to have to come in and tell them when to stop.

Well, three points on that. Number one, goes back to the data layer, because if you're write something, you wanna make sure it's based on good information and knowledge, right? And that you're not creating your entire thesis off of Reddit, right? So that's point number one. Point number two, you have to build embedded controls so that these systems will behave

Mm-hmm.

And by the way, not perfectly, nor are we, right? I mean, we all tell employees what they should or shouldn't do and they don't always do it. Machines sometimes make the same mistakes. And by the way, like, there was a thing the other day where I was talking about this concept of like, how do you embed your ethics into AI? Well, you can. You can prompt your standards,

I mean, everybody does it and Thrapac has their constitution in there, absolutely.

Right? And by the way, like organizations need to think about what are the ethics of their AI. I mean, that's right. Like what is and Rob and Josh and Jeff, like what are our values and how do we want to imbue those values? And what's great about AI is you can do that. And then lastly, and I talked about this a little bit in the beginning, like there's the governance layer, but then there's the monitoring layer. Right? So you want to have somebody that says, I don't want people doing stuff.

Yeah, their own constitution. I agree with that. Hmm

And then you may want to have a completely different model that sits outside, that's looking across that stack. And it's almost independently asking, something smell right? Right? So you actually have like, it's almost like Rob, you're going to do the work. And then Josh, you're going to look at Rob's work. then Josh, you're going to do the work. And Rob, you're to look at Joshua. Like this idea that you create independence. And by the way, you might even have two different models or three, or even four.

Right.

Mm-hmm.

that are asking the same question. And by the way, you may get different answers, but I think people need to acknowledge your point, Rob, that these things are imperfect. And by the way, they're imperfect in different ways than we are imperfect, right?

Yeah. Exactly. That's the key, right? It's a different kind of intelligence. Like, it's great for a point to think of them in an anthropomorphic way because it helps us understand them. But then you go too far and you start to realize like, wait, but they're different. don't have self-preservation. They want to burn

Mm-hmm.

as much electricity as they possibly can. They have no sense of stopping. with humans, managers are always talking about how to get them going, how to get them productive. With these things, it's like, how do you stop them? When do you stop them? And that kind of comes to like the point I wanted to really dive into with you, I talked to these CTOs that are like, my God, like I'm so busy. I'm finally able to catch up on my backlog of all the features of all the software that I've been behind on. Really software, quite honestly, that no one's ever going to use. But now they're cranking through their backlog, right? They're like, agents, I don't have time for agents. I'm getting through my backlog. Agents are building the code for all the things I was supposed to build. And it occurred to me at that moment, especially after talking to Joshua Gans in the microeconomics of AI, that there's like really bad, then there's bad, and then there's good, right? And really bad isn't not using AI. Really bad is burning a shit ton of tokens on things that never see the light of day or produce a dime of revenue. Better you don't use any AI than everyone grinds on tokens within your company, but none of those tokens actually see the light of day when it comes to revenue production and how easy that is to do. Cause these systems are so dopaminergic. They're so addictive that you got these developers that are cranking and producing, but they're producing stuff that's not production ready. so what happened? The company's labor bill essentially with tokens just got increased productivity got increased, but revenue didn't move, which is scary, right? Like, that's probably what happens when you don't know what you're doing, but you're trying to lean into AI, because you feel like you're falling behind. And

So here's what I would say to that. I think it is very hard for organizations who have never experimented with AI to sit down at a board meeting and say, here's our priorities. Very, very difficult. And they don't know. So I think there is something to be said for burning some tokens in

Mhm.

in a controlled environment ⁓ where people have access to an appropriate set of tools with a set of training. And I think most organizations have to kind of go through a little bit of that, right, for them to really understand the art of the possible. ⁓ So I think there is an element of what I'm going to say wasted tokens in the sense that there is some learning, right? So I would not say that that is fundamentally bad, but I would agree with your premise, right? That what firms are not doing is maybe they get through that after six or nine months. What they're not doing to a large extent, and I don't want to say all because there are some great organizations out there. I like to think I worked for one of them that are basically saying, what are my strategic priorities for my business? And if I were to employ five to 10 things at a strategic level, what would they be? And how would AI enable those things? That would be the first question I would ask. The second question that we need to be asking is if my business is destroyed in 10 years by AI,

Who destroyed it and how do they destroy it? And therefore, what moats do I need to establish to prevent that from happening? How does AI play a role in that? And then the final question, which nobody's asking, is AI actually reduces the friction of new products, new services, new markets new clients. And given my unique...

Mm-hmm.

my being a company's unique strategic position, maybe it's its infrastructure, ⁓ maybe it's scale, maybe it's their knowledge base, right? Whatever it is, is there an opportunity to expand what I do today in ways that I've never even thought about? So, you know, who knows in 10 years, Wealth management firms are going to be selling legal and accounting services and legal and accounting service are going to sell wealth management services, right? Because they're going to be able to create new products. And I think the missing piece is that conversation. And I'm not even here to say what the result is, but my strong advice, if you're a CEO or senior leader, like you need to be having those conversations and let it drive the technology. Because to your point, and I'm guilty of it too, I'm sure you guys are, you know, it's two, three in the morning and you're still playing around with live coding.

Yeah. Yes. Yeah, and you're adding new features because it's more fun instead of just finishing the ones you already have in there. ⁓ Because that's painful. And it has no problem offering you the next feature that you should work on.

Well, I mean, it's crazy. Like I was doing something the other day. I bought a farm and I built a tool to, I'm trying to basically use AI to manage the whole thing, right? And start to finish. And I kind of was four hours in and I was pretty proud of what I had gotten to. And I said, what ideas do you have for me? And he gave me like 37 more ideas. And I said, oh, you're going to Mumento Hall. Just do them.

Mm-hmm. Yes, exactly.

But then guess what? Guess what? It all started breaking on me. And then I was like,

Yes.

then I spent another 14 hours of my life debugging and telling the thing, no, I don't want it up here. I want it down there. mean, and I think that's exactly right. And I think we're all, we're all guilty. ⁓

It's addictive. I don't think people understand it's dopamine. Like that chase is built, you know, whether you're chasing new speeds on TikTok or whether you're grinding on new features that you think are going to be the next big thing. Getting them out the door is painful and it's easier to just jump on the next thing. But it's also like, so I was thinking about this from a counting lens, like, We always look at labor costs. Let's look at token. Like if you if you if in your accounting you have like one line item, which is tokens, you're in trouble, right? You need to understand like what were those tokens spent specifically on? And those have to roll up to this knowledge that we were talking about to say, like, if you have a knowledge system and that knowledge system is done well, then it understands the priorities of the business and understands what what is being done. And then if you have awareness of what everybody's burning tokens on and it's being bumped up against that knowledge, have you have stronger sense that you're not going to have like 10 versions of the same piece of software being produced by 10 different people.

Yeah, but I'll tell you, and you know this, firms are, mean, token consumption is easy to measure. But there's two problems. Organizations have a very difficult time measuring the impact or let's just use, let's forget revenue and risk for a second. Let's just talk capacity. So the first problem is knowing that you've created 30 % capacity, because that implicitly means that you have a baseline. That you know what people are doing today, which in many organizations they don't. But there's even a more complicated problem.

Right.

Let's make believe that we have perfect instrumentation on what the process was before. Efficiency only comes from whether or not the capacity that you created is applied to something value added. So you and I, all of us can create 30 % of our day.

Mm-hmm.

And if we go play golf with that capacity, there is no value that has been created for the organization. So you not only have to know where the capacity came from, you have to know where that capacity was applied and whether or not that application of that capacity actually was driving greater revenue or greater service or whatever. the first part is hard enough, let alone the second. Very few organizations have the sophistication to be able to really understand where that

excess capacity is applied and whether it's actually generating value.

Does knowledge management kind of get you there in a way though, if you do it the right way, because LLMs might have set knowledge management back a bit because it makes it so easy to just, look at explicit data and, and be, you know, you can throw it somewhere and summarize it and think that you've done some degree of knowledge management, but it might be tempting if you're doing that to ignore the more difficult work, which I think is, is mapping processes.

Yeah.

⁓ and like seeking out implicit knowledge, right? Like some of it might even require getting up from your desk and going somewhere and talking to some other person. you have people scattered everywhere, ⁓ using more and more of these tools and maybe like not having as many conversations. there's. There's this knowledge management challenge of finding the explicit knowledge. But once you unearth that, you're also unearthing information about processes and how they run from end to end and who touches them. And then you are finding kind of what you're talking about, right? Like the value points to where you can actually properly apply automations.

Thank Right. mean, well, you're using knowledge management in the the broadest sense of that term, which, you know, I'm, you know, I'm a strong advocate of that. guess specifically to this, though, I think what we're talking about is having a deep understanding of your core processes and understand how work moves from right to left. You understand the value added, the checks that are in place. You understand your failure and defect rates.

Mm-hmm.

your rework costs. In high end knowledge businesses, we don't think that way. We hire really smart kids from Columbia and Stanford and Harvard, and we put them in a seat and we say, work and watch me and you will learn and you will work 80 hours a week and you'll do that for four years and you will learn that tribal knowledge, right? But when I come to you, and I say you senior person, how do you run your XYZ process? Most high end sophisticated businesses have a very difficult time articulating that at a degree of specificity that you would from a consulting firm that mapped the vizios, right? And part of I guess what I'm getting across is it is very hard for you to apply artificial intelligence in a

Mm-hmm.

in a precise, high quality output way unless you know those core processes, unless you know what your standard KPIs are, and therefore, because you need those to be able to measure whether you're actually producing something that's better or worse than you had before. And most highly paid people in this world don't think that

Yeah, it's just like LMS have evals, right? How does anyone know that GPT-4 is better than GPT-3 without evals, right? Someone says, it's smarter. And you always talk to people who say, I think cloud's smarter than, but at the end of the day, these are just perceptions, right? And there has to be these evals so that we know we're making progress versus just training new models and spinning.

That's right. mean, you not to bore people, but like you want to do this right. One, if you're going to create, let's say a bot, let's just say a Q and A bot, simple example. You probably want a thousand, a hundred percent accurate inputs and outputs. And you want to run those inputs and outputs through your model every single time you do an upgrade. And, and maybe the first time you run it through the model, it's 80 % accurate. And then it gets 85 and 90, 93, 94.

Right.

And then what happens sometimes is you fix something in the model that you think you fixed a prompt or maybe you adjust some data sets or you put some tagging in and then all of sudden it goes from 89 % to 74%. And you're like, oops, what did I do? And then you gotta go back again and wire it. But the point is unless having to your point be like, oh, I love the new Anthropic model. Cause I asked three questions of it and it gave me good answers. And then you used it and you asked about your

Right.

you know, your dinner for Friday night and you like you didn't like the extra garlic in your dish, right? Like these are very personal things which they're not statistically significant, right? And by the way, you have to do it.

No, especially because my kids are using my LLM and polluting my memory, you know? So they're like...

Well, that's a different, that's a different, I'll leave that, I'll leave that to you at home, Rob. But the point though, the point I was gonna make is like, what I just described is step one in the process. Then step two is you put it into the wild and you say, you know what, I need you to participate in a training program here, a testing program. And every week I need you to ask 20 questions of the system. And I need, I need 100 people to do that every week on Monday. and I need the results and when you get something that's wrong I want you to me a thumbs down and I want you to describe to me in words why you I got a thumbs down because what will happen is your golden source will get in your high 90s and then as soon as you give to people guess what it's going to drop down and then and then you Rob tell me oh you don't like this answer and I fix it and it screws up something that Josh wants to do right but you have to be very methodical about this you can't just say oh I gave it to 10 people they said it feels good it's good Jeff

Yeah. Yeah. Yeah. Great!

I'm like, yeah, let's go. That

is not a robust evaluation. And by the way, people don't like doing that. They like to build stuff. They don't like to test it.

Mm-hmm.

I know, like that's pain, that's not dopamine. But it's so crucial to have that. And what you're talking about, you know, I just think of as custom evals, right? Because you need them contextualized. know, don't, general evals aren't going to work. You need custom evals that are about your organization. You need to build a corpus that's organizational centric. That's OAGI, right? Not AGI, organizational AGI.

Yeah.

And, and then your point is right. Like you got to keep it static so that you have a baseline to play off of as you, as you think as changes happen, are you improving or is it getting worse or you just burning tokens for no reason. And this kind of goes across. It's a meta problem, right? It's there and anything else. if you, what if you rev an agent in some way or add a new agent to replace an old one? How do you know that it's better?

Go. Well, two things. Number one, to your point, and this is what I recommend to people all the time, is like, you know, these models, they're upgrading them every six months, sometimes every three months, right? So how do you know? I mean, and they're like, it's better. Well, is it better on my corpus? Or are you actually degrading, right? Because the way I destructured my rag process, it's not, it's not, it's sort of in a weird way that it's actually acting strangely when I'm improving the model.

Right.

So your improvement is not my improvement. But the other thing which I think is super important is that we never get in trouble with the core use cases. We know what they are, we train them. It's the two, three, four standard deviations. It's those edge cases. And again, there might be in some cases five people in your entire thousand person organization who are actually capable of finding those use cases. Right? You need, if you want good, I always say if you want good AI, you need good data and really smart people. Because bad data with dumb people makes really dumb AI. Right? And, and that is where the, mean, sometimes you can build stuff in days, as you know, it could take you months.

Yeah.

of testing to get to the point where you are really confident. And when we start talking about agents, I'm just talking about systems that just give you an answer, let alone systems now that are going to be acting on your behalf, passing on information to something else. everyone is very focused on agents and which they, by the way, should be, but they're missing out on this conversation we're having right now because this is where the value really lies.

Mm-hmm. Yes. So I always, call it use case zero because I always feel like, if you're going to build a Maximus robot that use case zero is that Maximus can build robots. not fold laundry. and there's no question in my mind, that's what they're working on. In the knowledge management space, it's knowledge that can train and learn itself, right? It's not about like, ⁓ you know, a pipeline. We started, we assembled 30 people, we went through, accumulated all the knowledge, dumped it into the system, and then we keep updating it. Like, use case zero is how do we create a system that accumulates knowledge on its own and maintains its own knowledge? Because it can.

Yeah.

Right? It's just, we're not used to thinking in that way.

I mean, yeah, but I think we have to think about these things as a human AI collaboration. I think we are very far away between hit the button, redo my strategy, build a new product, and send it out to all the clients. We should be thinking about this agentic environment as a world that is leveraging for machines to do what machines do extraordinarily well, that is collaborating and being supervised by people, which is being collaborated and supervised by machines, right? We have to think about this world that we're handing things back and forth to each other. We're engaging, we're talking, and we're using humans where humans are best, and we're using machines where machines are best. And I think part of the challenge is we're thinking about agents as purely agentic machinery, right? And the reality is in organizations, it's going to be an interface where, and I'll just use a simple example. have an account opening process that is agentic. And then at the end of that process, you may have a human being come on top and then validate everything. And then you may even before they submit have a different model that comes in and checking what the human does. And only after all those things have happened together or someone may do something that feels outside of your ethical policy and that either maybe gets stopped or maybe gets escalated to a human who then looks at it and makes an evaluation. So it's gonna be this interplay. in all honesty, most organizations are not really built for any of that yet.

Yeah. Yeah. I think you make a good point. Like this can't just be autonomous. It's got to be. It's funny. Like people will say it human in the loop, but I almost think of it in the reverse. It's agent in the loop because, you know, at some point it's the agent saying, Hey, you know, was asked this question. I didn't have an answer. I sought the answer from where, wherever. data source or ultimately it came from a person no matter what it always comes from a person no matter LLM's knowledge always came from a person there is no such thing as information that didn't come from a person it's just about whether it recently came from a person and which person did it come from but there's nothing in the system that didn't come from a person at some point and therefore it can

There's that other, go ahead. Go ahead.

It can turn around and say, like, Hey, I was asked this question. I don't have this information. I reached out, got this information from this person. Now I need to know somebody who needs to validate that before I institutionalize that into my memory. ⁓ which makes sense. Like that's absolutely true because some person has to be responsible for that information. So, because if it's wrong, there has to be a human that has something to lose. Right. That. that someone can go to and say, hey, how did this information get propagated?

Well, let me give you a thought experience. I was at a conference the other day and I raised this question. So let's make believe that you have a terrible cough and you go to your doctor who accesses your agentic medical profile, who then uses her agent to diagnose your cough, ⁓ who then uses another agent to prescribe a medicine to you, which then is... received by your agentic pharmacy, maybe no humans involved now, by the way, who dumps out some pills into a little ⁓ plastic container, which then an agentic ⁓ drone picks it up from your pharmacy and flies it to you and drops it on your front door. And you take that medicine and it kills you. Whose fault? is it? Like, we are living in this world now, which we are not prepared for that problem, right? That assumes that there is accountability and transparency in every one of those agents, that that you know what information was passed from your I mean, maybe your records were screwed up. Maybe you lied in your in what your what your conditions were right? Like, that issue is very, very complicated, which is one of the reasons I don't believe that agents from an external perspective are gonna move nearly as fast as we think they are because of these types of challenges. And we have MCP, but the reality is these issues are unknowable right now.

Yeah, MCP has nothing. Yeah, it doesn't solve the problem that agents don't have anything to lose. You have to have an entity that has something to lose in that chain that in our case, we put people with paychecks, right? And we pay them lots of money and we say, okay, if this happens, you'll be held responsible and you'll lose your paycheck.

That's right.

There has to be something to lose. Agents don't have anything to lose in this whole equation. And our society depends on a human being accountable. Otherwise, your whole point is like, if it's the government that gave you the pills and you're like, can't sue the government, then you're screwed.

But I'm... But that's why we're gonna. No, but this is why I think in like our kids will be flexing in 10 years in a bar saying that they've got, know, they own and are responsible for the top agents at their company. Right? That's going to be, they're going, you're going to manage people.

Exactly.

and you're going to imagine agents. if your agent, you to your point, Rob, if you're running the agent that does prescription management at the pharmacy and it comes out that your agent sent the wrong drugs out to Josh, you're going to lose your job.

Yeah. Yeah. And somewhere in that chain is going to be you swiping right and not reading it. The agent's going to be like, are you going to let me, I'm ready to deliver. Do you approve? And you're like, yep. You know what? I'm too busy to read it. Yep. And then, and then boom, you're going to be like, shit, next time I'm going to read that thing.

Well, that's a whole different question. Well, this is what I wrote something about this recently about that what what's going to it's not going to be like one moment in time, we're going to start to seed our judgment over time and it's going to happen in very small increments. Right. And it will not be a problem until it is a problem. And I will predict I will predict that we are not far away from some very bad thing happening in our world.

Right.

and the CEO is going to get on CNBC and said, well, we had an agentic problem. And the reality is, you know what, your agentic problem is your problem, right? Like it's no, it's no different than if I hired a bunch of summer interns and gave them gave them vodka on a Friday afternoon and they have it right. Like, it's my job, like we can't and I think there is a

Right. Exactly. And as that manager, it's why you don't get paid seven bucks an hour, because if you have nothing to lose, you will give them vodka. Like, whatever, lose my job, I can get a better paying job at McDonald's.

Well, that's right. That's right. is, mean, the argument, and by the way, I don't know if I prescribed to this, but the argument is that we are going to create so much complexity in these agents that we're going to need better, higher quality people supervising all this because the job is going to be really, really hard. It's going to be a big, right? Yeah.

Hmm.

Mm-hmm. Yeah, it's like lawyers that read the fine print, know, like you got to

Well there's...

vigilance, right? You got to be like a lot of decisions really fast. We're almost like all going to become like Obama and his gray suit, blue suit thing. I, he just one less decision he has to make that day on something unimportant because he's got so many decisions to make in that day that matter.

Well, and the problem is that the risk, and I think that's a good example, right? Before, there was only so much damage that the three of us could do in a day, right? When you're managing an agent that's going to send out all the tax notifications to your 4 million client base, or is going to process those prescription drugs, your ability to do harm at scale is so far, so much bigger, which again, demonstrates why you need, I mean, it's very likely that firms are gonna have five or even six,

Hmm.

independent monitoring controls to include humans. Like you may have five, five agentic monitors and two people over at this because if you don't give them the right dialysis approach, people are going to die. Right. And, really putting in the level of controls, which gets back to the conversation we started with, like

Yeah. Yeah.

good data foundations, good semantic layers, good controls. And again, we're good for 15 agents. We're not good for 15,000.

Yep. Yeah.

Yeah.

And that context now to kind of bring it full circle is not, we realize this knowledge management isn't just for AI agents, it's for the people that have to make these decisions too. They need the same knowledge to know that the decisions they're making make sense because the consequences now are so great, that context matters more.

Well, and the good news is you can, if you properly structure your data, if you have the right sources and they're of high quality, you can help your employee base make those better decisions, right? But again, it comes down to all the stuff we've been talking about.

Yeah. Well, that ties back to another aspect of use case zero that we've talked about we trained an agent on all the knowledge in our book. And that was exciting. And then we're just staring at another empty interface. That's like, me anything. that's interesting, but it's not terribly helpful. Like what, what that interface should really be doing is acting on behalf of the person staring at it, like, who are you? Have you read the book? Did you just buy the book or did you read it all? Let me see how much you know. And now let me teach you everything I know about agentic AI and let me teach it to you in a way that you're going to understand and that you're going to grow and you're going to be engaged and interested and you'll learn it all, but we'll take you through it the way that you're going to like it or that you're going to learn the most from. I think, you know, that also opens an opportunity where if

Pushing. Yeah.

If agents inside of an organization are actively teaching people, not just about how agentic AI works, but also, the corpus of knowledge about that organization and bringing them up to speed, there's still an opportunity there too, for a feedback loop into the system. Cause people are going to spot things that are wrong. People are going to educate the system as well. They're not just going to be taking the education.

Yeah.

Well, you make the point, which I've said, I teach at Columbia, was talking to students about this the other day. the comment, someone sort of said, well, AI can make you really dumb. And I said, yes, AI can make you really dumb if you want to be dumb. It can also make you incredibly smart if you want to be smart. Both of those things are factually true. And I think to your point, AI can teach.

Mm-hmm.

and inform and challenge and elevate your knowledge base in a way that is just extraordinary. On the other hand, if you want to be lazy and act smart, AI can do that too. And I think one of the, and I know we're getting philosophical here, but one of the questions is how do we build environments and cultures where people are doing the former rather than the latter? And how can AI play a constructive,

Yeah. Yeah.

a constructive tool in that knowledge expanding, right? So I know more stuff about the world and I'm more thoughtful and I'm asking better questions as opposed to just to your point, Rob, you're like, done, done, done, done, done. And I'm on my iPhone looking at ESPN.com.

Exactly.

Yeah, I think, you know, as I I look through this, there's one lens to see that humans are going to be the bottleneck in all of this, not machines. ⁓ And but but it needed necessary bottleneck. And if you want to make them not more productive, but more productive at making decisions, then you have to make information more accessible and available to them. at the time they're making decisions. And this kind of ends up at decision science being like at this, you know, use case zero isn't just the system learning on its own. It's the system teaching what it learned back to humans so that they can make better and faster decisions, which is, it's an interesting loop, right? Human in the loop is like, I'm going to teach the system and now the system is going to teach me.

But there's, mean, again, not to go back to the human thing, but that's when we're at our best, right? We're at our best when the three of us are doing things, listening, being humble, being open, and then saying, no, I think you're wrong. No, I think you're wrong. I think you're right, right? That is the essence of like learning, right? And challenging, and that's why, you I personally...

Mm-hmm.

I know if you guys saw, I built a bot over the weekend, talk about wasting tokens, ⁓ that does, ⁓ I created a board of advisors and it's on MacmillanAI.com and it's a board of advisor. And what I did is I took my favorite 20 leaders in the world, like from Teddy Roosevelt to Richard Feynman, Steve Jobs, Sun Tzu, And Nelson Mandela and what you can do, you can ask.

Mm-hmm.

Yeah, mixture of experts on the roofline.

It's about a penny a query so people can go crazy. I got plenty of tokens. And what you can do is you can ask questions. And by the way, I have a look at it. It's called the Board of Advisors on my site. And what it will do is it will give you different perspectives. And by the way, some of them are actually different. Like Steve Jobs will say something that's different than FDR. And I would argue that That process is really helpful. It's like I can't call you guys and ask you your opinion all day long, but I can ask AI a lot. I can say, Josh, be Rob, be somebody else, and give me your point of view, and that elevates my game.

Yes. Yes. Yeah, we had this good discussion. ⁓ Josh reminded me The Genome Project.

Dr. Lee Hood?

Yes, Dr. Lee Hood. So we had this with Dr. Lee Hood and we discussed this, And we were talking about the fact that the medical system isn't going to change at its core based on AI. What's going to happen is there's going to be an AI abstraction layer that helps you manage the bureaucratic healthcare system that doesn't change. ⁓ And that's going to happen faster. It's just going to be a layer that will make appointments for you. in the antiquated old system that they didn't get around to changing. And it makes a lot of sense. But what we discussed is like, yeah, it could be GPS, right? Like some people will be lazy and they'll just let the thing make decisions. But more than likely, we're to have different perspectives. And once you add that, once you add a couple of agents, if you want to call them that, arguing with each other about whether you should eat the donut or not eat the donut, now you're forced to kind of read the arguments. And what's happening at that point is you're actually going from like the opposite. Now you're learning because the best way to learn is to hear different perspectives. Right. So now you're like, should I eat the donut? Well, your psychologist says eat the donut, but your trainer says don't eat the donut. Now you decide you know, you get to decide on whether you'll eat the donut or not. Well, you just learn something

we're getting somewhat philosophical here, but I remember, you know, 15 years ago, I was when I was working on machine learning stuff, and I used to make the joke that smart people with better information make better decisions than smart people without it, right? And I think what you're highlighting is the fact that generative AI does this on steroids, right? Like now again, It's a question of what do you do with that information? Like, are you just doing whatever the AI tells you and says, okay, well, now tell me what the right answer is, or are you applying critical thinking? And the power is that if you wish to apply critical thinking, you will produce better decisions without question. But if you do not, if you're lazy and do not apply critical thinking, you will make dumb decisions, right?

Right.

This is where we as humans have to be very thoughtful about this issue. It's a very existential question, and it's not letting the AI say, OK, well, it told me to eat the donut. I'm just going eat the donut.

Yeah. And that's just up to us, right? It's not up to it to, you know, it can't, it can't force us to make choices, right? I think, you know, we all, this is another anthropomorphic concept that we apply that it gets dangerous because, you know, we think of things in singular. We have like one thread in our consciousness and we talk and we have an opinion. We have an opinion, right? And so we think there's a right and wrong answer to a lot of things because well, run from the lion or don't run from the lion really matters to us. But AIs are multi-threaded. There's 20,000 right answers to that question. Should I eat the donut? There's not one right answer and one wrong answer. There's a hundred. So now you have to pick the one right answer out of a hundred right answers. There's no right answer. And that's where knowledge gets complicated is you're like, well, There is no answer. There's just, there's just the answer you want to pick.

Well, the other thing is, I mean, you could imagine a world in many years where, you know, you have a choice, it is a probabilistic curve of outcomes, you then experience that outcome, and then it goes back into your algorithm that adjusts. So the next time maybe you don't eat the donut, right? Or maybe the next time you do because the factors have changed. But the idea... I mean, what makes great smart people is they learn from experience, right? There's a reason, you know, 18 year olds do stuff, things sometimes, right? Because they don't have that knowledge. But, you know, I think one of the things that's so powerful about AI is it really can capture those learnings in a way that we as humans, you know, we forget stuff that we don't remember, but being able to sort of build. And I ultimately think, you know, we will have these algorithms and they'll have the

The Rob algorithm, the Josh and the Jeff algorithm, right? And these will be telling us all day long, you know what, ⁓ maybe you can eat the donut today, but maybe tomorrow you shouldn't for the following million different reasons.

Yeah. And what's the likely thing? The likely thing is it says, you know what, just give me the data. The data's gonna say, you know what, eat the donut, you'll feel good for 20 minutes, and then you'll feel like shit in an hour. Your decision.

That's right. That's right. And we'll probably still eat the donut, but that's a different problem.

Yeah.

Yep. Well, cool. This was great conversation as usual. Never enough time. We barely crack it open. But thanks for thanks for joining us.

Indeed.

No, my pleasure guys, always a pleasure.

The Checklist Your Deck Is Missing

Transcript