Joel Hellermark:
You have lawful and lawless agents, rogue agents that don't know the permissions, don't know the underlying rails, and you can give them exactly what they need. rather than each user having to sort of figure that out. You can then build these more proactive experiences.
Michelle Dawkins:
Hello, my name is Michelle Dawkins and on today's Future of Work podcast, I was joined by Joel Hellermark, Workday's Chief AI Officer. In our conversation today, we explored Workday's recent research paper, which talks about the copy paste economy. This is digging into the fact that employees are spending a lot of time replicating data between disparate systems, which really increases the productivity gap. Joel and I talked about how do we solve for this? We really looked into why context is important when it comes to AI, why it's important to have AI running on guardrails and within systems and processes rather than separate to those systems and processes. And we also talked about the importance of human in the loop when we're moving towards more automation. I really hope you enjoyed the conversation. Joel, thank you so much for joining us. It's lovely to see you in London. Absolutely amazing.
Hellermark:
It's amazing to be here.
Dawkins:
Looking forward to Elevate tomorrow? Are you a fan of the Rolling Stones? Have you noticed we're in the Rolling Stones room in the London office?
Hellermark:
Yeah, yeah, I am indeed. Not a massive Rolling Stones fan, unfortunately, but excited to be in the room.
Dawkins:
Okay, all right. So we are going to talk about some incredible new research that was just published about AI adoption, the usage of AI in organizations, and I'm really excited to ask you some questions about this, get your perspective. So we will jump right in if that's okay.
Hellermark:
Of course.
Dawkins:
If we talk about the difference between adoption and transformation, organizations can often make mistakes around what is the difference between adoption and transformation. We know that just putting an AI assistant at every desk actually isn't going to make the changes that companies are expecting. What is the change that needs to happen within organizations?
Hellermark:
I think the first problem that organizations need to solve is genuinely AI adoption. When you get AI adoption, you ride the model improvements. So every time the models get better, if AI is universally embedded in all of your teams, your organization will instantly get that sort of intelligence increase if you like. But these systems reset the mechanics of how you use them sort of every 12 months or so now.
So we started in the assistant era, and the assistant era was heavily prompt-based. So we just put co-pilots at every desk, and people were copy-pasting material from any system of record and so on, trying to input it to this copilot to do work for them. What's shifting now, and why this sort of pattern is resetting, is that we're getting proactive agents.
We're getting proactive agents that have context from these apps. So this means we're shifting from co-pilots that we're constantly sort of instructing and prompting to get any value out of them to agents that proactively serve us actions and insights before we even ask for that. And as they do that, when they become increasingly embedded into the systems that we use every single day, with that context from, for example,
work that they can do that much more intelligently. So shifting from reactive copilots to proactive agents.
Dawkins:
And that's what's really driving transformation rather than adoption.
Hellermark:
Exactly. So the patterns that we're seeing is that, you know, from a co-pilot-esque interface, you could see, you know, small sort of ten, twenty percent productivity increases. But as you're moving towards agents that can do end to end work proactively, you can save an order of magnitude more and that's truly the shift that we're seeing now.
Dawkins:
It's interesting because the research identified, you talked about it, this copy paste into different systems, so the copy paste economy, the research identified that people are losing seven or more hours a week on performing these tasks. That's incredible. It's an incredible loss of productivity. Why isn't just adding more AI the answer here?
Hellermark:
I think in those cases agents that are fundamentally sort of context aware and that to my previous point they are proactive. So basically what we've seen historically is across each one of these systems you try to go around and the human was sort of doing the context engineering if you like. Trying to at each moment for every single task sort of provide the right context to the model. And as the systems become embedded and proactive, they're assembling all of the right context just in time to solve the task. But they're also moving from doing single-step tasks to multi-step tasks. So historically, you might have done this to generate a single email or a single document and so on. Now the systems are also becoming better at solving long horizon tasks. So what we're seeing now is that the length of the tasks that AI can solve is doubling roughly every seven months currently. So that means we're moving from seven months ago they could you know draft a document at best to now taking out an end-to-end process. And so gathering the right context and executing multi-step workflows will mean you'll get significantly more productivity out of these systems than when you had to sort of for each subtask context engineer.
Dawkins:
The context point is really interesting. I use a running app at the moment to improve my pace. I'm very slow with my running. And when I first started using it, it was giving me a pace that was completely unreasonable. And I kept arguing with the AI, I can't run at that pace. And then over time, it's completely shifted as it's taken in my runs, it's taken in my feedback on what my effort was. And now it's actually giving me relevant pace based on my data. That context piece is really important.
Hellermark:
Exactly.
Dawkins:
I did argue with it a lot in the beginning. So you've talked about having context, having end-to-end processes. In the research, it identifies this right and wrong approach to AI. So having AI completely separate or having it embedded into your systems. What is the crucial difference there in having the layer or having it embedded into your processes?
Hellermark:
So there's a few things. The first thing is just running on the right rails. And so you have lawful and lawless agents. The lawless agents are largely the paradigm that folks are running to today. It's sort of rogue agents that don't know the permissions, don't know the underlying rails and and and so on, and really struggle to execute tasks that are policy and sort of context aware. When you run it inside of the systems and with that context, you can make sure they're lawful. They're executing the tasks that they should be executing. Second thing is that you can bootstrap a lot of this context and you can give them exactly what they need to do to solve the tasks. And so rather than each user continuously having to sort of figure that out, you can then build these more proactive experiences. And so rather than having a reactive experience where you're trying to go in, assemble the right context, put that into the agent to solve the task, the agent can serve you the tasks that it could support you with. And so you're moving from these reactive approaches to an agent that is running 24/7 effectively. I think about it as you're basically getting an infinite set of, you know, 150 plus IQ, universally aware coworkers in your pocket that you can review their work, you can review their suggestions, and they're sort of constantly delivering new insights or new actions or new work for you.
Dawkins:
Yeah, that's incredible. Building AI that knows the company, knows the context, knows the processes, sounds simple in theory. What actually makes it hard and what makes it hard about having it sit within the data itself?
Hellermark:
The first part is building out the rails for these agents to run accurately. And this is a very new pattern compared to the previous UR patterns. Historically, we were largely building software for humans. Now we're building software for agents. And so you need to deal with the latency requirements of agents, you need to deal with how do you gather the right context, how do you create long and short-term memory, and so on. So we're basically creating a new pattern of software that is built for agents to use instead of humans to use. Yeah. And so we're building the right policy engines for the agents, we're building the right knowledge engines for the agents, we're building the right memory systems for the agents so that they can solve these tasks effectively. So in that sense it's sort of a re-engineering moment of how do you make these systems super intuitive for agents rather than just super intuitive for humans.
Dawkins:
And so if an organization is trying to do that outside of a system where the data sits, where the process sits, where the logic sits, the risk, the compliance, the cost, that all becomes a massive issue.
Hellermark:
Exactly. And those systems might not have the deep context of the tasks that you're trying to solve. So I think the domain of knowledge becomes super important here. We sit on the deepest domain of knowledge of our tasks, our workflows, and we can use that as we create the rails for these agents. But if you're just executing this on an abstract level without having any insight into the specific tasks, that gets really difficult.
Dawkins:
Yeah. You mentioned something earlier, human in the loop. I think it's something that we've talked about for a long time that is something that is still critically important. We are moving more towards autonomous agents that are making decisions, they are taking action, they are running processes, they're supporting and working side by side with humans. How do we keep that human in the loop concept still in place? How do we make sure that we're retaining that?
Hellermark:
So I think we're basically seeing five levels of autonomy. So if you take the self-driving cars analogy, we're sort of following that. So first it was this sort of highly reactive systems, maybe like simple auto, auto complete, those sorts of tasks, and then they became more prompt based. and then they're becoming more proactive. And I think as you go towards the higher levels of autonomy, they're gonna be more and more sort of policy bound. So we're still at the rate now where humans are heavily involved in the sort of final approvals. But instead of approving a task that would have taken you one minute to create, you can approve a whole body of work. But once you've been embedded in that loop long enough, the system should get an intuition for when it actually needs you to loop you in. So if you think about self-driving cars, it was a quite long era where humans were still sitting in the car correcting it when it did something incorrectly. The autonomous driving system was handing over to the human when it was unsure and so on.
And that's largely where we're at today. We have a car that drives pretty well but still makes quite a lot of mistakes, hasn't seen all of the edge cases and so on. And now we are basically sitting in these autonomous cars, correcting them, and then at some point it will have gathered a lot of the policies, a lot of the knowledge from us sitting and correcting it in the car. And at that point it will get increasingly autonomous. So I think we're at like L3, approaching L4 level autonomy for knowledge work. And the final sort of L4 is policy bound. and then if you think about L5, which is really the end state, you have autonomous companies that are sort of self-improving. And so you have end-to-end agents running end-to-end cycles and then reviewing and improving the processes as it does that. But I think it's gonna be a long journey of sort of L3, L4 getting that right before people will trust autonomous enterprises.
Dawkins:
And the trust word is really interesting, actually. So what I was thinking then is I recently went in an autonomous car in San Francisco and you still have that desire as a human to take control and drive it yourself. And you're worrying, is it doing the right thing? We see that. I mean, the research sees that that trust piece is really important. Do you see a change in the way people are interacting with AI now where we will become more trustful and not recheck the work over and over and over?
Hellermark:
I think at some point it will be quite the opposite, where people will lack trust for the human-driven systems. That's certainly how I feel. I feel more safe going into an autonomous car in San Francisco than jumping in with a cap cab driver in London. Despite those cab drivers being, you know, incredible drivers. If you look at the error rates of the autonomous cars compared.Compared to the error rates of the cab drivers, they're significantly lower. And that will apply to all work, right? and I think we're just at the face where we're starting to make this shift. But we're certainly seeing that humans more and more want to double check everything with an AI system.
So you get something from your doctor, you want to send that to an AI system to make sure it was correct. but you wouldn't w you you wouldn't purely trust the AI system in that case either. So I think that's where you want both. You want the sort of judgment and the intuition of humans combined with the AI systems are incredibly unconstrained, right? They can have all of the world's knowledge, all of your company's knowledge, all of your context. They can run millions of hours of thinking in parallel to solve your task. So naturally they will start doing a lot less errors than humans would.
Dawkins:
Yeah, yeah, I get it. You do see the transition coming and the trust, but as you said, the data, the accuracy, the guardrails, all of that needs to be in place to make sure that people do move in that direction. Yeah, more confident in the outcomes and then increase productivity in that way. So you are Workday's Chief AI Officer. As part of that role, you work on bridging the gap between our potential and the return on investment that our customers see. How are you thinking about that? How are you thinking about the value that we drive with AI?
Hellermark:
I think we're moving into entirely new sort of category of software. Historically we were largely selling these sort of software solutions that you would put in the hands of employees and you would see some productivity gains as a as a as a function of as a function of that. And now we're building this that is the systems that are basically enabling you to create an endless amount of agents that can help you do your work and augment your your your your your teams and as we make that trend transition and and we define a new category of software working incredibly closely with with our customers to define those those new those new patterns and the results we're seeing are are truly staggering as you move from sort of reactive co-pilots to proactive agents people are moving from s you know saving a few hours a week to doing work that would require months of effort. And so we're incredibly excited about that ROI that we're seeing through these through these partnerships. So we love to bring together you know the best AI researchers on the planet, the best designers on the planet with our customers that are sort of pioneering the application of of this to define those user experiences that augment their teams.
Dawkins:
So you're really looking at outcomes based, an outcomes based approach when it comes to defining the value that the AI that we're building is delivering. Do you look at it, is it productivity? Is it access to data? Are there specific metrics that you have that you use to define what success looks like?
Hellermark:
I think there's a tendency to try to over measure. And I think given the current trend lines, the least of our issues will be the ROI of AI. The issues will be: do we re-engineer our processes around it? Do we adapt quickly enough to adopt it? And so on. And why this shift is very different compared to previous shifts is that historically, we used to rely on a couple of years where we could adopt these technologies, and then you know, a decade of sort of harvesting the value of adopting it. Now this adoption to harvesting cycles is basically yearly.
We spent last year trying to just adopt these co-pilots, and then this year we're trying to adapt to agents and so on. So we're moving from an organization that does these transformations, you know, every decade or so, to having to fundamentally re-engineer how we work every single year. And so what we're really focused on is how do we together with our customers really define new ways of working that can re-engineer how they work every single year. But if you're stuck defining your po your pilot and your ROI criteria you'll be stuck doing that for the old paradigm.
So I believe a lot in sort of AI maximalism, if you like. Okay. Try everything, run it in parallel, and you know overinvest in adoption. I think for leaders today, they'll much rather have been over-rotated than under-rotated. And I think the risk with sort of AI minimalism in defining a few sets of projects, running a pilot, defining the ROI, and based on that, sort of defining whether you invest, invest more, is that you'll be massively under- rotating. And so what we see as most successful is companies that are exploring, that are testing a lot of different approaches and then doubling down on the projects that work.
Dawkins:
It's interesting, a lot of the customers I talk to, it's building creativity inside their own companies as well. So giving people the opportunity to explore, come up with their ideas, and then see what would be adopted at a global level or a company level. Is that something you're seeing as well, organizations really going down that route?
Hellermark:
Exactly. So Ethan Mollick, who's you know an exceptional researcher in this field, he talks about sort of the lab, the crowd, and the leadership. And your AI implementation really has to hit all three. So first starting with the leadership, you know, every all hands, every performance review, it's sort of universally embedded in how you work and the leader of the company needs to be very AGI pilled, basically. If the leader is not, it's not really going to trickle down from there. If the leader is sort of an AI minimalist that's taking a few sets of projects and wants to measure the ROI of every single initiative and so on, it's going to hinder a lot of that.
So first the leader needs to be very very sort of AGI pilled, if you like. The second is then the lab. So that's sort of the central AI team that sets the principles, evaluates the tools, just makes AI adoption internally very easy. So can set up a lot of that internal AI tooling that a lot of other companies or a lot of other teams can rely on. And then finally the crowd. And what works really well when you invest in the crowd is to find the folks in each function that have the domain and knowledge and are really sort of curious and explore the new AI models a lot. So if you combine that, you basically have ambassadors in every single time, every single team who knows the craft, knows the models, and is applying it there. You have the lab that is overseeing sort of making sure all of the teams have the right tooling and so on, and then you have the leaders that are sort of constantly reinforcing this in every aspect of the processes. And if you get all of those three right, those companies tend to be most successful in their AI implementations.
Dawkins:
Joel, we were talking about Swedish Midsummer, a great time of year, and that quite often there's a long holiday during that time. You mentioned that you tend to use that to do a bit more research or learn some new things. What are you planning for this summer?
Hellermark:
I think this is a very very fun era of home robotics. So I think this summer what I've been working a bit about on this, creating a sort of home security system. and you can have the sort of always on the cloud bots overseeing so no bad guys sort of enter the building. And so I think something like that would be a fun project for this summer.
Dawkins:
This goes way beyond a robot vacuum. I like it. I know you've got a lot going on over the next few days, but I really enjoyed the conversation. Thank you for joining the podcast.
Hellermark:
Thank you so much for having me.