Will you be left behind if you don’t understand AI agents?
Logo and header design by Sara Remi Fields

Will you be left behind if you don’t understand AI agents?

A lightning fast take on AI Agents, with special guest Deepak Agarwal . I had the pleasure of working with Deepak at LinkedIn, where he was responsible for the massive value that gets created every day for LinkedIn members through AI-based products. Currently VP of Core Engineering at Pinterest, Deepak oversees engineering solutions that power all of Pinterest’s consumer experiences, Gen AI and innovation, and previously was at Yahoo where he led the first personalization of Yahoo’s media properties. Deepak brings over 20 years of experience and excellence in data and AI, regularly serving in leadership roles for the world’s top tier conferences in data, statistics, and AI and contributing to critical academic research. 

Let’s do this ⚡ These views are my own, not those of where I work. 

The take

If there is one key message for product builders from my conversation with Deepak, it is that business leaders who are not thinking about transformation around AI agents will be left behind. AI agents combine foundational models like LLMs + code to become software entities capable of performing tasks on behalf of users, thus serving as intermediaries in various processes. While AI agents have existed for some time now, from self-driving cars to real-time bidding for ads, recent technological advances mean that an agent can now be created by anyone. Not just developers, not just those with AI expertise. Anyone with any kind of expertise or idea can encode this into an agent that can be interacted with. Imagine that at all levels of the stack, agents will be created to be able to act on specific tasks, affecting user facing products and internal operations of every business. To speculate, every person will have an AI agent that they have trained with their specific wants, desires (discussed early with Dmitry Shapiro in my past newsletter). That agent will be able to interact with other people’s agents, as well as with a system of agents that are designed to accomplish specific tasks (e.g. booking a flight, working with your healthcare provider, teaching you something, who knows…). There will also be an infinite number of use cases for agents that consumers never see, but that work internally within systems to help accomplish tasks. 

Deepak and I also went deeper on how tech systems are going to change if we have layers and teams of agents operating towards specific outcomes. We started by thinking about how what is already built can be re-imagined to help solve meaty agent problems. It is possible we will need to develop some kind of common standard for agents to be able to interact with each other similar to TCP/IP to facilitate orchestration. It is possible that this system would need to encode similar dynamics of traditional recommender systems to be able to choose the right, trustworthy agent for any task that is asked for. Similar to classical recommender systems that learn via federated computation like candidate generation and explore/exploit, we may need to reinvent such methods that can help us handle the selection of the right agent for the right outcome at scale.

But this assumption equates systems of agent to recommender systems for content or search where there is a known library of content, and then you rank it. In the world of generative AI, why couldn’t an agent be spun up in the moment to solve a highly specific request (in the same way that content can also be spun up upon request of your favorite music, or desired topic)? This brings a whole host of adversarial problems, alongside data privacy issues, and environmental impact of nearline inference. 

As always, we’re going to have to build responsibly, given the environmental impacts of computing costs, and find ways of solving real meaty problems that are worth solving. Deepak reminds us also about the fact that nothing is changing about the goal of business to help customers, its that this new technology will allow these same problems to be solved in novel ways. 

I left the conversation excited about the possibilities here, and feeling a strong sense of urgency to start re-imagining every product on the internet in a world of agent-focused orchestration. Let me just clear my schedule… 

The Interview 

Keren: I'm so excited to be talking to you about this topic. I have been very interested in the ways that consumer behavior is going to change with generative AI. Both of us work on platforms that work with content and creation. But I think that the topic that I wanted to cover is way more broad and big than that, right? It's the fact that the way all user interfaces of the internet have been constructed might not exist in the same way, let's say one, two, three, or four years from now. 

The way that these products have worked, such as searching the web or sending a message via your product, have all about consumers interfacing directly. And now we have the capability, especially when you combine an LLMs with code directly into loops, to have agents who are performing a given task on someone's behalf, acting as an intermediary. There can also be multiple agents that are coordinated in a single system of agents or a team of agents. So, first, why don’t you tell us a little bit about this concept and what excites you about it, and then we can go from there. 

Deepak: At a high level, the idea of an AI agent is something that we have all been thinking about for a long time, like the ability for anyone without a lot of deep expertise or even a lot of coding expertise to be able to create something that is their specialty. If you have knowledge in some area, you actually create software to encode that knowledge and then have that manifest itself in the world in a meaningful way. That's really the whole idea of having an agent, and that's what has been enabled by all the innovation that has happened. So, it's not just GPT; the fact that you have a model like GPT, and then anyone can take that and build on top of that by infusing some of the knowledge that they have. If you are a fashion designer, you can actually encode whatever you have as knowledge and make the GPT learn about that knowledge, and then you can start talking and interacting with it. So, that's really the key, and the fact that you can do it so easily now with all the innovation that has happened, that is what makes it very exciting. You don't have to know the inner workings, but you can still reason about it and build AI agents and start interacting with those in the real world and add value to the entire ecosystem. 

And it's not like AI agents did not already exist. If you look at self-driving cars, it's an example of an AI agent, right? You sit in a car, you tell the car, "I want to go from point A to point B," and then it orchestrates the whole thing. But it's not easy to build; it's a very nuanced piece of technical work. Similarly, if you look at advertising, for instance, I think the idea of an agent existed in advertising for a long time. That's what they call real-time bidding. You are looking for an audience, and in real time, you can bid on it, and the agent decides what is the right bid to make. But again, these are all digital systems that are agents, but they require a lot of effort to build. So, what has happened now is that it has become democratized, and anyone can build an agent to encapsulate the knowledge they have and then start participating in this AI ecosystem. So, that's how I think about the world of agents, and it's very new. This is going to evolve in many different ways. It will start impacting the world in very different ways. That all remains to be seen, and we are all going to speculate on what it's going to look like, but it's very exciting given the evolution that we've seen in the technical landscape.

Keren: Just because it is fun, I think we should speculate a little bit. I think even what you just described, you just described multiple different types of agents even just in the examples you used.

Deepak: Another example is healthcare. A health coach is also an agent, and many health companies have built that before. If you are a diabetic patient, this is how I'm going to help you with your nutrition and your lifestyle. But again, as I was saying, it takes a lot of effort to build these agents in the past, but now it's become much easier to build these and the process has been and is being democratized. That's the key.

Keren: Let’s dig into a use case. Let's talk about healthcare, and I'm a patient, and I can interact with an AI doctor and maybe that AI doctor also has access to another AI agent that has the past history of you what I've been eating or what I've been doing. What was the step that was just taken that has made this world much easier to imagine? And then what are the next two or three steps that need to be taken to get us into a world where there can be the level of orchestration that we just described?

Deepak: Yeah, so the biggest step that really has made it very exciting is the fact that anyone can sit for a few hours and create an agent. I can encapsulate all the knowledge I have in a particular area, and then it starts interacting with the world at large. So in the old days, it was much harder to create the website quickly, right? It's like when creating a website becomes really easy. For example, I have all the knowledge about how to deploy recommender systems in the industry. I don't have to go into tutorials anymore. I can actually build an agent, and people can then ask questions, and they can get all the answers and they can actually use that knowledge to do what they want to do in real life. You don't have to know how to write code or have very deep knowledge of AI or any such thing. So that's, I think, what has been transformative. It is that power and the ability to do these things so easily at scale is what 's going to transform how this whole thing will change the future that we are going to be in. So that's the first question. 

The second one, this is again going to be very speculative. So I think what will change is, in particular, the way humans and machines interact with each other. I think that will change in a very big way, right? In particular, it is going to become a lot more multimodal and heterogeneous than what it is today. So, for instance, it will provide all of us with more options in terms of how we get work done in real life. And again, I think what mode we use may depend on the goals and the tasks that need to become, like, for instance, I'm about to go to India in December. I spent so much time researching and getting my tickets done to India, right, because I had a lot of constraints, like, okay, I need to go on this day. My daughter is in college, so I have to align with her schedule. Then I have to come back on this day, and then I need a flight, and it cannot have so many layovers. So I ended up doing so much research. And then you want to look at the price point. You want to know what is the right time to buy, which airlines, you have to read the reviews of their flights. All of that stuff in the new world should not take me that much effort. I should be delegating these things to an agent who is an expert at booking flights, right? It's almost like a virtual travel agent who should just take my requirements, and in a few days, come up with a great set of options, and I should then click on one of them. And another agent will go and just get the booking done for me with all the information, and this is how the world should become and would become in the future for some of these things. The amount of time we as human beings spend on doing such mundane tasks every single day is just insane. If you count the number of hours we spend doing these things, all of that would get automated and become so much better and frictionless. 

These are all things that would get automated through agents, and everyone, I believe, would start using agents for these things, and that's going to really change how we all live and how we do business and how we go about our lives. But again, there will be some others that may not change very much, like this whole idea of browsing and quietly sitting down and thinking about discovering new ideas. You really don't know what you're looking for, but you just want to surf the web. You might not want to delegate things to an agent necessarily. You just want to have your quiet time where you can just go through things, and you are waiting to be surprised. So the discovery aspect of human computer interaction may not change much or change in a very different way.  That's why I think it will become very multimodal, and it will become very task-based. 

The main thing for the product thinkers is, given this enormously powerful technology that we have at our disposal, how do we design systems? We use this capability to really add the maximum value, right? Don't start using agents for use cases where there isn't a good product market fit. Try to use it for cases where there is a clear product market, like some of the examples we talked about that's screaming for agents. How do we take this whole capability and harness this power and really start using it in places which can really transform our lives ?

Another thing I'm worried about is although the technology is there, we need to consider the cost. I am not even talking about the cost of training these large models, we're talking about the cost of doing inference at runtime, right? So as you start building these capabilities and you start, I mean the cost of doing inference is so high and not only that, the impact of these things on the environment is also very significant. It can actually make the environment significantly worse. I was reading an article today in MIT technology review, and it seems like every time you use a foundation model to generate an image, it's almost like recharging your phone. That's the kind of energy consumption that is happening today. So we don't want to be using this technology in areas where it is not adding a lot of ROI and hurting the environment.  Obviously, we have to and we will work on technical solutions to make the energy consumption lower, but we still have to remain mindful of the cost and the impact on the environment as we start using these things.

Keren: That's an excellent point. I think I'd love to maybe dig into that. I want to actually dig into that further at some point to understand how we might make sure we understand the secondary and tertiary impacts of using this technology, and how we can build for that.  As I've started building with this technology, I’ve recognized that thinking in a more nuanced and system-focused way becomes way more necessary. You have to understand if you want to include some kind of data, what does that data look like, and how there's a lot more to consider than when you're building a UI product. 

Deepak: Speaking of data, right? Security, data privacy. These things also will become much more important because at a high level, we're building a massively federated computational environment, right? But then in a federated environment, it's very important to do the computation in a secure, privacy-preserving way. There will be security loopholes that will come in as we start doing these things, and people can break into these systems. So those things also have to be technologically solved, and these are some hard technical problems that will start coming around as this technology starts getting used more and more across the board. But again, I'm confident we will solve those technical problems as we uncover them. It's a good problem to have as new technology comes along and we use it in ways that add disproportionate value.

Keren: Yeah, another similar problem I've been wondering about is coordination between multiple agents. I think that's a huge problem. And I wanted to see if this feels similar to kind of TCP/IP, is there a world in which, in order to build in a coordinated way, we'd have to have a similar language for how these agents interact with each other, and is that something that needs to be built in a way that's maybe adopted more widely or open source? Does it feel like something that's a key problem that needs to be solved?

Deepak: Yeah, you’re talking about how TCP/IP protocol works in networking, right? Like how your packets get routed from A to B through routers. I think it is very similar to that. But I think one thing that is a little different here is the quality and the trust of these agents have to be estimated from data. If you have a particular task in hand, how do you know, and let's say you want to book an airline ticket, and you might have three agents. How do you know which one to assign your task to? That means you have to have some idea from the past for which system is more trustworthy or which system is going to accomplish your task in the best possible way. This whole idea becomes similar to recommenders.  You have to explore /exploit and then you have to learn from past data, and then based on a particular task, you have to choose which one to select that is going to accomplish your task in the best possible way. So it is similar to the IP routing you talk about, but there are additional nuances since the objectives are different. What makes it very challenging as well is that you have to do it all in a privacy-preserving way. You cannot be sharing very granular data across agents as you are learning these utilities and objectives. That's really the technical challenge that I think will be very fascinating to solve. Once this whole ecosystem starts becoming very big, we are all using this in our day-to-day lives just as we use our mobile phones today. 

Keren: As you're talking, I think that this is making me understand that it's possible that this problem is a little bit closer to search. Let's say you have systems of agents calling other systems of agents, each one of those must need to create some kind of ranking, have some understanding of the ranking of what's available, signals about what agent could possibly be used for a particular scenario, and all of this could be happening at multiple layers, right? And I don't want to go too deep into the computation cost that you could describe earlier, but it feels like we're actually very far away when you think about how much would have to be understood for each ranking, the memory issues that would come up in order to make a system that complex possible.

Deepak: Yeah, I mean, even with large language models, a lot of people are doing this today, right? If you have a particular query, you can actually get an answer to that query from multiple models. An open-source one, GPT-3.5, GPT-4, Llama, and if you look at the cost and quality of these things, they're very different. So for some queries, you actually get the same quality from an open-source model, and there is no reason to call GPT-4, which is  several times more costly than that. And so even in that small ecosystem, there is this problem that is already emerging, and people are actually solving that one. This is the same problem, but at a much larger scale with a lot more complexity. Even today in recommender systems, although we don't have agents internally, recommender systems do work that way. Let's say at Pinterest, we have 300 billion content items in the corpus. You cannot call and figure out which among those 300 billion are the best. So you have what are called candidate generators in recommendation systems. A candidate generator is like an agent in this context ,right? Each candidate generator gives you a  set of candidates following a particular heuristic and then you merge. So this idea of doing federated computation and then merging things and then figuring out the best - this is already happening in many places. With the agents, as you're saying, it also needs to happen, but there will be a lot more constraints on this than what we have seen in the computation that we do today.

Keren: It will get even more complex. If you think about the concept of auto-agents, agents are spun up in the moment and that that doesn't happen with content. Well I guess that doesn’t happen with content, yet. Obviously, it could, but that brings up a whole new layer where you're not actually ranking anymore. You could be creating the agent that's best suited for the moment, and honestly, maybe that's actually a solution to the problem. Maybe we don't need to be doing a ranking of existing entities. That could rather just be using the information from the initial agent who initiates the request to generate whatever is the right agent for that particular task.

Deepak: That's true, and you can do that, but then how to maintain these things over time, and anytime you add something you add risk to the system from an adversarial perspective. But yeah, that's a good point.

Keren Baruch: Absolutely. Yes. I know we don't have too much time left. I wanted to switch gears slightly, and we're talking a lot about how agents are going to vastly impact any product on the Internet today. Given your role, if you were advising companies, let's say medium to large size companies as to where they should invest in research versus where they should invest in product development, given the consolidation around these few large language models that are costly to train, how would you advise them to place their bets?

Deepak: Yes, I'll go even beyond that. I think this is more like a cultural shift in thinking that's going to happen. So, at a high level, what the foundation models like LLMs and AI agents are doing is providing a very simplified way for businesses and enterprises to combine world knowledge with their enterprise knowledge and create much better solutions for their customers and their users, which is extremely powerful. It's not like businesses and enterprises were not trying to do that before. If you look at the LinkedIn Knowledge Graph or standardization team, that's what they were trying to do, but it's also difficult to do that, right? Because you're trying to extract all this knowledge from outside by using a combination of technology and human curation tied to fixed ontologies built by experts in sub-domains. All of that has now become so much easier. So, businesses who fail to take advantage of this new paradigm shift in the technology landscape and who are late in this game, I think after a few years or a few months (I don't know, given the pace of this field, who knows whether months or years), I think they will be left behind very soon. So, this is not optional. This is a must-have for every enterprise, and they have to think about seizing the opportunities, and they have to think about creating a culture in their teams and organization where everyone starts thinking about the problem they're solving in a very different way given the powerful tools and the world we are in. It's kind of similar to the data culture, right? Decade or so ago, data become the new oxygen for enterprises, and then every organization had to transform themselves to become a data-first company, and then came the ML  first culture, mobile was a similar transition. Similar to those, this is not an option. Business leaders have to really put in the effort to invest in this area and create teams to build the tools and then infuse that in the culture of how they develop products in the organization. It's not just the user experience, internal productivity is and will continue to change with AI Co-pilots. You can actually save a lot of money through internal productivity boosts by embracing these tools and then invest those savings to enhance the user experience, that’s not a bad strategy. So, it's not even something where you have to add additional funding if you go about it in a smart way. You can save in one place and then reinvest that, helping create new user experiences that will transform your business in a strategic sense. Maybe initially, you will not see the benefits right away, just like what happened with the data transformation. But after a while, the payoffs are going to be significant. That's how I think business leaders should think about this problem. You either disrupt yourself or be ready to be disrupted. 

Keren: I think I was trying to be gentle in my question, but that is exactly what I was alluding to. I completely agree with you. I think it's an imperative for people to spend time visualizing a world that looks more like this.

Deepak: Yeah, I mean, the technology isn’t perfect, right? People talk about hallucination, and there are some areas where you have to be more careful. Even on this topic of hallucination, we often hear models have to get better. But at the end of the day, it also depends on the use case. For some applications, if  models hallucinate, the hallucination might actually not hurt much because the cost of sporadic mistakes is not high. But if you are a medical professional and your model hallucinates, that could become a disaster, right? So the cost of a wrong suggestion also depends on the end use case. And so I'm not suggesting everyone can just rush to it and they can embrace it very quickly. In some cases where the wrong decisions can have significant consequences, you have to be more careful, and those areas will take more time. Some areas, you can adopt it much faster and get the payout much faster. But nonetheless, everyone has to think about it because the pace of innovation in this area is such that whatever problems that are emerging, it will get solved very quickly. And now with such a vibrant open source community working on it, I think this will only get better and even faster than what we have seen in the past few years.

Keren: Thank you so much. I really appreciate it. It's been an awesome discussion. 

Your discussion reminds me of the Semantic Web, or Web 3.0, that was proposed some 20 years ago. The key idea was to populate the Web with machine-processable information and to have intelligent agents execute tasks on behalf of humans. The proposed technologies were explicit metadata, ontologies, logic and inference, and a number of standards were proposed to support inter-operability and communication between agents (RDF, DAML/OIL, OWL, etc). I wonder if those ideas and technologies are reusable with the new generation of agents that we are discussing today.

Jonathan Rochelle

Product Leader & Builder, Entrepreneur, Startup Advisor, Investor, Creator, Learner. Currently Product VP at LinkedIn

7mo

Great discussion and post! the "standard for agents to be able to interact with each other" really stands out as something both SO important, and in many cases associated to the API tooling work that so many products have sprung up to support - particularly in the last decade of "No Code" tools. example: My co-founder and I initiated the work on Google Apps to develop Apps Script (2007-ish) - it was a pet project - but the goal was to make the features of all the Google apps accessible to code... much like VBA did in the MS-Office world - but we had the advantage of web-based access to everything from Sheets (!) to Maps, to Finance, to GMail, etc. Apps Script became the foundation for later No Code tools and loads of 3rd party opportunity to tie Google Apps to other web-based products - THOUSANDS of them. That's where Zapier (and others) thrived and continue to thrive... AND (finally getting to my point ;) that's how Zapier (and competitors) will be able to quickly make the huge leap (from a value perspective) to Agent Creation and even simpler workflow automation. The "standard for agents to interact with each other" can somewhat be leveraged via APIs in the near term - and hopefully become more sophisticated and less fragile.

Like
Reply
Shelby Layne

Strategist | Innovator | Communicator | Futurist | Executive Thought Partner | Previously: Goldman Sachs, Yale, Barnard

7mo

"Yeah, so the biggest step that really has made it very exciting is the fact that anyone can sit for a few hours and create an agent." Still trying to process the reality of this and how we each can and must adapt to this truth.

Like
Reply

Such a cool conversation on AI Agents. Thank you Keren Baruch for your awesome podcast, and for interviewing Deepak Agarwal on this topic. Some of my takeaways: 1. We still need a robust definition for an AI Agent. Sometimes we think about avatars who can work on our behalf, other times about co-pilots who help us be more productive, other times about coaches who help us learn and grow, and other times about specialized agents that perform specific tasks like automatic bidding in online auctions, or driving our car (these are Deepak's examples in this piece). 2. The analogy with the TCP/IP and Web stacks is exciting. It conveys the scale at which AI Agents are going to impact everything we do, and highlights some of the challenges like orchestration, interoperability, security, privacy and all that. 3. The human in the loop question is fascinating: what are the touch points where I make decisions vs. these are automatically made for me by the orchestrator and the agents?

Candace McGeer

Fat Finger Tech Marketing Maven | Sales Team Leader | Project Maestro | Client Happiness Guru

7mo

I am truly excited to see how AI in general is being integrated into so many platform solutions to make life simpler, really going to be a great investment for any company in the long run.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics