Tomasz Tunguz’s Post

2mo

If I asked you, “When someone turns in a work assignment, how accurate is it? 80%, 90%, 95% or perhaps 100%?” We don’t think this way about coworkers’ spreadsheets. But we will probably think this way about AI & this will very likely change the way product managers on-board users. When was the last time you signed up for a SaaS & wondered : Would the data be accurate? Would the database corrupt my data? Would the report be correct? But today, with every AI software now tucking a disclaimer at the bottom of the page, we will be wondering. “Gemini may display inaccurate info, including about people, so double-check its responses” & “ChatGPT/Claude can make mistakes. Check important info” are two examples. In the early days of this epoch, mistakes will be common. Over time, less so, as accuracies improve. The more important the work, the greater peoples’ need to be confident the AI is correct. We will demand much better than human error rates. Self-driving cars provide an extreme example of this trust fall. Waymo & Cruise have published data arguing self-driving cars are 65-94% safer. Yet, 2/3 of Americans surveyed by the AAA fear them. We suffer from a cognitive bias : work performed by a human is likely more trustworthy because we understand the biases & the limitations. AIs are a Schrodinger’s cat stuffed in a black box. We don’t comprehend how the box works (yet), nor can we believe our eyes if the feline is dead or alive when we see it. New product on-boarding will need to mitigate this bias. One path may be starting with low-value tasks where the software-maker has tested exhaustively the potential inputs & outputs. Another tactic may be to provide a human-in-the-loop to check the AI’s work. Citations, references, & other forms of fact-checking will be a core part of the product experience. Independent testing might be another path. As with any new colleague, the first impressions & a series of small wins will determine the person’s trust. Severe errors in the future will erode confidence, that must be rebuilt - likely with the help of human support teams who will explain, develop tests for the future, & assure users. I recently asked a financial LLM to analyze NVIDIA’s annual report. A question about the company’s increase in dividend amount vaporized its credibility, raising the question : is it less work to do the analysis myself than to check the AI’s work? That will be the trust fall for AI. Will the software catch us if we trust it?

10 Comments

Khyati Sundaram

CEO | Applied | 2x VC backed founder

2mo

The more risk in the use cases. The higher the accuracy needs to be. If it’s healthcare its 99.99%. If it’s deciding who gets the job, it should still be 99.9%. Hard to build unless systemised and trained for the sole purpose of those uses cases. And we all know algorithmic aversion is real. Really curious how this lands with users - ultimately will users care and by how much ?

1 Reaction

Henrik Fabrin

Head of AI | Tech & AI Entrepreneur & Leader | 1 exit, 2 bootstrapped & profitable, 1 failure | Son of a Ninja | Always curious

2mo

These two paragraphs are super important for understanding where we are today: 1) "In the early days of this epoch, mistakes will be common. Over time, less so, as accuracies improve." 2) "We suffer from a cognitive bias: work performed by a human is likely more trustworthy because we understand the biases & the limitations." P.S. I didn't know Schrodinger’s cat was so adorable.

2 Reactions

Mostafa Megahid

Senior Product Manager @ Instabug | SaaS, B2B, Developer Tools

2mo

Agreed 100%. It's going to be a long way to trust AI to be the main driver and decision maker rather than being a handy co-pilot.

1 Reaction

Rafael Vasconcelos

Construindo Experiência para empreendedores e investidores, em um novo mercado de capitais

2mo

I have also raised this question. Great discussions generate improvements in these systems!

Luke Shalom

CEO @ Grow Solo I help the C-suite turn industry expertise into predictable pipeline with Content-led Outbound

2mo

We need good strategies to make sure AI works right and users trust it.

Yulia Pukhova

CFO, Veesion AI

2mo

Love the parallel with humains - a great way to be more tolerant towards new AI technologies . Like any person starting a new job, one should expect errors, misses, under deliveries in the first months, then a learning curve occurs and the results can go up to 100% and over . Obviously the ramp up phase is quicker if the person has prior experiences. And like with a new colleague who is different from others, we all need to learn to work together in a productive way (balancing profit from time gain versus controlling time lost and AI cost)

3 Reactions

Adam Smith

Co-founder @ Workbounce

2mo

Which financial analysis AI did you use? I built a version for myself that has an accuracy linter and it works well.

3 Reactions

James Meaden

Psychometrics + Artificial Intelligence = Better Behavioral Science

2mo

"is it less work to do the analysis myself than to check the AI’s work?" - great point

3 Reactions

Dan S.

2mo

Having the ability to 'double check' with a human when less sure vs always being confident will be a big milestone for AI systems building trust. People trust other people to admit when they don't know something important and to learn from their failures. We have spent hundreds of years building guard-rails preventing human caused catastrophic failure and have some well defined ways to verify a person being 'credible'. AI can fail at the scale of traditional computer systems but without the determinism of processes and data that make guard rails easy to implement. None of our human credibility signals work with an AI bot. And right now our AI systems don't even know how confident to be in their answers, so how could a human?

3 Reactions

Vasu Prathipati

Chief Quality Officer - raising the quality bar

2mo

cc Kartik Hosanagar Mike Murchison

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Monte Carlo

27,512 followers
2mo
Report this post
Monte Carlo friend Tomasz Tunguz hit the nail on the head. Enterprise AI can only succeed if users trust the responses. "The more important the work, the greater peoples’ need to be confident the AI is correct... ...We suffer from a cognitive bias: work performed by a human is likely more trustworthy because we understand the biases & the limitations. AIs are a Schrodinger’s cat stuffed in a black box. We don’t comprehend how the box works (yet), nor can we believe our eyes if the feline is dead or alive when we see it... ...Will the software catch us if we trust it?" So, how do data teams build confidence in their AI? By first ensuring that high quality is data feeding it. As analyst group Gartner has stated, data observability is an essential component of AI architecture. End-to-end coverage gives teams the ability to monitor, alert, triage, and resolve data issues at scale. And data issues ARE AI issues. As AI matures, data observability will help data teams instill the confidence we need to "trust fall" into AI. Full post here: https://lnkd.in/es4Cx38x #dataobservability #dataquality #GenAI #AI
Tomasz Tunguz Tomasz Tunguz is an Influencer
2mo

If I asked you, “When someone turns in a work assignment, how accurate is it? 80%, 90%, 95% or perhaps 100%?” We don’t think this way about coworkers’ spreadsheets. But we will probably think this way about AI & this will very likely change the way product managers on-board users. When was the last time you signed up for a SaaS & wondered : Would the data be accurate? Would the database corrupt my data? Would the report be correct? But today, with every AI software now tucking a disclaimer at the bottom of the page, we will be wondering. “Gemini may display inaccurate info, including about people, so double-check its responses” & “ChatGPT/Claude can make mistakes. Check important info” are two examples. In the early days of this epoch, mistakes will be common. Over time, less so, as accuracies improve. The more important the work, the greater peoples’ need to be confident the AI is correct. We will demand much better than human error rates. Self-driving cars provide an extreme example of this trust fall. Waymo & Cruise have published data arguing self-driving cars are 65-94% safer. Yet, 2/3 of Americans surveyed by the AAA fear them. We suffer from a cognitive bias : work performed by a human is likely more trustworthy because we understand the biases & the limitations. AIs are a Schrodinger’s cat stuffed in a black box. We don’t comprehend how the box works (yet), nor can we believe our eyes if the feline is dead or alive when we see it. New product on-boarding will need to mitigate this bias. One path may be starting with low-value tasks where the software-maker has tested exhaustively the potential inputs & outputs. Another tactic may be to provide a human-in-the-loop to check the AI’s work. Citations, references, & other forms of fact-checking will be a core part of the product experience. Independent testing might be another path. As with any new colleague, the first impressions & a series of small wins will determine the person’s trust. Severe errors in the future will erode confidence, that must be rebuilt - likely with the help of human support teams who will explain, develop tests for the future, & assure users. I recently asked a financial LLM to analyze NVIDIA’s annual report. A question about the company’s increase in dividend amount vaporized its credibility, raising the question : is it less work to do the analysis myself than to check the AI’s work? That will be the trust fall for AI. Will the software catch us if we trust it?
Like Comment
To view or add a comment, sign in
Diego Ventura

Regional Director, SoCal & Bay Area at Tryolabs
1mo Edited
Report this post
I keep saying this over and over, you can run as many test and evaluations as you want. No matter the accuracy, nor the precision nor the recall, or whatever made up measure we invent. (And please don’t get me started on human interannotation agreement in business processes or I’ll have to call Hernán Correa Suárez to comment 😂😂😂) AIs in the workplace are not going to be judged on KPIs o OKRs or leading nor laggin indicators, they are going to be judged based on Trust. You build relationships with AIs, not processes. Trust takes a lot of time to build up and can be lost in just a second. AIs should be included in the workplace not as software but as coworkers. If you disagree, let’s talk, I would love to be proven wrong. And if you are up for it, I think we should start talking about the generalized delusion of human/AIs being able to become unbiased and start having the conversation about how organizations can/should explicitly train models with biases but be transparent about it,as well as made responsible for their repercussions. Aaaand… please don’t get me started on fair pair for data proletariat (producers of data and annotators). More than ever we need more iniatives for Data Cooperatives ( https://lnkd.in/dB2a5Shp ) The more we pretend we can build unbiased ultimate sources of Truth with AI, the more power concentration we will see. #AI #Machinelearning #bias #
Tomasz Tunguz Tomasz Tunguz is an Influencer
2mo

If I asked you, “When someone turns in a work assignment, how accurate is it? 80%, 90%, 95% or perhaps 100%?” We don’t think this way about coworkers’ spreadsheets. But we will probably think this way about AI & this will very likely change the way product managers on-board users. When was the last time you signed up for a SaaS & wondered : Would the data be accurate? Would the database corrupt my data? Would the report be correct? But today, with every AI software now tucking a disclaimer at the bottom of the page, we will be wondering. “Gemini may display inaccurate info, including about people, so double-check its responses” & “ChatGPT/Claude can make mistakes. Check important info” are two examples. In the early days of this epoch, mistakes will be common. Over time, less so, as accuracies improve. The more important the work, the greater peoples’ need to be confident the AI is correct. We will demand much better than human error rates. Self-driving cars provide an extreme example of this trust fall. Waymo & Cruise have published data arguing self-driving cars are 65-94% safer. Yet, 2/3 of Americans surveyed by the AAA fear them. We suffer from a cognitive bias : work performed by a human is likely more trustworthy because we understand the biases & the limitations. AIs are a Schrodinger’s cat stuffed in a black box. We don’t comprehend how the box works (yet), nor can we believe our eyes if the feline is dead or alive when we see it. New product on-boarding will need to mitigate this bias. One path may be starting with low-value tasks where the software-maker has tested exhaustively the potential inputs & outputs. Another tactic may be to provide a human-in-the-loop to check the AI’s work. Citations, references, & other forms of fact-checking will be a core part of the product experience. Independent testing might be another path. As with any new colleague, the first impressions & a series of small wins will determine the person’s trust. Severe errors in the future will erode confidence, that must be rebuilt - likely with the help of human support teams who will explain, develop tests for the future, & assure users. I recently asked a financial LLM to analyze NVIDIA’s annual report. A question about the company’s increase in dividend amount vaporized its credibility, raising the question : is it less work to do the analysis myself than to check the AI’s work? That will be the trust fall for AI. Will the software catch us if we trust it?
Like Comment
To view or add a comment, sign in
Robert DuWors

Digital Substrate Architect with insight from being there (Retired)
10mo Edited
Report this post
Here it is. A major shift in tangible products from classic if-the-else software to neuromorphic (artificial neural net) software. The stuff is mechanically self-created from feedback driven inputs. Humans don’t write a single line of the software instructions which drive the computations. Let’s stop with the magical thinking of “AI”. Whether overhyped or stupidly mislabeled “stochastic parrot”, we have to deal with the reality of this kind of software being everywhere including safety critical systems. When humans write programs they must anticipate every condition they believe to be relevant. If they miss one (incomplete requirements) or get one wrong (bugs), they can, or should, be held responsible. It is also possible to determine what was missed or went wrong. But neuromorphic software has no specification beyond “the program (model) is the spec”, a degenerate condition for human written software. Nobody can be sure what it will or will not do in every given situation. Nobody can account for what associations it uses to arrive at an ouput (nonexplainable). And now, without concern and consideration, our very lives will be under its control. The great “AI” swindle is working: either it is just more software like the rest or its magical, so don’t worry or try to reason about it. I must say that I never thoght I would live to see Computer Science as a discipline have a moral failure worthy of 1940 era physics. But now I have. Why are others who know better so silent?

Tesla FSD v12 shifts away from ‘rules-based’ approach

https://www.teslarati.com

23 Comments
Like Comment
To view or add a comment, sign in
TestingXperts

74,876 followers
9mo
Report this post
#Blog | In the fast-evolving era of AI, robust model testing and evaluation is crucial for success. As ML models are involved in critical decision-making like approving loans, steering autonomous vehicles, or diagnosing patients, there could be chances of errors. This is why ML testing is a crucial process that every business needs to implement. It ensures that the ML models operate responsibly, accurately, and ethically. From understanding the intricacies of data preprocessing to optimizing hyperparameters, we've got you covered. Our guide breaks down the process, helping you navigate through testing frameworks, metrics, and performance evaluation techniques. Read more: https://lnkd.in/erZtqm9t #ml #ai #qualityassurance #testingxperts

Comprehensive Guide to ML Model Testing and Evaluation

https://www.testingxperts.com
Like Comment
To view or add a comment, sign in
Taha Sheikh

Co-Founder @ DeVinci Codes | AI/ML | Full-Stack Dev (Next.js, Django, Node.js) | CS Senior @ FAST NUCES
6mo
Report this post
Did you know Google Cloud's Vertex AI AutoML can train AI models to identify damaged car parts... just by looking at pictures? Building a project to automatically detect dents, scratches, and even hail damage from car photos. Here's what's got me hooked on Vertex AI AutoML: --- Zero coding required: No need to be an AI whiz! AutoML handles the heavy lifting, training powerful models even if you're just starting with AI. --- Blazing-fast training: Train models on massive datasets in minutes, not hours, thanks to Google's cutting-edge cloud infrastructure. --- Explainable AI: Understand how the model makes its decisions, building trust and transparency in your results. --- Pre-built models for common tasks: Get a head start with pre-trained models for image classification, object detection, and more. This project is more than just fixing cars; it's about exploring the potential of AI to revolutionize industries. Imagine: Automating insurance claims processing with lightning speed and accuracy. Empowering used car dealerships to provide detailed damage reports instantly. Helping car owners identify potential issues before they become costly repairs. Stay tuned as I share my progress and insights on this exciting journey! In the meantime, let me know what you think about Vertex AI AutoML and its potential applications. #VertexAI #AutoML #GoogleCloud #AI #MachineLearning #ComputerVision #CarTech #Innovation #ArtificialIntelligence #TechInnovation #DataScience #DigitalTransformation #FutureTech #Automation #SmartTech #TechProgress #EmergingTech #AIApplications #IndustryRevolution
3 Comments
Like Comment
To view or add a comment, sign in
Salesforce Developers

129,809 followers
9mo
Report this post
Did you know Einstein Copilot uses an agent-driven approach to change the way you build apps on the #Einstein1 Platform? 🤔 Learn how Autonomous Agents enable this new wave of #AI innovation in the blog post by Stephan Chandler-Garcia. 💻

An Introduction to Autonomous Agents

developer.salesforce.com
Like Comment
To view or add a comment, sign in
VMblog

3,173 followers
6mo
Report this post
In 2024, on the back of MLLMs, John Hayes of Ghost Autonomy expects #AI will dramatically expand its reach into all areas of technology, fundamentally changing how we write software and build hardware. Find out more in these #predictions. https://lnkd.in/gHpDYrcE #LLMs #APIs

How GenAI Will Transform Core Infrastructure in the Coming Year, From Custom Chip Development to Autonomous Driving

vmblog.com
Like Comment
To view or add a comment, sign in
Raju Ghivari

I help you to understand Finance in a Easy way I Finance Consultant l Chegus Infotech l Al Trainer l Outlier l Ex-EY GDS l Educator l Business & Leadership Coach
2mo
Report this post
💫 The AI Trust Fall? 🚀 We don’t think this way about coworkers’ spreadsheets. But we will probably think this way about AI & this will very likely change the way product managers on-board users. But today, with every AI software now tucking a disclaimer at the bottom of the page, we will be wondering. “Gemini may display inaccurate info, including about people, so double-check its responses” & “ChatGPT/Claude can make mistakes. Check important info” are two examples. 🔎 In the early days of this epoch, mistakes will be common. Over time, less so, as accuracies improve. 🚀 The more important the work, the greater peoples’ need to be confident the AI is correct. We will demand much better than human error rates. Self-driving cars provide an extreme example of this trust fall. Waymo & Cruise have published data arguing self-driving cars are 65-94% safer. Yet, 2/3 of Americans surveyed by the AAA fear them. 🚀 As with any new colleague, the first impressions & a series of small wins will determine the person’s trust. Severe errors in the future will erode confidence, that must be rebuilt - likely with the help of human support teams who will explain, develop tests for the future, & assure users.
1 Comment
Like Comment
To view or add a comment, sign in
Brian Murphy

I enhance and elevate careers of mid-revenue cycle healthcare professionals. Published author, podcast host. Former ACDIS Director.
1mo
Report this post
Who will watch the vendor? Will vendors be held accountable for not just accurate coding, but inaccurate coding and errors and repayment, when machines they manufacture are doing more and more of the work? Tesla has delivered self-driving cars, but these incredibly sophisticated machines are not without fault. In 2019 two pedestrians were killed when one of Tesla’s automobiles failed to recognize them or the crosswalk they were navigating. In October of last year Tesla won the first U.S. trial over allegations that its Autopilot driver assistant feature led to a death, a major victory for the automaker. And it won an earlier trial with a strategy of saying that it tells drivers that its technology requires human monitoring, despite the "Autopilot" and "Full Self-Driving" names. The lesson? You still need your hands on the wheel. But machines keep getting better, and the manufacturers (in a desire to separate from the competition) proclaim more from them. A simple Google search renders coding products that describe themselves as fully autonomous with no human intervention. Who is accountable for their inevitable errors? The device or the operator? For now, it’s you. If an NLP solution prompts clarification on a borderline low sodium reading, the provider still must sign off on a hyponatremia diagnosis. Which makes the provider and/or your healthcare organization liable for recoupments or false claims action. But the act of prompting, again and again, is a steady drip that can’t be ignored. It influences behavior. Even something as primitive as drop-down menus that put a desired code (CC or MCC) at the top lead to a higher incidence in reporting. As we further trust artificial intelligence to augment our work, it stands to reason that we will outsource thinking, and even decision-making, to machines. It’s not just hospitals and providers at risk either. I read report after report of Medicare Advantage organizations getting hit with Office of Inspector General (OIG) audits for inaccurate coding, principally for reporting acute conditions in office settings when a “history of” code is warranted. Undoubtedly these organizations are using some version of NLP or other products that elevate or auto-suggest diagnoses based on documentation in the record. No one codes from books alone anymore. Who is accountable? Right now it’s still the operator. But is that fair? We all need these tools to do the work, and as the symbiosis of men (and women) and machine continues to blend, it seems reasonable that liability be spread out as well. We are seeing the formation of responsible AI frameworks in healthcare that may trickle down to the mid-revenue cycle. Take a look at your contracts and see what they say. Would love to hear your perspective on this issue and in particular vendors.

8 Comments
Like Comment
To view or add a comment, sign in
Knowledge Galaxy
11mo
Report this post
Unveiling Taper: Google's Mind-Blowing Computer Vision Breakthrough! 👁️🌟 dailycelebsnews.com 🚀 Prepare to be amazed by Google's revolutionary leap in computer vision: Introducing 'Taper'! 🤯🔍 This groundbreaking AI model is set to redefine how machines perceive the world around us. 🌐📸 🔥 Join us on a journey into the future of visual understanding as Taper harnesses the power of deep learning to decipher images like never before. From object recognition that's eerily accurate to understanding intricate scenes, Taper is redefining the boundaries of what's possible. 🌆🔍 🌈 Whether you're a tech enthusiast, a creative mind, or just curious about the cutting-edge, this is a revelation you won't want to miss. Get ready to witness the fusion of artificial intelligence and vision, painting a vivid tomorrow for industries like autonomous vehicles, medical imaging, and more. 🚗🏥✨ 🎉 Dive into the science, innovation, and endless possibilities with 'Taper'. Don't blink; you might just miss the future unfolding before your eyes! 👀🔮 Smash that like button, subscribe, and hit the notification bell to stay updated on the tech revolution that's about to take the world by storm! 🌍🌟 #quantumcomputer #googlequantumcomputer #google #googlequantumcomputer2019 #computer #quantumcomputerwormhole #quantumcomputers
Like Comment
To view or add a comment, sign in

394,400 followers

View Profile Follow

Tomasz Tunguz’s Post

More from this author

Rewriting the RFP for AI

The Battle for AI Gravity

Agentic Systems' Sales Cycles

Explore topics