The Business of Health with Chip Kahn

AI: As Much Peril As Promise?

May 19, 2026

Video

Audio

About this Episode


Episode 4, AI Series: What does AI mean for patients in bed and doctors at the bedside? Host Chip Kahn and guest Dr. Robert Wachter, Chair of the Department of Medicine at the University of California, San Francisco, discuss whether AI will produce a different kind of doctor in the future — a “clinician curator rather than a clinician-diagnostician.” The answer could define the future of medicine and the doctor-patient relationship.

The Host


Headshot photo of Chip Kahn wearing a navy blue suit with a red tie, red pendant on lapel, and glasses.

Sr. Visiting Fellow

Charles N. Kahn III is a senior visiting fellow at KFF. He is also a visiting senior fellow at the American Enterprise Institute and a nonresident senior scholar at the University of Southern California’s Schaeffer Center for Health Policy & Economics. He serves as co-chair of the international Future of Health collaborative.

Guest


Professor and Chair of the Department of Medicine at the University of California, San Francisco (UCSF)

Robert Wachter, MD is Professor and Chair of the Department of Medicine at UCSF. He is past-president of the Society of Hospital Medicine, past-chair of the American Board of Internal Medicine, and an elected member of the National Academy of Medicine. In 2004, he received the John M. Eisenberg Award, the nation’s top honor in patient safety. Modern Healthcare magazine has ranked him as one of the 50 most influential physician-executives in the U.S. more than a dozen times; he was #1 on the list in 2015. He is the author of the books “The Digital Doctor,” a New York Times bestseller, “A Giant Leap: How AI is Transforming Healthcare and What That Means for Our Future.”

Transcript


AI Usage Disclosure: This transcript was created with assistance from AI tools. It was reviewed and edited by KFF Staff.

Chip Kahn: In our first three episodes, we covered the strategic landscape, the question of whether AI represents a true paradigm shift in healthcare and a real application at the frontline with Aidoc. This conversation steps back from technology to ask what all of it means for the patient in the bed, and the physician at the bedside. Our guest is Bob Wachter. He has spent 30 years thinking about what happens at the point of care. He chairs the Department of Medicine at UCSF, coined the term “hospitalist,” and is considered the founder of the fastest growing specialty in modern medicine. His 2015 book, “The Digital Doctor” was the definitive account of medicine’s first digital wave. A story of hope, hype, and harm that resonates directly with the AI moment we’re in today. His new book, “A Giant Leap,” built on more than 100 interviews, tackles what he calls the central question in health care today. Will AI be another digital disappointment or a genuine transformation? His argument is that AI does not need to be perfect. It only needs to be better than a system already failing patients. But the book also confronts the risks that don’t make the headlines. Not just the hallucinations and bias, but the problem of deploying a technology whose fundamental weakness is broad judgment in a profession whose fundamental requirement is broad judgment. At the end of the day, this is all about the patient. But the question that will run underneath the entire conversation is whether AI is leading us towards a different kind of doctor altogether—a clinician curator rather than a clinician diagnostician. The answer could be defining for the future of medicine and the physician-patient relationship. Much will be gained and much could be lost. Bob Wachter, welcome to KFF’s Business of Health.

Bob Wachter: Thank you, Chip. It’s great to see you. Great to be here.

Chip Kahn: So great to have you here. Let’s get started.

In our conversation about AI and health care, you as well as being at UCSF and having a role there as a teacher, but you are a practitioner, too. How is it different now with the advent of AI, when you walk into a patient’s room at the hospital or when you walk into an examining room for a patient visit?

Bob Wachter: Yeah, well, it’s not true everywhere. I’ll tell you at UCSF when I’m on the wards, because I’m a hospitalist, I will now, if the patient has an extensive history, I’ll now pull out my phone and with the patient’s permission, use an AI scribe, a tool that didn’t exist three years ago, and it will document my note. And I will be looking the patient in the eye and paying full attention to them and the patients notice that. I will click a little button on my EHR and ask it to do a summarization of the patient’s past record. That’s relevant because one out of five patients has a past record longer than Moby Dick. And the idea that I’m going to be able to get through that in two minutes is a joke. It’s impossible. I may ask it to draft the discharge summary, which is very useful if the patient’s been in the hospital for a month. And in many cases, I will pull out my phone and paying attention to HIPAA, I will say to mostly a tool called Open Evidence, which, is sort of GPT for doctors, but sometimes just a GPT or Gemini. I’m an 82-year-old patient with CLL who comes in with a fever and a white count and shortness of breath and has a creatinine of 1.7. What do you think is going on? Something you could not do with any tool that you had until a few years ago. And in some ways I think about it as you know, the term we use sometimes is a curbside consult, that I’ve got a question that I could use a specialist, but I don’t need a full specialist consult where I used to hope that I’d run into my friendly infectious disease doctor in the hallway. And now I will use AI for that purpose. And I don’t think I’m that atypical. I think, you know, UCSF may be a little ahead of the curve, but because we’re in San Francisco, but I think these are kind of relatively typical uses, which is remarkable for a field that tends to be pretty sluggish in adopting these kinds of technologies. All you have to do is look at how long it took for us to adopt electronic health records to get a sense of that.

Chip Kahn: You also run a Department of Medicine with hundreds of physicians and trainees. How are you preparing the next generation to practice alongside AI? And what and how are they learning differently than you learned back when you were in medical school?

Bob Wachter: Yeah, I start out with one sort of uber point, which is I don’t think any organization as great as UCSF is in the crust and teaching and research and clinical care or any individual practitioner will be great in five years if they’re not great at implementing AI effectively. That doesn’t mean doing it blindly, that doesn’t mean stupidly, doesn’t mean being agnostic or ignorant of potential negative consequences. But I think these tools are so potentially game changing that, I start out with the point that in some ways I heard from Gianrico Farruggia, the CEO of the Mayo Clinic, when I interviewed him for the book, and I said, you have the best brand in healthcare, you must be worried about making a misstep with this. And he said to me, I think the risks of going too slow are far greater than the risk of going too fast. That’s my belief and that I’m trying to kind of inculcate that culture. Practically what that means is two or three years ago I launched a division in my department of clinical informatics and digital transformation, we call it DoC-IT to focus on the research and education. Our health system has a Chief Health AI Officer, a job we didn’t know we needed two years ago. In my department I have a head of AI for medical education who’s helping to educate people about how these tools work. We basically say we’re looking at every process that we have and asking how can AI make it better or safer or more productive? And in terms of medical education, I think in some ways that’s the trickiest question. We’re certainly training our students and residents on what tools to use, how to spot a hallucination, how to be a copilot with these tools. I think the hardest question is, are there things that we used to teach that we can take off the curriculum? And by that we’re usually talking about the knowledge of medicine. All of the time I spent learning the differential diagnosis of 100 different syndromes, or the interpretation of certain lab tests or whatever, do we take those off the curriculum? Because the AI can now do those things. And I think the answer is no for the time being because I think when an expert uses these tools, they use them in a certain way. They know the right questions to ask, they know the follow-up questions. They know when the AI gives them an answer and they say really? Are you sure about that? And then the AI says oh, you’re right, that’s the wrong answer. They also know when the AI gives them an output and says, here’s the differential diagnosis, here are the possible diagnoses, I can look at it and say number one and two, that’s pretty smart. I hadn’t thought of that. Number three. No, that’s crazy. I’m going to ignore that. If we stop training young physicians in physicianship and learning how to make a diagnosis and have judgment and diagnostic reasoning, essentially we’ll turn them back into laypeople and I think that these tools are not ready for lay people to use effectively. So that’s the hardest question. I think we’ve gained consensus on one thing to take off the curriculum. It’s something called the Krebs cycle. We all learn quite painfully in med school. This organic chemistry pathway that we never learned, we never use again. But beyond that, I think the risk of what’s called never skilling, not deskilling, but never skilling, is too high. So, we’re being very careful about taking off sort of foundational medical knowledge out of the curriculum.

Chip Kahn: I’ll come back to medical education in a bit, but before we do that, let’s go to the period pre AI. You had another book, “The Digital Doctor.” You discussed there that the issues around electronic health record dissemination, that on the one hand improved safety, but on the other hand, from everything I heard from those I worked for over the many years, caused tremendous workflow problems. And really you put a system on top of a very fragmented clinical set of encounters and then it had great expectations for it. What were the kinds of problems there? And does AI fix any of those problems?

Bob Wachter: I think it does. The book I wrote 10 years ago, “The Digital Doctor,” was really about health care going from paper to digital. And the main character in the book is the Electronic Health Record. I had high hopes for it. I’d been studying patient safety for a decade, and it just felt like if we could just computerize, get rid of doctors’ handwriting and get decision support that helps suggest the right diagnosis or the right treatment. And then the EHR came in and obviously we were late to the dance. Every other industry had computerized a decade earlier or two decades earlier. And we only did it after being essentially bribed by the federal government with the High Tech Act to pay us money to implement an EHR, which no other industry needed that. But I still thought this is going to be great and make care better and safer and improve efficiency and convenience. And it did some of those things. But also massive number of unanticipated consequences. Patients noticed their doctors weren’t looking them in the eye anymore because they were so busy filling out forms and checklists. We opened up a patient portal. The patients had access to all their information but gave them absolutely no help in interpreting any of it. So the patient would see in their portal that their magnesium is low and their EKG is abnormal and they’d say, what does that mean? And they’d have absolutely no help. And the only help we gave them was a little button at the bottom of the screen that said, click here if you want to send a message to your doctor. So patients being normal human beings, click there. And all of a sudden the doctor had 100 emails to answer after a long day in clinic. So, lots of stuff that we didn’t anticipate. I think some of the lessons are that, that these tools change the nature of the work and the workflow. And there’s a long history in technology of what’s called the productivity paradox, where we think the technology is going to magically make things better, and it doesn’t unless you actually change the system around it. And I think, particularly for a system like an EHR, which is so ubiquitous, changes every workflow, every arrangement. We also didn’t recognize that the tools weren’t very good, and that was part of the problem. But I think in some ways they’re unfairly tagged with being the entire problem. The bigger part of the problem was all of a sudden, now there was a mechanism by which the insurance companies, the quality measurers, the malpractice attorneys, could now make the doctor do something because they could look over your shoulder in real time. Doctors A, weren’t used to that, but B, what that led to is a huge amount of additional paperwork, and box checking and all that kind of stuff. At the end of “The Digital Doctor,” I have a chapter, I think it was 27, about where this goes in the future. It’s actually a very optimistic chapter embedded in a very grumpy book. And I had people say to me, like, who was your ghostwriter? And I said, no. I could see how this could work out. My mistake was believing that the EHR was the solution. And what I came to learn was that the EHR was the foundation. That we needed to digitize our information. We needed to get interoperability at least partly right so the information can move around. But it wasn’t the answer. The answer is now what are the tools and changes in process and maybe training and people. But what are the things that take all that digital information and turn it into a system that works better and is more convenient and safer and maybe lower cost? And, you know, that’s sort of the history of every other industry. You had to digitize the information before you could have Airbnb or before you could have Waymo or before you could have Netflix or any of those things didn’t flow directly, or Uber didn’t flow directly from digitization. They were things that were built on top of a digital system. So that was my hope. And really the first time, and I think it was not, there was no payoff on that hope for the first five to seven years of the EHR. Cause the EHR provided remarkably little help and decision support. The first time I used ChatGPT on November 30, 2022, a little light bulb went off and I said, this is it. This is the tool, the type of tool that if we get it right, will not only solve the problems from the EHR, but allow us to do things that we couldn’t do, allow us to scale the knowledge of specialists, allow us to look the patient in the eye again when we’re talking to them, because the AI can take our conversation and turn it into a properly formatted note, allow us to keep up with the literature, whereas I can’t possibly keep up with the latest literature because there are a thousand new articles a month. All of those things, I think AI has the capacity to do that. Whether it does it effectively is partly dependent on how good the tools are. But a lot of it now depends on us and how good we are in change management and changing our processes and our training. And in some ways the book is less about the technology and more about our system and how what happens when this technology enters our system and how do we take advantage of it.

Chip Kahn: You know, over that period when EHRs were introduced in hospitals and high tech was implemented, it was a period at the same time that there were issues around Medicare payment and just the whole bureaucracy of medical practice that I think was causing tremendous dissonance among physicians. And they looked at the EHR that you’re describing and the implementation of it as just another burden, not a workflow assistant. So you think from at least your immediate experience, that AI is going to be more of a collaborator than another burden.

Bob Wachter: I mean, it always can go off the rails and there’s a long history in medicine of getting this stuff wrong. But yeah, I think so. I think that the capabilities of these tools that we never had before, to be able to read an unstructured note, to be able to provide, you know, subspecialty level knowledge and insight and do it in plain English. I can ask a question of it in a way that I couldn’t ask any prior digital tool. Eventually those tools won’t be something I’m using on my phone, but will be embedded in my workflow in the electronic health record, will be in the back office world, will be able to sort of figure out what does the insurance company need for the payment, knows the rules of this patient’s insurance versus that patient’s insurance so that we don’t have to have 1,000 people in the billing department, can help coordinate the care of a patient with cancer, can sort of anticipate things that you’re at risk for and maybe provide guidance directly to patients so that they can do prevention better. Maybe take all the information from cameras in your home or stuff coming off your wristwatch or your ring and make some sense of it and manage it in a way that a human system can’t possibly do. I think all those things are possible. Whether we get it right enough to actually deliver on that, whether it lowers healthcare costs, which may be the dominant issue in our healthcare system. Maybe I think the early evidence on that one is actually not very positive because every side is using it to sort of create a better bill and get a better payment. And, you know, ultimately that may depend on how the decision support works, which is really complex question. For a lot of medicine, there’s no right answer. You know, here’s two therapies for a patient with cancer. One costs 20,000 bucks. One costs 200,000 bucks. The one for 200,000 bucks improves life expectancy by six months. But one out of 100 people is cured. Does the system recommend it or not? That’s not a technical question. That is an ethics value incentive question. But to the degree that the AI is going to be providing decision support that’s more robust and to some extent more determinative of what I do, a lot of the action here is going to be, who’s the wizard of Oz behind the curtain, figuring out what dial we turn to give an answer to that question. I think those are really complex questions and I think they can go in a lot of different directions depending on culture, history, payments, incentives, battles between providers and insurance companies. All those sort of things, I think are going to play out in new ways, as is the relationship between patients and providers now that patients have tools that really to some extent dissolve some of the asymmetry of knowledge that they typically had between them and providers. So a lot of things can go wrong, but I think the capacity for a lot of things to go better than they do now is there in this technology and just did not exist before we had this technology, at our fingertips.

Chip Kahn: You know, in reading your new book, “A Giant Leap,” I had this feeling that even though we have a general public that’s very risk averse or is not risk tolerant in terms of new technology, that you make an argument that AI doesn’t need to be perfect. It just needs to be better than a system that already fails or doesn’t work for patients. And even though we can say that, it is new. So the question is, what’s the tolerance level here? And, what’s your view when you say not perfect? How far can we go and have it become the new reality, and the new presence?

Bob Wachter: One of the things that, when I started writing the book, you know, my editor and my wife, who’s a very accomplished author, said, you know, you’re going to have to try to figure out, how do you write something that’s not out of date five minutes after it comes out in a field that’s moving this fast? And really pushed me to think, like, what are the big picture issues that we’re going to confront here? I think you’ve captured one of the biggest. I use Biden’s old line in the book: Don’t compare him to the Almighty. Compare me to the alternative. It may not be perfect, but it still might be a lot better than status quo. The status quo is like, try to find a primary care doctor in San Francisco. It’s nearly impossible. Try, to find a mental health professional in San Francisco. And if you do, try to find one who’s less than 300 bucks an hour, nearly impossible. That’s the status quo. You have a new diagnosis of cancer, and you’re just overwhelmed by this system that between oncologists and the infusion center and the insurance company and all, like, how do I make sense of it? That’s the status quo. I think we should try to be comparing it to that as opposed to some mythical state of perfection. But that said, it’s natural to hold technology up to a higher standard. I use the example of Waymo a lot in the book because A, I live in San Francisco, so I take a Waymo about once a week. If you told me 10 years ago that I’d be comfortable getting the backseat of a driverless car and taking a nap, I would have said, are you crazy? And yet there is incontrovertible data now that it’s safer than a car with a driver. And yet. And there’s now been over 100 million miles of Waymo without a fatality. It’s staggering. And yet, three or four months ago, you probably know this, a Waymo ran over a little cat in San Francisco. It made front page news. You know, how many Ubers have driven over cats? How many regular drivers driven over cats, you know, probably thousands. So it’s a natural tendency. What it says to me is asking people in society to make apples to apples comparisons is a big ask. It’s hard to do, and maybe the wrong ask because it’s a natural tendency to be a little more concerned, partly because the technology can scale errors very effectively. What it means, I think, is we need to start out with use cases where we get quick wins and build trust. And I think that’s happening. for example, I think it was important that UCSF start with using AI to draft a note from my chart rather than start out recommending what treatment I give for a patient with cancer. Because if the latter is wrong, we can kill somebody. And if we kill somebody, that’s going to be a front-page story and that’s the end of AI. Whereas if we start out with chart summarization, drafting a note, writing a prior auth, maybe suggesting diagnoses, but not embedding it in the electronic health record yet, but kind of doing it offline almost the way I’d use a textbook, I think that’s smart because I think you’re building up a reservoir of trust because inevitably at some point it’s going to kill somebody, it’s going to get something wrong, it’s going to kill somebody. That has to happen. And I think if we reach a point where there’s so much trust built up and we’ve made the convincing case that we’re monitoring these systems, and yes, there was a fatality, but in exactly the same situation in the old system, there would have been 10 fatalities. I think that is what you need to resist the inevitable pushback. I guess the final thing I’d say is if doctors or nurses thought their jobs would be threatened, then some of the pushback will be framed in the language of patient safety, but actually be about “I’m, worried about my job.” I think one of the happy coincidences for healthcare is, I think for the foreseeable future, I don’t think there are any nurses or doctors losing their jobs. I think that the unmet needs are so vast that even if this massively improves productivity, I don’t think it’s going to reach the stage where you can just let it run by itself and it’s taking care of patients, maybe treating your cholesterol, possibly, maybe, you know, vaccinations, possibly. But, most of medicine, I think, is still going to need a lot of doctors and a lot of nurses. Will there be job costs? Yeah, I think in the billing department, I think in the call center. But I think in terms of the clinicians who would be the ones to push back and make you scared that this might kill you. The example of radiology is the most salient here. You know, we can’t hire enough radiologists at UCSF in the center of AI, our radiologists are begging for AI to help them because their volume of scans is undoable without it. So even in the field that I think is most vulnerable to job replacement among physician fields, you know, most of our radiologists, pathologists, are saying we need the help because otherwise our job’s not doable. And certainly people in primary care are saying that. So I think there are a lot of kind of happy coincidences, but you’re absolutely right, we have to sort of create enough reservoir of trust that when something goes wrong, the answer is, yes, I know, but it’s still substantially better than the existing system.

Chip Kahn: Sort of to follow along there, your book covers, drafting notes, fielding patient questions, recommending treatments, interpreting images and guiding surgeries.

If you had to rank those, where is AI sort of most mature. Where is it hyped? And is there a gap, in terms of the reality of clinical care in any of those areas?

Bob Wachter: Well, the reality is on the what I call singles versus home runs, the reality is today at UCSF and probably in hundreds of healthcare, organizations around the country, it’s already drafting notes, it’s already creating the bill to send to the insurance company, company, it’s already contacting patients in a better way than what we had before that. It’s time for your mammogram. In many institutions, it’s doing the first read of your mammogram. Demonstrably better than systems that rely purely on radiologists, which is sort of data that’s come out in the last year or two. So I think there, not hype at all. And I think the real issue is diffusion. I think it really is ready for prime time. As you move toward, you know, surgery in more procedural fields. I think it’s helping kind of at the margin to, you know, think about AI-enabled colonoscopy better than just plain old colonoscopy in identifying precancerous lesions. Surgery, I think, is still pretty early. Some of these tools in robotic surgery can point to, you should cut here and don’t cut there. That kind of guidance, I think is potentially effective. Certainly, we’re nowhere near AI autonomous. You know, the Waymo of surgery. You know, we are many, many years away from that. Where I think there’s probably some overhype, I don’t study this, personally, but what I hear is in the drug development world, you know, AI is going to figure out the cure for cancer and the cure for Alzheimer’s. I think that’s mostly hype today. Will it sort of guide you to potentially effective compounds sooner? Maybe. But the process of drug development, testing of drugs, clinical trials, regulatory process is such that if it shaves some time off that you still have not created a cure for cancer anytime soon. I’d say the areas I worry about the most are in direct consumer-facing AI, where I don’t think it’s hyped because I think the tools are capable of things that are really pretty magical. But the studies that are coming out showing what happens when a patient uses GPT or Gemini for medical advice, it gets it wrong a lot. And it’s not really the fault of the tools because if I was using it, it would get it right a lot. It is a, kind of failure to recognize that for a layperson who does not have expert knowledge to use these tools, they don’t know the right information to put in. They don’t know how to interpret the results. And that’s not the fault of the patients, obviously. They know what they know. But it does say that the tools that we’re going to build to be patient-facing AI tools have to be different than the, than generic, chatbot that you use today. They have to act much more doctorish. First of all, they have to know enough about your past information. And so either they get embedded in the electronic health record or now you can load in a lot of your information into GPT or Claude. You’re going to have to decide if you trust those companies because they don’t operate under HIPAA. So you’d have to trust them with your data. But let’s say you do. So that’s part of the problem. They need to know that. But I think the bigger problem is patients don’t know what the right information to put in is. If I wake up with a headache, how do I frame that? And then if you said to a doctor, I have a headache, the doctor’s going to say, tell me more about it. What part of your head? Are you a headache person? Does your neck hurt? Does the bright light hurt your eyes? But the patient facing AI in the future has to act much more like that to ask those questions and not give an answer until you’ve had all those iterations. That’s what would happen when you saw your doctor. But some of the AI tools now will just say, okay, it sounds like you have a headache and take some Tylenol and turn out the lights. I think the tools built for patient-facing medical information are going to have to be a next generation. Does that mean you shouldn’t use them? I think probably they’re better than Google. They’re better than calling your cousin the veterinarian. But, you know, I probably use two of them. I probably put my information into GPT and also Gemini and see if it gives me the same answer. Sort of an AI second opinion. But I think the next generation of AI tools for patients, I think it’s being a little overhyped now because I think they’re not giving the right answer often enough to be completely trustworthy. And some patients are trusting them completely and not going to see the doctor when they really should.

Chip Kahn: And you’re describing the mitigation to some extent. But I understand that there are indications now from some research and discussions in social media that these machines are more empathetic than physicians. Clearly on the mental health side, on the behavioral health side, all kinds of stories about people telling the AI, things about their life they would never tell the psychiatrist or the psychologist they were seeking therapy from.

Do you sense that? And I guess you would almost label that as a problem right now, as much, as something that AI can cope with?

Bob Wachter: Yeah. I mean, I think there’s two different issues there. One is empathy. One is sort of the degree to which you trust the tool to handle information that you might be reluctant to tell another human.

In my book, I try to take a sort of neutral attitude about the doctor-patient relationship, which is hard for me because I feel like I learned a lot in medical school and residency. I’ve been practicing for 35 years. I feel like there’s something I’ve learned that has utility. And yet whenever I hear somebody say, well, the doctor-patient relationship is sacred, it’s like, I don’t think so. I think it adds value. I think that I don’t want a bot to tell me I have cancer. Or to tell me I need chemotherapy or need surgery. I think that there probably is a lot of utility to it. But sort of saying it’s sacred is a conversation ender that’s designed to say, we don’t need any empirical data about this. It’s carved into a stone somewhere that there must be a doctor who is the source of your medical information. It’s hard for me to not accept that, you know, it hurts my ego. I have a daughter and son-in-law who are doctors. There are a lot of reasons why I think I have a bias that the human adds real value. I think it probably is true in a lot of circumstances. But I think we have to test that empirically. And even if it’s equal, you know, patients, particularly younger patients, may prefer getting their care in a more transactional way. You know, do they want to go to the office and sit there and wait for half an hour to see the doctor for 15 minutes and have to pay a big copay? They may prefer some of the care that they can get from AI. So I think we have to approach this as an open question, one of many, many open questions here about when do patients really need to see a doctor, when do they really not see a doctor? I think what that’s going to cause us to do is dissect out not can an AI replace a primary care doctor, but what are the things that a primary care doctor does? Almost task by task. and what things can be done by this tool safely, more conveniently, probably less expensively, and what things are not like that. Now we’ve asked versions of that question before. I’m old enough to remember the day where it’s like a nurse practitioner doing this thing, are you kidding me? Or a PA? How could that work? And then we said, well, there aren’t enough primary care docs. We need some other person who’s lesser trained and probably couldn’t handle some complex problems, but is less expensive, more available. And now most of us are fine with that. It’s not without some tension there. But I think this is a version of that same kind of question. Where I think that lands is for mental health care, I think tens of millions of people are finding they are getting value from chatting with a chatbot at a cost of on average $20 a month. Try to find a psychologist or psychiatrist in San Francisco and if you do, it’s $300 an hour. And then every now and then they go off the rails and tell a kid to kill themselves. And that’s awful. And that can never happen. The regulations need to happen there, probably lawsuits need to happen there. So getting the balance right is important. But I just think going in and saying this thing is sacred or has to be a doctor, you know, as I say, I get in the back of a Waymo and I prefer it over a car with a driver. I generally do my, you know, my travel, I do using digital tools. And yet every now and then, when I was going to Vietnam a year or two ago, I needed a travel agent. My tax needs are complex enough and I can afford to see an accountant. But if I couldn’t, I’d be comfortable using a digital tool. I think there are a lot of things that we used to think that’s fundamentally a human task where the technology now has asked us, has challenged us on this. I think medicine’s going to provide that in spades. And I do think there are going to be, yes, the AI can fake empathy really well. It has no empathy, obviously. So the differentiator of I want to be dealing with an empathic thing that gives me an answer that, that feels like it knows me, and isn’t making judgments. I don’t think the humans have a slam dunk advantage over the AI the way I would have thought three years ago and the way I would have thought three years ago, because it did. There was no AI tool that could do that, that could mimic empathy. But today I think we have to have an open mind about what is the right role of each of these things. But at the end of the day, I still want to see a doctor for complex chronic issues, for things that have a high emotional valence. but that may be because I’m an old guy.

Chip Kahn: So clearly from our discussion, you’re on the augmentation side rather than the replacement side, at least from the get go. And then you’ll wait and see, dependingon what it is.

Bob Wachter: But also, I think, task by task. So, as you know, an AI company was just given permission a month or two ago in Utah to refill meds on its own without a physician looking over its shoulder. It’s a discrete set of less dangerous meds. There’s an escalation pathway. If the patient says, I’ve had a reaction to the medicine, it boots you off to a doctor. But I think that’s great. I think we gotta test that. No patient wants to see a doctor for a refill. No doctor wants to see a patient generally for a refill. So I think that it’s almost a case-by-case thing. And I think we’ve gotta be careful here. But yeah, I think to make the job of primary care doable, I think we’re going to have to say there are certain things that you can see an AI for your cholesterol management or to decide whether you need to be on Wegovy or Zepbound. I don’t see that as being something that you absolutely need a physician to do.

Chip Kahn: So, moving to the risk area, there’s a lot of discussion about hallucinations. And frankly, at least from my view, that’s a technical issue. And over time, they’ll be reduced.

Bob Wachter: It already is substantially better than it was three years ago.

Chip Kahn: But what are other risks that may be less obvious and really serious? And, in a sense, you know, what keeps you up at night in terms of those kinds of risks?

Bob Wachter: Deep fakes and security risks are probably the main things that keep me up at night. So, you know, the same deep fake, I started the book sitting with the CEO of the Mayo Clinic, which showed me a Mayo deep fake of a Mayo physician talking to a patient beautifully, empathically. And then behind this doctor walked in the real doctor, who waved to the camera awkwardly. And so the idea of using this technology to scale the expertise of a UCSF doctor, a Mayo doctor, is thrilling. You know, think about rural areas that have no access to that kind of thing. On the same hand, the exact same technology can, you could take what I’m saying to you and have me say, you shouldn’t get vaccinated against anything because they’ll kill you. So that scares the hell out of me. How we deal with misinformation and disinformation in an era where now the technology can make anybody, even the most trustworthy person, look like they’re saying anything. And it’s, completely undetectable. I don’t know how we get ourselves out of that box. The more AI is doing not just decision support, but particularly assets acting autonomously. That’s built into your pacemaker, your defibrillator, or your insulin delivery system. The idea that someone can hack into that and change the algorithm is potentially fatal. I mean, there are lots of things that, yeah, it gets more powerful. You know, we’re seeing versions of this in warfare. As it gets more powerful, it’s exciting in many ways and really scary in other ways. Those are the things that scare me the most. Hallucinations, less so, because I think it’s just gotten better and better. The tools are better and more trustworthy. I worried a lot about explainability in the early days. I think it turns out to be, to a large extent, a nothing burger, as I tell people. I can’t explain to you how Tylenol works or anesthesia works, but I know they work, and I use them. I think for physicians, explainability turns out to be less important than empirical evidence of trustworthiness. I think the thing where I do worry about explainability is researchers trying to get to the root cause of why this cancer is growing on this schema therapy do need to understand mechanisms. So I think we may need different AI for different purposes. Bias. Yeah, I worry about it a little bit, but not really any more than I worry about bias in our current system. In some ways, the bias of AI is just parroting the bias of humans. And I think probably the AI is easier to fix than the humans are. So I’d say really, security and deep fakes and misinformation are the two that keep me up the most at night.

Chip Kahn: So along those lines, even a vigilant clinician really can’t be responsible for the reliability of the AI. There’s got to be some other safety model that assures that because the physician’s not a technician, the physician’s using a tool. And if the tool as you’re describing turns out to be not trustworthy, what do we need to have in place so the physician can be confident and obviously the patient can be confident?

Bob Wachter: Yeah, I mean, we have a chicken and hen house problem, which is probably the only way we are going to do that is with AI monitoring AI. We’re going to have to have systems that not only have mechanisms to decide is this thing trustworthy enough to bring into a health system, where I’m reasonably confident that I look at a system like mine, we’ve got a very robust governance process before we turn on an AI tool. We have a lot of incentives to get it right. We don’t want to get sued. Our brand is important. We have a moral and ethical obligation to do the right thing by our patients. And that means that we’re relatively conservative about, you know, we’re only going to bring a tool in if we’re sure it works. And yet there have been a lot of cases where the AI seemed to work on day one. But over time, maybe the patient population changed, maybe the literature changed about what the right thing to do. So you have to figure out a mechanism to monitor how it’s working over time. That’s got to involve AI. And the reason I say that is, you know, I chaired our patient safety committee for a long time. We would implement all these fixes after there was a bad error. You know, how often did we go back and look a year later to see whether the fix was still working? The answer is not very often. And I think if you rely on the human system, you know, we’re going to have hundreds of AI tools doing stuff. And to say, you know, we’re going to have to have the human, the quality department is enough to go out and look at 100 charts of patients, measure how it’s doing. I think that’s not going to be feasible. So I think you’re going to have AI monitoring AI and, you know, hopefully we’ll set it up right, to be sure that it’s useful. But, you know, and this gets to the issue of how we regulate it. I think there are enough built in guardrails, you know, an AI embedded in a machine that is, a currently regulated machine, a defibrillator or respirator clearly has to be approved by a, respected, whether it’s the FDA or somebody else system that is designed to say this thing is safe and effective. And also probably some new standards for health systems to say, what are you doing as a system to be sure all of your tools are safe and effective? Because there’s just no way the FDA has the capacity to look at a thousand different tools that you might be using. So some of it is going to be, what are the standards for the scrutiny of a system, to adopt best practices to do that. I think the thing I worry about more than the tools that UCSF is deciding to bring in to use, because I think also, you know, physicians or nurses will also have some sense that the thing’s not operating correctly. Not perfect sense, and I wouldn’t completely rely on it. But the thing I worry about more is patient facing use, where a patient has no real ability to tell whether the thing is giving them correct information or crazy information. And there I think we’re going to need some regulatory framework that’s better than the patient just kind of hoping that this tool works correctly. There’s a lot of mischief that can happen there. It’s not just that the tools may be wrong, it may be that there are conflicts of interest embedded in the tool. The tool’s giving you an answer because the drug company or the device company paid some money to somebody. That’s a risk in health care organizations too, that we’re going to have to figure out how to monitor. So there are a lot of things that can go wrong here. I’m more worried on the patient facing side than I am on the health care organization facing side.

Chip Kahn: We also, when we look at AI, see systems that can do straightforward things or respond to clear questions or take data and do amazing magic. But at the same time, at least right now, before we get to some later generations, they don’t have judgment or necessarily context if they’re not given the context in the prompt. How do we deal with that? What criteria do we use in terms of using these tools when they lack that, in terms of protecting the patients in ways you just described?

Bob Wachter: Well, first of all, some of them, I think it’s probably directionally correct that we’re going to have to monitor and maybe assess and regulate these tools sort of the way we do physicians. You know, can it pass the appropriate test to demonstrate that you have both the skills and judgment to make the right calls? So, the problem is we’re not all that great at doing that for physicians either. It’s not like we have a perfect, and I say this as a former chair of the American Board of Internal Medicine, it’s not like we have a perfect system for assessing is your doctor any good. But in some ways, I think the system has to resemble that more than just a pure technological assessment of its performance in a laboratory. It’s got to be sort of more in the real world, kinds of cases and circumstances. I think the, you know, I have a chapter in the book on regulation, thought really hard about this. You know, how do we get the balance right? And I came out with the really brilliant conclusion that this is a hard problem, we’re going to have to be very creative. Meaning, I have no idea. I mean, I think this is really, this is really a tough one. And the problem on the regulation side and the assessment side is I don’t think we even have the right models to think about this. In other words, to say, oh, we have an organization that regulates stuff that we do in medicine that could hurt people, for safety and effectiveness, called the FDA. I would be fine with the FDA regulating a new tool embedded in a CAT scanner or a ventilator, but to regulate decision support tools or predictive analytics that tell me this patient has a risk of a fall or being hospitalized or developing Alzheimer’s in 10 years, I think the FDA is a square peg and a round hole problem. I don’t think it’s the right even way to begin thinking about it. And if we were a more mature society, we’d be having really hard conversation, not just in health care, but in everything about how do we get the Goldilocks problem right of, you know, we don’t want to slow this down because it’s potentially incredibly useful and because of geopolitical considerations. On the other hand, we don’t want to go too fast and get it wrong and hurt people. And we would be thinking, almost starting with a blank piece of paper, you know, what does the regulatory structure look like? How much of it happens at the level of how we regulate the doctor who’s using it, how we regulate the health system that’s bringing in, how we regulate the AI company that’s selling something directly to patients. There are so many parties here, the drug companies, all that kind of stuff. I don’t think we’re even beginning to have that conversation, which is upsetting because you know we’re going to end up in the wrong place.

Chip Kahn: Let’s turn to education. And you made the argument that there are some things you can drop off, organic chemistry or whatever it was, but that basically, the students at the medical undergraduate level still have to have the basics of understanding the science. But what happens in residency when trainees are reasoning clinically alongside an AI that reasons faster than they do? How do you make sure they actually learn to think and they learn how to collaborate with this thing?

Bob Wachter: Yeah, in point of fact, I think much of that will happen kind of organically as they use these tools and play with them. I think our job as educators is partly structural and partly, almost moral. I mean, we really are trying to inculcate the message that if you become over reliant on these tools, you will get stupider. They may make the argument that the tools are smarter than I am and therefore I’m learning from them. There’s probably some of that going on. And in the same way, it’s analogous to a lot of other fields of endeavor. It’s one thing to write a draft of your Substack on your own and then put it into Claude and say, can you help me make it better or critique this? It’s another to say, Claude, can you write this for me? And I think one probably makes you better, one probably makes you worse. And so part of this is inculcating with our trainees. You come up with the diagnosis first, say what you think is going on, and then put it into open evidence and say, what do you think? Did I miss anything?

00:45:00

Bob Wachter: Is there anything that might kill the patient that I didn’t think of? I think that’s very healthy. So that’s sort of almost moral and appealing to their own sense of trying to be as good as they can be. The second though I think is probably structural. We have now on our committees, that make a judgment about bringing AI in for various use cases, we now have someone to represent the education world. Because I think that prior to about a year ago, the group of people who did that were clinical leaders who said, well, the scribe or the AI chart summarizer, it’s good enough, we’re going to turn the switch on. And there was no one there to look at that issue through an education lens and say, yes, that’s probably fine, but for medical students, we don’t want that turned on yet. We want them writing their own note for a year or until they pass some level of competency assessment and then they can turn it on. It’s a little tricky to withhold from our trainees what we think is the state-of-the-art tools to make care better. But I think as we balance these things, you know, because there we’re not talking about deskilling, we’re talking about never skilling. And you know, to do that, some of it is going to be sort of moral imperative, but some of it is going to be sort of how we think about the tools. And it may be that we use them differentially in our trainees. And then on the other hand, there’s a lot of stuff that the tools can give to trainees that we couldn’t do before. So how can I tell if my trainee does a good job talking to a patient during a patient interview? Do I sit there and sit and watch them for 20 minutes? I don’t have the time to do that. I often would judge how they worked with the patient by how they presented the case to me. It’s like asking someone to do a piano concerto and come out and tell me how it sounded. Whereas now I can have AI listen to their conversation with a patient and critique them and give that data to me as well. So there’s the capacity to give coaching and feedback in new ways that actually will improve their performance. There’s the capacity to look at how their caseload is and say, you know, I’ve seen a ton of patients with heart failure, but you haven’t seen a patient with lupus. We need to put you through some simulated case with lupus to be sure you’re good at that. There’s all sorts of wonderful things that can happen in education, but I do worry a lot about never skilling and deskilling. And I think we have to be super intentional about that.

Chip Kahn: So along these lines, what will the physician of 2035 or 2040 look like? Will it be a clinician curator, rather than a clinician, diagnostician?

Bob Wachter: Well, I think the correct answer to anybody asking about 2040 is I have no idea. We hope we’ll be there. That’s the first point. Right, exactly. But, you know, and I have a daughter and son in law who are pulmonary fellows. So I’m, you know, I’m thinking a lot about what does this look like 20 years from now. I think there are so many moving pieces here, it’s almost impossible to project. I think the direction of trade is relatively clear that I think there will still be physicians, I think they will still have important roles. I think the level of complexity that they will be operating in will be substantially higher. Meaning the easier stuff will be taken off their plate and will be done primarily by AI. And therefore the stuff that they’re doing are the cases that are particularly complex and certainly with the assistance of AI. And what that means is they are more orchestra conductor, they’re more sort of pulling things together, pulling together the right team of people to make the right decisions and do the right things in complex patients. But you know what this looks like that far out. You know, I think your X rays will be, and your pathology slides will be read by AI, maybe with one person overseeing the whole process. I think there will be still humans in the loop for high-stakes decisions. Starting chemo, you need to go to the OR, you need to go to the ICU, but the AI probably being able to express its degree of certainty. So in some ways signaling to you that here’s what I recommend, and here’s what the AI saying. Here’s what I recommend in green because it’s sure that in the last hundred patients, this was the path where they did the best. And otherwise signaling to you, here’s what I recommend, but I’m less sure. And then it comes to you in yellow to signal to you that as a human, pay a lot of attention to this one, because I’m not sure. But you know, beyond that, I have no idea. I think it’s just going to be so hard just thinking about what the last three years have looked at. ChatGPT got rolled out on November 30, 2022. And my mind was blown that day. And what I call, or sometimes has been called AI vertigo. I’ve had moments like that about every six months where, oh my God, I can’t believe it. You know, in the early days, okay, it can pass a test, but it can’t reason. And then there were reasoning models that could explain how it came to an answer. Okay, but you know, it can’t solve a complicated case. Okay, it can. Okay, you know, at least we have empathy. Well, it’s actually at least as good, maybe better than we are in empathy. And that’s over three years. So I think it’s hazardous to try to predict anything beyond maybe five to seven years.

Chip Kahn: So considering all that, you frame the book as a story about human choices. Who is making those choices right now? And do we have the right people at the table to make the kind of choices that are implied by the discussion we’ve had over the last many minutes?

Bob Wachter: I’m worried about that because I do think there’s a lot of value judgments built into some of that, particularly as decision support gets more robust. And probably not for a while will it become agentic, meaning it’s operating on its own. Except for the very simple things of vaccine, yes or no, statin, yes or no. But as the decision support becomes more robust and we trust it more, there’s just a lot of values underlying what those recommendations are. So you’re going to want to have experienced clinicians in the room as you decide how to tweak the dials, to decide what the answers it suggests are. You may want to have some ethicists in the room as you sort of grapple with that. I’m a little worried that a lot of this is being determined by the technology companies themselves and particularly as they go direct to patients. I think democratization of care is a good thing generally. But you now have in the hands of people that are primarily running for-profit businesses, decisions about tools that are, you know, they will say are not playing doctor, but really to some extent are, to the degree the patients are trusting them with their medical data and trusting them to make help make decisions for them. So, you know, that said, I was on Google Health’s advisory board in 2007, and this is Google, you know, they can do anything. They have unlimited money and undelivered expertise. I remember Eric Schmidt coming into the room and dissolving our board. He said, this is too hard a problem for us. But there’s a long history of tech companies coming into health care saying, we know how to fix you because we fix financial technology or entertainment or travel. I think they’re smarter now. I think there are very few of the companies trying to do things in medicine that don’t have a lot of medical advisors or nurse advisors sitting at the table making those decisions. I think health care delivery organizations are less naive and less likely to fall for the hype and the PowerPoint slides. So I think in many cases we do have the right people in the room. But this is also happening in an era where the financialization of health care and the role of private equity has become bigger. I worry that these tools are going to be driven to achieve the best financial outcomes, not necessarily the best health care outcomes. So all of that is partly who’s in the room and partly what the incentive systems are and how these things get paid for. There’s a huge amount of complexity embedded in the system, and these tools fix some of it, but they don’t fix some of the fundamental problems of how we pay for health care. And are you paying for doing more stuff or billing better, or are you actually paying for better care and better health outcomes? This is not a magic bullet for those things, and in some ways it could make them worse if it gets embedded in the system in ways that have the wrong values.

Chip Kahn: You discuss that, in the book, those are the issues you raised: Cost, payment, dysfunction, inequality, we talked a bit about, treatment over prevention bias in American healthcare generally. Do you think AI can really help us and find a pathway on these issues? Or is it a tool that will help clinical care, increase understanding of disease, and the such? But at the end of the day, these problems are still going to be around and be bigger than any kind of technology.

Bob Wachter: I think I’d lean toward the latter. I mean, I think that to the degree that the system gets help by sort of getting rid of a lot of the friction, people spending hours and hours writing prior auths and sticking them in fax machines just like wild silliness. The time I spend documenting the note, as opposed to actually looking the patient in the eye and paying attention, keeping up with the literature and suggesting the best treatment for a given disease or the best workup for a given disease and the most cost effective. I think those are things that are all tractable problems that AI can deal with. I also think it would be really hazardous to bet against each of the stakeholders in the system using these tools to their maximum economic advantage. And I will put physicians on that list. I will put academic health systems like my own on that list. Insurance companies, pharma companies, the government. You know, it would violate the rules of both politics and human nature to say powerful tools that can be used to deliver an ROI to our organization, if that ROI does not make healthcare better, safer, less expensive, and more convenient, but simply makes my organization more profitable. I think it’s hazardous to bet against the organizations using AI for that purpose. It feels like that would violate everything I know about how humans and how organizations operate. So if you’re going to fix that, you’ve got to fix the payment system and the way organizations are incented. and I don’t see anything about AI that just automatically does that. So that’s upsetting. And I think we’re seeing a little bit of evidence for that now that as organizations adopt AI, it does not look like the cost of care are going down. It looks like the cost of care going up because we’re all using it to create a better bill. And to the extent that we’re being paid more for a better bill, we’re going to be paid more for a better bill. And you sort of can’t blame the organizations for doing that. That’s the incentive system they’re operating under. So, and this may be the thing that causes us to scrap the whole thing, but we’ve both been around long enough to remember where health care costs are unsustainable because they’re 13% of GDP. And obviously that was wrong because they’re 20 now. So it’s a bad idea to bet against this thing just going on and on and on, kind of on the same path. And I don’t think there’s anything about AI that naturally just makes that better.

Chip Kahn: Actually, the unsustainable word was used prior to me coming to Washington in 1979. The Nixon administration said it was unsustainable.

Bob Wachter: Obviously that was wrong because we more than sustained it.

Chip Kahn: So just to close out, you’ve predicted AI would usher in something of a golden age in health care, and maybe we’ve covered this. But to sort of conclude what has to be true for that prediction, to come to fruition.

Bob Wachter: I think the current conditions are good enough for that. I think the tools in some way, so obviously can help with some of the bureaucratic burden of just the paperwork and all that can clearly help by providing me as a generalist with subspecialty level knowledge, with a tool that’s at this moment, free. You know, it’s an advertising model. And is that a golden age? That might be a slight overstatement, but I think it clearly, without a whole lot of tweaking of the system writ large today, makes my job easier, better, makes me a little bit smarter than I was, should make care more convenient for patients, more accessible, easier to navigate the system, allow them to get answers to questions that they couldn’t get answers to before. So as long as we don’t completely screw it up, I think that improving the system for the clinicians and for patients, it’s not quite inevitable, nothing’s exactly like that, but feels like it’s almost a slam dunk. These larger system problems, the things that could go off the rails, include if it does start doing a lot of job replacement, which you could argue needs to happen if we’re going to lower health care costs, there’ll be massive pushback. And I suspect in the next few years, most of the labor actions in the United States will be about AI and jobs. If we hit 15% unemployment in the U.S. there’ll be a revolution. So there’s a lot of sort of moving parts here that you might say to make care better and safer, the AI coming in to replace certain human activities, if that starts hitting jobs, particularly in the most politically powerful parts of the health care economy, there’ll be enough pushback to slow that down in ways that might not be healthy for patients or healthy for the system, but are kind of inevitable in the political environment. So a lot of things that can go wrong here. But I think in the next few years, what I’ve seen in the trajectory of the last three years, and I look at what I can do and what patients can do, the tools they have themselves, clearly in my mind is better than what they had three or four years ago. And I don’t see any good reason that that’ll slow down over the next several years.

Chip Kahn: Bob, this was a great discussion today, and I just want to express my appreciation. I think our audience will have a better sense of what’s happening now and what the future has in store, or at least what we need to think about, when we look at AI and health care.

Bob Wachter: Great. Thank you, Chip. It was a great pleasure talking with you.


SERIES

This weekly podcast features insightful conversations between host Chip Kahn and his guests, who discuss the business of health care, connecting the dots between the health care business, policy, and patients.

The podcast’s first series on AI in health care illuminates how AI is changing health care, and features guests who are deploying this technology, managing its consequences, and designing policy around it.