The case against ambient AI scribes at the bedside

Ambient AI scribes are coming for the patient encounter, and the conversation about them has been almost entirely one-sided.

The pitch is seductive. A microphone on the wall captures the visit. A model produces a draft H&P, problem list, billing codes, and patient summary before the doctor leaves the room. Documentation time drops 30 to 60 percent in published pilots. Burnout improves. Throughput goes up. Major systems — Permanente, MGB, HCA, UNC — have already rolled out scribes to thousands of clinicians. It is not hypothetical anymore.

I am not going to use one. Not at the bedside. Not yet, and probably not in the form they are currently sold. Here is why.

The patient did not consent to a transcript. When I take a history, the patient is telling me things they have not told their spouse. They are telling me about a substance use history they will deny on a form. They are telling me they think their daughter is being abused. They are telling me they want to stop chemotherapy without telling their family yet. The professional contract is that what they say to me, in that room, becomes a clinical note — not a verbatim audio file processed by a third-party language model and stored on a server I have never audited. The consent form they signed at registration does not, in any meaningful sense, capture what is happening when an LLM is in the room.

The note is not the artifact. The encounter is the artifact. A good progress note is a synthesis. It reflects what mattered out of forty minutes of conversation, weighted by judgment that took a decade to build. An ambient scribe produces a transcript-flavored document — comprehensive, plausible, frequently wrong about which detail mattered. The published examples and product demonstrations I have reviewed show a consistent pattern: the chief complaint is technically accurate, the assessment is hedged into uselessness, and the plan reads like the bullet-point version of every plan that has ever existed. Editing such a draft into a real note takes nearly as long as writing one from scratch — and now I have also surrendered the cognitive act of writing, which is half of how I think through the case.

The accountability vanishes. When I write a note and sign it, I am the author. When I sign an AI-drafted note, the legal accountability is mine but the cognitive authorship is not. This is a category we have not had in medicine before, and it is being normalized at scale before any of us have figured out how it should be governed. If a model hallucinated a diagnosis I did not consider, and I signed it, and the patient was harmed — what is my defense? “The system drafted it” is not a defense. It is an admission that I outsourced the most important sentence in the chart.

Patients can tell. The data on this is preliminary, but the qualitative reports are consistent: patients notice when their doctor is no longer mentally writing the note in their head while listening. They notice when the doctor relaxes into the conversation in a way that feels like talking to a friend instead of a physician. Some of them like it. Many of them, including the older patients I see most days, find it slightly disorienting — and a small number find it actively alarming once they understand what is happening.

To be clear: I think the underlying technology is real, and I think a future version of this — one with patient-controlled consent, on-device processing, transparent retention, validated accuracy on the populations who actually use it, and audit trails I can inspect — could be a meaningful improvement. The right answer is not to refuse it forever. The right answer is to refuse it now, on these terms, and to be specific about what would change my mind.

It is also worth being precise about what “ambient AI” actually is, because the branding is doing a lot of work. The speech-to-text layer underneath most of these scribe products is OpenAI’s Whisper — an automatic-speech-recognition model OpenAI open-sourced in September 2022, years before this became a category that vendors could charge per-seat for. Whisper is good. It has been good for a while. The novel work in an ambient scribe is not the microphone; it is the large language model that turns a transcript into a SOAP note, the consent flow that wraps the recording, the storage policy on the audio, and the workflow that decides who reviews the draft. When the marketing asks “is AI ready for medicine?” it is collapsing four very different questions into one slogan. Whisper-grade transcription has been ready since 2022. The summarization model trained on someone else’s notes, the audit trail, and the per-patient consent — those are the questions that have not been answered, and those are the ones I am refusing on.

What would change my mind:

— Per-encounter, opt-in patient consent that is more than a checkbox at registration.

— Local or on-prem processing with no third-party retention by default.

— Published validation of note accuracy on the real distribution of patients I see, including elderly, multilingual, and chronically ill populations who are systematically underrepresented in pilot data.

— A clear governance policy at my institution about who reviews drafted notes, who is responsible for errors, and how disagreements between the model and the clinician are tracked.

Until those exist, the microphone stays off when I’m in the room.

— Jeremy Tabernero, MD · More notes · RSS · Get in touch