ScribeMD - Case Study

Quality Assessment

Comparable to Human Scribes

Notes generated by the AI were evaluated against those from three experienced human scribes using the standardized PDQI-9 criteria.

Highest Overall Score.
AI achieved the highest average score across all 9 criteria: 3.90 vs Human Scribes (3.48, 3.77, 3.69).
Superior in Key Areas.
Higher scores in Up-to-date, Succinct, and Synthesized categories, often capturing more details from encounters.
Bar chart comparing PDQI-9 scores for AI and three human scribes across nine quality dimensions

Speed & Efficiency

A Clear Win for AI

The AI dramatically reduced the initial time required to generate a structured medical note compared to humans.

18.7s

AI Scribe

vs

437.1s

Human Scribe

Dramatic Time Reduction.
Percentage of physicians spending 5+ minutes on notes dropped from 71% to 40%.
Bar chart showing significant shift in time spent on documentation before and after AI scribe pilot

User Experience

Clinician Satisfaction

Physicians reported a notable reduction in documentation-related fatigue and a positive impact on patient interactions.

Improved Patient Interaction.
All but one physician felt the AI scribe positively impacted their interactions, allowing more focus on the patient.
Reduced Fatigue.
The percentage of clinicians reporting they "rarely" experienced documentation fatigue doubled from 8.3% to 16.7%.
Bar chart showing decrease in reported fatigue after the pilot