Amazon
Learn more about using John Snow Labs' models on SageMaker here

Published: January 21, 2026
Video Description
Want to unlock the power of healthcare data while staying HIPAA compliant? Learn how to deploy a production-grade de-identification pipeline that processes half a million medical records in just over two hours with 99.1% accuracy! This tutorial walks you through setting up John Snow Labs' pre-trained models on Amazon SageMaker to automatically detect and mask 18+ protected health identifiers in clinical notes, radiology reports, and medical documents. We'll show you how to deploy from AWS Marketplace, configure masking policies, and process both text and scanned documents—all within your secure AWS environment. No model training required, just plug in and start protecting your data right now!
Learn more about using John Snow Labs' models on SageMaker here: https://go.aws/4r0U8pY
Follow AWS Developers!
📺 Instagram: https://go.aws/49r7LZC
🆇 X: https://go.aws/3Ya728V
💼 LinkedIn: https://go.aws/4sdbXnj
00:00 - Introduction
00:37 - Understanding Privacy and De-Identification
01:54 - Using LLMs for de-identification
02:30 - Building a de-identification pipeline
04:55 - Testing the outputs
05:42 - De-identifying scanned documents and images
06:53 - Spinning up the model in production
07:36 - Top 3 Developer Tips
08:07 - Conclusion
#HIPAACompliance #MedicalAI #JohnSnowLabs