CHoRUS Data Generation Project
AI/ML for Clinical Care Grand Challenge
About the Project
The Patient-Focused Collaborative Hospital Repository Uniting Standards (CHoRUS) for Equitable AI project is creating a diverse, ethically sourced, AI-ready dataset to advance recovery from acute illness. This flagship effort aims to capture the complexity of real-world clinical care through multimodal data—including EHRs, imaging, waveforms, and clinical text—harmonized using unified standards.
A collaboration among 20 academic institutions (with 14 serving as Data Acquisition Centers), CHoRUS emphasizes a patient-focused approach, addressing privacy, bias, and Social Determinants of Health. A custom environment will support the annotation of clinically meaningful outcomes, enabling predictive modeling and responsible AI development.
The project also includes dedicated training efforts to build a skilled, diverse AI/ML workforce. Through federated access and balanced sampling, CHoRUS will contribute to a strong foundation for future biomedical AI research.

The CHoRUS Dataset
Controlled Access
The CHoRUS project is developing a flagship dataset to support AI/ML research focused on team-based clinical care. The dataset is designed to support the development of responsible, real-world AI tools that enhance Critical Care delivery. This dataset is only available under controlled access after a registration review. The current dataset includes:
Over 50,000 ICU admissions from 14 different hospitals from around the United States that include patients with AKI, Shock, Sepsis, Trauma and more.
1.6 billion rows of OMOP standardized EHR data
7,642 patients with Radiology Data
23 Tb of Waveform data
The dataset is optimal to assist in training AI models that are specific to AI in Critical Care.