Useful Resources
1. Background: OMOP and Vocabulary Mapping
What is OMOP?
OMOP stands for Observational Medical Outcomes Partnership, a project that led to the creation of the Common Data Model (CDM) used by the Observational Health Data Sciences and Informatics (OHDSI) community.
The OMOP CDM standardizes:
- Clinical data structure
- Vocabularies (like ICD, SNOMED, RxNorm)
- Research methodology and tools
Purpose of OMOP CDM
- Facilitate interoperability across health systems
- Enable observational research and real-world evidence generation
- Use shared tools like Atlas, Achilles, and Usagi
Medical Vocabulary Mapping
OMOP uses standard vocabularies to unify diverse coding systems. This means mapping from source vocabularies to OMOP’s standard concepts.
Examples:
- ICD-10 to SNOMED
- NDC (drug codes) to RxNorm
Tools used:
- Usagi: Semi-automated tool for mapping source terms to standard concepts
- Athena: Platform to download vocabularies (https://athena.ohdsi.org)
Data Domains in OMOP CDM
Key tables include:
PERSON
: DemographicsOBSERVATION_PERIOD
: Time span of observationVISIT_OCCURRENCE
: EncountersCONDITION_OCCURRENCE
: DiagnosesDRUG_EXPOSURE
: MedicationsPROCEDURE_OCCURRENCE
: ProceduresMEASUREMENT
: Labs and vitals
2. Core Concepts
Database
- The top-level container for data.
- Stores schemas, tables, views, stored procedures, users, roles, etc.
- Each database is independent and has its own files on disk.
- You can back up, restore, or move a database easily.
Schema
- A namespace within a database.
- Groups related objects (e.g., tables, views, stored procedures).
- Helps organize and manage permissions and object naming.
- Examples:
dbo
,sales
,hr
,cdm_531
,cdm_60
Table
- Stores data in rows and columns.
- Belongs to a schema.
- Contains fields (columns) with defined data types.
3. Visual Structure
SQL Server Instance
└── Database: OMOP_TestEnv
├── Schema: cdm_531
│ └── Tables: PERSON, CONDITION_OCCURRENCE, DRUG_EXPOSURE...
├── Schema: cdm_60
│ └── Tables: PERSON, CONDITION_OCCURRENCE, DRUG_EXPOSURE...
└── Schema: cdm_54
└── Tables: PERSON, CONDITION_OCCURRENCE, DRUG_EXPOSURE...
4. Why Use Schemas?
Reason | Explanation |
---|---|
Organization | Logical grouping of tables for different OMOP versions |
Avoid Naming Conflicts | Each schema can have its own PERSON table |
Permissions Management | Control access per schema rather than per table |
Easier Testing | Compare CDM versions side-by-side in the same database |
Lightweight Alternative | More efficient than creating multiple databases |
5. Recommended Setup for OMOP Testing
- Create one SQL Server database (e.g.,
OMOP_TestEnv
) - Inside it, create multiple schemas:
CREATE SCHEMA cdm_531; CREATE SCHEMA cdm_54; CREATE SCHEMA cdm_60;
- Run OMOP DDL scripts for each version with the correct schema prefix:
CREATE TABLE cdm_531.PERSON ( person_id INT PRIMARY KEY, gender_concept_id INT, -- etc... );
6. How to Connect & Manage
- Use SQL Server Management Studio (SSMS) to connect to the server
- Use one database and navigate between schemas via
schema_name.table_name
- Start with the script:
OMOP CDM sql server ddl.txt
(to create tables) - Then:
OMOP CDM sql server indexes.txt
andconstraints.txt
7. Quick Reference
Term | Description |
Instance | The installed SQL Server engine |
Database | Full container of schemas and objects |
Schema | Namespace for organizing related objects |
Table | Structure storing data in rows/columns |
8. Notes
- The
dbo
schema is the default in SQL Server. - You can reference objects fully:
Database.Schema.Table
- OMOP DDL scripts can be modified with
schema_name.
prefix. - Schemas are ideal for prototyping, versioning, and teaching.