Guide to SQL Server: Databases, Schemas, and Tables (with OMOP Context)

Useful Resources

The Book of OHDSI

1. Background: OMOP and Vocabulary Mapping

What is OMOP?

OMOP stands for Observational Medical Outcomes Partnership, a project that led to the creation of the Common Data Model (CDM) used by the Observational Health Data Sciences and Informatics (OHDSI) community.

The OMOP CDM standardizes:

  • Clinical data structure
  • Vocabularies (like ICD, SNOMED, RxNorm)
  • Research methodology and tools

Purpose of OMOP CDM

  • Facilitate interoperability across health systems
  • Enable observational research and real-world evidence generation
  • Use shared tools like Atlas, Achilles, and Usagi

Medical Vocabulary Mapping

OMOP uses standard vocabularies to unify diverse coding systems. This means mapping from source vocabularies to OMOP’s standard concepts.

Examples:

  • ICD-10 to SNOMED
  • NDC (drug codes) to RxNorm

Tools used:

  • Usagi: Semi-automated tool for mapping source terms to standard concepts
  • Athena: Platform to download vocabularies (https://athena.ohdsi.org)

Data Domains in OMOP CDM

Key tables include:

  • PERSON: Demographics
  • OBSERVATION_PERIOD: Time span of observation
  • VISIT_OCCURRENCE: Encounters
  • CONDITION_OCCURRENCE: Diagnoses
  • DRUG_EXPOSURE: Medications
  • PROCEDURE_OCCURRENCE: Procedures
  • MEASUREMENT: Labs and vitals

2. Core Concepts

Database

  • The top-level container for data.
  • Stores schemas, tables, views, stored procedures, users, roles, etc.
  • Each database is independent and has its own files on disk.
  • You can back up, restore, or move a database easily.

Schema

  • A namespace within a database.
  • Groups related objects (e.g., tables, views, stored procedures).
  • Helps organize and manage permissions and object naming.
  • Examples: dbo, sales, hr, cdm_531, cdm_60

Table

  • Stores data in rows and columns.
  • Belongs to a schema.
  • Contains fields (columns) with defined data types.

3. Visual Structure

SQL Server Instance
└── Database: OMOP_TestEnv
    ├── Schema: cdm_531
    │   └── Tables: PERSON, CONDITION_OCCURRENCE, DRUG_EXPOSURE...
    ├── Schema: cdm_60
    │   └── Tables: PERSON, CONDITION_OCCURRENCE, DRUG_EXPOSURE...
    └── Schema: cdm_54
        └── Tables: PERSON, CONDITION_OCCURRENCE, DRUG_EXPOSURE...

4. Why Use Schemas?

ReasonExplanation
OrganizationLogical grouping of tables for different OMOP versions
Avoid Naming ConflictsEach schema can have its own PERSON table
Permissions ManagementControl access per schema rather than per table
Easier TestingCompare CDM versions side-by-side in the same database
Lightweight AlternativeMore efficient than creating multiple databases

5. Recommended Setup for OMOP Testing

  • Create one SQL Server database (e.g., OMOP_TestEnv)
  • Inside it, create multiple schemas:CREATE SCHEMA cdm_531; CREATE SCHEMA cdm_54; CREATE SCHEMA cdm_60;
  • Run OMOP DDL scripts for each version with the correct schema prefix:CREATE TABLE cdm_531.PERSON ( person_id INT PRIMARY KEY, gender_concept_id INT, -- etc... );

6. How to Connect & Manage

  • Use SQL Server Management Studio (SSMS) to connect to the server
  • Use one database and navigate between schemas via schema_name.table_name
  • Start with the script: OMOP CDM sql server ddl.txt (to create tables)
  • Then: OMOP CDM sql server indexes.txt and constraints.txt

7. Quick Reference

TermDescription
InstanceThe installed SQL Server engine
DatabaseFull container of schemas and objects
SchemaNamespace for organizing related objects
TableStructure storing data in rows/columns

8. Notes

  • The dbo schema is the default in SQL Server.
  • You can reference objects fully: Database.Schema.Table
  • OMOP DDL scripts can be modified with schema_name. prefix.
  • Schemas are ideal for prototyping, versioning, and teaching.

Leave a Reply

Your email address will not be published. Required fields are marked *