D2L-datahub-tablecreatesql/README.md

# D2L Brightspace DataHub SQL Schema

This repository contains SQL DDL (Data Definition Language) scripts to create local database tables that mirror the **D2L Brightspace DataHub (BDS)** Data Sets. These scripts are designed to facilitate ETL processes, allowing institutions to store, query, and analyze Brightspace data in a relational environment.

## 📌 Overview

D2L DataHub provides bulk data exports (CSVs) for various LMS activities. This project provides the structured schema required to host that data in a local SQL database (PostgreSQL, MySQL, or SQL Server), ensuring data integrity and optimized query performance.

## 📂 Repository Structure

The scripts are organized by functional domain to match the Brightspace Data Sets:

* **`users/`**: Schema for user profiles, logins, and roles.
* **`org_structure/`**: Tables for Org Units, Course Offerings, Templates, and Semesters.
* **`enrollments/`**: Tracking user enrollments across course sections.
* **`grades/`**: Grade objects, results, schemes, and category details.
* **`content/`**: Course content structure and user progress tracking.
* **`quizzes/`**: Comprehensive quiz data, including attempts and question-level responses.
* **`assignments/`**: Assignment folders, submissions, and feedback.

## 🚀 Getting Started

### Prerequisites

- A running SQL database instance (MySQL 8.0+, PostgreSQL 13+, or SQL Server 2019+).
- Access to D2L Brightspace DataHub to download the `.csv` (or `.zip`) data sets.

### Installation

1.  **Clone the repository:**
    ```bash
    git clone https://git.jebbarger.com/d2l/d2l-datahub-sql.git
    cd d2l-datahub-sql
    ```

2.  **Execute the Schema Scripts:**
    You can run the full setup or individual files depending on your needs.
    ```sql
    -- Example for PostgreSQL
    \i core/create_all_tables.sql
    ```

## 🛠 ETL Workflow Recommendation

To keep your local data warehouse synchronized with Brightspace:

1.  **Extract:** Download the latest Differential or Full Data Sets from DataHub.
2.  **Stage:** Load the raw CSV data into a staging schema (e.g., `stg_bds`).
3.  **Transform/Load:** Use an UPSERT logic to move data from staging into the tables created by this repository, handling `IsDeleted` flags and updated timestamps.

## 📊 Schema Standards

- **Primary Keys:** Defined based on D2L's unique identifiers (e.g., `UserId`, `OrgUnitId`).
- **Data Types:** Mapped to handle large scale data (e.g., `BIGINT` for IDs, `TEXT` for long strings, and `TIMESTAMP` for dates).
- **Indexing:** Essential foreign keys are indexed by default to optimize joins across the LMS schema.

## 🤝 Contributing

If you encounter missing fields from newer DataHub versions or want to contribute optimizations for a specific SQL dialect:

1.  Fork the Project.
2.  Create your Feature Branch (`git checkout -b feature/NewDataSet`).
3.  Commit your Changes (`git commit -m 'Add UserAttribute data set'`).
4.  Push to the Branch (`git push origin feature/NewDataSet`).
5.  Open a Pull Request.

---
*Disclaimer: This project is not officially affiliated with D2L Corporation. Please refer to the [Brightspace DataHub Documentation](https://community.d2l.com/brightspace/kb/articles/4518-about-brightspace-data-sets) for the most up-to-date field specifications.*