M320: Chapter 1: Introduction

MongoDB Data Modeling

Introduction to Data Modeling

  • Good performance
  • Maximizing the productivity of your developers
  • Minimizing the overall costs of your solution

  • Gather the requirements to create a “Data Model”

  • Turn those requirements into a basic “Model”

  • Powerful transformation patterns to optimize your “Data Model”

  • Understand how to evolve your “Data Model” over time

Course Prerequisites

Here are some of the terms and references for your benefit:

MongoDB Concepts and Vocabulary

Relational Database Concepts and Vocabulary

General Database Concepts and Definitions

MongoDB Compass and Atlas

Data Modeling in MongoDB

MongoDB is schemaless. Schema is a structure.

ERD and UML tooling.

  • Usage pattern
  • How you access your data
  • Which queries are critical to your application
  • Ratios between reads and writes

Document validation(enforce rules)

To join, use $lookup in MongoDB.

The Document Model in MongoDB

BSON is a binary representation of JSON documents, which is used store data in MongoDB.

  • MongoDB stores data as Documents
  • Document fields can be values, embedded documents, or arrays of values and documents
  • MongoDB is a Flexible Schema database

Document Structure in MongoDB

Supported Datatypes in MongoDB

Constraints in Data Modeling

MongoDB does support transactions.

  • Keep the frequently used Documents in RAM
  • Keep the Indexes in RAM
  • Prefer Solid State Drives to Hard Disk Drives
  • Infrequently data can use Hard Disk Drives

Recap:
1. The nature of your dataset and hardware define the need to model your data 2. It is important to identify those exact constraints and their impact to create a better model 3. As your software and the technological landscape change, your model should be re-evaluated and updated accordingly

When working with MongoDB, security features, network performance, disk drive speed, and amount of RAM are all aspects you need to keep in mind. As for the operating system your deployment will be running on, MongoDB and other systems usually hide the differences from you.

The Data Modeling Methodology

Model for Simplicity or Performance

Modeling for Simplicity Diagram

https://university-courses.s3.amazonaws.com/M320/modeling_for_simplicity.png

Modeling for Performance Diagram

https://university-courses.s3.amazonaws.com/M320/modeling_for_performance.png

Modeling for a Mix of Simplicity and Performance Diagram

https://university-courses.s3.amazonaws.com/M320/modeling_for_a_mix.png

Summary of Modeling Approaches

https://university-courses.s3.amazonaws.com/M320/flexible_methodology.png

Identifying the Workload

Case Study: IoT

  • Organization has 100 Millions weather sensors
  • Need to:
    • collect the data from all devices
    • analyze the data trends with a team of 10 data scientists

  • Quantify and Qualify the queries as much as you can
  • Few CRUD operations will drive the design