Introduction to MongoDB

What is MongoDB?

MongoDB is a NoSQL document database that stores data in JSON like documents with flexible schemas. It's designed for scalability, performance, and developer productivity.

Key Features:

  • Document-Oriented: Data stored as BSON (Binary JSON) documents
  • Schema-Less: Flexible schema allows evolving data structures
  • Distributed: Built-in sharding and replication
  • Aggregation Pipeline: Powerful data processing framework
  • ACID Transactions: Multi-document ACID transactions (v4.0+)
  • Full-Text Search: Native text search capabilities

MongoDB Shell Basics

Essential commands to get started with MongoDB shell:

// Start MongoDB Shell 
mongosh 
// Check current database 
db 
// List all databases 
show dbs 
// Show current user 
db.getUser()

Connection Methods

MongoSh:

// Connect to local MongoDB 
mongosh 
// Connect to remote server 
mongosh "mongodb://username:password@host:port/database" 
// Connect to MongoDB Atlas 
mongosh "mongodb+srv://username:password@cluster.mongodb.net/database"

Compass:

// MongoDB Compass Connection String 
mongodb://localhost:27017 
// Connection with Authentication 
mongodb://username:password@localhost:27017/database 
// Atlas Connection 
mongodb+srv://username:password@cluster.mongodb.net/database

Node.js

// Using MongoDB Native Driver 
const {
    MongoClient
} = require('mongodb');
const client = new MongoClient('mongodb://localhost:27017');
await client.connect();
const db = client.db('myapp');
const collection = db.collection('users');

Database & Collection Basics

Database Operations

// Use or create a database 
use myapp 
// Get current database 
db 
// Drop database 
db.dropDatabase() 
// Get database statistics 
db.stats() 
// List collections 
show collections 
// Get collection statistics 
db.collection_name.stats()

Tip: Databases are created automatically when you first write data to them. Use use database_name to switch or create a database.

Collection Operations

// Create collection 
db.createCollection('users')
// Create collection with validation 
db.createCollection('users', {
    validator: {
        $jsonSchema: {
            bsonType: 'object',
            required: ['name', 'email'],
            properties: {
                name: {
                    bsonType: 'string'
                },
                email: {
                    bsonType: 'string'
                }
            }
        }
    }
})
// Drop collection 
db.users.drop()
// Rename collection 
db.users.renameCollection('customers')
// Check if collection exists 
db.getCollectionNames().includes('users')

CRUD Operations

Create Operations

insertOne() - Insert Single Document
db.users.insertOne({
    _id: ObjectId(),
    name: "John Doe",
    email: "john@example.com",
    age: 30,
    active: true
})
// Returns: { acknowledged: true, insertedId: ObjectId(...) }
insertMany() - Insert Multiple Documents
db.users.insertMany([{
    name: "Alice",
    email: "alice@example.com"
}, {
    name: "Bob",
    email: "bob@example.com"
}, {
    name: "Charlie",
    email: "charlie@example.com"
}], {
    ordered: false
}) // ordered: true (default) - stop on error // ordered: false - continue on error

Read Operations

find() - Find Multiple Documents
// Find all documents 
db.users.find()
// Find with filter 
db.users.find({
    age: {
        $gt: 25
    }
})
// Convert to array 
db.users.find({
    active: true
}).toArray()
// Count matches 
db.users.countDocuments({
    age: {
        $gt: 25
    }
})
findOne() - Find Single Document
// Find first match 
db.users.findOne({
    email: "john@example.com"
})
// Returns: { _id: ObjectId(...), name: "John", email: "..." } // Returns null if not found db.users.findOne({ email: "nonexistent@example.com" })

Update Operations

updateOne() - Update Single Document
db.users.updateOne({
    _id: ObjectId("...")
}, {
    $set: {
        name: "John Smith",
        updatedAt: new Date()
    }
}) // Returns: { matched: 1, modified: 1 }
updateMany() - Update Multiple Documents
// Update all inactive users 
db.users.updateMany({
    active: false
}, {
    $set: {
        status: "inactive"
    },
    $inc: {
        inactiveCount: 1
    }
})
// Returns: { matched: n, modified: n }
Update Operators
// $set - Set field value 
{
    $set: {
        status: "active"
    }
}
// $unset - Remove field 
{
    $unset: {
        tempField: ""
    }
}
// $inc - Increment numeric field 
{
    $inc: {
        age: 1,
        score: 10
    }
}
// $push - Add to array 
{
    $push: {
        tags: "new-tag"
    }
}
// $addToSet - Add to array (no duplicates) 
{
    $addToSet: {
        hobbies: "reading"
    }
}
// $pull - Remove from array 
{
    $pull: {
        tags: "old-tag"
    }
} // $pop - Remove first/last from array 
{
    $pop: {
        items: 1
    }
}
// Remove last 
{
    $pop: {
        items: -1
    }
}
// Remove first // $rename - Rename field 
{
    $rename: {
        oldName: "newName"
    }
}

Delete Operations

deleteOne() - Delete Single Document
db.users.deleteOne({
    _id: ObjectId("...")
})
// Returns: { deletedCount: 1 }
deleteMany() - Delete Multiple Documents
// Delete all inactive users 
db.users.deleteMany({
    active: false
})
// Delete all documents 
db.users.deleteMany({})
// Returns: { deletedCount: n }

Query Operators

Comparison Operators

// $eq - Equal 
db.users.find({
    age: {
        $eq: 25
    }
})
// $ne - Not equal 
db.users.find({
    status: {
        $ne: "inactive"
    }
})
// $gt - Greater than 
db.users.find({
    age: {
        $gt: 18
    }
})
// $gte - Greater than or equal 
db.users.find({
    score: {
        $gte: 80
    }
})
// $lt - Less than 
db.users.find({
    age: {
        $lt: 65
    }
})
// $lte - Less than or equal 
db.users.find({
    price: {
        $lte: 100
    }
})
// $in - Match any value in array 
db.users.find({
    status: {
        $in: ["active", "pending"]
    }
})
// $nin - Not in array 
db.users.find({
    country: {
        $nin: ["US", "CA"]
    }
})

Logical Operators

// $and - All conditions must match (implicit by default) 
db.users.find({
    age: {
        $gt: 18
    },
    status: "active"
})
// OR explicitly: 
db.users.find({
    $and: [{
        age: {
            $gt: 18
        }
    }, {
        status: "active"
    }]
})
// $or - At least one condition matches 
db.users.find({
    $or: [{
        age: {
            $lt: 18
        }
    }, {
        age: {
            $gt: 65
        }
    }]
})
// $not - Negate condition 
db.users.find({
    age: {
        $not: {
            $lt: 18
        }
    }
})
// $nor - None of conditions match 
db.users.find({
    $nor: [{
        status: "deleted"
    }, {
        banned: true
    }]
})

Element Operators

// $exists - Field exists or not 
db.users.find({
    phone: {
        $exists: true
    }
}) db.users.find({
    nickname: {
        $exists: false
    }
})
// $type - Match by data type 
db.users.find({
    age: {
        $type: "int"
    }
}) db.users.find({
    createdAt: {
        $type: "date"
    }
})
// Type codes: "double", "string", "object", "array", "binData", // "objectId", "bool", "date", "null", "int", "long", "decimal"

Pattern Matching & Regex

// $regex - Regular expression matching 
db.users.find({
    email: {
        $regex: "@example\\.com$"
    }
})
// Case insensitive 
db.users.find({
    email: {
        $regex: "^john",
        $options: "i"
    }
})
// $text - Text search 
db.articles.find({
    $text: {
        $search: "mongodb tutorial"
    }
})
// $where - JavaScript expression (slower, avoid in production) 
db.users.find({
    $where: "this.age > 25"
})

Array Operators

// $all - Array contains all elements 
db.posts.find({
    tags: {
        $all: ["mongodb", "database"]
    }
})
// $elemMatch - Array element matches condition 
db.users.find({
    scores: {
        $elemMatch: {
            $gte: 80,
            $lt: 90
        }
    }
})
// $size - Array has specific length 
db.posts.find({
    tags: {
        $size: 5
    }
})
// Direct array match 
db.users.find({
    hobbies: "reading"
})
// Check if value is in any array element 
db.users.find({
    tags: {
        $in: ["important", "urgent"]
    }
})

Projection & Sorting

Projection - Select Fields

Include Specific Fields
// Include fields (1 = include, 0 = exclude) 
db.users.find({
    age: {
        $gt: 25
    }
}, {
    name: 1,
    email: 1,
    _id: 0
})
// Always includes _id unless explicitly excluded 
// Result: { name: "John", email: "john@example.com" }
Exclude Fields
// Exclude sensitive fields 
db.users.find({
    active: true
}, {
    password: 0,
    ssn: 0,
    apiKey: 0
})
// Returns all fields except specified ones
Advanced Projections
// Array slicing - get first 5 items 
db.posts.find({}, {
    comments: {
        $slice: 5
    }
})
// Array slicing - skip first 10, get next 20 
db.posts.find({}, {
    comments: {
        $slice: [10, 20]
    }
})
// Computed field 
db.users.find({}, {
    name: 1,
    fullName: {
        $concat: ["$firstName", " ", "$lastName"]
    }
})

Sorting & Pagination

sort() - Sort Results
// Sort ascending (1) 
db.users.find().sort({
    name: 1
})
// Sort descending (-1) 
db.posts.find().sort({
    createdAt: -1
})
// Multiple field sort 
db.users.find().sort({
    country: 1,
    name: 1
})
// Case-insensitive sort 
db.users.find().collation({
    locale: "en",
    strength: 2
}).sort({
    name: 1
})
Pagination Pattern
const pageSize = 10;
const pageNumber = 2;
// Page 2 
const results = db.users.find({
    active: true
}).sort({
    createdAt: -1
}).skip((pageNumber - 1) * pageSize).limit(pageSize).toArray()
// Get total count for pagination info 
const total = db.users.countDocuments({
    active: true
}) const totalPages = Math.ceil(total / pageSize)
limit() & skip()
// limit() - Get first 5 results 
db.users.find().limit(5)
// skip() - Skip first 10, get next 5 
db.users.find().skip(10).limit(5)
// Always use skip().limit() for pagination, not offset() 
// Order matters: sort() -> skip() -> limit()

Indexes

Index Basics

Indexes improve query performance by creating efficient data structures for lookups. Every collection has a default _id index.

Benefits of Indexes:
  • Faster query execution
  • Reduced CPU usage
  • Efficient sorting
  • Enable unique constraints
  • Support TTL (Time-To-Live) features

Creating Indexes

Simple Index
// Create single field index 
db.users.createIndex({
    email: 1
})
// Ascending (1) or Descending (-1) 
db.posts.createIndex({
    createdAt: -1
}) // Unique index 
db.users.createIndex({
    email: 1
}, {
    unique: true
})
// Sparse index (ignore documents without field) 
db.users.createIndex({
    phone: 1
}, {
    sparse: true
})
Compound Index
// Multi-field index 
db.users.createIndex({
    country: 1,
    city: 1,
    name: 1
})
// Indexes support queries like: // { country: "USA", city: "NY" } // { country: "USA" } // But NOT: { city: "NY" } alone (uses index prefix rule)
Special Index Types
// Text index (full-text search) 
db.articles.createIndex({
    title: "text",
    content: "text"
}) db.articles.find({
    $text: {
        $search: "mongodb"
    }
})
// Geospatial index (location-based queries) 
db.places.createIndex({
    location: "2dsphere"
}) db.places.find({
    location: {
        $near: {
            type: "Point",
            coordinates: [-73.99, 40.71]
        }
    }
})
// TTL index (auto-delete after expiration) 
db.sessions.createIndex({
    createdAt: 1
}, {
    expireAfterSeconds: 3600
})

Index Management

// List all indexes 
db.users.getIndexes()
// Get index info 
db.users.getIndexSpecs()
// Drop specific index 
db.users.dropIndex('email_1') db.users.dropIndex({
    email: 1
})
// Drop all indexes except _id 
db.users.dropIndexes()
// Rename index 
db.users.dropIndex('old_name') db.users.createIndex({
    field: 1
}, {
    name: 'new_name'
})
Analyze Query Performance
// explain() with executionStats 
db.users.find({
    email: "john@example.com"
}).explain("executionStats")
// Look for: // executionStages.stage: "COLLSCAN" (bad - full collection scan) // executionStages.stage: "IXSCAN" (good - index scan) // executionStats.executionStages.executionStages (nested plans)

Aggregation Pipeline

Aggregation Basics

The aggregation pipeline processes documents through stages, transforming and combining data. Think of it as a series of filters and transformations.

Basic Structure:
db.collection.aggregate([{
    $stage1: {
        /* stage parameters */
    }
}, {
    $stage2: {
        /* stage parameters */
    }
}, {
    $stage3: {
        /* stage parameters */
    }
}])
// Returns a cursor for iterating results // Results are returned as array of documents

Core Aggregation Stages

$match - Filter documents (like find())
// Filter orders for 2024 
db.orders.aggregate([{
    $match: {
        year: 2024,
        status: "completed"
    }
}])
// $match should be early in pipeline for performance
$project - Select/transform fields
db.users.aggregate([{
    $project: {
        _id: 0,
        name: 1,
        email: 1,
        displayName: {
            $concat: ["$firstName", " ", "$lastName"]
        },
        ageGroup: {
            $cond: {
                if: {
                    $gte: ["$age", 18]
                },
                then: "Adult",
                else: "Minor"
            }
        }
    }
}])
$group - Group and aggregate
// Group by category and calculate totals 
db.sales.aggregate([{
    $group: {
        _id: "$category",
        totalSales: {
            $sum: "$amount"
        },
        avgPrice: {
            $avg: "$price"
        },
        count: {
            $sum: 1
        },
        maxPrice: {
            $max: "$price"
        },
        minPrice: {
            $min: "$price"
        }
    }
}])
// $group operators: $sum, $avg, $min, $max, $first, $last, // $push, $addToSet, $count
$sort - Sort results
db.products.aggregate([{
                $match: {
                    active: true
                }
            }, {
                $sort: {
                    price: -1
                }
            }, // Descending price { $limit: 10 } ])
$limit & $skip - Pagination
db.users.aggregate([{
                $match: {
                    status: "active"
                }
            }, {
                $sort: {
                    createdAt: -1
                }
            }, {
                $skip: 20
            },
            // Skip first 20 { $limit: 10 } // Get next 10 (page 3) ])

Advanced Aggregation

$lookup - JOIN collections (LEFT OUTER JOIN)
db.orders.aggregate([{
                $lookup: {
                    from: "users", // Join with users collection localField: "userId", // Field in orders foreignField: "_id", // Field in users as: "user" // Result array name } }, { $unwind: "$user" }, // Convert array to object { $project: { orderId: 1, amount: 1, "user.name": 1, "user.email": 1 } } ])
$unwind - Flatten arrays
// Input: { _id: 1, tags: ["a", "b", "c"] } 
db.posts.aggregate([{
    $unwind: "$tags"
}])
// Output: 
// { _id: 1, tags: "a" } 
// { _id: 1, tags: "b" } 
// { _id: 1, tags: "c" } 
// Unwind with empty handling 
db.posts.aggregate([{
    $unwind: {
        path: "$tags",
        preserveNullAndEmptyArrays: true
    }
}])
Real-world: Sales Report by Month
db.sales.aggregate([{
    $match: {
        year: 2024
    }
}, {
    $group: {
        _id: {
            $dateToString: {
                format: "%Y-%m",
                date: "$date"
            }
        },
        totalSales: {
            $sum: "$amount"
        },
        totalOrders: {
            $sum: 1
        },
        avgOrderValue: {
            $avg: "$amount"
        }
    }
}, {
    $sort: {
        _id: 1
    }
}, {
    $project: {
        _id: 0,
        month: "$_id",
        totalSales: 1,
        totalOrders: 1,
        avgOrderValue: {
            $round: ["$avgOrderValue", 2]
        }
    }
}])

More Aggregation Operators

// $addFields - Add new fields 
{
    $addFields: {
        category: "electronics"
    }
}
// $count - Count documents 
{
    $count: "totalDocuments"
}
// $facet - Multiple aggregations in one 
{
    $facet: {
        byCategory: [{
            $group: ...
        }],
        byPrice: [{
            $group: ...
        }]
    }
}
// $bucket - Group by ranges 
{
    $bucket: {
        groupBy: "$price",
        boundaries: [0, 50, 100, 200],
        default: "other",
        output: {
            count: {
                $sum: 1
            }
        }
    }
}
// $replaceRoot - Change root document 
{
    $replaceRoot: {
        newRoot: "$user"
    }
}
// $merge - Write results to collection 
{
    $merge: {
        into: "results_collection"
    }
}
// $out - Write results to new collection (older syntax) 
{
    $out: "results_collection"
}

Data Modeling

Embedding vs Referencing

Embedding - Denormalization (Nested Data)
// User with embedded address (embedding) 
db.users.insertOne({
    _id: ObjectId(),
    name: "John Doe",
    email: "john@example.com",
    address: {
        street: "123 Main St",
        city: "New York",
        state: "NY",
        zipCode: "10001",
        country: "USA"
    },
    phone: [{
        type: "mobile",
        number: "555-1234"
    }, {
        type: "home",
        number: "555-5678"
    }]
})
// Pros: Single query, faster reads, ACID within document 
// Cons: Data duplication, larger documents, harder updates
Referencing - Normalization (Using IDs)
// User with reference to address 
db.users.insertOne({
    _id: ObjectId(),
    name: "John Doe",
    email: "john@example.com",
    addressId: ObjectId("..."),
    phoneIds: [ObjectId("..."), ObjectId("...")]
}) db.addresses.insertOne({
    _id: ObjectId("..."),
    street: "123 Main St",
    city: "New York"
})
// Query with lookup 
db.users.aggregate([{
    $lookup: {
        from: "addresses",
        localField: "addressId",
        foreignField: "_id",
        as: "address"
    }
}])
// Pros: Reduced duplication, flexible updates // Cons: Multiple queries, $lookup overhead

Data Modeling Patterns

One-to-One Relationship
// Example: User with passport 
// Option 1: Embed (if accessed together) 
db.users.insertOne({
    _id: ObjectId(),
    name: "John",
    passport: {
        number: "AB123456",
        issueDate: new Date(),
        expiryDate: new Date()
    }
})
// Option 2: Reference (if independent access) 
db.users.insertOne({
    _id: ObjectId(),
    name: "John",
    passportId: ObjectId()
})
One-to-Many Relationship
// User with multiple posts 
// Option 1: Embed if few items (< 10K) 
db.users.insertOne({
    _id: ObjectId(),
    name: "John",
    posts: [{
        title: "Post 1",
        content: "..."
    }, {
        title: "Post 2",
        content: "..."
    }]
})
// Option 2: Reference if many items 
db.users.insertOne({
    _id: ObjectId(),
    name: "John"
}) db.posts.insertMany([{
    _id: ObjectId(),
    userId: ObjectId(),
    title: "Post 1"
}, {
    _id: ObjectId(),
    userId: ObjectId(),
    title: "Post 2"
}])
Many-to-Many Relationship
// Students enrolled in multiple courses 
db.students.insertOne({
    _id: ObjectId(),
    name: "Alice",
    courseIds: [ObjectId(), ObjectId(), ObjectId()]
}) db.courses.insertOne({
    _id: ObjectId(),
    name: "MongoDB 101",
    studentIds: [ObjectId(), ObjectId()]
})

Schema Design Principles

Do:

  • Embed data that's accessed together frequently
  • Use references for data accessed independently or that grows unbounded
  • Keep embedded documents reasonably sized (< 16MB)
  • Design queries-first, then model your schema
  • Denormalize for read performance

Don't:

  • Create unbounded arrays (arrays that grow infinitely)
  • Duplicate data excessively without a purpose
  • Use schema design patterns from relational databases
  • Embed very large documents (> 5MB)
  • Forget to plan for scalability

Transactions & ACID

Multi-Document Transactions

MongoDB 4.0+ supports multi-document ACID transactions. Transactions ensure atomicity (all-or-nothing) across multiple documents.

// Transfer money between accounts 
const session = db.getMongo().startSession() session.startTransaction() try {
    db.accounts.updateOne({
        _id: ObjectId("from")
    }, {
        $inc: {
            balance: -100
        }
    }, {
        session: session
    }) db.accounts.updateOne({
        _id: ObjectId("to")
    }, {
        $inc: {
            balance: 100
        }
    }, {
        session: session
    }) session.commitTransaction()
} catch (error) {
    session.abortTransaction() throw error
} finally {
    session.endSession()
}

Note: Transactions have a 60-second timeout by default. Use transactionLifetimeLimitSeconds to configure.

ACID Properties

Performance & Optimization

Query Optimization

// Use explain() to analyze query 
db.users.find({
    email: "john@example.com"
}).explain("executionStats")
// Look for: // - executionStages.stage: "IXSCAN" (good) vs "COLLSCAN" (bad) // - executionStats.totalKeys: documents examined // - executionStats.nReturned: documents returned // Ratio should be close (ideally 1:1) // Bad query performance pattern: // { executionStages.stage: "COLLSCAN", totalKeys: 1000000, nReturned: 10 } // Good query performance pattern: // { executionStages.stage: "IXSCAN", totalKeys: 10, nReturned: 10 }
Optimization Tips:
  • Create indexes on frequently queried fields - Dramatically improves performance
  • Use compound indexes wisely - Order matters (ESR: Equality, Sort, Range)
  • Filter early in aggregation pipelines - Use $match early to reduce documents
  • Avoid regex patterns that don't start with ^ - Cannot use index efficiently
  • Use projection to limit fields - Reduces network transfer
  • Batch large operations - Use insertMany, updateMany for bulk ops
  • Use appropriate data types - Improves storage and query performance

Indexing Best Practices

ESR Rule - Index Design Pattern
// E = Equality, S = Sort, R = Range // Query: find users by status, sorted by created date, age > 25 // ✓ CORRECT index order: db.users.createIndex({ status: 1, createdAt: 1, age: 1 }) // ✗ WRONG index order (inefficient): db.users.createIndex({ age: 1, status: 1, createdAt: 1 }) // Query to optimize: db.users.find({ status: "active", age: { $gt: 25 } }).sort({ createdAt: -1 })
Avoid Index Bloat
// Monitor unused indexes db.users.aggregate([ { $indexStats: {} } ]).pretty() // Remove indexes with zero accesses db.users.dropIndex("unused_index_name") // Rebuild fragmented indexes db.users.reIndex()

Aggregation Optimization

// ✓ GOOD: Filter first db.orders.aggregate([ { $match: { year: 2024 } }, // Filter early { $group: { _id: "$category", ... } }, { $sort: { total: -1 } } ]) // ✗ BAD: Filter after grouping db.orders.aggregate([ { $group: { _id: "$category", ... } }, { $match: { year: 2024 } } // Too late! ]) // Aggregation Optimization: // - $match as early as possible // - $project to limit fields // - Avoid $lookup if possible (consider denormalization) // - Use $limit after $sort for top-n queries // - Combine $group and $project efficiently

Security Best Practices

Authentication & Authorization

Create Users
// Create admin user db.createUser({ user: "admin", pwd: "strong_password_here", roles: [{ role: "root", db: "admin" }] }) // Create database user with limited access db.createUser({ user: "app_user", pwd: "app_password", roles: [ { role: "readWrite", db: "myapp" } ] }) // List all users db.getUsers() // Drop user db.dropUser("app_user")
Role-Based Access Control (RBAC)
// Built-in roles: // - read: Read access to collections // - readWrite: Read and write access // - dbAdmin: Database administration // - userAdmin: Manage users and roles // - dbOwner: Database owner (read, write, admin) // - root: Super admin // Custom role db.createRole({ role: "customRole", privileges: [ { resource: { db: "myapp", collection: "users" }, actions: ["find", "insert", "update"] } ], roles: [] })

Connection String Security

// ✓ SECURE: Use environment variables const connectionString = process.env.MONGODB_URI // ✓ SECURE: Connection string with authentication mongodb+srv://username:password@cluster.mongodb.net/database // ✗ INSECURE: Hardcoded credentials const connectionString = "mongodb+srv://admin:password@cluster.mongodb.net" // ✓ SECURE: Use .env file // .env file (never commit to Git): // MONGODB_URI=mongodb+srv://user:pass@cluster.mongodb.net // ✗ INSECURE: Storing in code const config = { dbUrl: "mongodb://user:pass@localhost" }

Security Checklist

  • Enable Authentication: Always require login credentials
  • Use Strong Passwords: Enforce complexity requirements
  • Enable Encryption: Use TLS/SSL for connections, encryption at rest
  • Limit Network Access: Use IP whitelisting, VPC restrictions
  • Audit Logging: Enable audit logs for compliance
  • Regular Backups: Implement backup and restore procedures
  • Updates: Keep MongoDB updated with security patches
  • Data Validation: Validate all inputs to prevent injection attacks
  • Principle of Least Privilege: Grant minimum necessary permissions

Backup & Restore

mongodump & mongorestore

Full Database Backup
// Backup entire database 
mongodump--db myapp--out. / backups
// Backup with authentication 
mongodump--db myapp--username admin--password--authenticationDatabase admin
// Backup to URI 
mongodump--uri "mongodb+srv://user:pass@cluster.mongodb.net/myapp"
// Backup specific collection 
mongodump--db myapp--collection users
Restore Backup
// Restore entire backup 
mongorestore. / backups
// Restore to specific database 
mongorestore--db myapp_restored. / backups / myapp
// Restore specific collection 
mongorestore--db myapp--collection users. / backups / myapp / users.bson

Export/Import JSON & CSV

// Export to JSON 
mongoexport--db myapp--collection users--out users.json
// Export to CSV 
mongoexport--db myapp--collection users--type csv--fields name, email, age--out users.csv
// Import from JSON 
mongoimport--db myapp--collection users--file users.json
// Import from CSV (with --headerline for headers) 
mongoimport--db myapp--collection users--type csv--headerline--file users.csv

MongoDB Atlas Backup

MongoDB Atlas provides automated backups and point-in-time recovery.

  • Continuous Backup: Automatic hourly snapshots
  • Point-in-Time Recovery: Restore to any moment within retention window
  • Snapshot Export: Export to S3, export to new Atlas cluster
  • Retention Policies: Configurable retention windows

MongoDB with Node.js

MongoDB Native Driver

Connection & Setup
const {
    MongoClient
} = require('mongodb');
const client = new MongoClient(process.env.MONGODB_URI);
async function connectDB() {
    try {
        await client.connect();
        console.log('Connected to MongoDB');
        const db = client.db('myapp');
        return db;
    } catch (error) {
        console.error('Connection failed:', error);
        process.exit(1);
    }
}
// Close connection 
await client.close();
CRUD with Native Driver
const db = await connectDB();
const users = db.collection('users');
// Create 
const result = await users.insertOne({
    name: 'John',
    email: 'john@example.com'
});
// Read 
const user = await users.findOne({
    _id: result.insertedId
});
// Update 
await users.updateOne({
    _id: result.insertedId
}, {
    $set: {
        name: 'Jane'
    }
});
// Delete
await users.deleteOne({
    _id: result.insertedId
});

Mongoose ODM

Setup & Schema
const mongoose = require('mongoose'); // Connect mongoose.connect(process.env.MONGODB_URI); // Define schema const userSchema = new mongoose.Schema({ name: { type: String, required: true }, email: { type: String, unique: true, required: true }, age: { type: Number, min: 0, max: 150 }, active: { type: Boolean, default: true }, createdAt: { type: Date, default: Date.now } }); // Create model const User = mongoose.model('User', userSchema); module.exports = User;
Mongoose CRUD Operations
// Create const user = await User.create({ name: 'John', email: 'john@example.com' }); // Read const users = await User.find({ active: true }); const user = await User.findById(userId); const user = await User.findOne({ email: 'john@example.com' }); // Update await User.updateOne( { _id: userId }, { $set: { name: 'Jane' } } ); // Delete await User.deleteOne({ _id: userId }); // Populate references const posts = await Post.find() .populate('userId') // Get full user object
Mongoose Validation & Middleware
const userSchema = new mongoose.Schema({
    email: {
        type: String,
        required: [true, 'Email is required'],
        match: [/.+@.+\..+/, 'Invalid email format'],
        unique: true
    },
    age: {
        type: Number,
        min: [0, 'Age cannot be negative'],
        max: [150, 'Age too high']
    }
});
// Pre-save middleware 
userSchema.pre('save', async function(next) {
    // Hash password before saving 
    if (this.isModified('password')) {
        this.password = await bcrypt.hash(this.password, 10);
    }
    next();
});
// Post-save middleware 
userSchema.post('save', function(doc) {
    console.log('User saved:', doc);
});

Advanced Topics

Change Streams

Change streams allow you to monitor real-time changes to MongoDB data.

// Watch collection for changes 
const changeStream = db.collection('users').watch();
changeStream.on('change', (change) => {
            console.log('Change detected:', change);
            // change.operationType: 'insert', 'update', 'delete', etc. // change.fullDocument: full document after change // change.updateDescription: what was updated }); // Watch with pipeline const pipeline = [ { $match: { 'operationType': 'update' } } ]; const changeStream = db.collection('users').watch(pipeline);

Sharding Basics

Sharding distributes data across multiple MongoDB instances for horizontal scalability.

Key Concepts:

  • Shard Key: Field used to distribute data (e.g., userId)
  • Config Servers: Store cluster metadata
  • Mongos: Router that directs queries to correct shard
  • Chunks: Range of shard key values

MongoDB Atlas Features

  • Multi-Cloud Deployment: Deploy on AWS, GCP, or Azure
  • Automated Backups: Continuous backups with point-in-time recovery
  • Live Replication: Data mirroring across multiple regions
  • Charts & Dashboards: Visual data analytics
  • Realm: Backend-as-a-Service with MongoDB
  • Search: Full-text search capabilities
  • Mobile SDKs: Sync data to mobile apps

Time Series Collections

Optimized for storing time-stamped data like metrics and analytics.

// Create time series collection 
db.createCollection('metrics', {
            timeseries: {
                timeField: 'timestamp',
                metaField: 'metadata',
                granularity: 'seconds' // or 'minutes', 'hours' } }) 
                // Insert time series data 
                db.metrics.insertOne({
                    timestamp: new Date(),
                    metadata: {
                        region: 'us-west',
                        host: 'server-1'
                    },
                    cpu: 42.5,
                    memory: 78.2
                })
                // Compress old data automatically (TTL) 
                db.metrics.createIndex({
                        timestamp: 1
                    }, {
                        expireAfterSeconds: 2592000
                    } // 30 days

Common Interview Questions (2025)

Interview Questions

1. What is MongoDB and how is it different from SQL databases?

Answer: MongoDB is a NoSQL document database that stores data as JSON-like BSON documents with flexible schemas, unlike SQL databases with fixed table structures.

AspectMongoDBSQL
SchemaFlexibleFixed
ScalabilityHorizontal (sharding)Vertical
TransactionsMulti-doc (v4.0+)Native
Joins$lookup (expensive)Native (efficient)
StructureDocuments (nested)Tables (normalized)

Tips & Tricks

Pro Tips

Tip 1: Always index your query filters

If you frequently query by email, create an index. It's one of the most impactful optimizations.

Tip 2: Use projection to limit fields

find({}, { name: 1, email: 1 }) is more efficient than returning all fields, especially for large documents.

Tip 3: Design queries first, then schema

Know how your application will query data before designing the schema. This prevents performance problems later.

Tip 4: Use updateOne/updateMany over find + update

updateMany() is atomic and more efficient than finding documents then updating them individually.

Tip 5: Batch operations for better performance

insertMany() is much faster than individual insertOne() calls.

Copy // Slower for (let doc of docs) { db.collection.insertOne(doc) } // Faster db.collection.insertMany(docs, { ordered: false })

Tip 6: Monitor slow queries

Enable profiling to find slow queries, then optimize them with indexes and better design.

Tip 7: Use connection pooling

Reuse database connections instead of creating new ones for each request. Huge performance improvement in applications.

Tip 8: Denormalize cautiously

Denormalization improves read performance but complicates updates. Use only when read performance is critical.

Common Gotchas

Gotcha 1: ObjectId comparison

ObjectIds with same string value are NOT equal. Convert string to ObjectId: ObjectId("string")

Gotcha 2: Regex without ^

/mongodb/ doesn't use index. Use /^mongodb/ (anchored) for index usage.

Gotcha 3: Unbounded arrays grow forever

If array grows without limit, document size will exceed 16MB. Use separate collection instead.

Gotcha 4: $push on non-existent field creates array

{ $push: { newArray: value } } creates the array even if it doesn't exist.

Gotcha 5: Implicit $and vs explicit $or

Multiple conditions in filter are implicit AND. Must use $or explicitly for OR logic.

Gotcha 6: $unset with empty string value

{ $unset: { field: "" } } removes the field. The value doesn't matter.

Useful Shortcuts & Commands

// Clear entire collection quickly 
db.users.deleteMany({})
// Truncate collection (drop & recreate) 
db.users.drop()
// Get database size 
db.stats()
// Get collection size 
db.users.stats()
// Count documents 
db.users.countDocuments()
// Get distinct values 
db.users.distinct('country')
// Check if collection exists 
db.getCollectionNames().includes('users')
// Rename field in all documents 
db.users.updateMany({}, {
    $rename: {
        oldName: 'newName'
    }
})
// Add default value to all documents 
db.users.updateMany({}, {
    $set: {
        status: 'active'
    }
})
// Remove field from all documents 
db.users.updateMany({}, {
    $unset: {
        tempField: ''
    }
})