MongoDB¶
Nilus supports MongoDB as a batch destination, enabling users to load structured or semi-structured data into MongoDB collections. MongoDB is a distributed NoSQL document database known for horizontal scalability, flexible schema design, and high availability.
Currently, MongoDB destinations in Nilus operate exclusively in full-replace mode, meaning the target collection is cleared and then fully rewritten during every run.
Info
MongoDB as a batch destination cannot be used in CDC pipelines, as incremental sync modes (merge, upsert, append) are not supported.
Prerequisites¶
Before configuring MongoDB as a sink, ensure the following:
-
MongoDB Server Access: The MongoDB host must be reachable from the Nilus environment (self-hosted instance or MongoDB Atlas).
-
Authentication: If authentication is enabled, a valid username and password are required.
-
Permissions: The MongoDB user must have:
- Read/write privileges on the target database.
- Permission to create databases/collections, if they do not already exist.
-
Pre-created MongoDB Depot A Depot must exist in DataOS with read-write access. It supports depots configured with multiple nodes, enabling users to reuse the same depot created for the MongoDB CDC source for destination use cases as well.
MongoDB Depot Manifest
name: mongomultiport version: v1 type: depot tags: - MongoDb - Sanity - dataos:type:resource - dataos:type:cluster-resource - dataos:resource:depot - dataos:layer:user layer: user depot: type: mongodb description: "MongoDb depot with multiple nodes" compute: niluscompute spec: subprotocol: mongodb nodes: - <host>.dataos.info:27017 - <host>.info:27018 - <host>.dataos.info:27019 params: authSource: admin replicaSet: rs0 external: "true" secrets: - name: ${{instance-secret-name}}-r allkeys: ${{true}} - name: ${{instance-secret-name}}-rw allkeys: ${{true}}
Info
Specify the target database and collection using dest-table in the format database.collection.
Sink Configuration¶
name: mongo-batch
version: v1
type: workflow
tags:
- workflow
- nilus-batch
description: Nilus Batch Service Sample
workspace: public
workflow:
dag:
- name: mongo-lakehouse
spec:
stack: nilus:1.0
compute: runnable-default
resources:
requests:
cpu: 100m
memory: 256Mi
logLevel: DEBUG
stackSpec:
source:
address: dataos://postgresdepot
options:
source-table: "test_db.employee"
sink:
address: dataos://mongomultiport
options:
dest-table: "nilus_testing.employee"
incremental-strategy: replace
URI & Parameter Details¶
| Parameter | Required | Description | Callouts |
|---|---|---|---|
username |
Yes (if auth enabled) | Username for authentication | Ensure the user is allowed to write to the target DB |
password |
Yes (if auth enabled) | Password associated with the user | — |
host |
Yes | MongoDB server hostname | Must be reachable from Nilus |
port |
No | Server port (default: 27017) |
— |
| Query Params | No | Connection options (tls=true, etc.) |
Useful for Atlas or SSL environments |
Info
Do not append the database name in the URI. Specify the database + collection using dest-table as: database.collection.
Sink Attribute Details¶
Nilus supports the following MongoDB sink options:
| Option | Required | Description | Callouts |
|---|---|---|---|
dest-table |
Yes | Specifies <database>.<collection> |
Creates DB/collection if they do not exist |
incremental-strategy |
Yes | Must be replace |
Collection is cleared before each load |
Destination Table Format
Example:
Supported Incremental Strategies¶
| Strategy | Supported | Behavior |
|---|---|---|
replace |
✅ Yes | Drops/clears the collection and rewrites all data |
append |
❌ No | Not supported |
merge |
❌ No | Not supported |
upsert |
❌ No | Not supported |
Info
MongoDB as a destinations always run in full replace mode.
Behavior & Capabilities¶
-
Data is written in batches via MongoDB bulk operations.
-
If the collection exists:
- It is fully cleared under
replacestrategy.
- It is fully cleared under
-
If the collection does not exist, Nilus automatically creates it.
Document ID Behavior¶
- If incoming data contains
_id, it becomes the MongoDB document_id. - If
_idis not present, MongoDB auto-generates one.
Type Mapping¶
Nilus maps common types automatically:
| Incoming Type | MongoDB Representation |
|---|---|
| Numbers | int / long / double |
| Timestamps | ISODate |
| Booleans | boolean |
| Strings | string |
| Nested Objects | Embedded documents |
| Lists | Arrays |
Unsupported or complex types may be stringified.
Performance Considerations¶
-
Since replace mode rewrites the entire collection, it is best suited for:
- Small to medium datasets
- Non-real-time workloads
-
Not ideal for:
- Collections with millions of documents
- Use-cases requiring incremental updates or CDC
Recommendations for Large Data Volumes¶
- Use a Lakehouse (Delta/BigQuery/S3) as the primary destination.
- Sync to MongoDB downstream using a tool that supports incremental merges.
Troubleshooting¶
| Issue | Possible Cause | Resolution |
|---|---|---|
| Authentication failure | Wrong credentials or missing roles | Validate username/password and roles. Verify auth database (default is admin unless specified) |
| Database not found | User lacks permission to create databases | Grant appropriate privileges |
Duplicate key _id errors |
Incoming data contains duplicate _id values |
Remove or regenerate _id upstream |
| TLS/SSL errors | Invalid certificates or missing TLS params | Add URI params like tls=true&tlsAllowInvalidCertificates=true |