Scanner for MariaDB¶
You can scan metadata details such as schemas, tables, view details etc. from MariaDB with depot/non-depot Scanner workflows. In this document, find requirements and YAML configurations to connect to MariaDB for extracting entity metadata.
Requirements¶
- Credentials to connect to MariaDB database.
- To extract metadata, the MariaDB user needs to have access to the
INFORMATION_SCHEMA
. By default a user can see only the rows in theINFORMATION_SCHEMA
that correspond to objects for which the user has the proper access privileges.
Depot Scan Workflow¶
DataOS allows you to connect to a database with JDBC driver to read data from tables using Depot. The Depot enables access to all schemas visible to the specified user in the configured database, MariaDB. You can scan metadata from an JDBC-type depot with Scanner workflows.
Depot Scan Workflow YAML
You can apply database, schema, and table filter patterns while scanning metadata.
version: v1
name: mariadb-scanner2
type: workflow
tags:
- mariadb-scanner2.0
description: The job scans schema tables and register metadata
workflow:
dag:
- name: scanner2-mariadb
description: The job scans schema from mariadb depot tables and register metadata to metis2
spec:
stack: scanner:2.0
compute: runnable-default
runAsUser: metis
stackSpec:
depot: mariadb01
# sourceConfig:
# config:
# schemaFilterPattern:
# includes:
# - mariadb
#
Non-Depot Scan Workflow¶
You need to provide source connection details and configuration settings, such as metadata type and filter patterns, to include/exclude assets for metadata scanning.
Scanner Configuration Properties¶
- Type: This is the source to be scanned;
mariadb
- Source: Provide source name where the scanned metadata is saved within Metastore. Under the given source name, you can see the information about all the entities scanned for your data source;
MariaDBSource
Source Connection Properties¶
- Type: Specify source type;
MariaDB
- Username: Specify the User to connect to MariaDB. It should have enough privileges to read all the metadata.
- Password: Password to connect to MariaDB.
- Host and Port: Enter the fully qualified hostname and port number for your MariaDB deployment in the Host and Port field.
- databaseSchema: databaseSchema of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single databaseSchema.
- Connection Options (Optional): Enter the details for any additional connection options that can be sent to MariaDB during the connection. These details must be added as Key-Value pairs.
- Connection Arguments (Optional): Enter the details for any additional connection arguments such as security or protocol configs that can be sent to MariaDB during the connection. These details must be added as Key-Value pairs.
- In case you are using Single-Sign-On (SSO) for authentication, add the
authenticator
details in the Connection Arguments as a Key-Value pair as follows:"authenticator" : "sso_login_url"
- In case you authenticate with SSO using an external browser popup, then add the
authenticator
details in the Connection Arguments as a Key-Value pair as follows:"authenticator" : "externalbrowser"
- In case you are using Single-Sign-On (SSO) for authentication, add the
Non-Depot Scan Workflow YAML
In this example, sample source connection and configuration settings are provided.
version: v1
name: scanner2-mariadb # Scanner workflow name
type: workflow
tags:
- scanner
- mariadb
description: The job scans schema tables and register metadata
workflow:
dag:
- name: scanner2-mariadb
description: The job scans schema from mariadb tables and registers their metadata to metis2
spec:
tags:
- scanner2
stack: scanner:2.0
compute: runnable-default
runAsUser: metis
stackSpec:
type: mariadb
source: MariaDBSource
sourceConnection:
config:
type: MariaDB
username: xxxxxx
password: xxxxxxxxxxxxxxxxxxxxx
hostPort: mariadbdev.cb98qlrtcmrz.us-east-1.rds.amazonaws.com:3306
# databaseSchema: schema
sourceConfig:
config:
markDeletedTables: false
includeTables: true
includeViews: true
databaseFilterPattern:
includes:
- database1
- database2
excludes:
- database3
- database4
schemaFilterPattern:
includes:
- schema1
- schema2
excludes:
- schema3
- schema4
tableFilterPattern:
includes:
- table1
- table2
excludes:
- table3
- table4
After the successful workflow run, you can check the metadata of scanned Tables on Metis UI.