Skip to content

Steps to create Amazon S3 Depot

Pre-requisites specific to Depot creation

  • Tags: A developer must possess the following tags, which can be obtained from a DataOS operator.

            NAME          ID        TYPE          EMAIL                       TAGS               
        ─────────────┼─────────────┼────────┼──────────────────────┼─────────────────────────────────
        Iamgroot        iamgroot   person    iamgroot@tmdc.io    roles:id:data-dev,                            
                                                                 roles:id:user,                  
                                                                 users:id:iamgroot  
    
  • Use cases: Alternatively, instead of assigning tags, a developer can create a Depot if an operator grants them the "Manage All Instance-level Resources of DataOS in the user layer" use case through Bifrost Governance.

    Bifrost Governance
    Bifrost Governance

Pre-requisites specific to the S3 Depot

To create a S3 Depot you must have the following details:

  • AWS access key ID: The Access Key ID used to authenticate and authorize API requests to your AWS account. This can be obtained from the AWS IAM (Identity and Access Management) Console under your user’s security credentials or requested from your AWS administrator.

  • AWS bucket name: The name of the Amazon S3 bucket where the data resides. You can find this in the AWS S3 Console under the list of buckets or request it from the administrator managing the storage.

  • Secret access key: The Secret Access Key associated with your AWS Access Key ID, is required for secure API requests. This is available in the AWS IAM Console under your user’s security credentials. Ensure that it is securely stored and shared only with authorized personnel.

  • Scheme: The scheme specifies the protocol to be used for the connection, such as s3 or https. This information depends on your system’s configuration and can be confirmed with the team managing the connection setup or workflow.

  • Relative Path: The path within the S3 bucket that points to the specific data or folder you want to access. This path is typically structured according to how your data is organized and can be obtained from the team managing the data or the AWS S3 Console.

  • Format: The file format of the data stored in the S3 bucket, such as CSV, Parquet, or JSON. This information is determined by the structure of your data and can be confirmed with the team managing the data storage.

Create a S3 Depot

Azure Blob File System Secure (ABFSS) is an object storage system. Object stores are distributed storage systems designed to store and manage large amounts of unstructured data.

DataOS enables the creation of a Depot of type 'Bigquery' to facilitate the reading of data stored in an Azure Blob Storage account. This Depot provides access to the storage account, which can consist of multiple containers. A container serves as a grouping mechanism for multiple blobs. It is recommended to define a separate Depot for each container. To create a Depot of type ‘ABFSS‘, follow the below steps:

Step 1: Create an Instance Secret for securing S3 credentials

Begin by creating an Instance Secret Resource by following the Instance Secret document.

Step 2: Create a S3 Depot manifest file

Begin by creating a manifest file to hold the configuration details for your S3 Depot.

name: ${{depot-name}}
version: v2alpha
type: depot
tags:
    - ${{tag1}}
owner: ${{owner-name}}
layer: user
description: ${{description}}
depot:
    type: S3                                          
    external: ${{true}}
    s3:                                            
    scheme: ${{s3a}}
    bucket: ${{project-name}}
    relativePath: ${{relative-path}}
    format: ${{format}}
    secrets:
    - name: ${{s3-instance-secret-name}}-r
        allkeys: true

    - name: ${{s3-instance-secret-name}}-rw
        allkeys: true

To get the details of each attribute, please refer to this link.

Step 3: Apply the Depot manifest file

Once you have the manifest file ready in your code editor, simply copy the path of the manifest file and apply it through the DataOS CLI by pasting the path in the placeholder, using the command given below:

dataos-ctl resource apply -f ${{yamlfilepath}}
dataos-ctl apply -f ${{yamlfilepath}}

Verify the Depot creation

To ensure that your Depot has been successfully created, you can verify it in two ways:

  • Check the name of the newly created Depot in the list of Depots where you are named as the owner:

    dataos-ctl get -t depot
    
  • Additionally, retrieve the list of all Depots created in your organization:

    dataos-ctl get -t depot -a
    

You can also access the details of any created Depot through the DataOS GUI in the Operations App and Metis UI.

Delete a Depot