Harvesting from Oracle Object Storage

Harvesting is a process that extracts technical metadata from your data assets into your data catalog. A Data Asset represents a data source, such as a database, an object store, a file or document store, a message queue, or an application.

In this tutorial, you:

  1. Allow Data Catalog to access any object in your Oracle Object Storage, in any bucket, in any compartment within the tenancy where the policy is created.
  2. Create an Oracle Object Storage data asset.
  3. Add one default connection for the data asset.
  4. Harvest the data asset by running the harvest job immediately.

For additional information, see harvesting technical metadata.

1. Creating an Access Policy

You create a policy to allow Data Catalog to access your Object Storage resources.

At a minimum, you must have READ permission for the Object Storage aggregate resource type object-family, or for all the individual resource types objectstorage-namespaces, buckets, and objects.

To create an access policy to grant READ permission to the Object Storage aggregate resource type object-family, perform the following steps:

  1. Open the Console navigation menu and then select Policies under Identity. Console navigation menu
  2. Click Create Policy. Policies page
  3. In the Create Policy panel, enter a unique name for the policy. The name must be unique across all policies in your tenancy. You cannot change this later. For example, data-catalog-policy.
    Next, enter a description such as Grant access to all resources in any compartment in the tenancy and then select Keep Policy Current.
    Create Policy panel
  4. Under Policy Statements field, enter the following policy rule. Then, click Create.
    allow service datacatalog to read object-family in tenancy
    Create Policy panel showing policy statement
    Note

    This policy allows access to any object, in any bucket, in any compartment within the tenancy where the policy is created. For more examples, see policy examples.
You have successfully created the policy to allow Data Catalog to access all your Oracle Object Storage resources. Policies page showing the created policy

2. Creating a Data Asset

You are now ready to register your Oracle Object Storage data sources with Data Catalog as a data asset .

To create an Oracle Object Storage data asset, perform the following steps:

  1. In the Console, open the navigation menu, and then under Data and AI, click Data Catalog. Console navigation menu
  2. Click the data catalog instance where you want to create your data asset.Data Catalogs page
  3. On your data catalog instance Home page, click Create Data Asset from the Quick Actions tile. Data Catalog home page
  4. In the Create Data Asset panel, enter a name to uniquely identify your data asset. Optionally, enter a description. Then, from the Type drop-down list, select Oracle Object Storage. Additional fields display. Create Data Asset panel showing options for Type
  5. In the URL field, enter the swift URI for your Oracle Cloud Infrastructure Object Storage resource. For example: https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/. Create Data Asset page showing URL example
  6. In the Namespace field, enter the object storage namespace for the specified Oracle Cloud Infrastructure Object Storage resource and then click Save.
    Note

    To view your Object Storage namespace string in the Console, from the Profile menu click Tenancy:<your_tenancy_name>. The namespace is listed under Object Storage Settings.
    Create Data Asset page showing Namespace example
You have successfully created your Oracle Object Storage data asset. Data Catalog home page

3. Adding a Connection

After creating the Oracle Object Storage data asset, you create a connection for the data asset.

To create a connection for your Oracle Object Storage data asset, perform the following steps:

  1. On the Home page, click Data Assets to access the Data Assets page.Data Catalog home page
  2. In the Data Assets list, select the data asset you created previously.Data Assets page
  3. In the Summary tab on the data asset details page, under Connection Information, click Add Connection.Data Asset details page
  4. In the Add Connection panel, enter a unique name for your connection. Optionally, enter a short description and then ensure that S2S Principal is selected for Type.Add Connection panel
  5. In the OCI Region field, enter the region identifier for your Object Storage resource.
    Note

    To view the region identifier for your region in the Console, from the Profile menu click Tenancy: <your_tenancy_name>. The region name and identifier are displayed.
    Add Connection panel showing OCI region example
  6. In the Compartment OCID field, enter the compartment OCID for your Object Storage resource.
    Note

    To view the compartment OCID in the Console, navigate to Identity → Compartments. Click the compartment link for your Object Storage resource. From the Compartment Details page, copy the OCID under Compartment Information.
    Add Connection panel showing Compartment OCID
  7. Select Make this connection the default connection for the data asset.Add Connection panel
  8. Click Test Connection. A notification displays indicating if the test connection was successful or failed. Next, click Add.Add Connection panel with connection tested successfully
You have successfully created a connection for your Oracle Object Storage data asset.Data Asset details page with connection added successfully

4. Harvesting the Data Asset

You are now ready to harvest your Oracle Object Storage data asset.

To harvest your Oracle Object Storage data asset, perform the following steps:

  1. Click Harvest on the data asset details page for a data asset.Data Asset details page
  2. The Select Connection page displays and the default connection is selected. Click Next.Select Connection page
  3. The Select Data Entities page displays. View and add all the data entities you want to harvest from the Available Folders / Data Entities section.
    1. Click the add icon for each data entity you want to include in the harvest job.
    2. Click Add All to select all the entities for harvesting.
    3. Use the Filter folders / data entities box to find a data entity from the available data entities.
    4. Use the page navigation icons to browse all the data entities.
    5. Click the remove icon for any selected data entity that you want to remove from the harvest job.
    6. If you need to start over, click Remove All and then start over.
    Select Data Entities pageAfter you have reviewed the data entities you want to harvest from the Selected Folders / Data Entities section, click Next.
  4. The Create Job page displays. In the Job Name field, enter a unique name to identify the harvest job. Optionally, enter a Description. Next, select Run job now and then click Create Job.Create Job page
  5. The job to harvest your Oracle Object Storage data asset is created successfully. Click the job name.Jobs page
Your data asset is harvested successfully and you can review the harvest job details.Job details page