Harvesting from Autonomous Databases with Private Access

Harvesting is a process that extracts technical metadata from your data source into your data catalog. This tutorial provides the steps to harvest from a data source that is only accessible privately.

In this tutorial, you:

  1. Create the polices needed to harvest an autonomous database with private URL.
  2. Obtain the autonomous database access details.
  3. Create a private endpoint in Data Catalog.
  4. Attach the private endpoint to your data catalog.
  5. Create a data asset.
  6. Add a connection for the data asset.
  7. Harvest the data asset.

For additional information, see configuring a private network.

Before You Begin

To successfully perform this tutorial, you must have the following:

If you already have the autonomous database you want to harvest from, you can use the details for that database to complete this tutorial. If you don't have an existing autonomous database with private access and want to try this tutorial, you can follow the instructions below to set up the resources needed to perform this tutorial.

Setting up the Resources Needed for this Tutorial

1. Creating Access Policies for Network Resouces
2. Creating a Virtual Cloud Network
3. Creating a Private Subnet
4. Creating a Network Security Group
5. Creating an Autonomous Database with Private Access
6. Creating Security Rules

1. Creating Access Policies

To configure Data Catalog to access the private network of a data source, you need access to networking and data catalog resources.

If you already have access to perform all Data Catalog and Networking operations in your required compartments, you may skip this step.

To create the policy needed to configure a private network in data catalog, perform the following steps:

  1. Open the Console navigation menu and then select Policies under Identity.
  2. Click Create Policy.
  3. In the Create Policy panel, enter a unique name for the policy. The name must be unique across all policies in your tenancy. You cannot change this later. For example, data-catalog-private-endpoint-policy.
  4. Next, enter a description such as Grant permissions to create private endpoint and then select Keep Policy Current.
  5. Under Policy Statements field, enter the following policy rule.
    allow group data-catalog-users to manage data-catalog-private-endpoints in tenancy
    Note

    This policy allows users in the data-catalog-users group to perform all data catalog private endpoint operations in any compartment in the tenancy.
  6. Click + Another Statement.
  7. Enter the following policy rule.
    allow group data-catalog-users to manage virtual-network-family in tenancy
    Note

    This policy allows users in the data-catalog-users group to perform all network related operations in any compartment in the tenancy.
  8. Click Create.
You have successfully created the policies to access the required resources for configuring a private network in Data Catalog.

2. Obtaining Data Source Details

You need the private network and database connection information for the autonomous database you want to harvest.

Obtain the following details for the autonomous database:

Information Needed Instructions to Obtain Information
For configuring the private network, you need the VCN and subnet name and the private URL for the database.
  1. From the navigation menu in the Console, click Autonomous Data warehouse.
  2. View the details for the database you want to harvest.
  3. From the Network section, note the VCN, Subnet, and Private Endpoint URL for the database.
Tip

If you have more database in this network (same VCN and subnet) that you want to harvest, make note of the Private URL for those databases too.
For creating the data asset, you need the database name. From the autonomous database details page, note the database name from the General Information section.
For adding a connection, you need the database wallet and login credentials.
  1. From the autonomous database details page, click DB Connection.
  2. Click Download Wallet.
  3. Enter a password for this wallet. You will not use this password in this tutorial.
  4. Click Download.
  5. Save the wallet file in your local machine.

You also need the credentials (username and password) for the database. You specify this when you created the autonomous database. If you did not create the autonomous database, obtain the credentials from your admin. While harvesting, you will only be able to view database entities you have access to.

3. Creating a Private Endpoint

You create a Data Catalog private endpoint to configure the network access details for the autonomous database data sources you want to harvest.

To create a private endpoint in Data Catalog, perform the following steps:

  1. From the navigation menu in the Console, under Data and AI, click Data Catalog.
  2. Click Private Endpoints. The Private Endpoints page displays.
  3. Click Create Private Endpoint. A Create Private Endpoint panel displays.
  4. Ensure you have permission to work in the selected compartment, and enter a name for the private endpoint. For example, XYZ Private Endpoint.
  5. Select the VCN and subnet where the autonomous database is hosted.
  6. Enter the DNS zone (Private Endpoint URL) for the autonomous database. Use a comma to enter more than one data source private URL.
    Note

    To view autonomous database Private Endpoint URL in the Console, from the navigation menu click Autonomous Database and view the details for your autonomous database. The Private URL is displayed under Network.
  7. Click Create.
    Image shows the Create Private Endpoint panel in the Console
The private endpoint is being created. The create process can take a couple of minutes. When the private endpoint is created successfully, the private endpoint is in ACTIVE status.

If the private endpoint status changes to FAILED, ensure you have the created the access policies and set up your private network correctly.

4. Attaching a Private Endpoint

You attach a private endpoint to a data catalog to allow data assets to be created for data sources available in the private network.

To attach a private endpoint to a data catalog, perform the following steps:

  1. Click Data Catalogs.
  2. Click the Actions icon (three dots) for the data catalog where you want to attach the private endpoint and select Attach Private Endpoint.
  3. Select the private endpoint you created in the previous step and click Attach.
    Image shows the Attach Private Endpoint dialog in the Console
The data catalog status changes to Updating and the private endpoint is being attached. After the private endpoint is attached successfully, the status of the data catalog changes to Active.

5. Creating a Data Asset

You are now ready to register your private IP autonomous database with Data Catalog as a data asset . In this tutorial, you create an Autonomous Data Warehouse data asset.

To create an autonomous database data asset, perform the following steps:

  1. Click the data catalog instance where you attached the private endpoint in the previous step.
  2. From your data catalog Home tab, click Create Data Asset from the Quick Actions tile.
  3. In the Create Data Asset panel, enter a Name to uniquely identify your data asset. Optionally, enter a description.
  4. From the Type drop-down list, select Autonomous Data Warehouse.
  5. In the Database Name field, enter the database name you specified when you created the autonomous database.
    Note

    To view autonomous database name in the Console, from the navigation menu click Autonomous Database and view the details for your autonomous database. The database name is displayed under General Information.
  6. Select the Use private endpoint checkbox.
  7. Click Create.
    Image shows the Create Data Asset panel in the data catalog
You have successfully created your autonomous database data asset.

6. Adding a Connection

After creating the autonomous database data asset, you add a connection for the data asset.

To add a connection for your autonomous database data asset, perform the following steps:

  1. From the data catalog Home tab, click Data Assets to access the Data Assets page.
  2. In the Data Assets list, select the autonomous database data asset you created previously.
  3. In the Summary tab on the data asset details page, under Connection Information, click Add Connection.
  4. In the Add Connection panel, enter a unique name for your connection. Optionally, enter a short description.
  5. Select Generic for Type.
  6. In the Wallet field, upload the wallet credentials file for your autonomous database.
    Note

    To download the wallet for your autonomous database, from the navigation menu in the Console,click Autonomous Database. From the details page for your autonomous database, click DB Connection. Retain the Instance Wallet selection for Wallet Type and click Download Wallet. Enter and confirm the password and save the wallet file to your local machine.
  7. Select a TNS Alias.
  8. Enter the Username and Password to access the autonomous database.
  9. Click Test Connection. A notification displays indicating if the test connection was successful or failed.
  10. Click Add.
    Image shows the Add Connection panel in the data catalog
You have successfully created a connection for your autonomous database data asset.

7. Harvesting the Data Asset

You are now ready to harvest your autonomous database data asset. Your autonomous database must have the data from which you want to harvest the technical metadata. If you used the set up instructions in this tutorial, you can harvest metadata from the default data that is available in your autonomous database.

To harvest your autonomous database data asset, perform the following steps:

  1. Click Harvest on the data asset details page the a data asset.
  2. The Select Connection page displays and the default connection is selected. Click Next.
  3. The Select Data Entities page displays. View and add all the data entities you want to harvest from the Available ADW Schema section.
    1. Click the add icon for each data entity you want to include in the harvest job.
    2. Click Add All to select all the entities for harvesting.
    3. Use the Filter ADW Schema box to find a data entity from the available data entities.
    4. Use the page navigation icons to browse all the data entities.
    5. Click the remove icon for any selected data entity that you want to remove from the harvest job.
    6. If you need to start over, click Remove All and then start over.
    After you have reviewed the data entities you want to harvest from the Selected ADW Schema / Data Entities section, click Next.
    Image shows the Select Data Entities page for the harvest wizard in data catalog
  4. The Create Job page displays. In the Job Name field, enter a unique name to identify the harvest job.
  5. Optionally, enter a Description.
  6. Select Run job now and then click Create Job.
    Image shows the Create Job page for the harvest wizard in data catalog
  7. The job to harvest your autonomous database data asset is created successfully and the Jobs tab displays. Click the job name to view job details.
Your data asset is harvested successfully and you can review the harvest job details.