Name the new compartment data-science-work, and enter a
description.
Confirm that the compartment appears in the compartments list.
3. (Optional) Creating a VCN and Subnet 🔗
This step is optional. When you create a notebook session in Step
6. Creating a Notebook Session, you can choose to create a default
network with the proper setup for notebook sessions.
Important
You can skip creating a network and setting up subnets and gateways if you
select default networking when creating a notebook. If the default networking is
configured in a notebook, you can't change it when reactivating the
notebook.
This section shows users who require access to their VCNs, how to create a VCN and
later, how to select the recommended subnet for notebook sessions. For example, if
you're performing the Scheduling Data Science Job Runs tutorial, you create this
network and use it both for the notebook session in Data Science, and the workspace
in the Data Integration service.
Select the data-science-work compartment. This compartment hosts the VCN
that you create in this section. It takes time for this new compartment to
appear in the compartment list, so refresh the page until it appears.
For Configure VCN and Subnets, keep the defaults:
VCN CIDR Block: 10.0.0.0/16
Public Subnet CIDR Block: 10.0.0.0/24
Private Subnet CIDR Block: 10.0.1.0/24
Use DNS hostnames in this VCN: selected
You use this VCN and its private subnet, Private
Subnet-datascience-vcn when you create a notebook
session.
Select View Virtual Cloud Network to review the VCN and subnets.
Note
For egress access to the public internet, we recommend that you use a private
subnet with a route to a NAT Gateway. A NAT gateway gives instances in a private
subnet access to the internet. The VCN that you create in this step creates a
private subnet with egress access to the internet through the VCN's NAT
Gateway.
4. Creating Policies 🔗
Before users start their notebook sessions, you must configure the Data Science policies.
Open the navigation menu and click Identity &
Security. Under Identity, click
Policies.
Click Create Policy.
Enter data-science-policy for the Name.
Enter Policy for data science users and service as the
Description.
Select the data-science-work compartment.
Click Show manual editor.
Enter the following five policy statements into the Policy
Builder field:
Copy
allow service datascience to use virtual-network-family in compartment data-science-work
allow group data-scientists to manage data-science-family in compartment data-science-work
allow group data-scientists to use virtual-network-family in compartment data-science-work
allow group data-scientists to manage buckets in compartment data-science-work
allow group data-scientists to manage objects in compartment data-science-work
Click Create to create your policy.
Explanation for the policies:
To allow the Data Science service to attach your VCN to your notebook session
and route egress traffic from the notebook environment, add:
allow service datascience to use virtual-network-family in compartment data-science-work
To allow the data-scientists group to perform operations on
all Data Science resources in the data-science-work
compartment (projects, notebook sessions, models, model deployments, work
requests, jobs, and job runs), add:
allow group data-scientists to manage data-science-family in compartment data-science-work
To allow those data scientists to use the VCN, you created and attach it to
their notebook session, add:
allow group data-scientists to use virtual-network-family in compartment data-science-work
To allow those data scientists to create and manage buckets, such as adding
artifacts and conda environments to buckets, add:
allow group data-scientists to manage buckets in compartment data-science-work
allow group data-scientists to manage objects in compartment data-science-work
Tip
Instead of specifying which resources to manage such as buckets,
objects, or virtual network family, to allow data scientists administrative
rights to their compartment, in which they can manage all the resources
of OCI services, replace the
preceding five policies with the following two policies:
Copy
allow group data-scientists to manage all-resources in compartment data-science-work
allow service datascience to use virtual-network-family in compartment data-science-work
5. Creating a Dynamic Group with Policies 🔗
Create a dynamic group for Data Science resources and allow this dynamic group to
access other OCI resources, such as Object
Storage and Logging.
To give permission to OCI resources to
access other OCI resources, first, you
add the resources to a dynamic group, instead of a user group. Then you write
policies to allow the dynamic group to access specified resources. Here, your
dynamic group has three Data Science resources: notebook sessions, model
deployments, and job runs.
Open the navigation menu and click Identity &
Security. Under Identity, click
Compartments.
Click the data-science-work compartment.
For the OCID attribute, click
Copy to save the entire OCID to your notepad.
In the trail that displays the current page, click
Compartments to return to the list of
compartments.
In the Matching Rules section, click Match
any rules defined below.
Enter the following three matching rules. Replace
<compartment-ocid> with the compartment OCID
that you copied.
Rule
1:
Copy
ALL {resource.type='datasciencenotebooksession', resource.compartment.id='<compartment-ocid>'}
The
preceding matching rule means that all notebook sessions created in
your compartment are members of the
data-science-dynamic-group.
Click
Additional Rule and add the following
rule:
Rule
2:
Copy
ALL {resource.type='datasciencemodeldeployment', resource.compartment.id='<compartment-ocid>'}
The
preceding matching rule means that all model deployments created in
your compartment are members of the
data-science-dynamic-group.
Click
Additional Rule and add the following
rule:
Rule
3:
Copy
ALL {resource.type='datasciencejobrun', resource.compartment.id='<compartment-ocid>'}
The
preceding matching rule means that all job runs created in your
compartment are members of the
data-science-dynamic-group.
Select Create.
Next, write policies to allow resources of this dynamic group to access other
OCI services.
In the trail that displays the current page, click
Identity.
Click Policies.
Click Create Policy.
Enter the following:
Name: data-science-dynamic-group-policy
Description: Policy for the Data Science dynamic group
Instead of the data-science-work compartment, select the
top-most compartment, which is your tenancy.
Important
Your policy fails to create if you don't use
tenancy.
Click Show manual editor.
Enter the following policy statements into the Policy
Builder field:
Copy
allow dynamic-group data-science-dynamic-group to manage data-science-family in compartment data-science-work
allow dynamic-group data-science-dynamic-group to manage dataflow-family in compartment data-science-work
allow dynamic-group data-science-dynamic-group to read compartments in tenancy
allow dynamic-group data-science-dynamic-group to read users in tenancy
allow dynamic-group data-science-dynamic-group to use log-content in compartment data-science-work
allow dynamic-group data-science-dynamic-group to use log-groups in compartment data-science-work
allow dynamic-group data-science-dynamic-group to manage object-family in compartment data-science-work
Click Create to create the policy.
You can use this dynamic group to give notebook sessions and model deployments that
are in the data-science-work compartment, access to other OCI
resources in the tenancy.
Explanation for the policies:
To allow notebook sessions to perform CRUD operations on entries in the model
catalog, projects, and notebook session resources, add:
allow dynamic-group data-science-dynamic-group to manage data-science-family in compartment data-science-work
To allow notebook sessions to perform CRUD operations on Data Flow
applications and runs, add:
allow dynamic-group data-science-dynamic-group to manage dataflow-family in compartment data-science-work
To allow notebook sessions to list and read compartments and user names that
are in the tenancy, add:
allow dynamic-group data-science-dynamic-group to read compartments in tenancy
allow dynamic-group data-science-dynamic-group to read users in tenancy
To allow model deployments to emit logs to the Logging service, add:
allow dynamic-group data-science-dynamic-group to use log-content in compartment data-science-work
To allow job runs to create logs and record job run details in the Logging
service, add:
allow dynamic-group data-science-dynamic-group to use log-groups in compartment data-science-work
To allow notebook sessions and model deployments to read and write files to
object storage buckets, in the data-science-work
compartment, add:
allow dynamic-group data-science-dynamic-group to manage object-family in compartment data-science-work
Tip
The preceding policy allows model deployments to access any bucket in
the data-science-work compartment.
To give model deployments read access to specific buckets
outside the data-science-work compartment, specify the bucket
names and their compartments in your policy.
Example: To allow model deployments to access published conda environments
from bucket published-conda-env, and model artifacts
from bucket model-artifacts,
add:
Copy
allow dynamic-group data-science-dynamic-group to read objects in compartment <another-compartment> where ANY {target.bucket.name='published-conda-envs', target.bucket.name='model-artifacts'}
If your policy statements mention tenancy or include compartments
outside the data-science-work compartment, then in the
Create Policy dialog, for the
Compartment option, select
<your-tenancy> (root). This way, in
addition to your compartment, the policy can include rules for other
compartments in the tenancy.
6. Creating a Notebook Session 🔗
Lastly, create a notebook session and test its access to the public internet.
Open the navigation menu and click Analytics & AI.
Under Machine Learning, click Data
Science.
Click Create Project.
Select the data-science-work compartment.
(Optional)
Enter Initial Project for the Name.
(Optional)
Enter my first project for the Description.
Click Create.
Click Create notebook session.
For Compartment, select data-science-work.
(Optional)
Enter my-first-notebook-session for the
Name.
For Compute shape, click
Select.
Choose the following options:
Instance Type: Virtual machine
Shape Series: Intel
Shape Name: VM.Standard3.Flex
For VM.Standard3.Flex, keep the default
allocations:
Number of OCPUs: 1
Amount of memory (GB): 16
Click Select shape.
For Block storage size, enter 100 GBs to attach
to your virtual machine.
Click Custom networking, and select the
datascience-vcn VCN and Private Subnet-datascience-vcn subnet
to route egress traffic from your notebook session.
Instead of Custom networking, you can choose the
Default networking option which creates the
networking for you. With Default networking, you can skip
the Step 3. Creating a VCN and Subnet section of this
tutorial. This tutorial shows custom networking for users with custom settings,
so they can see the steps.
Click View detail page on clicking create.
Click Create to create your first notebook session.
Creating the notebook session takes a few minutes. When the notebook session
status turns to Active, you can open the notebook
session.
Click Open.
Enter your Oracle Cloud Infrastructure credentials to access
the JupyterLab UI.
If you don't have a tab called Launcher, click File, and
then New Launcher.
In the Launcher, under Other, click the
Terminalicon to start a new terminal session.
To perform a simple test, check that you can access the public internet from
your notebook session by running this command:
(base) bash-4.2$ wget --spider https://www.oracle.com
Spider mode enabled. Check if remote file exists.
--<date>-- https://www.oracle.com/
Resolving www.oracle.com (www.oracle.com)...
Connecting to www.oracle.com (www.oracle.com)... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.
The HTTP request sent, awaiting response... 200 OK indicates
a successful test and you have public internet access in your notebook
session.
What's Next 🔗
You have successfully set up a Data Science tenancy and created a Data Science project that
includes a notebook session. You can now proceed to the following tasks: