Data Science Resource Principals and other Improvements to the Notebook Session Environment are now available

Support for Resource Principals in Notebook Sessions

Oracle Cloud Infrastructure Data Science enables you to authenticate using your notebook session's resource principal to access other Oracle Cloud Infrastructure resources. Resource principals provide a more secure and easy-to-use method of authenticating to Oracle Cloud Infrastructure resources, see Authenticating to the Oracle Cloud Infrastructure APIs from a Notebook Session

Library Upgrades

  • `oci` version 2.18.1
  • `git` version 2.27.0

ADS Updates

New model explanation diagnostic is available in MLX: accumulated local effects (ALEs)

ALEs evaluate the relationship between feature values and target variables. However, in the event that features are highly correlated, PDP may include unlikely combinations of feature values in the average prediction calculation due to the independent manipulation of feature values across the marginal distribution. This lowers the trust in the PDP explanation when features are highly correlation. Unlike PDP, ALE handles feature correlations by averaging and accumulating the difference in predictions across the conditional distribution, which isolates the effects of the specific feature.

There are two new notebook examples showcasing this new model interpretation diagnostic, mlx_ale.ipynb and lx_ale_vs_pdp.ipynb.

Support for "What-if" scenarios in MLX

The WhatIf Explainer provides a user interface that allows the data scientist to manipulate one of many values in a single observation and measure the impact on the model predictions. The new mlx_whatif.ipynb notebook example showcases this new feature.

Improvements made to the correlation map calculation in show_in_notebook()

We have improved the feature correlation calculation and we now show the full scale of the correlation [-1,1]. You can display the correlation between different features by calling `ds.show_corr()`.

Improvements made to the model artifact

We now generate by default the files necessary to deploy the model to Oracle Functions. There is no longer a fn-model/ directory. All files are in the top level directory of your model artifact. The new artifact format has the following files:

  • func.py
  • func.yaml
  • requirements.txt
  • runtime.yaml
  • score.py

      (Plus, any additional files you have.)

Saving a model using either the generic or the ADSModel.from_estimator() approach, yields the exact same set of files. The runtime.yaml file now captures a comprehensive list of attributes of the notebook session that the model was trained in.

Bug Fixes

Multiple bugs were fixed in the dataflow module:

  • Applications inherit the compartment assignment of the client. Runs inherit from applications by default. Compartment OCIDs can also be specified independently at the client, application, and run levels.
  • The log link for logs pulled from an application loaded into the notebook session is fixed.
  • Run directory of a loaded application is now found under the application's directory.
  • Index out-of-range for loaded applications.

Some progress bars now complete fully. Previously some progress bar didn't fully complete. This was encountered under ADSModel.prepare() and prepare_generic_model().

 

See the Oracle Accelerated Data Science Library release notes for more information.