Oracle AutoML automates the machine learning experience. It replaces the laborious and time consuming tasks of the data scientist whose workflow is as follows:
Select a model from a large number of viable candidate models.
For each model, tune the hyperparameters.
Select only predictive features to speed up the pipeline and reduce over-fitting.
Ensure the model performs well on unseen data (also called generalization).
Oracle AutoML automates this workflow and provides you with an optimal model given a time budget. In addition to incorporating these typical machine learning workflow steps, Oracle AutoML is also optimized to produce a high quality model very efficiently. This is achieved by the following:
Scalable design: All stages in the Oracle AutoML Pipeline exploit both inter-node and intra-node parallelism, improving scalability and reducing runtime.
Intelligent choices reduce trials in each stage: Algorithms and parameters are chosen based on dataset characteristics. This ensures that the selected model is accurate and is efficiently selected. This is achieved with the use of meta-learning throughout the pipeline. Meta-learning is used in:
Algorithm selection to select an optimal model class.
Adaptive sampling to identify the optimal set of samples.
Feature selection to determine the ideal feature subset.
Hyperparameter optimization it is used to tune the hyperparameter values.
The following topics describe the Oracle AutoML Pipeline and individual stages of the pipeline in more detail.
- The Oracle AutoML Pipeline
- Building a Classifier using OracleAutoMLProvider
Familiarize yourself with Keras by reviewing About Keras.
By default, Keras uses TensorFlow as the backend.
These examples examine a binary classification problem predicting churn. This is a common type of problem that can be solved using
You need to load the dataset, pull the data from
github, and cache it for faster use after the firtst time:
from os import path import numpy as np import pandas as pd import requests import logging logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.ERROR) churn_data_file = '/tmp/churn.csv' if not path.exists(churn_data_file): # fetch sand save some data print('fetching data from web...', end =" ") r = requests.get('https://github.com/darenr/public_datasets/raw/master/churn_dataset.csv') with open(churn_data_file, 'wb') as fd: fd.write(r.content) print("Done") df = pd.read_csv(churn_data_file)
scikit-learn to generate metrics. Most of these tasks can be done using ADS. For example, ADS can open datasets and split them into test and training sets. This example demonstrates how to do
these tasks with external libraries:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.metrics import confusion_matrix, roc_auc_score from keras.models import Sequential from keras.layers import Dense
The first step is data preparation. From the
pandas.DataFrame, you extract the X-values and Y-values as a
numpy array. The feature selection is performed manually. This example is designed to show how ADS is used with external libraries,
but this whole section can be replaced with ADS AutoML. AutoML does not design network architectures. Instead, it creates models using packages like
The next step is feature encoding using
LabelEncoder to convert category variables into ordinal numbers (‘red’, ‘green’, ‘blue’ –> 0, 1, 2).
Once the variables have been encoded compatible with Keras, the data is split into a 80/20 ratio. The training is performed on the 80% and tested with the 20% to see how well the model generalizes to unseen examples.
feature_name = ['CreditScore', 'Geography', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary'] response_name = ['Exited'] data = df[[val for sublist in [feature_name, response_name] for val in sublist]].copy() # Encode the category columns for col in ['Geography', 'Gender']: data.loc[:, col] = LabelEncoder().fit_transform(data.loc[:, col]) # Do an 80/20 split for the training and test data train, test = train_test_split(data, test_size=0.2, random_state=42) # Scale the features and split the features away from the response sc = StandardScaler() # Feature Scaling X_train = sc.fit_transform(train.drop('Exited', axis=1).to_numpy()) X_test = sc.transform(test.drop('Exited', axis=1).to_numpy()) y_train = train.loc[:, 'Exited'].to_numpy() y_test = test.loc[:, 'Exited'].to_numpy()
Next, you design and code a neural network using the neural network architecture. It is a sequential model with an input layer with 10 nodes. It then has two hidden layers with 255 densely connected nodes and the ReLu activation function. The output layer has a single node with a sigmoid activation function because the model is doing binary classification. The optimizer is Adam and the loss function is binary cross-entropy. The model is optimized on the accuracy metric. This takes several minutes to run.
keras_classifier = Sequential() keras_classifier.add(Dense(units=255, kernel_initializer='uniform', activation='relu', input_dim=10)) keras_classifier.add(Dense(units=255, kernel_initializer='uniform', activation='relu')) keras_classifier.add(Dense(units=1, kernel_initializer='uniform', activation='sigmoid')) keras_classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) keras_classifier.fit(X_train, y_train, batch_size=10, epochs=25)
To evaluate this model, you could use
sklearn or ADS.
This example uses
y_pred = keras_classifier.predict(X_test) y_pred = (y_pred > 0.5) cm = confusion_matrix(y_test, y_pred) auc = roc_auc_score(y_test, y_pred) print("confusion_matrix:\n", cm) print("roc_auc_score", auc)
This example uses the ADS evaluator package:
from ads.common.model import ADSModel from ads.evaluations.evaluator import ADSEvaluator from ads.common.data import MLData clf = ADSModel.from_estimator(keras_classifier) evaluator = ADSEvaluator(test, models=[clf], training_data=X_train)
sklearn pipeline can be used to build a model on the same churn dataset that was used in the Keras section. The pipeline allows the model to contain multiple stages and transformations. In a
more sophisticated example, there would be pipeline stages for feature encoding, scaling, and so on. In this pipeline example, a
LogisticRegression estimator is used:
from sklearn.linear_model import LogisticRegression from sklearn.pipeline import Pipeline pipeline_classifier = Pipeline(steps=[ ('clf', LogisticRegression()) ]) pipeline_classifier.fit(X_train, y_train)
You can evaluate this model using
sklearn or ADS.
Familiarize yourself with XGBoost by reviewing XGBoost Documentation.
Import XGBoost with:
from xgboost import XGBClassifier xgb_classifier = XGBClassifier(nthread=1) xgb_classifier.fit(eval_train.X, eval_train.y)
From three estimators we create three ADSModel objects: a
Keras classifier, a
sklearn pipeline with a single
from ads.common.model import ADSModel from ads.evaluations.evaluator import ADSEvaluator from ads.common.data import MLDataa keras_model = ADSModel.from_estimator(keras_classifier) lr_model = ADSModel.from_estimator(lr_classifier) xgb_model = ADSModel.from_estimator(xgb_classifier) evaluator = ADSEvaluator(eval_test, models=[keras_model, lr_model, xgb_model], training_data=eval_train) evaluator.show_in_notebook()