DP-100: Designing and Implementing a Data Science Solution on Azure - Exam Prep

A collection of resources and learning material to help prepare for and pass exam DP-100: Designing and Implementing a Data Science Solution on Azure. Passing this exam will result in becoming a certified Azure Data Scientist. This exam focuses on how to implement and run machine learning workloads on Azure, in particular, using the Azure Machine Learning Service.

Suggested Approach

Skills Measured - In order to aid study, a copy of the skills measured sourced from the official exam page with key phrases highlighted below. Note: Always ensure to refer to the latest skills outline available directly from the official exam home page as the content changes from time to time.
Microsoft Learn - Collection of free content that can be consumed on-demand and aligned to the Azure Data Scientist role. Working through this content will establish a solid foundation to build upon.
Study Notes - Finally, refer to either your own set of notes or those compiled below. After completing your initial round of learning, spend additional time in any areas where you may not feel a satisfactory level of confidence prior to taking the exam.

Resources

↥ back to top

Resource	Link
Certification	Microsoft Certified: Azure Data Scientist Associate
Exam	Exam DP-100: Designing and Implementing a Data Science Solution on Azure
Microsoft Learn	Azure Data Scientist Learning Paths
Skills Outline	DP-100 Exam Skills Outline

Suggested Learning Paths

↥ back to top

Skills Measured

↥ back to top

1. Set up an Azure Machine Learning workspace | 30-35%

Create an Azure Machine Learning workspace

create an Azure Machine Learning workspace [1]
configure workspace settings [1]
manage a workspace by using Azure Machine Learning Studio [1]

Manage data objects in an Azure Machine Learning workspace

register and maintain data stores [1]
create and manage datasets [1]

Manage experiment compute contexts

create a compute instance [1]
determine appropriate compute specifications for a training workload [1]
create compute targets for experiments and training [1]

2. Run experiments and train models | 25-30%

Create models by using Azure Machine Learning Designer

create a training pipeline by using Azure Machine Learning designer [1]
ingest data in a designer pipeline [1]
use designer modules to define a pipeline data flow [1]
use custom code modules in designer [1]

Run training scripts in an Azure Machine Learning workspace

create and run an experiment by using the Azure Machine Learning SDK [1]
consume data from a data store in an experiment by using the Azure Machine Learning SDK [1]
consume data from a dataset in an experiment by using the Azure Machine Learning SDK [1]
choose an estimator for a training experiment [1]

Generate metrics from an experiment run

log metrics from an experiment run [1]
retrieve and view experiment outputs [1]
use logs to troubleshoot experiment run errors [1]

Automate the model training process

create a pipeline by using the SDK [1] [2]
pass data between steps in a pipeline [1] [2]
run a pipeline [1]
monitor pipeline runs [1]

3. Optimize and manage models | 20-25%

Use Automated ML to create optimal models

use the Automated ML interface in Azure Machine Learning studio [1]
use Automated ML from the Azure Machine Learning SDK [1]
select scaling functions and pre-processing options [1] [2]
determine algorithms to be searched
define a primary metric [1]
get data for an Automated ML run [1]
retrieve the best model [1]

Use Hyperdrive to tune hyperparameters

select a sampling method [1] [2]
define the search space [1] [2]
define the primary metric [1]
define early termination options [1] [2]
find the model that has optimal hyperparameter values [1]

Use model explainers to interpret models

select a model interpreter [1] [2]
generate feature importance data [1]

Manage models

register a trained model [1] [2]
monitor model history [1] [2]
monitor data drift [1] [2]

4. Deploy and consume models | 20-25%

Create production compute targets

consider security for deployed services [1] [2]
evaluate compute options for deployment [1]

Deploy a model as a service

configure deployment settings [1]
consume a deployed service [1]
troubleshoot deployment container issues [1]

Create a pipeline for batch inferencing

publish a batch inferencing pipeline [1] [2]
run a batch inferencing pipeline and obtain outputs [1]

Publish a designer pipeline as a web service

create a target compute resource [1]
configure an Inference pipeline [1]
consume a deployed endpoint [1]

Study Notes

↥ back to top

1. Set up an Azure Machine Learning workspace | 30-35%

Create an Azure Machine Learning Workspace

from azureml.core import Workspace
ws = Workspace.create(
    name='aml-workspace',
    subscription_id='123456-abc-123...',
    resource_group='aml-resources',
    create_resource_group=True,
    location='eastus',
    sku='enterprise'
)

Workspace Configuration File (config.json)

{
    "subscription_id": "<subscription-id>",
    "resource_group": "<resource-group>",
    "workspace_name": "<workspace-name>"
}

Connect to Workspace using a Configuration File
By default, the from_config method looks for a file named config.json in the folder containing the Python code file, but you can specify another path if necessary.

from azureml.core import Workspace

ws = Workspace.from_config()

Load Workspace without a Configuration File

from azureml.core import Workspace

ws = Workspace.get(
    name='aml-workspace',
    subscription_id='1234567-abcde-890-fgh...',
    resource_group='aml-resources')

from azureml.core import Workspace, Datastore

ws = Workspace.from_config()

# Register a new datastore
blob_ds = Datastore.register_azure_blob_container(workspace=ws,
    datastore_name='blob_data',
    container_name='data_container',
    account_name='az_store_acct',
    account_key='123456abcde789…')

Create and Register a Tabular Dataset

from azureml.core import Dataset

blob_ds = ws.get_default_datastore()
csv_paths = [(blob_ds, 'data/files/current_data.csv'),
             (blob_ds, 'data/files/archive/*.csv')]
tab_ds = Dataset.Tabular.from_delimited_files(path=csv_paths)
tab_ds = tab_ds.register(workspace=ws, name='csv_table')

Create and Register a File Dataset

from azureml.core import Dataset

blob_ds = ws.get_default_datastore()
file_ds = Dataset.File.from_files(path=(blob_ds, 'data/files/images/*.jpg'))
file_ds = file_ds.register(workspace=ws, name='img_files')

Create a Compute Instance

from azureml.core.compute import ComputeTarget, ComputeInstance
from azureml.core.compute_target import ComputeTargetException

compute_name = "compute-instance"

try:
    instance = ComputeInstance(workspace=ws, name=compute_name)
except ComputeTargetException:
    compute_config = ComputeInstance.provisioning_configuration(
        vm_size='STANDARD_D3_V2',
        ssh_public_access=False)
    instance = ComputeInstance.create(ws, compute_name, compute_config)
    instance.wait_for_completion(show_output=True)

Create a Compute Target

from azureml.core import Workspace
from azureml.core.compute import ComputeTarget, AmlCompute

# Load the workspace from the saved config file
ws = Workspace.from_config()

# Specify a name for the compute (unique within the workspace)
compute_name = 'aml-cluster'

# Define compute configuration
compute_config = AmlCompute.provisioning_configuration(
    vm_size='STANDARD_DS12_V2',
    min_nodes=0,
    max_nodes=4,
    vm_priority='dedicated')

# Create the compute
aml_cluster = ComputeTarget.create(ws, compute_name, compute_config)
aml_cluster.wait_for_completion(show_output=True)

2. Run experiments and train models | 25-30%

Create an Experiment

from azureml.core import Experiment
experiment = Experiment(workspace=ws, name="my-experiment"

Get a Datastore

from azureml.core import Dataset

my_datastore = Datastore.get(workspace, 'my_datastore')

Get a Dataset

from azureml.core import Dataset
dataset_name = 'my-dataset'

# Get a dataset by name
ds = Dataset.get_by_name(workspace=workspace, name=dataset_name)

# Load dataset into pandas DataFrame
df = ds.to_pandas_dataframe()

Generic Estimator
This class is designed for use with machine learning frameworks that do not already have an Azure Machine Learning pre-configured estimator. Pre-configured estimators exist for Chainer, PyTorch, TensorFlow, and SKLearn. Encapsulates a run configuration and a script configuration in a single object. Running the estimator produces a model in the output directory specified in your training script.

from azureml.train.estimator import Estimator
from azureml.core import Experiment

# Create an estimator
estimator = Estimator(source_directory='experiment_folder',
                      entry_script='training_script.py',
                      compute_target='local',
                      conda_packages=['scikit-learn']
                      )

# Create and run an experiment
experiment = Experiment(workspace = ws, name = 'training_experiment')
run = experiment.submit(config=estimator)

Framework-Specific Estimators

from azureml.train.sklearn import SKLearn
from azureml.core import Experiment

script_params = {
    '--kernel': 'linear',
    '--penalty': 1.0
}

estimator = SKLearn(
    source_directory=project_folder,
    script_params=script_params,
    compute_target=compute_target,
    entry_script='train_iris.py',
    pip_packages=['joblib==0.13.2']
)

# Create and run an experiment
experiment = Experiment(workspace = ws, name = 'training_experiment')
run = experiment.submit(config=estimator)

Run Metrics

  
      Type
      Function
      Example
      Note
    

  
      Scalar values
      run.log(name, value, description=''
      run.log("accuracy", 0.95)
      Log a numerical or string value.
    

      Lists
      run.log_list(name, value, description='')
      run.log_list("accuracies", [0.6, 0.7, 0.87])
      Log a list of values.
    

      Row
      run.log_row(name, description=None, **kwargs)
      run.log_row("Y over X", x=1, y=0.4)
      Creates a metric with multiple columns as described in kwargs.
    

      Table
      run.log_table(name, value, description='')
      run.log_table("Y over X", {"x":[1, 2, 3], "y":[0.6, 0.7, 0.89]})
      Log a dictionary object.
    

      Images
      run.log_image(name, path=None, plot=None)
      run.log_image("ROC", plot=plt)
      Log an image.
    

      Tag a run
      run.tag(key, value=None)
      run.tag("selected", "yes")
      Tag the run with a string key and optional string value.
    

      Upload file or directory
      run.upload_file(name, path_or_stream)
      run.upload_file("best_model.pkl", "./model.pkl")
      Upload a file.
    

Type	Function	Example	Note
Scalar values	run.log(name, value, description=''	run.log("accuracy", 0.95)	Log a numerical or string value.
Lists	run.log_list(name, value, description='')	run.log_list("accuracies", [0.6, 0.7, 0.87])	Log a list of values.
Row	run.log_row(name, description=None, **kwargs)	run.log_row("Y over X", x=1, y=0.4)	Creates a metric with multiple columns as described in kwargs.
Table	run.log_table(name, value, description='')	run.log_table("Y over X", {"x":[1, 2, 3], "y":[0.6, 0.7, 0.89]})	Log a dictionary object.
Images	run.log_image(name, path=None, plot=None)	run.log_image("ROC", plot=plt)	Log an image.
Tag a run	run.tag(key, value=None)	run.tag("selected", "yes")	Tag the run with a string key and optional string value.
Upload file or directory	run.upload_file(name, path_or_stream)	run.upload_file("best_model.pkl", "./model.pkl")	Upload a file.

Logging Options

  
      Option
      Description
    
      Run.start_logging
      Add logging functions to your training script and start an interactive logging session in the specified experiment. start_logging creates an interactive run for use in scenarios such as notebooks. Any metrics that are logged during the session are added to the run record in the experiment.
    
      ScriptRunConfig
      Add logging functions to your training script and load the entire script folder with the run. ScriptRunConfig is a class for setting up configurations for script runs. With this option, you can add monitoring code to be notified of completion or to get a visual widget to monitor.
    
      Designer logging
      Add logging functions to a drag-&-drop designer pipeline by using the Execute Python Script module. Add Python code to log designer experiments.

Option	Description
Run.start_logging	Add logging functions to your training script and start an interactive logging session in the specified experiment. start_logging creates an interactive run for use in scenarios such as notebooks. Any metrics that are logged during the session are added to the run record in the experiment.
ScriptRunConfig	Add logging functions to your training script and load the entire script folder with the run. ScriptRunConfig is a class for setting up configurations for script runs. With this option, you can add monitoring code to be notified of completion or to get a visual widget to monitor.
Designer logging	Add logging functions to a drag-&-drop designer pipeline by using the Execute Python Script module. Add Python code to log designer experiments.

Logging

# Get an experiment object from Azure Machine Learning
experiment = Experiment(workspace=ws, name="train-within-notebook")

# Create a run object in the experiment
run =  experiment.start_logging()

# Log the algorithm parameter alpha to the run
run.log('alpha', 0.03)

Create a Pipeline

from azureml.pipeline.steps import PythonScriptStep, EstimatorStep
from azureml.pipeline.core import Pipeline
from azureml.core import Experiment

# Step to run a Python script
step1 = PythonScriptStep(name = 'prepare data',
                         source_directory = 'scripts',
                         script_name = 'data_prep.py',
                         compute_target = 'aml-cluster',
                         runconfig = run_config)

# Step to run an estimator
step2 = EstimatorStep(name = 'train model',
                      estimator = sk_estimator,
                      compute_target = 'aml-cluster')

# Construct the pipeline
train_pipeline = Pipeline(workspace = ws, steps = [step1,step2])

# Create an experiment and run the pipeline
experiment = Experiment(workspace = ws, name = 'training-pipeline')
pipeline_run = experiment.submit(train_pipeline)

Pass Data between Steps in a Pipeline

from azureml.pipeline.core import PipelineData
from azureml.pipeline.steps import PythonScriptStep, EstimatorStep

# Get a dataset for the initial data
raw_ds = Dataset.get_by_name(ws, 'raw_dataset')

# Define a PipelineData object to pass data between steps
data_store = ws.get_default_datastore()
prepped_data = PipelineData('prepped',  datastore=data_store)

# Step to run a Python script
step1 = PythonScriptStep(name = 'prepare data',
                         source_directory = 'scripts',
                         script_name = 'data_prep.py',
                         compute_target = 'aml-cluster',
                         runconfig = run_config,
                         # Specify dataset as initial input
                         inputs=[raw_ds.as_named_input('raw_data')],
                         # Specify PipelineData as output
                         outputs=[prepped_data],
                         # Also pass as data reference to script
                         arguments = ['--folder', prepped_data])

# Step to run an estimator
step2 = EstimatorStep(name = 'train model',
                      estimator = sk_estimator,
                      compute_target = 'aml-cluster',
                      # Specify PipelineData as input
                      inputs=[prepped_data],
                      # Pass as data reference to estimator script
                      estimator_entry_script_arguments=['--folder', prepped_data])

3. Optimize and manage models | 20-25%

Use Automated ML from the Azure Machine Learning SDK

from azureml.train.automl import AutoMLConfig
from azureml.core.experiment import Experiment

automl_config=AutoMLConfig(
    task='classification',
    primary_metric='AUC_weighted',
    experiment_timeout_minutes=30,
    blacklist_models=['XGBoostClassifier'],
    training_data=train_data,
    label_column_name=label,
    n_cross_validations=2)

ws = Workspace.from_config()

# Choose a name for the experiment and specify the project folder.
experiment_name = 'automl-classification'
project_folder = './sample_projects/automl-classification'

experiment = Experiment(ws, experiment_name)
run = experiment.submit(automl_config, show_output=True)

AutoMLConfig
Use the whitelist_models and blacklist_models parameters of AutoMLConfig class to include or exclude models.

Retrieve the Best Model

best_run, fitted_model = local_run.get_output()
print(best_run)
print(fitted_model)

Hyperparameter Tuning - Sampling Methods

  
      Method
      Description
    
      Grid
      Grid sampling can only be employed when all hyperparameters are discrete, and is used to try every possible combination of parameters in the search space.
    
      Random
      Random sampling is used to randomly select a value for each hyperparameter, which can be a mix of discrete and continuous values.
    
      Bayesian
      Bayesian sampling chooses hyperparameter values based on the Bayesian optimization algorithm, which tries to select parameter combinations that will result in improved performance from the previous selection.

Method	Description
Grid	Grid sampling can only be employed when all hyperparameters are discrete, and is used to try every possible combination of parameters in the search space.
Random	Random sampling is used to randomly select a value for each hyperparameter, which can be a mix of discrete and continuous values.
Bayesian	Bayesian sampling chooses hyperparameter values based on the Bayesian optimization algorithm, which tries to select parameter combinations that will result in improved performance from the previous selection.

Grid Sampling

from azureml.train.hyperdrive import GridParameterSampling, choice

param_space = {
                 '--batch_size': choice(16, 32, 64),
                 '--learning_rate': choice(0.01, 0.1, 1.0)
              }

param_sampling = GridParameterSampling(param_space)

Random Sampling

from azureml.train.hyperdrive import RandomParameterSampling, choice, normal

param_space = {
                 '--batch_size': choice(16, 32, 64),
                 '--learning_rate': normal(10, 3)
              }

param_sampling = RandomParameterSampling(param_space)

Bayesian Sampling

from azureml.train.hyperdrive import BayesianParameterSampling, choice, uniform

param_space = {
                 '--batch_size': choice(16, 32, 64),
                 '--learning_rate': uniform(0.5, 0.1)
              }

param_sampling = BayesianParameterSampling(param_space)

Hyperparameter Tuning - Define Metric Goal

from azureml.train.hyperdrive import HyperDriveConfig

hyperdrive_run_config = HyperDriveConfig(estimator=estimator,
                          hyperparameter_sampling=param_sampling, 
                          policy=early_termination_policy,
                          resume_from=warmstart_parents_to_resume_from, 
                          resume_child_runs=child_runs_to_resume,
                          primary_metric_name="accuracy", 
                          primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                          max_total_runs=100,
                          max_concurrent_runs=4)

Hyperparameter Tuning - Early Termination Policy

  
      Policy
      Description
    
      Bandit
      You can use a bandit policy to stop a run if the target performance metric underperforms the best run so far by a specified margin
    
      Median Stopping
      A median stopping policy abandons runs where the target performance metric is worse than the median of the running averages for all runs.
    
      Truncation Selection
      A truncation selection policy cancels the lowest performing X% of runs at each evaluation interval based on the truncation_percentage value you specify for X.

Policy	Description
Bandit	You can use a bandit policy to stop a run if the target performance metric underperforms the best run so far by a specified margin
Median Stopping	A median stopping policy abandons runs where the target performance metric is worse than the median of the running averages for all runs.
Truncation Selection	A truncation selection policy cancels the lowest performing X% of runs at each evaluation interval based on the truncation_percentage value you specify for X.

Bandit Policy

from azureml.train.hyperdrive import BanditPolicy

early_termination_policy = BanditPolicy(slack_amount = 0.2,
                                        evaluation_interval=1,
                                        delay_evaluation=5)

Median Stopping Policy

from azureml.train.hyperdrive import MedianStoppingPolicy

early_termination_policy = MedianStoppingPolicy(evaluation_interval=1,
                                                delay_evaluation=5)

Truncation Selection Policy

from azureml.train.hyperdrive import TruncationSelectionPolicy

early_termination_policy = TruncationSelectionPolicy(truncation_percentage=10,
                                                     evaluation_interval=1,
                                                     delay_evaluation=5)

Hyperparameter Tuning - Identify Best Performing Configuration

best_run = hyperdrive_run.get_best_run_by_primary_metric()
best_run_metrics = best_run.get_metrics()
parameter_values = best_run.get_details()['runDefinition']['Arguments']

print('Best Run Id: ', best_run.id)
print('\n Accuracy:', best_run_metrics['accuracy'])
print('\n learning rate:',parameter_values[3])
print('\n keep probability:',parameter_values[5])
print('\n batch size:',parameter_values[7])

Model Interpreters (Explainers)

  
      Policy
      Description
    
      MimicExplainer
      An explainer that creates a global surrogate model that approximates your trained model and can be used to generate explanations. This explainable model must have the same kind of architecture as your trained model (for example, linear or tree-based).
    
      TabularExplainer
      An explainer that acts as a wrapper around various SHAP explainer algorithms, automatically choosing the one that is most appropriate for your model architecture.
    
      PFIExplainer
      A Permutation Feature Importance explainer that analyzes feature importance by shuffling feature values and measuring the impact on prediction performance.

Policy	Description
MimicExplainer	An explainer that creates a global surrogate model that approximates your trained model and can be used to generate explanations. This explainable model must have the same kind of architecture as your trained model (for example, linear or tree-based).
TabularExplainer	An explainer that acts as a wrapper around various SHAP explainer algorithms, automatically choosing the one that is most appropriate for your model architecture.
PFIExplainer	A Permutation Feature Importance explainer that analyzes feature importance by shuffling feature values and measuring the impact on prediction performance.

MimicExplainer

from interpret.ext.blackbox import MimicExplainer
from interpret.ext.glassbox import DecisionTreeExplainableModel

mim_explainer = MimicExplainer(model=loan_model,
                             initialization_examples=X_test,
                             explainable_model = DecisionTreeExplainableModel,
                             features=['loan_amount','income','age','marital_status'], 
                             classes=['reject', 'approve'])

TabularExplainer

from interpret.ext.blackbox import TabularExplainer

tab_explainer = TabularExplainer(model=loan_model,
                             initialization_examples=X_test,
                             features=['loan_amount','income','age','marital_status'],
                             classes=['reject', 'approve'])

PFIExplainer

from interpret.ext.blackbox import PFIExplainer

pfi_explainer = PFIExplainer(model = loan_model,
                             features=['loan_amount','income','age','marital_status'],
                             classes=['reject', 'approve'])

Generate Feature Importance Data

from azureml.contrib.interpret.explanation.explanation_client import ExplanationClient

client = ExplanationClient.from_run(run)

# get model explanation data
explanation = client.download_model_explanation()
# or only get the top k (e.g., 4) most important features with their importance values
explanation = client.download_model_explanation(top_k=4)

global_importance_values = explanation.get_ranked_global_values()
global_importance_names = explanation.get_ranked_global_names()
print('global importance values: {}'.format(global_importance_values))
print('global importance names: {}'.format(global_importance_names))

from azureml.core import Model

classification_model = Model.register(workspace=ws,
                       model_name='classification_model',
                       model_path='model.pkl', # local path
                       description='A classification model')

Monitor Data Drift

from azureml.datadrift import DataDriftDetector

monitor = DataDriftDetector.create_from_datasets(
    workspace=ws,
    name='dataset-drift-detector',
    baseline_data_set=train_ds,
    target_data_set=new_data_ds,
    compute_target='aml-cluster',
    frequency='Week',
    feature_list=['age','height', 'bmi'],
    latency=24)

4. Deploy and consume models | 20-25%

Secure Deployment with TLS

from azureml.core.webservice import AciWebservice

aci_config = AciWebservice.deploy_configuration(
    ssl_enabled=True, ssl_cert_pem_file="cert.pem", ssl_key_pem_file="key.pem", ssl_cname="www.contoso.com")

Deployment Compute Options

  
      Compute Target
      Used For
      GPU Support
      FPGA Support
    
      Local
      Testing
      
      Compute Instance
      Testing
      
      Compute Cluster
      Batch Inference
      Y
      
      Azure Kubernetes Service (AKS)
      Real-time Inference
      Y
      Y

Compute Target	Used For	GPU Support	FPGA Support
Local	Testing
Compute Instance	Testing
Compute Cluster	Batch Inference	Y
Azure Kubernetes Service (AKS)	Real-time Inference	Y	Y

Deployment Configuration

  
      Compute Target
      Deployment Configuration Example
    
      Local
      deployment_config = LocalWebservice.deploy_configuration(port=8890)
    
      Azure Container Instance (ACI)
      deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
    
      Azure Kubernetes Service (AKS)
      deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)

Compute Target	Deployment Configuration Example
Local	`deployment_config = LocalWebservice.deploy_configuration(port=8890)`
Azure Container Instance (ACI)	`deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)`
Azure Kubernetes Service (AKS)	`deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)`

Consume a Deployed Service

import requests
import json

headers = {'Content-Type': 'application/json'}

if service.auth_enabled:
    headers['Authorization'] = 'Bearer '+service.get_keys()[0]
elif service.token_auth_enabled:
    headers['Authorization'] = 'Bearer '+service.get_token()[0]

print(headers)

test_sample = json.dumps({'data': [
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
]})

response = requests.post(service.scoring_uri, data=test_sample, headers=headers)
print(response.status_code)
print(response.elapsed)
print(response.json())

Publish a Batch Inferencing Pipeline

published_pipeline = pipeline_run.publish_pipeline(name='Batch_Prediction_Pipeline', description='Batch pipeline', version='1.0')
rest_endpoint = published_pipeline.endpoint

Run a Batch Inferencing Pipeline

import requests

response = requests.post(rest_endpoint, headers=auth_header, json={"ExperimentName": "Batch_Prediction"})
run_id = response.json()["Id"]

Create a Target Compute Resource to Publish a Designer Pipeline as a Web Service

from azureml.core.compute import AksCompute, ComputeTarget

# Use the default configuration (you can also provide parameters to customize this).
# For example, to create a dev/test cluster, use:
# prov_config = AksCompute.provisioning_configuration(cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST)
prov_config = AksCompute.provisioning_configuration()

aks_name = 'myaks'
# Create the cluster
aks_target = ComputeTarget.create(workspace = ws,
                                    name = aks_name,
                                    provisioning_configuration = prov_config)

# Wait for the create process to complete
aks_target.wait_for_completion(show_output = True)

Configure a Real-Time Inference Pipeline

When you select Create inference pipeline, several things happen:

The trained model is stored as a Dataset module in the module palette. You can find it under My Datasets.
Training modules like Train Model and Split Data are removed.
The saved trained model is added back into the pipeline.
Web Service Input and Web Service Output modules are added. These modules show where user data enters the pipeline and where data is returned.

Taygan

Taygan

BLOG

Taygan

DP-100: Designing and Implementing a Data Science Solution on Azure - Exam Prep

Suggested Approach

Resources

Suggested Learning Paths

Skills Measured

Study Notes

Framework-Specific Estimators

Taygan