Hidden tricks for running AutoML experiment from Azure Machine Learning SDK (2023)

Automated Machine Learning is an fast-growing field in Machine Learning community that enables users to try multiple algorithms and pre-processing transformations with their data. Combined with scalable cloud-based compute makes it possible to find the best performing model for data without the huge amount of time-consuming manual trial and error that would otherwise be required.

This blog provides a brief overview of how to run an AutoML experiment from Azure Machine Learning SDK.

Azure Machine Learning includes support for automated machine learning known as AutoML as one of Azure cloud offerings through visual interface in Azure Machine Learning studio or submit an experiment using the SDK. The SDK gives data scientists greater control over the settings for the automated machine learning experiment, whereas the visual interface is easier to use for users with less or no-code experience.

Azure Machine Learning trains models for the following types of machine learning task:

  • Classification
  • Regression
  • Time Series Forecasting

In addition, Azure AutoML includes support for numerous commonly used algorithms for these tasks, including:

Classification Algorithms

  • Logistic Regression
  • Light Gradient Boosting Machine (GBM)
  • Decision Tree
  • Random Forest
  • Naive Bayes
  • Linear Support Vector Machine (SVM)
  • XGBoost
  • Deep Neural Network (DNN) Classifier
  • Others…

Regression Algorithms

  • Linear Regression
  • Light Gradient Boosting Machine (GBM)
  • Decision Tree
  • Random Forest
  • Elastic Net
  • LARS Lasso
  • XGBoost
  • Others…

Forecasting Algorithms

  • Linear Regression
  • Light Gradient Boosting Machine (GBM)
  • Decision Tree
  • Random Forest
  • Elastic Net
  • LARS Lasso
  • XGBoost
  • Others…

For a full list of supported algorithms, see How to define a machine learning task in the documentation.

While user interface provides an intuitive way to select options for your automated machine learning experiment, using the SDK gives user greater flexibility to setup the experiments and monitor the runs. Here, I have listed seven steps that guides the users to run AutoML via SDK.

In Azure Machine Learning, Compute Targets are physical or virtual computers on which experiments are run.

The ability to assign experiment runs to specific compute targets helps you implement a flexible data science ecosystem in the following ways:

  • Code can be developed and tested on local or low-cost compute, and then moved to more scalable compute for production workloads.
  • You can run individual processes on the compute target that best fits its needs. For example, by using GPU-based compute to train deep learning models, and switching to lower-cost CPU-only compute to test and register the trained model.

One of the core benefits of cloud computing is the ability to manage costs by paying only for what you use. In Azure Machine Learning, you can take advantage of this principle by defining compute targets that:

  • Start on-demand and stop automatically when no longer required.
  • Scale automatically based on workload processing needs.

For complete documentation on Compute Targets look at here:

Azure Machine Learning includes the ability to create Compute Instances in a workspace to provide a development environment (Jupyter Notebook, Jupyer Lab, RStudio and SSH) that is managed with all of the other assets in the workspace.

Hidden tricks for running AutoML experiment from Azure Machine Learning SDK (1)
pip install azureml-sdk

The SDK includes optional extras that aren’t required for core operations, but can be useful in some scenarios. For example, the notebooks extra include widgets for displaying detailed output in Jupyter Notebooks, the automl extra includes packages for automated machine learning training, and the explain extra includes packages for generating model explanations. To install extras, specify them in brackets as shown here:

pip install azureml-sdk[notebooks, automl,explain]

More Information: For more information about installing the Azure Machine Learning SDK for Python, see the SDK documentation. Also, you should be aware that the SDK is updated on a regular basis, and review the release notes for the latest release.

Automated machine learning is designed to enable you to simply bring your data, and have Azure Machine Learning figure out how best to train a model from it.

When using the Automated Machine Learning user interface in Azure Machine Learning studio, you can create or select an Azure Machine Learning dataset to be used as the input for your automated machine learning experiment.

When using the SDK to run an automated machine learning experiment, you can submit the data in the following ways:

  • Specify a dataset or dataframe of training data that includes features and the label to be predicted.
  • Optionally, specify a second validation data dataset or dataframe that will be used to validate the trained model. if this is not provided, Azure Machine Learning will apply cross-validation using the training data.

Alternatively:

  • Specify a dataset, dataframe, or numpy array of X values containing the training features, with a corresponding y array of label values.
  • Optionally, specify X_valid and y_valid datasets, dataframes, or numpy arrays of X_valid values to be used for validation.

Hint1: AML has an embed feature of data profiling that allows users to explore their registered datasets:

Hidden tricks for running AutoML experiment from Azure Machine Learning SDK (2)

If you want to have this feature in your SDK experiment, you can use the actual python package(pandas_profiling ) and after installing the package, to generate the [profile report, run:

profile = ProfileReport(df, title="Pandas Profiling Report")

This is achieved by simply displaying the report. In the Jupyter Notebook, run:

profile.to_widgets()

The HTML report can be included in a Jupyter notebook:

Hidden tricks for running AutoML experiment from Azure Machine Learning SDK (3)

Run the following code:

profile.to_notebook_iframe()
Hidden tricks for running AutoML experiment from Azure Machine Learning SDK (4)

Saving the report

If you want to generate a HTML report file, save the ProfileReport to an object and use the to_file() function:

profile.to_file("your_report.html")

Alternatively, you can obtain the data as json:

# As a string
json_data = profile.to_json()
# As a file
profile.to_file("your_report.json")
Hidden tricks for running AutoML experiment from Azure Machine Learning SDK (5)

After installing the SDK package in your Python environment, you can write code to connect to your workspace and perform machine learning operations. The easiest way to connect to a workspace is to use a workspace configuration file, which includes the Azure subscription, resource group, and workspace details as shown here:

{ 
"subscription_id": "<subscription-id>",
"resource_group": "<resource-group>",
"workspace_name": "<workspace-name>"
}

To connect to the workspace using the configuration file, you can use the from_config method of the Workspace class in the SDK, as shown here:

from azureml.core import Workspacesubscription_id = '<subscription-id>'
resource_group = '<resource-group>'
workspace_name = '<workspace-name>'
try:
ws = Workspace(subscription_id = subscription_id, resource_group = resource_group, workspace_name = workspace_name)
ws.write_config()
print('Library configuration succeeded')
except:
print('Workspace not found')

The user interface provides an intuitive way to select options for your automated machine learning experiment. When using the SDK, you have greater flexibility, and you can set experiment options using the AutoMLConfig class, as shown in the following example:

automl_settings = {
"n_cross_validations": 3,
"primary_metric": 'average_precision_score_weighted',
"enable_early_stopping": True,
"max_concurrent_iterations": 2, # This is a limit for testing purpose, please increase it as per cluster size
"experiment_timeout_hours": 0.25, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ablity to find the best model possible
"verbosity": logging.INFO,
}

automl_config = AutoMLConfig(task = 'classification',
debug_log = 'automl_errors.log',
compute_target = compute_target,
training_data = training_data,
label_column_name = label_column_name,
**automl_settings
)

Like any scientific discipline, data science involves running experiments; typically to explore data or to build and evaluate predictive models. In Azure Machine Learning, an experiment is a named process, usually the running of a script or a pipeline, that can generate metrics and outputs and be tracked in the Azure Machine Learning workspace.

An experiment can be run multiple times, with different data, code, or settings; and Azure Machine Learning tracks each run, enabling you to view run history and compare results for each run.

You can submit an automated machine learning experiment like any other SDK-based experiment:

from azureml.core.experiment import Experiment

automl_experiment = experiment(ws,'automl_experiment')
automl_run = automl_experiment.submit(automl_config)
automl_run.wait_for_completion(show_output=True)

You can easily identify the best run in Azure Machine Learning studio, and download or deploy the model it generated. To accomplish this programmatically with the SDK, you can use code like the following example:

best_run, fitted_model = automl_run.get_output()
print(best_run)
print(fitted_model)

In addition to the best model, when you submit an experiment, you use its run context to initialize and end the experiment run that is tracked in Azure Machine Learning, as shown in the following code sample:

automl_run = experiment.start_logging()run = automl_run.get_context() # allow_offline=True by default, so can be run locally as well 
...
run.log("Accuracy", 0.98)
run.log_row("Performance", epoch=e, error=err)

Every experiment generates log files that include the messages that would be written to the terminal during interactive execution. This enables you to use simple print statements to write messages to the log. However, if you want to record named metrics for comparison across runs, you can do so by using the Run object; which provides a range of logging functions specifically for this purpose. These include:

  • log: Record a single named value.
  • log_list: Record a named list of values.
  • log_row: Record a row with multiple columns.
  • log_table: Record a dictionary as a table.
  • log_image: Record an image file or a plot.

More Information: For more information about logging metrics during experiment runs, see Monitor Azure ML experiment runs and metrics in the Azure Machine Learning documentation.

You can view the metrics logged by an experiment run in Azure Machine Learning studio or by using the RunDetails widget in a notebook, as shown here:

from azureml.widgets import RunDetails
RunDetails(automl_run).show()

You can also retrieve the metrics using the Run object’s get_metrics method, which returns a JSON representation of the metrics, as shown here:

best_run_metrics = best_run.get_metrics() # or other runs with runID
for metric_name in best_run_metrics:
metric = best_run_metrics[metric_name]
print(metric_name, metric)

Another good method for run is get_properties that allows you that fetches the latest properties of the run from the service and the return a dict type that can be query for particular properties such as iteration, algorithm name, class name, and many other useful features that needs to be extracted.

Another useful method get_status that returns common values returned include “Running”, “Completed”, and “Failed”.

while automl_run.get_status() not in ['Completed','Failed']: 
print('Run {} not in terminal state'.format(atoml_run.id))
time.sleep(10)

The following code example shows some uses of the list method.

favorite_completed_runs = automl_run.list(experiment, status='Completed', tags = 'favorite')all_distinc_runs = automl_run.list(experiment)and_their_children = automl_run.list(experiment, include_children=True)only_script_runs = Run.list(experiment,, type=ScriptRun.RUN_TYPE)

For the complete list of methods see the Azure ML API documentation.

FAQs

How do you create an experiment in Azure ML? ›

In Azure Machine Learning, an experiment is represented by the Experiment class and a trial is represented by the Run class. To get or create an experiment from a workspace, you request the experiment using the experiment name.

How do I visualize data in Azure ML studio? ›

Data exploration in ML Studio
  1. Right-click on the output port.
  2. Click on Visualize.

Which steps can automate by AutoML? ›

More specifically, here are some of the steps of the machine learning process that AutoML can automate, in the order they occur in the process:
  • Raw data processing.
  • Feature engineering and feature selection.
  • Model selection.
  • Hyperparameter optimization and parameter optimization.

What are the key steps in the AutoML process? ›

There are generally eight steps in the AutoML process: data ingestion, data preparation, data engineering, model selection, model training, hyperparameter tuning, model deployment, and model updates.

Which type of machine learning model does cloud AutoML integrate with? ›

You can use AutoML to build on Google's machine learning capabilities to create your own custom machine learning models that are tailored to your business needs, and then integrate those models into your applications and web sites.

How do I use AutoML API? ›

AutoML Natural Language API Tutorial
  1. Step 1: Create a dataset.
  2. Step 2: Import training items into the dataset.
  3. Step 3: Create (train) the model.
  4. Step 4: Evaluate the model.
  5. Step 5: Deploy the model.
  6. Step 6: Use the model to make a prediction.
  7. Step 7: Delete a Model.

What enables you to perform automated deployments from Azure DevOps? ›

The Power BI automation tools extension is an open source Azure DevOps extension that provides a range of deployment pipelines operations that can be performed in Azure DevOps.

Is Azure good for ML? ›

Azure machine learning tool is one of the best tools available in the market to do predictive analysis. we are using it for the last 3 years in our organization. it has made model training and prediction very easy for our team.

How do you trigger Azure ml pipeline? ›

In your Web browser, navigate to Azure Machine Learning. From the Endpoints section of the navigation panel, choose Pipeline endpoints. This takes you to a list of the pipelines published in the Workspace.

How do I create a virtual lab in Azure? ›

Create a lab
  1. In the Azure portal, search for and select DevTest Labs.
  2. On the DevTest Labs page, select Create.
  3. On the Create Devtest Lab page, on the Basic Settings tab, provide the following information: ...
  4. Optionally, select the Auto-shutdown, Networking, or Tags tabs at the top of the page, and customize those settings.
Apr 19, 2022

How do I make an Azure golden image for a virtual machine? ›

Create an image from an Azure VM
  1. Take your first snapshot. First, create the base VM for your chosen image. ...
  2. Customize your VM. Sign in to the VM and start customizing it with apps, updates, and other things you'll need for your image. ...
  3. Take the final snapshot. ...
  4. Run sysprep.
Dec 4, 2022

How do you visualize simulation data? ›

To visualize simulation data using the Simulation Data Inspector, log data in the model. When you log signals and outputs, the logged data is automatically available in the Simulation Data Inspector during and after simulation.

What is the disadvantage of AutoML? ›

The main criticisms of AutoML solutions are: 1 Control - Can't alter generated solutions. 2 It doesn't do enough - Most of the work is elsewhere. 3 Quality of results - Users don't want to be held back.

What AutoML Cannot do? ›

AutoML cannot replace a data scientist's job; instead, it may help speed up a data scientist's work. AutoML (Automated Machine Learning) automates certain key components of the machine learning pipeline.

What algorithms does AutoML use? ›

The core innovation utilized in AutoML is hyperparameters search, utilized for preprocessing components and model type selection, and for optimizing their hyperparameters. There are numerous sorts of optimization algorithms going from random and grid search to genetics algorithms and Bayesian.

What are the most important features in the AutoML model? ›

AutoML Processes

Model selection and automation of the process of hyperparameter optimization, also known as tuning, are the AutoML's most valuable features.

What are the three required phases in order for using machine learning models with AutoAI? ›

Let's dig a little deeper into the different stages of AutoAI.
  • Data Preprocessing: The first stage in AutoAI is data pre-processing. ...
  • Automated Model Selection: The second stage in AutoAI is automated model selection. ...
  • Automated Feature Engineering: The third stage in AutoAI is automated feature engineering.

What are the 3 key steps in machine learning project? ›

Machine Learning Steps
  • Collecting Data: As you know, machines initially learn from the data that you give them. ...
  • Choosing a Model: A machine learning model determines the output you get after running a machine learning algorithm on the collected data. ...
  • Evaluating the Model:
Feb 16, 2023

Is AutoML supervised or unsupervised? ›

AutoML is primarily used for supervised learning applications, i.e., regression and classification. Automl techniques can also be used for unsupervised learning and reinforcement learning.

Does AutoML use deep learning? ›

AutoML Systems

Auto-sklearn is an extension of AutoWEKA using the Python library scikit-learn which is a drop-in replacement for regular scikit-learn classifiers and regressors. Auto-PyTorch is based on the deep learning framework PyTorch and jointly optimizes hyperparameters and the neural architecture.

Does AutoML clean data? ›

Data Cleaning for AutoML.

Some existing AutoML frameworks already include data cleaning mechanisms. For example, AutoGluon is a Python library for AutoML with tabular data [5]. It automatically performs MV imputation and outlier detection before the ML pipeline in both a model-agnostic and model-specific way.

How do you test AutoML? ›

  1. Open the AutoML Vision UI and click the Models tab (with lightbulb icon) in the left navigation bar to display the available models. ...
  2. Click the row for the model you want to evaluate.
  3. If necessary, click the Evaluate tab just below the title bar.

Is AutoML MLOps? ›

AutoML objectives and benefits overlap with those of MLOps — a broader discipline with focus not only on automation but also on cross-functional collaboration within machine learning projects.

What are the 3 deployment modes that can be used for Azure? ›

Azure supports three approaches to deploying cloud resources - public, private, and the hybrid cloud.

Which are methods of performing automated Windows deployments? ›

These tools include Windows Deployment Services (WDS), the Volume Activation Management Tool (VAMT), the User State Migration Tool (USMT), Windows System Image Manager (Windows SIM), Windows Preinstallation Environment (Windows PE), and Windows Recovery Environment (Windows RE).

What are the two common types of deployment strategies used in DevOps? ›

Various Types of Deployment Strategies
  • Blue/Green Deployment. In this type of deployment strategy, the new version of the software runs alongside the old version. ...
  • Canary Deployment. ...
  • Recreate Deployment. ...
  • Ramped Deployment. ...
  • Shadow Deployment. ...
  • A/B Testing Deployment.
May 3, 2022

When should you not use Azure? ›

Azure Functions are not suited for running long and computationally intensive tasks. Since Azure Functions are a compute-on-demand service, attempting to replace any APIs with multiple Azure functions could result in severely increased costs in terms of development, maintenance, and computations.

Is Azure outdated? ›

Because Azure Resource Manager now has full IaaS capabilities and other advancements, we deprecated the management of IaaS virtual machines (VMs) through Azure Service Manager (ASM) on February 28, 2020. This functionality will be fully retired on September 1, 2023.

What are two advantages of using the Azure ML platform? ›

Benefits of Azure Machine Learning

There is no set data limit to import data from Azure storages and hdfs systems. It is flexible for pricing. You simply “pay as you go” for the features you use. Azure Machine Learning is very user-friendly and comes with a set of tools that are less restrictive.

What are two ways of executing pipelines in Azure? ›

There are two main options for operating Azure Pipelines—you can define pipelines using YAML code or the classic UI.

How do I manually run Azure pipeline? ›

To trigger the pipeline manually:
  1. Go to Azure Devops and select the project for your deployment.
  2. Click Pipelines.
  3. Click the pipeline. For example, the infrastructure pipeline.
  4. Click Run Pipeline. Note. ...
  5. In the Run Pipeline dialog click Run. Azure Devops will queue the job and start the redeployment.
Jan 25, 2021

How do I make an Azure virtual machine using python? ›

Additional resources
  1. 1: Set up your local development environment.
  2. 2: Install the needed Azure library packages.
  3. 3: Write code to create a virtual machine.
  4. Run the script.
  5. Verify the resources.
  6. 6: Clean up resources.
  7. See also.
Dec 16, 2022

How can I practice Azure for free? ›

With an Azure free account, you can get hands-on and learn on the move. Signing up is free, and you'll get a $200 credit after the first 30 days. That's a month of “testing and deploying enterprise apps, creating custom mobile experiences, and gaining insight from your data” at no cost.

Can we create free virtual machine in Azure? ›

Services can be created in any region

For example, you get 750 hours of a B1S Windows virtual machine free each month with the Azure free account. You can create the virtual machine in any region where B-series virtual machines are available. Azure doesn't charge you unless you exceed 750 hours.

What is Azure AutoML? ›

Azure AutoML is a cloud-based service that can be used to automate building machine learning pipelines for classification, regression and forecasting tasks. Its goal is not only to tune hyper-parameters of a given model, but also to identify which model to use and how to pre-process the input dataset.

How do you use AutoML model? ›

The workflow for training and using an AutoML model is the same, regardless of your datatype or objective:
  1. Prepare your training data.
  2. Create a dataset.
  3. Train a model.
  4. Evaluate and iterate on your model.
  5. Get predictions from your model.
  6. Interpret prediction results.

What can be monitored using Azure monitor? ›

Use Azure Monitor to monitor these types of resources in Azure, other clouds, or on-premises:
  • Applications.
  • Virtual machines.
  • Guest operating systems.
  • Containers.
  • Databases.
  • Security events in combination with Azure Sentinel.
  • Networking events and health in combination with Network Watcher.
5 days ago

What is the difference between Azure monitor and application Insights? ›

Application Insights is an extension of Azure Monitor and provides Application Performance Monitoring (also known as “APM”) features. APM tools are useful to monitor applications from development, through test, and into production in the following ways: Proactively understand how an application is performing.

What are the different types of monitoring in Azure? ›

Microsoft combined three unique services—Azure Monitor, Log Analytics, and Application Insights—under the umbrella of Azure Monitor to provide powerful end-to-end monitoring of your applications and the components they rely on. Log Analytics and Application Insights are now features of Azure Monitor.

What algorithm does AutoML use? ›

AutoML Systems

AutoWEKA is an approach for the simultaneous selection of a machine learning algorithm and its hyperparameters; combined with the WEKA package it automatically yields good models for a wide variety of data sets.

What are the advantages of AutoML? ›

AutoML helps users transfer data to training algorithms and automatically search for the best neural network architecture for a given issue. This saves data science practitioners a huge amount of time. Often, tasks that would take hours to complete can be accomplished in minutes using AutoML.

For what type of machine learning models can you use AutoAI experiment? ›

Based on analyzing a subset of the data set, AutoAI chooses a default model type: binary classification, multiclass classification, or regression.

References

Top Articles
Latest Posts
Article information

Author: Stevie Stamm

Last Updated: 11/10/2023

Views: 6431

Rating: 5 / 5 (80 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Stevie Stamm

Birthday: 1996-06-22

Address: Apt. 419 4200 Sipes Estate, East Delmerview, WY 05617

Phone: +342332224300

Job: Future Advertising Analyst

Hobby: Leather crafting, Puzzles, Leather crafting, scrapbook, Urban exploration, Cabaret, Skateboarding

Introduction: My name is Stevie Stamm, I am a colorful, sparkling, splendid, vast, open, hilarious, tender person who loves writing and wants to share my knowledge and understanding with you.