transfer learning overfitting in machine learning
X
Definition

automated machine learning (AutoML)

What is automated machine learning (AutoML)?

Automated machine learning (AutoML) is the process of applying machine learning models to real-world problems using automation. More specifically, it automates the selection, composition and parameterization of ML models. Automating the machine learning process makes it more user-friendly and often provides faster, more accurate outputs than hand-coded algorithms.

AutoML software platforms make machine learning more user-friendly and give organizations without a specialized data scientist or ML expert access to machine learning. These platforms can be built in house or acquired from a third-party vendor and accessed through open source repositories such as GitHub.

How does the AutoML process work?

AutoML is typically a platform or open source library that simplifies each step in the machine learning process, from handling a raw data set to deploying a practical ML model. In traditional machine learning, models are developed by hand, and each step in the process must be handled separately.

A diagram depicting the AutoML process.

AutoML automatically locates and uses the optimal type of machine learning algorithm for a given task. Two concepts help achieve this:

  • Neural architecture search. This automates the design of neural networks. It helps AutoML models discover new architectures for problems that require them.
  • Transfer learning. Pretrained models apply what they've learned to new data sets. Transfer learning helps AutoML apply existing architectures to new problems that require it.

Users with minimal machine learning and deep learning knowledge can then interface with the models through a coding language such as Python.

More specifically, here are some steps in the machine learning process that AutoML can automate, in the order they occur:

  • Raw data processing.
  • Feature engineering and feature selection.
  • Model selection.
  • Hyperparameter optimization and parameter optimization.
  • Deployment with consideration for business and technology constraints.
  • Evaluation metric selection.
  • Monitoring and problem checking.
  • Analysis of results.

Why is AutoML important?

AutoML is important because it represents a milestone in machine learning and artificial intelligence. AI and ML have been subject to the "black box" criticism -- meaning machine learning algorithms can be difficult to reverse engineer. Although they improve efficiency and processing power to produce results, it can be difficult to track how the algorithm delivered that output. Consequently, this also makes it challenging to choose the correct model for a given problem, because it can be difficult to predict a result if a model is a black box.

AutoML helps to make machine learning less of a black box by making it more accessible. It automates parts of the ML process that apply the algorithm to real-world scenarios. A human performing this task would need an understanding of the algorithm's internal logic and how it relates to the real-world scenarios. AutoML, however, learns and makes choices that are more time-consuming or resource-intensive for humans to do with efficiency at scale.

Fine-tuning the end-to-end machine learning process -- or machine learning pipeline -- through meta learning has been made possible by AutoML.

On a wider scale, AutoML also represents a step toward artificial general intelligence.

Pros and cons of AutoML

The main benefits of AutoML are as follows:

  • Efficiency. AutoML speeds up and simplifies the machine learning process and reduces training time of ML models.
  • Cost savings. Having a faster, more efficient machine learning process means a company can save money by devoting less of its budget to maintaining that process.
  • Accessibility. Having a simpler process allows companies to save money on training staff or hiring experts. It also makes machine learning a viable possibility for a wider range of companies.
  • Performance. AutoML algorithms tend to be more efficient than hand-coded models.

The main challenge of AutoML is the temptation to view it as a replacement for human knowledge.

Like most automation, AutoML is designed to perform rote tasks efficiently with accuracy and precision, freeing up employees to focus on more complex or novel tasks. Things that AutoML automates -- such as monitoring, analysis and problem detection -- are rote tasks that are faster if automated. A human should still be involved to assess and supervise the model. AutoML should help, not replace, data scientists and other employees, especially those with expert knowledge.

Another challenge is that AutoML is a relatively new field, and some of the most popular tools are not yet fully developed.

Different ways to use AutoML

AutoML shares common use cases with traditional machine learning. Some of these include the following:

  • Fraud detection in finance, where it improves the accuracy and precision of fraud detection models.
  • Research and development in healthcare, where it can analyze large data sets and draw insights.
  • Image recognition, which is useful for facial recognition.
  • Risk assessment and management in banking, finance and insurance.
  • Cybersecurity, where it can be used for risk assessment, monitoring and testing.
  • Customer support, where it can be used for sentiment analysis in chatbots as well as increasing efficiency in customer support teams.
  • Malware and spam, where it can be used to generate adaptive cyberthreats.
  • Agriculture, where it can be used to expedite the quality testing process.
  • Marketing, where it can be used for predictive analytics, improving engagement rates and making behavioral marketing campaigns on social media more efficient.
  • Entertainment, where it can be used for content selection or as a recommendation engine.
  • Retail, where it can be used to improve profits and reduce waste and inventory carryover.

AutoML tool features

The following are some popular AutoML platforms:

  • Google AutoML, Google's proprietary, cloud-based automated machine learning platform.
  • Azure Automated Machine Learning, a proprietary, cloud-based platform.
  • AutoKeras, an open source software library developed by the Data Lab at Texas A&M University.
  • Auto-sklearn, which evolved from Scikit-learn, an open source, commercially usable collection of simple machine learning tools in Python.
  • H2O AutoML, a tool on H2O's open source platform that automates the process of tuning and training models.
  • TransmogrifAI, an open source AutoML library for structured data that runs over Apache Spark.
  • IBM AutoAI, a cloud-based AutoML tool included in IBM's Watson Studio.

Tools such as Auto-sklearn and AutoKeras are open source and can be run on local infrastructure, meaning users can avoid the costs of proprietary cloud services. They rely strongly on known architectures and data they've already seen, and support classification and regression techniques, among other tasks.

Tools such as Google AutoML and Azure ML, by contrast, are proprietary cloud platforms that offer scale, but also incur costs associated with cloud services. They use recurrent neural networks, convolutional neural networks, long short-term memory and other ML models.

This was last updated in June 2024

Continue Reading About automated machine learning (AutoML)

Dig Deeper on AI technologies

Business Analytics
CIO
Data Management
ERP
Close