AutoML

Automated machine learning (AutoML) is the process of automating the process of applying machine learning to real-world problems. AutoML covers the complete pipeline from the raw dataset to the deployable machine learning model. AutoML was proposed as an artificial intelligence-based solution to the ever-growing challenge of applying machine learning.

See AI/Supervised Learning/Model selection and tuning

Resources

https://en.wikipedia.org/wiki/Automated_machine_learning
Automated machine learning can target various stages of the machine learning process. Steps to automate are:
- Data preparation and ingestion (from raw data and miscellaneous formats)
  - Column type detection; e.g., boolean, discrete numerical, continuous numerical, or text
  - Column intent detection; e.g., target/label, stratification field, numerical feature, categorical text feature, or free text feature
  - Task detection; e.g., binary classification, regression, clustering, or ranking
- Feature engineering
  - Feature selection
  - Feature extraction
  - Meta learning and transfer learning
  - Detection and handling of skewed data and/or missing values
- Model selection. See AI/Supervised Learning/Model selection and tuning
- Hyperparameter optimization of the learning algorithm and featurization
- Pipeline selection under time, memory, and complexity constraints
- Selection of evaluation metrics and validation procedures
- Problem checking
  - Leakage detection
  - Misconfiguration detection
- Analysis of results obtained
- User interfaces and visualizations for automated machine learning
AutoML conference: https://www.automl.org/
https://www.rapidcanvas.ai/blogs/benchmarking-automl-exploring-the-top-automl-libraries
http://www.kdnuggets.com/2017/01/current-state-automated-machine-learning.html
https://medium.com/airbnb-engineering/automated-machine-learning-a-paradigm-shift-that-accelerates-data-scientist-productivity-airbnb-f1f8a10d61f8

Neural architecture search (NAS)

NAS is closely related to hyperparameter optimization and is a subfield of automated machine learning (AutoML).
https://en.wikipedia.org/wiki/Neural_architecture_search
- Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used:
- The search space defines the type(s) of ANN that can be designed and optimized.
- The search strategy defines the approach used to explore the search space.
- The performance estimation strategy evaluates the performance of a possible ANN from its design (without constructing and training it).

Books

#BOOK AutoML: Methods, systems, challenges

Code

#CODE FLAML - Fast and Lightweight AutoML
- FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection method invented by Microsoft Research
#CODE EvalML - AutoML library written in python
- https://innovation.alteryx.com/introducing-evalml/
#CODE Model Search
- https://ai.googleblog.com/2021/02/introducing-model-search-open-source.html
#CODE Auto-sklearn
#CODE TPOT
- http://rhiever.github.io/tpot/
- Consider TPOT yourData Science Assistant. TPOT is a Python tool that automatically creates and optimizes ML pipelines using genetic programming
- https://blog.alookanalytics.com/2017/05/25/automate-your-machine-learning/
#CODE AutoKeras
#CODE H2O autoML
#CODE Adanet - Fast and flexible AutoML with learning guarantees
- https://adanet.readthedocs.io
- AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention
#CODE FEDOT
- Automated modeling and machine learning framework
- https://fedot.readthedocs.io/en/latest/

References

#PAPER #REVIEW An improved hyperparameter optimization framework for AutoML systems using evolutionary algorithms (Vincent 2023)