+ All Categories
Home > Documents > AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule...

AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule...

Date post: 14-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
21
Zürcher Fachhochschule Automated Machine Learning in Practice: State of the Art and Recent Results 6th Swiss Conference on Data Science 14.6.2019 Lukas Tuggener 1,4 Mohammadreza Amirian 1,2 Katharina Rombach 1 Stefan Lörwald 3 Anastasia Varlet 3 Christian Westermann 3 Thilo Stadelmann 1 1 ZHAW, 2 Ulm University, 3 PricewaterhouseCoopers AG, 4 USI
Transcript
Page 1: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule

Automated Machine Learning in Practice:State of the Art and Recent Results

6th Swiss Conference on Data Science14.6.2019

Lukas Tuggener1,4

Mohammadreza Amirian1,2

Katharina Rombach1

Stefan Lörwald3

Anastasia Varlet3Christian Westermann3

Thilo Stadelmann1

1ZHAW, 2Ulm University, 3PricewaterhouseCoopers AG, 4USI

Page 2: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 2

Contents

• Automated Machine Learning• What is it?• Why is it?

• Current state of the art• Benchmark results• Conclusions and closing remarks

• Q & A

Page 3: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 3

Automated Machine Learning - What is it?

Training Data

Live Data

Data Generating Process

CRM systemStock MarketsSurveysSensors (Camera, Thermometer)

LabelsTraining Algorithm Trained

Model

TrainedModel Predictions

Page 4: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 4

Automated Machine Learning - What is it?

Training Data

LabelsTraining Algorithm Trained

Model

Page 5: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 5

Automated Machine Learning - What is it?

• Model selection• Model hyperparameter

e.g. number of layers, splitting criterion …

• Training alg. Selection• Training hyperparameter

e.g. learning rate, batch size …• Regularization• Data handling

e.g. transformations, outlier handling…

• …• …Training Data

LabelsTraining Algorithm Trained

Model

Page 6: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 6

Automated Machine Learning - What is it?

• Model selection• Model hyperparameter

e.g. number of layers, splitting criterion …

• Training alg. Selection• Training hyperparameter

e.g. learning rate, batch size …• Regularization• Data handling

e.g. transformations, outlier handling…

• …• …

Do this automatically

Training Data

LabelsTraining Algorithm Trained

Model

Page 7: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 7

Automated Machine Learning - What is it?

More formally:

Combined Algorithm Selection and Hyperparameter optimization (CASH)1:

1C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Autoweka: Combined selection and hyperparameter optimization of classification algorithms”.

i-th crossvalidation

train / valid set

validation loss

model space hyperparameter space

Page 8: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 8

Automated Machine Learning - What is it?

More formally:

Combined Algorithm Selection and Hyperparameter optimization (CASH)1:

1C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Autoweka: Combined selection and hyperparameter optimization of classification algorithms”.

i-th crossvalidation

train / valid set

validation loss

model space hyperparameter space

Notably absentData preprocessing

Training Alg. configuration

Page 9: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 9

Automated Machine Learning - Why is it?

Make data analytics talent more efficient on hard tasks and obsolete on simple ones.

Page 10: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 10

Automated Machine Learning - Why is it?

Make data analytics talent more efficient on hard tasks and obsolete on simple ones.

Anyone needs convincing?

Page 11: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 11

Current state of the art - concepts

Optimization

CASH is an optimization problem

Meta-learning

What can we learn about datasets and learning algorithms that is “generally true”?

Resource allocation

How do we spend the resources at our disposal?

Bayesian OptimizationEvolutionary StrategiesTree SearchHandcrafted HeuristicsRandom Search

Dataset ClusteringDataset LandmarksLearning Curve estimationTraining Meta-Models (BO etc…)Selecting Model candidatesShipping pretrained Models (think: MAML)

Early stoppingModel compressionRestarts of promising candidates

Explore vs. Exploit

Page 12: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 12

Current state of the art - implementations

• Data Science Machine (DSM)Random search (our configuration), “fully trained"

• Auto-sklearn1

Bayesian optimization, ensemble building, meta-trained BO model

• TPOT2

genetic programming

• Portfolio Hyperbandportfolio of promising model candiates3 in our case sourced from

openml.org, hyperband4

1M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, and F. Hutter, “Efficient and robust automated machine learning “2R. S. Olson, R. J. Urbanowicz, P. C. Andrews, N. A. Lavender, J. H. Moore, et al., “Automating biomedical data science through tree-based pipeline optimization”3M. Feurer, K. Eggensperger, S. Falkner, M. Lindauer, and F. Hutter, “Practical automated machine learning for the automl challenge 2018”4L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A novel bandit-based approach to hyperparameter optimization”

Page 13: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 13

Portfolio Hyperband

Portfolio ConfigP1P2P3P4P5P6P…

Random ConfigPaPbPcPdPePfP…

Config: full model + training configuration

Portfolio sampled from successful and diverse runs on meta-datasets Choose best 1/2

of populationChoose best 1/2

of populationChoose best 1/2

of population

Cho

ose

N c

onfig

s

Train each config for M CPU-seconds

Train each config for M CPU-seconds

Train each config for M CPU-seconds

Train each config for M CPU-seconds

Restart with different M/N ratio

Page 14: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 14

Benchmark results

Page 15: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 15

Conclusions and closing remarks

• Design space of AutoML systems is vast.

• No clearly superior paradigm – but different characteristics

• Random search is boring but a crucial part of any AutoMLsystem

Page 16: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 16

Conclusions and closing remarks

• Design space of AutoML systems is vast.

• No clearly superior paradigm – but different characteristics

• Random search is boring but a crucial part of any AutoMLsystem

speed accuracy

good priors

Aggressive early stopping

Local random search

Dataset characterization

Response surface modelling

More random search

Page 17: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 17

Conclusions and closing remarks

Any constraints possible?

Page 18: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 18

Conclusions and closing remarks

Any constraints possible?

Spend a lot of time on:

• defining your meta-dataset

• pre-training meta-models

• pre-training models

Page 19: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 19

Conclusions and closing remarks

Any constraints possible?

Spend a lot of time on:

• defining your meta-dataset

• pre-training meta-models

• pre-training models

Producing general improvements is extremely difficult

Page 20: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 20

Any questions?

On me:• Doctoral Student ZHAW / USI• [email protected]• 058 934 47 33• https://tuggeluk.github.io/

Happy to answer questions & requests.

Thanks for your attention!

Page 21: AutomatedMachineLearning in Practice: State oftheArt … · 2019-07-01 · Zürcher Fachhochschule AutomatedMachineLearning in Practice: State oftheArt andRecentResults 6th Swiss

Zürcher Fachhochschule 21

APPENDIX


Recommended