The rapid increase in the amount of data collected is quickly shifting the bottleneck of making informed decisions from a lack of data to a lack of data scientists to help analyze the collected data. Moreover, the publishing rate of new potential solutions and approaches for data analysis has surpassed what a human data scientist can follow. At the same time, we observe that many tasks a data scientist performs during analysis could be automated. Automatic machine learning (AutoML) research and solutions attempt to automate portions or even the entire data analysis process.

We address two challenges in AutoML research: first, how to represent ML programs suitably for metalearning; and second, how to improve evaluations of AutoML systems to be able to compare approaches, not just predictions.

To this end, we have designed and implemented a framework for ML programs which provides all the components needed to describe ML programs in a standard way. The framework is extensible and framework’s components are decoupled from each other, e.g., the framework can be used to describe ML programs which use neural networks. We provide reference tooling for execution of programs described in the framework. We have also designed and implemented a service, a metalearning database, that stores information about executed ML programs generated by different AutoML systems.

We evaluate our framework by measuring the computational overhead of using the framework as compared to executing ML programs which directly call underlying libraries. We observe that the framework’s ML program execution time is an order of magnitude slower and its memory usage is twice that of ML programs which do not use this framework.

We demonstrate our framework’s ability to evaluate AutoML systems by comparing 10 different AutoML systems that use our framework. The results show that the framework can be used both to describe a diverse set of ML programs and to determine unambiguously which AutoML system produced the best ML programs. In many cases, the produced ML programs outperformed ML programs made by human experts.




Download Full History