Deep Learning: Triangle Machine Learning Day - What You Didn’t Learn About Machine Learning in School, Wayne Thompson
September 20, 2019
Abstract
A machine learning pipeline is comprised of data wrangling + feature engineering and extraction + model formulae. It may also be layered with rules. Each model includes a lot of data preparation logic. You must aggregate many data sources, include the model formulae, and layer it with rules or policies. Most organizations don’t have enough rigor and metadata to re-create the data wrangling phase for scoring. As a result, many of the backward data source dependencies for deriving the new scoring tables get lost. This is the biggest reason why most organizations take too long to put a model to work.