Suvarna Garge (Editor)

Deep feature synthesis

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Deep Feature Synthesis is an algorithm developed by James Max Kanter and Kalyan Veeramachaneni in their paper "Deep Feature Synthesis: Towards Automating Data Science Endeavors"

Contents

Definition

Quoting the above paper: "Deep Feature Synthesis is an algorithm that automatically generates features for relational datasets. In essence, the algorithm follows relationships in the data to a base field, and then sequentially applies mathematical functions along that path to create the final feature."

Practical Results

Kanter and Veeramachaneni implemented the Deep Feature Synthesis algorithm in their Data Science Machine and proceeded to enter the automated results in several competitions:

Their results competed against human teams to find predictive patterns in unfamiliar data sets. Of the 906 teams participating in the three competitions, the researchers' "Data Science Machine" finished ahead of 615. In two of the three competitions, the predictions made by the Data Science Machine were 94 percent and 96 percent as accurate as the winning submissions. In the third, the figure was a more modest 87 percent. But where the teams of humans typically labored over their prediction algorithms for months, the Data Science Machine took somewhere between two and 12 hours to produce each of its entries.

Characteristics

Little to no human intervention.

Results in hours not weeks.

Relies on SQL schema and normalized table relationships.

Applications

Quickly create feature sets of predictive value.

Critique

The process of feature synthesis from relational data is known as propositionalization, which is known at least from 1991. The employed algorithm in Deep feature synthesis was for the first time described by Knobbe in 2001 and is known as RollUp. RollUp was later on enhanced in PRORED. A commercial version of RollUp is sold under the name Safarii.

References

Deep feature synthesis Wikipedia