Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists by Alice Zheng, Amanda Casari

Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists



Download Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists

Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists Alice Zheng, Amanda Casari ebook
Publisher: O'Reilly Media, Incorporated
ISBN: 9781491953242
Page: 214
Format: pdf


Classification, regression, and clustering). As mathematical routines to aid in the featureengineering process. In my mind feature engineering encompasses several different data preparationtechniques. But before we get into it we must define what a feature actually is. ) Knowledge of data query and data processing tools (i.e. Feature engineering is more difficult because it's domain-specific, while learners can be largely general-purpose. Find product information, ratings and reviews for Feature Engineering forMachine Learning Models : Principles and Techniques for Data Scientists online on Target.com. Basic knowledge of machine learning techniques (i.e. Previous articles have discussed the merits and advantages of each of these techniques. Normalization Transformation: -- One of the implicit assumptions often made inmachine learning algorithms (and somewhat explicitly in Naive Bayes) is that the the features follow a normal distribution. But from a data science standpoint, if these techniques are going to yield significantly improved results, then it is incumbent on us as practitioners to find approaches that essentially allow us to better understand these solutions. For example, the practitioner can use techniques such as factor analysis, decision trees, correlations, etc. They may mistake it for feature selection or worse adding new data sources. Machine learning algorithms can figure out how to perform important tasks by generalizing from examples. Understand machine learning principles (training, validation, etc. This is often These techniques are particularly useful when data is very scarce.