Automating Predictive Modeling at Zynga with PySpark and Pandas UDFsBen Weber Zynga

Automating Predictive Modeling at Zynga with PySpark and Pandas UDFsBen Weber Zynga

wrenddy HD 0:38:41


Automating Predictive Modeling at Zynga with PySpark and Pandas UDFsBen Weber Zynga
riverbelle slots
riverbelle slots . 2819 Video
Followers
253603
2735 views
22
1
Date of publication:
07.05.2019 1:06 07.05.2019 1:06
Duration:
0:38:41
Source Video:
https://youtube.com/watch?v=StNRA02Ny7Y

Description:

Building propensity models at Zynga used to be a time-intensive task that required custom data science and engineering work for every new model. Weve built an automated model pipeline that uses PySpark and feature generation to automate this process. The challenge that we faced was that the Featuretools library that we wanted to use for automated feature engineering works only on Pandas data frames, limiting the size of data sets that we could handle. Our solution to this problem is to use Pandas UDFs to scale the feature engineering process to our entire player base. We start with our full set of players, partition the data into smaller chucks that can be loaded into memory, apply the feature engineering step on these subsets of data, and then combine the results back into one large data set. This presentation will outline how we use Pandas UDFs in production to automate propensity modeling at Zynga. The outcome of this approach is that we now have hundreds of propensity models in production that teams can use to personalize game experiences. Instead of spending time on feature engineering and model fitting, our data scientists are now spending more of their time engaging with game teams to help build new features. About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business. Read more here: https://databricks.com/product/unified-data-analytics-platform Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc/

Automating Predictive Modeling at Zynga with PySpark and Pandas UDFsBen Weber Zynga

Automating Predictive Modeling at Zynga with PySpark and Pandas UDFsBen Weber Zynga

Similar Video