530. Playground Series Season 3, Episode 3 | playground-series-s3e3
我甚至还没有选择任何提交结果,因为看着我的公共排行榜分数下滑,我根本没指望能取得好成绩。😄
非常感谢 Khawaja Abaid,他的笔记本 Starting Strong - XGBoost + LightGBM + CatBoost 是 我自己的笔记本 的基础。如果你还没有给 Khawaja 的笔记本点赞,请去点赞。我唯一的大改动是在训练相同的模型之前增加了一些特征工程。我之前在 Adding Risk Factors 中讨论过这个问题,但这是获胜版本的最终特征工程代码:
df['MonthlyIncome/Age'] = df['MonthlyIncome'] / df['Age']
df["Age_risk"] = (df["Age"] < 34).astype(int)
df["HourlyRate_risk"] = (df["HourlyRate"] < 60).astype(int)
df["Distance_risk"] = (df["DistanceFromHome"] >= 20).astype(int)
df["YearsAtCo_risk"] = (df["YearsAtCompany"] < 4).astype(int)
df['NumCompaniesWorked'] = df['NumCompaniesWorked'].replace(0, 1)
df['AverageTenure'] = df["TotalWorkingYears"] / df["NumCompaniesWorked"]
# df['YearsAboveAvgTenure'] = df['YearsAtCompany'] - df['AverageTenure']
df['JobHopper'] = ((df["NumCompaniesWorked"] > 2) & (df["AverageTenure"] < 2.0)).astype(int)
df["AttritionRisk"] = df["Age_risk"] + df["HourlyRate_risk"] + df["Distance_risk"] + df["YearsAtCo_risk"] + df['JobHopper']