#19 Solution | AutoGluon + TF + LightGBM + XGBoost + Calibration

622. Playground Series - Season 4, Episode 8 | playground-series-s4e8

开始: 2024-08-01 结束: 2024-08-31 临床决策支持数据算法赛

#19 解决方案 | AutoGluon + TF + LightGBM + XGBoost + 校准

#19 解决方案 | AutoGluon + TF + LightGBM + XGBoost + 校准

作者： Oscar Aguilar (Grandmaster)

发布日期： 2024-09-01

竞赛排名： 19

这是我第一次使用 AutoGluon，我必须承认我享受它的简单性和出色的性能。在这篇文章中，我将解释我的方法。

数据预处理

在数据处理方面，我考虑了三种情况。

AmbrosM 的笔记中提出的预处理。
Stephen Murphy 笔记中提出的预处理。
无预处理。

建模

在模型方面，我构建了几个模型。下表总结了我的结果。

模型	模型数量	最差 CV 分数	最佳 CV 分数	集成 CV 分数
`AutoGluon`	20	0.98512	0.98525	-
`LightGBM`	9	0.98423	0.98477	0.98484
`XGBoost`	7	0.98417	0.98485	0.98488
`TensorFlow`	6	0.98383	0.98401	0.98452

请注意，上述结果基于 10 折交叉验证策略。

集成

在集成方面，我做了以下操作：

0.62 x AutoGluon (最佳模型) + 0.38 x AutoGluon of (LightGBM 集成，XGBoost 集成，TensorFlow 集成)

上述集成在 10 折交叉验证中获得了 0.98527 的 CV 分数。

校准

最后，我使用 IsotonicRegression 校准了预测结果，这使得 10 折 CV 分数提高了 0.00003。

什么不起作用

我运行了一些实验，试图优化阈值来确定标签，但我没有找到一致的结果。

同比赛其他方案

1st Place Solution: 72 OOFs, a whole lotta Autogluon, and 31 scores of 0.98512 or above (on the private LB)

[1st Place Solution AutoML Grand Prix] AutoML Grandmasters: AutoGluon Distributed + Post-Hoc Ensembling

#6 place - A quick reflection

8th Place Solution with Autogluon🤔

10th place solution (and a potential 5th)