2nd place Solution

第二名解决方案

作者： Andrey Zotov
发布时间： 2019-12-18

非常感谢比赛的组织者。感谢 Vinay Uday Prabhu 提供数据集以及在 arxive.org 上发表的有趣文章。

我使用了 FWiktor 的内核中的 CNN 架构（+RMSProp，+ReduceLROnPlateau）（非常感谢他）。该内核现已不再公开。以下是该网络的结构：

数据增强使用了以下参数：

ImageDataGenerator (rotation_range = 10,
width_shift_range = 0.25,
height_shift_range = 0.25,
shear_range = 0.1,
zoom_range = 0.25,
horizontal_flip = False)

首先，生成了三组权重 w1, w2, w3：

1.
- 45 轮 (epochs)，test_size = 0.05 -> w1
- 60 轮 (epochs)，test_size = 0.001 -> w2
- 80 轮 (epochs)，test_size = 0.0005 -> w3

接下来，又生成了 2 组权重（w-pseudo, w-pseudo-corr）：

2a.
- 集成 3 个模型 w1+w2+w3
- 伪标签（非常感谢 Nandor Balogh），阈值=0.95*3
- 100 轮 (epochs)，test size = 0.0005 -> w-pseudo

2b.
- 集成 3 个模型 w1+w2+w3
- 后验概率校正：

results [:, 4] = results [:, 4] - 0.7*3
results [:, 9] = results [:, 9] - 0.7*3
results [:, 1] = results [:, 1] - 0.7*3
results [:, 2] = results [:, 2] - 0.7*3

- 伪标签，阈值=0.95*3
- 100 轮 (epochs)，test size = 0.0005 -> w-pseudo-corr

最后，计算两个提交结果：

3a. 无校正
- 集成 2 个模型：w3 + 3*w-pseudo -> 提交 -> 0.9918/0.9928

3b. 有校正
- 集成 2 个模型：w-pseudo + w-pseudo_corr
- 后验概率校正：

results[:,1] = results[:,1]*0.3
results[:,2] = results[:,2]*0.2  
results[:,9] = results[:,9]*0.05

- 提交 -> 0.9936/0.9952

后来，我在互联网上发现了一些关于应用后验概率校正的文章。这是其中之一：A Posteriori Corrections to Classification Methods

第二名解决方案

同比赛其他方案