第1名解决方案

首先，恭喜获奖者，并感谢 Kaggle 和主办方团队举办这场艰难的比赛！我很早就参加了这场比赛，在数据更新之前花了很多时间。我无法保持手动标记的动力，所以一个月前离开了。幸运的是我获得了第一名，所以我想分享我的方法。

方法总结

单模型 Unet se_resnext101_32x4d（4折交叉验证）
借鉴了之前分割比赛的一些技巧（主要来自云分割比赛）
训练时采用平衡切片采样（编辑：平衡了掩码区域）
对公共测试数据和外部数据使用伪标签
避免边缘效应的技巧
我的流程是基于旧数据集（即数据更新之前）开发的。

1. 数据准备

制作 1024x104 切片 + 移位的 1024x1024 切片（我将切片移动了 (512,512)）

2. 验证

我选择了验证数据，以便相同的患者编号位于同一组中

val_patient_numbers_list = [
    [63921], # fold0
    [68250], # fold1
    [65631], # fold2
    [67177], # fold3
 ]

3. 训练时的平衡切片采样

首先，我根据掩码区域对切片数据进行分箱（掩码切片的分箱数 = 4）。然后我应用以下过程进行平衡采样。

n_sample = trn_df['is_masked'].value_counts().min()
trn_df_0 = trn_df[trn_df['is_masked']==False].sample(n_sample, replace=True)
trn_df_1 = trn_df[trn_df['is_masked']==True].sample(n_sample, replace=True)
n_bin = int(trn_df_1['binned'].value_counts().mean())
trn_df_list = []
for bin_size in trn_df_1['binned'].unique():
    trn_df_list.append(trn_df_1[trn_df_1['binned']==bin_size].sample(n_bin, replace=True))
trn_df_1 = pd.concat(trn_df_list, axis=0)
trn_df_balanced = pd.concat([trn_df_1, trn_df_0], axis=0).reset_index(drop=True)

4. 模型

U-Net SeResNext101 + CBAM + Hypercolumns（超列） + Deep Supervision（深度监督）
在我的例子中，更大的模型带来了更好的 CV 和 LB 分数。
我将 1024x1024 调整为 320x320 作为输入切片。
代码片段如下：

class CenterBlock(nn.Module):
    def __init__(self, in_channel, out_channel):
        super().__init__()
        self.conv = conv3x3(in_channel, out_channel).apply(init_weight)
        
    def forward(self, inputs):
        x = self.conv(inputs)
        return x

class DecodeBlock(nn.Module):
    def __init__(self, in_channel, out_channel, upsample):
        super().__init__()
        self.bn1 = nn.BatchNorm2d(in_channel).apply(init_weight)
        self.upsample = nn.Sequential()
        if upsample:
            self.upsample.add_module('upsample',nn.Upsample(scale_factor=2, mode='nearest'))
        self.conv3x3_1 = conv3x3(in_channel, in_channel).apply(init_weight)
        self.bn2 = nn.BatchNorm2d(in_channel).apply(init_weight)
        self.conv3x3_2 = conv3x3(in_channel, out_channel).apply(init_weight)
        self.cbam = CBAM(out_channel, reduction=16)
        self.conv1x1   = conv1x1(in_channel, out_channel).apply(init_weight)
        
    def forward(self, inputs):
        x  = F.relu(self.bn1(inputs))
        x  = self.upsample(x)
        x  = self.conv3x3_1(x)
        x  = self.conv3x3_2(F.relu(self.bn2(x)))
        x  = self.cbam(x)
        x += self.conv1x1(self.upsample(inputs)) #shortcut
        return x
        
class UNET_SERESNEXT101(nn.Module):
    def __init__(self, resolution, deepsupervision, clfhead, load_weights=True):
        super().__init__()
        h,w = resolution
        self.deepsupervision = deepsupervision
        self.clfhead = clfhead
        
        #encoder
        model_name =