第9名解决方案 - UBC-OCEAN竞赛

第9名解决方案

作者：fate (Kaggle Master)
竞赛排名：第9名
发布时间：2024年1月5日

仅使用比赛数据，不使用外部数据

区分WSI和TMA图像：

WSI图像包含黑色像素（所有三个通道均为零），而TMA图像没有。因此，如果图像的宽度和高度都小于6000，但黑色像素面积超过图像总面积的5%（训练数据中所有WSI图像的黑色像素均超过10%），则将其分类为WSI；否则分类为TMA。

图像分块处理：

首先将WSI图像缩小至原尺寸的0.33倍，然后分割为512×512像素的图块。接着根据"np.sum(np.ptp(tile, axis=2) < 20)"条件识别劣质像素，将这些图块分为三个质量等级。

推理时图块生成代码：

def resize_image_and_make_tile(name, out_path, scale):
    path = f"/kaggle/input/UBC-OCEAN/{inference}_images/{name}.png"
    p_mask = f"{pred_mask_512_folder}/{name}.npy"
    image = cv2.imread(path)
    image = cv2.resize(image, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_AREA)
    mask = np.load(p_mask)
    os.makedirs(f"{out_path}/{name}", exist_ok=True)
    count = 0
    
    if count < 20:
        idxs = [(y, x) for y in range(0, image.shape[0] // 512) 
                       for x in range(0, image.shape[1] // 512)]
        random.shuffle(idxs)
        for k, (y, x) in enumerate(idxs):
            tile = image[y*512:(y+1)*512, x*512:(x+1)*512, :]
            bg_count = np.sum(np.ptp(tile, axis=2) < 20)
            
            if (bg_count / (512*512)) <= 0.5:
                cv2.imwrite(f"{out_path}/{name}/{x}_{y}.png", tile)
                count += 1
                
            if count >= 60:
                break
    
    if count < 20:
        idxs = [(y, x) for y in range(0, image.shape[0] // 512) 
                       for x in range(0, image.shape[1] // 512)]
        random.shuffle(idxs)
        for k, (y, x) in enumerate(idxs):
            tile = image[y*512:(y+1)*512, x*512:(x+1)*512, :]
            bg_count = np.sum(np.ptp(tile, axis=2) < 20)
            
            if ((bg_count / (512*512)) <= 0.65) & ((bg_count / (512*512)) > 0.5):
                cv2.imwrite(f"{out_path}/{name}/{x}_{y}.png", tile)
                count += 1
                
            if count >= 40:
                break
    
    if count < 10:
        idxs = [(y, x) for y in range(0, image.shape[0] // 512) 
                       for x in range(0, image.shape[1] // 512)]
        random.shuffle(idxs)
        for k, (y, x) in enumerate(idxs):
            tile = image[y*512:(y+1)*512, x*512:(x+1)*512, :]
            bg_count = np.sum(np.ptp(tile, axis=2) < 20)
            
            if ((bg_count / (512*512)) <= 0.75) & ((bg_count / (512*512)) > 0.65):
                cv2.imwrite(f"{out_path}/{name}/{x}_{y}.png", tile)
                count += 1
                
            if count >= 10:
                break

训练时图块筛选策略：

步骤1：使用所有bg_count/area < 0.5的图块
步骤2：如果WSI图像的图块数<50，添加bg_count/area在0.5-0.65之间的图块直至达到50个
步骤3：如果WSI图像的图块数<20，添加bg_count/area在0.65-0.75之间的图块直至达到20个

模型训练：

仅使用WSI图块。每批次从每张图像中随机选择6个图块进行训练。
损失函数：二元交叉熵(BCE)

步骤1：常规训练
步骤2：利用步骤1的结果生成辅助标签。若真实标签的预测值>0.3，则辅助标签设为1，否则为0。重新训练模型（不使用步骤1的权重）。损失函数：标签损失(BCE) + 0.3*辅助标签损失(BCE)，学习率：2e-4
步骤3：使用步骤2权重进行微调。损失函数：标签损失(BCE) + 0.15*辅助标签损失(BCE)，学习率：5e-5

不同骨干网络的模型：

efficientnetb4, efficientnet_v2s, maxvit_tiny（不同骨干网络的模型设置略有差异）

WSI图像预测处理：

使用模型预测图块。

WSI图块集成：

tile_df["prob"] = np.max(tile_df[["pred_0", "pred_1", "pred_2", "pred_3", "pred_4"]], axis=1)
tile_df["pred"] = np.argmax(tile_df[["pred_0", "pred_1", "pred_2", "pred_3", "pred_4"]].values, axis=1)
tile_df = tile_df[["image_id", "pred", "prob", "aux"]].groupby(["image_id", "pred"])[["prob", "aux"]].mean().reset_index()
idx = tile_df.groupby(["image_id"])["prob"].idxmax()
wsi_df = tile_df.loc[idx].reset_index(drop=True)

WSI异常值处理：

aux_label的预测平均值<0.5（与不预测"Other"相比，分数几乎相同，可能+0.01）

TMA图像处理：

步骤1. 裁剪TMA图像

def crop_tma(img):
    ks = min(min(img.shape[0], img.shape[1]) // 150, 20)
    
    mask = (img.max(axis=2) - img.min(axis=2)) > 20
    kernel = np.ones((ks, ks), np.uint8)
    mask = cv2.erode(mask.astype(np.uint8), kernel)
    nonzero_pixels = np.column_stack(np.where(mask > 0))
    
    if nonzero_pixels.size < (img.size // 60):
        return img
    else:
        min_y, min_x = np.min(nonzero_pixels, axis=0)
        max_y, max_x = np.max(nonzero_pixels, axis=0)
        return img[max(0, min_y-ks):max_y+ks+1, max(0, min_x-ks):max_x+ks+1, :]

步骤2. 调整至512×512尺寸（TMA图像尺寸*0.33*0.5≈512，可直接调整至512进行预测）
步骤3. 使用WSI训练模型进行预测

TMA异常值处理：

aux_label预测值<0.5（与不预测"Other"的TMA相比，公开榜分数+0.03，私有榜分数+0.06）

多模型集成：

投票法（与单模型相比，可能仅提升+0.01）

可能无效的方法：

图像分割

UBC-OCEAN竞赛主页 https://www.kaggle.com/competitions/UBC-OCEAN 作者Kaggle主页 https://www.kaggle.com/chihantsai

9th place solution