返回列表

31st solution with custom loss

361. TensorFlow 2.0 Question Answering | tensorflow2-question-answering

开始: 2019-10-28 结束: 2020-01-22 自然语言处理 数据算法赛
第31名方案:使用自定义损失函数

第31名方案:使用自定义损失函数

作者:higepon (Master) | 比赛排名:31st

感谢 Kaggle 和 Kaggle 社区举办了这场精彩的比赛,我学到了很多东西。由于个人原因,过去两周我几乎没做什么,但这过程真的非常有趣。

我的模型

  • Public 分数 0.68,Private 分数 0.65。
  • 单个 PyTorch Bert 模型。
  • 微调 bert-large-uncased-whole-word-masking-finetuned-squad 模型 1 个 epoch。
    • 训练 2 个 epoch 虽然获得了更好的 Private 分数 0.68(Public 0.65),但我没有选择它 :(
  • 学习率使用 3e-5 而不是 5e-5。
  • 对训练数据中的空实例进行下采样。
  • 在损失函数中惩罚答案位于步长中的训练数据。
  • 简单移除了 HTML 标签。
  • 使用短/长文本分数进行参数搜索。

下采样

flattened_examples = list(itertools.chain.from_iterable(examples))
null_instances = []
annotated_instances = []
for e in flattened_examples:
    if e.class_label == 'unknown':
        null_instances.append(e)
    else:
        annotated_instances.append(e)
len_null = len(null_instances)
len_downsampled = int(len_null / 50) if len_null > 50 else 0
downsampled = random.sample(null_instances, len_downsampled)
logging.info('    down sampling nonnull(%d) null(%d) to null(%d)', len(annotated_instances), len_null, len(downsampled))
self.examples = downsampled + annotated_instances

损失函数

def loss_fn(preds, labels, no_answers):

    start_preds, end_preds, class_preds = preds
    start_labels, end_labels, class_labels = labels
    
    has_answers = [not x for x in no_answers]
    
    start_preds_no_answer = start_preds[no_answers]
    start_preds_has_answer = start_preds[has_answers]
    end_preds_no_answer = end_preds[no_answers]
    end_preds_has_answer = end_preds[has_answers]
    class_preds_no_answer = class_preds[no_answers]
    class_preds_has_answer = class_preds[has_answers]
    start_labels_no_answer = start_labels[no_answers]
    start_labels_has_answer = start_labels[has_answers]
    end_labels_no_answer = end_labels[no_answers]
    end_labels_has_answer = end_labels[has_answers]
    class_labels_no_answer = class_labels[no_answers]
    class_labels_has_answer = class_labels[has_answers]

    loss_no_answer = 0
    loss_has_answer = 0
    # has answer
    if len(start_preds_has_answer) > 0:
        start_loss = nn.CrossEntropyLoss(ignore_index=-1)(start_preds_has_answer, start_labels_has_answer)
        end_loss = nn.CrossEntropyLoss(ignore_index=-1)(end_preds_has_answer, end_labels_has_answer)
        class_loss = nn.CrossEntropyLoss()(class_preds_has_answer, class_labels_has_answer)
        loss_has_answer = start_loss + end_loss + class_loss

    if len(start_preds_no_answer) > 0:
        start_loss = nn.CrossEntropyLoss(ignore_index=-1)(start_preds_no_answer, start_labels_no_answer)
        end_loss = nn.CrossEntropyLoss(ignore_index=-1)(end_preds_no_answer, end_labels_no_answer)
        class_loss = nn.CrossEntropyLoss()(class_preds_no_answer, class_labels_no_answer)
        loss_no_answer = start_loss + end_loss + class_loss
        
    return loss_has_answer * 2 + loss_no_answer

我没有尝试的方法

  • p/table 标签注释
  • TPU
  • 更多的后处理
同比赛其他方案