返回列表

3rd Place Solution - Span MLM + T5 Augmentations

504. Feedback Prize - Predicting Effective Arguments | feedback-prize-effectiveness

开始: 2022-05-24 结束: 2022-08-23 智能评测 数据算法赛
第3名方案 - Span MLM + T5 数据增强

第3名方案 - Span MLM + T5 数据增强

作者: Raja Biswas, Trushant Kalyanpur, Harshit Mehta
比赛: Feedback Prize - Predicting Effective Arguments

非常感谢 Kaggle 和比赛主办方引入效率赛道 (Efficiency Track),这是一个非常棒的补充,肯定会带来更多创造性的解决方案。这是我们针对Feedback Prize - Predicting Effective Arguments比赛解决方案的详细版本。

模型架构

总体而言,我们采用了跨度分类 方法,模型架构如下:

Model Architecture

预处理

我们对每篇文章进行了预处理,为每种语篇类型 插入了新增加的跨度开始和跨度结束标记。为了给模型提供更多上下文,我们添加了一个主题片段,格式为 [TOPIC] [TOPIC END] 。作为一个小细节,我们插入了 [SOE][EOE] 标记来表示文章的开始和结束。以下是一个示例:

[TOPIC] Should computers read the emotional expressions of students in a classroom? [TOPIC_END] [SOE] Lead [LEAD] Imagine being told about a new new technology that reads emotions. Now imagine this, imagine being able to tell how someone truly feels about their emotional expressions because they don't really know how they really feel. [LEAD_END]  In this essay I am going to tell you about my opinions on why  Position [POSITION] I believe that the value of using technology to read student's emotional expressions is over the top. [POSITION_END] 

...

Concluding Statement [CONCLUDING_STATEMENT] In conclusion I think this is not a very good idea, but not in the way students should be watched. The way teachers are able to read students' facial expressions tells them how they feel. I don't believe it's important to look at students faces when they can fake their emotions. But, if the teacher is watching you then they're gonna get angry. This is how I feel on this topic. [CONCLUDING_STATEMENT_END] [EOE]

Span MLM (跨度掩码语言模型)

为了训练新添加的标记(例如 [LEAD], [POSITION])并适应特定任务领域(即 6-12 年级的学生作文),我们使用掩码语言建模 (MLM) 目标继续对每个主干网络(例如 deberta-v3-large)进行预训练。虽然标准的 MLM 效果尚可,但我们发现通过以下修改可以获得巨大的提升:

  • 将掩码概率更改为 40-50%,而不是通常使用的 15%。关于掩码率的详细分析,请参阅:Should You Mask 15% in Masked Language Modeling? https://arxiv.org/abs/2202.08005
  • 掩码长度为 3-15 的连续标记,而不是常规的随机掩码方法。我们的动机来自这篇论文:SpanBERT: Improving Pre-training by Representing