504. Feedback Prize - Predicting Effective Arguments | feedback-prize-effectiveness
非常感谢 Kaggle 和比赛主办方引入效率赛道 (Efficiency Track),这是一个非常棒的补充,肯定会带来更多创造性的解决方案。这是我们针对Feedback Prize - Predicting Effective Arguments比赛解决方案的详细版本。
总体而言,我们采用了跨度分类 方法,模型架构如下:

我们对每篇文章进行了预处理,为每种语篇类型 插入了新增加的跨度开始和跨度结束标记。为了给模型提供更多上下文,我们添加了一个主题片段,格式为 [TOPIC] 。作为一个小细节,我们插入了 [SOE] 和 [EOE] 标记来表示文章的开始和结束。以下是一个示例:
[TOPIC] Should computers read the emotional expressions of students in a classroom? [TOPIC_END] [SOE] Lead [LEAD] Imagine being told about a new new technology that reads emotions. Now imagine this, imagine being able to tell how someone truly feels about their emotional expressions because they don't really know how they really feel. [LEAD_END] In this essay I am going to tell you about my opinions on why Position [POSITION] I believe that the value of using technology to read student's emotional expressions is over the top. [POSITION_END]
...
Concluding Statement [CONCLUDING_STATEMENT] In conclusion I think this is not a very good idea, but not in the way students should be watched. The way teachers are able to read students' facial expressions tells them how they feel. I don't believe it's important to look at students faces when they can fake their emotions. But, if the teacher is watching you then they're gonna get angry. This is how I feel on this topic. [CONCLUDING_STATEMENT_END] [EOE]
为了训练新添加的标记(例如 [LEAD], [POSITION])并适应特定任务领域(即 6-12 年级的学生作文),我们使用掩码语言建模 (MLM) 目标继续对每个主干网络(例如 deberta-v3-large)进行预训练。虽然标准的 MLM 效果尚可,但我们发现通过以下修改可以获得巨大的提升: