返回列表

18th Solution(public 11th)

674. Jigsaw - Agile Community Rules Classification | jigsaw-agile-community-rules

开始: 2025-07-23 结束: 2025-10-23 内容安全 数据算法赛
第 18 名解决方案(公开第 11 名)

第 18 名解决方案(公开第 11 名)

作者: Ebi (ebinan92)
发布时间: 2025-10-24

🧩 解决方案总结

所有流程均在 Kaggle Notebook 测试运行期间执行。

1. 数据改写 (Paraphrasing)

  • 使用 Qwen3-32B 生成改写的额外训练数据。
  • 为每个训练正文创建一个改写样本,以提高语言多样性。
  • 每条规则均使用了自定义提示词。
  • 特定规则提示词配置示例:
RULE_SPECIFIC_INSTRUCTIONS = {
    "No financial advice": {
        "violation": "Rewrite this financial advice comment using different words while keeping its advisory nature about investments, taxes, or crypto.",
        "compliant": "Rewrite this comment using different words while keeping it free from specific financial advice.",
    },
    "No medical advice": {
        "violation": "Rewrite this medical advice comment using different words while keeping its diagnostic or treatment recommendation nature.",
        "compliant": "Rewrite this comment using different words while keeping it free from specific medical advice.",
    },
    "No promotion of illegal activity": {
        "violation": "Rewrite this comment using different words while keeping its promotion or encouragement of illegal activities.",
        "compliant": "Rewrite this comment using different words while keeping it legal and compliant.",
    },
    "No spoilers": {
        "violation": "Rewrite this spoiler comment using different words while keeping the reveal of important plot details.",
        "compliant": "Rewrite this comment using different words while keeping it spoiler-free.",
    },
    "No Advertising": {
        "violation": "Rewrite this promotional or advertising text using different words while keeping its spammy, promotional nature with links or product promotion.",
        "compliant": "Rewrite this comment using different words while keeping it free from advertising or promotional content.",
    },
    "No legal advice": {
        "violation": "Rewrite this legal advice comment using different words while keeping its advisory nature about legal matters.",
        "compliant": "Rewrite this comment using different words while keeping it free from specific legal advice.",
    },
    "Default": {
        "violation": "Rewrite this rule-violating comment using different words while preserving its problematic nature and rule violation.",
        "compliant": "Rewrite this compliant comment using different words while keeping it appropriate and rule-compliant.",
    },
}

2. 模型

  • 通过 QLoRA 训练了三个模型:Phi-4、Qwen-2.5-14B-Instruct 和 Qwen3-14B。
  • 每个模型均训练为接收 bodyrule 作为输入,并根据 rule_violation 输出 YesNo
  • 训练使用 Unsloth 配合 DDP(分布式数据并行)进行高效的多 GPU 训练。

效果不佳的尝试

  • 伪标签 (pseudo label)
  • muon 优化器 (muon optimizer)
  • 使用 few-shot 提示生成的新训练数据 (new training data using few-shot prompting)
同比赛其他方案