0.66+ solution - mean prompt + SFT + text_type

603. LLM Prompt Recovery | llm-prompt-recovery

开始: 2024-02-27 结束: 2024-04-16 AIGC与多模态数据算法赛

0.66+ 方案 - 均值提示 + SFT + 文本类型

0.66+ 方案 - 均值提示 + SFT + 文本类型

首先，我要感谢组织者并祝贺所有获奖者！

特别感谢：

@richolson 分享了很好的想法
@seifachour12 提供的均值提示 “请使用保持原意但改变语调的写作风格来改进此文本”，该提示得分 0.63。尽管多次尝试，我未能进一步提升 :)

数据集

我构造了两个与组织者提供的小数据集相似的集合：

original_text, rewrite_prompt, rewritten_text（原始文本、重写提示、重写后文本）
original_text, text_type（文本类型：文本、诗歌、故事、备忘录、电子邮件等）

模型

SFT Mistral-7B-Instruct-v0.2 用于预测 rewrite_prompt：

instuction = "I will provide you two texts - original text and rewritten text"
text = f"""[INST] {instuction} [/INST] Sure. Write the original text [INST] {sample['original_text']} [/INST] Write the rewritten text [INST]{sample['rewritten_text']}[/INST] The following prompt could be used to transform the original text into the rewritten text: Rewrite the original text """

SFT Mistral-7B-Instruct-v0.2 用于预测原始文本类型（虽然已有训练管道，但为冗余开销）：

instuction = "Provide a text and I'll tell you it's type. I'll output only answer without explanation."
text = f"""[INST] {instuction} [/INST] Sure. Write the text [INST] {sample['original_text']} [/INST] The text is"""

推理

REWRITE_PROMPT = "Please improve this {text_type} using the writing style {rewrite_prompt} with maintaining the original meaning but altering the tone."

在比赛结束时，我尝试将三个不同模型的预测结果进行融合，获得了小幅提升。

未取得良好效果的尝试

扩展 text_type（加入口号、俳句、绕口令等），并生成提示 Convert this {text_type_1} to {text_type_2}（如果 text_type_1 ≠ text_type_2）
使用更大的模型（如 Mixtral-8x7B-Instruct-v0.1）反而表现更差
尝试 PPO（强化学习）

作者：Ruslan Guseynov | 发布日期：2024-04-17

同比赛其他方案

1st place solution: adversarial attack

2nd place solution: Team Danube

3rd place solution

4th Place: ST5 Tokenizer Attack!

5th place solution