microsoft/deberta-large-mnli百度ai智能云
deberta人工智能ai哪个好: Decoding-enhanced BERT with Disentangled Attention
DeBERTa improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on majority of NLU tasks with 80GB training data.猫箱下载安装
Please check the official repository for more details and updates.ima是什么软件
This is the DeBERTa large model fine-tuned with MNLI task.百度aiapp
Fine-tuning on NLU tasks
We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.百度流畅ai制作
| Model | SQuAD 1.1 | SQuAD 2.0 | MNLI-m/mm | SST-2 | QNLI | CoLA | RTE | MRPC | QQP | STS-B |
|---|---|---|---|---|---|---|---|---|---|---|
| F1/EM | F1/EM | Acc | Acc | Acc | MCC | Acc | Acc/F1 | Acc/F1 | P/S | |
| BERT-Large | 90.9/84.1 | 81.8/79.0 | 86.6/- | 93.2 | 92.3 | 60.6 | 70.4 | 88.0/- | 91.3/- | 90.0/- |
| RoBERTa-Large | 94.6/88.9 | 89.4/86.5 | 90.2/- | 96.4 | 93.9 | 68.0 | 86.6 | 90.9/- | 92.2/- | 92.4/- |
| XLNet-Large | 95.1/89.7 | 90.6/87.9 | 90.8/- | 97.0 | 94.9 | 69.0 | 85.9 | 90.8/- | 92.3/- | 92.5/- |
| DeBERTa-Large1 | 95.5/90.1 | 90.7/88.0 | 91.3/91.1 | 96.5 | 95.3 | 69.5 | 91.0 | 92.6/94.6 | 92.3/- | 92.8/92.5 |
| DeBERTa-XLarge1 | -/- | -/- | 91.5/91.2 | 97.0 | – | – | 93.1 | 92.1/94.3 | – | 92.9/92.7 |
| DeBERTa-V2-XLarge1 | 95.8/90.8 | 91.4/88.9 | 91.7/91.6 | 97.5百度流畅ai制作 | 95.8 | 71.1 | 93.9即梦下载官方 | 92.0/94.2 | 92.3/89.8 | 92.9/92.9 |
| DeBERTa-V2-XXLarge1,2 | 96.1/91.4有戏ai | 92.2/89.7下载官方即梦a1 | 91.7/91.9制作ai的软件 | 97.2 | 96.0百度ai智能云 | 72.0al一键脱装入口 | 93.5 | 93.1/94.9百度流畅ai制作 | 92.7/90.3百度流畅ai制作 | 93.2/93.1即梦下载官方 |
Notes.
- 1 Following RoBERTa, for RTE, MRPC, STS-B, we fine-tune the tasks based on DeBERTa-Large-MNLI, DeBERTa-XLarge-MNLI, DeBERTa-V2-XLarge-MNLI, DeBERTa-V2-XXLarge-MNLI. The results of SST-2/QQP/QNLI/SQuADv2 will also be slightly improved when start from MNLI fine-tuned models, however, we only report the numbers fine-tuned from pretrained base models for those 4 tasks.
- 2 To try the XXLargeima是什么软件 model with HF Transformers制作ai的软件, you need to specify –sharded_ddp元宝大模型
cd transformers/examples/text-classification猫箱下载安装/
export TASK_NAME=mrpc
python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \\
--task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \\
--learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
Citation
If you find DeBERTa useful for your work, please cite the following paper:grok中文版下载
@inproceedings{
he2021deberta,
title={DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION},
author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=XPZIaotutsD}
}
数据统计
数据评估
本站菠萝导航提供的microsoft/deberta-large-mnli都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由菠萝导航实际控制,在2023年5月15日 下午3:14收录时,该网页上的内容,都属于合规合法,后期网页的内容如出现违规,可以直接联系网站管理员进行删除,菠萝导航不承担任何责任。百度流畅ai制作

