microsoft/deberta-xlarge-mnliai软件哪个比较好
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on majority of NLU tasks with 80GB training data.grok中文版下载
Please check the official repository for more details and updates.百度ai智能云
This the DeBERTa xlarge model(750M) fine-tuned with mnli task.下载官方即梦a1
Fine-tuning on NLU tasks
We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.猫箱下载安装
| Model | SQuAD 1.1 | SQuAD 2.0 | MNLI-m/mm | SST-2 | QNLI | CoLA | RTE | MRPC | QQP | STS-B |
|---|---|---|---|---|---|---|---|---|---|---|
| F1/EM | F1/EM | Acc | Acc | Acc | MCC | Acc | Acc/F1 | Acc/F1 | P/S | |
| BERT-Large | 90.9/84.1 | 81.8/79.0 | 86.6/- | 93.2 | 92.3 | 60.6 | 70.4 | 88.0/- | 91.3/- | 90.0/- |
| RoBERTa-Large | 94.6/88.9 | 89.4/86.5 | 90.2/- | 96.4 | 93.9 | 68.0 | 86.6 | 90.9/- | 92.2/- | 92.4/- |
| XLNet-Large | 95.1/89.7 | 90.6/87.9 | 90.8/- | 97.0 | 94.9 | 69.0 | 85.9 | 90.8/- | 92.3/- | 92.5/- |
| DeBERTa-Large1 | 95.5/90.1 | 90.7/88.0 | 91.3/91.1 | 96.5 | 95.3 | 69.5 | 91.0 | 92.6/94.6 | 92.3/- | 92.8/92.5 |
| DeBERTa-XLarge1 | -/- | -/- | 91.5/91.2 | 97.0 | – | – | 93.1 | 92.1/94.3 | – | 92.9/92.7 |
| DeBERTa-V2-XLarge1 | 95.8/90.8 | 91.4/88.9 | 91.7/91.6 | 97.5al一键脱装入口 | 95.8 | 71.1 | 93.9免费的ai工具 | 92.0/94.2 | 92.3/89.8 | 92.9/92.9 |
| DeBERTa-V2-XXLarge1,2 | 96.1/91.4制作ai的软件 | 92.2/89.7ai软件哪个比较好 | 91.7/91.9制作ai的软件 | 97.2 | 96.0ima是什么软件 | 72.0快问ai | 93.5 | 93.1/94.9做al视频怎么赚钱 | 92.7/90.3快问ai | 93.2/93.1百度流畅ai制作 |
Notes.
- 1 Following RoBERTa, for RTE, MRPC, STS-B, we fine-tune the tasks based on DeBERTa-Large-MNLI, DeBERTa-XLarge-MNLI, DeBERTa-V2-XLarge-MNLI, DeBERTa-V2-XXLarge-MNLI. The results of SST-2/QQP/QNLI/SQuADv2 will also be slightly improved when start from MNLI fine-tuned models, however, we only report the numbers fine-tuned from pretrained base models for those 4 tasks.
- 2 To try the XXLarge免费的ai工具 model with HF Transformersai是什么东西?, you need to specify –sharded_ddpgrok中文版下载
cd transformers/examples/text-classificational一键脱装入口/
export TASK_NAME=mrpc
python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \\
--task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \\
--learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
Citation
If you find DeBERTa useful for your work, please cite the following paper:猫箱下载安装
@inproceedings{
he2021deberta,
title={DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION},
author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=XPZIaotutsD}
}
数据统计
数据评估
本站菠萝导航提供的microsoft/deberta-xlarge-mnli都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由菠萝导航实际控制,在2023年5月15日 下午3:14收录时,该网页上的内容,都属于合规合法,后期网页的内容如出现违规,可以直接联系网站管理员进行删除,菠萝导航不承担任何责任。ima是什么软件

