roberta-large-mnli

Model Details
How To Get Started With the Model
Uses
Risks, Limitations and Biases
Training
Evaluation
Environmental Impact
Technical Specifications
Citation Information
Model Card Authors

Model Details

Model Description:百度aiapp roberta-large-mnli is the RoBERTa large model fine-tuned on the Multi-Genre Natural Language Inference (MNLI) corpus. The model is a pretrained model on English language text using a masked language modeling (MLM) objective.

Developed by:ai分析软件 See GitHub Repo for model developers
Model Type:al一键脱装入口 Transformer-based language model
Language(s):做al视频怎么赚钱 English
License:免费的ai工具 MIT
Parent Model:al一键脱装入口 This model is a fine-tuned version of the RoBERTa large model. Users should see the RoBERTa large model card for relevant information.
Resources for more information:人工智能ai哪个好
- Research Paper
- GitHub Repo

How to Get Started with the Model

Use the code below to get started with the model. The model can be loaded with the zero-shot-classification pipeline like so:百度流畅ai制作

from transformers import pipeline
classifier = pipeline('zero-shot-classification', model='roberta-large-mnli')

You can then use this pipeline to classify sequences into any of the class names you specify. For example:元宝大模型

sequence_to_classify = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(sequence_to_classify, candidate_labels)

Uses

Direct Use

This fine-tuned model can be used for zero-shot classification tasks, including zero-shot sentence-pair classification (see the GitHub repo for examples) and zero-shot sequence classification.制作ai的软件

Misuse and Out-of-scope Use

The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.人工智能ai哪个好

Risks, Limitations and Biases

CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propogate historical and current stereotypes.ima是什么软件

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). The RoBERTa large model card notes that: “The training data used for this model contains a lot of unfiltered content from the internet, which is far from neutral.”即梦al

Predictions generated by the model can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. For example:ai是什么东西?

sequence_to_classify = "The CEO had a strong handshake."
candidate_labels = ['male', 'female']
hypothesis_template = "This text speaks about a {} profession."
classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template)

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.制作ai的软件

Training

Training Data

This model was fine-tuned on the Multi-Genre Natural Language Inference (MNLI) corpus. Also see the MNLI data card for more information.元宝大模型

As described in the RoBERTa large model card:ai分析软件

The RoBERTa model was pretrained on the reunion of five datasets:下载官方即梦a1

BookCorpus, a dataset consisting of 11,038 unpublished books;

English Wikipedia (excluding lists, tables and headers) ;

CC-News, a dataset containing 63 millions English news articles crawled between September 2016 and February 2019.

OpenWebText, an opensource recreation of the WebText dataset used to train GPT-2,

Stories, a dataset containing a subset of CommonCrawl data filtered to match the story-like style of Winograd schemas.

Together theses datasets weight 160GB of text.元宝大模型

Also see the bookcorpus data card and the wikipedia data card for additional information.ai软件哪个比较好

Training Procedure

Preprocessing

As described in the RoBERTa large model card:ima是什么软件

The texts are tokenized using a byte version of Byte-Pair Encoding (BPE) and a vocabulary size of 50,000. The inputs of
the model take pieces of 512 contiguous token that may span over documents. The beginning of a new document is marked
with <s> and the end of one by </s>

The details of the masking procedure for each sentence are the following:猫箱下载安装

15% of the tokens are masked.

In 80% of the cases, the masked tokens are replaced by <mask>.

In 10% of the cases, the masked tokens are replaced by a random token (different) from the one they replace.

In the 10% remaining cases, the masked tokens are left as is.

Contrary to BERT, the masking is done dynamically during pretraining (e.g., it changes at each epoch and is not fixed).百度ai智能云

Pretraining

Also as described in the RoBERTa large model card:快问ai

The model was trained on 1024 V100 GPUs for 500K steps with a batch size of 8K and a sequence length of 512. The
optimizer used is Adam with a learning rate of 4e-4, $\beta_{1} = 0.9$ , $\beta_{2} = 0.98$ and
$\epsilon = 1e-6$ , a weight decay of 0.01, learning rate warmup for 30,000 steps and linear decay of the learning
rate after.

Evaluation

The following evaluation information is extracted from the associated GitHub repo for RoBERTa.百度流畅ai制作

Testing Data, Factors and Metrics

The model developers report that the model was evaluated on the following tasks and datasets using the listed metrics:做al视频怎么赚钱

Dataset:即梦al Part of GLUE (Wang et al., 2019), the General Language Understanding Evaluation benchmark, a collection of 9 datasets for evaluating natural language understanding systems. Specifically, the model was evaluated on the Multi-Genre Natural Language Inference (MNLI) corpus. See the GLUE data card or Wang et al. (2019) for further information.
- Tasks:猫箱下载安装 NLI. Wang et al. (2019) describe the inference task for MNLI as:
  
  The Multi-Genre Natural Language Inference Corpus (Williams et al., 2018) is a crowd-sourced collection of sentence pairs with textual entailment annotations. Given a premise sentence and a hypothesis sentence, the task is to predict whether the premise entails the hypothesis (entailment), contradicts the hypothesis (contradiction), or neither (neutral). The premise sentences are gathered from ten different sources, including transcribed speech, fiction, and government reports. We use the standard test set, for which we obtained private labels from the authors, and evaluate on both the matched (in-domain) and mismatched (cross-domain) sections. We also use and recommend the SNLI corpus (Bowman et al., 2015) as 550k examples of auxiliary training data.免费的ai工具
- Metrics:ai软件哪个比较好 Accuracy
Dataset:元宝大模型 XNLI (Conneau et al., 2018), the extension of the Multi-Genre Natural Language Inference (MNLI) corpus to 15 languages: English, French, Spanish, German, Greek, Bulgarian, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, Hindi, Swahili and Urdu. See the XNLI data card or Conneau et al. (2018) for further information.
- Tasks:ai是什么东西? Translate-test (e.g., the model is used to translate input sentences in other languages to the training language)
- Metrics:做al视频怎么赚钱 Accuracy

Results

GLUE test results (dev set, single model, single-task fine-tuning): 90.2 on MNLIai软件哪个比较好

XNLI test results:即梦下载官方

Task	en	fr	es	de	el	bg	ru	tr	ar	vi	th	zh	hi	sw	ur
91.3	82.91	84.27	81.24	81.74	83.13	78.28	76.79	76.64	74.17	74.05	77.5	70.9	66.65	66.81

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). We present the hardware type and hours used based on the associated paper.百度流畅ai制作

Hardware Type:下载官方即梦a1 1024 V100 GPUs
Hours used:有戏ai 24 hours (one day)
Cloud Provider:ai软件哪个比较好 Unknown
Compute Region:免费的ai工具 Unknown
Carbon Emitted:al一键脱装入口 Unknown

Technical Specifications

See the associated paper for details on the modeling architecture, objective, compute infrastructure, and training details.下载官方即梦a1

Citation Information

@article{liu2019roberta,
    title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
    author = {Yinhan Liu and Myle Ott and Naman Goyal and Jingfei Du and
              Mandar Joshi and Danqi Chen and Omer Levy and Mike Lewis and
              Luke Zettlemoyer and Veselin Stoyanov},
    journal={arXiv preprint arXiv:1907.11692},
    year = {2019},
}

数据统计

数据评估

roberta-large-mnli浏览人数已经达到1,911，如你需要查询该站的相关权重信息，可以点击"5118数据即梦al""爱站数据有戏ai""Chinaz数据ai分析软件"进入；以目前的网站数据参考，建议大家请以爱站数据为准，更多网站价值评估因素如：roberta-large-mnli的访问速度、搜索引擎收录以及索引量、用户体验等；当然要评估一个站的价值，最主要还是需要根据您自身的需求以及需要，一些确切的数据则需要找roberta-large-mnli的站长进行洽谈提供。如该站的IP、PV、跳出率等！

特别声明

本站菠萝导航提供的roberta-large-mnli都来源于网络，不保证外部链接的准确性和完整性，同时，对于该外部链接的指向，不由菠萝导航实际控制，在2023年5月15日下午3:13收录时，该网页上的内容，都属于合规合法，后期网页的内容如出现违规，可以直接联系网站管理员进行删除，菠萝导航不承担任何责任。猫箱下载安装

菠萝导航致力于优质、实用的网络站点资源收集与分享！本文地址https://huanlankj.com/sites/3240.html转载请注明

暂无评论快问ai

暂无评论...grok中文版下载

roberta-large-mnli免费的ai工具

roberta-large-mnli

Table of Contents

Model Details

How to Get Started with the Model

Uses

Direct Use

Misuse and Out-of-scope Use

Risks, Limitations and Biases

Training

Training Data

Training Procedure

Preprocessing

Pretraining

Evaluation

Testing Data, Factors and Metrics

Results

Environmental Impact

Technical Specifications

Citation Information

数据统计

数据评估

相关导航

暂无评论快问ai

热门标签

随机网址

roberta-large-mnli免费的ai工具

roberta-large-mnli

Table of Contents

Model Details

How to Get Started with the Model

Uses

Direct Use

Misuse and Out-of-scope Use

Risks, Limitations and Biases

Training

Training Data

Training Procedure

Preprocessing

Pretraining

Evaluation

Testing Data, Factors and Metrics

Results

Environmental Impact

Technical Specifications

Citation Information

数据统计

数据评估

相关导航

暂无评论快问ai

热门标签

随机网址

广告位即梦下载官方