facebook/dpr-question_encoder-single-nq-baseai是什么东西?
dpr-question_encoder-single-nq-base
Table of Contents
- Model Details
- How To Get Started With the Model
- Uses
- Risks, Limitations and Biases
- Training
- Evaluation
- Environmental Impact
- Technical Specifications
- Citation Information
- Model Card Authors
Model Details
Model Description:百度ai智能云 Dense Passage Retrieval (DPR) is a set of tools and models for state-of-the-art open-domain Q&A research. dpr-question_encoder-single-nq-base is the question encoder trained using the Natural Questions (NQ) dataset (Lee et al., 2019; Kwiatkowski et al., 2019).
- Developed by:猫箱下载安装 See GitHub repo for model developers
- Model Type:做al视频怎么赚钱 BERT-based encoder
- Language(s):下载官方即梦a1 CC-BY-NC-4.0, also see Code of Conduct
- License:百度aiapp English
-
Related Models:快问ai
dpr-ctx_encoder-single-nq-basedpr-reader-single-nq-basedpr-ctx_encoder-multiset-basedpr-question_encoder-multiset-basedpr-reader-multiset-base
-
Resources for more information:百度流畅ai制作
- Research Paper
- GitHub Repo
- Hugging Face DPR docs
- BERT Base Uncased Model Card
How to Get Started with the Model
Use the code below to get started with the model.免费的ai工具
from Transformers制作ai的软件 import DPRQuestionEncoder, DPRQuestionEncoderTokenizer
tokenizer = DPRQuestionEncoderTokenizer.from_pretrained("facebook/dpr-question_encoder-single-nq-base")
model = DPRQuestionEncoder.from_pretrained("facebook/dpr-question_encoder-single-nq-base")
input_ids = tokenizer("Hello, is my dog cute ?", return_tensors="pt")["input_ids"]
embeddings = model(input_ids).pooler_output
Uses
Direct Use
dpr-question_encoder-single-nq-base, dpr-ctx_encoder-single-nq-base, and dpr-reader-single-nq-base can be used for the task of open-domain question answering.
Misuse and Out-of-scope Use
The model should not be used to intentionally create hostile or alienating environments for people. In addition, the set of DPR models was not trained to be factual or true representations of people or events, and therefore using the models to generate such content is out-of-scope for the abilities of this model.ima是什么软件
Risks, Limitations and Biases
CONTENT WARNING: Readers should be aware this section may contain content that is disturbing, offensive, and can propogate historical and current stereotypes.有戏ai
Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al., 2021 and Bender et al., 2021). Predictions generated by the model can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.al一键脱装入口
Training
Training Data
This model was trained using the Natural Questions (NQ) dataset (Lee et al., 2019; Kwiatkowski et al., 2019). The model authors write that:免费的ai工具
[The dataset] was designed for end-to-end question answering. The questions were mined from real Google search queries and the answers were spans in Wikipedia articles identified by annotators.即梦al
Training Procedure
The training procedure is described in the associated paper:百度aiapp
Given a collection of M text passages, the goal of our dense passage retriever (DPR) is to index all the passages in a low-dimensional and continuous space, such that it can retrieve efficiently the top k passages relevant to the input question for the reader at run-time.即梦al
Our dense passage retriever (DPR) uses a dense encoder EP(·) which maps any text passage to a d- dimensional real-valued vectors and builds an index for all the M passages that we will use for retrieval. At run-time, DPR applies a different encoder EQ(·) that maps the input question to a d-dimensional vector, and retrieves k passages of which vectors are the closest to the question vector.即梦下载官方
The authors report that for encoders, they used two independent BERT (Devlin et al., 2019) networks (base, un-cased) and use FAISS (Johnson et al., 2017) during inference time to encode and index passages. See the paper for further details on training, including encoders, inference, positive and negative passages, and in-batch negatives.ai分析软件
Evaluation
The following evaluation information is extracted from the associated paper.元宝大模型
Testing Data, Factors and Metrics
The model developers report the performance of the model on five QA datasets, using the top-k accuracy (k ∈ {20, 100}). The datasets were NQ, TriviaQA, WebQuestions (WQ), CuratedTREC (TREC), and SQuAD v1.1.免费的ai工具
Results
| Top 20 | Top 100 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| NQ | TriviaQA | WQ | TREC | SQuAD | NQ | TriviaQA | WQ | TREC | SQuAD |
| 78.4 | 79.4 | 73.2 | 79.8 | 63.2 | 85.4 | 85.0 | 81.4 | 89.1 | 77.2 |
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). We present the hardware type and based on the associated paper.人工智能ai哪个好
- Hardware Type:下载官方即梦a1 8 32GB GPUs
- Hours used:做al视频怎么赚钱 Unknown
- Cloud Provider:元宝大模型 Unknown
- Compute Region:即梦al Unknown
- Carbon Emitted:ai软件哪个比较好 Unknown
Technical Specifications
See the associated paper for details on the modeling architecture, objective, compute infrastructure, and training details.猫箱下载安装
Citation Information
@inproceedings{karpukhin-etal-2020-dense,
title = "Dense Passage Retrieval for Open-Domain Question Answering",
author = "Karpukhin, Vladimir and Oguz, Barlas and Min, Sewon and Lewis, Patrick and Wu, Ledell and Edunov, Sergey and Chen, Danqi and Yih, Wen-tau",
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
month = nov,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.emnlp-main.550",
doi = "10.18653/v1/2020.emnlp-main.550",
pages = "6769--6781",
}
Model Card Authors
This model card was written by the team at Hugging Face.百度aiapp
数据统计
数据评估
本站菠萝导航提供的facebook/dpr-question_encoder-single-nq-base都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由菠萝导航实际控制,在2023年5月9日 下午7:12收录时,该网页上的内容,都属于合规合法,后期网页的内容如出现违规,可以直接联系网站管理员进行删除,菠萝导航不承担任何责任。grok中文版下载

