https://dblp.org/rdf/schema#authoredBy
|
https://dblp.org/pid/352/2462 +
, https://dblp.org/pid/33/7132 +
, https://dblp.org/pid/69/157 +
, https://dblp.org/pid/150/4295 +
, https://dblp.org/pid/282/6213 +
, https://dblp.org/pid/365/3419 +
, https://dblp.org/pid/71/3692 +
, https://dblp.org/pid/348/5674 +
, https://dblp.org/pid/158/6747 +
, https://dblp.org/pid/167/1754 +
, https://dblp.org/pid/314/6067 +
, https://dblp.org/pid/74/6847 +
, https://dblp.org/pid/121/6722 +
, https://dblp.org/pid/197/0141 +
, https://dblp.org/pid/333/4268 +
, https://dblp.org/pid/47/2026-19 +
, https://dblp.org/pid/49/67 +
, https://dblp.org/pid/76/1774 +
, https://dblp.org/pid/192/1241-1 +
, https://dblp.org/pid/63/2841 +
, https://dblp.org/pid/27/5100 +
, https://dblp.org/pid/135/6973 +
, https://dblp.org/pid/52/323-1 +
, https://dblp.org/pid/69/1395 +
, https://dblp.org/pid/05/6735-1 +
, https://dblp.org/pid/150/8447 +
, https://dblp.org/pid/24/5818 +
|
https://dblp.org/rdf/schema#bibtexType
|
http://purl.org/net/nknouf/ns/bibtex#Article +
|
https://dblp.org/rdf/schema#documentPage
|
https://doi.org/10.48550/ARXIV.2401.06080 +
|
https://dblp.org/rdf/schema#doi
|
https://doi.org/10.48550/ARXIV.2401.06080 +
, http://dx.doi.org/10.48550/ARXIV.2401.06080 +
|
https://dblp.org/rdf/schema#listedOnTocPage
|
https://dblp.org/db/journals/corr/corr2401 +
|
https://dblp.org/rdf/schema#numberOfCreators
|
27
|
https://dblp.org/rdf/schema#primaryDocumentPage
|
https://doi.org/10.48550/ARXIV.2401.06080 +
|
https://dblp.org/rdf/schema#publishedIn
|
CoRR
|
https://dblp.org/rdf/schema#publishedInJournal
|
CoRR
|
https://dblp.org/rdf/schema#publishedInJournalVolume
|
abs/2401.06080
|
https://dblp.org/rdf/schema#title
|
Secrets of RLHF in Large Language Models Part II: Reward Modeling.
|
https://dblp.org/rdf/schema#yearOfPublication
|
2024
|
owl:sameAs |
https://doi.org/10.48550/ARXIV.2401.06080 +
, http://dx.doi.org/10.48550/ARXIV.2401.06080 +
|
rdf:type |
https://dblp.org/rdf/schema#Publication +
, https://dblp.org/rdf/schema#Informal +
|
rdfs:label |
Binghai Wang et al.: Secrets of RLHF in Large Language Models Part II: Reward Modeling. (2024)
|