Uses Direct Use The model is a classifier that can be used to detect text generated by GPT-2 models. Downstream Use The model's developers have stated that they developed and released the model to help with research related to synthetic text generation, so the model could potentially be used for downstream tasks related to synthetic text generation. Misuse and Out-of-scope Use The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model developers discuss the risk of adversaries using the model to better evade detection in their associated paper, suggesting that using the model for evading detection or for supporting efforts to evade detection would be a misuse of the model. Risks and Limitations The detection model that has detection rates of ~95% for detecting 1.5B GPT-2-generated text. We believe this is not high enough accuracy for standalone detection and needs to be paired with metadata-based approaches, human judgment, and public education to be more effective. Classifying content from larger models is more difficult, suggesting that detection with automated tools like this model will be increasingly difficult as model sizes increase. The authors find that training detector models on the outputs of larger models can improve accuracy and robustness. Bias Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). Predictions generated by RoBERTa large and GPT-2 1.5B (which this model is built/fine-tuned on) can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups (see the RoBERTa large and GPT-2 XL model cards for more information). The developers of this model discuss these issues further in their paper. Evaluation The following evaluation information is extracted from the associated paper. Testing Data, Factors and Metrics The model is intended to be used for detecting text generated by GPT-2 models, so the model developers test the model on text datasets, measuring accuracy by: testing 510-token test examples comprised of 5,000 samples from the WebText dataset and 5,000 samples generated by a GPT-2 model, which were not used during the training. Model Details Model Description: RoBERTa large OpenAI Detector is the GPT-2 output detector model, obtained by fine-tuning a RoBERTa large model with the outputs of the 1.5B-parameter GPT-2 model. The model can be used to predict if text was generated by a GPT-2 model. This model was released by OpenAI at the same time as OpenAI released the weights of the largest GPT-2 model, the 1.5B parameter version. Developed by: OpenAI Model Type: Fine-tuned transformer-based language model Language(s): English License: MIT Related Models: RoBERTa large, GPT-XL (1.5B parameter version), GPT-Large (the 774M parameter version), GPT-Medium (the 355M parameter version) and GPT-2 (the 124M parameter version)