logo

OpenAI GPT4

Translation
Question answering
Text generation
Zero-shot classification
Summarization
Conversational
Text classification
Named entity recognition

ChatGPT is an advanced conversational artificial intelligence language model developed by OpenAI. This It is based on the GPT-4 architecture and has been trained on a diverse range of internet text to generate human-like responses in natural language conversations. This model is latest version.


Intended Use

GPT stands for Generative Pre-trained Transformer (GPT), a type of language model that uses deep learning to generate human-like, conversational text. As a multimodal model, GPT-4 is able to accept both text and image outputs. 

However, OpenAI has not yet made the GPT-4 model's visual input capabilities available through any platform. Currently the only way to access the text-input capability through OpenAI is with a subscription to ChatGPT Plus.

The GPT-4 model is optimized for conversational interfaces and can be used to generate text summaries, reports, and responses. Currently, only text modality is supported.


Performance

GPT-4 is a highly advanced model that can accept both image and text inputs, making it more versatile than its predecessor, GPT-3. However, it is important to use the appropriate techniques to get the best results, as the model behaves differently than older GPT models.

OpenAI published results for the GPT-4 model comparing it to other state-of-the-art models (SOTA) including its previous GPT-3.5 model.

Benchmark

GPT-4

Evaluated few-shot

GPT-3.5

Evaluated few-shot

LM SOTA

Best external LM evaluated few-shot

SOTA

Best external model (includes benchmark-specific training)

MMLU

Multiple-choice questions in 57 subjects (professional & academic)

86.4%

5-shot

70.0%

5-shot

70.7%

5-shot U-PaLM

75.2%

5-shot Flan-PaLM

HellaSwag

Commonsense reasoning around everyday events

95.3% 10-shot

85.5%

10-shot

84.2%

LLAMA (validation set)

85.6%

ALUM

AI2 

Reasoning Challenge (ARC)

Grade-school multiple choice science questions. Challenge-set.

96.3%

25-shot

85.2%

25-shot

84.2%

8-shot PaLM

85.6%

ST-MOE

WinoGrande

Commonsense reasoning around pronoun resolution

87.5%

5-shot

81.6%

5-shot

84.2%

5-shot PALM

85.6%

5-shot PALM

HumanEval

Python coding tasks

67.0%

0-shot

48.1%

0-shot

26.2%

0-shot PaLM

65.8%

CodeT + GPT-3.5

DROP 

(f1 score)

Reading comprehension & arithmetic.

80.9

3-shot

64.1

3-shot

70.8

1-shot PaLM

88.4

QDGAT


Limitations

The underlying format of the GPT-4 model is more likely to change over time, and it may provide less useful responses if interacted with in the same way as older models. The GPT-4 model has similar limitations to previous GPT models, such as being prone to LLM hallucination and reasoning errors. OpenAI claims that GPT-4 hallucinates less often than other models, regardless.

Limitations

https://openai.com/research/gpt-4

Privacy policy

OpenAI Policy