Version 2.0

ChatGPT Evaluation Dataset v.2.0

View Dataset
Kocoń, Jan;Kazienko, Przemysław

Description

We tested ChatGPT on 25 tasks focusing on solving common NLP problems and requiring analytical reasoning. These tasks include (1) a relatively simple binary classification of texts like spam, humor, sarcasm, aggression detection, or grammatical correctness of the text; (2) a more complex multiclass and multi-label classification of texts such as sentiment analysis, emotion recognition; (3) reasoning with the personal context, i.e., personalized versions of the problems that make use of additional information about text perception of a given user (user’s examples provided to ChatGPT); (4) semantic annotation and acceptance of the text going towards natural language understanding (NLU) like word sense disambiguation (WSD), and (5) answering questions based on the input text. More information in the paper: https://www.sciencedirect.com/science/article/pii/S156625352300177X

Citations (1)

Mentions (0)

Metrics

Dataset Index

2.2

FAIR Score

77%

Citations

1

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

Zenodo

Assigned Domain

Subfield

Artificial Intelligence

Field

Computer Science

Domain

Physical Sciences

Confidence Score

52%

Source

Scholar Data Model

Keywords

Artificial intelligenceMachine learningNatural language processingLarge language modelsChatGPT

Normalization Factors

FT

13.46

CTw

1.00

MTw

1.00