Automated Author Profile

Martianus Henry, Matthew

Bina Nusantara University

Current S-Index

3.2

Sum of Dataset Indices for all datasets

Average Dataset Index per Dataset

1.6

Average Dataset Index per dataset

Total Datasets

2

Total datasets for this author

Average FAIR Score

65.4%

Average FAIR Score per dataset

Total Citations

0

Total citations to the author's datasets

Total Mentions

0

Total mentions of the author's datasets

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

EmoTweetID: Indonesian Emotion Tweet Dataset

The EmoTweetID dataset is a publicly available resource of Indonesian tweets collected from X (formerly Twitter) using emotion-related keywords. The dataset consists of three main components:1. EmoTweetID-Corpus.csv: 3,126,987 unlabeled tweets for unsupervised tasks such as word embedding construction.2. EmoTweetID-Lexicon.csv: 2,243 tweets automatically annotated using the Indonesian NRC EmoLex.3. EmoTweetID-Human.csv: 2,243 tweets manually annotated by three psychology students, with inter-annotator agreement measured using Cohen’s and Fleiss’ Kappa.Both annotated files (EmoTweetID-Lexicon.csv and EmoTweetID-Human.csv) provide labels following Ekman’s six basic emotions: anger, disgust, fear, joy, sadness, and surprise.Additionally, two pre-trained word embedding models (Wors2Vec and FastText) trained on the corpus, TweetID-Word2Vec.zip and TweetID-FastText.zip, are provided for various downstream NLP tasks.This dataset offers a valuable benchmark for affective computing and natural language processing in Indonesian, supporting research in emotion recognition, social media analysis, and the development of empathetic AI systems.

Authors

  • Setyo Nugroho, Kuncahyo ;
  • Abdurrachman Bachtiar, Fitra ;
  • Firdaus Mahmudy, Wayan ;
  • Martianus Henry, Matthew ;
  • Isnan, Mahmud ;
  • Pangestu, Gusti ;
  • Pardamean, Bens
0 Citations0 Mentions65% FAIR1.6 Dataset Index
10.17632/jzgnjsff9f2025

EmoTweetID: Indonesian Emotion Tweet Dataset

The EmoTweetID dataset is a publicly available resource of Indonesian tweets collected from X (formerly Twitter) using emotion-related keywords. The dataset consists of three main components:1. EmoTweetID-Corpus.csv: 3,126,987 unlabeled tweets for unsupervised tasks such as word embedding construction.2. EmoTweetID-Lexicon.csv: 2,243 tweets automatically annotated using the Indonesian NRC EmoLex.3. EmoTweetID-Human.csv: 2,243 tweets manually annotated by three psychology students, with inter-annotator agreement measured using Cohen’s and Fleiss’ Kappa.Both annotated files (EmoTweetID-Lexicon.csv and EmoTweetID-Human.csv) provide labels following Ekman’s six basic emotions: anger, disgust, fear, joy, sadness, and surprise.Additionally, two pre-trained word embedding models (Wors2Vec and FastText) trained on the corpus, TweetID-Word2Vec.zip and TweetID-FastText.zip, are provided for various downstream NLP tasks.This dataset offers a valuable benchmark for affective computing and natural language processing in Indonesian, supporting research in emotion recognition, social media analysis, and the development of empathetic AI systems.

Authors

  • Setyo Nugroho, Kuncahyo ;
  • Abdurrachman Bachtiar, Fitra ;
  • Firdaus Mahmudy, Wayan ;
  • Martianus Henry, Matthew ;
  • Isnan, Mahmud ;
  • Pangestu, Gusti ;
  • Pardamean, Bens
0 Citations0 Mentions65% FAIR1.6 Dataset Index
10.17632/jzgnjsff9f.12025