Large Language Model-Driven Narrative Generation Study Data: ChatGPT-Generated Narratives, Real Tweets, and Source Code
View DatasetDescription
In the interests of advancing Large Language Models (LLMs) usage in engineering, science, and medicine, and other fields, we provide the data sets and code associated with the Structured Narrative Prompt for LLMs Study. Data for this study was generated using an Agent-Based Model (ABM), the LLM ChatGPT, and using a set of tweets previously collected from Twitter. To facilitate reproducibility, transparency, and reuse of our work, this repository includes: (1) Simulation-related code and data for generating simulated agents' life events (a) output from the Java ABM simulation, including the ABM-generated narratives and associated life-event information (2) ChatGPT-related code and data (a) the Python script that generates structured prompts for ChatGPT from the ABM-generated life events (b) the set of generated structured prompts (inputs) for ChatGPT, (used to generate the LLM narratives) (c) the Python script that submits the structured prompts to ChatGPT via the API (d) the set of ChatGPT-generated narratives (e) the Python script that combines ChatGPT (output) narratives with the ABM simulation narratives, in preparation for PANAS sentiment analysis (3) Analysis-related code and data (a) the PANAS sentiment analysis R scripts (b) the statistical significance test R scripts (Chi-squared test and Fisher's exact test), used for finding significant differences in sentiment scoring among ABM-generated narratives, LLM-generated narratives, and the real tweets (a) the PANAS lexicon used for the sentiment analysis (b) the set of utilized tweets with PII removed (c) the approved IRB documentation for collecting those tweetsFolder Names/Breakdown for Data File section:1. LLM-related Scripts and Data: LLM_Phase_Scripts_and_Data.zip2. Analysis-related Scripts and Data: Analysis_Phase_Scripts_and_Data.zip
Citations (0)
No citations found
Mentions (0)
No mentions found
Metrics Over Time
Publication Details
Subfield
Rheumatology
Field
Medicine
Domain
Health Sciences
Confidence Score
57%
Source
Open Alex