Site is currently under maintenance
Some features may be unavailable or limited during this time. We apologize for any inconvenience and appreciate your patience.

Published on 01 January 2018

Image and sound data from film Fantasia produced by Walt Disney

View Dataset
Martín-Gómez, Lucía;Pérez-Marcos, Javier

Description

This repository contains the data used in the article Convolutional neural networks and transfer learning applied to automatic composition of descriptive music published in the 15th International Conference on Distributed Computing and Artificial Intelligence (DCAI). Data structure is explained in detail in the article. This proposal is the continuation of an earlier work whose data are available in a GitHub repository.
Abstract
Visual and musical arts has been strongly interconnected throughout history. The aim of this work is to compose music on the basis of the visual characteristics of a video. For this purpose, descriptive music is used as a link between image and sound and a video fragment of film Fantasia is deeply analyzed. Specially, convolutional neural networks in combination with transfer learning are applied in the process of extracting image descriptors. In order to establish a relationship between the visual and musical information, Naive Bayes, Support Vector Machine and Random Forest classifiers are applied. The obtained model is subsequently employed to compose descriptive music from a new video. The results of this proposal are compared with those of an antecedent work in order to evaluate the performance of the classifiers and the quality of the descriptive musical composition.

DATAtrain_data.arff: Image descriptors and the most important sound of each frame from the fragment "The Nutcracker Suite" in film Fantasia obtained by means of CNNs. Data stored into ARFF format.test_data.arff: Image descriptors of each frame from the fragment "The Firebird" in film Fantasia 2000 obtained by means of CNNs. Data stored into ARFF format.midi.csv: Frame number of the fragment "The Firebird" in film Fantasia 2000 and the sound predicted by the system encoded in MIDI. Data stored into CSV format.firebird_prediction.mp3: Audio file with the synthesizing of the prediction data for the fragment "The Firebird" of film Fantasia 2000.LICENSEData is available under MIT License. To make use of the data the article must be cited.

Citations (1)

Mentions (0)

Metrics

Dataset Index

0.7

FAIR Score

13%

Citations

1

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

figshare

Assigned Domain

Subfield

Gender Studies

Field

Social Sciences

Domain

Social Sciences

Confidence Score

93%

Source

Open Alex

Keywords

80199 Artificial Intelligence and Image Processing not elsewhere classifiedFOS: Computer and information sciencesArtificial Intelligence and Image Processing80106 Image Processing80704 Information Retrieval and Web SearchFOS: Media and communications

Normalization Factors

FT

13.46

CTw

1.00

MTw

1.00