Published on 01 January 2026

Enhanced GoCJ: Google Cloud Jobs Dataset

View Dataset
Nawaz, Mohsin;Hussain, Altaf

Description

The GoCJ dataset is comprised of multiple files, where each file contains the sizes of a specified number of jobs expressed in Million Instructions (MI), derived from workload behaviors observed in Google cluster traces. The name of each file indicates the number of jobs it contains; for example, GoCJ_Dataset_1000 includes 1000 jobs along with their associated SLA classes and arrival times.In this study, a modified version of the GoCJ dataset is employed. Each dataset file consists of three columns: (i) job length in terms of Million Instructions (MI), (ii) Service Level Agreement (SLA: {1, 2, 3}), representing different levels of priorities, and (iii) job arrival time, which captures realistic workload submission behavior.The experimental evaluation is conducted using the following dataset files: GoCJ_Dataset_1000.csv, GoCJ_Dataset_2000.csv, GoCJ_Dataset_3000.csv, GoCJ_Dataset_4000.csv, GoCJ_Dataset_5000.csv, and GoCJ_Dataset_6000.csv, enabling performance analysis under increasing workload scales.The file Original_Enhanced_Dataset.txt contains the 50 seed job sizes required as input for both the Java-based generator (EnhancedGoCJGenerator.java) and the Excel-based generator (GoCJ_Enhanced_Generator.xlsx) to reproduce datasets of any desired size while preserving the original workload distribution properties.

Citations (0)

Mentions (0)

Metrics Over Time

Publication Details

DOI

Publisher

Mendeley Data

Keywords

Cloud ComputingCloud Computing Environment