Published on 01 January 2026
Central Kurdish Linguistic Text Cryptography Dataset (CKLTCD): A Dataset for Statistical Feature-Based and AI-Driven Cryptographic Key Generation Using SHA-FFNN with AES-GCM-256 Encryption and Comparative Evaluation with PBKDF2, Argon2, and HKDF (3000 Texts)
View DatasetDescription
The Central Kurdish Linguistic Text Cryptography Dataset (CKLTCD) consists of 3000 Central Kurdish texts designed for AI-driven cryptographic research. Statistical linguistic features are extracted, transformed via SHA-256, and processed using a Feedforward Neural Network (FFNN) to generate nonlinear cryptographic keys within an AES-GCM-256 framework. The dataset supports ciphertext-based evaluation with randomized salts per text and a fixed global FFNN configuration. It also enables comparative analysis with standard key derivation functions (PBKDF2, Argon2, and HKDF), where security is assessed using NIST SP800-22 tests on concatenated ciphertext streams, and key sensitivity is evaluated at the ciphertext level under one-bit key variations.Important points:1-All NIST SP800-22 results in this workbook were computed on concatenated ciphertext bitstreams for each method, not on generated keys.2- Random salt values were generated for every text in all four methods.3- A single global random FFNN (W1, b1, W2, b2) was generated and fixed for the whole experiment;4ِِ- All parameter values are stored in sheet 03_FFNN_Random_Params. Key sensitivity was computed on the ciphertext level, using the ciphertext change after a 1-bit key modification, instead of measuring the bit change inside the key itself.