Published on 16 October 2025 |
KinFragLib: Combinatorial library
View DatasetDescription
KinFragLib: Exploring the Kinase Inhibitor Space Using Subpocket-Focused Fragmentation and Recombination.Project description.Protein kinases play a crucial role in many cell signaling processes, making them one of the most important families of drug targets. In this context, fragment-based drug design strategies have been successfully applied to develop novel kinase inhibitors, usually following a knowledge-driven approach to optimize a focused set of fragments to a potent kinase inhibitor.Alternatively, KinFragLib is a new method that allows to explore and extend the chemical space of kinase inhibitors using data-driven fragmentation and recombination, built on available structural kinome data from the KLIFS database for over 3,200 kinase DFG-in complexes. The computational fragmentation method splits the co-crystallized non-covalent kinase inhibitors into fragments with respect to their 3D proximity to six predefined functionally relevant subpocket centers. The resulting fragment library consists of six subpocket pools with over 9,000 fragments, available at https://github.com/volkamerlab/KinFragLib.KinFragLib offers two main applications: (i) In-depth analyses of the chemical space of known kinase inhibitors, subpocket characteristics and connections, as well as (ii) subpocket-informed recombination of fragments to generate potential novel inhibitors. The latter showed that recombining only a subset of 722 representative fragments generated a combinatorial library of 11.3 million molecules, containing, besides some known kinase inhibitors, more than 99% novel chemical matter compared to ChEMBL and 56% molecules compliant with Lipinski's rule of five.Combinatorial library dataset.The dataset offered here is part of the KinFragLib GitHub repository (https://github.com/volkamerlab/KinFragLib) and contains the metadata and properties of the KinFragLib combinatorial library.1. Raw datacombinatorial_library.json: Full combinatorial library, please refer to notebooks/4_1_combinatorial_library_data_preparation.ipynb at https://github.com/volkamerlab/KinFragLib for detailed information about this data format.combinatorial_library_deduplicated.json: Deduplicated combinatorial library (based on InChIs).chembl_standardized_inchi.csv: Standardized ChEMBL 36 molecules in the form of InChI strings.klifs_download_summary.csv: PDB codes of all KLIFS structures used to generate the KinFragLib fragmentation library. 2. Processed dataData extracted from combinatorial_library_deduplicated.json, performed in notebooks/4_1_combinatorial_library_data_preparation.ipynb at https://github.com/volkamerlab/KinFragLib.n_atoms.csv: Number of atoms for each recombined ligand.ro5.csv: Number of ligands that fulfill Lipinski's rule of five (Ro5) and its individual criteria; number of ligands in total.subpockets.csv: Number of ligands per subpocket combination.original_exact.json: Ligands with exact matches in original ligands, i.e. KLIFS ligands that were used for the fragmentation.original_substructure.json: Ligands with substructure matches in original ligands, i.e. KLIFS ligands that were used for the fragmentation.chembl_exact.json: Ligands with exact matches in ChEMBL.chembl_most_similar.json: Most similar ligand in ChEMBL for each recombined ligand.chembl_highly_similar.json: Most similar ligand in ChEMBL for each recombined ligand with similarity greater than 0.9.Usage.This dataset can be used to run the notebooks available on https://github.com/volkamerlab/KinFragLib.Clone the KinFragLib repository.Download the tar.bz2 file provided here.Extract the archive content to the combinatorial library folder in your local KinFragLib folder and run the notebooks.tar -xvf combinatorial_library.tar.bz2 -C /path_to_kinfraglib/data/combinatorial_library/ Citation.This dataset is part of the KinFragLib publication:Sydow, D., Schmiel, P., Mortier, J., and Volkamer, A. KinFragLib: Exploring the Kinase Inhibitor Space Using Subpocket-Focused Fragmentation and Recombination. J. Chem. Inf. Model. 2020. https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c00839
Citations (0)
No citations found
Mentions (0)
No mentions found
Metrics Over Time
Publication Details
Subfield
Computational Theory and Mathematics
Field
Computer Science
Domain
Physical Sciences
Confidence Score
79%
Source
Open Alex