Scholar Data

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is an essential analytical technique in the pharmaceutical industry, used particularly for elucidating the structure of unknown impurities in the synthesis of active pharmaceutical ingredients. However, the interpretation of mass spectra is challenging and time-consuming, requiring significant expertise. While recent computational tools aimed at automating this process have been developed, their accuracy in determining the chemical structure is limited. In this paper, we introduce a new method called SEISMiQ for elucidating unknown impurities from their MS/MS spectra. We are able to significantly improve elucidation accuracy by integrating domain experts’ knowledge, specifically the impurity sum formula and known substructure, into the model's training and inference process. Further performance improvements can be achieved through transfer learning from simulated MS/MS spectra of impurities from an in-house database. Finally, the need for any experimental data collection for finetuning can be circumvented by simulating the entire drug substance synthesis process in silico via reaction templates.

Authors

Dorigatti, Emilio ;
Groß, Jonathan ;
Kühlborn, Jonas ;
Möckel, Robert ;
Maier, Frank ;
Keupp, Julian

0 Citations0 Mentions77% FAIR1.9 Dataset Index

10.5281/zenodo.157903002025

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Authors

Dorigatti, Emilio ;
Groß, Jonathan ;
Kühlborn, Jonas ;
Möckel, Robert ;
Maier, Frank ;
Keupp, Julian

0 Citations0 Mentions73% FAIR1.8 Dataset Index

10.5281/zenodo.164387702025

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Authors

Dorigatti, Emilio ;
Groß, Jonathan ;
Kühlborn, Jonas ;
Möckel, Robert ;
Maier, Frank ;
Keupp, Julian

0 Citations0 Mentions77% FAIR1.9 Dataset Index

10.5281/zenodo.157903012025

Automated Author Profile
Dorigatti, Emilio
Boehringer Ingelheim (Germany)
0000-0002-6829-7766

Dorigatti, Emilio

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Automated Author ProfileDorigatti, EmilioBoehringer Ingelheim (Germany)0000-0002-6829-7766

Dorigatti, Emilio

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Automated Author Profile
Dorigatti, Emilio
Boehringer Ingelheim (Germany)
0000-0002-6829-7766