Automated Author Profile

Dorigatti, Emilio

Boehringer Ingelheim (Germany)
0000-0002-6829-7766

Current S-Index

5.6

Sum of Dataset Indices for all datasets

Average Dataset Index per Dataset

1.9

Average Dataset Index per dataset

Total Datasets

3

Total datasets for this author

Average FAIR Score

75.6%

Average FAIR Score per dataset

Total Citations

0

Total citations to the author's datasets

Total Mentions

0

Total mentions of the author's datasets

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is an essential analytical technique in the pharmaceutical industry, used particularly for elucidating the structure of unknown impurities in the synthesis of active pharmaceutical ingredients. However, the interpretation of mass spectra is challenging and time-consuming, requiring significant expertise. While recent computational tools aimed at automating this process have been developed, their accuracy in determining the chemical structure is limited. In this paper, we introduce a new method called SEISMiQ for elucidating unknown impurities from their MS/MS spectra. We are able to significantly improve elucidation accuracy by integrating domain experts’ knowledge, specifically the impurity sum formula and known substructure, into the model's training and inference process. Further performance improvements can be achieved through transfer learning from simulated MS/MS spectra of impurities from an in-house database. Finally, the need for any experimental data collection for finetuning can be circumvented by simulating the entire drug substance synthesis process in silico via reaction templates.

Authors

  • Dorigatti, Emilio ;
  • Groß, Jonathan ;
  • Kühlborn, Jonas ;
  • Möckel, Robert ;
  • Maier, Frank ;
  • Keupp, Julian
0 Citations0 Mentions77% FAIR1.9 Dataset Index
10.5281/zenodo.157903002025

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is an essential analytical technique in the pharmaceutical industry, used particularly for elucidating the structure of unknown impurities in the synthesis of active pharmaceutical ingredients. However, the interpretation of mass spectra is challenging and time-consuming, requiring significant expertise. While recent computational tools aimed at automating this process have been developed, their accuracy in determining the chemical structure is limited. In this paper, we introduce a new method called SEISMiQ for elucidating unknown impurities from their MS/MS spectra. We are able to significantly improve elucidation accuracy by integrating domain experts’ knowledge, specifically the impurity sum formula and known substructure, into the model's training and inference process. Further performance improvements can be achieved through transfer learning from simulated MS/MS spectra of impurities from an in-house database. Finally, the need for any experimental data collection for finetuning can be circumvented by simulating the entire drug substance synthesis process in silico via reaction templates.

Authors

  • Dorigatti, Emilio ;
  • Groß, Jonathan ;
  • Kühlborn, Jonas ;
  • Möckel, Robert ;
  • Maier, Frank ;
  • Keupp, Julian
0 Citations0 Mentions73% FAIR1.8 Dataset Index
10.5281/zenodo.164387702025

Enhancing automated drug substance impurity structure elucidation from tandem mass spectra through transfer learning and domain knowledge.

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is an essential analytical technique in the pharmaceutical industry, used particularly for elucidating the structure of unknown impurities in the synthesis of active pharmaceutical ingredients. However, the interpretation of mass spectra is challenging and time-consuming, requiring significant expertise. While recent computational tools aimed at automating this process have been developed, their accuracy in determining the chemical structure is limited. In this paper, we introduce a new method called SEISMiQ for elucidating unknown impurities from their MS/MS spectra. We are able to significantly improve elucidation accuracy by integrating domain experts’ knowledge, specifically the impurity sum formula and known substructure, into the model's training and inference process. Further performance improvements can be achieved through transfer learning from simulated MS/MS spectra of impurities from an in-house database. Finally, the need for any experimental data collection for finetuning can be circumvented by simulating the entire drug substance synthesis process in silico via reaction templates.

Authors

  • Dorigatti, Emilio ;
  • Groß, Jonathan ;
  • Kühlborn, Jonas ;
  • Möckel, Robert ;
  • Maier, Frank ;
  • Keupp, Julian
0 Citations0 Mentions77% FAIR1.9 Dataset Index
10.5281/zenodo.157903012025