Quantcast
Channel: Mestrelab Research Analytical Chemistry Software
Viewing all articles
Browse latest Browse all 11

Chemical shift prediction in 13C NMR spectroscopy using ensembles of message passing neural networks (MPNNs)

$
0
0

 

Title: Chemical shift prediction in 13C NMR spectroscopy using ensembles of message passing neural networks (MPNNs)
Authors: D. Williamson, S. Ponte, I. Iglesias, N. Tonge, C. Cobas, E.K. Kemsley
Date: November 2024
Reference: Journal of Magnetic Resonance, Volume 368, November 2024, 107795
DOI: 10.1016/j.jmr.2024.107795
Download link: https://www.sciencedirect.com/science/article/pii/S1090780724001794?via%3Dihub

ABSTRACT

This study reports a deep learning approach that utilises message passing neural networks (MPNNs) for predicting chemical shifts in 13C NMR spectra of small molecules. MPNNs were trained on two distinct datasets: one with approximately 4000 labelled structures and another with over 40,000. To reduce stochastic variation, an ensemble framework was implemented, which is simple to deploy on multiple nodes of a High-Performance Computing facility.
The results emphasise the critical role of training set size and diversity. While prediction performance was comparable on test sets drawn from each dataset, the ensemble trained on the larger dataset retained its accuracy when these sets were crossed over, and when applied to a further collection of approximately 12,000 previously unseen structures introduced after all development work had been completed. In contrast, the ensemble trained on the smaller dataset showed a notable decline in generalisation ability. This difference is attributed to the greater diversity of atomic environments captured in the larger dataset.
The larger dataset also enabled more robust modelling of various error properties, providing a quantitative foundation for spectral assignment and verification. This was achieved in two ways. First, a clear relationship was observed between prediction errors and the frequency of different node feature vectors in the training data, allowing error estimates to be associated with individual nodes based on their type. These estimates can be used as weights in a modified cityblock distance metric when assigning observed to predicted shifts. Second, the mean absolute prediction error calculated at the structure level is well-fitted by a Gaussian kernel cumulative distribution. This enabled a probabilistic assessment of whether the predicted shifts and assigned observations are consistent with originating from the same molecular structure.

 

Download Publication

 

 

La entrada Chemical shift prediction in <sup>13</sup>C NMR spectroscopy using ensembles of message passing neural networks (MPNNs) se publicó primero en Mestrelab Research Analytical Chemistry Software.


Viewing all articles
Browse latest Browse all 11

Trending Articles