Skip to main content

Advertisement

Springer Nature Link
Account
Menu
Find a journal Publish with us Track your research
Search
Saved research
Cart
  1. Home
  2. Annals of Mathematics and Artificial Intelligence
  3. Article

Conformal prediction of biological activity of chemical compounds

  • Open access
  • Published: 16 June 2017
  • Volume 81, pages 105–123, (2017)
  • Cite this article

You have full access to this open access article

Download PDF
Save article
View saved research
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript
Conformal prediction of biological activity of chemical compounds
Download PDF
  • Paolo Toccaceli  ORCID: orcid.org/0000-0002-9911-71821,
  • Ilia Nouretdinov1 &
  • Alexander Gammerman1 
  • 1346 Accesses

  • 8 Citations

  • Explore all metrics

Abstract

The paper presents an application of Conformal Predictors to a chemoinformatics problem of predicting the biological activities of chemical compounds. The paper addresses some specific challenges in this domain: a large number of compounds (training examples), high-dimensionality of feature space, sparseness and a strong class imbalance. A variant of conformal predictors called Inductive Mondrian Conformal Predictor is applied to deal with these challenges. Results are presented for several non-conformity measures extracted from underlying algorithms and different kernels. A number of performance measures are used in order to demonstrate the flexibility of Inductive Mondrian Conformal Predictors in dealing with such a complex set of data. This approach allowed us to identify the most likely active compounds for a given biological target and present them in a ranking order.

Article PDF

Download to read the full article text

Similar content being viewed by others

Combination of inductive mondrian conformal predictors

Article Open access 22 August 2018

Prediction of chemical compounds properties using a deep learning model

Article Open access 04 June 2021

Alignment-independent technique for 3D QSAR analysis

Article Open access 30 March 2016

Explore related subjects

Discover the latest articles, books and news in related subjects, suggested using machine learning.
  • Cheminformatics
  • Computational Chemistry
  • Machine Learning
  • Protein function predictions
  • Statistical Learning
  • Structure Prediction
  • Computational Drug Discovery and Design Strategies

References

  1. Monev, V.: Introduction to similarity searching in chemistry. Comm. Math. Comp. Chem. 51, 7–38 (2004)

    MathSciNet  MATH  Google Scholar 

  2. Bottou, L., Chapelle, O., DeCoste, D., Weston, J.: Large-scale kernel machines (neural information processing). The MIT press (2007)

  3. Bussonnier, M.: Interactive parallel computing in Python. https://github.com/ipython/ipyparallel

  4. Pérez, F., Granger, B.E.: IPython: a system for interactive scientific computing, vol. 9 (2007). http://ipython.org

  5. Kluyver, T., et al.: Jupyter Notebooks – a publishing format for reproducible computational workflows. Positioning and Power in Academic Publishing: Players, Agents and Agendas, 87–90 doi:10.3233/978-1-61499-649-1-87

  6. Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

    Article  Google Scholar 

  7. Chang, E.Y.: PSVM: parallelizing support vector machines on distributed computers. In: Foundations of Large-Scale Multimedia Information Management and Retrieval, pp. 213–230. Springer, Berlin Heidelberg (2011)

    Chapter  Google Scholar 

  8. Faulon, J.-L., Visco, D.P. Jr., Pophale, R.S.: The signature molecular descriptor. 1. using extended valence sequences in qsar and qspr studies. J. Chem. Inf. Comput. Sci. 43(3), 707–720 (2003). PMID: 12767129

    Article  Google Scholar 

  9. Gammerman, A., Vovk, V.: Hedging predictions in machine learning. Comput. J. 50(2), 151–163 (2007)

    Article  Google Scholar 

  10. Gärtner, T.: Kernels for Structured Data. World Scientific Publishing Co., Inc., River Edge (2009)

  11. Graf, H.P., Cosatto, E., Bottou, L., Durdanovic, I., Vapnik, V.: Parallel support vector machines: the cascade SVM. In: Advances in Neural Information Processing Systems, pp 521–528. MIT Press (2005)

  12. Jain, A.N., Nicholls, A.: Recommendations for evaluation of computational methods. J. Comput. Aided Mol. Des. 22(3-4), 133–139 (2008)

    Article  Google Scholar 

  13. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  14. Shafer, G., Vovk, V.: A tutorial on conformal prediction. J. Mach. Learn Res. 9, 371–421 (2008)

    MathSciNet  MATH  Google Scholar 

  15. Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer-Verlag New York, Inc., Secaucus, NJ, USA (2005)

    MATH  Google Scholar 

  16. Weis, D.C., Visco, D.P. Jr.: Jean-loup Faulon. Data mining pubchem using a support vector machine with the signature molecular descriptor Classification of factor {XIa} inhibitors. J. Mol. Graph. Model. 27(4), 466 –475 (2008)

    Article  Google Scholar 

  17. Holenz, J., et al. (eds.): Lead Generation: Methods and Strategies, vol. 68. Wiley-VCH (2016)

  18. Woodsend, K., Gondziom, J.: Hybrid MPI/OpenMP parallel linear support vector machine training. J. Mach. Learn. Res. 10, 1937–1953 (2009)

    MathSciNet  MATH  Google Scholar 

  19. You, Y., Fu, H., Song, S.L., Randles, A., Kerbyson, D., Marquez, A., Yang, G., Hoisie, A.: Scaling support vector machines on modern HPC platforms. J. Parallel Distrib. Comput. 76(C), 16–31 (2015)

    Article  Google Scholar 

  20. Toccaceli, P., Nouretdinov, I., Gammerman, A.: Conformal predictors for compound activity prediction. In: COPA Proceedings of the 5th International Symposium on Conformal and Probabilistic Prediction with Applications, vol. 9653, p 2016. Springer-Verlag New York Inc. (2016)

  21. Nouretdinov, I., Gammerman, A., Qi, Y., Klein-Seetharaman, J.: Determining confidence of predicted interactions between HIV-1 and human proteins using conformal method. Pac. Symp. Biocomput. 311 (2012)

  22. Wang, Y., Suzek, T., Zhang, J., Wang, J., He, S., Cheng, T., Shoemaker, B.A., Gindulyte, A., Bryant, S.H.: Pubchem BioAssay: 2014 upyear. Nucleic Acids Res. 42(1), D1075–82 (2014)

    Article  Google Scholar 

  23. McCool, M., Robison, A.D., Reinders, J.: Structured Parallel Programming: Patterns for Efficient Computation. Morgan-Kaufmann (2012)

Download references

Acknowledgments

This project (ExCAPE) has received funding from the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement no. 671555. We are grateful for the help in conducting experiments to the Ministry of Education, Youth and Sports (Czech Republic) that supports the Large Infrastructures for Research, Experimental Development and Innovations project “IT4Innovations National Supercomputing Center – LM2015070”. This work was also supported by EPSRC grant EP/K033344/1 (“Mining the Network Behaviour of Bots”) and by Technology Integrated Health Management (TIHM) project awarded to the School of Mathematics and Information Security at Royal Holloway as part of an initiative by NHS England supported by InnovateUK. We are indebted to Lars Carlsson of Astra Zeneca for providing the data and useful discussions. We are also thankful to Zhiyuan Luo and Vladimir Vovk for many valuable comments and discussions.

Author information

Authors and Affiliations

  1. Royal Holloway, University of London, Egham, UK

    Paolo Toccaceli, Ilia Nouretdinov & Alexander Gammerman

Authors
  1. Paolo Toccaceli
    View author publications

    Search author on:PubMed Google Scholar

  2. Ilia Nouretdinov
    View author publications

    Search author on:PubMed Google Scholar

  3. Alexander Gammerman
    View author publications

    Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Paolo Toccaceli.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Toccaceli, P., Nouretdinov, I. & Gammerman, A. Conformal prediction of biological activity of chemical compounds. Ann Math Artif Intell 81, 105–123 (2017). https://doi.org/10.1007/s10472-017-9556-8

Download citation

  • Published: 16 June 2017

  • Issue date: October 2017

  • DOI: https://doi.org/10.1007/s10472-017-9556-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Conformal prediction
  • Confidence estimation
  • Chemoinformatics
  • Non-conformity measure

Mathematics Subject Classification (2010)

  • 68T05

Advertisement

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Journal finder
  • Publish your research
  • Language editing
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our brands

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Discover
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support
  • Legal notice
  • Cancel contracts here

Not affiliated

Springer Nature

© 2026 Springer Nature