Frequently Asked Questions

What is SuperPred3?

SuperPred3 offers a knowledge-based method for ATC code and target predicition of your compounds which is based on machine learning models.

What is the technical background for database and website?

The data is stored in a relational MySQL database, which is hosted on the Charité IT system. For the handling of chemical information in the database, the Python package RDKit and ChemAxon software were used. The website back-end consists of a lab-based LAMP (Linux/Apache/MySQL/PHP) server, with PHP serving as the back-end language. The database connection is established through the MySQL interface and front-end data delivery through a mixture of Html from submission responses and AJAX requests. Website functionalities are implemented using Javascript and, in extension, its plugin jQuery. Additionally, the CSS_Framework Bootstrap 4 is used. Tables on the website were created with the jQuery plugin DataTables, and the absolute sorting extension. For the chemistry interface, the JavaScript library ChemDoodle Web components was used. The usage of a JavaScript-capable browser is essential, and the server was tested on the most recent version of Google Chrome and Mozilla Firefox.

What is the ATC code?

The Anatomical Therapeutic Chemical (ATC) classification system is used for the classification of drugs. It is published by the World Health Organization (WHO). The classification into groups is based on therapeutic and chemical characteristics of the drugs.Each ATC code is divided into 5 levels:

1. level: Anatomical main group
2. level: Therapeutic main group
3. level: Therapeutic/pharmacological subgroup
4. level: Chemical/therapeutic/pharmacological subgroup
5. level: Chemical substance

Substances or combination of substances in the 5th level refer to a single indication. Drugs having more than one indication belong to more than one ATC code. Aspirine for example has 3 ATC codes assigned.

How can I input a molecular structure?

For the prediction of ATC class and targets, a molecular structure has to be loaded in the ChemDoodle web interface. Structures can be obtained by entering a PubChem name, a SMILES string, loading a structure file or drawing with the provided tools (see below). Once a structure is loaded, additional modifications can be done as well. When satisfied with the result, the button "Start Calculation" can be used to start the predictions.

Which dataset was used for the ATC prediction?

ATC codes were obtained from WHO and filtered as described in the statistics. For a more detailed look into the dataset and machine learning accuracy, you can download the csv file, containing substance name as well as expected and predicted ATC code of each training sample.

Which models are used for the prediction and how is their performance?

Predictions are made by logistic regression machine learning models, based on Morgan fingerprints of length 2048. Training data was filtered in multiple steps (for details see statistics), and the model performance was evaluated using 10-fold cross-validation for the target predcition, and leave-one-out cross-validation in the ATC code prediction.
For a detailed look into performance values for each class (including sensitivity and precision) you can have a look at the corresponding csv file.

What is the meaning of the different scores in the target prediction?

When predicting targets, two different scores are reported, "probability" and "model accuracy". The first score is the probability that the input structure binds with the specific target, as determined by the respective target machine learning model. Since the model performances vary between different targets, additionally the 10-fold cross-validation score of the respective logistic regression model is displayed.

How to cite SuperPred?

Please cite as:

Nickel J, Gohlke BO, Erehman J, Banerjee P, Rong WW, Goede A, Dunkel M, Preissner R. SuperPred: update on drug classification and target prediction. Nucleic Acids Res. 2014 Jul;42(Web Server issue):W26-31. doi: 10.1093/nar/gku477. Epub 2014 May 30. PMID: 24878925; PMCID: PMC4086135.