How to Use

BitterSweet Overview

Perception of taste is a complex sensation evolved in humans primarily to respond to naturally occurring food-derived chemicals. Among all the taste perceptions, the dichotomy of sweet and bitter tastes is of special importance to human gustatory mechanisms. The sweetness is innately attractive, whereas bitterness evokes an aversive response. Yet, notwithstanding the stark contrast in their sensory perceptions, both the tastes are linked to regulatory health effects by virtue of the receptors involved in their sensation. Thus, a better understanding of molecular correlates that are responsible for a gradient of bitter-sweet taste and development of computational models is of key-value towards the identification of natural as well as synthetic compounds of desirable taste on this axis.


Towards these goals, the BitterSweet web server implements state-of-the-art machine learning models for categorical prediction of bitter and sweet tastes and linked bitter receptors of small molecules. The web server facilitates taste-predictions through PubChem/ZINC15 IDs, SMILES, Structure Definition File (SDF) and by drawing a molecular structure. The web server is further enriched with the functionality to browse 3086 molecules with curated bitter-sweet taste information and 394152 molecules (from external datasets such as FlavorDB, FooDB, Super Natural II, DSSTox, and DrugBank) with predicted taste-information.


BitterSweet database enables insights into the perception of bitter and sweet tastes as well as informed leads for engineering new molecules of desired taste gradient. Along with the power of computational strategies for spanning the chemical space of known and unknown realizable organic compounds, this resource presents with a potent tool for generating compounds of desirable bitter-sweet potency. In the presence of an epidemic of nutrition-related diseases such as obesity and diabetes, BitterSweet facilitates the search for molecules that can increase the appeal of food and beverages to address the problem of overnutrition.

BitterSweet Predict

BitterSweet web server implements state-of-the-art machine learning models for categorical prediction of bitter and sweet tastes as well as linked bitter receptors of small molecules. The models were developed using a comprehensive compilation of bitter, sweet, and tasteless molecules and molecular descriptors from ChemoPy  - A freely available python package for calculating commonly used structural and physiochemical properties.

Draw Structure

Here, the user can provide either the structure of the molecule (using the JSME molecular editor) or the SMILES identifier (by clicking on the yellow smiling face icon) in order to generate prediction.

Mini-Batch Prediction

Mini-batch prediction allows the user to specify the SMILES/PubChem IDs/ZINC15 IDs of up to 10 molecules and retrieve predictions.

Batch Prediction

Batch prediction caters to users requiring prediction for a large number of molecules (up to 200). The user can either upload a consolidates SMILES file or a Structure Definition File (SDF).

Job Status

Status of jobs submitted using Batch Prediction utility can be monitored using the following URL: https://cosylab.iiitd.edu.in/bittersweet/get_job/{JOB_Key}. In addition, the user is also notified via email (if specified) on submission and completion of jobs.

A submitted job results in one of the following statuses:

  • Queued - Server resources are busy at the moment and the submitted job is queued for processing.
  • Running - Server has started processing the job.
  • Failed - Server was not able to process the job successfully. Please check if the input was specified correctly. Also see sample files for reference.
  • Finished - The submitted job was completed successfully and results should be viewable on the same page.

Batch/Mini-Batch Prediction Results

On submitting a mini-batch or a batch of molecules for retrieving predictions, the results are returned in a tabular form with the predicted bitter-sweet taste and linked bitter receptors (if molecules is classified as bitter). The “Details” button redirects the user to the BitterSweet Prediction page wherein probabilities, UMAP visualization, molecular properties, , 3D structure visualization, link-outs to other databases and further download options are provided.

BitterSweet Prediction

The BitterSweet prediction page presents information of predicted bitter-sweet taste and receptors (if molecule is classified as bitter) along with the corresponding probabilities. In addition, similar to the BitterSweet profile page, it displays common molecular properties, 3D structure visualization, UMAP visualization, and link-outs to similar molecules within BitterSweet as well as external databases such as ZINC and FlavorDB. Furthermore, the option to download BitterSweet prediction details, molecular properties, SDF file and 2D image is also provided.

BitterSweet Profile

On clicking the "Details" tab, the user is directed to the BitterSweet profile of a molecule. In addition to different chemical identfiers, UMAP visualizations, molecular properties, and references for taste information, this page contains link-outs to similar molecules within BitterSweet as well as external databases such as ZINC and FlavorDB. Furthermore, the functionality to visualize 3D structure of the queried molecule is provided along with the option to download BitterSweet details, molecular properties, SDF file and 2D image.

Receptor Profile

On clicking the "Details" tab in the receptor search results table, the user is directed to the Receptor profile page wherein information about name and short names of receptors is provided along with a link-out to Uniprot database. The linked molecules table on this page displays all the bitter-molecules present in BitterSweet which were predicted to activate the queried receptor.

UMAP View

Uniform Manifold Approximation and Projection (UMAP) is a general technique for non-linear dimensionality reduction. Using ChemoPy molecular descriptors and UMAP algorithm, each molecule is mapped to a point in a three-dimensional space while preserving the local neighbor relations of molecules in the ChemoPy descriptor space. To know more about UMAP please click here.

The UMAP view is provided to enable the user to get an idea of the domain applicability of the models implemented in BitterSweet. In case a query molecule (in red) is far from clusters of curated bitter, sweet and tasteless molecules, then the predictions provided by BitterSweet web server may not be accurate.

BitterSweet Search

BitterSweet web server implements state-of-the-art machine learning models for categorical prediction of bitter and sweet tastes and linked bitter receptors of small molecules. The models were developed using a comprehensive compilation of bitter, sweet, and tasteless molecules and molecular descriptors from ChemoPy - A freely available python package for calculating commonly used structural and physiochemical properties.

Molecular Search

This is the primary search tool for querying and browsing the molecules present in the web server along with their curated/predicted annotation of bitter, sweet, and tastesless taste, as well as linked bitter receptors (of bitter molecules). In addition to Common Name, IUPAC Name, PubChem ID and Functional Group, the search can be performed using the taste- and source-related parameters. The conjunction of query parameters allows a refined search using multiple parameters.

Note: An empty query will result in listing of all the molecules present in the database.

Advanced Molecular Search

Beyond the basic search, Advanced Search enables queries via other descriptors such as Molecular Weight, ALogP, Hydrogen Bond Acceptors/Donors, Number of Rings, Number of Rotatable Bonds, Number of Aromatic Bonds, AloP etc. Similar to the basic molecular search, Advanced Molecular Search allows filtering by conjunction of parameters, including a graphic query via its chemical structure.

Bitter Receptor Search

This functionality is provided to enable the user to explore the molecular-associations of bitter receptors present in the web server. The receptors can be queried using their name, short name or Uniprot ID.

Note: An empty query will result in listing of all the bitter receptors present in the database.

Molecular Search Results

In response to the query, a list of all matching molecules is listed. In case of using the JSME tool for searching using a structure, an additional column indicating the similarity of the molecule is also displayed.

Receptor Search Results

In response to the query, a list of all matching receptors is listed along with the number of linked molecules (predicted and curated) present in BitterSweet.

Predict

We provide a tool for predicting the taste of a molecule on the BitterSweet axis ranging from Bitter, Bitter-Sweet, Neither Bitter nor Sweet, or Sweet by implementing state of the art machine learning classifiers. These models were trained from molecular properties obtained from  ChemoPy  - A freely available python package for calculating commonly used structural and physiochemical properties.

A. Prediction based on structure

Here, the user can provide either the structure of the molecule (using the JSME molecular editor) or the SMILES identifier (by clicking on the yellow smiling face icon) in order to generate prediction.


B. Prediction for a mini-batch of molecules

The user can also provide a mini-batch (limited to 10 molecules only) of molecules to generate the output for all of them. Identifiers for these molecules could be SMILES, PubChem or ZINC15. We have also provided a short list of molecules (SMILES) to try this functionality out.


C. Prediction for a batch of molecules

In order to process a large number of molecules at once, a list of molecules can also be provided through a SDF file. There is no limit to the number of molecules that can be processed through this mechanism however, keeping in mind the time that it may take to process large number of molecules, the results in this case are emailed to the user in the form of a csv.

Prediction Tool



In addition to a comprehensive curation of Bitter-Sweet molecules, the resource provides with a tool for predicting the taste of a molecule on the BitterSweet axis ranging from Bitter, Bitter-Sweet, Neither Bitter nor Sweet, or Sweet by implementing state of the art machine learning classifiers. These models were trained from molecular properties obtained from ChemoPy - A freely available python package for calculating commonly used structural and physiochemical properties.




The following is a list of 10 sweet molecules of Mango from FlavorDB which you can yourself try out -

SMILES ID Name
CC1CCC(C(C1)O)C(C)C Neomenthol
CC(=O)C(=O)C 2,3-Butanedione
CCC1C(=O)C(=C(O1)C)O 2-Ethyl-4-Hydroxy-5-Methyl-3(2H)-Furanone
CCCCCC(=O)C 2-Heptanone
CCC(C)COC(=O)C 2-Methylbutyl Acetate
CCCC(=O)C 2-Pentanone
CCCC(=O)CC 3-Hexanone
C1=CC=C(C=C1)CCC(=O)O 3-Phenylpropanoic Acid
C(C1C(C(C(C(O1)OC2C(OC(C(C2O)O)O)CO)O)O)O)O Alpha-Maltose
C1=CC=C(C=C1)CO Benzyl Alcohol

Reliability of Prediction Models

Snow
Forest

An ROC curve is the most commonly used way to visualize the performance of a binary classifier, and AUC is (arguably) the best way to summarize its performance in a single number. An area of 1 represents a perfect test; an area of .5 represents a worthless test. A rough guide for classifying the accuracy of a diagnostic test is the traditional academic point system:
  • 0.90-1.00 = excellent
  • 0.80-0.90 = good
  • 0.70-0.80 = fair
  • 0.50-0.70 = poor
  • less than 0.50 = fail

Apart from using the standard tricks of avoiding overfitting (like Regularization, KFold Cross Validation), we also created an external validation set known as the Gold Standard from the molecules curated from 3 sources namely - Phyto Dictionary, BitterNew and UNIMI. A high AUC-ROC value of 0.87 on these molecules ensured the robustness of our classification models. For more details of the model training and comparative analysis of the state of the art, please refer to the data statistics dashboard.


Processing a Batch of SMILES



To further enhance the user experience and to improve the utility of the BitterSweet Predict, Batch Prediction facility is provided for prediction of Bitter-Sweet taste on the basis of SMILES. One may simply list a bunch of SMILES to fetch predictions. To avoid inordinate delays, at the moment, Batch Predict allows searching upto 10 molecules.

Note: Batch Predictions may take a while to process since the properties of the query molecules are generated dynamically in the backend and further processed by the machine learning classifier. So please be patient while using this functionality.


Results


The results are displayed in a tabular form with the predicted taste, bitter and sweet confidence levels (probabilities) for every query molecule. The “Details” button enables exploring detailed molecular properties.



After entering a compound identifier and clicking the “Predict” button, BitterSweet starts processing the molecule based to generate all the chemical properties and to implement the classifiers at the back-end.This process may take a while (up to a minute, depending on the speed of your connection). Please be patient. After successful completion of the above steps, you will be redirected to the page detailing the predicted taste of the molecule.


Bitter and Sweet Probability

The probability of this molecule being bitter as predicted by the classifier. This value indicates the confidence level of the classifier. In simple terms the probability indicates, “How sure/unsure is the model that a given molecule is bitter or sweet?”. By default, as in case of any linear classifier, the threshold probability to categorize a class as True or False is set at 0.5, meaning that if the classifier predicts the probability of a certain class to be higher than 50%, it said to be truly belonging to that class. To demonstrate the robustness of our classifiers, herein we provide the False-Positive-Rate of our model against a range of threshold values.


False Positive Rate - Bitter Threshold Value - Bitter False Positive Rate - Sweet Threshold Value - Sweet
0 0.98 0 1
0 0.87 0 0.98
0 0.74 0.01 0.97
0.01 0.67 0.02 0.94
0.06 0.5 0.03 0.86
0.09 0.46 0.06 0.77
0.15 0.42 0.09 0.71
0.2 0.35 0.12 0.62
0.27 0.28 0.16 0.5
0.35 0.21 0.21 0.44
0.47 0.15 0.28 0.37
0.6 0.09 0.41 0.23
0.7 0.05 0.59 0.14
0.82 0.02 0.82 0.05

Molecular Properties

These are chemical properties generated using ChemoPy open source library for python and portray various characteristics properties of a molecule. .