CHEESE API

If you are interested in integrating CHEESE into one of your workflows in a programmatic way, we provide you with the CHEESE API. The API consists now of the principal molecular search service.

The CHEESE API URL is the following : https://api.cheese.themama.ai/ and only registered CHEESE users can access it using an API key generated by the CHEESE UI.

How to generate the API key?

Go to the CHEESE webpage
Sign in with your account.
You should see a button to generate the API key in the top right corner
Copy the API key and use it to call the API.

Molecular search

Searching by a single molecule

This is the main CHEESE service where you can supply a query molecule, the number of neighbors to retrieve, the search type and quality, and you'll get a JSON object containing the resulting molecules together with their descriptors, predicted properties and Morgan Tanimoto similarities.

To call the service you can perform a GET request and provide the parameters and your generated API key in the headers.

Request URL : https://api.cheese.themama.ai/molsearch

Parameters :

search_input (str) : SMILES string of the query molecule
search_type (str) : Can be one of the following : 'morgan', 'espsim_electrostatic', 'espsim_shape', 'usr_shape'
search_quality (str) : Can be one of the following : 'fast','accurate','very accurate'
n_neighbors (int) : Number of neighbors
descriptors (bool) : Whether to get descriptors or not
properties (bool) : Whether to get properties or not
filter_molecules (bool): Whether to apply 'No solvants' filter

Headers :

Authorization : Enter here your API key under Bearer $api_key

Python example

import requests

# Query molecule in SMILES format
query = "CCC(O)(CC)C[N@@H+](C)CC(=O)NCc1cccc(OC(C)C)c1"

# Search type
search_type = "morgan" 

# Search quality
search_quality = "fast"

# Number of neighbors
n_neighbors = 2 

# Paste your API key here
api_key="XXXXXXX"

# Perform the request
requests.get("https://api.cheese.themama.ai/molsearch",
                {"search_input":query,
                "search_type":search_type,
                "n_neighbors":n_neighbors,
                "search_quality":"fast",
                "descriptors":True,
                "properties":True,
                "filter_molecules":False}
                ,headers={'Authorization': f"Bearer {api_key}"
                },
                verify=False).json()

CURL example

Please make sure that the search input SMILES is correctly formated to make the URI. Sometimes the smiles strings contain =,/... signs and can make the request invalid.

Here is an example of such a query SMILES string (CCC(O)(CC)CN@@H+CC(=O)NCc1cccc(OC(C)C)c1) and how it should be sent to the API.

curl -X 'GET' \
  'https://api.cheese.themama.ai/molsearch?search_input=CCC%28O%29%28CC%29C%5BN%40%40H%2B%5D%28C%29CC%28%3DO%29NCc1cccc%28OC%28C%29C%29c1&search_type=morgan&n_neighbors=2&search_quality=fast&descriptors=true&properties=true&filter_molecules=false' \
  -H 'Accept: application/json'
  -H 'Authorization: Bearer {$API_KEY}'

The JSON response should looks like this :

{
  "remarks": "",
  "neighbors": [
    {
      "smiles": "CCC(O)(CC)C[N@@H+](C)CC(=O)NCc1cccc(OC(C)C)c1",
      "zinc_id": "ZINC000626110675",
      "properties": {
        "absorption": {
          "caco2_wang": -5.708,
          "lipophilicity_astrazeneca": -0.799,
          "solubility_aqsoldb": -1.443,
          "bioavailability_ma": 0.349,
          "hia_hou": 0.075,
          "pgp_broccatelli": 0.017,
          "clogp": 1.1558
        },
        "excretion": {
          "clearance_hepatocyte_az": 40.041,
          "clearance_microsome_az": 14.665,
          "half_life_obach": 4.941
        },
        "toxicity": {
          "ld50_zhu": 2.701,
          "ames": 0,
          "dili": 0,
          "herg": 0.992
        },
        "distribution": {
          "ppbr_az": 27.154,
          "vdss_lombardo": 1.932,
          "bbb_martins": 0.926
        },
        "metabolism": {
          "cyp2c9_veith": 0,
          "cyp2d6_veith": 0.666,
          "cyp3a4_veith": 0
        },
        "basics": {
          "molecular_weight": 337.24857,
          "formal_charge": 1,
          "heavy_atoms": 24,
          "h_bond_acceptors": 3,
          "h_bond_donor": 3,
          "rotatable_bonds": 10,
          "num_of_rings": 1,
          "molar_refractivity": 96.2052,
          "number_of_atoms": 24,
          "topological_surface_area_mapping": 63
        }
      },
      "Morgan Tanimoto": 1
    },
    {
      "smiles": "CCC(O)(CC)C[N@H+](C)CC(=O)NCc1cccc(OC(C)C)c1",
      "zinc_id": "ZINC000626110675",
      "properties": {
        "absorption": {
          "caco2_wang": -5.708,
          "lipophilicity_astrazeneca": -0.799,
          "solubility_aqsoldb": -1.443,
          "bioavailability_ma": 0.349,
          "hia_hou": 0.075,
          "pgp_broccatelli": 0.017,
          "clogp": 1.1558
        },
        "excretion": {
          "clearance_hepatocyte_az": 40.041,
          "clearance_microsome_az": 14.665,
          "half_life_obach": 4.941
        },
        "toxicity": {
          "ld50_zhu": 2.701,
          "ames": 0,
          "dili": 0,
          "herg": 0.992
        },
        "distribution": {
          "ppbr_az": 27.154,
          "vdss_lombardo": 1.932,
          "bbb_martins": 0.926
        },
        "metabolism": {
          "cyp2c9_veith": 0,
          "cyp2d6_veith": 0.666,
          "cyp3a4_veith": 0
        },
        "basics": {
          "molecular_weight": 337.24857,
          "formal_charge": 1,
          "heavy_atoms": 24,
          "h_bond_acceptors": 3,
          "h_bond_donor": 3,
          "rotatable_bonds": 10,
          "num_of_rings": 1,
          "molar_refractivity": 96.2052,
          "number_of_atoms": 24,
          "topological_surface_area_mapping": 63
        }
      },
      "Morgan Tanimoto": 1
    }
  ],
  "query_properties": {
    "smiles": "CCC(O)(CC)C[N@@H+](C)CC(=O)NCc1cccc(OC(C)C)c1",
    "zinc_id": "",
    "properties": {
      "absorption": {
        "caco2_wang": -5.707998275756836,
        "lipophilicity_astrazeneca": -0.7990449666976929,
        "solubility_aqsoldb": -1.4428932666778564,
        "bioavailability_ma": 0.34873905777931213,
        "hia_hou": 0.0752602070569992,
        "pgp_broccatelli": 0.016872582957148552,
        "clogp": 1.1558000000000013
      },
      "excretion": {
        "clearance_hepatocyte_az": 40.041385650634766,
        "clearance_microsome_az": 14.665279388427734,
        "half_life_obach": 4.940968990325928
      },
      "toxicity": {
        "ld50_zhu": 2.7007973194122314,
        "ames": 0.0000671601592330262,
        "dili": 0.000006393878720700741,
        "herg": 0.9918881058692932
      },
      "distribution": {
        "ppbr_az": 27.15413475036621,
        "vdss_lombardo": 1.9315871000289917,
        "bbb_martins": 0.9258340001106262
      },
      "metabolism": {
        "cyp2c9_veith": 0.0002478980168234557,
        "cyp2d6_veith": 0.6658723950386047,
        "cyp3a4_veith": 0.00003590650521800853
      },
      "basics": {
        "molecular_weight": 337.24856933609,
        "formal_charge": 1,
        "heavy_atoms": 24,
        "h_bond_acceptors": 3,
        "h_bond_donor": 3,
        "rotatable_bonds": 10,
        "num_of_rings": 1,
        "molar_refractivity": 96.20520000000006,
        "number_of_atoms": 24,
        "topological_surface_area_mapping": 63
      }
    }
  },
  "search_info": {
    "query_embedding_time": 0.05881309509277344,
    "search_time": 0.027311325073242188,
    "filter_time": 0.00004315376281738281,
    "sorting_time": 0.021863937377929688,
    "property_prediction_time": 0.043591976165771484,
    "total_time": 0.15162348747253418
  }
}

Under neighbors, a list of neighbors is provided together with their descriptors, properties and Morgan similarities to the query. Under query_properties key as well as the descriptors and predicted properties of the query molecule. Finally some details about the search time is provided as well.

Searching by a list of molecules

You can search a list of molecules and get a response either in JSON or CSV format.

To call the service you can perform a GET request and provide the parameters as well as your generated API key and the response format in the headers.

Request URL : https://api.cheese.themama.ai/molsearch_array

Parameters :

search_input (List[str]) : List of SMILES strings of the query molecules
search_type (str) : Can be one of the following : 'morgan', 'espsim_electrostatic', 'espsim_shape', 'usr_shape'
search_quality (str) : Can be one of the following : 'fast','accurate','very accurate'
n_neighbors (int) : Number of neighbors
descriptors (bool) : Whether to get descriptors or not
properties (bool) : Whether to get properties or not
filter_molecules (bool): Whether to apply 'No solvants' filter

Headers :

Authorization : Enter here your API key under Bearer $api_key
Accept : Can either be application/json for a JSON response or application/csv for a CSV response

Imporatnt notes : If you are searching for a large set of molecules, the search speed is significantly affected if you :

Set the search quality to very accurate
Set the search type to consensus
Enable the molecule filtering
The worst speed is when you combine all of the above

Python example

import requests

# Query molecule in SMILES format
query = ["CCC(O)(CC)C[N@@H+](C)CC(=O)NCc1cccc(OC(C)C)c1" , "CN(C)c1ccccc1NC(=O)CSc1nc2ccccc2[nH]1" , "XXX?"]

# Search type
search_type = "morgan" 

# Search quality
search_quality = "fast"

# Number of neighbors
n_neighbors = 2

# Paste your API key here
api_key="XXXXXXX"

# Perform the request
requests.get("https://api.cheese.themama.ai/molsearch_array",
                {"search_input":query,
                "search_type":search_type,
                "n_neighbors":n_neighbors,
                "search_quality":"fast",
                "descriptors":False,
                "properties":True,
                "filter_molecules":False}
                ,headers={'Authorization': f"Bearer {api_key}",
                          'Accept': "application/csv"
                },
                verify=False).json()

# Download result file to local storage
result_filename=new_resp.headers["content-disposition"].split()[-1].replace("filename=","")
save_path="./"
with open(save_path+result_filename,'wb') as new_file:
    new_file.write(new_resp.content)

CURL example

curl -X 'GET' \
  'https://api.cheese.themama.ai/molsearch_array?search_type=morgan&n_neighbors=2&search_quality=fast&search_input=CCC%28O%29%28CC%29C%5BN%40%40H%2B%5D%28C%29CC%28%3DO%29NCc1cccc%28OC%28C%29C%29c1&search_input=CN%28C%29c1ccccc1NC%28%3DO%29CSc1nc2ccccc2%5BnH%5D1&search_input=XXX%3F&descriptors=true&properties=false&filter_molecules=false&filename=search_results' \
  -H 'Accept: application/json' \
  -H 'Authorization: Bearer {$API_KEY}'

The CSV response file should have the following format :

query,neighbor,neighbor ZINC ID,Morgan Tanimoto,prop1,prop2,....,desc1,desc2... where prop are properties and desc are descriptors.

The JSON response has a similar format to the API call of the /molsearch call, the difference is that the keys are SMILES of the molecules together with the search_info.

Example JSON response :

{
  "CCC(O)(CC)C[N@@H+](C)CC(=O)NCc1cccc(OC(C)C)c1": {
    "remarks": "",
    "neighbors": [
      {
        "smiles": "CCC(O)(CC)C[N@@H+](C)CC(=O)NCc1cccc(OC(C)C)c1",
        "zinc_id": "ZINC000626110675",
        "properties": {
          "basics": {
            "molecular_weight": 337.24857,
            "formal_charge": 1,
            "heavy_atoms": 24,
            "h_bond_acceptors": 3,
            "h_bond_donor": 3,
            "rotatable_bonds": 10,
            "num_of_rings": 1,
            "molar_refractivity": 96.2052,
            "number_of_atoms": 24,
            "topological_surface_area_mapping": 63
          },
          "absorption": {
            "clogp": 1.1558
          }
        },
        "Morgan Tanimoto": 1
      },
      {
        "smiles": "CCC(O)(CC)C[N@H+](C)CC(=O)NCc1cccc(OC(C)C)c1",
        "zinc_id": "ZINC000626110675",
        "properties": {
          "basics": {
            "molecular_weight": 337.24857,
            "formal_charge": 1,
            "heavy_atoms": 24,
            "h_bond_acceptors": 3,
            "h_bond_donor": 3,
            "rotatable_bonds": 10,
            "num_of_rings": 1,
            "molar_refractivity": 96.2052,
            "number_of_atoms": 24,
            "topological_surface_area_mapping": 63
          },
          "absorption": {
            "clogp": 1.1558
          }
        },
        "Morgan Tanimoto": 1
      }
    ],
    "query_properties": {
      "smiles": "CCC(O)(CC)C[N@@H+](C)CC(=O)NCc1cccc(OC(C)C)c1",
      "zinc_id": "",
      "properties": {
        "basics": {
          "molecular_weight": 337.24856933609,
          "formal_charge": 1,
          "heavy_atoms": 24,
          "h_bond_acceptors": 3,
          "h_bond_donor": 3,
          "rotatable_bonds": 10,
          "num_of_rings": 1,
          "molar_refractivity": 96.20520000000006,
          "number_of_atoms": 24,
          "topological_surface_area_mapping": 63
        },
        "absorption": {
          "clogp": 1.1558000000000013
        }
      }
    }
  },
  "CN(C)c1ccccc1NC(=O)CSc1nc2ccccc2[nH]1": {
    "remarks": "",
    "neighbors": [
      {
        "smiles": "CN(C)c1ccccc1NC(=O)CSc1nc2ccccc2[nH]1",
        "zinc_id": "ZINC000154371242",
        "properties": {
          "basics": {
            "molecular_weight": 326.12013,
            "formal_charge": 0,
            "heavy_atoms": 23,
            "h_bond_acceptors": 4,
            "h_bond_donor": 2,
            "rotatable_bonds": 5,
            "num_of_rings": 3,
            "molar_refractivity": 96.2164,
            "number_of_atoms": 23,
            "topological_surface_area_mapping": 61.02
          },
          "absorption": {
            "clogp": 3.3597
          }
        },
        "Morgan Tanimoto": 1
      },
      {
        "smiles": "CN(C)c1ccccc1NC(=O)CSc1nccc2ccccc21",
        "zinc_id": "ZINC000129508613",
        "properties": {
          "basics": {
            "molecular_weight": 337.12488,
            "formal_charge": 0,
            "heavy_atoms": 24,
            "h_bond_acceptors": 4,
            "h_bond_donor": 1,
            "rotatable_bonds": 5,
            "num_of_rings": 3,
            "molar_refractivity": 101.8657,
            "number_of_atoms": 24,
            "topological_surface_area_mapping": 45.23
          },
          "absorption": {
            "clogp": 4.0316
          }
        },
        "Morgan Tanimoto": 0.543859649122807
      }
    ],
    "query_properties": {
      "smiles": "CN(C)c1ccccc1NC(=O)CSc1nc2ccccc2[nH]1",
      "zinc_id": "",
      "properties": {
        "basics": {
          "molecular_weight": 326.120132196,
          "formal_charge": 0,
          "heavy_atoms": 23,
          "h_bond_acceptors": 4,
          "h_bond_donor": 2,
          "rotatable_bonds": 5,
          "num_of_rings": 3,
          "molar_refractivity": 96.21640000000002,
          "number_of_atoms": 23,
          "topological_surface_area_mapping": 61.02
        },
        "absorption": {
          "clogp": 3.359700000000001
        }
      }
    }
  },
  "XXX?": {
    "remarks": "SMILES string is not valid !!"
  },
  "search_info": {
    "query_embedding_time": 0.08516359329223633,
    "search_time": 0.029165983200073242,
    "filter_time": 0.000051021575927734375,
    "sorting_time": 0.0447239875793457,
    "property_prediction_time": 0.06223559379577637,
    "total_time": 0.22134017944335938
  }
}

Searching by a file molecules

You can search a files of molecules and get a response either in JSON or CSV format. The file formats can either be in .csv, .sdf, .smi, or .txt formats.

Please note that for .csv format, the inputs column should be named "SMILES".

To call the service you can perform a POST request and provide the parameters as well as your generated API key and the response format in the headers.

Request URL : https://api.cheese.themama.ai/molsearch_file

Parameters :

search_type (str) : Can be one of the following : 'morgan', 'espsim_electrostatic', 'espsim_shape', 'usr_shape'
search_quality (str) : Can be one of the following : 'fast','accurate','very accurate'
n_neighbors (int) : Number of neighbors
descriptors (bool) : Whether to get descriptors or not
properties (bool) : Whether to get properties or not
filter_molecules (bool): Whether to apply 'No solvants' filter

Files:

search_input (file : _io.BufferedReader) file of the query molecules. The file must be provided under the search_input parameter of the POST request files (see example below).

You can find some file examples to try the API in the CHEESE web assets folder

Headers :

Authorization : Enter here your API key under Bearer $api_key
Accept : Can either be application/json for a JSON response or application/csv for a CSV response

This is an example to call the service using the Python requests library :

Python example

import requests


# Query file 

filename="00_chembl_subset.sdf"
file=open(filename,"rb")


# Paste your API key here
api_key="XXXXXXX"

new_resp=requests.post("http://api.cheese.themama.ai/molsearch_file",
                data={
                "search_type":"morgan",
                "n_neighbors":5,
                "search_quality":"very accurate",
                "descriptors":False,
                "properties":True,
                "filter_molecules":False},
                files={"search_input":file},
                headers={'Authorization': f"Bearer {api_key}",
                          'Accept': "application/csv"
                },
                verify=False)


# Download result file to local storage
result_filename=new_resp.headers["content-disposition"].split()[-1].replace("filename=","")
save_path="./"
with open(save_path+result_filename,'wb') as new_file:
    new_file.write(new_resp.content)

The response formats are similar to those of the /molsearch_array call.

Searching by a file molecules (Example of a KNIME workflow)

As KNIME is a tool frequently used by computational chemists, we provided an example template of a KNIME workflow that is used to process a list of SMILES strings and return a CSV file as a result.

You can download it from here.

Property prediction

ADMET property prediction is also supported by the API. You have the option to supply molecules and get ADMET properties as well as basic descriptors if needed.

Property prediction for a single molecule

You can perform property prediction and descriptor calculation on a single molecule query and get a JSON object as a response.

To call the service you can perform a GET request and provide the parameters and your generated API key in the headers.

Request URL : https://api.cheese.themama.ai/predict

Parameters :

search_input (str) : SMILES string of the query molecule
descriptors (bool) : Whether to get descriptors or not
properties (bool) : Whether to get properties or not
At least one of the two above parameters needs to be True

Headers :

Authorization : Enter here your API key under Bearer $api_key

This is an example to call the service using the Python requests library :

Python example

import requests

# Query molecule in SMILES format
query = "CCCCCC"

# Paste your API key here
api_key="XXXXXXX"

# Perform the request
requests.get("https://api.cheese.themama.ai/predict",
                {"search_input":query,
                "descriptors":True,
                "properties":True
                }
                ,headers={'Authorization': f"Bearer {api_key}"
                },
                verify=False).json()

CURL example

curl -X 'GET' \
  'https://api.cheese.themama.ai/predict?search_input=CCCCCC&descriptors=true&properties=true&filename=properties_and_descriptors' \
  -H 'Accept: application/json' \
  -H 'Authorization: Bearer {$API_KEY}'

An example JSON response looks like this :

{
  "descriptors": {
    "CCCCCC": {
      "molecular_weight": 86.109550448,
      "formal_charge": 0,
      "clogp": 2.5866000000000007,
      "heavy_atoms": 6,
      "h_bond_acceptors": 0,
      "h_bond_donor": 0,
      "rotatable_bonds": 3,
      "num_of_rings": 0,
      "molar_refractivity": 29.81599999999998,
      "number_of_atoms": 6,
      "topological_surface_area_mapping": 0
    }
  },
  "properties": {
    "CCCCCC": {
      "caco2_wang": -3.950169086456299,
      "clearance_hepatocyte_az": 90.212158203125,
      "clearance_microsome_az": 37.758026123046875,
      "half_life_obach": 6.711719036102295,
      "ld50_zhu": 0.6453738212585449,
      "lipophilicity_astrazeneca": 1.429108738899231,
      "ppbr_az": 71.03226470947266,
      "solubility_aqsoldb": -4.1111674308776855,
      "vdss_lombardo": 3.2378389835357666,
      "ames": 0.0000046125546759867575,
      "bbb_martins": 0.9992955923080444,
      "bioavailability_ma": 0.9417303204536438,
      "cyp2c9_substrate_carbonmangels": 0.30764755606651306,
      "cyp2c9_veith": 0.000007601168363180477,
      "cyp2d6_substrate_carbonmangels": 0.10820775479078293,
      "cyp2d6_veith": 0.0003342859272379428,
      "cyp3a4_substrate_carbonmangels": 0.004517389927059412,
      "cyp3a4_veith": 0.00001898099981190171,
      "dili": 0.12116408348083496,
      "herg": 0.7191941142082214,
      "hia_hou": 0.9802113771438599,
      "pgp_broccatelli": 0.0000023020054413791513
    }
  }
}

Property prediction for a list of molecules

You can perform property prediction and descriptor calculation on a list of molecules and get a JSON object or a CSV file as a response.

To call the service you can perform a GET request and provide the parameters and your generated API key in the headers.

Request URL : https://api.cheese.themama.ai/predict_array

Parameters :

search_input (List[str]) : list of SMILES string of the query molecules
descriptors (bool) : Whether to get descriptors or not
properties (bool) : Whether to get properties or not
At least one of the two above parameters needs to be True

Headers :

Authorization : Enter here your API key under Bearer $api_key
Accept : Can either be application/json for a JSON response or application/csv for a CSV response

This is an example to call the service using the Python requests library :

Python example

import requests

# Query molecule in SMILES format
queries = ["CCCCCC","CCC","CCC=O"]

# Paste your API key here
api_key="XXXXXXX"

# Perform the request
requests.get("https://api.cheese.themama.ai/predict_array",
                {"search_input":query,
                "descriptors":True,
                "properties":True
                },
                headers={
                    "Accept":"application/csv",
                    'Authorization': f"Bearer {api_key}"
                },
                verify=False).json()

# Download result file to local storage
result_filename=new_resp.headers["content-disposition"].split()[-1].replace("filename=","")
save_path="./"
with open(save_path+result_filename,'wb') as new_file:
    new_file.write(new_resp.content)

CURL example

curl -X 'GET' \
  'https://api.cheese.themama.ai/predict_array?search_input=CCCCCC&search_input=CCC&search_input=CCC%3DO&descriptors=true&properties=true&filename=properties_and_descriptors' \
  -H 'Accept: application/json' \
  -H 'Authorization: Bearer {$API_KEY}'

The CSV response file should have the following format :

query,prop1,prop2,....,desc1,desc2... where prop are properties and desc are descriptors.

The JSON response has a similar format to the /predict API call, the difference is that the keys of descriptors and properties subdirectories are SMILES of the molecules together.

Property prediction for a file molecules

You can predict properties for a file of molecules and get a response either in JSON or CSV format. The file formats can either be in .csv, .sdf, .smi, or .txt formats.

To call the service you can perform a POST request and provide the parameters as well as your generated API key and the response format in the headers.

Request URL : https://api.cheese.themama.ai/predict_file

Parameters :

descriptors (bool) : Whether to get descriptors or not
properties (bool) : Whether to get properties or not

Files:

search_input (file : _io.BufferedReader) file of the query molecules. The file must be provided under the search_input parameter of the POST request files (see example below).

You can find some file examples to try the API in the CHEESE web assets folder

Headers :

Authorization : Enter here your API key under Bearer $api_key
Accept : Can either be application/json for a JSON response or application/csv for a CSV response

This is an example to call the service using the Python requests library :

Python example

import requests


# Query file 

filename="00_chembl_subset.sdf"
file=open(filename,"rb")


# Paste your API key here
api_key="XXXXXXX"

new_resp=requests.post("http://api.cheese.themama.ai/predict_file",
                data={
                "descriptors":False,
                "properties":True,
                },
                files={"search_input":file},
                headers={'Authorization': f"Bearer {api_key}",
                          'Accept': "application/csv"
                },
                verify=False)


# Download result file to local storage
result_filename=new_resp.headers["content-disposition"].split()[-1].replace("filename=","")
save_path="./"
with open(save_path+result_filename,'wb') as new_file:
    new_file.write(new_resp.content)

The response formats are similar to those of the /predict_array call.