The Small Molecule Suite

The Small Molecule Suite (SMS) is a free, open-acces tool developed by the Harvard Program in Therapeutic Sciences (HiTS) and funded by the NIH. The goal of the SMS is to help scientists understand and work with the targets of molecular probes, approved drugs and other drug-like molecules, while acknowliging the complexity of polypharmacology — the phenomenon that virtually all drug-like molecules bind multiple target proteins. The SMS combines data from the ChEMBL database with prepublished data from the Laboratory of Systems pharmacology. The methodology of calculating selectivities and similarities are explained in Moret et al. Cell Chem Biol 2019 (which can also be used to cite the Small Molecule Suite).

This work is licensed under the Creative Commons Attribution-ShareAlike license.

Use cases

Applications

Selectivity

Show the affinity and selectivity of compounds for a protein of interest.

Library

Compose custom chemical genetics libraries for gene-sets of interest.

Compound affinity and binding assertions

Filter by compound

Filter by target

Commercial availability

Include only commercially available compounds. Vendors available in the cross-references at ZINC or eMolecules.

Download Excel Download CSV

Target Affinity Spectrum (TAS) values are binding affinity assertions that aggregate compound binding data from heterogeneous sources, such as full dose-response affinity measurements, single dose binding assays and binding assertions from the literature.

The binding assertions are 1, 2 and 3, with 1 representing the strongest and 3 the weakest binding. 10 represents confirmed non-binding.

See the publication for details.

Download Excel Download CSV

Target gene

Include non-human genes

Minimum/maximum affinity (nM)

Minimum number of affinity measurements

Commercial availability

Include only commercially available compounds. Vendors available in the cross-references at ZINC or eMolecules.

The Selectivity app helps you find selective and potent small molecules against your target of interest.

To use the Selectivity app:

Select a gene of interest in the top left corner of the application
Change the filter settings as needed
Look at the 'Affinity and selectivity plot' and select a region of compounds you are interested in
The 'Affinity and selectivity data' will change upon your selection in (3), select the compound you are most interested in to see all its known targets in the 'Affinity and selectivity reference' (you may have to scroll down)

x-axis

y-axis

Download CSV Download Excel

Reference compound

Compare selected compound to all other compounds in the database.

Minimum number of affinity assays in common with reference compound

Minimum number of phenotypic assays in common with reference compound

Commercial availability

Include only commercially available compounds. Vendors available in the cross-references at ZINC or eMolecules.

The Similarity app helps you find compounds similar to your compound of interest.

To use the Similarity app:

Select a reference compound and set filters as desired. Three plots show up under 'Compound similarity plots'. These plots describe the similarity to the reference compound in phenotype (PFP), targets (TAS), and chemical structure (structural similarity) -- calculated using Morgan2 fingerprints in RDkit.
Select an area of the compound similarity plots you are interested in. They will show up in table format under 'Compound similarity data'.
In the 'Compound similarity data' select a compound so see Its Target Afinity Spectrum in the 'Compound similarity selections' that shows up below (you may have to scroll down).

Download CSV Download Excel

Download Small Molecule Suite data

SMS version based on ChEMBL v29

The entire Small Molecule Suite dataset is available for download.
The data are organized in separate tables. Documentation for each table and their relationships are available.

Table documentation Download tables from Synapse

Understanding compound and target identifiers

Compounds and targets in all tables are referred to using ID numbers in the columns lspci_id and lspci_target_id , respectively.

Compound and target IDS can be translated into compound names and target symbols using the tables lsp_compound_dictionary and lsp_target_dictionary .

The table lsp_compound_dictionary also contains mappings for the most common compound databases, such as ChEMBL, eMolecules and HMS LINCS.

Download tables in CSV format

SMS version based on ChEMBL v29

Name	Description	Size
lsp_biochem_agg	Table of aggregated biochemical affinity measurements. All available data for a single compound target pair were aggregated by taking the first quartile.	15.2 MB
lsp_biochem	Table of biochemical affinity measurements.	52.9 MB
lsp_clinical_info	Table of the clinical approval status of compounds. Sourced from ChEMBL	45.5 kB
lsp_commercial_availability	Table of the commercial availability of compounds. Sourced from eMolecules (https://www.emolecules.com/).	388.0 MB
lsp_compound_dictionary	Primary table listing all compounds in the database. During compound processing distinct salts of the same compound are aggregated into a single compound entry in this table. The constituent compound IDs for each compound in this table are available in the lsp_compound_mapping table.	1.2 GB
lsp_compound_library	Library of optimal compounds for each target. See 10.1016/j.chembiol.2019.02.018 for details.	115.5 kB
lsp_compound_mapping	Table of mappings between compound IDs from different sources to the internal lspci_ids.	295.6 MB
lsp_compound_names	Table of all annotated names for compounds. The sources for compound names generally distinguish between primary and alternative (secondary) names.	13.2 MB
lsp_manual_curation	Table of manual compund target binding assertions.	9.6 kB
lsp_one_dose_scan_agg	Table of single dose compound activity measurements as opposed to full dose-response affinity measurements. All available data for a single concentration and compound target pair were aggregated by taking the first quartile.	2.2 MB
lsp_one_dose_scans	Table of single dose compound activity measurements as opposed to full dose-response affinity measurements.	5.2 MB
lsp_phenotypic_agg	Table of aggregated phenotypic assays performed on the compounds. All available data for a single assay and compound target pair were aggregated by taking the first quartile.	70.6 MB
lsp_phenotypic	Table of phenotypic assays performed on the compounds.	82.7 MB
lsp_references	External references for the data in the database.	2.2 MB
lsp_selectivity	Table of selectivity assertions of compounds to their targets. See 10.1016/j.chembiol.2019.02.018 for details.	24.9 MB
lsp_structures	Additional secondary InChIs for compounds.	14.2 MB
lsp_target_dictionary	Table of drug targets. The original drug targets are mostly annotated as ChEMBL or UniProt IDs. For convenience we converted these IDs to Entrez gene IDs. The original mapping between ChEMBL and UniProt target IDs are in the table `lsp_target_mapping`	1.4 MB
lsp_target_mapping	Mapping between the original ChEMBL target IDs, their corresponding UniProt IDs and Entrez gene IDs. A single UniProt or ChEMBL ID can refer to protein complexes, therefore multiple gene IDs often map to the same UniProt or ChEMBL ID.	273.8 kB
lsp_tas_references	Table that makes it easier to link TAS values to the references that were used to compute the TAS values	3.8 MB
lsp_tas	Table of Target Affinity Spectrum (TAS) values for the affinity between compound and target. TAS enables aggregation of affinity measurements from heterogeneous sources and assays into a single value. See 10.1016/j.chembiol.2019.02.018 for details.	10.6 MB

The Small Molecule Suite

Use cases

I want pre-calculated optimized libraries for—

I have a gene and—

I have a compound and—

I want to analyze large amounts of data and—

Applications

Selectivity

Similarity

Library

Compound affinity and binding assertions

Target gene

Reference compound

Selectivity levels

Clinical phases

Expert opinion compounds

Output table

Minimum affinity for query target (nM)

Minimum number of affinity measurements

Download Small Molecule Suite data

SMS version based on ChEMBL v29

Understanding compound and target identifiers

Download tables in CSV format

SMS version based on ChEMBL v29