Screening Libraries
Our ready-to-use ChemScene compound libraries feature over 4,000 small molecules with validated biological and pharmacological activities, ideal for high-throughput screening (HTS) and high-content screening (HCS). These libraries serve as essential tools for drug discovery and research into new indications.
A Lead-like, diverse library is the foundation for achieving biological activity diversity. The ChemScene Lead-like Diverse Library Plus is a further supplement to the 50K Lead-like Compound Library (HY-L901), consisting of over 80,000 lead-like compounds, with an additional 30,000 structurally novel lead-like molecules. These compounds occupy a broader "chemical space", making it a powerful tool for new drug discovery.
Cysteine proteases (CPs), a key enzyme family regulating physiological metabolism and mediating pathological processes (such as abnormal bone resorption, tumour invasion, and pathogen infection), represent a core therapeutic target for developing specific inhibitors in disease intervention. Currently reported CP inhibitors primarily achieve their inhibitory function by precisely binding to CP active pockets (e.g., S1-S4 non-primed regions or S1'-S2' primed regions) and forming covalent/non-covalent interactions with the active site cysteine residues, providing clear structural references for the development of novel inhibitors. This compound library, designed based on the core strategy of "similarity-based known active structures", contains over 200 cysteine protease inhibitors. Leveraging AI-driven molecular screening technology, it retains the critical pharmacological and shape features of reported CP inhibitors, serving as a specialized tool for efficiently discovering novel cysteine protease inhibitors.
CRBN, namely cereblon, is the substrate recognition subunit of the E3 ubiquitin ligase complex in the ubiquitin-proteasome system. A CRBN ligand library refers to a collection of numerous fragments that can specifically bind to the CRBN protein. These ligands are mostly designed based on validated CRBN-binding warheads and modified through AI-driven molecular generation optimization systems. They not only include classic lenalidomide-derived structures but also cover novel non-lenalidomide scaffolds. After drug-likeness filtering, these ligands exhibit structural diversity and favorable druggable properties. They can be further optimized and modified to facilitate the development of novel molecular glue degraders, accelerate the discovery of molecular glues that induce interactions between CRBN and new substrate proteins, and enable the exploration of novel CRBN substrates for identifying previously unknown CRBN-binding proteins. ChemScene compiles 118 fragments that can specifically bind to the CRBN protein, with molecular weights ranging from 200 to 500. Compounds developed based on the library ligands target multiple disease targets such as cancer and autoimmune diseases, further advancing the development of Molecular Glues and PROTACs therapeutic agents.
With the aging population and increasing competitive pressures, neurodegenerative diseases of the central nervous system (CNS) have become a serious medical challenge in modern society, including Parkinson's disease, Alzheimer's disease, brain tumors, and multiple sclerosis. The CNS MPO (Multi-Parameter Optimization) score is a widely recognized algorithm in medicinal chemistry. Developed by Pfizer, this method is based on an analysis of approved CNS drugs and their interior CNS drug candidates, establishing the CNS MPO rules. It incorporates six key physicochemical properties (ClogP, ClogD, MW, TPSA, HBD, and pKa) to prospectively optimize CNS drug attributes—such as high blood-brain barrier (BBB) permeability, low P-gp efflux liability, low metabolic clearance, and high safety—thereby improving the clinical success rate of CNS drug candidates. The CNS MPO compound library is a collection of compounds with CNS MPO scores greater than 5, specifically designed for CNS drug discovery.
In contrast to the high conservation of conventional orthosteric sites, allosteric sites possess structural characteristics of low conservation, high hydrophobicity, weak polarity, confined spatial geometry, and dynamic cryptic properties. There is a significant difference between their core structures and orthosteric pockets — allosteric pockets are mostly dynamic grooves formed by protein conformational changes, subunit interface clefts, or shallow depressions, rather than the rigid "keyhole" structure of orthosteric sites. With looser spatial constraints, allosteric sites have the advantages of high selectivity and low off-target risk, and have become an important direction in new drug discovery. Based on the dynamic, hydrophobic, and narrow-long spatial characteristics of allosteric pockets, ChemScene has performed targeted modification and screening of fragments. The screening criteria strictly conform to the requirements of allosteric binding: molecular weight is controlled at 120–280 Da (to meet the core needs of small molecules in fragment libraries and high derivatization), hydrogen bond donors (HBD ≤ 2), hydrogen bond acceptors (HBA ≤ 3), polar surface area (PSA = 30–80 Ų), rotatable bonds (≤ 2), moderate hydrophobicity (cLogP = 1–3.5), no strongly ionizable groups, and both appropriate rigidity and conformational flexibility to adapt to the dynamic changes of the pocket. Meanwhile, combined with the results of principal moment of inertia (PMI) analysis, fragments with high 3D diversity were obtained. Such fragments have good shape complementarity with allosteric pockets, ensuring that the fragments can smoothly enter the allosteric pockets and form stable binding, while providing room for subsequent optimization and derivation. This library contains 1,800 structurally diverse fragment molecules with excellent drug-like properties, suitable for allosteric drug development and the design and optimization of allosteric sites. It combines the
Covalent inhibitors are small molecules that can bind specifically to target proteins through covalent bonds and inhibit their biological functions. Although for a long time, covalent targeting has been playing a subordinate role in drug discovery, with an increasing number of reports on successful clinical applications of such drugs, the potential of these agents is now being acknowledged. Currently, cysteine is the most common covalent amino acid residue in a variety of covalent drugs, and various warheads have been developed that can react with cysteine, providing the key building blocks for covalent drugs to form covalent bonds. To meet the development needs of covalent inhibitors targeting cysteine, ChemScene has designed a unique collection of 3,844 fragments with different covalent warheads that target cysteine. The ChemScene Cysteine Targeted Covalent Fragment Library is designed using the following covalent warheads: Acrylamides, Propiolic acid ester, Dimethylamine functionalized acrylamides, Chloroacetamides, Acrylonitrile, 2-Cyanoacrylamide, Aziridine, Haloacetamide, etc. All fragments are pre-filtered with the Rule of Three restrictions which can be used for fragment-based covalent drug development.
Lysine is the second most common target residue used in the design of TCIs and related covalent ligands. Its appeal lies in its abundance in human proteins, which is approximately three times higher than that of cysteine (5.8% vs. 1.9%). This significantly increases the number of proteins suitable for covalent targeting, especially given that many human proteins lack ligandable cysteine residues. Moreover, it has been suggested that functional lysines have a lower probability of being replaced by mutation, as they often play a crucial role in catalysis by acting as bases or nucleophiles. Additionally, lysines are essential for maintaining the structural integrity of proteins and for regulating post-translational modifications (PTMs). Consequently, targeting lysine has garnered significant interest in recent years. Through careful selection, we constructed a structural filter containing over 110 electrophilic groups. By analyzing the electrophilic fragments selected by the structural filter, we removed any molecules with trivial or undesirable structural features. Ultimately, we obtained 445 fragment molecules which can target lysine residue and can be used for fragment-based covalent drug discovery.
In drug discovery and development (R&D) area, target binding and druggability optimization are core processes. Among these attributes, high solubility is critical for a compound to achieve druggability, as it directly impacts the progress of drug R&D. Superior solubility ensures the rapid dissolution and uniform distribution of drug molecules in vivo, thereby enhancing bioavailability and effectively mitigating issues such as suboptimal efficacy, increased dosage requirements, or exacerbated toxic and side effects arising from insufficient solubility. From the perspective of medicinal chemistry, high-solubility drug fragments serve as high-quality "molecular building blocks". Based on these fragments, lead compounds with potential druggability can be rapidly screened out, which significantly shortens the drug R&D cycle and reduces R&D costs. Meanwhile, the high-solubility drug fragment library can provide diverse options for drug development in different therapeutic areas, offer solutions for the solubility defects of existing clinical drugs, and facilitate the development of novel, highly effective targeted drugs with higher bioavailability and better safety profiles. ChemScene has collected and compiled 2,527 experimentally validated small-molecule fragments with high solubility. These fragments can be directly used for drug molecular design, providing high-quality pre-validated solubility fragments that significantly improve the efficiency of lead compound screening and accelerate the progress of drug R&D.
19F-NMR has proved to be a detection mode in fragment-based drug discovery (FBDD) for studies of protein structure and interactions. 19F shows high sensitivity for NMR detection, and the exquisite sensitivity of 19F chemical shifts and linewidths to ligand binding all make it a valuable approach in FBDD.F (Fluorine) -Fragments can be used for 19F-NMR detection after binding to target proteins, and can be used as an effective 19F-NMR tool for FBDD. ChemScene designs a unique collection of 5,124 F-fragments, all of which obey a heuristic rule called the “Rule of Three (RO3)”, in which molecular weight ≤300 Da, the number of hydrogen bond donors (H-donors) ≤3, the number of hydrogen bond acceptors (H-acceptors) is ≤3 and cLogP is ≤3. This F-fragments library is an important source of lead-like drugs.
RNA is crucial for the regulation of numerous cellular processes and functions. With the in-depth study of disease mechanisms, processes such as RNA expression, splicing, translation, and stability regulation have become new targets for disease intervention. RNA has provided new therapeutic modalities for metabolic diseases, genetic disorders, and cancer patients, resulting in several innovative drugs. ChemScene R&D team collected small molecules targeting RNA from the PDB, R-BIND, ROBIN, and internal database as the positive dataset, and non-targeting RNA small molecules from ROBIN as the negative dataset. Based on the GeminiMol pre-trained model, we encoded the molecules and calculated over 1700 molecular descriptors using Mordred as inputs for the model. Subsequently, we employed 13 deep learning models to learn from the data. All of which yielded good training results, with AUROCs greater than 0.75. Ultimately, we selected the Finetune model to screen HY-L901P, which exhibited the best classification performance, achieving an AUROC of 0.82 and a prediction accuracy of 0.76. We then applied filtering based on StaR rules (with at least two of the following properties: cLogP ≥ 1.5, Molar Refractivity ≥ 4, Relative Polar Surface Area ≤ 0.3) to obtain a library containing approximately 5,000 small molecule compounds targeting RNA. This library serves as a valuable tool for screening small molecules that interact with RNA.