7.9
CiteScore
 
3.6
Impact Factor
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
ABUNDANCE ESTIMATION IN AN ARID ENVIRONMENT
Case Study
Correspondence
Corrigendum
Editorial
Full Length Article
Invited review
Letter to the Editor
Original Article
Research Article
Retraction notice
REVIEW
Review Article
SHORT COMMUNICATION
Short review
7.2
CiteScore
3.7
Impact Factor
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
ABUNDANCE ESTIMATION IN AN ARID ENVIRONMENT
Case Study
Correspondence
Corrigendum
Editorial
Full Length Article
Invited review
Letter to the Editor
Original Article
Research Article
Retraction notice
REVIEW
Review Article
SHORT COMMUNICATION
Short review
View/Download PDF

Translate this page into:

Research Article
2025
:37;
7792025
doi:
10.25259/JKSUS_779_2025

Integrative machine learning and molecular simulation strategies for BCR-ABL inhibition in chronic myeloid leukemia

Department of Biology, University of Hail, S-153 Department of Biology, Hail, 2240, Saudi Arabia
Department of CLS, King Khalid University, Abha, Abha, 61421, Saudi Arabia
Department of Biology, Princess Nourah Bint Abdulrahman University, Riyadh, Riyadh, 11671, Saudi Arabia
Department of Biological Sciences, King Abdul Aziz University, King Abdul Aziz University, Jeddah, 21589, Mecca, Saudi Arabia
Department of Bioengineering, Integral University, Bioengineering, Lucknow, 22026, India
Department of Pharmacy, Gachon University, Hambakmoeiro 191, Yeonsu-gu,, Incheon City, 21924, Korea, Republic of Korea

*Corresponding author E-mail address: dharmendra30oct@gmail.com (DK Yadav)

Licence
This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-Share Alike 4.0 License, which allows others to remix, transform, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms.

Abstract

Chronic myeloid leukemia (CML) is predominantly driven by the oncogenic fusion protein BCR-ABL1, with tyrosine kinase inhibitors (TKIs) representing the primary therapeutic approach. Nevertheless, the emergence of the T315I mutation in the ABL1 kinase domain markedly reduces the effectiveness of TKIs, posing a major therapeutic challenge. In this study, we present a robust in silico workflow aimed at discovering potent and selective inhibitors against the T315I-mutated ABL1 kinase. The strategy combines machine learning-assisted quantitative structure-activity relationship (QSAR) modeling with molecular docking, followed by extensive molecular dynamics (MD) simulations performed under both physiological (310 K) and stress-induced (500 K) temperature conditions to evaluate stability and binding dynamics. A total of 2,727 phytochemicals from the PhytoHub database were mapped to PubChem IDs and screened based on drug-likeness, binding affinity, docking scores, hydrogen bonding, and predicted IC₅₀ values. Among them, 15 compounds exhibited superior predicted activity (IC₅₀ < 31.65 nM) compared to the reference inhibitor, rebastinib. Notably, compound 5281238 (flavoxanthin) emerged as a top candidate, with a predicted IC₅₀ of 31.46 nM. MD simulations revealed its remarkable stability (RMSD = 0.15 nm over 100 ns), and interaction analysis confirmed key hydrogen bonding with HIS138. Free energy landscape (FEL) profiling and molecular mechanics general born surface area (MM/GBSA) analysis (ΔG_TOTAL = -61.91 kcal/mol) further supported its favorable binding profile. This study identifies Flavoxanthin as a promising lead for combating T315I-associated resistance in CML and advancing targeted therapy development.

Keywords

BCR-ABL1
Drug discovery
Machine learning
Molecular dynamics simulations
Myeloid Leukaemia
QSAR

1. Introduction

Chronic myeloid leukemia (CML), sometimes called chronic myelogenous leukemia, is a malignant disorder of the hematopoietic system that originates in the bone marrow (Eden and Coviello, 2023). It is characterized by uncontrolled proliferation of myeloid lineage cells, a subset of leukocytes. CML is recognized as belonging to one of the four major categories of leukemia, alongside acute myeloid leukemia (AML), acute lymphoblastic leukemia (ALL), and chronic lymphocytic leukemia (CLL) (Chennamadhavuni et al., 2023). It was estimated that 2023 would see approximately 8,930 new CML diagnoses, with about 1,310 disease-related deaths (“Chronic Myeloid Leukemia - Cancer Stat Facts,” n.d.). As a myeloproliferative neoplasm, CML occurs with a frequency of 1-2 cases per 100,000 individuals and accounts for nearly 15% of all newly diagnosed leukaemia cases in adults (Jabbour and Kantarjian, 2022). The molecular hallmark of CML is the presence of the BCR-ABL fusion oncogene, which arises due to a reciprocal chromosomal translocation between chromosomes 9 and 22 (Deininger et al., 2000). This rearrangement brings together the Abelson murine leukemia (ABL) gene on chromosome 9 and the breakpoint cluster region (BCR) gene on chromosome 22, leading to the formation of the BCR-ABL1 fusion gene (Kang et al., 2016). The resulting abnormal chromosomal 22 is commonly referred to as the “Philadelphia chromosome” or Ph chromosome [7]. The BCR-ABL1 fusion gene encodes the constitutively active BCR-ABL oncoprotein, a dysregulated tyrosine kinase that lacks normal regulatory control. This persistent kinase activity continuously phosphorylates tyrosine residues on target proteins, driving leukemogenesis in CML (Faderl et al., 1999). As a consequence, multiple downstream signalling cascades, including JAK/STAT, PI3K/AKT, and RAS/MEK, are aberrantly activated, promoting uncontrolled cellular proliferation, enhanced survival, inhibition of apoptosis, and activation of oncogenic transcription factors (Faderl et al., 1999).

Tyrosine kinase inhibitors (TKIs) are currently the standard therapeutic option for patients with CML (Berke et al., 2000). Extensive research has focused on the BCR-ABL1 tyrosine kinase with the aim of identifying clinically relevant inhibitors (Abdelgawad et al., 2022; Chan et al., 2011; Jha et al., 2022; Kumar et al., 2016; Natarajan et al., 2019). The introduction of imatinib, followed by second-generation TKIs such as dasatinib, nilotinib, and bosutinib, and later the third-generation TKI ponatinib, has markedly improved clinical outcomes, offering CML patients a life expectancy approaching that of the general population (Akard, 2010; Bower et al., 2016; Sacha, 2013; Shamroe and Comeau, 2013). Imatinib functions by binding to the tyrosine kinase domain of ABL, thereby blocking the downstream signalling process that drives leukemogenesis. Despite these advances, resistance, primarily due to point mutations within the BCR-ABL kinase domain, impairs drug binding (Krishnamurty and Maly, 2010). Among these, the T315I mutation, commonly known as “the gatekeeper mutation,” is particularly problematic, as it renders nilotinib ineffective. Ponatinib was specifically developed to overcome this mutation (Cortes et al., 2013; O’Hare et al., 2009). Nevertheless, cases of resistance to ponatinib have also been reported, underscoring the urgent need for novel inhibitors that can selectively target the T315I-mutated ABL1 kinase. In this context, the present study is directed towards the design and evaluation of compounds targeting the T315I variant of the ABL tyrosine kinase.

A recently developed therapeutic agent, rebastinib, has shown potential in the treatment of patients with relapsed cases of CML as well as acute myeloid leukemia (Cortes et al., 2017a). Classified as a type II inhibitor, rebastinib interacts with the switch control pocket of the ABL1 kinase domain, a region containing crucial amino-acid residues that regulate the conformational shift from the inactive to the active state (Chan et al., 2011). By engaging this pocket, rebastinib employs a unique mechanism of action that stabilizes the ABL1 kinase domain in its inactive conformation. Evidence from a phase I clinical trial involving 40 CML patients indicated encouraging therapeutic outcomes, with the drug achieving eight complete hematologic responses. Notably, four responses occurred in patients harboring the drug-resistant T315I mutation, highlighting rebastinib’s potential to overcome resistance in refractory CML cases. (Cortes et al., 2017b; Fontana et al., 2024; Rossari et al., 2018). Biochemically, it maintains potent inhibition of ABL1-T315I in vitro, with IC₅₀ values comparable to those against wild-type ABL and has demonstrated specificity similar to ponatinib when evaluated against T315I-mutant BCR-ABL1 profiles (Ernst et al., 2024; Fontana et al., 2024; Rossari et al., 2018). Structurally, rebastinib reinforces the critical GLU282-ARG386 hydrogen bond that is essential for inactive kinase conformation, and T315I appears to enhance hydrophobic stabilization of its binding (Ernst et al., 2024; Mu et al., 2024; Rea et al., 2018; Zabriskie et al., 2014). Together, these clinical, biochemical, and structural data validate rebastinib as a high-confidence control compound: its known ability to neutralize T315I-mutated ABL1 provides essential baseline performance against which novel hits can be benchmarked within our computational modeling workflow.

Numerous phytochemicals have demonstrated the capacity to synergistically interact with standard anti-cancer medications, thereby enhancing the eradication of cancer stem cells by targeting various signaling pathways and reducing drug resistance (Choudhari et al., 2020; Egbuna et al., 2023; Kaufman-Szymczyk et al., 2019; Khajapeer et al., 2016; Soverini et al., 2021). Previous studies have demonstrated the effectiveness of in silico strategies, including molecular docking and MD simulations, in predicting ligand-protein interactions and identifying potential therapeutic candidates. Building upon this framework, the present study focuses on screening phytochemical compounds against the T315I-mutated tyrosine-protein kinase ABL1 to discover novel inhibitors capable of overcoming resistance to existing TKIs. ML-based QSAR modelling combined with molecular docking was employed in a two-step screening strategy to identify promising phytochemical hits. Within the crystal structure of tyrosine-protein kinase ABL1, rebastinib was observed as a reference ligand, and earlier studies have reported its ability to bind at the functional site of the protein. On this basis, rebastinib was selected as the control compound for comparative evaluation. To investigate the dynamic stability of protein-ligand complexes, MD simulations were carried out, followed by principal component analysis (PCA) and free energy landscape (FEL) assessments. Furthermore, binding free energies of the phytochemicals were calculated using the MM/GBSA approach to estimate their binding affinity towards the protein target. Overall, tyrosine-protein kinase ABL1 was established as a critical therapeutic target for the discovery of novel inhibitors against CML.

2. Materials and Methods

2.1 Binding pocket and compound library

The 3D structure of the T315I-mutated tyrosine-protein kinase ABL1, associated with CML, was retrieved from the Protein Data Bank (PDB ID: 3QRJ) (Berman et al., 2000). This X-ray crystallographic structure, resolved at a high resolution of 1.8 Å, is co-crystallized with rebastinib (PubChem CID: 25066467), a potent inhibitor specifically active against the T315I mutation [10, 36]. A few missing residues in the loop region of chain A were generated by employing a homology modeling approach using the SWISS-MODEL server (Schwede et al., 2003). This modeled structure was further used for molecular docking. In this study, 2,727 phytochemicals from the PhytoHub database (https://phytohub.eu/) were retrieved (Bento da Silva et al., 2016). The selected compounds were mapped to their corresponding PubChem CIDs using the “PubChem Identifier Exchange Service” (https://pubchem.ncbi.nlm.nih.gov/idexchange/) (Kim et al., 2023). Duplicate entries were removed, and only unique CIDs were retained for further processing. Compounds with available 3D-SDF structures were directly retrieved through the PubChem API, whereas those available only in 2D-SDF format were converted into 3D-SDF structures using Open-Babel (O’Boyle et al., 2011). Additionally, the SMILES representations of all compounds were stored for subsequent use in virtual screening through the QSAR model.

2.2 Machine learning-based QSAR modelling

A machine learning-based QSAR framework was employed to enable the computer-aided screening of naturally derived molecules with anticipated inhibitory action against BCR-ABL1. To construct and evaluate predictive models, six regression algorithms were applied: Linear Regression and Bayesian Ridge Regression representing linear approaches, Decision Tree Regressor as a rule-based learner, Support Vector Regressor (SVR) as a kernel-based method, and Random Forest and Gradient Boosting Regressors as ensemble learning techniques.

The compound dataset used for model development was retrieved from the ChEMBL database (https://www.ebi.ac.uk/chembl/) (Davies et al., 2015), where a query for the target “ABL1” resulted in the identification of 13 unique target entries. The target with the highest compound coverage, CHEMBL1862, comprised 17,582 compounds, of which 3,119 (Supplementary File 1) had experimentally reported IC₅₀ values. After preprocessing (removal of duplicates and incomplete entries), a curated dataset of 3,105 compounds was retained for QSAR modelling. SMILES notations of the compounds were retrieved, and the reported IC₅₀values, originally expressed in micrograms per milliliter (µg/mL), were converted into nanomolar (nM) concentrations using the formula:

Supplementary File

nM = 10 6 × IC 50 ( µg/mL ) Molecular Weight

Subsequently, the IC₅₀ values were normalized using a log₁₀ transformation to generate pIC₅₀ values suitable for regression modeling. Molecular descriptors and Morgan fingerprints were computed using RDKit (Landrum, 2014). The curated dataset was then partitioned into 70% for training (Supplementary File 2) and 30% for testing (Supplementary File 3), ensuring representative chemical diversity through random sampling. The machine learning models were trained to predict pIC₅₀ values based on the calculated molecular features.

A total of 2,727 phytochemicals retrieved from the PhytoHub database were subsequently screened using the optimized QSAR model. The predicted IC₅₀ values were employed to prioritize potential candidates, and compounds exhibiting greater predicted potency than the reference inhibitor, rebastinib, were shortlisted for validation through molecular docking studies.

2.3 Molecular docking

Protein preparation was performed using AutoDock 4.2 (Morris et al., 2009). The preparation steps included removal of water molecules, repairing of missing atoms, addition of all hydrogen atoms, and assignment of Kollman charges followed by saving the structure in PDBQT format. The grid box was generated around the binding site residues of the co-crystallized ligand rebastinib, with a spacing of 6 Å. The grid center was defined at coordinates (x = 4.75 Å, y = -14.0 Å, and z = 28.59 Å), and the box dimensions were set to 24 Å × 24 Å × 24 Å along the x, y, and z axes, respectively. Docking parameters were specified with 20 runs, an energy range of 4 kcal/mol, and an exhaustiveness of 100. For each compound, 20 docking poses were generated, and binding scores were calculated.

The top five compounds were selected for MD simulations based on their docking score and hydrogen-bond interactions within the binding pocket. Binding free energy (ΔG) values were predicted using PRODIGY-ligand (Kurkcuoglu et al., 2018; Vangone et al., 2019). Additionally, Ligand Efficiency (LE) was computed as:

L E = Δ G N h e a v y   a t o m s

Where N refers to the number of heavy atoms in the ligand.

The Lipophilic Ligand Efficiency (LLE) was calculated using:

L L E = Δ G log P

2.4 Molecular dynamics simulation

MD simulations were performed to investigate the dynamic stability and conformational flexibility of the protein-ligand complexes. Five lead compounds obtained from initial screening, along with a reference ligand, were subjected to MD simulations using GROMACS 2021.2 with the CHARMM27 force field (Berendsen et al., 1995; Hess et al., 2008). Ligand topologies and parameter files were generated using SwissParam (Zoete et al., 2011). Electrostatic interactions were calculated via Ewald Particle Mesh method (Darden et al., 1993). The systems were neutralized with Na+ and Cl⁻ counterions and solvated in a dodecahedral box with a 1 Å buffer using the TIP3P water model (Izadi et al., 2014). Energy minimization was conducted using 5000 iterations of the steepest descent algorithm. The systems were subsequently heated to 500 K, with all hydrogen bonds constrained by the LINCS algorithm. Equilibration was carried out under NVT and NPT ensembles for 1ns each at 500 K and 1 atm. Production MD runs of 10 ns were then performed to assess thermodynamic stability. The most stable compound, identified by its ability to remain securely bound within the protein’s binding site at elevated temperatures, was subjected to an extended 100 ns simulation at 310 K to evaluate long-term stability. Multiple iterations of the 100 ns run ensured the reliability of the MD approach. Control simulations with both the apo-protein and control ligand were conducted for comparative analysis. Pressure coupling was maintained using the Parrinello-Rahman barostat (Parrinello and Rahman, 1981), while temperature was regulated through the velocity-rescaling thermostat (Bussi et al., 2007). Post-simulation trajectory analysis included root mean square deviation (RMSD), root mean square fluctuation (RMSF), and hydrogen bonding profiles, all processed with standard GROMACS utilities.

2.5 PCA

PCA was performed on the protein-ligand complex trajectories using the standard protocols in GROMACS (Hess et al., 2008). Prior to analysis, the trajectories were preprocessed to remove periodic boundary artifacts. The covariance matrix of atomic positional fluctuations was then generated using the ‘gmx covar’ utility, which quantifies the correlated motions within the protein-ligand complexes. Subsequently, the eigenvalues and eigenvectors of the covariance matrix were extracted with the ‘gmx anaeig’ tool. Finally, the ‘gmx anaproj’ program was used to project the MD trajectories onto the principal components (PCs), thereby allowing the visualization and evaluation of the dominant collective motions of the complexes.

2.6 FEL

The FEL approach was employed to characterize the conformational changes in the protein-ligand complexes and to relate them to different energy states (Maisuradze et al., 2010). FEL analysis provides insight into biomolecular recognition, aggregation, and folding by distinguishing between stable conformations, represented as minima, and transient conformations, expressed as energy barriers. The Gibbs free energy was calculated using the following expression:

Δ G ( X )   =   k B T ln P ( X )

where ΔG(X) denotes the Gibbs free energy, kB is the Boltzmann constant, T is the absolute temperature (T), and P(X) is the probability distribution of conformational states.

2.7 MM/GBSA calculations

The MM/GBSA method was employed to estimate the binding free energy of the protein-ligand complexes. This approach integrates molecular mechanics energies, polar solvation energies (calculated using the Generalized Born model), and non-polar solvation energies (derived from solvent-accessible surface area, SASA). In this study, the GROMACS plugin gmx_MMPBSA was utilized to perform MM/GBSA calculations on the final 20 ns trajectory segment of the molecular dynamics simulation, providing quantitative insights into the stability and strength of the protein-ligand interactions (Miller et al., 2012; Valdés-Tresanco et al., 2021). The binding free energy was computed using the following set of equations:

(1)
Δ G = G c o m p l e x    [ G r e c e p t o r + G l i g a n d ]

(2)
Δ G b i n d i n g = Δ H T Δ S

(3)
Δ H = Δ G G A S + Δ G S O L V

(4)
Δ G G A S = Δ E E L + Δ E V D W A A L S

(5)
Δ G S O L V = Δ E G B + Δ E S U R F

(6)
Δ E S U R F   =   γ . S A S A

Here, Gcomplex, Greceptor, and Gligand represent the total free energies of the bound complex, the free receptor, and the free ligand, respectively. ΔH corresponds to the enthalpic contribution, comprising gas-phase energy (ΔGGAS) and solvation energy (ΔGSOLV). ΔGGAS is the sum of electrostatic (ΔEEL) and van der Waals (ΔEVDWAALS) interactions. ΔGSOLV includes the polar solvation energy (ΔEGB) and the nonpolar solvation energy (ΔESURF), the latter approximated by the product of the solvent surface tension coefficient (γ) and the SASA. The entropic term (-TΔS) represents the conformational entropy contribution to binding.

2.8 Evaluation of ADMET properties

The final compounds retrieved after the above steps were evaluated for the ADMET properties using the SwissADME (Daina et al., 2017) and ProTox-II server (Banerjee et al., 2018). Further, SMARTCyp was used for metabolic hotspots and CYP isoform inhibition (Rydberg et al., 2010).

3. Results and Discussions

3.1 Protein structure and compound library

The T315I-mutated tyrosine-protein kinase ABL1 (PDB ID 3QRJ) was selected as the receptor protein for molecular docking studies. The binding pocket was defined around the native ligand rebastinib, as illustrated in Fig. 1. The key amino acid residues forming the binding pocket include: LEU26, GLY27, VAL34, ALA47, VAL48, LYS49, MET56, GLU60, PHE61, LYS63, GLU64, VAL67, MET68, ILE71, LEU76, VAL77, ILE91, ILE93, GLU94, PHE95, MET96, THR97, TYR98, GLY99, ASN100, LEU132, PHE137, HIS139, LEU148, VAL157, ALA158, ASP159, PHE160, and GLY161. Importantly, residues within the ATP binding site of BCR-ABL, namely GLU64, MET96, ILE138, ALA158, and ASP159, were identified within this binding region (Banavath et al., 2014). These residues are critical as the reference points for constructing the docking grid box.

The binding pocket in the crystal structure of Tyrosine-protein kinase ABL1, defined around the control compound rebastinib, highlights the key residues involved in ligand recognition and stabilization
Fig. 1.
The binding pocket in the crystal structure of Tyrosine-protein kinase ABL1, defined around the control compound rebastinib, highlights the key residues involved in ligand recognition and stabilization

In this study, a total of 2,727 phytochemicals were retrieved from the PhytoHub database for virtual screening against the T315I-mutated tyrosine-protein kinase ABL1, a critical target in CML). Previous reports have demonstrated the utility of PhytoHub phytochemicals in drug discovery, including screening against the main protease of SARS-CoV-2 (Patel et al., 2022), as well as in other pharmacological investigations (Fiamoncini et al., 2017; Verma et al., 2021). Building on this rationale, the present study employed the PhytoHub collection to explore the potential inhibitors of the T315I-mutated ABL1 protein. These compounds were first converted into PubChem Compound Identifiers (CIDs) using the PubChem Identifier Exchange Service. Following duplicate removal, 1,288 unique CIDs were obtained. For compounds lacking direct CIDs, available InChI Keys were converted to CIDs, resulting in an additional 759 unique entries. After further duplicate elimination, a total of 1,652 unique CIDs were finalized for downstream analysis. Among these, 1,427 compounds had readily available 3D-SDF structures, which were directly retrieved via the PubChem API. For the remaining 225 compounds, 2D-SDF files were downloaded and subsequently converted into 3D-SDF format using Open Babel. All 1,652 3D-SDF files subjected to energy-minimization and converted into PDBQT format through Open Babel preprocessing. These curated structures formed the basis for the development of a machine learning-based QSAR model. The optimized QSAR framework was then applied to predict the pharmacological activity of the phytochemicals, enabling systematic screening and prioritization of potential inhibitor against the T315I-mutated ABL1 protein.

3.2 ML-based QSAR model

In this study, six machine learning regression models were employed to develop the QSAR framework. The models included linear regression, logistic regression, ridge regression, decision trees, support vector machines (SVM), and gradient boosting. To construct the training dataset, a search for “ABL1” within the “Targets” section of the ChEMBL database identified 13 unique target identifiers, which were subsequently used to retrieve compound-bioactivity data for model development. The compound with the target “CHEMBL1862” showed the highest number of compounds (17,582 compounds) when ranked according to compound activity. A total of 3,119 compounds exhibiting IC50 were identified within the target “CHEMBL1862”. Following the elimination of empty values from the IC50 dataset and the removal of duplicate entries, the compounds associated with IC50 values were successfully identified based on their corresponding SMILES representations. After preprocessing (removal of duplicates and incomplete entries), a curated dataset of 3,105 compounds was retained for QSAR modelling. The IC50 value was normalized using a logarithmic transformation with base 10 (log10). A total of 30% of the compounds was allocated as a test set, while the remaining 70% was used for training the regression models. Model accuracy was assessed based on the correlation strength between predicted and observed values, quantified by the coefficient of determination (R2). The model with the highest R2 value was selected for subsequent screening. As summarized in Table 1, the Random Forest regressor achieved the best performance, yielding an R2 of 0.69, which indicates a strong positive correlation between predicted and experimental values. Importantly, previous studies have suggested that models with an R2 value greater than 0.6 are considered acceptable for predictive QSAR applications (Guerra et al., 2016). In addition to its strong R2, the Random Forest model exhibited the lowest root mean square error (RMSE: 0.42) and mean absolute error (MAE: 0.35), highlighting its accuracy and robustness. Moreover, it achieved a Q2 value of 0.91, further supporting its predictive reliability (Table 1). To ensure the generalizability and robustness of the model, a 5-fold cross-validation was performed exclusively on the Random Forest regressor. This model exhibited the highest coefficient of determination (R2 = 0.69) among all tested algorithms. The Random Forest model was selected for cross-validation owing to its optimal balance of predictive accuracy and interpretability, supported by consistently low RMSE and MAE values across iterations. The 5-fold cross-validation confirmed the model’s stability and reliability in predicting IC₅₀ values, withR2 ranging from 0.66-0.70, mean RMSE: 0.43, and mean MAE of 0.35 (Table S1). Based on these results, the Random Forest model was finalized as the representative predictor for external validation and further screening.

Table 1. Performance metrics of the six regression models used for QSAR development, including the coefficient of determination (R2), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and cross-validated R2 (Q2). The R2 value indicates the proportion of variance in the dependent variable explained by the independent variables.
Model R2 RMSE MAE Q2
Bayesian Ridge 0.67 0.45 0.37 0.9
Linear Regression -9.2 0.2 0.16 0.98
Random Forest 0.69 0.42 0.35 0.91
Decision Tree Regressor 0.45 0.81 0.67 0.69
Support Vector Regression 0.67 0.45 0.37 0.9
Gradient Boosting Regression 0.66 0.47 0.38 0.9

Feature importance analysis (Fig. S1) identified BCUT2D_LOGPHI and VSA_EState7 as the most influential descriptors, underscoring the relevance of electronic and surface area properties. Other key contributors included MinAbsEStateIndex, FpDensityMorgan1, and MolLogP, highlighting the importance of topological, physicochemical, and lipophilic characteristics. The presence of TPSA and multiple EState and VSA descriptors further supports the role of electronic distribution in activity prediction.

The employed random forest regressor model was utilized for the purpose of conducting QSAR on a dataset consisting of 1,652 phytochemicals for activity determination. Following QSAR analysis, 15 compounds were identified as having significantly higher predicted activity than the control compound, rebastinib (Table S2). These 15 compounds, along with the control, were subsequently selected for molecular docking studies.

3.3 Molecular docking

Fifteen QSAR-screened phytochemicals, along with the control compound, were docked into the binding pocket of the T315I-mutated tyrosine-protein kinase ABL1. To validate the docking protocol, rebastinib was re-docked into the active site of the T315I-mutated BCR-ABL1 protein using the same parameters applied to the screened compounds. Structural alignment of the re-docked pose with the experimentally determined rebastinib-ABL1 crystal complex (Fig. S2) showed minimal deviation, confirming the reliability and robustness of the docking methodology. The validated re-docked conformation was subsequently employed for molecular dynamics simulations and MM/GBSA calculations to maintain methodological consistency.

Docking results indicated that the top poses of the ten highest-scoring phytochemicals exhibited binding energies below -6.0 kcal/mol. Previous studies have suggested that binding energies lower than -6.0 kcal/mol are indicative of favorable drug-target interactions [64–67]. In this study, all top ten compounds satisfied this threshold, suggesting strong binding affinity for the mutant ABL1 protein. Table 2 shows the binding scores of the top ten QSAR-screened phytochemicals and the control compound, rebastinib. For comparison, a previous study reported docking scores of ponatinib (−11.05 kcal/mol) and weaker interactions for bosutinib (−5.51 kcal/mol), bafetinib (−4.68 kcal/mol), dasatinib (−4.70 kcal/mol), nilotinib (−3.77 kcal/mol), and imatinib (−3.59 kcal/mol) (Banavath et al., 2014). Another study reported active compounds with binding scores ranging from -7.50 to -12.60 kcal/mol (Rahman et al., 2022; Kalinichenko et al., 2019). In the present analysis, rebastinib showed the lowest binding energy (-12.80 kcal/mol), consistent with prior reports, while the top ten phytochemicals demonstrated comparable or stronger binding scores within this favorable range. Although docking poses may undergo conformational rearrangements during molecular dynamics simulations, the consistently high binding affinities observed here suggest strong compatibility of these phytochemicals with the T315I-mutated ABL1 protein.

Table 2. Binding scores and hydrogen-bond interactions of the ten selected compounds and the control (rebastinib) with the T315I-mutated BCR-ABL1 protein. The binding score reflects the relative binding affinity of each ligand to the protein, while the listed residues represent the key hydrogen bonds formed within the binding pocket.
Ligand Binding energy of the top pose (kcal/mol) H-bond residues
25066467 (control) -12.8 GLU64
10168 -7.6 GLU64
14985 -8.8 NA
86749 -6 ASP159
1268142 -6.8 NA
9998730 -7.3 NA
11953937 -7.8 NA
70697714 -7.4 NA
71434444 -7.1 LYS25, GLY27, GLY29
92729 -6.5 GLU60, ARG164
5281238 -8.8 HIS139

The protein-ligand interactions were visualized using Maestro (Version 13.6) (Schrödinger, 2023). Fig. 2 presents the 2D representation of the intermolecular interactions between the protein and the ligands. Hydrogen bonding plays a pivotal role in maintaining the structural stability of protein-ligand complexes and is therefore critical high importance in drug design. These interactions not only stabilize complex conformations but also enhance binding affinity, which is essential for developing efficacious therapeutics with selective targeting of specific biological processes (Cichero et al., 2021). Based on the hydrogen-bonding interactions, five compounds: 10168, 86749, 71434444, 92729, and 5281238 were shortlisted for further investigation. Compound 10168 formed a single hydrogen bond with residue GLU64, similar to the control compound. However, despite this similarity, a significant difference in binding scores was observed (Table 2). The higher binding affinity of the control suggests the involvement of additional non-covalent interactions beyond hydrogen bonding. Compounds 86749 and 5281238 also formed a single hydrogen bond with residues ASP159 and HIS139, respectively. In contrast, compound 71434444 exhibited three hydrogen bonds with LYS25, GLY27, and GLY29, while compound 92729 formed two hydrogen bonds with GLU60, and ARG164. Further, these shortlisted compounds, together with the control, were subjected to molecular dynamics simulation. In a previous study, the binder exhibited hydrogen bonding interactions with residues ASP159 and GLU64 (Kalinichenko et al., 2019). Similarly, another report demonstrated that the lead compound DB01172 formed hydrogen bonds with GLU60, LYS63, GLU64, ILE138, and ASP159 (Banavath et al., 2014). Notably, compounds 92729, 86749, and 10168 in the present study displayed hydrogen bonding with several of these key residues, thereby corroborating earlier findings. Overall, the hydrogen-bond analysis provided additional insights into ligand-protein interactions; however, it should be noted that these observations are not directly corelated with the binding scores shown in Fig. 2. The control compound and 5281238 exhibited the most favorable binding scores, although each formed only a single hydrogen bond. It is also noteworthy that hydrogen bond identification was carried out using the Maestro tool, which applies predefined criteria to determine the presence of hydrogen bonds. The geometric criteria for defining protein-ligand hydrogen bonding are as follows: a donor-acceptor distance (D-H···A) of ≤2.5 Å, a donor angle (D-H···A) of ≥ 120°, and an acceptor angle (H···A-X) of ≥ 90°. During the binding process, conformational adjustments of both the ligand and the protein may occur to achieve the most stable pose. Consequently, previously formed hydrogen bonds may be disrupted, while new hydrogen bonds may emerge, reflecting the dynamic nature of protein-ligand interactions.

Intermolecular interaction analysis in 2D for complexes of (a) control, (b) 10168, (c) 14985, (d) 86749, (e) 1268142, (f) 9998730, (g) 11953937, (h) 70697714, (i) 71434444, (j) 92729, and (k) 5281238, highlighting key residues involved in binding
Fig. 2.
Intermolecular interaction analysis in 2D for complexes of (a) control, (b) 10168, (c) 14985, (d) 86749, (e) 1268142, (f) 9998730, (g) 11953937, (h) 70697714, (i) 71434444, (j) 92729, and (k) 5281238, highlighting key residues involved in binding

In order to strengthen the foundation for compound selection, the predicted binding free energy (ΔG), LE, and LLE of all docked compounds were assessed, as illustrated in Table S3. The control ligand (25066467) showed a ΔG of −5.5 kcal/mol with a relatively low LE of 0.134 and LLE of 1.98. Among the screened compounds, several candidates demonstrated comparable or slightly improved ΔG values, ranging from −5.33 to −5.45 kcal/mol. Notably, compound 86749 exhibited the highest LE of 0.494, suggesting efficient binding relative to its size, while 10168 displayed the highest LLE (4.09), indicating favorable binding energy concerning its lipophilicity. These metrics offer a more refined understanding of binding quality beyond ΔG alone and support the prioritization of select candidates over the control.

However, based on the hydrogen bonds formed with the protein, five compounds (10168, 86749, 71434444, 92729, and 5281238), along with the control, were selected for further analysis.

3.4 MD simulation

MD simulations provide detailed insights into protein-ligand interactions at the atomic level, making them an indispensable tool in rational drug design and development. Binding affinity, stability, and conformational changes assist in refining drug candidates and understanding their mechanisms. Phytochemicals bound to the T315I-mutated tyrosine-protein kinase ABL1 protein were evaluated for robustness and adaptation using MD simulations. The RMSD was employed to assess the stability and convergence of the protein-ligand complexes during MD simulations. Tocapture the temporal behavior of the complexes, RMSD, RMSF, and hydrogen bond interactions were analyzed throughout the trajectories. The RMSD was calculated by aligning each trajectory frame with the initial equilibrated structure of the protein-ligand complex, thereby providing a quantitative measure of structural deviation over time. Thermal Titration Molecular Dynamics (TTMD) simulations were carried out following previously reported protocols (Menin et al., 2023; Pavan et al., 2022). This approach was employed to further evaluate the stability of the protein-ligand complexes under elevated thermal conditions. The simulations were performed over a duration of 10 ns at progressively increasing temperature of 350 K, 400 K, 450 K, and 500 K. Initially, the top five hit compounds along with the control were subjected to simulations at these four elevated temperatures for 10 ns each to evaluate their conformational compatibility with the protein. Based on stability assessments from these runs, the most stable ligand was identified and subsequently subjected to extended molecular dynamics simulations under physiological conditions (300K) for 100 ns in duplicate, to ensure reproducibility and reliability of the observed results.

3.5 Trajectory analysis

RMSD calculations for the protein were performed by aligning the Cα atoms to the reference positions of the initial complex structure. This alignment strategy enabled a rigorous evaluation of structural fidelity and conformational deviations relative to the initial complex throughout the simulation. Here, the ligand RMSD was calculated by first aligning the protein backbone to the reference structure, followed by computing the positional deviations of the ligand, providing a measure of its stability within the binding pocket.

The RMSD of the ligands at 350 K, 400 K, and 450 K have been shown in the Fig. S3. It was observed that 71434444 dissociated from the protein binding site at 350 K and 400 K, whereas the other compounds remained stable at these temperatures (Figs. S3(a and b). At 450 K MD simulation, 10168, 86749, along with 71434444, had high RMSD, indicating that they moved out of the protein binding site (Fig. S3c). However, 92729, 5281238, and the control showed stable RMSD at 450 K. Further, at 500 K, high RMSD was observed for all the compounds except 5281238 and the control. In the initial simulation of 10 ns at high temperature, ligands 10168, 86749, 71434444, and 92729 had a higher (RMSD) compared to the control molecule, rebastinib. As shown in Fig. 3(a), 10 ns simulations at 500 K revealed that four of the top five compounds experienced significant displacement from the protein’s binding pocket. The application of 500 K in these simulations was not intended to mimic physiological conditions but rather to accelerate conformational sampling. Elevated-temperature MD is a widely used strategy to explore rare or slow-timescale events such as unfolding, binding pocket flexibility, or ligand dissociation, which are difficult to capture at physiological temperatures within practical timeframes. This approach provides insight into the relative stability of ligand–protein complexes under thermal stress, highlighting key differences in their dynamic behavior. The RMSD plot for the initial 2 ns was further examined in Fig. 3(b), showing that after 1 ns, the molecules with 10168, 86749, 71434444, and 92729 exhibited a noticeable deviation from the protein, indicating a reduced binding at elevated temperatures. At high temperatures, the compounds are supposed to show higher conformational variability. Notably, compound 5281238 remained stable within the protein’s binding pocket under elevated temperatures, demonstrating greater resilience compared to the control. Fig. 3(c) presents the RMSD values of ligand 5281238 and the control over 10 ns simulations at elevated temperatures. The experimental findings indicated that the control compound maintained consistent binding, with RMSD values ranging from 0.3 nm to 0.5 nm. In contrast, ligand 5281238 exhibited slightly higher RMSD values, ranging from 0.5 nm to 1 nm, reflecting modest fluctuations within the binding pocket. Moreover, these simulations confirmed that 5281238 exhibited stronger binding affinity to the protein compared to the other four compounds. Thus, it was selected for a longer run of 100 ns under physiological conditions (T=310 K).

(a) RMSD of the ligands 10168, 86749, 71434444, 92729, and 5281238 along with the control for the 10 ns simulation at 500 K, (b) RMSD of the ligands for the first 2 ns, and (c) RMSD of the 5281238 and the control (rebastinib) for 10 ns simulation, indicating structural stability
Fig. 3.
(a) RMSD of the ligands 10168, 86749, 71434444, 92729, and 5281238 along with the control for the 10 ns simulation at 500 K, (b) RMSD of the ligands for the first 2 ns, and (c) RMSD of the 5281238 and the control (rebastinib) for 10 ns simulation, indicating structural stability

The interaction analysis (Figs. S4a-l) performed at multiple time points (0 ns, 5 ns, and 10 ns) and temperatures during the MD simulations revealed significant differences in the binding behavior of the compounds towards the target protein. At the outset (0 ns) of the MD simulation at 350 K, all compounds exhibited interactions with the protein (Fig. S4a). However, by 5 ns and 10 ns, compound 71434444 had disengaged from the protein’s binding site. The control maintained robust hydrogen bonds (H-bonds) with GLU64, GLU94, GLY99, ALA158, and ASP159 at 0 ns and retained these interactions through to 5 ns (Fig. S4b). By 10 ns (Fig. S4c), its interactions were limited to GLU64, MET96, and ASP159. Other compounds, such as 10168, 86749, and 92729, initially had H-bonds at 0 ns but showed a progressive loss of these interactions over time, except for 10168, which formed new H-bonds with GLY161, PHE160, and LYS49 by 10 ns at 350 K. 5281238 showed H-bonds with ILE138 and ASP159. At a higher temperature of 400 K (Figs. S4d-f), the control continued to show stable interactions, forming new bonds with Lys49, Met96, and Val34 by 5 ns, maintaining these through 10 ns. Compound 10168 exhibited a shift in its binding partners over time, indicating dynamic interaction profiles that varied with the increased temperature. In contrast, 86749 and 92729 lost all interactions by 10 ns at 400 K. However, 5281238 showed H-bond with ASP159 and ILE138 at 0 ns and 10 ns, respectively. At 450 K (Figs. S4g-i), compounds like 10168 and 86749 moved out of the binding site, indicating a decrease in stability or affinity under higher temperatures. However, 5281238 demonstrated consistent H-bond interactions with HIS138 across all timeframes at 450 K, suggesting a robust binding even at elevated temperatures. At the highest temperature tested (500 K), all compounds except the control and 5281238 disengaged from the protein (Figs. S4j-l), showcasing the exceptional stability and sustained interaction of 5281238 and the control with ASP159 even under extreme conditions. These observations suggest that both the control and 5281238 maintain critical interactions with key residues that might be crucial for their functional efficacy, highlighting their potential resilience and effectiveness as therapeutic agents under varying physiological temperatures.

A 100 ns production run of 5281238 and a control compound were carried out in duplicate at 310 K. In addition, the apoprotein of T315I-mutated tyrosine-protein kinase ABL1 was simulated over 100 ns to allow comparison of bound and unbound states. RMSD analyses of the protein backbone (Cα atoms) and ligands were performed to assess the structural stability and equilibration of the protein-ligand complexes over the 100 ns molecular dynamics simulations. As depicted in Fig. 4(a), the protein RMSD converged within the first 20 ns and remained stable thereafter, indicating that the system had reached equilibration. Furthermore, Fig. 4(b) shows that both the reference ligand and compound 5281238 exhibited consistently low and stable RMSD values throughout the simulation period, suggesting that the ligands remained stably bound within the active site without undergoing significant conformational drift. These RMSD plateaus confirm that the simulated systems were well-equilibrated, allowing for reliable interpretation of their dynamic behavior.

(a) RMSD of protein (T315I-mutated tyrosine-protein kinase ABL1) Cα-atoms aligned to the initial structure; (b) RMSD of the ligands, 5281238 and control, over a 100 ns simulation, illustrating the maintenance of ligand binding stability over time
Fig. 4.
(a) RMSD of protein (T315I-mutated tyrosine-protein kinase ABL1) Cα-atoms aligned to the initial structure; (b) RMSD of the ligands, 5281238 and control, over a 100 ns simulation, illustrating the maintenance of ligand binding stability over time

Fig. 4(a) shows the RMSD of the protein (T315I-mutated tyrosine-protein kinase ABL1) C-α atoms in complex with 5281238 and the control, while Fig. 4(b) depicts the RMSD profiles of the ligands. The protein, both in its apo form and when bound to ligands, exhibited a consistent RMSD ranging from 0.2 nm to 0.3 nm, indicating overall structural stability throughout the simulation. In comparison, the protein demonstrated high stability in both its bound and unbound forms, indicating that no significant structural alterations occurred during the simulation. Nevertheless, the protein that was bound to 5281238 exhibited a consistent RMSD with few variations, potentially leading to a subtle alteration in its conformation. Fig. 4(b) shows the ligand RMSD, indicating that the control and 5281238 remained stable throughout the simulation, with RMSD values of 0.2 nm and 0.15 nm, respectively. Compound 5281238 exhibited the highest stability, as indicated by consistently low RMSD values, suggesting that it remained firmly bound to the protein’s initial binding site. Likewise, the 100 ns replicate simulation shown in Supplementary Fig. S5(a) confirmed that the protein RMSD remained consistently low (0.2 to 0.3 nm) when bound to both the control and 5281238, further supporting the structural stability of these complexes. As shown in Supplementary Fig. S5(b), the ligand RMSD values further corroborated this stability, with the control maintaining an RMSD of about 0.2 nm and 5281238 exhibiting minor fluctuations between 0.2 and 0.3 nm. These values were in close agreement with those obtained from the initial MD simulation, indicating reproducibility across independent runs. This consistent RMSD behavior suggests that compound 5281238 adopts a stable conformation and binding orientation throughout the simulation period. These consistently low deviations indicate the formation of a stable and conformationally well-maintained protein–ligand complex. Taken together, the results from both replicates provide strong evidence that 5281238 forms a reproducibly stable binding pose with the target protein, justifying its selection as a potential lead compound.

The 100 ns molecular dynamics simulations revealed notable differences in the conformational stability of the complexes formed with compound 5281238 and the control. Initially, both compounds (shown in Figs. S4(a and c) were well-positioned within the binding site of the target protein. By the end of the simulation (100 ns; Figs. S4(b and d), compound 5281238 maintained stable interactions within the binding site, indicating strong and sustained binding affinity. In contrast, the control compound underwent significant conformational changes, suggesting a comparatively less stable interaction. These findings underscore the potential of compound 5281238 as a more effective inhibitor, exhibiting superior stability and binding affinity relative to the control.

The RMSF of T315I-mutated tyrosine-protein kinase ABL1 in both bound and unbound states was calculated over 100 ns, as shown in Fig. 5(a). The apo form of the protein exhibited 8 residues with RMSF values > 0.3 nm, with one peak exceeding 0.4 nm, indicating high flexibility. When bound to the control, the protein showed 12 residues with RMSF > 0.3 nm, and one peak reached > 0.5 nm. In contrast, binding to compound 5281238 resulted in only 11 residues exceeding 0.3 nm, with a maximum peak of 0.4 nm. Notabl, Arg164 displayed high RMSF in both the control- and 5281238-bound complexes. This residue is critical for mediating conformational transitions between the inactive and active state of the protein (Chan et al., 2011). Therefore, the binding of 5281238 may induce conformational changes that contribute to the inactivation of the T315I-mutated tyrosine-protein kinase ABL1.

(a) RMSF of the protein (T315I-mutated tyrosine-protein kinase ABL1) bound to ligands 5281238 and control over 100 ns simulation, highlighting residue flexibility. (b) Hydrogen bond analysis of the 100 ns protein-ligand simulations, showing the number and stability of hydrogen bonds formed
Fig. 5.
(a) RMSF of the protein (T315I-mutated tyrosine-protein kinase ABL1) bound to ligands 5281238 and control over 100 ns simulation, highlighting residue flexibility. (b) Hydrogen bond analysis of the 100 ns protein-ligand simulations, showing the number and stability of hydrogen bonds formed

The hydrogen bond interactions between the protein and ligands, 5281238, and the control, were analyzed over the 100 ns simulation. As shown in Fig. 5(b), the control maintained the highest number of hydrogen bonds, ranging from 4 to 5 for most of the simulation, with occasional frames showing 6 hydrogen bonds, indicating stable binding. In contrast, 5281238 formed 1 to 2 hydrogen bonds for the majority of the simulation. Notably, during the last 20 ns, 5281238 consistently maintained 2 hydrogen bonds, suggesting strong and stable binding to T315I-mutated tyrosine-protein kinase ABL1. The replicate 100 ns MD simulation (Supplementary Figs. S5(c and d) confirmed these findings, showing similar RMSF fluctuations and hydrogen bond patterns for both the control and 5281238, reinforcing the reproducibility of the results.

The hydrogen bonds could be compared with the docked pose (initial pose), and as it is observed, the number of hydrogen bonds has increased for both control and 5281238 in the simulation. This indicates that the ligand has found a more stable pose during the simulation to maximize the hydrogen bonds. Additional hydrogen bonds can contribute to stronger interactions, leading to improved binding affinity. It was also observed that hydrogen bonds were at their maximum during the last 20 ns of the simulation, which further supports the attainment of a more stable form as the simulation progresses.

The interaction analysis of the protein-ligand complexes with 5281238 and the control has been shown in Fig. 6. The control formed three hydrogen bonds with key residues GLU64, MET96, and ARG164 of T315I-mutated tyrosine-protein kinase ABL1 (Fig. 6c). In comparison, 5281238 established a single hydrogen bond with HIS139 (Fig. 6a). Notably, GLU64 and MET96 are critical for ATP binding, while ARG164 plays a significant role in protein activation (Banavath et al., 2014). Using structure-based drug design, inhibitors were developed to selectively interact with specific residues, including ARG164 and GLU60, within the ABL1 protein (Chan et al., 2011). These residues are critical for mediating the transition between the inactive and active conformations of the protein. A previous study demonstrated that the inhibitor DCC-2036 (rebastinib) also forms hydrophobic interactions with the mutant gatekeeper ILE93 and spine residues MET168 and HIS139 (Chan et al., 2011). Consistently, in the current study, rebastinib interacted with ARG164, while compound 5281238 engaged the key residue HIS139. These findings suggest that the selected compound may inhibit the protein, thereby impairing its activity. The 3D structure shown in Figs. 6(b,d) indicates that 5281238 binds within the same pocket as the control compound. Furthermore, structural alignment reveals a high degree of similarity between the conformations of 5281238 and the control, strongly suggesting that 5281238 could exert inhibitory effects on T315I-mutated tyrosine-protein kinase ABL1, comparable to the control compound.

Protein-ligand interactions in 2D and 3D for (a, b) Compound 5281238 and (c, d) the control compound. Key binding residues and the spatial orientation of the ligands within the binding pocket are highlighted
Fig. 6.
Protein-ligand interactions in 2D and 3D for (a, b) Compound 5281238 and (c, d) the control compound. Key binding residues and the spatial orientation of the ligands within the binding pocket are highlighted

3.6 PCA and FEL

PCA is a widely used method for reducing the dimensionality of large datasets, allowing the extraction of meaningful insights. In this study, PCA was applied to examine the mobility patterns of the control and 5281238 complexes with T315I-mutated tyrosine-protein kinase ABL1, focusing on the two main components (PC1 and PC2). Fig. 7(a) presents the superimposed PCA plots of these complexes. The data points depicted in the figure correspond to individual conformations, and a greater concentration of these data points indicates a higher degree of similarity among various conformations or a reduced level of structural variation. Here, the control showed high-density cloud with less conformational variation for the 100 ns simulation. Though 5281238 showed minor dispersion but the overall data points were highly concentrated, showing less structural variation. 5281238-protein complex showed comparable results to the control-protein complex.

(a) PCA showing the distribution of conformations along the first two eigenvectors (PC1 and PC2) for the protein complexes with compound 5281238 and the control, highlighting differences in conformational variance. (b, c) FEL plots derived from PCA for the 5281238-protein complex and the control-protein complex, respectively, illustrating the relative stability and energy basins of binding conformations during the simulation
Fig. 7.
(a) PCA showing the distribution of conformations along the first two eigenvectors (PC1 and PC2) for the protein complexes with compound 5281238 and the control, highlighting differences in conformational variance. (b, c) FEL plots derived from PCA for the 5281238-protein complex and the control-protein complex, respectively, illustrating the relative stability and energy basins of binding conformations during the simulation

As shown in Supplementary Fig. S6(a), the 5281238–ABL1 complex demonstrates that the first three PCs (PC1-PC3) account for 48.4% of the total variance, with contributions of 33.9% from PC1, 9.1% from PC2, and 5.4% from PC3. The cumulative variance reaches 78.5% by the 20th component, indicating that most of the system’s motion is captured within the top few PCs. This sharp decline suggests a more directed and less flexible conformational landscape, implying that 5281238 binding leads to relatively rigid structural behavior. In contrast, Fig. S6(b) for the control-ABL1 complex shows a more distributed variance. PC1 explains 24.6%, PC2 adds 12.3%, and PC3 contributes 9.0%, totaling only 46.0% by the third component. The cumulative variance by the 20th PC is slightly lower at 77.6%, indicating a broader spread of dynamic motion across more components. This reflects greater conformational diversity and flexibility in the control complex during simulation. Overall, these PCA eigenvalue spectra suggest that the 5281238-ABL1 complex displays a more constrained and stable motion profile compared to the control, consistent with tighter binding or reduced dynamic fluctuations induced by 5281238.

The FEL provides a visual representation of the diverse conformations that a protein-ligand complex can adopt during a molecular dynamics simulation. To show this energy landscape, two variables, PC1 and PC2, were employed to gauge conformational variability, each reflecting specific system properties. An assessment of system stability can be performed by examining the shape and extent of the minimal energy regions, highlighted in blue. Compact and less expansive blue areas indicate increased stability within specific conformations adopted by the complex. The 3D projections reveal the formation of a narrow funnel, illustrating the dynamic conformational transitions occurring throughout the simulation as the system approaches a structure with minimal energy (Baidya et al., 2022). Additionally, the presence of deep violet-colored basins within the broader FEL represents local energy minima, indicating that the protein attains low-energy states during the simulation (Kannan and Kolandaivel, 2018). As depicted in Figs. 7(b,c), the complexes formed a single narrow funnel, suggesting the existence of one dominant local minimum. Both complexes exhibited local energy minima within a broader FEL, highlighted in violet, with deep basins indicating that the complexes reached low-energy, stable conformations. The control complex displayed a steeper basin compared to 5281238, suggesting greater stability for the control. Additionally, both complexes featured one wider, shallower basin corresponding to higher-energy conformations relative to the global minimum. Control has one deeper, narrower basin, while in 5281238, both basins are wider. The energy barrier in the case of control suggests that reaching global minima is more challenging. The color gradient showed that there are more deep violet color conformations, which represent the minimum state in the energy landscape of 5281238, compared to the control compound. Overall, minimum energy is more favored in the case of 5281238. However, it is noteworthy that the energy in kJ/mol shown on the Z-axis is the relative energy with respect to the minimum most energy attained by the conformation. Thus, the minimum most value on the Z-axis is zero.

Further, the hit compound 5281238 was examined for the conformation that shows the lowest energy in FEL. Fig. 8 shows the two-dimensional representation of the energy landscape and the three conformations from the FEL. Here, conformation (a) represents the minimum most state with free energy 0 kJ/mol, while (b) is the middle state with free energy of 5.47 kJ/mol, and (c) is from the top of the funnel where free energy is 11.41 kJ/mol. The orientation and interactions varied for all three states. Moreover, the surroundings of the ligand in all three poses were similar, but due to a change in orientation, the interactions changed.

Three representative conformation states derived from the FEL: (a) lowest-energy state, (b) average-energy state, and (c) highest-energy state. These states illustrate the most stable, intermediate, and least stable conformations of the protein-ligand complex, providing insights into dynamic structural behavior
Fig. 8.
Three representative conformation states derived from the FEL: (a) lowest-energy state, (b) average-energy state, and (c) highest-energy state. These states illustrate the most stable, intermediate, and least stable conformations of the protein-ligand complex, providing insights into dynamic structural behavior

In the study of the molecular interactions involving a control molecule and compound 5281238, distinct patterns of hydrogen bonding with specific amino acids across three energy states-lowest, highest, and average minimum energy states were observed in Fig. S7. In the lowest energy conformation, the control molecule formed hydrogen bonds with MET96, GLU64, and LYS63, whereas 5281238 interacted only with ILE138. As the energy state increased to the highest minimum, the control was maintained by maintaining interaction with GLU64 and forming a new bond with ASP159, whereas 5281238 shifted its interaction to HIS139. Notably, in the average minimum energy state, both molecules shared a conserved interaction with ASP159, indicating a critical binding site potentially essential for stabilizing the protein’s active or binding conformation. The conservation of interaction with ASP159 across different energy states, observed for both the control and 5281238, underscores its critical role in protein function and ligand binding, highlighting it as a key site for therapeutic targeting or further structural investigation.

3.7 MM/GBSA (Binding free energy)

The binding free energies of the protein-ligand complexes were estimated using the MM/GBSA method over the final 20 ns of the molecular dynamics trajectories (Fig. 9 and Table 3). All calculated ΔG values were negative, indicating a stable complex formation for each ligand. The contributions of ΔGGAS significantly influenced the total binding free energy (ΔGTotal). The control compound exhibited a ΔGTotal of -56.10 ± 4.50 kcal/mol, whereas compound 5281238 showed a lower ΔGTOTAL of -61.91 ± 3.82 kcal/mol. This lower binding free energy suggests that 5281238 forms a stronger and more favorable interaction with T315I-mutated tyrosine-protein kinase ABL1 compared to the control. In molecular and structural biology, a lower binding free energy is indicative of higher binding affinity, implying that 5281238 may possess enhanced inhibitory potential. The second simulation run confirmed similar trends, with the control and 5281238 both showing negative ΔGTOTAL values, further supporting the formation of a stable protein-ligand complex.

Total binding free energy (ΔGtotaL) of the protein-ligand complexes for: (a) Compound 5281238, and (b) the control. The figure illustrates the relative binding affinity and stability of each compound when complexed with T315I-mutated tyrosine-protein kinase ABL1.
Fig. 9.
Total binding free energy (ΔGtotaL) of the protein-ligand complexes for: (a) Compound 5281238, and (b) the control. The figure illustrates the relative binding affinity and stability of each compound when complexed with T315I-mutated tyrosine-protein kinase ABL1.
Table 3. Total Binding Free Energy (ΔG total) of complexes formed between T315I-mutated tyrosine-protein kinase ABL1 and the compounds 5281238 and the control.
Compounds Binding free energy first run (kcal/mol) Binding free energy second run (kcal/mol)
5281238 (last 20 ns) -61.91 ± 3.82 -52.14 ± 3.58
Control (last 20 ns) -56.10 ± 4.50 -57.59 ± 3.19
5281238 (last 50 ns) -59.32 ± 5.65 -53.83 ± 4.99
Control (last 50 ns) -55.18 ± 3.58 -56.25 ± 3.81

Over the 50 ns simulation period, compound 5281238 consistently demonstrates a more favorable and stable binding interaction with T315I-mutated tyrosine-protein kinase ABL1 compared to the control. The binding free energy of 5281238 is -59.32 ± 5.65 kcal/mol in the first run and -53.83 ± 4.99 kcal/mol in the second run. In contrast, the control compound shows binding free energies of -55.18 ± 3.58 kcal/mol in the first run and -56.25 ± 3.81 kcal/mol in the second run. These results indicate that compound 5281238 maintains a stronger binding affinity throughout the extended simulation period, further reinforcing its potential as a more effective inhibitor than the control compound.

Comparing the binding free energies over the last 50 ns and 20 ns of the simulations reveals that compound 5281238 consistently exhibits a stronger and more stable interaction with T315I-mutated tyrosine-protein kinase ABL1 than the control compound. For 5281238, the binding free energy slightly increases over the longer 50 ns period (-59.32 ± 5.65 and -53.83 ± 4.99 kcal/mol) compared to the last 20 ns (-61.91 ± 3.82 and -52.14 ± 3.58 kcal/mol), but remains significantly more favorable than the control compound, whose binding free energy shows less variation and stability between the two periods (-56.10 ± 4.50 and -57.59 ± 3.19 kcal/mol for 20 ns; -55.18 ± 3.58 and -56.25 ± 3.81 kcal/mol for 50 ns). These findings underscore the potential of 5281238 as a more effective inhibitor, demonstrating sustained strong binding affinity throughout extended simulations.

Following the 100 ns MD simulations of the protein-ligand complexes for compound 5281238 and the control, clustering analysis was performed on the trajectory data. Here, the trajectories were aligned to the protein structure to isolate and analyze the ligand’s movements independently. The protein co-ordinates were then removed, allowing for clustering analysis based solely on the ligand trajectories. A cutoff of 0.1 nm was applied, and only clusters comprising more than 10% of the total structures, corresponding to over 1000 members, were selected for further analysis. In the general case, the cut-off used is 0.3 nm, while for ligand, this value could be high. Thus, it was clustered at 0.1 nm. Clusters with 1000 members (10% of total frames) correspond to the significance of the cluster, and only these were considered for analysis. It was observed that 5281238 had one cluster in the first run and 3 clusters in the second run that had more than 10% members. The control compound formed a single cluster in both the simulation runs, and the central representative structure from each cluster was used to calculate the binding free energy via MM/GBSA. Fig. 10 illustrates the binding free energies calculated for the representative clusters of both compound 5281238 and the control. The analysis revealed that the control exhibited binding free energies of -47.24 kcal/mol and -54.49 kcal/mol for the clusters in the first and the second runs, respectively. Compound 5281238 displayed -50.08 kcal/mol in the first run and -56.4 kcal/mol, -57.19 kcal/mol, and -44.71 kcal/mol across the clusters in the second run. Notably, the first two clusters of 5281238 in the second run exhibited lower binding free energies than the control, indicating stronger binding affinity. The formation of multiple clusters for 5281238 in the second run also suggests its ability to adopt diverse conformations while interacting with the target protein. The figure shows a subtle conformational shift in ligand 5281238 from the first to the second run. Initially, the ligand was observed in close proximity to a loop in the protein structure during the first run cluster. However, in the second run, it exhibited a slight positional shift away from the loop, which likely contributed to its increased binding affinity. This subtle conformational shift may be crucial, as it could reduce steric clashes and optimize interactions with surrounding residues, thereby enhancing the overall binding affinity of the ligand for the protein. Similarly, the control ligand depicted close contact with the loop in the first run, but it slightly moved away in the second run, potentially explaining the enhanced affinity observed in the latter. This adjustment may have enabled a more optimal alignment within the protein’s active site, enhancing interactions with surrounding residues and thereby contributing to the increased binding affinity observed in the second run.

Binding free energy of the representative middle structure from the selected clusters for compounds 5281238 and control. This analysis illustrates the energy landscape and relative stability of the protein-ligand interactions within the most representative conformational clusters observed during the simulations.
Fig. 10.
Binding free energy of the representative middle structure from the selected clusters for compounds 5281238 and control. This analysis illustrates the energy landscape and relative stability of the protein-ligand interactions within the most representative conformational clusters observed during the simulations.

The entropic contributions (TΔS) were included in the MM/GBSA calculations to provide a more comprehensive understanding of the binding energetics. These values were derived using normal mode analysis and incorporated into the total free energy (ΔG = ΔH − TΔS) for both the control and the lead compound 5281238. As illustrated in Fig. S8, the entropic penalty for 5281238 was 6.36 kcal/mol, whereas the control exhibited a higher entropic penalty of 21.38 kcal/mol for the control, suggesting a more favorable entropic contribution for 5281238. This relatively lower TΔS contribution suggests reduced conformational restriction upon binding and supports the higher binding affinity of 5281238. Incorporating entropy refines the free energy analysis and underscores the thermodynamic advantage of 5281238 over the control compound.

3.8 ADMET properties

The absorption, distribution, metabolism, excretion, and toxicity (ADMET) profiles of the selected compound were evaluated, as summarized in Table 4. Compound 5281238 (flavoxanthin) exhibits a slightly higher molecular weight (584.87 Da) compared to the control compound (rebastinib) (553.59 Da). Both compounds have a similar number of rotatable bonds (9 for flavoxanthin and 10 for rebastinib), indicating comparable flexibility, which is beneficial for fitting into the binding pocket. Flavoxanthin has fewer hydrogen bond acceptors (3) and donors (2) compared to rebastinib (7 acceptors and 3 donors). This reduced hydrogen bonding potential may decrease non-specific interactions, enhancing specificity towards the target protein. Flavoxanthin exhibits a higher molar refractivity (186.24) than rebastinib (154.12), indicating a greater ability to polarize light, which correlates with stronger interactions within the binding pocket. Flavoxanthin has a lower TPSA (49.69) compared to rebastinib (123.06), suggesting better membrane permeability and, thus, potentially higher bioavailability. The higher iLOGP value of flavoxanthin (7.15) compared to rebastinib (3.52) indicates greater lipophilicity, which can facilitate better absorption through cell membranes. Flavoxanthin is moderately soluble, whereas rebastinib is insoluble. This improved solubility of flavoxanthin enhances its potential for bioavailability and easier formulation. Both compounds fall within the acceptable toxicity classes 3 and 4, according to the Globally Harmonized System of Classification and Labeling of Chemicals (GHS) and the Hazard Communication Standard (HCS). These results suggest favorable drug-like properties for 5281238 and the control, particularly in terms of their potential oral bioavailability. Both flavoxanthin and rebastinib have zero PAINS alerts, reducing the likelihood of assay interference, which supports the reliability of the screening results. The selection of flavoxanthin is further justified by its natural origin, which aligns with the current trend towards utilizing bioactive natural compounds in drug discovery due to their generally favorable safety and efficacy profiles. Overall, flavoxanthin’s combination of strong binding affinity, favorable physicochemical and ADMET properties, lower toxicity, and natural origin supports its selection as a promising candidate for further experimental validation and potential development as an inhibitor for the T315I-mutated tyrosine-protein kinase ABL1 in CML therapy. Table 4 illustrates the evaluation of key pharmacokinetic parameters alongside SMARTCyp-predicted metabolic hotspots, providing insights into the drug-likeness and potential metabolic behavior of the selected compounds. SMARTCyp analysis predicted C.30, N.38, and N.24 in Rebastinib, and C.7, C.8, and C.26 in flavoxanthin, as the most probable sites of oxidative metabolism mediated by CYP3A4, CYP2D6, and CYP2C9. Both compounds displayed low gastrointestinal (GI) absorption and were anticipated not to cross the blood-brain barrier (BBB), suggesting limited central nervous system exposure. Additionally, neither compound was identified as a P-glycoprotein (P-gp) substrate, suggesting limited active efflux. Despite their similar metabolic liabilities, both compounds are predicted to inhibit major CYP450 isoforms, including CYP2C19, CYP2C9, and CYP3A4, while showing no inhibition of CYP1A2 or CYP2D6. The predicted skin permeation values were equally low for both (Log Kp = –6.18 cm/s), further indicating limited transdermal absorption. Collectively, these findings suggest comparable metabolic and pharmacokinetic profiles, with potential implications for drug–drug interaction risk and bioavailability.

Table 4. ADMET properties of the selected compound and control.
Property Rebastinib Flavoxanthin
MW 553.59 584.87
Heavy atoms 41 43
Aromatic heavy atoms 27 0
Rotatable bonds 10 9
H-bond acceptors 7 3
H-bond donors 3 2
Molar Refractivity (MR) 154.12 186.24
TPSA 123.06 49.69
iLOGP 3.52 7.15
Solubility Insoluble Moderately soluble
PAINS alerts 0 0
Toxicity Class 4 3
SMARTCyp Top SoMs (CYPs) C.30, N.38, N.24 (3A4, 2D6, 2C9) C.7, C.8, C.26 (3A4, 2D6, 2C9)
GI absorption Low Low
BBB permeant No No
P-gp substrate No No
CYP1A2 inhibitor No No
CYP2C19 inhibitor Yes Yes
CYP2C9 inhibitor Yes Yes
CYP2D6 inhibitor No No
CYP3A4 inhibitor Yes Yes
Log Kp (skin permeation) -6.18 cm/s -6.18 cm/s

3.9 Limitations and future applications

While this study employed robust in silico approaches, including molecular docking, QSAR modeling, and MD simulations, certain limitations remain that could be addressed in future research. The study focused specifically on the T315I mutation, a critical mutation associated with drug resistance. Expanding the scope to include other clinically relevant ABL1 mutations could enhance the therapeutic applicability of the findings.

Although 2,727 phytochemicals were screened, future studies could explore a broader chemical space to identify additional potent inhibitors. Moreover, investigating potential synergistic effects of 5281238 with other TKIs or chemotherapeutic agents may improve treatment efficacy and overcome resistance mechanisms.

Crucially, experimental validation in vitro and in vivo is required to confirm the computational predictions and provide mechanistic insights. The structural and binding information generated here can guide the design of optimized TKIs. Following successful preclinical studies, clinical trials could evaluate the safety and efficacy of 5281238 in CML patients harboring the T315I mutation.

4. Conclusions

CML is primarily driven by the BCR-ABL fusion oncogene and is commonly treated with TKIs. However, the T315I mutation in the ABL1 kinase domain confers resistance to many TKIs, limiting their clinical efficacy. Rebastinib, a known inhibitor, interacts with key residues ARG164 and GLU60, effectively targeting T315I-mutated ABL1. In this study, we aimed to identify natural phytochemical inhibitors of T315I-mutated ABL1, using rebastinib as a reference. A two-step virtual screening pipeline, combining machine learning-based QSAR modeling and molecular docking, shortlisted the top five candidate compounds. MD simulations at elevated temperature (500 K) revealed that only compound 5281238 and the control maintained stable binding. Subsequent evaluations at physiological temperature (310 K), supported by PCA, FEL mapping, and MM/GBSA calculations, demonstrated that compound 5281238 maintained exceptional binding stability and exhibited favorable energetics. Notably, it formed crucial interactions with His138, a functional residue essential for kinase activity. These findings identify compound 5281238, flavaxanthin, as a promising lead for overcoming T315I-mediated resistance in CML. Overall, this study underscored the potential of flavaxanthin as a natural therapeutic candidate targeting the T315I-mutated ABL1 kinase, providing a novel strategy for managing resistant forms of CML.

Acknowledgment

This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R82), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia also the authors are thankful to the Deanship of Research and Graduate Studies, King Khalid University, Abha, Saudi Arabia, for financially supporting this work through the Large Research Group Project under Grant no. R.G.P.2/480/46.

CRediT authorship contribution statement

Mohd Saeed: Writing-review & editing. Ali G. Alkhathami: Writing-review. Lamya Al-Keridis: Writing-review & editing. Nawaf Alshammari: Writing-review & editing. Hadba Al-Amrah: Writing-review & editing. Alvina Farooqui: Writing-review & editing. Dharmendra Kumar Yadav: Writing-review & editing, Writing-original draft, supervision, conceptualization. All authors have read and agreed to the published version of the manuscript.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Declaration of Generative AI and AI-assisted technologies in the writing process

The authors confirm that there was no use of artificial intelligence (AI)-assisted technology for assisting in the writing or editing of the manuscript and no images were manipulated using AI.

Supplementary data

Supplementary material to this article can be found online at https://dx.doi.org/10.25259/JKSUS_779_2025.

References

  1. , , , , , , , , . Phytochemical investigation of Egyptian spinach leaves, a potential source for antileukemic metabolites: In vitro and in silico study. Rev Bras Farmacogn. 2022;32:774-785. https://doi.org/10.1007/s43450-022-00307-0
    [Google Scholar]
  2. . Second-generation BCR-ABL kinase inhibitors in CML. N Engl J Med. 2010;363:1672-1673. https://doi.org/10.1056/NEJMc1007927
    [Google Scholar]
  3. , , , . Allosteric binding sites of aβ peptides on the acetylcholine synthesizing enzyme ChAT as deduced by in silico molecular modeling. Int J Mol Sci. 2022;23:6073. https://doi.org/10.3390/ijms23116073
    [Google Scholar]
  4. , , , . Identification of novel tyrosine kinase inhibitors for drug resistant T315I mutant BCR-ABL: A virtual screening and molecular dynamics simulations study. Sci Rep. 2014;4:6948. https://doi.org/10.1038/srep06948
    [Google Scholar]
  5. , , , . ProTox-II: A webserver for the prediction of toxicity of chemicals. Nucleic Acids Res. 2018;46:W257-W263. https://doi.org/10.1093/nar/gky318
    [Google Scholar]
  6. Bento da Silva, A., Giacomoni, F., Pavot, B., Fillatre, Y., Rothwell, J., Bartolomé Sualdea, B., Veyrat, C., Garcia-Villalba, R., Gladine, C., Kopec, R., Hollman, P., Landberg, R., Morand, C., Nunes dos Santos, C., Nyström, L., Pujos-Guillot, E., Bronze, M., Tomas-Barberan, F., Urpi-Sarda, M., Wiczkowski, W., Knox, C., Manach, C., 2016. PhytoHub V1.4: A new release for the online database dedicated to food phytochemicals and their human metabolites.
  7. , , . GROMACS: A message-passing parallel molecular dynamics implementation. Comput Phys Commun. 1995;91:43-56. https://doi.org/10.1016/0010-4655(95)00042-e
    [Google Scholar]
  8. , , , , , . Peptides spanning the junctional region of both the abl/bcr and the bcr/abl fusion proteins bind common HLA class I molecules. Leukemia. 2000;14:419-426. https://doi.org/10.1038/sj.leu.2401703
    [Google Scholar]
  9. , , , , , , , . The protein data bank. Nucleic Acids Res. 2000;28:235-242. https://doi.org/10.1093/nar/28.1.235
    [Google Scholar]
  10. , , , , , . Life expectancy of patients with chronic myeloid leukemia approaches the life expectancy of the general population. J Clin Oncol. 2016;34:2851-2857. https://doi.org/10.1200/JCO.2015.66.2866
    [Google Scholar]
  11. , , . Canonical sampling through velocity rescaling. J Chem Phys. 2007;126:014101. https://doi.org/10.1063/1.2408420
    [Google Scholar]
  12. , , , , , , , , , , , , , , , , , , , , , , , , , , . Conformational control inhibition of the BCR-ABL1 tyrosine kinase, including the gatekeeper T315I mutant, by the switch-control inhibitor DCC-2036. Cancer Cell. 2011;19:556-568. https://doi.org/10.1016/j.ccr.2011.03.003
    [Google Scholar]
  13. , , , . Leukemia, in: StatPearls. Treasure Island (FL): StatPearls Publishing; .
  14. , , , , . Phytochemicals in cancer treatment: from preclinical studies to clinical practice. Frontiers in Pharmacology. 2020;10
    [Google Scholar]
  15. Chronic Myeloid Leukemia - Cancer Stat Facts [WWW Document], n.d. SEER. URL [accessed 2023 Dec 9]. Available from: https://seer.cancer.gov/statfacts/html/cmyl.html.
  16. , , , , , . Probing in silico the benzimidazole privileged scaffold for the development of drug-like anti-RSV agents. Pharmaceuticals (Basel). 2021;14:1307. https://doi.org/10.3390/ph14121307
    [Google Scholar]
  17. , , , , , , , , , , , , , , , . Phase 1 dose-finding study of rebastinib (DCC-2036) in patients with relapsed chronic myeloid leukemia and acute myeloid leukemia. Haematologica. 2017;102:519-528. https://doi.org/10.3324/haematol.2016.152710
    [Google Scholar]
  18. , , , , , , , , , , , , , , , . Phase 1 dose-finding study of rebastinib (DCC-2036) in patients with relapsed chronic myeloid leukemia and acute myeloid leukemia. Haematologica. 2017;102:519-528. https://doi.org/10.3324/haematol.2016.152710
    [Google Scholar]
  19. , , , , , , , , , , , , , , , , , , , , , , , , , , , , , . A phase 2 trial of ponatinib in philadelphia chromosome–positive leukemias. N Engl J Med. 2013;369:1783-1796. https://doi.org/10.1056/nejmoa1306494
    [Google Scholar]
  20. , , , , , . Virtual screening with AutoDock: Theory and practice. Expert Opin Drug Discov. 2010;5:597-607. https://doi.org/10.1517/17460441.2010.484460
    [Google Scholar]
  21. , , . SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep. 2017;7:42717. https://doi.org/10.1038/srep42717
    [Google Scholar]
  22. , , . Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems. J Chem Phys. 1993;98:10089-10092. https://doi.org/10.1063/1.464397
    [Google Scholar]
  23. , , , , , , , . ChEMBL web services: Streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015;43:W612-W620. https://doi.org/10.1093/nar/gkv352
    [Google Scholar]
  24. , , . The molecular biology of chronic myeloid leukemia. Blood. 2000;96:3343-3356.
    [Google Scholar]
  25. , . Chronic Myelogenous Leukemia. In: StatPearls. Treasure Island (FL): StatPearls Publishing; .
    [Google Scholar]
  26. , , , , , , , , , , , , , , , , , , , , , , , , , , . Phytochemicals and bioactive compounds effective against acute myeloid leukemia: A systematic review. Food Sci Nutr. 2023;11:4191-4210. https://doi.org/10.1002/fsn3.3420
    [Google Scholar]
  27. , , , , , , . Treatment-free remission after third-line therapy with asciminib in chronic myeloid leukemia with an atypical e19a2 BCR: ABL1 transcript and T315I mutation. Leukemia. 2024;38:2037-2040. https://doi.org/10.1038/s41375-024-02327-2
    [Google Scholar]
  28. , , , , , . The biology of chronic myeloid leukemia. N Engl J Med. 1999;341:164-172. https://doi.org/10.1056/NEJM199907153410306
    [Google Scholar]
  29. Fiamoncini, J., Feunang, Y., Dalle, C., Durand, S., Petera, M., Wishart, D., Manach, C., 2017. New metabolites of dietary terpenoids identified using&lt;em&gt; in silico&lt;/em&gt; prediction of metabolism and high-resolution mass spectrometry. The 2nd International Electronic Conference on Metabolomics Sciforum.net, pp. 4996.
  30. , , , , , . Repurposing pexmetinib as an inhibitor of TKI-resistant BCR: ABL1. Leukemia. 2024;38:1843-1847. https://doi.org/10.1038/s41375-024-02282-y
    [Google Scholar]
  31. , , , , . 2D-QSAR analysis of derivatives of quinoxaline 1,4-di-n-oxides with activity against chagas’ disease. Química Nova. 2016 https://doi.org/10.5935/0100-4042.20160078
    [Google Scholar]
  32. , , , . GROMACS 4:  Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput. 2008;4:435-447. https://doi.org/10.1021/ct700301q
    [Google Scholar]
  33. , , . Building water models: A different approach. J Phys Chem Lett. 2014;5:3863-3871. https://doi.org/10.1021/jz501780a
    [Google Scholar]
  34. , . Chronic myeloid leukemia: 2022 update on diagnosis, therapy, and monitoring. Am J Hematol. 2022;97:1236-1256. https://doi.org/10.1002/ajh.26642
    [Google Scholar]
  35. , , , , , , , , . Multitargeted molecular docking study of phytochemicals on hepatocellular carcinoma. J App Biol biotech. 2022 https://doi.org/10.7324/jabb.2023.110117
    [Google Scholar]
  36. , , , , , , , . Synthesis, biological activities and docking studies of novel 4-(Arylaminomethyl)benzamide derivatives as potential tyrosine kinase inhibitors. Molecules. 2019;24:3543. https://doi.org/10.3390/molecules24193543
    [Google Scholar]
  37. , , , , , , , , , , . The philadelphia chromosome in leukemogenesis. Chin J Cancer. 2016;35:48. https://doi.org/10.1186/s40880-016-0108-0
    [Google Scholar]
  38. , . The inhibitory performance of flavonoid cyanidin-3-sambubiocide against H274Y mutation in H1N1 influenza virus. J Biomol Struct Dyn. 2018;36:4255-4269. https://doi.org/10.1080/07391102.2017.1413422
    [Google Scholar]
  39. , , , , . Clofarabine-phytochemical combination exposures in CML cells inhibit DNA methylation machinery, upregulate tumor suppressor genes and promote caspase-dependent apoptosis. Mol Med Report. 2019 https://doi.org/10.3892/mmr.2019.10619
    [Google Scholar]
  40. , . Natural products for treatment of chronic myeloid leukemia. In: Anti-cancer Drugs - Nature, Synthesis and Cell Anti-cancer Drugs - Nature, Synthesis and Cell. InTech;
    [Google Scholar]
  41. , , , , , , , , , , , , . PubChem 2023 update. Nucleic Acids Res.. 2023;51:D1373-D1380. https://doi.org/10.1093/nar/gkac956
    [Google Scholar]
  42. , . Biochemical mechanisms of resistance to small-molecule protein kinase inhibitors. ACS Chem Biol. 2010;5:121-138. https://doi.org/10.1021/cb9002656
    [Google Scholar]
  43. , , , . In-silico identification of inhibitors against mutated BCR-ABL protein of chronic myeloid leukemia: A virtual screening and molecular dynamics simulation study. J Biomol Struct Dyn. 2016;34:2171-2183. https://doi.org/10.1080/07391102.2015.1110046
    [Google Scholar]
  44. , , , , , , , , , , , , . Performance of HADDOCK and a simple contact-based protein–ligand binding affinity predictor in the D3R grand challenge 2. J Comput Aided Mol Des. 2018;32:175-185. https://doi.org/10.1007/s10822-017-0049-y
    [Google Scholar]
  45. . Rdkit: Open-source cheminformatics. Release 2014.03.1. Zenodo 2014 https://doi.org/10.5281/ZENODO.10398
    [Google Scholar]
  46. , , . Relation between free energy landscapes of proteins and dynamics. J Chem Theory Comput. 2010;6:583-595. https://doi.org/10.1021/ct9005745
    [Google Scholar]
  47. , , , , . Thermal titration molecular dynamics (TTMD): Not your usual post-docking refinement. Int J Mol Sci. 2023;24:3596. https://doi.org/10.3390/ijms24043596
    [Google Scholar]
  48. , , , , , . MMPBSA.py: An efficient program for end-state free energy calculations. J Chem Theory Comput. 2012;8:3314-3321. https://doi.org/10.1021/ct300418h
    [Google Scholar]
  49. , , , , , , . AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem. 2009;30:2785-2791. https://doi.org/10.1002/jcc.21256
    [Google Scholar]
  50. , , . Quantitative detection of T315I mutations of BCR: ABL1 using digital droplet polymerase chain reaction. Hematol Transfus Cell Ther 46 Suppl. 2024;6:S79-S85. https://doi.org/10.1016/j.htct.2023.12.007
    [Google Scholar]
  51. , , . Repurposing drugs by in silico methods to target BCR kinase domain in chronic myeloid leukemia. Asian Pac J Cancer Prev. 2019;20:3399-3406. https://doi.org/10.31557/APJCP.2019.20.11.3399
    [Google Scholar]
  52. , , . Synthesis, computational and molecular docking study of some 2, 3-dihydrobenzofuran and its derivatives. J Mol Struct. 2021;1224:129225. https://doi.org/10.1016/j.molstruc.2020.129225
    [Google Scholar]
  53. , , , . Investigating the binding affinity, molecular dynamics, and ADMET properties of 2,3-dihydrobenzofuran derivatives as an inhibitor of fungi, bacteria, and virus protein. Beni-Suef Univ J Basic Appl Sci. 2021;10 https://doi.org/10.1186/s43088-021-00117-8
    [Google Scholar]
  54. , , , , , . Open babel: An open chemical toolbox. J Cheminform. 2011;3:33. https://doi.org/10.1186/1758-2946-3-33
    [Google Scholar]
  55. , , , , , , , , , , , , , , , , , , , , , , , , , , , . AP24534, a pan-BCR-ABL inhibitor for chronic myeloid leukemia, potently inhibits the T315I mutant and overcomes mutation-based resistance. Cancer Cell. 2009;16:401-412. https://doi.org/10.1016/j.ccr.2009.09.028
    [Google Scholar]
  56. , . Polymorphic transitions in single crystals: A new molecular dynamics method. J Appl Phys. 1981;52:7182-7190. https://doi.org/10.1063/1.328693
    [Google Scholar]
  57. , , , , . Computational investigation of natural compounds as potential main protease (Mpro) inhibitors for SARS-CoV-2 virus. Comput Biol Med. 2022;151:106318. https://doi.org/10.1016/j.compbiomed.2022.106318
    [Google Scholar]
  58. , , , , . Qualitative estimation of protein–ligand complex stability through thermal titration molecular dynamics simulations. J Chem Inf Model. 2022;62:5715-5728. https://doi.org/10.1021/acs.jcim.2c00995
    [Google Scholar]
  59. , , , , , , , , , , , , , , , . In silico investigation and potential therapeutic approaches of natural products for COVID-19: Computer-aided drug design perspective. Front Cell Infect Microbiol. 2022;12:929430. https://doi.org/10.3389/fcimb.2022.929430
    [Google Scholar]
  60. , , , , , , , , , , , , , , , , , , . Asciminib, a specific allosteric BCR-ABL1 inhibitor, in patients with chronic myeloid leukemia carrying the T315I mutation in a phase 1 trial. Blood. 2018;132:792. https://doi.org/10.1182/blood-2018-99-113609
    [Google Scholar]
  61. , , . Past, present, and future of Bcr-Abl inhibitors: From chemical development to clinical efficacy. J Hematol Oncol. 2018;11:84. https://doi.org/10.1186/s13045-018-0624-2
    [Google Scholar]
  62. , , , , . SMARTCyp: A 2D method for prediction of cytochrome P450-mediated drug metabolism. ACS Med Chem Lett. 2010;1:96-100. https://doi.org/10.1021/ml100016x
    [Google Scholar]
  63. . Imatinib in chronic myeloid leukemia: An overview. Mediterr J Hematol Infect Dis. 2014;6:e2014007. https://doi.org/10.4084/MJHID.2014.007
    [Google Scholar]
  64. . Maestro. New York, NY: LLC; .
  65. , , , . SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res. 2003;31:3381-3385. https://doi.org/10.1093/nar/gkg520
    [Google Scholar]
  66. , . Ponatinib. Ann Pharmacother. 2013;47:1540-1546. https://doi.org/10.1177/1060028013501144
    [Google Scholar]
  67. , , , , . Targeting leukemic stem cells in chronic myeloid leukemia: is it worth the effort? Int J Mol Sci. 2021;22:7093. https://doi.org/10.3390/ijms22137093
    [Google Scholar]
  68. , , , . gmx_MMPBSA: A new tool to perform end-state free energy calculations with GROMACS. J Chem Theory Comput. 2021;17:6281-6291. https://doi.org/10.1021/acs.jctc.1c00645
    [Google Scholar]
  69. , , , , , , , . Large-scale prediction of binding affinity in protein–small ligand complexes: The PRODIGY-LIG web server. Bioinformatics. 2019;35:1585-1587. https://doi.org/10.1093/bioinformatics/bty816
    [Google Scholar]
  70. , , , , , , , . In silico screening of DNA gyrase B potent flavonoids for the treatment of clostridium difficile infection from PhytoHub database. Braz Arch Biol Technol. 2021;64 https://doi.org/10.1590/1678-4324-2021200402
    [Google Scholar]
  71. , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , . BCR-ABL1 compound mutations combining key kinase domain positions confer clinical resistance to ponatinib in Ph chromosome-positive leukemia. Cancer Cell. 2014;26:428-442. https://doi.org/10.1016/j.ccr.2014.07.006
    [Google Scholar]
  72. , , , . SwissParam: A fast force field generation tool for small organic molecules. J Comput Chem. 2011;32:2359-2368. https://doi.org/10.1002/jcc.21816
    [Google Scholar]
Show Sections