Translate this page into:
Enhanced major depressive disorder diagnosis: A hybrid deep learning framework utilizing EEG and fMRI
*Corresponding author E-mail address: changchunlxiuy@163.com (X. Lin)
-
Received: ,
Accepted: ,
Abstract
Major Depressive Disorder (MDD) is a severe mental disorder that requires early detection, which poses a need to enhance diagnostic accuracy. Although Electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) have demonstrated the ability to diagnose MDD, their combined application has not been extensively investigated. This work presents a new two-stream deep learning model for MDD diagnosis with EEG and fMRI data acquired simultaneously. The proposed approach is to process EEG and fMRI data independently by two different Convolutional Neural Networks (CNNs) to obtain discriminative features simultaneously. These features were merged and then classified using a support vector machine optimized by a genetic algorithm. The proposed model achieved a mean diagnostic accuracy of 98.67%, along with a recall of 96.43% and a precision of 100%. This study demonstrates the framework’s effectiveness as a tool to assist clinicians in enhancing the accuracy of MDD diagnosis and facilitating timely treatment planning.
Keywords
Convolutional neural network
Genetic algorithm
Major depression disorders
Support vector machine
1. Introduction
Major Depressive Disorder (MDD) is a disorder characterized by two main features: experiencing a depressed mood (low mood) in various situations and a lack of interest in activities that were previously enjoyable for a minimum of 2 weeks. The lifetime prevalence of major depression is 16.6%, and the 1-year prevalence is 6.7% (Thapar et al., 2022). The prevalence of the disease is twice as high in women as in men. In men, the rate of incidence is 5-12%, while in women, it is 14-19%. The onset of the disease occurs around the age of 30. According to available statistics, 2-7% of depressed adults die by suicide (Li et al., 2022a). Furthermore, more than 60% of individuals who die by suicide suffer from severe depression or other mood disorders (Rönnqvist et al., 2021).
It is believed that the main causes of depression are genetic, environmental, and psychological factors (Fig. 1) (Thapar et al., 2022). However, in 40% of cases, genetic factors are estimated to be the main cause of susceptibility. The diagnosis of depression is based entirely on the reported experiences of the individual and cognitive status evaluations, and there are no laboratory tests for its rapid diagnosis (Abdoli et al., 2022). Tests, however, could occasionally be performed to rule out certain medical disorders that could exhibit similar symptoms. It should be highlighted that there is a difference between typical depression, which is lighter and a normal part of life, and depression. Differentiating between these two is essential (Park & Kim, 2020).

- Factors contributing to MDD.
Many researchers have conducted studies to evaluate Electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) as diagnostic tools for MDD. The research on EEG-based diagnosis of MDD has focused on power spectral density as linear features (Mumtaz et al., 2015; Cai et al., 2020; Čukić et al., 2020; Mahato & Paul, 2019; Shi et al., 2020; Sun et al., 2019) and entropy and fractal dimension as nonlinear features (Čukić et al., 2020; Shi et al., 2020; Shen et al., 2019; Li et al., 2020) and functional connectivity measures as additional features (Sun et al., 2019; Sun et al., 2020). Each of these studies shows that EEG effectively captures MDD brain patterns and produces different levels of accuracy measurements (Bandopadhyay et al., 2020; Mahato et al., 2020; Li et al., 2022b; Rafiei et al., 2022; Lin et al., 2022; Zhang et al., 2023). Alongside these efforts, fMRI research analyzes MDD brain connection dynamics (Liu & Gui, 2024) and combines various data scales to recognize MDD symptoms (Liu & Gui, 2024). MDD diagnosis through fMRI data analysis can be done using graph neural networks as described in (Xia et al., 2023). However, the integration of EEG and fMRI data within deep learning frameworks for MDD diagnosis remains an area with limited exploration.
While EEG and fMRI have individually shown promise in MDD diagnosis, their combined application within a deep learning framework for enhanced diagnostic accuracy has not been extensively explored. We address this gap. Additionally, the study of deep learning architectures for learning the features from these multimodal data is still limited. This is the reason why there is a need for a more elaborate method to utilize the additional information that can be found in combining both EEG and fMRI signals for better diagnostic outcomes. This study’s rationale is rooted in the need to establish early and accurate MDD diagnosis since the disorder’s early identification can play a pivotal role in improving patient prognosis. Thus, we have a goal to enhance the sphere of mental healthcare by creating an efficient and reliable diagnostic instrument.
To achieve this goal, efficient feature extraction and optimal classifier utilization are necessary, which are addressed in this paper. The proposed method employs convolutional neural networks (CNNs) for extracting efficient attributes from EEG and fMRI signals. Additionally, a combination of genetic algorithm (GA) and SVM was employed for classification. The employed combination in the proposed method can be effective in addressing the mentioned challenges. This is the novelty of this study: Firstly, a new deep learning model which integrates EEG and fMRI data is presented to extract different features from both of them. Secondly, it facilitates the parallel processing of these signals, which enables the separate feature extraction using CNNs, thereby improving the model’s capacity to distinguish between healthy and depressed people. Finally, the integration of the extracted features from the EEG and fMRI offers a more holistic picture of the brain activity, enhancing the diagnosis. The contributions of the research include the following:
-
Developing a new hybrid framework for diagnosing MDD based on EEG and fMRI data.
-
Effective feature extraction from EEG and fMRI signals using CNNs, and enhanced diagnostic performance through the fusion of EEG and fMRI features.
-
Optimization of support vector machine (SVM) hyperparameters using a GA for improved classification.
The continuation of the article is structured as follows: An overview of related works is given in the second part. The third section then goes into depth about the suggested method. The fourth portion discusses the outcomes after employing the suggested strategy, and the research’s conclusions are presented in the fifth section.
2. Materials and Methods
The presented model for MDD diagnosis includes the following steps:
-
1.
Preprocessing of EEG and fMRI signals.
-
2.
Extracting features by CNN.
-
3.
Classification by SVM and GA.
The stages of the introduced approach have been illustrated in Fig. 2. This method performs parallel processing of EEG and fMRI signals. A sequence of temporal fMRI images and simultaneous EEG signals form the input of the proposed diagnostic system. In the first phase of the introduced approach, the fMRI image and EEG signal are simultaneously preprocessed.

- Diagram of the introduced approach.
In the second phase of the introduced approach, the preprocessed spectrogram obtained from the EEG signal and fMRI image are used as inputs for two CNN models to extract descriptive features of EEG and fMRI signals based on these CNN models. At the end of the feature extraction step, the two feature vectors, EEG and fMRI, are merged, and the resulting vector is used as the input for the SVM classifier in the third step. The predicted label (MDD/healthy) for each input instance is the output of the diagnosis system.
2.1 Preprocessing
The presented model starts with the preprocessing of EEG and fMRI signals. Due to the different nature of these two signals, the proposed method employs two different mechanisms for preprocessing these signals, which are described in this section.
2.1.1 EEG preprocessing
In most MDD diagnosis studies based on EEG, signals from 64 or 128 electrodes are used to record brain activities. However, using this configuration presents complexities and challenges. Preparing the patient for recording signals from 64 or 128 electrodes is time-consuming, and the use of conductive gels complicates the recording process. The prefrontal lobe exhibits high correlation with emotional processes, making it sufficient to rely on signals recorded by electrodes in this region for the MDD detection process. Additionally, the absence of hair in this area eliminates the need for conductive gels. Therefore, in this study, an EEG recording device with three electrodes (Fp1, Fpz, and Fp2) was used. The device used for EEG signal recording and the electrode positions have been shown in Fig. 3.

- EEG signal recording in the proposed diagnostic system (a) The used device and (b) Electrode positions.
All EEG signals were collected with a frequency of 250 Hz. We selected this sampling rate according to MDD EEG analysis studies and established that it enables sufficient detection of brain activity frequency bands that operate below 100 Hz for MDD diagnosis. Higher sampling rates for this application do not improve diagnostic information while requiring increased data storage and processing power. To preprocess the input signals and remove the destructive effects of noise, each signal was first passed through a low-pass filter with a cutoff frequency of 1 Hz, and then a high-pass filter with a cutoff frequency of 40 Hz was applied. We applied a 4th-order Butterworth filter configuration for both low-pass and high-pass frequency filtering. The Butterworth filter provided the best choice because it delivered a maximally flat passband response. A 1 Hz low-pass filter functioned to eliminate both slow drifts and DC offsets present in the recorded signal. The filtering technique is widely used in EEG preprocessing procedures. The low 1Hz cutoff frequency used in this study stands below the cutoff frequencies applied in other research, which employed either 0.5Hz or greater values. The selected low cutoff frequency aimed to eliminate all very slow drifts and DC offsets from the signal. The 40 Hz high-pass filter function removed high-frequency noise components, including muscle artifacts and line noise from the signal. The chosen cutoff frequency stands as a typical selection in EEG analysis. Subsequently, any potential noise in the resulting signals was eliminated using discrete wavelet decomposition. To achieve this, the EEG signal was decomposed into four levels using the Haar wavelet function (Zhang et al., 2019) and then reconstructed using the inverse wavelet transform. We selected the Haar wavelet because it offers both easy implementation and efficient computation. The analysis used four decomposition stages, which were determined by observing signal decomposition through visual assessment and their ability to remove unwanted noise from EEG signals. We selected this decomposition level because it filters out eye blinks and muscle movements that appear predominantly in higher decomposition levels. The outcome of this process is an EEG signal free from noise.
After noise removal from the input signals, the spectrogram is extracted. For this aim, the Short-Time Fourier Transform (STFT) paradigm is utilized to explain the features of each signal and generate its spectrogram matrix. The STFT framework provides a good technique for expressing signal fluctuations in the temporal domain. This model provides a good feed for methods of deep learning such as CNN and is ideal for expressing interpretable features that show signal variations in closely spaced locations. The spectrogram produced by this approach is a time-frequency matrix created by leveling the intensities of STFT units of the signal. To generate this matrix, the input signal is first separated into sections of equal length using a windowing technique. Following that, the Fourier transformation is calculated for each segment. STFT may be characterized as follows given a signal (Le-Roux et al., 2010):
The normalized signal is represented by S[n] in the equation above. Moreover, the windowing operator centered at m is indicated by w[m]. The signal’s spectrogram may be derived by squaring its scales once the STFT has been calculated using the previous equation (Le-Roux et al., 2010).
A spectrogram matrix, with the vertical and horizontal dimensions signifying frequency and time, respectively, is the outcome of the equation. We applied STFT calculations using a Hamming window of 256 samples with a 50% overlap setting. We selected this combination of window size and overlap to achieve optimal results between time and frequency domain resolution. STFT analysis often uses 50% overlap because this combination provides better time resolution with weaker frequency resolution. The spectrogram required a frequency range between 0 to 40 Hz for its production. The selected frequency bands, which are important for MDD analysis, fit within this frequency range.
2.1.2 fMRI preprocessing
The fMRI image preprocessing in the proposed method is performed based on the fMRIPrep framework (Esteban et al., 2019). After applying this process to each fMRI sample, three slices of the image with the highest level of Blood-Oxygen-Level Dependent (BOLD) signal changes in the image sequence are selected, and the corresponding brain region is segmented from these slices. We selected the three BOLD signal change-maximizing slices by performing temporal standard deviation calculations on the BOLD signal values across the entire fMRI time series for each voxel. A calculation of the mean BOLD signal occurred across all voxels. The analysis determined the temporal standard deviation of the BOLD signal for each voxel through examination of the entire fMRI time series. Voxels with maximum temporal standard deviation led to the selection of three specific slices.
To separate the brain region in each image slice, the input image I is first transformed into a binary image B using a brightness threshold of 12. The threshold value was selected through empirical observation of the obtained binary masks to validate their proper identification of brain regions. The same threshold value was used for processing all fMRI image data. In this binary image, pixels with a brightness value less than 12 are set to 0, while the rest of the pixels are set to 1. The binary image B provides an approximation of the skull region. Furthermore, any holes present in the approximate skull region are filled with a value of 1, and the edges of this region are enhanced using an erosion operator. The erosion operation applied a kernel with dimensions of . The selected kernel size removed small regions outside the brain mask, as determined through visual inspection. The hole-filling operation utilized a morphological closing operation.
The resulting binary image B is multiplied by the original image I, so that regions outside the skull region in the image are assigned a value of 0. In the resulting image, the skull and brain regions are separated from each other through a margin with values of 0. Consequently, the second-largest connected region, indicating brain tissue, in the preprocessed image will be identified, and by isolating it from other regions, the brain tissue will be determined.
2.2 CNN-based feature extraction
After preprocessing the simultaneous EEG and fMRI signals, two CNN models with similar structures will be used for feature extraction from the preprocessed signals. In this step, each EEG and fMRI signal is fed into a separate CNN model to extract features and represent each signal as a descriptive vector. In the following text, the feature extraction model for EEG signals will be represented as and the feature extraction model for fMRI images will be represented as . The base architecture for the employed CNNs used for extracting features from EEG and fMRI signals has been illustrated in Fig. 4. Both CNNs have the same number and type of layers, but their hyperparameters differ.

- The base architecture for the employed CNNs used for extracting features from EEG and fMRI signals.
The feature extraction model is fed with preprocessed signal data. In the , spectrogram matrices are used as inputs, while the is fed with preprocessed fMRI images. Both proposed CNN models utilize three convolutional blocks to extract feature maps from the input signals. Each convolutional module includes a sequence of convolution, Batch Normalization (BN), ReLU, and pooling layers. After extracting the feature maps, three consecutive fully connected (FC) layers have been employed to further extract structural attributes and compress them. Thus, the FC1 layer transforms the feature maps into vector form, and then the extracted features are compressed using the FC2 and FC3 layers. Finally, the activations resulting from FC3 are used as the attributes extracted from the input EEG/fMRI signals and are considered as the output of the feature extraction model. As mentioned, both CNNs employed by the presented approach consist of the same combination of layer types, but the parameters of their layers are configured differently. The parameter configuration details for each of the layersand have been provided in Table 1.
| Layer | parameter setting | parameter setting |
| Input | 128×128×3 | 32×100 |
| Convolution1 | 7×7×24 | 7×7×16 |
| Pooling1 | Max pooling (3×3) | Average pooling (3×3) |
| Convolution2 | 5×5×32 | 5×5×32 |
| Pooling2 | Max pooling (3×3) | Average pooling (3×3) |
| Convolution3 | 3×3×64 | 3×3×48 |
| Pooling3 | Max pooling (2×2) | Average pooling (3×3) |
| FC1 | 1000 | 500 |
| FC2 | 500 | 200 |
| FC3 | 200 | 100 |
The parameter configurations for the CNNEEG and CNNfMRI models were determined by a grid search strategy. In this regard, various hyperparameter settings of the CNNs were examined using the training loss metric. The examined hyperparameters for tuning the model include the dimensions and number of convolutional filters, the type and dimensions of the pooling layers, the type of activation layers, and the size of FC layers. Additionally, various settings for training-related parameters of mini-batch size and optimizer were considered in the tuning step. Details for the search bounds of each configurable parameter of the CNNs have been listed in Table 2.
| CNN parameter | Search space |
| Dimension of convolution layers | {3, 5, 7, 9, 11} |
| Number of convolution filters | {8, 16, 24, 32, 48, 64, 128} |
| Type of pooling layers | Maximum, Average |
| Dimension of pooling layers | {2, 3, 4, 5} |
| Activation function | ReLU, PReLU, Leaky ReLU |
| FC layer | {100, 200, …, 1000} |
| Optimizer | SGDM, Adam |
| Mini batch size | {16, 32, 64} |
The investigations based on the grid search strategy and according to the search space employed (Table 2) revealed that feature extraction from fMRI images requires a higher number of filters in the convolutional layers compared to EEG signals to extract a greater number of patterns from the input images. Therefore, the number of filters in the CNNfMRI convolutional layers was set to 64, while the CNNEEG model effectively extracts features using 32 filters in its convolutional layers.
Furthermore, the nature of spectrogram matrices in EEG signals allows achieving desirable results by utilizing the average pooling function in the CNNEEG layers. On the other hand, feature extraction from fMRI images is more efficiently performed using the Max pooling function in the CNNfMRI model. Additionally, considering the larger dimensions and more detailed features present in fMRI images compared to EEG signals, the length of the feature vectors extracted from fMRI images was set to twice the number of feature vectors extracted from EEG signals.
At the end of the feature extraction step, the two feature vectors generated by the CNNEEG and CNNfMRI models are merged to create a single feature vector with a length of 300.
2.3 SVM and GA-based classification
Upon feature extraction and fusion from simultaneous EEG-fMRI data, an SVM model developed by a genetic algorithm is utilized for MDD identification. The link between the input elements and the target categories in the research subject under discussion is nonlinear. Consequently, a nonlinear kernel, such as the radial basis function (RBF) kernel, is used to tackle this problem with SVM. In the SVM model, the kernel of RBF is expressed as follows (Farquad et al., 2012):
The nonlinear multiplication of the two vectors of features for instances and is represented by the equation above as . Furthermore, the symbol denotes the kernel function’s parameter. Additionally, there can be an imbalance in the number of instances in the positive and negative categories. Classification mistakes may arise due to this imbalance. The SVM adjusts a correction factor independently for each category to solve this problem. In this instance, the SVM model’s optimisation issue may be explained as follows (Farquad et al., 2012).
The hyperplane’s normal vector is denoted by w in the equation above, and the margin coefficient is determined by b. Furthermore, specifies the matching label, and xi denotes the i-th training sample. In addition, represents the slack parameter. Lastly, the adjustment variables for both the negative and positive categories are denoted by C- and C+, respectively. The model offered suggests that the RBF parameter γ, along with the correction variables and , can be used to improve an SVM model. The suggested approach uses GA to solve this procedure, which can be expressed as a problem of optimization.
A real-valued GA served as the optimization method for SVM hyperparameters. The continuous nature of SVM hyperparameters (including , , and γ) makes Genetic Algorithms an optimal choice for their optimization. Real-valued GAs function most effectively in exploring continuous spaces because they directly represent parameters as floating-point numbers rather than binary or integer values. The search process becomes more efficient since the encoding and decoding steps are eliminated. The GA applied its optimization process to SVM kernel parameters by maximizing the validation set classification accuracy. The GA demonstrates efficient parameter space exploration capabilities, which makes it appropriate for this task.
Considering the three tunable kernel parameters, each chromosome will have a length of three. The variables and in each chromosome have search ranges set to [0.01, 36000], whereas the parameter γ has a range set to [0.0001, 35]. Since the fitness function measures the worth of a solution, defining it may be thought of as the essential step in addressing optimization issues. The validation error criteria serve as the basis for the fitness function definition in the suggested technique. Therefore, to assess each chromosome’s fitness, an SVM is first set up using the parameters that are listed in that chromosome. After that, the training samples are used to train the preset SVM model. Ultimately, the error in validation is regarded as the value of fitness when the trained model is subjected to validation samples.
The amount of validation instances for which the SVM output differs from the true class value is indicated by the symbol E in the equation above. The parameter N also shows the overall number of validation samples. In the suggested strategy, the GA’s goal is to set up the SVM model to minimize Eq. (5)‘s output. Considering the presented coding for the chromosomes and their fitness metric, the GA’s steps for optimizing the SVM are as follows:
Step 1) The initial population of chromosomes is generated randomly based on the defined boundaries for the optimization variables , , and γ.
Step 2) The fitness of each chromosome in the population is calculated based on Eq. (5).
Step 3) The number of parents are selected from the current population using the roulette wheel selection algorithm, where N represents the population size.
Step 4) Each selected pair of parents from the previous step undergoes one-point crossover to generate two new offspring chromosomes. Then, the fitness of the offspring chromosomes is calculated.
Step 5) In each offspring chromosome, a mutation operation is performed with a probability of 0.01. During this step, if the mutation probability is satisfied, one of the optimization variables, , , or γ is randomly selected, and its value is replaced with a randomly generated number within the specified bounds. Finally, the fitness of the mutated chromosome is calculated using Eq. (5).
Step 6) The entire set of existing chromosomes is sorted in ascending order based on their fitness. Then, N chromosomes with lower fitness values are selected as the new population for the next generation.
Step 7) The chromosome with the lowest fitness discovered so far is considered the current best-found solution.
Step 8) If one of the following conditions is met, terminate the algorithm; otherwise, go to Step 3:
-
The number of algorithm iterations reaches the threshold G.
-
The fitness of the best discovered solution reaches zero.
Following the preceding procedures, the SVM model is trained with the configuration found in the optimal solution, and this model is then utilized to identify MDD in new instances.
2.4 Complexity analysis and resource requirement
This section examines computational complexity together with resource needs for the proposed MDD diagnosis framework based on hybrid deep learning from a practical implementation perspective.
Multiple factors create the computational complexity within the proposed model design. The preprocessing steps for EEG signals consist of filtering operations () and wavelet denoising () and STFT-based spectrogram generation () where represents EEG signal length and denotes window count and stands for window size. On the other hand, fMRI preprocessing through fMRIPrep follows a complex pipeline. The exact complexity of this model remains challenging to determine but it requires more computational power than EEG preprocessing. Users can find complete information regarding fMRIPrep in (Esteban et al., 2019). The CNN feature extraction process in the model includes convolutional layers operating at complexity while FC layers contribute complexity based on input neuron count and output neuron count . and denote input feature map height and width, represents input channel number, indicates kernel size and indicates output channel number. Feature fusion has negligible cost. The SVM classification process follows a computational pattern of that relies on support vectors and feature vector dimension . The computational requirements of GA-based hyperparameter optimization run in the system’s offline mode. The proposed model’s main computational complexity stems from CNN feature extraction, particularly within . Real-world deployment requires optimizing this stage because it determines the overall efficiency.
The necessary resources consist of hardware equipment and software applications, together with data storage facilities. The training process took place on a workstation equipped with an Intel core i7 13700H CPU alongside an NVIDIA GeForce RTX 4050 GPU that had 6 gigabytes of VRAM. The computation of CNN requires graphical processing unit (GPU) acceleration to be effective. MATLAB 2020a with Deep Learning Toolbox served as the software component. The system also requires at least 8 gigabytes of RAM and data storage space to contain datasets, together with preprocessed data and trained models, and training file records. When implementing the model in real-world clinical settings, the speed of making predictions becomes the most important factor. For successful operation, models must be optimized along with efficient inference methods. Available hardware equipment within clinical settings plays an important role. The data acquisition process needs to be standardized to achieve results that can be generalized. Model explainability builds trust. Data privacy stands as the most important ethical issue alongside other considerations. Future research efforts will focus on deploying the model by addressing aspects such as compression techniques and performance speed, and explainable model behavior, together with ethical considerations.
2.5 Database and evaluation metrics
The database used in this study consists of 150 simultaneous EEG-fMRI signals. The EEG signals were collected using a portable three-electrode device. The sampling frequency for the EEG signals is set to 250 Hz. The fMRI images in the database were acquired using two Siemens 3T scanners. The data for these samples were collected using an EPI sequence, where each image slice has a thickness of 4 millimeters, and a total of 34 slices were obtained. The time of repetition (TR) is set to 2 seconds, and the time of echo (TE) is 30 ms. Additionally, the matrix size for each fMRI slice is 64×64. Moreover, for each patient, a T1-weighted image with a matrix size of 256x256, TR=1.9 seconds, and TE=2.26 milliseconds is available. The age range of the participants is between 21 and 76 years. Out of the total samples, 83 samples belong to female participants, and 67 samples belong to male participants. Furthermore, in the mentioned database, there are 94 samples belonging to healthy individuals and 56 samples belonging to individuals with MDD. All samples were recorded prior to the commencement of treatment for the disease.
A 10-fold cross-validation method was applied during the trials. During the experiments, we have used cross-validation with an 80/10/10 split for training/validation/test instances. In other words, in each fold of the cross-validation, 80% and 10% of instances were used for training and validating the model, respectively. After training the model, the remaining 10% of instances were used as unseen test instances. It should be noted that in each fold, a new 10% portion of instances was used for testing the model, meaning that after completing the 10 folds, all instances have been used in the test phases. Following every iteration, the test instances’ real labels and the labels provided by the suggested model were contrasted. Each test sample fits under either of the following groups under these circumstances:
-
1.
True Positive (TP): A test sample belonging to an MDD patient that has been correctly identified as having the disease by the proposed method.
-
2.
False Positive (FP): A test sample belonging to a healthy individual that has been mistakenly classified as a patient by the proposed method.
-
3.
True Negative (TN): A test sample belonging to a healthy individual that has been correctly labeled as healthy by the proposed method.
-
4.
False Negative (FN): A test sample of an MDD patient that has been mistakenly classified as healthy by the proposed method.
Based on the above conditions, the metrics of accuracy, precision, recall, and F-measure can be used to evaluate the performance of the diagnostic system. These metrics are formulated as follows:
In Eq. (6), the accuracy metric describes the system’s ability to correctly identify each of the positive (patient) or negative (healthy) categories. The precision metric in Eq. (7) indicates the proportion of correctly identified patient outputs among the total outputs classified as patients by the system. Additionally, the Recall metric in Eq. (8) describes how many of the actual patient samples the detection system was able to correctly identify. Lastly, the F-Measure metric in Eq. (9) provides the overall performance of the diagnostic system in identifying MDD, represented as the harmonic mean of the precision and recall metrics.
3. Result and Discussion
The MATLAB 2018a program was utilized to execute the suggested approach. The suggested approach was assessed using a dataset made up of 143 concurrent EEG-fMRI signals. Every experiment was carried out on a desktop computer equipped with a 32 GB RAM and an Intel Core i7 CPU running at 3.4 GHz. Furthermore, the EEG and fMRI preprocessing and extraction of features procedures were carried out concurrently. An NVIDIA GeForce GTX 1080 graphics card was used for and model training. The database requirements, assessment metrics, and a summary of the implementation outcomes are given in the next sections.
3.1 Results
According to the procedure described in section 4-1, the evaluation of the proposed method in detecting MDD was performed. During the experiments, the population size and the number of iterations in the genetic algorithm were set to 150 and 300, respectively, to optimize the configuration of SVM. Fig. 5 illustrates the variations of GA fitness during different iterations. In this graph, the horizontal axis represents the number of GA generations, and the vertical axis depicts the minimum fitness and the average fitness of the population during different iterations. As shown in this figure, the genetic algorithm can reach the best solution after 103 iterations, resulting in a validation error of 0.73%. The decreasing trend of the average population fitness in different iterations also indicates the successful performance of the optimization algorithm in moving towards the global optimum.

- The graph of GA fitness variations during different iterations for optimizing the configuration of SVM.
To evaluate the effectiveness of the techniques employed in the proposed diagnostic system, the performance of this model has been compared with the following scenarios:
-
SVM-GA-EEG: In this scenario, only features extracted from EEG signals are used for MDD detection, and fMRI images are disregarded.
-
SVM-GA-fMRI: In this scenario, only features extracted from fMRI images are used for MDD detection, and EEG signal features are disregarded.
-
SVM-EEG-fMRI: In this scenario, features extracted from EEG-fMRI signals are classified using an SVM model with a radial basis function (RBF) kernel. In other words, in this scenario, the optimization of the SVM model configuration is not performed by GA.
-
Random forest (RF)-EEG-fMRI: In this scenario, features extracted from EEG-fMRI signals are classified using a random forest classifier.
-
Multilayer perceptron (MLP)-EEG-fMRI: In this scenario, features extracted from EEG-fMRI signals are classified using an MLP neural network. This MLP model has two hidden layers with 10 and 8 neurons, and is trained by the Levenberg-Marquardt algorithm.
It should be noted that the above scenarios are comparisons with different configurations of our proposed model (e.g., using only EEG or only fMRI). The remainder of this section is dedicated to presenting the results of evaluating the proposed approach with the above comparison cases. The average accuracy of the suggested approach for recognizing MDD in the instances in comparison to alternative approaches has been shown in Fig. 6. It should be mentioned that the data shown in this figure and the other figures in this part are the average of the findings from ten cross-validation folds. According to Fig. 6, the suggested technique outperforms the methods that were examined, with an accuracy of 98.67% in determining the health or disease condition of the database instances. On the other hand, if the optimized proposed SVM model is replaced with the conventional SVM model with the RBF kernel, the detection accuracy will decrease to 94.67%. Similarly, if MDD detection is based on either the EEG or fMRI feature set alone, the detection accuracy will be 94%. In contrast, if the proposed classifier is replaced with RF and MLP models, the detection accuracy will be 93.33% and 92%, respectively. This shows that using the genetic algorithm to optimize the SVM configuration in the introduced approach can outperform the conventional SVM classifier and increase the accuracy by at least 4%. Additionally, comparing the accuracy of the proposed hybrid method with the scenario where either the EEG or fMRI feature set is used demonstrates that the combination employed in the proposed method can result in more accurate extraction of MDD-related features, leading to a minimum increase of 4.67% in detection accuracy. These findings demonstrate the potency of the strategies included in the suggested approach to raising MDD detection accuracy.

- Comparing the average accuracy of different methods in MDD detection.
On the other hand, the closest method to the proposed method in terms of accuracy is the SVM-EEG-fMRI approach, where an SVM model with a radial basis function is used to classify EEG-fMRI features. Thus, support vector machine, compared to other conventional learning models such as Random Forest and Multi-Layer Perceptron, can classify the features extracted by CNNEEG and CNNfMRI with higher accuracy, and the optimized configuration of this model in the proposed method has been able to demonstrate this superiority more prominently.
More thorough insights into how well classification techniques detect MDD may be obtained via a confusion matrix. The suggested method’s matrix of confusion, along with alternative approaches for identifying database samples, has been shown in Fig. 7.

- (a-f) Confusion matrix of different methods in MDD diagnosis after 10 cross-validation iterations.
The true labels of the test instances are shown in each column of these confusion matrices, while the estimates produced by each classification technique are shown within the rows. The healthy class is represented by the first row and column, while the MDD class is represented by the second row and column. For instance, the suggested technique successfully classified all 94 samples in Fig. 7(a) that belonged to the class of healthy (the total of values in the matrix’s first column).
In contrast, Fig. 7(d) shows that the base SVM model misclassified 6 samples from this class and assigned them to the MDD class. According to Fig. 7(a), it can be observed that the proposed method only made errors in classifying 2 out of 150 database samples, which belonged to the MDD class. On the other hand, the number of FP samples in the proposed method is zero. This means that the MDD samples detected by the proposed method are 100% correct, making the results of MDD diagnosis more reliable. The interpretation of classification results for the other presented cases in Fig. 7 can be done similarly. Overall, comparing these confusion matrices indicates that the proposed method outperforms other methods in classifying samples from both the healthy and MDD classes and has increased the accuracy of diagnosis by at least 4%.
As shown in Fig. 7, the proposed model proved to show a lower FP rate (FPR) and FN rate (FNR) than the compared scenarios. This is especially important in the case of MDD diagnosis, because both types of errors have important clinical consequences.
-
FNs: The implications of FN predictions in patients with MDD are severe. MDD requires early diagnosis and treatment; hence, timely diagnosis and intervention are important. Reducing FNs can therefore assist the model in providing early and right care to the patients.
-
FPs: Even though comparably less severe than FNs, FPs can have serious consequences as well. They can produce avoidable concerns and stress in patients who then require additional tests that may incur costs or even risks. In addition, the occurrence of FPs can put a lot of pressure on the health sector and cause a misallocation of available treatment. As a result, the model will enable investigators to avoid FPs, thus saving time and healthcare resources.
These results show that the techniques used in the proposed model can be used to reduce both false-positive and false-negative cases, which will enhance the ability to diagnose MDD and ultimately enhance patient care.
The average accuracy, recall, and F-measure measures have been shown in Fig. 8. These graphs show how various approaches fare overall in terms of classification rate metrics. Table 3 also includes the numerical outcomes of the tests carried out in this section. This table compares the suggested method’s performance to a few earlier investigations.

- Average precision, recall, and F-measure metrics.
| Method | Accuracy | F-measure | Recall | Precision |
| Proposed | 98.6667 | 0.9818 | 0.9643 | 1 |
| SVM-GA-EEG | 94 | 0.9231 | 0.9643 | 0.8852 |
| SVM-GA-fMRI | 94 | 0.9174 | 0.8929 | 0.9434 |
| SVM-EEG-fMRI | 94.6667 | 0.9310 | 0.9643 | 0.9000 |
| RF-EEG-fMRI | 93.3333 | 0.9091 | 0.8929 | 0.9259 |
| MLP-EEG-fMRI | 92 | 0.8983 | 0.9464 | 0.8548 |
| (Čukić et al., 2020) | 97.56 | 0.9794 | 0.9700 | 0.9889 |
| (Mahato & Paul, 2019) | 93.33 | 0.9329 | 0.9444 | 0.9217 |
| (Bandopadhyay et al., 2020) | 91.67 | 0.9135 | 0.9215 | 0.9056 |
| (Mahato et al., 2020) | 96.02 | 0.9589 | 0.9480 | 0.9700 |
As seen by Table 3 and Fig. 8‘s comparison of accuracy, precision, recall, and F-Measure metrics, the suggested approach is superior to the other techniques in its ability to identify MDD patients. These findings suggest that the suggested approach can raise the accuracy, recall, and F-Measure parameters. The suggested method’s precision rating of 1 shows that applying the suggested approach guarantees the accuracy of the method’s positive outcomes. Furthermore, a higher recall suggests that a greater percentage of MDD samples have been accurately identified using the suggested procedure.
The Receiver Operating Characteristic (ROC) curve that emerged from the database sample categorization has been shown in Fig. 9. This graphic shows that compared to the other approaches, the suggested method has a greater area under the ROC curve, a lower FPR, and a higher true positive rate (TPR). Therefore, it can be said that the approach suggested in this article has a better average accuracy when it comes to accurately identifying MDD individuals.

- ROC curve for MDD detection.
3.2 Statistical analysis
A more thorough examination of the suggested method’s effectiveness and importance in relation to the baseline techniques can be obtained through statistical analysis. One-way Analysis of Variance (ANOVA) was employed for this purpose. In this test, first, the predictions made by the models for all test instances are organized as a matrix. Each row of the matrix corresponds to a test sample, and each column to either the proposed method or one of the baseline techniques. The predicted labels of our model and the compared methods are arranged in this manner. The accuracy values for each test instance are then computed in this matrix by comparing each predicted label with the sample’s ground-truth label. In this instance, the output is characterized as matching the context label if it is +1 and non-matching if it is -1. Each column of the accuracy matrix underwent a normalcy test using a quantile-quantile (Q-Q) diagram. The test’s findings demonstrated that each model column’s distribution is normal, allowing one-way ANOVA to be utilized for statistical analysis.
The test produced a significant effect (), from which it is possible to conclude that there is a statistically significant difference () in the accuracy of at least two approaches. Because the one-way ANOVA test is unable to identify the models from which this difference originates, a multiple comparison analysis was employed to conduct a deeper investigation.
Tukey’s honestly significant difference (HSD) was employed as the post-hoc analysis in the multiple comparison study to determine which of the particular models had statistically distinct levels of accuracy. When comparing pairings of groups after a significant F-test in an examination of variance, this post-hoc test is perfect. Since Tukey’s HSD accounts for the family-wise inaccuracy rate, the likelihood of a minimum of one Type I mistake (FPs) for each individual comparison is kept within the acceptable alpha level (0.05 for this study), making it preferred over other post-hoc tests. Tukey’s HSD provides a workable method for determining which model performs noticeably worse or better by regulating this error rate. As a result of Tukey’s HSD test, we were able to determine specific models that accounted for the significant differences that were revealed in the ANOVA test. The test results have been displayed in Fig. 10, which indicates that the proposed method outperforms the compared scenarios.

- Multiple comparison analysis on the accuracy of the MDD diagnosis models.
This analysis proves the effectiveness of each technique used in the proposed method in the significant improvement of the accuracy of the diagnosis system. For instance, the significant difference between the accuracy of the proposed model and the “SVM-GA-EEG” and “SVM-GA-fMRI” cases, proves that using simultaneous EEG/fMRI for MDD diagnosis, results in a more accurate identification. Additionally, comparison with the “SVM-EEG-fMRI” shows the effectiveness of optimizing SVM hyperparameters on leveraging the accuracy.
3.3 Discussion
The trained hybrid deep learning model in the current study provides the potential of improving the outcome of diagnosing MDD. Using the EEG and fMRI data and using machine learning algorithms, the model has shown higher accuracy compared to the previous methods. The next subsections explore the real-world significance of these findings, as well as the conditions that facilitated the model’s success, its limitations, and potential future research directions.
The proposed hybrid deep learning model has proved to be very efficient in diagnosing MDD, with a classification accuracy of 98.67%. This improvement over what has been known for so long is actually a testament to what this model can do to mental health care. Thus, the model can be helpful to clinicians in the early identification of the potential disorder, in the differentiation between various types of the disorder, and in the development of management strategies. Such early intervention is critical in enhancing the patients’ prognosis. In addition, the model’s accuracy can improve the diagnostic agreement and decrease dependence on the clinicians’ subjective estimations.
The combination of EEG and fMRI data, together with the suggested hybrid deep learning architecture, has played a crucial role in attaining the model’s high performance. The CNN-based feature extraction was able to capture intricate features of both the EEG and the fMRI signals and combined them to provide a holistic representation of the brain activity. This approach helped to increase the model’s accuracy in distinguishing between healthy and depressed individuals by a large percentage. The usage of the genetic algorithm in the optimization of the SVM classifier added to the model’s accuracy by tuning the parameters of the classifier.
This shows that the proposed method yields better results than other machine learning methods like SVM, random forest, and MLP when the same feature set from EEG-fMRI data is used. For example, the proposed GA-based SVM tuning model performed 4% more accurate than the conventional method of SVM. The model’s high precision, recall, and F-measure values of 1, 0.9818, and 0.9867, respectively, demonstrate its robustness and effectiveness in identifying MDD cases. Also, the proposed method showed at least 1 percent improvement in the diagnosis accuracy compared to the methods presented in (Čukić et al., 2020), (Mahato & Paul, 2019), (Bandopadhyay et al., 2020), and (Mahato et al., 2020), proving the effectiveness of utilizing EEG and fMRI data for diagnosing MDD simultaneously.
The use of AI, in particular, deep learning in the proposed framework has several important benefits for MDD diagnosis.
-
Enhanced Diagnostic Accuracy: The proposed hybrid deep learning model has a higher diagnostic accuracy compared with conventional approaches, which can result in earlier and more accurate diagnosis of MDD. This may help with timely intervention as well as enhance treatment results.
-
Improved Treatment Outcomes: The timely and accurate diagnosis helps clinicians to start proper treatment regimens in a better manner, which may lead to better patient outcomes, shorter treatment span, and overall quality of life in case of MDD.
-
Personalized Medicine: Based on the patient’s own data, such as EEG and fMRI, AI can help in developing the patient’s individual treatment plan. This can pertain to the client demographics where treatment options depend on the client’s need, gender, or age, such as dosage of medication, type of therapy, length of treatment, and so on.
-
Reduced Diagnostic Bias: AI algorithms will help to reduce the influence of human biases in the diagnostic activity, thus making the latter more accurate. This can assist in minimizing diagnostic disparities and enhancing the quality of care of patients with MDD.
We should emphasize that the method described in this paper is a computational diagnostic tool that operates on the existing EEG and fMRI data. It does not take direct action or even handle the patients in any form or another. Hence, there are no direct side effects associated with the method itself. But important questions remain concerning the practical use of AI in health, such as privacy, fairness and accountability, and the weights and measures of AI in diagnosis and treatment.
3.4 Limitations, ethical considerations, and future directions
It is necessary to state the limitations of the proposed model, although the results look quite promising. The small number of participants in the study reduces the ability of the proposed model to be generalized across various population types. The model performance might be affected by the various EEG and fMRI acquisition parameters, which differ between different medical centers. The mysterious (black-box) structure of deep learning models hinders a clear understanding of extracted features since researchers find it difficult to establish their neurobiological relationships with MDD. The unclear nature of this method prevents scientists from grasping disease mechanisms better.
One vital ethical matter in this field centers around protecting subjects privacy data. The protection of patient data confidentiality, together with data security, stands as the most vital priority. To guarantee clinical success, strict compliance must be maintained with all data privacy and data management best practices. The ethical problem with AI-driven decisions includes the lack of interpretability of their underlying processes. Medical practitioners must understand the reasoning behind each diagnosis made by the model when working in a clinical setting. Research into Explainable AI (XAI) techniques should take place to enhance the transparency and gain trust in AI-based diagnostic systems. The responsible clinical implementation depends on this essential aspect.
The model faces an overfitting risk because the available sample size remains limited. A model becomes overfit when it memorizes training data by learning its noise and specific characteristics, thus performing poorly on new data. We utilized techniques like regularization in the CNNs together with cross-validation in the implementation to face the overfitting problem, but this issue may arise if we plan to use specific features, such as age, along with other features as inputs. The model requires independent validation on different population datasets to determine its ability to generalize across different groups.
The model needs further development, which requires several research directions to improve its clinical effectiveness. The essential requirement for improved model performance rests in validating the model across extensive datasets that represent diverse patient populations. The study requires data from various clinical centers that use different acquisition methods and treat patients from different demographic groups. A prospective clinical trial evaluation needs to occur to assess the model’s real-world clinical performance and its effects on patient results. Such clinical trials will generate important data points related to model performance measurements and their potential benefits for diagnostic precision and treatment preparation.
The integration of supplementary data types would improve MDD diagnosis accuracy and establish a complete understanding of the disorder. The next versions of the model should incorporate genetic information together with clinical scores obtained from standardized depression scales and lifestyle data. A combined data analysis strategy would show how various elements combine to develop MDD. The inclusion of genetic information allows doctors to classify patients according to their individual depression susceptibility so they can receive better individualized treatment.
The interpretability challenge needs to be resolved as a last step. Research must explore XAI techniques to understand how the model reaches its decisions. The visualization of CNN features together with the identification of major brain regions and connectivity patterns that influence MDD diagnosis would generate useful knowledge for both clinical practitioners and neuroscientists. The implementation of this approach would establish model trust and create opportunities to discover MDD neurobiological mechanisms. Research will explore age use as input data alongside proper regularization approaches for minimizing overfitting effects to enhance cross-age generalizability of the model.
4. Conclusions
Early detection of MDD can lead to faster improvement in patients and increased effectiveness of therapeutic interventions. Each of the EEG and fMRI techniques can be effective in MDD detection; however, by combining these two techniques, the weaknesses of each can be addressed, resulting in a more accurate diagnostic model. This aspect was investigated in the current study. In the proposed method, various preprocessing techniques were used to enhance simultaneous EEG-fMRI signals, and the significant features of each signal were extracted separately using a CNN model. Additionally, an SVM model, configured by GA, was employed for sample classification. The performance of the proposed method in MDD detection on EEG-fMRI samples demonstrated its efficacy in increasing diagnostic accuracy. Based on the results, when using each of the EEG or fMRI signals alone for MDD detection, the accuracy is almost comparable. However, utilizing the combination of features in simultaneous EEG-fMRI signals can increase the MDD detection accuracy by at least 4.67%. Furthermore, the optimization of the SVM model by GA can effectively contribute to enhancing MDD detection accuracy, resulting in a 4% increase in accuracy compared to the conventional SVM learner. The use of these techniques in the proposed method enables MDD detection with an accuracy of 98.67%, indicating an improvement compared to previous methods. Moreover, the proposed method reports precision and recall values of 100% and 96.43%, respectively, demonstrating the precise performance of the proposed method in MDD detection in terms of precision and recall metrics.
CRediT authorship contribution statement
Guojin Ma: Conceptualization, methodology, formal analysis, investigation, writing – original draft, writing – review & editing, visualization. Jiajing Li: Conceptualization, resources, writing – review & editing, supervision, project administration, funding acquisition, visualization. Jungu Liu: Software, writing – original draft, investigation, writing – review & editing. Xiuyu Lin: Formal analysis, writing – review & editing.
Declaration of competing interest
The authors declare that they have no competing financial interests or personal relationships that could have influenced the work presented in this paper.
Data availability
All data generated or analysed during this study are included in this published article.
Declaration of Generative AI and AI-assisted technologies in the writing process
The authors confirm that there was no use of Artificial Intelligence (AI)-Assisted Technology for assisting in the writing or editing of the manuscript and no images were manipulated using AI.
References
- The global prevalence of major depressive disorder (MDD) among the elderly: A systematic review and meta-analysis. Neurosci Biobehav Rev. 2022;132:1067-1073. https://doi.org/10.1016/j.neubiorev.2021.10.041
- [Google Scholar]
- Bandopadhyay, S., Nag, S., Saha, S., Ghosh, A., 2020. Identification of major depressive disorder. ISMSI ‘20: 2020 4th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence Thimphu Bhutan, pp. 65-70. https://doi.org/10.1145/3396474.3396480
- Feature-level fusion approaches based on multimodal EEG data for depression recognition. Information Fusion. 2020;59:127-138. https://doi.org/10.1016/j.inffus.2020.01.008
- [Google Scholar]
- The successful discrimination of depression from EEG could be attributed to proper feature extraction and not to a particular classification method. Cogn Neurodyn. 2020;14:443-455. https://doi.org/10.1007/s11571-020-09581-x
- [Google Scholar]
- fMRIPrep: A robust preprocessing pipeline for functional MRI. Nat Methods. 2019;16:111-116. https://doi.org/10.1038/s41592-018-0235-4
- [Google Scholar]
- Preprocessing unbalanced data using support vector machine. Decis Support Syst. 2012;53:226-233. https://doi.org/10.1016/j.dss.2012.01.016
- [Google Scholar]
- Fast signal reconstruction from magnitude STFT spectrogram based on spectrogram consistency. In Proc DAFx. 2010, September;10:397-403.
- [Google Scholar]
- Method of depression classification based on behavioral and physiological signals of eye movement. Complexity. 2020;2020:1-9. https://doi.org/10.1155/2020/4174857
- [Google Scholar]
- Predictors of suicidal ideation, suicide attempt and suicide death among people with major depressive disorder: A systematic review and meta-analysis of cohort studies. J Affect Disord. 2022;302:332-351. https://doi.org/10.1016/j.jad.2022.01.103
- [Google Scholar]
- A novel EEG-based major depressive disorder detection framework with two-stage feature selection. BMC Med Inform Decis Mak. 2022;22:209. https://doi.org/10.1186/s12911-022-01956-w
- [Google Scholar]
- MDD-TSVM: A novel semisupervised-based method for major depressive disorder detection using electroencephalogram signals. Comput Biol Med. 2022;140:105039. https://doi.org/10.1016/j.compbiomed.2021.105039
- [Google Scholar]
- Fusing multi-scale fMRI features using a brain-inspired multi-channel graph neural network for major depressive disorder diagnosis. Biomed Signal Process Control. 2024;90:105837. https://doi.org/10.1016/j.bspc.2023.105837
- [Google Scholar]
- Detection of major depressive disorder using linear and non-linear features from EEG signals. Microsyst Technol. 2019;25:1065-1076. https://doi.org/10.1007/s00542-018-4075-z
- [Google Scholar]
- Detection of depression and scaling of severity using six channel EEG data. J Med Syst. 2020;44:118. https://doi.org/10.1007/s10916-020-01573-y
- [Google Scholar]
- Review on EEG and ERP predictive biomarkers for major depressive disorder. Biomed Signal Process Control. 2015;22:85-98. https://doi.org/10.1016/j.bspc.2015.07.003
- [Google Scholar]
- The centrality of depression and anxiety symptoms in major depressive disorder determined using a network analysis. J Affect Disord. 2020;271:19-26. https://doi.org/10.1016/j.jad.2020.03.078
- [Google Scholar]
- Automated detection of major depressive disorder with EEG signals: A time series classification using deep learning. IEEE Access. 2022;10:73804-73817.
- [Google Scholar]
- Electroconvulsive Therapy and the risk of suicide in hospitalized patients with major depressive disorder. JAMA Netw Open. 2021;4:e2116589. https://doi.org/10.1001/jamanetworkopen.2021.16589
- [Google Scholar]
- Shen, J., Zhang, X., Li, J., Li, Y., Feng, L., Hu, C., Ding, Z., Wang, G., Hu, B., 2019. Depression detection from electroencephalogram signals induced by affective auditory Stimuli. 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII) Cambridge, UK, pp. 76-82. https://doi.org/10.1109/acii.2019.8925528
- Depression detection using resting state three-channel EEG signal. arXiv preprint arXiv 2020 2002.09175 (Article Under Press)
- [Google Scholar]
- A study of resting-state EEG biomarkers for depression recognition. arXiv preprint arXiv 2020 2002.11039 (Article Under Press)
- [Google Scholar]
- Graph Theory analysis of functional connectivity in major depression disorder with high-density resting state EEG data. IEEE Trans Neural Syst Rehabil Eng. 2019;27:429-439. https://doi.org/10.1109/TNSRE.2019.2894423
- [Google Scholar]
- DepressionGraph: A two-channel graph neural network for the diagnosis of major depressive disorders using rs-fMRI. Electronics. 2023;12:5040. https://doi.org/10.3390/electronics12245040
- [Google Scholar]
- Spatial–Temporal EEG fusion based on neural network for major depressive disorder detection. Interdiscip Sci Comput Life Sci. 2023;15:542-559. https://doi.org/10.1007/s12539-023-00567-x
- [Google Scholar]
- Wavelet transform. In: Texts in computer science Fundamentals of image data mining Texts in computer science Fundamentals of image data mining. Cham: Springer International Publishing; p. :35-44. https://doi.org/10.1007/978-3-030-17989-2_3
- [Google Scholar]
