Diagnosis of COVID-19 and non-COVID-19 patients by classifying only a single cough sound
Access
info:eu-repo/semantics/openAccessDate
2021Access
info:eu-repo/semantics/openAccessMetadata
Show full item recordAbstract
In the last month of 2019, a new virus emerged in China, spreading rapidly and affecting the whole world. This virus, which is called corona, is the most contagious type of virus that humanity has ever encountered. The virus has caused a huge crisis worldwide as it leads to severe infections and eventually death in humans. On March 11, 2020, it was announced by the World Health Organization that a COVID-19 outbreak has occurred. Computer-aided digital technologies, which eliminate many problems and provide convenience in people's lives, did not leave humanity alone in this regard and rushed to provide a solution for this unfortunate event. One of the important aspects in which computer-aided digital technologies can be effective is the diagnosis of the disease. Reverse transcription-polymerase chain reaction (RT-PCR), which is a standard and precise technique for diagnosing the disease, is an expensive and time-consuming method. Moreover, its availability is not the same all over the world. For this reason, it can be very attractive and important to distinguish the COVID-19 disease from a cold or flu through a cough sound analysis via smartphones which have entered into the lives of many people in recent years. In this study, we proposed a machine learning-based system to distinguish patients with COVID-19 from non-COVID-19 patients by analyzing only a single cough sound. Two different data sets were used, one accessible for the public and the other available on request. After combining the data sets, the features were obtained from the cough sounds using the mel-frequency cepstral coefficients (MFCCs) method, and then, they were classified with seven different machine learning classifiers. To determine the optimum values of hyperparameters for MFCCs and classifiers, the leave-one-out cross-validation (LOO-CV) strategy was implemented. Based on the results, the k-nearest neighbors classifier based on the Euclidean distance (kNN Euclidean) with the accuracy rate, sensitivity of COVID-19, sensitivity of non-COVID-19, F-measure, and area under the ROC curve (AUC) of 0.9833, 1.0000, 0.9720, 0.9799, and 0.9860, respectively, is more successful than other classifiers. Finally, the best and most effective features were determined for each classifier using the sequential forward selection (SFS) method. According to the results, the proposed system is excellent compared with similar studies in the literature and can be easily used in smartphones and facilitate the diagnosis of COVID-19 patients. In addition, since the used data set includes reflex and unconscious coughs, the results showed that conscious or unconscious coughing has no effect on the diagnosis of COVID-19 patients based on the cough sound.