EDPS Logo

EA Licence No. 78592

About Us

Our Services

Staff Hub

Contact Us

Technical Corner
< Technical Corner

2025-06-29

Machine Learning for Case Fatality Rate Classification

pic0

The world has witnessed pandemics such as the 1918 Influenza [1], Ebola [2], SARS [3], H1N1 [4], MERS [3] etc. The COVID–19 pandemic is another pandemic we are witnessing, and it has infected millions of people all over the world. Millions of lives have been lost. As of 17 March 2024, the United States tops coronavirus cases with records of 103,436,209, including 1,182,424 deaths. On mortality count, Americas and Europe have accounted for 75.1% while there are only 6.0% of total deaths in Western Pacific containing China, Japan, South Korea, and Singapore [5]. The initial cases of this epidemic were discovered at a seafood market in Wuhan, China in late 2019, but there is no consensus about its origin. It quickly broke out into other parts of the world in early 2020 and then escalated to a pandemic.
COVID–19 has interrupted civilization [6] for nearly three years. It has caused the most severe effects to the international society in comparison to the previous epidemics. Mask mandate was enforced. Social distancing was imposed. Large gathering was outlawed. Travel ban was imposed. Airports were deserted because planes were restricted to fly. Cruises were abandoned due to the fear of concentrated infection. Attractions lacked tourists because "unurgent and unnecessary" exits to foreign countries were restricted [7]. Classrooms were closed so students could only learn their classes online. Conference meetings were moved online. Cars were fewer on the roads. Polymerase chain reaction (PCR) tests and health QR codes [7] turned into mandatory prerequisite for residents to go outdoors. No one was free from the wrath of COVID–19 [6].
Successful containment strategies deployed in various countries have demonstrated the pivotal role of data analysis, scientific evidence, expert opinions, and depoliticization. For instance, Singapore implemented stringent "circuit breaker" [8] measures in 2020, but swiftly lifted restrictions in 2021 [9], because it cited a death rate of 0.2% [10], the lowest worldwide then.
Meanwhile, China, in accordance with Zhong Nanshan et al.'s suggestions in 2020, enforced a rigorous lockdown in Wuhan [7, 11]. It was applied to other major cities across the nation to stop the virus from spreading farther as soon as possible. Despite its side effects on economy, tourism, education, international trade, and daily lifestyles, the zero–COVID–19 policy between 2020 and 2021 successfully limited mortality in China and made it a rare country with infrequent confirmed COVID cases. When Wuhan reopened on April 8, the lockdown was hailed as demonstration of "institutional advantage" by the government [12], especially compared to Western countries with massive deaths caused by their inaction [13].
However, the city–wide lockdown encountered its shortcomings during the Omicron outbreak in 2022 despite its success in 2020 [11]. Due to the complex administrative structure in China, the urgency of zero–COVID policy was exaggerated by junior officials, securities, and volunteers. Thus, they often implemented excessive epidemic prevention. For example, citizens were forced to take PCR tests in every 48 hours, sometimes in even 24 hours, to keep their QR codes green, otherwise they would be rejected from many buildings or communities by security. As all three doses of the vaccine were sufficient to saved lives from danger, many citizens complained that it was inconvenient and meaningless to maintain such strict policies while the virus has changed.

This dichotomy has highlighted the importance for flexible policymaking levels to enhance epidemic response in large, multi–tiered governance systems. Countries like Singapore minimized the side effects of its epidemic prevention successfully and consistently since its decisions were made with respect of an advanced machine leaning model and precise data analysis evaluation.
Scientists have carried out different studies to assist stakeholders in understanding the situation. The authors in [14] trained linear and polynomial models to predict case fatality rate of COVID–19 in Nigeria. In another study, scientists [15] discovered spatiotemporal transmission patterns of the epidemic in Hong Kong via deep learning. While classifying fatality rate serves as a basis for effective disease control in Singapore, it seems that machine learning articles on this task is in a paucity. Therefore, this study aims to design, develop, and evaluate a classification of COVID–19 fatality rate using learning algorithms. The proposed model will classify case fatality rate at national level into low, moderate, or high.
 

2.Methodology
2.1.
Data There are two datasets collected form WHO coronavirus (COVID–19) dashboard [5] and Our World in Data (OWID) [16] respectively. Features of both involves new cases, cumulative cases, new deaths, and cumulative deaths. The numbers of confirmed COVID–19 cases and deaths in the world are updated weekly by WHO [5]. For this study, only records from ten industrial countries are selected, as they contain only one missing value and one noisy value that can be handled well afterwards. These countries are China, Singapore, Korea, Japan, the USA, Canada, UK, France, Germany, and Russia. The data from OWID comprises daily cases and deaths in the USA, and it also provides their smoothed values. We select data between January 2020 to March 2023 for our objectives. Redundant and irrelevant features are removed during the data cleaning phase. We aimed to calculate and classify the case fatality rate using five learning algorithms.
2.2.Preprocessing.
a. Interpolating Missing Values Given that our datasets contain both incremental and cumulative fields, a missing value can be easily estimated by values in adjacent rows and columns.
b. Smoothing Noisy Values

Screenshot 2025-06-30 171347.png

Screenshot 2025-06-30 171429.png

Screenshot 2025-06-30 171502.png

Screenshot 2025-06-30 171529.png

Screenshot 2025-06-30 171614.png

Screenshot 2025-06-30 170627.pngScreenshot 2025-06-30 170710.pngScreenshot 2025-06-30 170743.pngScreenshot 2025-06-30 170812.png

5.Conclusion
COVID–19 pandemic has resulted in major distraction to the society for nearly 4 years. It has infected millions of people and taken millions of lives. Aggressive containment strategies were vital when the virus performed high fatality rate, but they gradually brought major inconvenience to the society without being relaxed as the fatality rate dropped. Countries and regions such as Singapore have successfully implemented effective disease control by observing the case fatality rate within borders and categorize it for taking different measures. Therefore, in this study, we have designed, developed, and evaluated a COVID–19 fatality rate classifier, with datasets from World Health Organization and Our World in Data. Classification is based on low, moderate, or high. Imbalanced classes are equalized with SMOTE.


a.In WHO's data, RFA and KNN perform the best for weekly and cumulative CFR levels respectively, while LRA and SVM predict suboptimal results.


b.In OWID's data, CART and Bagging perform in a full accuracy for the cumulative label. The highest accuracies for the monthly and smoothed labels are evaluated from RFA and LRA.


c.Combining oversampling with ensemble learning algorithms proved to be an effective strategy. However, it requires a large data size to get optimal performance.


d.The high accuracy, F1–score and precision of the RFA and KNN models suggests that they have the capability of separately classifying weekly and cumulative COVID-19 fatality rates into low, moderate, or high.

 

 

# AI and Machine Learning

Related News

Technical_Corner_swiper

Rise of Generative AI in Software Development and Support

Technical_Corner_swiper

Top 5 AI-Driven Innovations in IT for 2025

Technical_Corner_swiper

Virtual Avatar Modelling & Animation in AR

EDPS Logo

EDPS

EDPS Systems Limited EDPS 電腦系統有限公司 EA Licence No. 78592

© Copyright of EDPS Systems Limited 2025. All Rights Reserved.