Identification of Characteristics of Covid-19 Infection Using the K-Means Clustering Method

Rina Fitriana; Yanto; Isdaryanto Iskandar1

1

Publication Date: 2022/09/28

Abstract: 2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-toperson spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people. The purpose of this study is to identify the type of data on the COVID-19 outbreak. Based on the outbreak of COVID-19 in the several area around the first identified cases, datasets for the infection based on several criteria have been made. The criteria of datasets include: reporting date; location; country; gender; and age. It evaluates how the data going to be grouped into several similar characteristics, so the report for this new viruses can be identified. Within those criteria, the data going to be analyzed with the clustering method which is specifically the k-means clustering. The k-means will group the data based on the similarity between each data for the purpose of visualizing the COVID-19 undefined data. The results obtained from the Kaggle study were data on the COVID-19 virus infection. In designing data mining, it uses the K-means clustering

Keywords: COVID-19, Data Mining, Clustering, K-means

DOI: https://doi.org/10.5281/zenodo.7117881

PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT22SEP141.pdf

REFERENCES

No References Available