Coronavirus Infection Probability using machine learning | A COVID-19 project using Machine Learning

0 like 0 dislike
1.4k views

Coronavirus Infection Probability using machine learning. Here we will make a simple machine learning model to predict whether you have an coronavirus infection or not (or probability of having infection). Here we are just trying to understand how machine learning can help us.

0 like 0 dislike
by Goeduhub's Expert (3.1k points)
edited by

Coronavirus

Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans, these viruses cause respiratory tract infections that can range from mild to lethal. Mild illnesses include some cases of the common cold (which is caused also by certain other viruses, predominantly rhinoviruses), while more lethal varieties can cause SARS, MERS, and COVID-19.This virus originated from Wuhan city of China.

Note

1. Here we will make a simple machine learning model to predict whether you have an coronavirus infection or not (or probability of having infection).
2. The data that we will use here is not an official data, it has been created randomly.
3. Because our data is not accurate here, it is not necessary to predict our model correctly.
4. Here we are just trying to understand how machine learning can help us.
5. If we have official and accurate data, then we can create an accurate model.

Practical Implementation

Required Libraries

#importing required libraries

import pandas  as pd

import numpy as np

import sklearn

from sklearn.metrics import mean_squared_error

Note: Here we imported all required and basic libraries to solve the problem. Numpy, Pandas and sklearn etc..

Data used here is not accurate and official data but if you want to do practice you can download it from here (Click here for data)

Output \

Note: As you can see from the above output we have basic features of coronavirus infection (i.e.  fever, cold , age etc...) and our last column in data is a measure of all features (1, 0),where 1 means have an infection and 0 means no infection.

#Information of data
Data.info() Note:We have to check the information of the data so that we can do any correction that is required in data (null values, column type etc...). So that we don't face any problem in further processing the data.

#Defining our target (Y) and features (X)

X = Data.drop('Probability', axis = 1)

print("data in Y")

Y=Data['Probability']

Output Note: In this section we have defined our target i.e. Y and features i.e. X. Basically,here our target is to find out the infection probability based on the features, so we have separated the column infection probability(Y) from other columns (X) (feature columns).

#Splitting train and test data
X_train, X_test, Y_train, Y_test = sklearn.model_selection.train_test_split(X, Y, test_size = 0.33, random_state = 5)

Note: In this section we have applied train_test_split function to split data into train and test data.(For training and testing purpose)

#Converting into numpy array

print(X_train.to_numpy())

Y_train.to_numpy()

X_test.to_numpy()

Y_test.to_numpy()

Output #Importing logisticregression model

from sklearn.linear_model import LogisticRegression

clf =LogisticRegression()

#training the model

Y_train_pred=clf.fit(X_train,Y_train)

Note:In this section of code we have imported logistic regression machine learning model and train the model using fit function.

#Predicting using model

#Infection (0,1) prediction

infection=clf.predict([[98,20,0,1,0,0,0]])

#Infection probability prediction

infection_probability= clf.predict_proba([[98,20,0,0,0,0,1]])

print(infection)

print(infection_probability)

Output Note

1. In this part, we have predicted infection and (infection probability) with the model we have prepared.
2. As you can see from the output, we have two types of output. In the first output we have predicted directly (1 or 0), whereas in another we have calculated the probability of infection.
3. We used here logistic regression (Because of categorical data) , if you want you can use other model. (Model selection depends on accuracy of the model)