Online Courses
Free Tutorials  Go to Your University  Placement Preparation 
Goeduhub's Online Courses @ Udemy in Just INR 570/-
Online Training - Youtube Live Class Link
0 like 0 dislike
3.4k views
in Python Programming by Goeduhub's Expert (2.2k points)

Predict retention of an employee within an organization such that whether the employee will leave the company or continue with it. An organization is only as good as its employees, and these people are the true source of its competitive advantage. Dataset is downloaded from Kaggle. Link: https://www.kaggle.com/giripujar/hr-analytics

First do data exploration and visualization, after this create a logistic regression model to predict Employee Attrition Using Machine Learning & Python.

Goeduhub's Top Online Courses @Udemy

For Indian Students- INR 360/- || For International Students- $9.99/-

S.No.

Course Name

 Coupon

1.

Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence

Apply Coupon

2.

Natural Language Processing-NLP with Deep Learning in Python Apply Coupon

3.

Computer Vision OpenCV Python | YOLO| Deep Learning in Colab Apply Coupon
    More Courses

4 Answers

0 like 0 dislike
by (110 points)

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np

from sklearn.linear_model import LogisticRegression

df = pd.read_csv("/content/HR_comma_sep.csv")

#df.head()

#Logistics Regression model

df1 = df[['salary','satisfaction_level'

          'average_montly_hours'

          'promotion_last_5years','left']]

dummies = pd.get_dummies(df1.salary)

df1 = pd.concat([df1,dummies],axis = 'columns')

df1 = df1.drop(['salary','medium'],axis='columns')

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(df1[['satisfaction_level''average_montly_hours''promotion_last_5years','high','low']],df1.left, test_size =2/3,random_state = 1)

model = LogisticRegression()

model.fit(X_train,y_train)

model.score(X_test,y_test)

***************************** O U T P U T *****************************

0.7828

0 like 0 dislike
by (122 points)
#GO_STP_379
# In this task we have to find the students scores based on their study hours. 
# This is a simple Regression problem type because it has only two variables.
import pandas as pd
data = pd.read_csv('HR_comma_sep.csv')
# exploration of data
print("-------exploration of data------------")
print(data.info())
print(data.head())
# laber encoder of data
from sklearn.preprocessing import LabelEncoder
col=['Department','salary']
label_encoder =LabelEncoder()
data['Department']= label_encoder.fit_transform(data['Department'])
data['salary']= label_encoder.fit_transform(data['salary'])
print("after the laber encoder : \n",data)
# LogisticRegression of data
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix,accuracy_score
ft=data[['Department','satisfaction_level','salary']]
label=data['left']
xtrain,xtest,ytrain,ytest=train_test_split(ft,label)
my_model=LogisticRegression()
my_model.fit(xtrain,ytrain)
y_pred=my_model.predict(xtest# y test
cm=confusion_matrix(ytest,y_pred)
print("confusion matrix: ",cm)
print("accuracy socre: ",accuracy_score(ytest,y_pred))
print("socre: ",my_model.score(xtrain,ytrain))
 
# visualization of data
import matplotlib.pyplot as plt
plt.subplot(2,2,1)
plt.scatter(ytesty_predmarker = '+')
plt.xlabel('xtest')
plt.ylabel('y prediction')
plt.legend()
plt.title('Prediction of company')
plt.subplot(2,2,2)
plt.scatter(x=data['salary'], y=data['left'],label='salary and left')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.title('salary and left')
plt.subplot(2,2,3)
plt.scatter(x=data['satisfaction_level'], y=data['left'],label='satisfaction level and left')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.title('satisfaction level and left')
plt.subplot(2,2,4)
plt.scatter(x=data['time_spend_company'], y=data['left'],label='time_spend_company and left')
plt.xlabel('x')
plt.ylabel('y')
plt.title('time_spend_company and left')
plt.legend()
plt.show()
 
# logistic regression model to predict Employee Attrition
#create a pipeline for Logistic Regression
from sklearn.externals import joblib
import joblib as joblib
import pickle
with open('model_save','wb'as file:
    pickle.dump(my_model,file)
#load model and prediction
with open('model_save','rb'as file:
    newmodel=pickle.load(file)
# newmodel.coef_
joblib.dump(my_model,'model_joblib')
mymodel=joblib.load('model_joblib')
print("my model: ",mymodel)
print("new model: ",newmodel)
print("file is :",file)

3.3k questions

7.1k answers

394 comments

4.6k users

Related questions

 Goeduhub:

About Us | Contact Us || Terms & Conditions | Privacy Policy || Youtube Channel || Telegram Channel © goeduhub.com Social::   |  | 
...