Online Courses
Free Tutorials  Go to Your University  Placement Preparation 
Goeduhub's Online Courses @ Udemy in Just INR 570/-
Online Training - Youtube Live Class Link
0 like 0 dislike
5.4k views
in Python Programming by Goeduhub's Expert (2.2k points)

Task- Predicting a Startups Profit/Success Rate using Multiple Linear Regression in Python-Download Data Set click here.

Here 50 startups dataset containing 5 columns  like “R&D Spend”, “Administration”, “Marketing Spend”, “State”, “Profit”.

In this dataset first 3 columns provides you spending on Research , Administration and Marketing respectively. State indicates startup based on that state. Profit indicates how much profits earned by a startup.

Clearly, we can understand that it is a multiple linear regression problem, as the independent variables are more than one.

Prepare a prediction model for profit of 50_Startups data in Python

Goeduhub's Top Online Courses @Udemy

For Indian Students- INR 360/- || For International Students- $9.99/-

S.No.

Course Name

 Coupon

1.

Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence

Apply Coupon

2.

Natural Language Processing-NLP with Deep Learning in Python Apply Coupon

3.

Computer Vision OpenCV Python | YOLO| Deep Learning in Colab Apply Coupon
    More Courses

4 Answers

0 like 0 dislike
by Goeduhub's Expert (2.2k points)
 
Best answer
# Multiple Linear Regression for 50 Startup

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('50_Startups.csv')
X = dataset.iloc[:, :-1]
y = dataset.iloc[:, 4]

#Convert the column into categorical columns

states=pd.get_dummies(X['State'],drop_first=True)

# Drop the state coulmn
X=X.drop('State',axis=1)

# concat the dummy variables
X=pd.concat([X,states],axis=1)

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

# Fitting Multiple Linear Regression to the Training set
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

# Predicting the Test set results
y_pred = regressor.predict(X_test)

from sklearn.metrics import r2_score
score=r2_score(y_test,y_pred)
0 like 0 dislike
by (342 points)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn
from sklearn.linear_model import LinearRegression
df = pd.read_csv('50_Startups.csv')
df.head()
df.info()
df.describe()
from sklearn.preprocessing import LabelEncoder
lab_enc = LabelEncoder()
df.State = lab_enc.fit_transform(df.State)
df.State.unique()
x = df.drop(['Profit'], axis=1)
print(x.head())
print(x.shape)
y = df['Profit']
print(y.head())
print(y.shape)
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 1/3, random_state = 45)
model = LinearRegression()
model.fit(x_train, y_train)
y_pred = model.predict(x_test)
y_pred
from sklearn.metrics import r2_score
score = r2_score(y_test, y_pred )
score
0.9426922836763976
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_test, y_pred)
mse
newdf = pd.DataFrame(y_pred, y_test)
newdf
plt.scatter(y_test, y_pred, marker = '^')
plt.show()

3.3k questions

7.1k answers

394 comments

4.6k users

Related questions

0 like 0 dislike
1 answer 1.5k views

 Goeduhub:

About Us | Contact Us || Terms & Conditions | Privacy Policy || Youtube Channel || Telegram Channel © goeduhub.com Social::   |  | 
...