# demonstrate the working of the decision tree based ID3 algorithm

0 like 0 dislike
4.7k views
Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an appropriate data set for building the decision tree and apply this knowledge toclassify a new sample

### For Indian Students- INR 570/- || For International Students- \$12.99/-

S.No.

Course Name

Apply Coupon

1.

Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence

Apply Coupon

2.

Computer Vision with OpenCV | Deep Learning CNN Projects

Apply Coupon

3.

Complete Machine Learning & Data Science with Python Apply Coupon

4.

Natural Language Processing-NLP with Deep Learning in Python Apply Coupon

5.

Computer Vision OpenCV Python | YOLO| Deep Learning in Colab Apply Coupon

6.

Complete Python Programming from scratch with Projects Apply Coupon

0 like 0 dislike
by Goeduhub's Expert (3.1k points)
edited by

# Decision Tree

• The Decision Tree Basically  is an inverted tree, with each node representing features and attributes.
• While the leaf node represents the output
• Except for the leaf node, the remaining nodes act as decision making nodes.
• Basic Terms in Decision tree

## Algorithms

• CART (Gini Index)
• ID3 (Entropy, Information Gain)

Note:- Here we will understand the ID3 algorithm

## Algorithm Concepts

1. To understand this concept, we take an example, assuming we have a data set (link is given here Click Here).
2. Based on this data, we have to find out if we can play someday or not.
3. We have four attributes in the data-set. Now how do we decide which attribute we should put on the root node?
4. For this, we will Calculate the information gain of all the attributes (Features), which will have maximum information will be our root node.

### Step1 : Creating a root node

Entorpy(Entropy of whole data-set)

Entropy (S) = (-p/p+n)*log2 (p/p+n) - (n/n+p)*log2 ((n/n+p))

p- p stand for number of positive examples

n- n stand for number of negative examples.

### Step2: For Every Attribute/Features

Average Information (AIG of a particular attribute)

I(Attribute) = Sum of {(pi+ni/p+n)*Entropy(Entropy of Attribute)}

pi- Here pi stand for number of positive examples in particular attribute.

ni- Here ni stand for number of negative examples in particular attribute.

Entropy (Attribute) - Entropy of Attribute calculated in same as we calculated for System (Whole Data-Set)

Information Gain

Gain = Entropy(S) - I (Attribute)

1. If all examples are positive, Return the single-node tree ,with label=+
2. If all examples are Negative, Return the single-node tree,with label= -
3. If Attribute empty, Return the single-node tree

### Step4: Pick The Highest Gain Attribute

1. The attribute that has the most information gain has to create a group of all the its attributes and process them in same as  which we have done for the parent (Root) node.
2. Again, the feature which has maximum information gain will become  a node and this process will continue until we get the leaf node.

### Step5: Repeat Until we get final node (Leaf node )

Let's take a look how our tree will look like

## Implementation of Decision-Tree (ID3) Algorithm

#Importing important libraries

import pandas as pd

from pandas import DataFrame

print( df_tennis)

Output

Calculating Entropy of Whole Data-set

#Function to calculate final Entropy

def entropy(probs):

import math

return sum( [-prob*math.log(prob, 2) for prob in probs] )

#Function to calculate Probabilities of positive and negative examples

def entropy_of_list(a_list):

from collections import Counter

cnt = Counter(x for x in a_list) #Count the positive and negative ex

num_instances = len(a_list)

#Calculate the probabilities that we required for our entropy formula

probs = [x / num_instances for x in cnt.values()]

#Calling entropy function for final entropy

return entropy(probs)

total_entropy = entropy_of_list(df_tennis['PT'])

print("\n Total Entropy of PlayTennis Data Set:",total_entropy)

Output

collections.Counter()
A counter is a container that stores elements as dictionary keys, and their counts are stored as dictionary values.

### Calculate Information Gain for each Attribute

#Defining Information Gain Function

def information_gain(df, split_attribute_name, target_attribute_name, trace=0):

print("Information Gain Calculation of ",split_attribute_name)

print("target_attribute_name",target_attribute_name)

#Grouping features of Current Attribute

df_split = df.groupby(split_attribute_name)

for name,group in df_split:

print("Name: ",name)

print("Group: ",group)

nobs = len(df.index) * 1.0

print("NOBS",nobs)

#Calculating Entropy of the Attribute and probability part of formula

df_agg_ent = df_split.agg({target_attribute_name : [entropy_of_list, lambda x: len(x)/nobs] })[target_attribute_name]

print("df_agg_ent",df_agg_ent)

# Calculate Information Gain

avg_info = sum( df_agg_ent['Entropy'] * df_agg_ent['Prob1'] )

old_entropy = entropy_of_list(df[target_attribute_name])

return old_entropy - avg_info

print('Info-gain for Outlook is :'+str(information_gain(df_tennis, 'Outlook', 'PT')),"\n")

Output

Note

In the same way, we will Calculate the information gain of the remaining attributes and then the attribute who has the most information will be named the best attribute

0 like 0 dislike
by Goeduhub's Expert (3.1k points)

Continued......

### Defining ID3 Algorithm

#Defining ID3  Algorithm Function

def id3(df, target_attribute_name, attribute_names, default_class=None):

#Counting Total number of yes and no classes (Positive and negative Ex)

from collections import Counter

cnt = Counter(x for x in df[target_attribute_name])

if len(cnt) == 1:

return next(iter(cnt))

# Return None for Empty Data Set

elif df.empty or (not attribute_names):

return default_class

else:

default_class = max(cnt.keys())

print("attribute_names:",attribute_names)

gainz = [information_gain(df, attr, target_attribute_name) for attr in attribute_names]

#Separating the maximum information gain attribute after calculating the information gain

index_of_max = gainz.index(max(gainz)) #Index of Best Attribute

best_attr = attribute_names[index_of_max] #choosing best attribute

#The tree is initially an empty dictionary

tree = {best_attr:{}} # Initiate the tree with best attribute as a node

remaining_attribute_names = [i for i in attribute_names if i != best_attr]

for attr_val, data_subset in df.groupby(best_attr):

subtree = id3(data_subset,

target_attribute_name,

remaining_attribute_names,

default_class)

tree[best_attr][attr_val] = subtree

return tree

Note

# Get Predictor Names (all but 'class')

attribute_names = list(df_tennis.columns)

print("List of Attributes:", attribute_names)

attribute_names.remove('PT')

#Remove the class attribute

print("Predicting Attributes:", attribute_names)

Output

# Run Algorithm (Calling ID3 function)

from pprint import pprint

tree = id3(df_tennis,'PT',attribute_names)

print("\n\nThe Resultant Decision Tree is :\n")

pprint(tree)

attribute = next(iter(tree))

print("Best Attribute :\n",attribute)

print("Tree Keys:\n",tree[attribute].keys())

Note:-The pprint module provides a capability to pretty-print arbitrary Python data structures in a well-formatted and more readable way.

Note:- After running the algorithm the output will be very large because we have also called the information gain function in it, which is required for ID3 Algorithm.

Note:- Here I am showing only the tree,to see the rest of the process result, run the code and see.

### ACCURACY

#Defining a function to calculate accuracy

def classify(instance, tree, default=None):

attribute = next(iter(tree))

print("Key:",tree.keys())

print("Attribute:",attribute)

print("Insance of Attribute :",instance[attribute],attribute)

if instance[attribute] in tree[attribute].keys():

result = tree[attribute][instance[attribute]]

print("Instance Attribute:",instance[attribute],"TreeKeys :",tree[attribute].keys())

if isinstance(result, dict):

return classify(instance, result)

else:

return result

else:

return default

Note

df_tennis['predicted'] = df_tennis.apply(classify, axis=1, args=(tree,'No') )

print(df_tennis['predicted'])

print('\n Accuracy is:\n' + str( sum(df_tennis['PT']==df_tennis['predicted'] ) / (1.0*len(df_tennis.index)) ))

df_tennis[['PT', 'predicted']]

Note:-

training_data = df_tennis.iloc[1:-4]

test_data  = df_tennis.iloc[-4:]

train_tree = id3(training_data, 'PT', attribute_names)

test_data['predicted2'] = test_data.apply(

classify, axis=1, args=(train_tree,'Yes') )

print ('\n\n Accuracy is : ' + str( sum(test_data['PT']==test_data['predicted2'] ) / (1.0*len(test_data.index)) ))

Output