Online Courses
Free Tutorials  Go to Your University  Placement Preparation 
Artificial Intelligence(AI) & Machine Learning(ML) Training in Jaipur
Online Training - Youtube Live Class Link
0 like 0 dislike
in RTU B.Tech (CSE-VI Sem) Machine Learning Lab by Goeduhub's Expert (3.1k points)
Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an appropriate data set for building the decision tree and apply this knowledge toclassify a new sample

Goeduhub's Online Courses @Udemy

For Indian Students- INR 570/- || For International Students- $12.99/-


Course Name

Apply Coupon


Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence

Apply Coupon


Computer Vision with OpenCV | Deep Learning CNN Projects

Apply Coupon


Complete Machine Learning & Data Science with Python Apply Coupon


Natural Language Processing-NLP with Deep Learning in Python Apply Coupon


Computer Vision OpenCV Python | YOLO| Deep Learning in Colab Apply Coupon


Complete Python Programming from scratch with Projects Apply Coupon

2 Answers

0 like 0 dislike
by Goeduhub's Expert (3.1k points)
edited by
Best answer

Decision Tree 

  • The Decision Tree Basically  is an inverted tree, with each node representing features and attributes.
  • While the leaf node represents the output
  • Except for the leaf node, the remaining nodes act as decision making nodes.
  • Basic Terms in Decision tree 


  • CART (Gini Index)
  • ID3 (Entropy, Information Gain)

Note:- Here we will understand the ID3 algorithm 

Algorithm Concepts

  1. To understand this concept, we take an example, assuming we have a data set (link is given here Click Here).
  2. Based on this data, we have to find out if we can play someday or not.
  3. We have four attributes in the data-set. Now how do we decide which attribute we should put on the root node?
  4. For this, we will Calculate the information gain of all the attributes (Features), which will have maximum information will be our root node.

Step1 : Creating a root node  

Entorpy(Entropy of whole data-set)

Entropy (S) = (-p/p+n)*log2 (p/p+n) - (n/n+p)*log2 ((n/n+p))

p- p stand for number of positive examples

n- n stand for number of negative examples.

Step2: For Every Attribute/Features 

Average Information (AIG of a particular attribute)

I(Attribute) = Sum of {(pi+ni/p+n)*Entropy(Entropy of Attribute)}

pi- Here pi stand for number of positive examples in particular attribute.

ni- Here ni stand for number of negative examples in particular attribute.

Entropy (Attribute) - Entropy of Attribute calculated in same as we calculated for System (Whole Data-Set)

Information Gain

Gain = Entropy(S) - I (Attribute)


  1. If all examples are positive, Return the single-node tree ,with label=+
  2. If all examples are Negative, Return the single-node tree,with label= -
  3. If Attribute empty, Return the single-node tree

Step4: Pick The Highest Gain Attribute

  1. The attribute that has the most information gain has to create a group of all the its attributes and process them in same as  which we have done for the parent (Root) node.
  2. Again, the feature which has maximum information gain will become  a node and this process will continue until we get the leaf node.

Step5: Repeat Until we get final node (Leaf node )

Let's take a look how our tree will look like 

Implementation of Decision-Tree (ID3) Algorithm

(link is given here for Data-set Click Here )

#Importing important libraries 

import pandas as pd

from pandas import DataFrame 

#Reading Dataset 

df_tennis = pd.read_csv('DS.csv')

print( df_tennis)


Calculating Entropy of Whole Data-set 

#Function to calculate final Entropy 

def entropy(probs):  

    import math

return sum( [-prob*math.log(prob, 2) for prob in probs] )

#Function to calculate Probabilities of positive and negative examples 

def entropy_of_list(a_list):

from collections import Counter

    cnt = Counter(x for x in a_list) #Count the positive and negative ex

num_instances = len(a_list)

#Calculate the probabilities that we required for our entropy formula 

probs = [x / num_instances for x in cnt.values()] 

#Calling entropy function for final entropy 

 return entropy(probs)

total_entropy = entropy_of_list(df_tennis['PT'])

print("\n Total Entropy of PlayTennis Data Set:",total_entropy)


A counter is a container that stores elements as dictionary keys, and their counts are stored as dictionary values.

Calculate Information Gain for each Attribute 

#Defining Information Gain Function 

def information_gain(df, split_attribute_name, target_attribute_name, trace=0):

print("Information Gain Calculation of ",split_attribute_name)


#Grouping features of Current Attribute

 df_split = df.groupby(split_attribute_name)

    for name,group in df_split:

        print("Name: ",name)

        print("Group: ",group)

nobs = len(df.index) * 1.0


#Calculating Entropy of the Attribute and probability part of formula 

    df_agg_ent = df_split.agg({target_attribute_name : [entropy_of_list, lambda x: len(x)/nobs] })[target_attribute_name]


# Calculate Information Gain

    avg_info = sum( df_agg_ent['Entropy'] * df_agg_ent['Prob1'] )

    old_entropy = entropy_of_list(df[target_attribute_name])

    return old_entropy - avg_info

print('Info-gain for Outlook is :'+str(information_gain(df_tennis, 'Outlook', 'PT')),"\n")



In the same way, we will Calculate the information gain of the remaining attributes and then the attribute who has the most information will be named the best attribute

0 like 0 dislike
by Goeduhub's Expert (3.1k points)


Defining ID3 Algorithm 

#Defining ID3  Algorithm Function

def id3(df, target_attribute_name, attribute_names, default_class=None):

#Counting Total number of yes and no classes (Positive and negative Ex)

from collections import Counter

    cnt = Counter(x for x in df[target_attribute_name])

if len(cnt) == 1:

        return next(iter(cnt))

# Return None for Empty Data Set 

 elif df.empty or (not attribute_names):

        return default_class


default_class = max(cnt.keys())


        gainz = [information_gain(df, attr, target_attribute_name) for attr in attribute_names] 

#Separating the maximum information gain attribute after calculating the information gain 

        index_of_max = gainz.index(max(gainz)) #Index of Best Attribute 

best_attr = attribute_names[index_of_max] #choosing best attribute 

#The tree is initially an empty dictionary

tree = {best_attr:{}} # Initiate the tree with best attribute as a node 

        remaining_attribute_names = [i for i in attribute_names if i != best_attr]

for attr_val, data_subset in df.groupby(best_attr):

subtree = id3(data_subset,




            tree[best_attr][attr_val] = subtree

        return tree


# Get Predictor Names (all but 'class')

attribute_names = list(df_tennis.columns)

print("List of Attributes:", attribute_names) 


#Remove the class attribute 

print("Predicting Attributes:", attribute_names)


# Run Algorithm (Calling ID3 function)

from pprint import pprint

tree = id3(df_tennis,'PT',attribute_names)

print("\n\nThe Resultant Decision Tree is :\n")


attribute = next(iter(tree))

print("Best Attribute :\n",attribute)

print("Tree Keys:\n",tree[attribute].keys())

Note:-The pprint module provides a capability to pretty-print arbitrary Python data structures in a well-formatted and more readable way.

Note:- After running the algorithm the output will be very large because we have also called the information gain function in it, which is required for ID3 Algorithm.

Note:- Here I am showing only the tree,to see the rest of the process result, run the code and see.


#Defining a function to calculate accuracy

def classify(instance, tree, default=None):

attribute = next(iter(tree))



print("Insance of Attribute :",instance[attribute],attribute)

    if instance[attribute] in tree[attribute].keys(): 

        result = tree[attribute][instance[attribute]]

        print("Instance Attribute:",instance[attribute],"TreeKeys :",tree[attribute].keys())

        if isinstance(result, dict): 

            return classify(instance, result)


            return result 


        return default


df_tennis['predicted'] = df_tennis.apply(classify, axis=1, args=(tree,'No') ) 


print('\n Accuracy is:\n' + str( sum(df_tennis['PT']==df_tennis['predicted'] ) / (1.0*len(df_tennis.index)) ))

df_tennis[['PT', 'predicted']]


training_data = df_tennis.iloc[1:-4] 

test_data  = df_tennis.iloc[-4:]

train_tree = id3(training_data, 'PT', attribute_names)

test_data['predicted2'] = test_data.apply(  

classify, axis=1, args=(train_tree,'Yes') ) 

print ('\n\n Accuracy is : ' + str( sum(test_data['PT']==test_data['predicted2'] ) / (1.0*len(test_data.index)) ))


Click here for more programs of  RTU ML LAB

Our Mentors(For AI-ML)

Sharda Godara Chaudhary

Mrs. Sharda Godara Chaudhary

An alumna of MNIT-Jaipur and ACCENTURE, Pune


Ms. Nisha

An alumna of IIT-BHU


About Us | Contact Us || Terms & Conditions | Privacy Policy || Youtube Channel || Telegram Channel © Social::   |  |