# Implementation of EM (Expectation Maximization) Algorithm using Gaussian Mixture Model

0 like 0 dislike
331 views

### For Indian Students- INR 570/- || For International Students- \$12.99/-

S.No.

Course Name

Apply Coupon

1.

Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence

Apply Coupon

2.

Computer Vision with OpenCV | Deep Learning CNN Projects

Apply Coupon

3.

Complete Machine Learning & Data Science with Python Apply Coupon

4.

Natural Language Processing-NLP with Deep Learning in Python Apply Coupon

5.

Computer Vision OpenCV Python | YOLO| Deep Learning in Colab Apply Coupon

6.

Complete Python Programming from scratch with Projects Apply Coupon

0 like 0 dislike
by Goeduhub's Expert (3.1k points)
edited by

## EM Algorithm concept

In a machine learning application, there will be two types of data present in the data set, observed and unobserved(latent variables/missing values that are not directly observable and are actually inferred from the values of the other observed variables).

EM  algorithms are known for density estimation (maximum likelihood estimation) and EM is also famous for clustering algorithm.

The EM algorithm is an approach for performing maximum likelihood estimation in the presence of latent variables.

Probability Density estimation

Probability Density estimation is basically the construction of an estimate based on observed data. It involves selecting a probability distribution function and the parameters of that function that best explains the joint probability of the observed data.

Maximum likelihood estimation

To select the joint probability distribution, what we require is Density estimation.

Density estimation needs to find out a probability distribution function and the parameters of that distribution.

The most common technique to solve this problem is the Maximum Likelihood Estimation or simply “maximum likelihood”.

Maximum Likelihood Estimation

In statistics, maximum likelihood estimation is the method of estimating the parameters of a probability distribution by maximizing the likelihood function in order to make the observed data most probable for the statistical model.

But there lies a limitation with Maximum Likelihood, it assumes that the data is complete, fully observed, etc. It does not really mandate that the model will have access to all the data. Instead, it assumes that all the variables relevant to the model are already present. But in some cases, some relevant variables may remain hidden and cause inconsistencies.

So, basically to implement EM algorithm on problem of clustering we use Gaussian mixture model.

### Steps involves in EM Algorithm

1. Consider a set of starting parameters in incomplete data (consider complete data with latent variables or missing values).

2. E-Step (Expectation step): In this step, what we do is that Basically the data in which the missing values ​​and latent variables are present, we estimate them by observe data that we have. (Updating variables and data)

3. M-step (Maximization step): This step is basically used to complete the data we get from E-step. This step updates the hypothesis.

4. If the convergence is not matched then repeat step 2 and 3.

See the diagram for clarification

### Use of GMM (Gaussian mixture model) in EM

1. Gaussian Mixture Model is used the combination of probability distributions and the estimation of mean and standard deviation parameters.
2. Gaussian mixture model has number of techniques to estimate data but common one is maximum likelihood.

GMM (Gaussian Mixture Model)

1. Gaussian Mixture Model is used the combination of probability distributions and the estimation of mean and standard deviation parameters.
2. Gaussian mixture model has number of techniques to estimate data but common one is maximum likelihood.

Let's see it by an example

1. Let's take a case in which the data points we have, are generated from two different processes i. e. unknown.
2. But we do not know which data point belongs to which class, and the data that we have also includes latent variables and missing values.
3. Both processes have Gaussian probability distribution.
4. We will be applying EM algorithm here with Gaussian mixture model to know which data points are from which process.

#Importing Important Libraries

from numpy import hstack

from numpy.random import normal

import matplotlib.pyplot as plt

#Defining samples from processes

sample1 = normal(loc=20, scale=5 , size=4000)

sample2 = normal(loc=40, scale=5 , size=8000)

sample = hstack((sample1,sample2))

plt.hist(sample, bins=50, density=True)

#Showing the data points

plt.show()

Output

Note:

1. What we did here is we have taken one dimensional data with mean is 20 mfor process1 and 40 is for process2 and  standard deviation is 5.
2. we have taken 4000 point for process1 and 8000 points  for process.
3. And after that we plot a graph which shows data distribution.
4. As you can see from the graph the data points between 20 and 40 are unclear as which distribution or process they belongs.

# Gaussian mixture model with expectation maximization

from numpy import hstack

from numpy.random import normal

#Importing Gaussian mixture model

from sklearn.mixture import GaussianMixture

# generate a sample

sample1 = normal(loc=20, scale=5, size=4000)

sample2 = normal(loc=40, scale=5, size=8000)

sample = hstack((sample1, sample2))

# reshape into a table with one column to fit the data

sample = sample.reshape((len(sample), 1))

# training the model

model = GaussianMixture(n_components=2, init_params='random')

model.fit(sample)

# predict latent values

yhat = model.predict(sample)

# check latent value for first few points

print(yhat[:80])

# check latent value for last few points

print(yhat[-80:])

Output

Note:In above code we used EM algorithm with Gaussian mixture model and from the output above we can see that our model predicted points in one class either 0 or 1 .