Linear Algebra for Data Science
Linear algebra is a branch of mathematics which involves study of lines, planes, vectors, matrices, equations etc. It has wide role of applications in data science field, a glimpse of which we will see here.
Why should you learn Linear Algebra ?
We can easily visualize 2D and 3D. But what about 4D, 5D or 10D. Normal human eyes can't visualize more than three dimensions. But in reality, sometimes data scientists need to work with 100 of dimensions or features. Here, Linear algebra comes into play. Even if we can't visualize such huge dimensions, but we can still represent them in form of matrix or vectors. For example, we can find distance between three points using srqt((a)^{2+}(b)^{2+}(c)^{2}). Like this, we can represent n points. Sometimes, we also decrease or increase dimensions for better visualization and analysis.
Vectors
Linear algebra deals with vector spaces. A vector space is a set of objects called vectors. We can add or multiply vectors. A vector is a list of number. It can be thought of as a point in a space, which is numerically represented in form of a list.
Row Vector - Vector having 1 row and n columns. It's dimension will be 1*n
Column Vector - Vector having n rows and 1 columns. It's dimension will be n*1
Vector Addition - Suppose you have two vector A and B.
A=[1,2,3] B=[3,2,1]
Then, C=A+B = [1+3,2+2,3+1]
Vector Multiplication - We have multiply vectors using dot and cross products. Cross products are not widely used in machine learning. We will only see dot product here.
Projection and unit Vector-
Plane
Two dimensional axis (x,y) is represented by a line. 3D axis (x,y,z) is represented by a plane. Similarly, n-dimensions are represented by hyperplanes.
y=mx+c is a equation of line, where c is the y-intercept. m is the slope.
Equation of a plane in n-dimension is wTx+w0=0
Hence, equation of plane passing through origin will be w^{T}x=0.
We are using w-transpose, because by-default it is represented as column-vector. We convert into row for multiplication.
A normal to the plane is represented by a vector perpendicular to the plane.
Matrix
A matrix is an array represented in the form of rows and columns. We often deal with matrix while performing or applying algorithms in Machine learning. Operations like addition, subtraction, multiplication, transpose etc. can be performed in matrices.
Sometimes, we will get similarity matrices of huge data. Then, normal human visualization is impossible. Then, linear algebra comes to rescue.
We also have eigen values and eigen vectors of matrices whose application lies in principal component analysis.
import numpy x = numpy.array([[1, 2], [3, 4]]) y = numpy.array([[5, 6], [7, 8]]) # Addition of two matrices print (numpy.add(x,y)) # Subtraction of two matrices print (numpy.subtract(x,y)) # Multiplication of two matrices print (numpy.multiply(x,y)) # The product of two matrices print (numpy.dot(x,y)) # Transpose of matrix print (x.T) |
OUTPUT
Addition of two matrices:
[[ 6 8]
[10 12]]
Subtraction of two matrices :
[[-4 -4]
[-4 -4]]
[[0.2 0.33333333]
[0.42857143 0.5 ]]
Multiplication of two matrices:
[[ 5 12]
[21 32]]
The dot product of two matrices :
[[19 22]
[43 50]]
Matrix transposition :
[[1 3]
[2 4]]