Blog

EN
Vector and matrix operations in Python

Vector and matrix operations in Python

6 min read

Series: Linear algebra for Machine Learning

2 parts

Have you ever wondered how Netflix decides what to recommend to you or how an algorithm measures how “good” a candidate is? The answer is not magic, it's linear algebra.

Subtraction and multiplication

  • In subtraction (-), each component is subtracted component by component.
  • In multiplication (*), each component is multiplied component by component.
import numpy as np

v1 = np.array([10, 20, 30])
v2 = np.array([1, 2, 3])

# Subtraction
resta = v1 - v2  # Result: [9, 18, 27]

# Multiplication
mult = v1 * v2  # Result: [10, 40, 90]

Subtraction allows us to measure distances, errors, and changes in direction in multidimensional spaces.

Norm

It is a function that assigns a positive real number to a vector. Simply put, it is the measure of a vector’s length or magnitude.

If you think of a vector as an arrow that goes from the origin to a point, the norm is the “distance” that arrow travels.

Uses of the norm

  • Data normalization: the norm is used to rescale vectors so that all of them have length 1 (unit vectors), allowing the model to compare “apples to apples.”
  • Error and distance: as we saw with subtraction, if you subtract two vectors and compute the norm of the result, you obtain the exact distance between those two data points.

To compute the norm of a vector, we must use NumPy’s norm function.

v = np.array([10, 20, 30])
np.linalg.norm(v)

Dot product

The dot product is behind recommendation systems, semantic search, and language models.

The dot product helps us measure the geometric relationship between two vectors, returning a scalar instead of a vector.

One rule must be taken into account: both vectors must have the same number of components.

v1 = np.array([10, 20, 30])
v2 = np.array([1, 2, 3])

prod_punto = np.dot(v1, v2)  # Result: 140 (10*1 + 20*2 + 30*3)

Practice

Let’s move on to practice. For this, we will build a small program that helps a company recruiter find the best candidates.

We will define two vectors: one for the ideal profile and another for a candidate, where each component represents their skill level in (Java, SQL, English):

perfil_ideal = np.array([10, 8, 9])
perfil_candidato = np.array([8, 9, 4])

We will calculate which skills the candidate is lacking by using subtraction:

error = perfil_ideal - perfil_candidato
print(f"Skill gap: {error}")  
# [2, -1, 5] -> Missing 2 in Java, exceeds by 1 in SQL, and missing 5 in English.

Now we will use the norm to determine how large the total error is:

distancia_error = np.linalg.norm(error)
print(f"Total error magnitude: {distancia_error:.2f}")

If we compute the norm of the subtraction, we obtain the Euclidean distance.

Next, we normalize the data so they operate on the same scale. Here we use the norm to convert the vectors into unit vectors:

perfil_ideal_unit = perfil_ideal / np.linalg.norm(perfil_ideal)
perfil_cand_unit = perfil_candidato / np.linalg.norm(perfil_candidato)
print(f"Normalized ideal profile: {perfil_ideal_unit}")

Finally, we use the dot product with cosine similarity to determine how similar the profiles are, where 1 means identical and 0 means completely different:

similitud = np.dot(perfil_ideal_unit, perfil_cand_unit)
print(f"Similarity: {similitud:.4f}")

Summary

  • Subtraction helps us measure how far we are from the goal. Reality - Prediction = Error.
  • The norm represents how large a vector is (its magnitude).
  • The dot product multiplies corresponding components and sums them.

Share this article on

Avatar byandrev

Andres Parra

Software Engineer

I'm Andres Parra, Software Engineer passionate about developing scalable and innovative technological solutions. I specialize in building modern web applications, mastering a versatile stack that includes JavaScript, TypeScript, Python, and Java, along with frameworks like React, Next.js, and Spring Boot. I'm also interested in the latest technologies and tools for development.