How can I calculate the Euclidean distance in Python, and could you provide some examples?

The Euclidean distance is a measure of the distance between two points in a multi-dimensional space. It is commonly used in various fields such as mathematics, statistics, and machine learning. In Python, the Euclidean distance can be calculated using the “numpy” library’s “linalg” module. This module contains a function called “norm” which can be used to calculate the Euclidean distance between two points. Some examples of calculating the Euclidean distance in Python using this function include finding the distance between two points in a 2D or 3D space, calculating the distance between two vectors, or finding the shortest distance between a point and a line. By using the “norm” function, one can easily and accurately calculate the Euclidean distance in Python.

Calculate Euclidean Distance in Python (With Examples)


The Euclidean distance between two vectors, A and B, is calculated as:

Euclidean distance = √Σ(Ai-Bi)2

To calculate the Euclidean distance between two vectors in Python, we can use the numpy.linalg.norm function:

#import functions
import numpy as np
from numpy.linalg import norm

#define two vectors
a = np.array([2, 6, 7, 7, 5, 13, 14, 17, 11, 8])
b = np.array([3, 5, 5, 3, 7, 12, 13, 19, 22, 7])

#calculate Euclidean distance between the two vectors 
norm(a-b)

12.409673645990857

The Euclidean distance between the two vectors turns out to be 12.40967.

Note that this function will produce a warning message if the two vectors are not of equal length:

#import functions
import numpy as np
from numpy.linalg import norm

#define two vectors
a = np.array([2, 6, 7, 7, 5, 13, 14])
b = np.array([3, 5, 5, 3, 7, 12, 13, 19, 22, 7])

#calculate Euclidean distance between the two vectors 
norm(a-b)

ValueError: operands could not be broadcast together with shapes (7,) (10,) 

Note that we can also use this function to calculate the Euclidean distance between two columns of a pandas DataFrame:

#import functions
import pandas as pd 
import numpy as np
from numpy.linalg import norm

#define DataFrame with three columns
df = pd.DataFrame({'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#calculate Euclidean distance between 'points' and 'assists' 
norm(df['points'] - df['assists'])

40.496913462633174

The Euclidean distance between the two columns turns out to be 40.49691.

Notes

1. There are multiple ways to calculate Euclidean distance in Python, but as this Stack Overflow thread explains, the method explained here turns out to be the fastest.

2. You can find the complete documentation for the numpy.linalg.norm function here.

3. You can refer to this Wikipedia pageto learn more details about Euclidean distance.

x