When we model the world through mathematics, we frequently find ourselves talking about magnitudes. Mass, length, price, and so on, are all examples of quantities that we use numbers to represent. We build our mathematical models in such a way that these numbers and operations involving them has some, largely idealized, correspondence to the entities, relations and processes out there that we aim to understand and predict. Numbers that are used in this way are called scalars.

Sometimes we need to couple the concept of magnitude with a direction. The prime example is the concept of force that we use in physics. A force has both a magnitude and a particular direction in space. The mathematical object that brings together magnitude and direction is called a vector.

As a simple example, take a vector with a magnitude of 5 units and a direction of \(32^\circ\) measured counterclockwise from the positive \(x\)-axis. We can represent this vector in several ways. Initially, assume we don’t care about where the vector starts (its tail). We just want to encode its length and direction. We represent it as a two component vector, where the first component is the horizontal (\(x\)) magnitude, and the second eomponent is the vertical (\(y\)) magnitude. The components can be calculated using trigonometry via the cosine and sine of the direction angle:1:

\[\begin{align} \vect{v} &= \begin{pmatrix} 5\cos 32^\circ & 5\sin 32^\circ \end{pmatrix}\nonumber\\ &= \begin{pmatrix}4.2402406 & 2.6495962\end{pmatrix} \end{align}\]

In other words, if you do not care about the tail point, a vector in a plane (2D) is represented by two real numbers.

You can abstract the directon from a vector by a unit vector, whose magnitude is 1. Such a unit vector for our example would be:

\[\vect{u} = \begin{pmatrix} \cos 32^\circ & \sin 32^\circ \end{pmatrix}\]

or, with the angle in radians (\(d=180^\circ \pi\)):

\[\vect{u} = \begin{pmatrix} \cos \frac{32\pi}{180} & \sin \frac{32\pi}{180} \end{pmatrix}\]

or, with the values of the trig functions plugged in:

\[\vect{u} = \begin{pmatrix} 0.8480481 & 0.52991927 \end{pmatrix}\]

Element-wise operations

Given a unit vector, you can factor the magnitude back in by multiplying the unit vector by the scalar magnitude. This is an example of an element-wise operation, where you apply the same operation to each component of the vector. In this case, you multiply each component of the unit vector by the magnitude:

\[\vect{r} = 5\vect{u} = \begin{pmatrix} 4.2402406 & 2.6495962\end{pmatrix}\]

An operation between vectors can also be element-wise. For example, if you have two vectors \(\vect{v}\) and \(\vect{w}\), you can add them together by adding their components:

\[\begin{align} \vect{v} + \vect{w} &= \begin{pmatrix} v_1 & v_2 \end{pmatrix} + \begin{pmatrix} w_1 & w_2 \end{pmatrix}\nonumber\\ &= \begin{pmatrix} v_1 + w_1 & v_2 + w_2 \end{pmatrix} \end{align}\]

Geometrically, adding two vectors means constructing a third spanning from the tail of the first to the head of the second, where you place the tail of the second at the head of the first. This arrangement gives a triangle when the vectors are not parallel (= in the same or opposite direction).

Norm of a vector

We’ve been talking about the magnitude of a vector. The interpretation of magnitude may change from context to context. In geometry, the magnitude of a vector is its length; in physics, the magnitude of a force vector is the strength of the force, and so on. The general mathematical term for the magnitude of a vector is its norm. The norm of a vector is a non-negative real number that represents the size or length of the vector.

The simplest and most common type of norm is Euclidian norm based on Pythogoras’ theorem. For a vector \(\vect{v} = \begin{pmatrix} 3 & 4 \end{pmatrix}\), the Euclidean norm is

\[\|\vect{v}\|_2 = \sqrt{3^2 + 4^2} = 5\]

This is the ordinary geometric length of the arrow from \((0,0)\) to \((3,4)\).

A useful related idea is distance between two vectors. This is usually defined by the norm of their difference:

\[\text{distance}(x,y) = \|x-y\|\]

That formula says that two vectors are close when their difference has small norm, where difference is the element-wise subtraction.

Properties of the norm function

The concept of norm is not limited to just measuring the length of a vector. Any function with the following properties can be considered a norm:

  • Non-negativity: \(\|x\| \ge 0\).
  • Zero only at the zero vector: \(\|x\| = 0\) exactly when \(x=0\).
  • Homogeneity: \(\|a x\| = \lvert a\rvert \,\|x\|\) for any scalar \(a\).
  • Triangle inequality: \(\|x+y\| \le \|x\| + \|y\|\).

Here are some examples of different norms (subscript indicates the type of norm):

  • Euclidean norm: \(\|x\|_2 = \sqrt{x_1^2 + \cdots + x_n^2}\), the usual geometric length.
  • 1-norm: \(\|x\|_1 = \lvert x_1\rvert + \cdots + \lvert x_n\rvert\), the sum of absolute values.
  • Infinity norm: \(\|x\|_\infty = \max_i \lvert x_i\rvert\), the largest absolute coordinate.

You might find it a bit confusing that we defined the norms for vectors in \(n\) dimensions. Just go on reading.

Dimensions

Except the definitions of different types of norms above, we’ve been dealing with vectors in a 2-dimensional plane. But the dimensionality of a vector can be any positive integer. A vector in 3D will naturally be conceived as an arrow in a 3D space. But what about vectors in 8D?

We, as humans, can visualize and intuitively understand geometry in 2D and 3D, but higher dimensions need a proper mathematical thinking. In mathematics, we operate on definitions and properties that follow from these definitions obeying the laws of logic. In mathematical education, we often start with intuitively familiar concepts and give them certain names. A good example is the circle and sphere. It is easy to grasp a circle in 2D and a sphere in 3D, as we think of them as some idealized forms of some objects we see around. But actually, what we call circle in 2D and sphere in 3D are just sets. How?

Start with the notion of a point. A point, in common-sense, is either a pair (2D) or a triple (3D) of numbers. In mathematics, a point is any \(n\)-tuple of numbers. The concepts circle and sphere are just the names we give to the notion of “the set of points that are at a fixed distance from a given point”, in 2D and 3D, respectively.

Likewise for vectors. A vector is a mathematical object in \(\mathbb{R}^n\), where \(n\) is any positive integer. When you need its Euclidean length, you simple compute the sum of squares of its components and take the square root, no matter how many components it has.

For a right triangle with legs $a$ and $b$ and hypotenuse $c$, Pythagoras says

\[a^{2} + b^{2} = c^{2}.\]

If you view the hypotenuse as a vector from the origin to the point \((a,b)\), then that vector’s length is

\[\|\begin{pmatrix} a & b \end{pmatrix}\| = \sqrt{a^{2} + b^{2}},\]

which is exactly “take the hypotenuse using Pythagoras.”

In \(\mathbb{R}^{n}\), the Euclidean norm of a vector \(x = \begin{pmatrix}x_{1} & \dots & x_{n}\end{pmatrix}\) is

\[\|x\| = \sqrt{x_{1}^{2} + \dots + x_{n}^{2}}\]

which is just repeatedly applying Pythagoras along mutually perpendicular axes. Each coordinate plays the role of a leg in a higher‑dimensional right triangle whose “hypotenuse” is the vector.

Dot product

A key operation that maps two vectors to a scalar is the dot product which is an instance of a family of operations called inner product.

The dot product of two vectors $u$ and $v$ in $\mathbb{R}^n$ is defined as:

\[u \cdot v = u_{1}v_{1} + \dots + u_{n}v_{n}.\]

Another common notation for the dot product is \(\langle u, v \rangle\).

It’s close relation to Pythogoras’ theorem makes dot product a fundamental tool in linear algebra.

Given \(\vect{v} = \pmatrix{x & y}\):

\[\|\vect{v}\|^2 = \vect{v} \cdot \vect{v} = x^2 + y^2.\]

Now take two vectors \(\vect{u} = \pmatrix{u_1& u_2}\) and \(\vect{v}=\pmatrix{v_1&v_2}\). Now, imagine you form a triangle by putting the tail of \(\vect{v}\) at the head of \(\vect{u}\). The vector from the tail of \(\vect{u}\) to the head of \(\vect{v}\) is \(\vect{u} + \vect{v}\). We know from above that:

\[\begin{align}\label{vec-pytho} \|\vect{u} + \vect{v}\|^2 & = (\vect{u} + \vect{v}) \cdot (\vect{u} + \vect{v})\\ & = \pmatrix{u_1 + v_1 & u_2 + v_2} \cdot \pmatrix{u_1 + v_1 & u_2 + v_2}\nonumber\\ & = (u_1 + v_1)^2 + (u_2 + v_2)^2\nonumber\\ & = u_1^2 + 2u_1v_1 + v_1^2 + u_2^2 + 2u_2v_2 + v_2^2\nonumber\\ & = (u_1^2 + u_2^2) + (v_1^2 + v_2^2) + 2(u_1v_1 + u_2v_2)\nonumber\\ & = \|\vect{u}\|^2 + \|\vect{v}\|^2 + 2\,(\vect{u} \cdot \vect{v})\nonumber \end{align}\]

To relate this equality to the angle between the vectors \(\vect{u}\) and \(\vect{v}\), we turn to a theorem from elementary geometry:

Given a triangle with sides of lengths \(a\), \(b\), and \(c\), and an angle \(\theta\) opposite the side of length \(c\), the law of cosines states that:

\[c^2 = a^2 + b^2 - 2ab \cos(\theta) \nonumber\]

By \eqref{vec-pytho} and , we get:

\[\begin{align} \|\vect{u} + \vect{v}\|^2 & = \|\vect{u}\|^2 + \|\vect{v}\|^2 + 2\,\|\vect{u}\|\|\vect{v}\|\cos\theta\\ & = \|\vect{u}\|^2 + \|\vect{v}\|^2 + 2\,(\vect{u} \cdot \vect{v})\nonumber \end{align}\]

yielding the equality:

\[\cos\theta = \frac{\vect{u} \cdot \vect{v}}{\|\vect{u}\|\|\vect{v}\|}\]

which is of prime importance in linear algebra and its applications in artificial intelligence and machine learning.

Its importance lies in the fact that it gives a measure of the directional similarity of two vectors that is independent of their magnitudes, as the cosine function approaches 1 from both directions as the angle approaches 0. This is the basis of the widely used cosine similarity measure in machine learning and information retrieval. In 2 and 3 dimensions, the concept of angle is intuitive, but in higher dimensions there is neither an intuitive approximation nor a need for one.

This was a very brief introduction to linear algebra through vectors.

  1. Let \(\theta\) be one of the acute angles of a right triangle: \(\cos\theta\) gives the ratio of the adjacent leg to the hypotenuse, and \(\sin\theta\) gives the ratio of the opposite leg to the hypotenuse.