Post

Vector Spaces: An Intuitive Introduction

Vector Spaces: An Intuitive Introduction

Introduction

It is undoubtedly a difficult concept to grasp—one of the most abstract that undergraduate students might encounter during their studies. Yet, it is so ubiquitous that nearly every science student eventually finds themselves baffled by its ambiguity, and so essential that they cannot do without it.

So what exactly is this concept? In lofty, difficult-to-understand terms, a vector space is an abelian group under vector addition and has some additional distributive properties of scalars (from the underlying field) over vector addition. But what does all this mean? Another, more concrete, yet one that completely disregards the idea and importance of this abstraction, is to visualize a vector space as $ \mathbb {R} ^n$. Although it makes more sense intuitively to view it as such, it undermines the mathematical formalism and makes it more difficult to understand abstract vector spaces since, in doing so, we take so much for granted ($ \mathbb{R} ^n$ has additional structures defined on it!!).

Vector Space

So, the way I like to think about a vector space is that it is a collection(set) of objects with a binary operation(takes two objects as arguments and gives an object belonging to the collection as output) defined on it, having basic symmetric properties with respect to that operation. The properties being:

  • There’s an element that, when taken as an argument with any object, produces that same object(identity element). It is conventionally called the zero vector, analogous to how zero acts with respect to addition defined on real numbers.
  • Similarly, for every object, there’s an object in the collection that does the exact opposite of what the original object does. It is called the inverse.
  • Then, there’s associativity, which guarantees that the successive application of the operator can be done in any order since all will give the same result. The importance of this lies in the fact that we break large computations or a sequence of operations into small chunks and proceed gradually, and if there weren’t any concept of associativity, there would have been many different ways of doing it, all of which would give different answers and none of which could be considered wrong.
  • Finally, there’s commutativity, which guarantees that the order of the arguments makes no difference (a + b = b + a).

Note that the closure is already guaranteed by the definition of the operation defined on the set, and these objects that constitute a vector space are called vectors.

What all of these properties guarantee is that there is some kind of symmetry in our structure. For if any vector produces some effect, there’s a vector that produces the opposite effect relative to the operator. The only vector that doesn’t produce any effect is the identity, and everything is kind of like balanced around the identity (as in the case of real numbers). Lastly, no matter what we do, we’ll always stay within the set (important because if by any chance we arrive at an object not in the set, we cannot use the properties that have been guaranteed to exist for every vector of the vector space, leading to inconsistencies).

Furthermore, we define an underlying field and scalar multiplication of vectors with scalars from that field having some distributive properties. What this multiplication does is it scales the effect of each vector, and the distributive properties again ensure the closure of the space with respect to all the scaled vectors.

These definitions naturally give rise to two important ideas that are intimately connected with that of a vector space.

Span

Span is defined as any subset of a vector space, the combinations of which generate every vector in the space. By combinations, what we mean is taking a vector and scaling it in some way, and combining it with other vectors using the operator defined on the space. More formally, this can be expressed as:

$$ \begin{gathered} \text{Let } S = \{a_1, a_2, \dots, a_i\} \text{ be a subset of a vector space } \mathbb{V} \text{ over a field } \mathbb{F}. \\ \text{Then, the span of } S \text{ is the set of all linear combinations:} \\ c_1a_1 + c_2a_2 + \dots + c_ia_i, \quad \text{where } c_1, c_2, \dots, c_i \in \mathbb{F}. \end{gathered} $$

But it is evident from the definition that span by itself doesn’t mean much, since every vector space $\mathbb{V}$ spans the space, which doesn’t tell us anything extra about the vector space.

The other important idea is that of linear independence.

Linear Independence

Linear independence is defined as a subset of a vector space, every vector of which can not be written as a combination of the other vectors. Formally, this is expressed as:

$$ \begin{gathered} \text{Let } S = \{a_1, a_2, \dots, a_i\} \text{ be a subset of a vector space } \mathbb{V} \text{ over a field } \mathbb{F}. \\ \text{Then, S is linearly independent iff } \\ c_1a_1 + c_2a_2 + \dots + c_ia_i = 0 \quad \text{implies that } c_j = 0 \text{ for all } 1 \le j \le i \end{gathered} $$

This seemingly innocuous statement may not mean much at first sight, but what it essentially tells us is that every vector in the span of this set is uniquely identified by a tuple of scalars, where each scalar can be thought of as how much of the corresponding vector from the linearly independent set is in that vector. (Note that this makes sense only if the linearly independent set is ordered). And to see why this is the case:

$$ \begin{gathered} \text{Let } a \in span(S = \{a_1, a_2, \dots, a_i\}) \text{ having two representations} \\ \text{where S is a linearly independent, ordered set such that} \\ c_1a_1 + c_2a_2 + \dots + c_ia_i = a \text{ and } \\ b_1a_1 + b_2a_2 + \dots + b_ia_i = a \\ \text{then from this, we have} \\ 0 = a - a = (c_1 - b_1)a_1 + (c_2 - b_2)a_2 + \dots + (c_i - b_i)a_i \\ \text{but by definition, we have } c_j - b_j = 0 \text{ for all } 1 \le j \le i \\ \text{which implies that } c_j = b_j \text{ for all } 1 \le j \le i \\ \end{gathered} $$

Another important thing to note is that any set containing the identity vector (zero vector) can never be linearly independent since $c_i0 = 0 \text{ for all } c_i \in \mathbb{F}$. So, $c_i$ doesn’t necessarily have to be zero.

Basis

These two ideas, considered separately, may not mean much, but when taken together, they give us a powerful way to express and think about any vector space in general that is in terms of its basis. Basis is defined to be a subset of a vector space $\mathbb{V}$ such that it is linearly independent and spans the whole vector space(they are many such sets).

The cardinality of this set is defined as the algebraic dimension of the vector space. This set can be an infinite set or a finite set, in terms of which we define infinite or finite-dimensional vector spaces.

But the most important idea that naturally arises from this concept in the context of finite-dimensional spaces is the isomorphism that results from ordering this set in some way:

$$ \begin{gathered} \text{Let } B = \{a_1, a_2, \dots, a_i\} \text{ be an ordered basis of a vector space } \mathbb{V} \text{ over a field } \mathbb{F}. \\ \text{then there exists an isomorphism} \\ (c_1, c_2, \dots, c_i) \in \mathbb{F}^i \text{ such that } \\ \forall v \in \mathbb{V}, \exists ! (c_1, c_2, \dots, c_i) \text{ such that } \\ v = c_1a_1 + c_2a_2 + \dots + c_ia_i \end{gathered} $$

Conclusion

What we’ve seen so far is that a vector space is a collection of objects with some operator defined on it having some properties, and an underlying field with a multiplication operation that also has some properties. But what do all of these even mean? Why do we even care about any of this?

The importance of all these abstract definitions and theorems is that they provide us with a language and framework with which we can rigorously think about and prove things about these spaces. For example, consider the statement:

The image of a line in $ \mathbb {R} ^2$ under a linear transformation is always a line (or a point).

This statement, although seemingly innocuous, is actually a deep statement about the nature of linear transformations and vector spaces. For it to be true, what must be true is that the preimage of every set of the form \(\{(x, y) \in \mathbb{R}^2 | ax + by = c\}\) under the transformation must be a set of the same form. And this must be true for every such line, i.e., for every choice of a, b, and c. And this is precisely what the theorems about linear transformations and vector spaces guarantee.

Thus, the study of vector spaces and linear transformations is not just about understanding lines and planes, but about understanding the very nature of mathematical transformations and the spaces they act on. It is about developing a rigorous, abstract language with which we can describe and understand mathematical phenomena. And this language and these concepts are not just limited to mathematics, but are also fundamental to fields like physics, computer science, and engineering, where they underpin much of modern technology and scientific understanding.

In conclusion, vector spaces are a fundamental concept in mathematics that provide a framework for understanding and working with linear transformations and systems of linear equations. They are used in a wide range of mathematical and scientific disciplines and are an essential tool for anyone looking to understand or work in these fields.

This post is licensed under CC BY 4.0 by the author.