Enforcing Symmetries in Neural Networks#
We are going to explain what we mean by symmetries, why they are important, and how we can enforce them in neural networks.
What are symmetries?#
Generally speaking, we say that some mathematical object possesses a certain symmetry if it either remains unchanged, or changes in a predictable way, under some transformation. We are particularly interested in symmetries of mathematical equations that describe a physical system. Typically these equations are written in terms of variables expressed in a coordinate system. But the origin and orientation of that coordinate system are arbitrary, and the form of the equations should not depend on them.
How do we describe symmetries?#
Symmetries are described using the language of group theory.
A group
that combines two elements to produce a third element:
Typically, we do not need to write the
Closure: For all
, .Associativity: For all
, .Identity: There exists an element
such that for all , .Inverses: For all
, there exists an element such that .
Examples of groups#
You already know many examples of groups. Here is some of them.
The set of integers with addition#
The set of integers
The set of non-zero rational numbers with multiplication#
The set of non-zero rational numbers,
The set of real numbers with addition#
The set of real numbers
The set of non-zero real numbers with multiplication#
The set of non-zero real numbers,
The translation group#
Consider the Euclidean space
The general linear group#
The general linear group of dimension
The group operation is matrix multiplication. Check the group properties for yourself.
What is the identity element of
The special orthogonal group#
The special orthogonal group of dimension
The group operation is matrix multiplication.
You can think of
This means that
Let’s look at some examples of these groups in 3D space.
Rotation of degree
Check that
Similarly, rotation of degree
Check again that
If you multiply the two, you get:
Check that
The orthogonal group#
The orthogonal group of dimension
Again, the group operation is matrix multiplicatin.
You can think of
An element of the orthogonal group not in the special orthogonal group is a reflection, for example:
Check that
Group homorphisms#
Two groups are homomorphic if there is a function between them that preserves the group structure. This just means that the group elements of one group are relabeled versions of the group elements of the other group. The function that does this is called a homomorphism.
Mathematically, let
for all
If the homomorphism is bijective, i.e., one-to-one and onto, then it is called an isomorphism.
When
Example: is homomorphic to #
Let’s look at an example of an homorphism.
Consider the group of integers modulo
Note that
Now, consider the function
that gives the remainder of an integer is divided by
This is an homomorphism. Check that:
But it is not an isomorphism because it is not one-to-one.
Example: The group of real numbers with addition is isomorphic to the group of positive real numbers with multiplication#
Let
Consider the function:
defined by:
The function is one-to-one and onto. Show that it is a homomorphism.
Group of transformations#
Let
that is one-to-one and onto (bijective).
Here the group operation is the composition of functions and we are assuming that
defined by:
is also in
Why is
Second, the identity function
and it is the identity element of
Group representations#
A group representation is a way to represent the elements of a group as matrices. Using group representations you can study the group using linear algebra.
Example: The group of invertible linear transformations is isomorphic to the general linear group#
Let
We will show that it is isomorphic to the general linear group
The matrix
The map
that sends a linear transformation to its matrix representation
is an isomorphism between
The Eucledian group#
Consider the Euclidean space
The Euclidean group
where
for all
Now, we can define the Euclidean group
Again, the group operation is function composition.
We can show that
Here the symbol
Intuitively, the semidirect product means that the group
The semidirect product specifies how the translation and the rotation/reflection interact.
Take two elements
It is the semidirect product that specifies how the two group operations interact.
Understanding the semidirect product is not trivial. We do it in the next section, but feel free to skip it if you are not interested.
Semidirect products#
You can skip this section if you are not interested in the details of the semidirect product.
Let
Furthermore, we assume that
and that elements of
We are going to assume that
This is a wierd assumption, but it is necessary for the semidirect product to work.
We will show that under these assumptions the group
The meaning of everything will become apparent as we go.
First,
such that:
This map is indeed onto because every element of
for some
Multiplying with
Now, the left hand side is an element of
To show that the map preserves the group structure, we need to explicitly define the group operation on
On the right hand side, we just use the definition of the map:
Now, we try to turn the result into something that looks like a product of an
Clearly, the term in the secon parenthesis is in
Connection to the Euclidean group#
This is all too theoretical. How does this connect to the Euclidean group? Take:
First, we do have that
We start by proving that
Now, define the translation
This is an element of
Or in terms of its action on a vector
We will show that
Try to show that the decomposition is unique.
Now, we need to show that
Now, consider the composition:
We need to show that this is an translation, i.e., it is in
This shows that
which proves the desired result.
Finally, we need to show that
This is obvious. Why?
Having proved all these things, we can use the result above to write:
Let’s write down the group operation explicitly:
We can simplify the first term of the right hand side:
If we identify
Invariance#
Now that we know what symmetries are, we can talk about invariance.
Consider a function
for all
If you have a physical problem with a known symmetry like that, you better construct a model that respects that symmetry.
If the symmetry group is finite, then there is an easy way to construct an invariant function from an arbitrary function.
Say
Show that
When
But integrating over a continuous group is not trivial. You get into the theory of Lie groups. It is also very likely that you will not be able to find an explicit expression for
Equivariance#
Let
When this happens, we say that
In the equation above, on the left hand-side
or for an
where
Now, suppose that
and for an
where
Now, take
How do the forces change under the action of
Eucledian neural networks#
Eucledian neural networks, see (Geiger et al. 2022), are neural networks that respect the symmetries of the Euclidean group. They rely on spherical harmonics to construct invariant and covariant functions. More details in their paper. We are going to demonstrate what they are capable of using numerical examples.