Activation Functions Registry

The module named ActivationFunctionRegistry contains the most used activation functions, allowing easy access to them while making a neural network.

from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr

The registry contains for as the current version the following activation functions:

Function

Definition

Deriviative

Sigmoid

\(\sigma(z) = \displaystyle\frac{1}{1+e^{-z}}\)

\(\sigma'(z)= \sigma(z)\left(1-\sigma(z)\right)\)

ReLU

\(\mathrm{ReLU}(z) = \max(0, z) = \begin{cases} z \quad & z\geq 0\\ 0 \quad & z<0\end{cases}\)

\(\mathrm{ReLU}'(z) =\begin{cases} 1 \quad & z\geq 0 \\ 0 \quad & z<0\end{cases}\)

Linear

\(f(z)=z\)

\(f'(z)=1\)

Hyperbolic tangent

\(f(z) = \tanh(z)\)

\(f'(z)=1-\tanh^2(z)\)

Each function and its deriviative are then grouped in tuples of (activation_function, deriviative) and stored in a dictionnary named Activations where the keys are strings indicating the name of the function.

For example,

from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr

sigmoid, sigmoid_prime = afr.Activations['sigmoid']

print(sigmoid(1))
print(sigmoid_prime(0))
0.7310585786300049
0.25

The keys for the activation functions are respectively 'sigmoid', 'ReLU', 'linear' and 'tanh'.

These activation functions can take as parameters integers, floatiing-point values, lists and even Matrix instances; the return type is the same as the input where it is computed component-wise for higher order data structures.

from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr

tanh = afr.Activations['tanh'][0]

z = 2
v = [-1, 0, 1]
M = Matrix([[1, 2, 3],[0, 1, -2]])

print(tanh(z))
print(tanh(v))
print(tanh(M))
0.9640275800758169
[-0.7615941559557649, 0.0, 0.7615941559557649]
matrix([
        [0.7615941559557649, 0.9640275800758169, 0.9950547536867305],
        [0.0, 0.7615941559557649, -0.9640275800758169]
])

The registry contains as well the softmax function that turns a column vector into a probability distribution. More formally, if \(X = \begin{pmatrix}x_1\\x_2\\ \vdots\\x_n\end{pmatrix}\) is a column matrix, then \(\mathrm{softmax}(X) = \begin{pmatrix} y_1\\ y_2 \\ \vdots \\ y_n\end{pmatrix}\) where

\[\forall i \in [1, n], \quad y_i = \displaystyle\frac{e^{x_i}}{\displaystyle\sum_{k=1}^n e^{x_k}}\]

For example,

from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr

M = Matrix([[-5],[4],[2.3],[-2],[3],[4],[7]])

print(afr.softmax(M))
matrix([
        [5.451275599280189e-06],
        [0.04417214369331117],
        [0.00806952287485786],
        [0.00010949179732781423],
        [0.01625002353723996],
        [0.04417214369331117],
        [0.8872212231283527]
])

If the matrix passed is not a column vector ie its format can not be written as (n,1), a TypeError is raised.