Activation Functions Registry
The module named ActivationFunctionRegistry contains
the most used activation functions, allowing easy access
to them while making a neural network.
from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr
The registry contains for as the current version the following activation functions:
Function |
Definition |
Deriviative |
|---|---|---|
Sigmoid |
\(\sigma(z) = \displaystyle\frac{1}{1+e^{-z}}\) |
\(\sigma'(z)= \sigma(z)\left(1-\sigma(z)\right)\) |
ReLU |
\(\mathrm{ReLU}(z) = \max(0, z) = \begin{cases} z \quad & z\geq 0\\ 0 \quad & z<0\end{cases}\) |
\(\mathrm{ReLU}'(z) =\begin{cases} 1 \quad & z\geq 0 \\ 0 \quad & z<0\end{cases}\) |
Linear |
\(f(z)=z\) |
\(f'(z)=1\) |
Hyperbolic tangent |
\(f(z) = \tanh(z)\) |
\(f'(z)=1-\tanh^2(z)\) |
Each function and its deriviative are then grouped in tuples of (activation_function, deriviative)
and stored in a dictionnary named Activations where the keys are strings indicating the name of the function.
For example,
from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr
sigmoid, sigmoid_prime = afr.Activations['sigmoid']
print(sigmoid(1))
print(sigmoid_prime(0))
0.7310585786300049
0.25
The keys for the activation functions are respectively 'sigmoid', 'ReLU',
'linear' and 'tanh'.
These activation functions can take as parameters integers, floatiing-point values,
lists and even Matrix instances; the return type is the same as the input
where it is computed component-wise for higher order data structures.
from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr
tanh = afr.Activations['tanh'][0]
z = 2
v = [-1, 0, 1]
M = Matrix([[1, 2, 3],[0, 1, -2]])
print(tanh(z))
print(tanh(v))
print(tanh(M))
0.9640275800758169
[-0.7615941559557649, 0.0, 0.7615941559557649]
matrix([
[0.7615941559557649, 0.9640275800758169, 0.9950547536867305],
[0.0, 0.7615941559557649, -0.9640275800758169]
])
The registry contains as well the softmax function that turns a column vector into a probability distribution. More formally, if \(X = \begin{pmatrix}x_1\\x_2\\ \vdots\\x_n\end{pmatrix}\) is a column matrix, then \(\mathrm{softmax}(X) = \begin{pmatrix} y_1\\ y_2 \\ \vdots \\ y_n\end{pmatrix}\) where
For example,
from basic_deep_learning import*
from basic_deep_learning import ActivationFunctionsRegistry as afr
M = Matrix([[-5],[4],[2.3],[-2],[3],[4],[7]])
print(afr.softmax(M))
matrix([
[5.451275599280189e-06],
[0.04417214369331117],
[0.00806952287485786],
[0.00010949179732781423],
[0.01625002353723996],
[0.04417214369331117],
[0.8872212231283527]
])
If the matrix passed is not a column vector ie its format can not be written as (n,1),
a TypeError is raised.