Forward Propagation
Definition:
forward_propagate(input_vector: Matrix): -> tuple[Matrix, tuple[list[Matrix], list[Matrix]]]
If input_matrix is not a column vector, a ValueError will be raised.
Let \(L\) be the number of layers, \(\left(W^{(1)},\cdots,W^{(L-1)}\right)\) be the \(L-1\) weight matrices and \(\left(B^{(1)},\cdots,B^{(L-1)}\right)\) be the \(L-1\) bias matrices. We iteratively compute the output of the MLP based on the input vector \(A^{(0)}=X\) via the recursive formula:
where \(f\) is the hidden layers activation function if \(i < L\) and is the output layer activation function if \(i = L\). The output is hence the last column matrix \(A^{(L)}\).
The method returns a tuple (output_vector, (activations, pre_activations)).
output_vector is self explanatory: the \(A^{(L)}\) vector as a Matrix instance.
activations: the list of the \(A^{(i)}\) column vectors.
pre_activations: the list of the \(A^{(i)}\) vectors before evaluating the activation function,
meaning the \(Z^{(i)}\) column vectors (\(1\leq i \leq L\)) where: