Layers
RecurrentLayers.RAN
— TypeRAN(input_size => hidden_size;
return_state = false, kwargs...)
Recurrent Additive Network cell. See RANCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} \tilde{c}_t &= W_c x_t, \\ i_t &= \sigma(W_i x_t + U_i h_{t-1} + b_i), \\ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]
Forward
ran(inp, (state, cstate))
ran(inp)
Arguments
inp
: The input to the ran. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the RAN. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.IndRNN
— TypeIndRNN(input_size, hidden_size, [activation];
return_state = false, kwargs...)
Independently recurrent network. See IndRNNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: activation function. Default istanh
.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]
Forward
indrnn(inp, state)
indrnn(inp)
Arguments
inp
: The input to the indrnn. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the IndRNN. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.LightRU
— TypeLightRU(input_size => hidden_size;
return_state = false, kwargs...)
Light recurrent unit network. See LightRUCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]
Forward
lightru(inp, state)
lightru(inp)
Arguments
inp
: The input to the lightru. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the LightRU. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.LiGRU
— TypeLiGRU(input_size => hidden_size;
return_state = false, kwargs...)
Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]
Forward
ligru(inp, state)
ligru(inp)
Arguments
inp
: The input to the ligru. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the LiGRU. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.MGU
— TypeMGU(input_size => hidden_size;
return_state = false, kwargs...)
Minimal gated unit network. See MGUCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]
Forward
mgu(inp, state)
mgu(inp)
Arguments
inp
: The input to the mgu. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the MGU. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.NAS
— TypeNAS(input_size => hidden_size;
return_state = false,
kwargs...)
Neural Architecture Search unit. See NASCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ o_3 &= \sigma(W_i^{(3)} x_t + W_h^{(3)} h_{t-1} + b^{(3)}), \\ o_4 &= \text{ReLU}(W_i^{(4)} x_t \cdot W_h^{(4)} h_{t-1}), \\ o_5 &= \tanh(W_i^{(5)} x_t + W_h^{(5)} h_{t-1} + b^{(5)}), \\ o_6 &= \sigma(W_i^{(6)} x_t + W_h^{(6)} h_{t-1} + b^{(6)}), \\ o_7 &= \tanh(W_i^{(7)} x_t + W_h^{(7)} h_{t-1} + b^{(7)}), \\ o_8 &= \sigma(W_i^{(8)} x_t + W_h^{(8)} h_{t-1} + b^{(8)}). \\ \text{Second Layer Computations:} & \\ l_1 &= \tanh(o_1 \cdot o_2) \\ l_2 &= \tanh(o_3 + o_4) \\ l_3 &= \tanh(o_5 \cdot o_6) \\ l_4 &= \sigma(o_7 + o_8) \\ \text{Inject Cell State:} & \\ l_1 &= \tanh(l_1 + c_{\text{state}}) \\ \text{Final Layer Computations:} & \\ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]
Forward
nas(inp, (state, cstate))
nas(inp)
Arguments
inp
: The input to the nas. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the NAS. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.RHN
— TypeRHN(input_size => hidden_size, [depth];
return_state = false,
kwargs...)
Recurrent highway network. See RHNCellUnit
for a the unit component of this layer. See RHNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layerdepth
: depth of the recurrence. Default is 3
Keyword arguments
couple_carry
: couples the carry gate and the transform gate. Defaulttrue
init_kernel
: initializer for the input to hidden weightsbias
: include a bias or not. Default istrue
return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) \end{aligned}\]
RecurrentLayers.MUT1
— TypeMUT1(input_size => hidden_size;
return_state=false,
kwargs...)
Mutated unit 1 network. See MUT1Cell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]
Forward
mut(inp, state)
mut(inp)
Arguments
inp
: The input to the mut. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the MUT. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.MUT2
— TypeMUT2Cell(input_size => hidden_size;
return_state=false,
kwargs...)
Mutated unit 2 network. See MUT2Cell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]
Forward
mut(inp, state)
mut(inp)
Arguments
inp
: The input to the mut. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the MUT. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.MUT3
— TypeMUT3(input_size => hidden_size;
return_state = false, kwargs...)
Mutated unit 3 network. See MUT3Cell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]
Forward
mut(inp, state)
mut(inp)
Arguments
inp
: The input to the mut. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the MUT. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.SCRN
— TypeSCRN(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true, alpha = 0.0,
return_state = false)
Structurally contraint recurrent unit. See SCRNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.alpha
: structural contraint. Default is 0.0.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\ h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\ y_t &= f(U_y h_t + W_y s_t) \end{aligned}\]
Forward
scrn(inp, (state, cstate))
scrn(inp)
Arguments
inp
: The input to the scrn. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the SCRN. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.PeepholeLSTM
— TypePeepholeLSTM(input_size => hidden_size;
return_state=false,
kwargs...)
Peephole long short term memory network. See PeepholeLSTMCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{aligned}\]
Forward
peepholelstm(inp, (state, cstate))
peepholelstm(inp)
Arguments
inp
: The input to the peepholelstm. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.FastRNN
— TypeFastRNN(input_size => hidden_size, [activation];
return_state = false, kwargs...)
Fast recurrent neural network. See FastRNNCell
for a layer that processes a single sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: the activation function, defaults totanh_fast
.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.init_alpha
: Initializer for the alpha parameter. Default is 3.0.init_beta
: Initializer for the beta parameter. Default is - 3.0.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]
Forward
fastrnn(inp, state)
fastrnn(inp)
Arguments
inp
: The input to the fastrnn. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the FastRNN. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.FastGRNN
— TypeFastGRNN(input_size => hidden_size, [activation];
return_state = false, kwargs...)
Fast recurrent neural network. See FastGRNNCell
for a layer that processes a single sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: the activation function, defaults totanh_fast
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.init_zeta
: Initializer for the zeta parameter. Default is 1.0.init_nu
: Initializer for the nu parameter. Default is - 4.0.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]
Forward
fastgrnn(inp, state)
fastgrnn(inp)
Arguments
inp
: The input to the fastgrnn. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the FastGRNN. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.FSRNN
— TypeFSRNN(input_size => hidden_size,
fast_cells, slow_cell;
return_state=false)
Fast slow recurrent neural network. See FSRNNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.fast_cells
: a vector of the fast cells. Must be minimum of length 2.slow_cell
: the chosen slow cell.return_state
: option to return the last state. Default isfalse
.
Equations
\[\begin{aligned} h_t^{F_1} &= f^{F_1}\left(h_{t-1}^{F_k}, x_t\right) \\ h_t^S &= f^S\left(h_{t-1}^S, h_t^{F_1}\right) \\ h_t^{F_2} &= f^{F_2}\left(h_t^{F_1}, h_t^S\right) \\ h_t^{F_i} &= f^{F_i}\left(h_t^{F_{i-1}}\right) \quad \text{for } 3 \leq i \leq k \end{aligned}\]
Forward
fsrnn(inp, (fast_state, slow_state))
fsrnn(inp)
Arguments
inp
: The input to the fsrnn. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(fast_state, slow_state)
: A tuple containing the hidden and cell states of the FSRNN. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.LEM
— TypeLEM(input_size => hidden_size, [dt];
return_state=false, init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform, bias = true)
Long expressive memory network. See LEMCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.dt
: timestep. Defaul is 1.0.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} \boldsymbol{\Delta t_n} &= \Delta \hat{t} \hat{\sigma} (W_1 y_{n-1} + V_1 u_n + b_1) \\ \overline{\boldsymbol{\Delta t_n}} &= \Delta \hat{t} \hat{\sigma} (W_2 y_{n-1} + V_2 u_n + b_2) \\ z_n &= (1 - \boldsymbol{\Delta t_n}) \odot z_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_z y_{n-1} + V_z u_n + b_z) \\ y_n &= (1 - \boldsymbol{\Delta t_n}) \odot y_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_y z_n + V_y u_n + b_y) \end{aligned}\]
Forward
lem(inp, (state, zstate))
lem(inp)
Arguments
inp
: The input to the LEM. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the LEM. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.coRNN
— TypecoRNN(input_size => hidden_size, [dt];
gamma=0.0, epsilon=0.0,
return_state=false, init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform, bias = true)
Coupled oscillatory recurrent neural unit. See coRNNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.dt
: time step. Default is 1.0.
Keyword arguments
gamma
: damping for state. Default is 0.0.epsilon
: damping for candidate state. Default is 0.0.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} \mathbf{y}_n &= y_{n-1} + \Delta t \mathbf{z}_n, \\ \mathbf{z}_n &= z_{n-1} + \Delta t \sigma \left( \mathbf{W} y_{n-1} + \mathcal{W} z_{n-1} + \mathbf{V} u_n + \mathbf{b} \right) - \Delta t \gamma y_{n-1} - \Delta t \epsilon \mathbf{z}_n, \end{aligned}\]
Forward
cornn(inp, (state, zstate))
cornn(inp)
Arguments
inp
: The input to thecornn
. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of thecoRNN
. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.AntisymmetricRNN
— TypeAntisymmetricRNN(input_size, hidden_size, [activation];
return_state = false, kwargs...)
Antisymmetric recurrent neural network. See AntisymmetricRNNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: activation function. Default istanh
.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
epsilon
: step size. Default is 1.0.gamma
: strength of diffusion. Default is 0.0.
Equations
\[h_t = h_{t-1} + \epsilon \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right),\]
Forward
asymrnn(inp, state)
asymrnn(inp)
Arguments
inp
: The input to the asymrnn. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the AntisymmetricRNN. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.GatedAntisymmetricRNN
— TypeGatedAntisymmetricRNN(input_size, hidden_size;
return_state = false, kwargs...)
Antisymmetric recurrent neural network with gating. See GatedAntisymmetricRNNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.epsilon
: step size. Default is 1.0.gamma
: strength of diffusion. Default is 0.0.
Equations
\[\begin{aligned} z_t &= \sigma \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_z x_t + b_z \right), \\ h_t &= h_{t-1} + \epsilon z_t \odot \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right). \end{aligned}\]
Forward
asymrnn(inp, state)
asymrnn(inp)
Arguments
inp
: The input to the asymrnn. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the GatedAntisymmetricRNN. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.JANET
— TypeJANET(input_size => hidden_size;
return_state = false, kwargs...)
Just another network. See JANETCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.beta_value
: control over the input data flow. Default is 1.0.
Equations
\[\begin{aligned} \mathbf{s}_t &= \mathbf{U}_f \mathbf{h}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f \\ \tilde{\mathbf{c}}_t &= \tanh (\mathbf{U}_c \mathbf{h}_{t-1} + \mathbf{W}_c \mathbf{x}_t + \mathbf{b}_c) \\ \mathbf{c}_t &= \sigma(\mathbf{s}_t) \odot \mathbf{c}_{t-1} + (1 - \sigma (\mathbf{s}_t - \beta)) \odot \tilde{\mathbf{c}}_t \\ \mathbf{h}_t &= \mathbf{c}_t. \end{aligned}\]
Forward
janet(inp, (state, cstate))
janet(inp)
Arguments
inp
: The input to the janet. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the JANET. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.CFN
— TypeCFN(input_size => hidden_size;
return_state = false, kwargs...)
Chaos free network unit. See CFNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} h_t &= \theta_t \odot \tanh(h_{t-1}) + \eta_t \odot \tanh(W x_t), \\ \theta_t &:= \sigma (U_\theta h_{t-1} + V_\theta x_t + b_\theta), \\ \eta_t &:= \sigma (U_\eta h_{t-1} + V_\eta x_t + b_\eta). \end{aligned}\]
Forward
cfn(inp, state)
cfn(inp)
Arguments
inp
: The input to the cfn. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the CFN. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.TRNN
— TypeTRNN(input_size => hidden_size, [activation];
return_state = false, kwargs...)
Strongly typed recurrent unit. See TRNNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: activation function. Default istanh
.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \mathbf{W} x_t \\ f_t &= \sigma (\mathbf{V} x_t + b) \\ h_t &= f_t \odot h_{t-1} + (1 - f_t) \odot z_t \end{aligned}\]
Forward
trnn(inp, state)
trnn(inp)
Arguments
inp
: The input to the trnn. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the TRNN. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.TGRU
— TypeTGRU(input_size => hidden_size;
return_state = false, kwargs...)
Strongly typed recurrent gated unit. See TGRUCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ h_t &= f_t \odot h_{t-1} + z_t \odot o_t \end{aligned}\]
Forward
tgru(inp, state)
tgru(inp)
Arguments
inp
: The input to the tgru. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the TGRU. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.
RecurrentLayers.TLSTM
— TypeTLSTM(input_size => hidden_size;
return_state = false, kwargs...)
Strongly typed long short term memory. See TLSTMCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
return_state
: Option to return the last state together with the output. Default isfalse
.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ c_t &= f_t \odot c_{t-1} + (1 - f_t) \odot z_t \\ h_t &= c_t \odot o_t \end{aligned}\]
Forward
tlstm(inp, state)
tlstm(inp)
Arguments
inp
: The input to the tlstm. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.state
: The hidden state of the TLSTM. If given, it is a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statesnew_states
and the last state of the iteration.
RecurrentLayers.UnICORNN
— TypeUnICORNN(input_size => hidden_size, [dt];
alpha=0.0, return_state=false, init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform, bias = true)
Undamped independent controlled oscillatory recurrent neural network. See UnICORNNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.dt
: time step. Default is 1.0.
Keyword arguments
alpha
: Control parameter. Default is 0.0.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} y_n &= y_{n-1} + \Delta t \, \hat{\sigma}(c) \odot z_n, \\ z_n &= z_{n-1} - \Delta t \, \hat{\sigma}(c) \odot \left[ \sigma \left( w \odot y_{n-1} + V y_{n-1} + b \right) + \alpha y_{n-1} \right]. \end{aligned}\]
Forward
unicornn(inp, (state, zstate))
unicornn(inp)
Arguments
inp
: The input to theunicornn
. It should be a vector of sizeinput_size x len
or a matrix of sizeinput_size x len x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of theUnICORNN
. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- New hidden states
new_states
as an array of sizehidden_size x len x batch_size
. Whenreturn_state = true
it returns a tuple of the hidden statsnew_states
and the last state of the iteration.