Cells
RecurrentLayers.RANCell
— TypeRANCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Recurrent Additive Network cell. See RAN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \tilde{c}_t &= W_c x_t, \\ i_t &= \sigma(W_i x_t + U_i h_{t-1} + b_i), \\ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]
Forward
rancell(inp, (state, cstate))
rancell(inp)
Arguments
inp
: The input to the rancell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the RANCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.IndRNNCell
— TypeIndRNNCell(input_size => hidden_size, [activation];
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Independently recurrent cell. See IndRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: activation function. Default istanh
.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]
Forward
indrnncell(inp, state)
indrnncell(inp)
Arguments
inp
: The input to the indrnncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the IndRNNCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.LightRUCell
— TypeLightRUCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Light recurrent unit. See LightRU
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]
Forward
lightrucell(inp, state)
lightrucell(inp)
Arguments
inp
: The input to the lightrucell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the LightRUCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.LiGRUCell
— TypeLiGRUCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Light gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]
Forward
ligrucell(inp, state)
ligrucell(inp)
Arguments
inp
: The input to the ligrucell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the LiGRUCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.MGUCell
— TypeMGUCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Minimal gated unit. See MGU
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]
Forward
mgucell(inp, state)
mgucell(inp)
Arguments
inp
: The input to the mgucell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the MGUCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.NASCell
— TypeNASCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Neural Architecture Search unit. See NAS
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ o_3 &= \sigma(W_i^{(3)} x_t + W_h^{(3)} h_{t-1} + b^{(3)}), \\ o_4 &= \text{ReLU}(W_i^{(4)} x_t \cdot W_h^{(4)} h_{t-1}), \\ o_5 &= \tanh(W_i^{(5)} x_t + W_h^{(5)} h_{t-1} + b^{(5)}), \\ o_6 &= \sigma(W_i^{(6)} x_t + W_h^{(6)} h_{t-1} + b^{(6)}), \\ o_7 &= \tanh(W_i^{(7)} x_t + W_h^{(7)} h_{t-1} + b^{(7)}), \\ o_8 &= \sigma(W_i^{(8)} x_t + W_h^{(8)} h_{t-1} + b^{(8)}). \\ \text{Second Layer Computations:} & \\ l_1 &= \tanh(o_1 \cdot o_2) \\ l_2 &= \tanh(o_3 + o_4) \\ l_3 &= \tanh(o_5 \cdot o_6) \\ l_4 &= \sigma(o_7 + o_8) \\ \text{Inject Cell State:} & \\ l_1 &= \tanh(l_1 + c_{\text{state}}) \\ \text{Final Layer Computations:} & \\ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]
Forward
nascell(inp, (state, cstate))
nascell(inp)
Arguments
inp
: The input to the nascell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the NASCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.RHNCell
— TypeRHNCell(input_size => hidden_size, [depth];
couple_carry = true,
cell_kwargs...)
Recurrent highway network. See RHNCellUnit
for a the unit component of this layer. See RHN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.depth
: depth of the recurrence. Default is 3.
Keyword arguments
couple_carry
: couples the carry gate and the transform gate. Defaulttrue
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
bias
: include a bias or not. Default istrue
Equations
\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) \end{aligned}\]
Forward
rnncell(inp, [state])
RecurrentLayers.RHNCellUnit
— TypeRHNCellUnit(input_size => hidden_size;
init_kernel = glorot_uniform,
bias = true)
RecurrentLayers.MUT1Cell
— TypeMUT1Cell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Mutated unit 1 cell. See MUT1
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]
Forward
mutcell(inp, state)
mutcell(inp)
Arguments
inp
: The input to the mutcell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the MUTCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.MUT2Cell
— TypeMUT2Cell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Mutated unit 2 cell. See MUT2
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]
Forward
mutcell(inp, state)
mutcell(inp)
Arguments
inp
: The input to the mutcell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the MUTCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.MUT3Cell
— TypeMUT3Cell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Mutated unit 3 cell. See MUT3
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]
Forward
mutcell(inp, state)
mutcell(inp)
Arguments
inp
: The input to the mutcell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the MUTCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.SCRNCell
— TypeSCRNCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true, alpha = 0.0)
Structurally contraint recurrent unit. See SCRN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.alpha
: structural contraint. Default is 0.0.
Equations
\[\begin{aligned} s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\ h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\ y_t &= f(U_y h_t + W_y s_t) \end{aligned}\]
Forward
scrncell(inp, (state, cstate))
scrncell(inp)
Arguments
inp
: The input to the scrncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.PeepholeLSTMCell
— TypePeepholeLSTMCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Peephole long short term memory cell. See PeepholeLSTM
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{aligned}\]
Forward
peepholelstmcell(inp, (state, cstate))
peepholelstmcell(inp)
Arguments
inp
: The input to the peepholelstmcell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.FastRNNCell
— TypeFastRNNCell(input_size => hidden_size, [activation];
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
init_alpha = 3.0, init_beta = - 3.0,
bias = true)
Fast recurrent neural network cell. See FastRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: the activation function, defaults totanh_fast
.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.init_alpha
: Initializer for the alpha parameter. Default is 3.0.init_beta
: Initializer for the beta parameter. Default is - 3.0.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]
Forward
fastrnncell(inp, state)
fastrnncell(inp)
Arguments
inp
: The input to the fastrnncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the FastRNN. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.FastGRNNCell
— TypeFastGRNNCell(input_size => hidden_size, [activation];
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Fast gated recurrent neural network cell. See FastGRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: the activation function, defaults totanh_fast
.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.init_zeta
: Initializer for the zeta parameter. Default is 1.0.init_nu
: Initializer for the nu parameter. Default is - 4.0.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \sigma(W x_t + U h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W x_t + U h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]
Forward
fastgrnncell(inp, state)
fastgrnncell(inp)
Arguments
inp
: The input to the fastgrnncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the FastGRNN. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.FSRNNCell
— TypeFSRNNCell(input_size => hidden_size,
fast_cells, slow_cell)
Fast slow recurrent neural network cell. See FSRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layerfast_cells
: a vector of the fast cells. Must be minimum of length 2.slow_cell
: the chosen slow cell.
Equations
\[\begin{aligned} h_t^{F_1} &= f^{F_1}\left(h_{t-1}^{F_k}, x_t\right) \\ h_t^S &= f^S\left(h_{t-1}^S, h_t^{F_1}\right) \\ h_t^{F_2} &= f^{F_2}\left(h_t^{F_1}, h_t^S\right) \\ h_t^{F_i} &= f^{F_i}\left(h_t^{F_{i-1}}\right) \quad \text{for } 3 \leq i \leq k \end{aligned}\]
Forward
fsrnncell(inp, (fast_state, slow_state))
fsrnncell(inp)
Arguments
inp
: The input to the fsrnncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(fast_state, slow_state)
: A tuple containing the hidden and cell states of the FSRNNCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (fast_state, slow_state)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.LEMCell
— TypeLEMCell(input_size => hidden_size, [dt];
init_kernel = glorot_uniform, init_recurrent_kernel = glorot_uniform,
bias = true)
Long expressive memory unit. See LEM
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.dt
: timestep. Defaul is 1.0.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \boldsymbol{\Delta t_n} &= \Delta \hat{t} \hat{\sigma} (W_1 y_{n-1} + V_1 u_n + b_1) \\ \overline{\boldsymbol{\Delta t_n}} &= \Delta \hat{t} \hat{\sigma} (W_2 y_{n-1} + V_2 u_n + b_2) \\ z_n &= (1 - \boldsymbol{\Delta t_n}) \odot z_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_z y_{n-1} + V_z u_n + b_z) \\ y_n &= (1 - \boldsymbol{\Delta t_n}) \odot y_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_y z_n + V_y u_n + b_y) \end{aligned}\]
Forward
lemcell(inp, (state, cstate))
lemcell(inp)
Arguments
inp
: The input to the lemcell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the RANCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.coRNNCell
— TypecoRNNCell(input_size => hidden_size, [dt];
gamma=0.0, epsilon=0.0,
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Coupled oscillatory recurrent neural unit. See coRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.dt
: time step. Default is 1.0.
Keyword arguments
gamma
: damping for state. Default is 0.0.epsilon
: damping for candidate state. Default is 0.0.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} \mathbf{y}_n &= y_{n-1} + \Delta t \mathbf{z}_n, \\ \mathbf{z}_n &= z_{n-1} + \Delta t \sigma \left( \mathbf{W} y_{n-1} + \mathcal{W} z_{n-1} + \mathbf{V} u_n + \mathbf{b} \right) - \Delta t \gamma y_{n-1} - \Delta t \epsilon \mathbf{z}_n, \end{aligned}\]
Forward
cornncell(inp, (state, cstate))
cornncell(inp)
Arguments
inp
: The input to the cornncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the coRNNCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.AntisymmetricRNNCell
— TypeAntisymmetricRNNCell(input_size => hidden_size, [activation];
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true, epsilon=1.0)
Antisymmetric recurrent cell. See AntisymmetricRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: activation function. Default istanh
.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.epsilon
: step size. Default is 1.0.gamma
: strength of diffusion. Default is 0.0.
Equations
\[h_t = h_{t-1} + \epsilon \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right),\]
Forward
asymrnncell(inp, state)
asymrnncell(inp)
Arguments
inp
: The input to the asymrnncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the AntisymmetricRNNCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.GatedAntisymmetricRNNCell
— TypeGatedAntisymmetricRNNCell(input_size => hidden_size, [activation];
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true, epsilon=1.0)
Antisymmetric recurrent cell with gating. See GatedAntisymmetricRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.epsilon
: step size. Default is 1.0.gamma
: strength of diffusion. Default is 0.0.
Equations
\[\begin{aligned} z_t &= \sigma \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_z x_t + b_z \right), \\ h_t &= h_{t-1} + \epsilon z_t \odot \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right). \end{aligned}\]
Forward
asymrnncell(inp, state)
asymrnncell(inp)
Arguments
inp
: The input to the asymrnncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the GatedAntisymmetricRNNCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.JANETCell
— TypeJANETCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true, beta_value=1.0)
Just another network unit. See JANET
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.beta_value
: control over the input data flow. Default is 1.0.
Equations
\[\begin{aligned} \mathbf{s}_t &= \mathbf{U}_f \mathbf{h}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f \\ \tilde{\mathbf{c}}_t &= \tanh (\mathbf{U}_c \mathbf{h}_{t-1} + \mathbf{W}_c \mathbf{x}_t + \mathbf{b}_c) \\ \mathbf{c}_t &= \sigma(\mathbf{s}_t) \odot \mathbf{c}_{t-1} + (1 - \sigma (\mathbf{s}_t - \beta)) \odot \tilde{\mathbf{c}}_t \\ \mathbf{h}_t &= \mathbf{c}_t. \end{aligned}\]
Forward
janetcell(inp, (state, cstate))
janetcell(inp)
Arguments
inp
: The input to the rancell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the RANCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.CFNCell
— TypeCFNCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Chaos free network unit. See CFN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} h_t &= \theta_t \odot \tanh(h_{t-1}) + \eta_t \odot \tanh(W x_t), \\ \theta_t &:= \sigma (U_\theta h_{t-1} + V_\theta x_t + b_\theta), \\ \eta_t &:= \sigma (U_\eta h_{t-1} + V_\eta x_t + b_\eta). \end{aligned}\]
Forward
cfncell(inp, state)
cfncell(inp)
Arguments
inp
: The input to the cfncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the CFNCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.TRNNCell
— TypeTRNNCell(input_size => hidden_size, [activation];
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Strongly typed recurrent unit. See TRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.activation
: activation function. Default istanh
.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \mathbf{W} x_t \\ f_t &= \sigma (\mathbf{V} x_t + b) \\ h_t &= f_t \odot h_{t-1} + (1 - f_t) \odot z_t \end{aligned}\]
Forward
trnncell(inp, state)
trnncell(inp)
Arguments
inp
: The input to the trnncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the TRNNCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, where both elements are given by the updated statenew_state
, a tensor of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.TGRUCell
— TypeTGRUCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Strongly typed gated recurrent unit. See TGRU
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ h_t &= f_t \odot h_{t-1} + z_t \odot o_t \end{aligned}\]
Forward
tgrucell(inp, state)
tgrucell(inp)
Arguments
inp
: The input to the tgrucell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the TGRUCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, inp)
is the new hidden state together with the current input. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.TLSTMCell
— TypeTLSTMCell(input_size => hidden_size;
init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform,
bias = true)
Strongly typed long short term memory cell. See TLSTM
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.
Keyword arguments
init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ c_t &= f_t \odot c_{t-1} + (1 - f_t) \odot z_t \\ h_t &= c_t \odot o_t \end{aligned}\]
Forward
tlstmcell(inp, state)
tlstmcell(inp)
Arguments
inp
: The input to the tlstmcell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.state
: The hidden state of the TLSTMCell. It should be a vector of sizehidden_size
or a matrix of sizehidden_size x batch_size
. If not provided, it is assumed to be a vector of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate, inp)
is the new hidden and cell state, together with the current input. They are tensors of sizehidden_size
orhidden_size x batch_size
.
RecurrentLayers.UnICORNNCell
— TypeUnICORNNCell(input_size => hidden_size, [dt];
alpha=0.0, init_kernel = glorot_uniform,
init_recurrent_kernel = glorot_uniform, bias = true)
Undamped independent controlled oscillatory recurrent neural unit. See coRNN
for a layer that processes entire sequences.
Arguments
input_size => hidden_size
: input and inner dimension of the layer.dt
: time step. Default is 1.0.
Keyword arguments
alpha
: Control parameter. Default is 0.0.init_kernel
: initializer for the input to hidden weights. Default isglorot_uniform
.init_recurrent_kernel
: initializer for the hidden to hidden weights. Default isglorot_uniform
.bias
: include a bias or not. Default istrue
.
Equations
\[\begin{aligned} y_n &= y_{n-1} + \Delta t \, \hat{\sigma}(c) \odot z_n, \\ z_n &= z_{n-1} - \Delta t \, \hat{\sigma}(c) \odot \left[ \sigma \left( w \odot y_{n-1} + V y_{n-1} + b \right) + \alpha y_{n-1} \right]. \end{aligned}\]
Forward
unicornncell(inp, (state, cstate))
unicornncell(inp)
Arguments
inp
: The input to the unicornncell. It should be a vector of sizeinput_size
or a matrix of sizeinput_size x batch_size
.(state, cstate)
: A tuple containing the hidden and cell states of the UnICORNNCell. They should be vectors of sizehidden_size
or matrices of sizehidden_size x batch_size
. If not provided, they are assumed to be vectors of zeros, initialized byFlux.initialstates
.
Returns
- A tuple
(output, state)
, whereoutput = new_state
is the new hidden state andstate = (new_state, new_cstate)
is the new hidden and cell state. They are tensors of sizehidden_size
orhidden_size x batch_size
.