Layers

RecurrentLayers.RANType
RAN(input_size => hidden_size;
    return_state = false, kwargs...)

Recurrent Additive Network cell. See RANCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} \tilde{c}_t &= W_c x_t, \\ i_t &= \sigma(W_i x_t + U_i h_{t-1} + b_i), \\ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]

Forward

ran(inp, (state, cstate))
ran(inp)

Arguments

  • inp: The input to the ran. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RAN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.IndRNNType
IndRNN(input_size, hidden_size, [activation];
    return_state = false, kwargs...)

Independently recurrent network. See IndRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: activation function. Default is tanh.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnn(inp, state)
indrnn(inp)

Arguments

  • inp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.LightRUType
LightRU(input_size => hidden_size;
    return_state = false, kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]

Forward

lightru(inp, state)
lightru(inp)

Arguments

  • inp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.LiGRUType
LiGRU(input_size => hidden_size;
    return_state = false, kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]

Forward

ligru(inp, state)
ligru(inp)

Arguments

  • inp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.MGUType
MGU(input_size => hidden_size;
    return_state = false, kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]

Forward

mgu(inp, state)
mgu(inp)

Arguments

  • inp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.NASType
NAS(input_size => hidden_size;
    return_state = false,
    kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ o_3 &= \sigma(W_i^{(3)} x_t + W_h^{(3)} h_{t-1} + b^{(3)}), \\ o_4 &= \text{ReLU}(W_i^{(4)} x_t \cdot W_h^{(4)} h_{t-1}), \\ o_5 &= \tanh(W_i^{(5)} x_t + W_h^{(5)} h_{t-1} + b^{(5)}), \\ o_6 &= \sigma(W_i^{(6)} x_t + W_h^{(6)} h_{t-1} + b^{(6)}), \\ o_7 &= \tanh(W_i^{(7)} x_t + W_h^{(7)} h_{t-1} + b^{(7)}), \\ o_8 &= \sigma(W_i^{(8)} x_t + W_h^{(8)} h_{t-1} + b^{(8)}). \\ \text{Second Layer Computations:} & \\ l_1 &= \tanh(o_1 \cdot o_2) \\ l_2 &= \tanh(o_3 + o_4) \\ l_3 &= \tanh(o_5 \cdot o_6) \\ l_4 &= \sigma(o_7 + o_8) \\ \text{Inject Cell State:} & \\ l_1 &= \tanh(l_1 + c_{\text{state}}) \\ \text{Final Layer Computations:} & \\ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]

Forward

nas(inp, (state, cstate))
nas(inp)

Arguments

  • inp: The input to the nas. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NAS. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.RHNType
RHN(input_size => hidden_size, [depth];
    return_state = false,
    kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3

Keyword arguments

  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) \end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1(input_size => hidden_size;
    return_state=false,
    kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.MUT2Type
MUT2Cell(input_size => hidden_size;
    return_state=false,
    kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.MUT3Type
MUT3(input_size => hidden_size;
return_state = false, kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.SCRNType
SCRN(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true, alpha = 0.0,
    return_state = false)

Structurally contraint recurrent unit. See SCRNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • alpha: structural contraint. Default is 0.0.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\ h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\ y_t &= f(U_y h_t + W_y s_t) \end{aligned}\]

Forward

scrn(inp, (state, cstate))
scrn(inp)

Arguments

  • inp: The input to the scrn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM(input_size => hidden_size;
    return_state=false,
    kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{aligned}\]

Forward

peepholelstm(inp, (state, cstate))
peepholelstm(inp)

Arguments

  • inp: The input to the peepholelstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.FastRNNType
FastRNN(input_size => hidden_size, [activation];
    return_state = false, kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: the activation function, defaults to tanh_fast.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.

  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.

  • init_alpha: Initializer for the alpha parameter. Default is 3.0.

  • init_beta: Initializer for the beta parameter. Default is - 3.0.

  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]

Forward

fastrnn(inp, state)
fastrnn(inp)

Arguments

  • inp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.FastGRNNType
FastGRNN(input_size => hidden_size, [activation];
return_state = false, kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: the activation function, defaults to tanh_fast

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.

  • init_kernel: initializer for the input to hidden weights Default is glorot_uniform.

  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.

  • init_zeta: Initializer for the zeta parameter. Default is 1.0.

  • init_nu: Initializer for the nu parameter. Default is - 4.0.

  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]

Forward

fastgrnn(inp, state)
fastgrnn(inp)

Arguments

  • inp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.FSRNNType
FSRNN(input_size => hidden_size,
    fast_cells, slow_cell;
    return_state=false)

Fast slow recurrent neural network. See FSRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • fast_cells: a vector of the fast cells. Must be minimum of length 2.
  • slow_cell: the chosen slow cell.
  • return_state: option to return the last state. Default is false.

Equations

\[\begin{aligned} h_t^{F_1} &= f^{F_1}\left(h_{t-1}^{F_k}, x_t\right) \\ h_t^S &= f^S\left(h_{t-1}^S, h_t^{F_1}\right) \\ h_t^{F_2} &= f^{F_2}\left(h_t^{F_1}, h_t^S\right) \\ h_t^{F_i} &= f^{F_i}\left(h_t^{F_{i-1}}\right) \quad \text{for } 3 \leq i \leq k \end{aligned}\]

Forward

fsrnn(inp, (fast_state, slow_state))
fsrnn(inp)

Arguments

  • inp: The input to the fsrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (fast_state, slow_state): A tuple containing the hidden and cell states of the FSRNN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.LEMType
LEM(input_size => hidden_size, [dt];
    return_state=false, init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform, bias = true)

Long expressive memory network. See LEMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • dt: timestep. Defaul is 1.0.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} \boldsymbol{\Delta t_n} &= \Delta \hat{t} \hat{\sigma} (W_1 y_{n-1} + V_1 u_n + b_1) \\ \overline{\boldsymbol{\Delta t_n}} &= \Delta \hat{t} \hat{\sigma} (W_2 y_{n-1} + V_2 u_n + b_2) \\ z_n &= (1 - \boldsymbol{\Delta t_n}) \odot z_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_z y_{n-1} + V_z u_n + b_z) \\ y_n &= (1 - \boldsymbol{\Delta t_n}) \odot y_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_y z_n + V_y u_n + b_y) \end{aligned}\]

Forward

lem(inp, (state, zstate))
lem(inp)

Arguments

  • inp: The input to the LEM. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the LEM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.coRNNType
coRNN(input_size => hidden_size, [dt];
    gamma=0.0, epsilon=0.0,
    return_state=false, init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform, bias = true)

Coupled oscillatory recurrent neural unit. See coRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • dt: time step. Default is 1.0.

Keyword arguments

  • gamma: damping for state. Default is 0.0.
  • epsilon: damping for candidate state. Default is 0.0.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} \mathbf{y}_n &= y_{n-1} + \Delta t \mathbf{z}_n, \\ \mathbf{z}_n &= z_{n-1} + \Delta t \sigma \left( \mathbf{W} y_{n-1} + \mathcal{W} z_{n-1} + \mathbf{V} u_n + \mathbf{b} \right) - \Delta t \gamma y_{n-1} - \Delta t \epsilon \mathbf{z}_n, \end{aligned}\]

Forward

cornn(inp, (state, zstate))
cornn(inp)

Arguments

  • inp: The input to the cornn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the coRNN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.AntisymmetricRNNType
AntisymmetricRNN(input_size, hidden_size, [activation];
    return_state = false, kwargs...)

Antisymmetric recurrent neural network. See AntisymmetricRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: activation function. Default is tanh.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true
  • epsilon: step size. Default is 1.0.
  • gamma: strength of diffusion. Default is 0.0.

Equations

\[h_t = h_{t-1} + \epsilon \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right),\]

Forward

asymrnn(inp, state)
asymrnn(inp)

Arguments

  • inp: The input to the asymrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the AntisymmetricRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.GatedAntisymmetricRNNType
GatedAntisymmetricRNN(input_size, hidden_size;
    return_state = false, kwargs...)

Antisymmetric recurrent neural network with gating. See GatedAntisymmetricRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • epsilon: step size. Default is 1.0.
  • gamma: strength of diffusion. Default is 0.0.

Equations

\[\begin{aligned} z_t &= \sigma \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_z x_t + b_z \right), \\ h_t &= h_{t-1} + \epsilon z_t \odot \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right). \end{aligned}\]

Forward

asymrnn(inp, state)
asymrnn(inp)

Arguments

  • inp: The input to the asymrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the GatedAntisymmetricRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.JANETType
JANET(input_size => hidden_size;
    return_state = false, kwargs...)

Just another network. See JANETCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.
  • beta_value: control over the input data flow. Default is 1.0.

Equations

\[\begin{aligned} \mathbf{s}_t &= \mathbf{U}_f \mathbf{h}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f \\ \tilde{\mathbf{c}}_t &= \tanh (\mathbf{U}_c \mathbf{h}_{t-1} + \mathbf{W}_c \mathbf{x}_t + \mathbf{b}_c) \\ \mathbf{c}_t &= \sigma(\mathbf{s}_t) \odot \mathbf{c}_{t-1} + (1 - \sigma (\mathbf{s}_t - \beta)) \odot \tilde{\mathbf{c}}_t \\ \mathbf{h}_t &= \mathbf{c}_t. \end{aligned}\]

Forward

janet(inp, (state, cstate))
janet(inp)

Arguments

  • inp: The input to the janet. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the JANET. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.CFNType
CFN(input_size => hidden_size;
    return_state = false, kwargs...)

Chaos free network unit. See CFNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} h_t &= \theta_t \odot \tanh(h_{t-1}) + \eta_t \odot \tanh(W x_t), \\ \theta_t &:= \sigma (U_\theta h_{t-1} + V_\theta x_t + b_\theta), \\ \eta_t &:= \sigma (U_\eta h_{t-1} + V_\eta x_t + b_\eta). \end{aligned}\]

Forward

cfn(inp, state)
cfn(inp)

Arguments

  • inp: The input to the cfn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the CFN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.TRNNType
TRNN(input_size => hidden_size, [activation];
    return_state = false, kwargs...)

Strongly typed recurrent unit. See TRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: activation function. Default is tanh.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \mathbf{W} x_t \\ f_t &= \sigma (\mathbf{V} x_t + b) \\ h_t &= f_t \odot h_{t-1} + (1 - f_t) \odot z_t \end{aligned}\]

Forward

trnn(inp, state)
trnn(inp)

Arguments

  • inp: The input to the trnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the TRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.TGRUType
TGRU(input_size => hidden_size;
    return_state = false, kwargs...)

Strongly typed recurrent gated unit. See TGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ h_t &= f_t \odot h_{t-1} + z_t \odot o_t \end{aligned}\]

Forward

tgru(inp, state)
tgru(inp)

Arguments

  • inp: The input to the tgru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the TGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source
RecurrentLayers.TLSTMType
TLSTM(input_size => hidden_size;
    return_state = false, kwargs...)

Strongly typed long short term memory. See TLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • return_state: Option to return the last state together with the output. Default is false.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ c_t &= f_t \odot c_{t-1} + (1 - f_t) \odot z_t \\ h_t &= c_t \odot o_t \end{aligned}\]

Forward

tlstm(inp, state)
tlstm(inp)

Arguments

  • inp: The input to the tlstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the TLSTM. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden states new_states and the last state of the iteration.
source
RecurrentLayers.UnICORNNType
UnICORNN(input_size => hidden_size, [dt];
    alpha=0.0, return_state=false, init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform, bias = true)

Undamped independent controlled oscillatory recurrent neural network. See UnICORNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • dt: time step. Default is 1.0.

Keyword arguments

  • alpha: Control parameter. Default is 0.0.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} y_n &= y_{n-1} + \Delta t \, \hat{\sigma}(c) \odot z_n, \\ z_n &= z_{n-1} - \Delta t \, \hat{\sigma}(c) \odot \left[ \sigma \left( w \odot y_{n-1} + V y_{n-1} + b \right) + \alpha y_{n-1} \right]. \end{aligned}\]

Forward

unicornn(inp, (state, zstate))
unicornn(inp)

Arguments

  • inp: The input to the unicornn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the UnICORNN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size. When return_state = true it returns a tuple of the hidden stats new_states and the last state of the iteration.
source