Cells

RecurrentLayers.RANCellType
RANCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Recurrent Additive Network cell. See RAN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \tilde{c}_t &= W_c x_t, \\ i_t &= \sigma(W_i x_t + U_i h_{t-1} + b_i), \\ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]

Forward

rancell(inp, (state, cstate))
rancell(inp)

Arguments

  • inp: The input to the rancell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RANCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.IndRNNCellType
IndRNNCell(input_size => hidden_size, [activation];
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Independently recurrent cell. See IndRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: activation function. Default is tanh.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnncell(inp, state)
indrnncell(inp)

Arguments

  • inp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LightRUCellType
LightRUCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Light recurrent unit. See LightRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]

Forward

lightrucell(inp, state)
lightrucell(inp)

Arguments

  • inp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LiGRUCellType
LiGRUCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Light gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]

Forward

ligrucell(inp, state)
ligrucell(inp)

Arguments

  • inp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.MGUCellType
MGUCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Minimal gated unit. See MGU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]

Forward

mgucell(inp, state)
mgucell(inp)

Arguments

  • inp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.NASCellType
NASCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Neural Architecture Search unit. See NAS for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ o_3 &= \sigma(W_i^{(3)} x_t + W_h^{(3)} h_{t-1} + b^{(3)}), \\ o_4 &= \text{ReLU}(W_i^{(4)} x_t \cdot W_h^{(4)} h_{t-1}), \\ o_5 &= \tanh(W_i^{(5)} x_t + W_h^{(5)} h_{t-1} + b^{(5)}), \\ o_6 &= \sigma(W_i^{(6)} x_t + W_h^{(6)} h_{t-1} + b^{(6)}), \\ o_7 &= \tanh(W_i^{(7)} x_t + W_h^{(7)} h_{t-1} + b^{(7)}), \\ o_8 &= \sigma(W_i^{(8)} x_t + W_h^{(8)} h_{t-1} + b^{(8)}). \\ \text{Second Layer Computations:} & \\ l_1 &= \tanh(o_1 \cdot o_2) \\ l_2 &= \tanh(o_3 + o_4) \\ l_3 &= \tanh(o_5 \cdot o_6) \\ l_4 &= \sigma(o_7 + o_8) \\ \text{Inject Cell State:} & \\ l_1 &= \tanh(l_1 + c_{\text{state}}) \\ \text{Final Layer Computations:} & \\ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]

Forward

nascell(inp, (state, cstate))
nascell(inp)

Arguments

  • inp: The input to the nascell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.RHNCellType
RHNCell(input_size => hidden_size, [depth];
    couple_carry = true,
    cell_kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • depth: depth of the recurrence. Default is 3.

Keyword arguments

  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) \end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.MUT1CellType
MUT1Cell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Mutated unit 1 cell. See MUT1 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.MUT2CellType
MUT2Cell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Mutated unit 2 cell. See MUT2 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.MUT3CellType
MUT3Cell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Mutated unit 3 cell. See MUT3 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.SCRNCellType
SCRNCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true, alpha = 0.0)

Structurally contraint recurrent unit. See SCRN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • alpha: structural contraint. Default is 0.0.

Equations

\[\begin{aligned} s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\ h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\ y_t &= f(U_y h_t + W_y s_t) \end{aligned}\]

Forward

scrncell(inp, (state, cstate))
scrncell(inp)

Arguments

  • inp: The input to the scrncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Peephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{aligned}\]

Forward

peepholelstmcell(inp, (state, cstate))
peepholelstmcell(inp)

Arguments

  • inp: The input to the peepholelstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastRNNCellType
FastRNNCell(input_size => hidden_size, [activation];
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    init_alpha = 3.0, init_beta = - 3.0,
    bias = true)

Fast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: the activation function, defaults to tanh_fast.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • init_alpha: Initializer for the alpha parameter. Default is 3.0.
  • init_beta: Initializer for the beta parameter. Default is - 3.0.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]

Forward

fastrnncell(inp, state)
fastrnncell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastGRNNCellType
FastGRNNCell(input_size => hidden_size, [activation];
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Fast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: the activation function, defaults to tanh_fast.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • init_zeta: Initializer for the zeta parameter. Default is 1.0.
  • init_nu: Initializer for the nu parameter. Default is - 4.0.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \sigma(W x_t + U h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W x_t + U h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]

Forward

fastgrnncell(inp, state)
fastgrnncell(inp)

Arguments

  • inp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FSRNNCellType
FSRNNCell(input_size => hidden_size,
    fast_cells, slow_cell)

Fast slow recurrent neural network cell. See FSRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • fast_cells: a vector of the fast cells. Must be minimum of length 2.
  • slow_cell: the chosen slow cell.

Equations

\[\begin{aligned} h_t^{F_1} &= f^{F_1}\left(h_{t-1}^{F_k}, x_t\right) \\ h_t^S &= f^S\left(h_{t-1}^S, h_t^{F_1}\right) \\ h_t^{F_2} &= f^{F_2}\left(h_t^{F_1}, h_t^S\right) \\ h_t^{F_i} &= f^{F_i}\left(h_t^{F_{i-1}}\right) \quad \text{for } 3 \leq i \leq k \end{aligned}\]

Forward

fsrnncell(inp, (fast_state, slow_state))
fsrnncell(inp)

Arguments

  • inp: The input to the fsrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (fast_state, slow_state): A tuple containing the hidden and cell states of the FSRNNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (fast_state, slow_state) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LEMCellType
LEMCell(input_size => hidden_size, [dt];
    init_kernel = glorot_uniform, init_recurrent_kernel = glorot_uniform,
    bias = true)

Long expressive memory unit. See LEM for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • dt: timestep. Defaul is 1.0.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \boldsymbol{\Delta t_n} &= \Delta \hat{t} \hat{\sigma} (W_1 y_{n-1} + V_1 u_n + b_1) \\ \overline{\boldsymbol{\Delta t_n}} &= \Delta \hat{t} \hat{\sigma} (W_2 y_{n-1} + V_2 u_n + b_2) \\ z_n &= (1 - \boldsymbol{\Delta t_n}) \odot z_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_z y_{n-1} + V_z u_n + b_z) \\ y_n &= (1 - \boldsymbol{\Delta t_n}) \odot y_{n-1} + \boldsymbol{\Delta t_n} \odot \sigma (W_y z_n + V_y u_n + b_y) \end{aligned}\]

Forward

lemcell(inp, (state, cstate))
lemcell(inp)

Arguments

  • inp: The input to the lemcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RANCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.coRNNCellType
coRNNCell(input_size => hidden_size, [dt];
    gamma=0.0, epsilon=0.0,
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Coupled oscillatory recurrent neural unit. See coRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • dt: time step. Default is 1.0.

Keyword arguments

  • gamma: damping for state. Default is 0.0.
  • epsilon: damping for candidate state. Default is 0.0.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} \mathbf{y}_n &= y_{n-1} + \Delta t \mathbf{z}_n, \\ \mathbf{z}_n &= z_{n-1} + \Delta t \sigma \left( \mathbf{W} y_{n-1} + \mathcal{W} z_{n-1} + \mathbf{V} u_n + \mathbf{b} \right) - \Delta t \gamma y_{n-1} - \Delta t \epsilon \mathbf{z}_n, \end{aligned}\]

Forward

cornncell(inp, (state, cstate))
cornncell(inp)

Arguments

  • inp: The input to the cornncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the coRNNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.AntisymmetricRNNCellType
AntisymmetricRNNCell(input_size => hidden_size, [activation];
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true, epsilon=1.0)

Antisymmetric recurrent cell. See AntisymmetricRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: activation function. Default is tanh.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • epsilon: step size. Default is 1.0.
  • gamma: strength of diffusion. Default is 0.0.

Equations

\[h_t = h_{t-1} + \epsilon \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right),\]

Forward

asymrnncell(inp, state)
asymrnncell(inp)

Arguments

  • inp: The input to the asymrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the AntisymmetricRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.GatedAntisymmetricRNNCellType
GatedAntisymmetricRNNCell(input_size => hidden_size, [activation];
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true, epsilon=1.0)

Antisymmetric recurrent cell with gating. See GatedAntisymmetricRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • epsilon: step size. Default is 1.0.
  • gamma: strength of diffusion. Default is 0.0.

Equations

\[\begin{aligned} z_t &= \sigma \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_z x_t + b_z \right), \\ h_t &= h_{t-1} + \epsilon z_t \odot \tanh \left( (W_h - W_h^T - \gamma I) h_{t-1} + V_h x_t + b_h \right). \end{aligned}\]

Forward

asymrnncell(inp, state)
asymrnncell(inp)

Arguments

  • inp: The input to the asymrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the GatedAntisymmetricRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.JANETCellType
JANETCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true, beta_value=1.0)

Just another network unit. See JANET for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.
  • beta_value: control over the input data flow. Default is 1.0.

Equations

\[\begin{aligned} \mathbf{s}_t &= \mathbf{U}_f \mathbf{h}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f \\ \tilde{\mathbf{c}}_t &= \tanh (\mathbf{U}_c \mathbf{h}_{t-1} + \mathbf{W}_c \mathbf{x}_t + \mathbf{b}_c) \\ \mathbf{c}_t &= \sigma(\mathbf{s}_t) \odot \mathbf{c}_{t-1} + (1 - \sigma (\mathbf{s}_t - \beta)) \odot \tilde{\mathbf{c}}_t \\ \mathbf{h}_t &= \mathbf{c}_t. \end{aligned}\]

Forward

janetcell(inp, (state, cstate))
janetcell(inp)

Arguments

  • inp: The input to the rancell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RANCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.CFNCellType
CFNCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Chaos free network unit. See CFN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} h_t &= \theta_t \odot \tanh(h_{t-1}) + \eta_t \odot \tanh(W x_t), \\ \theta_t &:= \sigma (U_\theta h_{t-1} + V_\theta x_t + b_\theta), \\ \eta_t &:= \sigma (U_\eta h_{t-1} + V_\eta x_t + b_\eta). \end{aligned}\]

Forward

cfncell(inp, state)
cfncell(inp)

Arguments

  • inp: The input to the cfncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the CFNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.TRNNCellType
TRNNCell(input_size => hidden_size, [activation];
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Strongly typed recurrent unit. See TRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • activation: activation function. Default is tanh.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \mathbf{W} x_t \\ f_t &= \sigma (\mathbf{V} x_t + b) \\ h_t &= f_t \odot h_{t-1} + (1 - f_t) \odot z_t \end{aligned}\]

Forward

trnncell(inp, state)
trnncell(inp)

Arguments

  • inp: The input to the trnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the TRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.TGRUCellType
TGRUCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Strongly typed gated recurrent unit. See TGRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ h_t &= f_t \odot h_{t-1} + z_t \odot o_t \end{aligned}\]

Forward

tgrucell(inp, state)
tgrucell(inp)

Arguments

  • inp: The input to the tgrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the TGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, inp) is the new hidden state together with the current input. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.TLSTMCellType
TLSTMCell(input_size => hidden_size;
    init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform,
    bias = true)

Strongly typed long short term memory cell. See TLSTM for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.

Keyword arguments

  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} z_t &= \mathbf{V}_z \mathbf{x}_{t-1} + \mathbf{W}_z \mathbf{x}_t + \mathbf{b}_z \\ f_t &= \sigma (\mathbf{V}_f \mathbf{x}_{t-1} + \mathbf{W}_f \mathbf{x}_t + \mathbf{b}_f) \\ o_t &= \tau (\mathbf{V}_o \mathbf{x}_{t-1} + \mathbf{W}_o \mathbf{x}_t + \mathbf{b}_o) \\ c_t &= f_t \odot c_{t-1} + (1 - f_t) \odot z_t \\ h_t &= c_t \odot o_t \end{aligned}\]

Forward

tlstmcell(inp, state)
tlstmcell(inp)

Arguments

  • inp: The input to the tlstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the TLSTMCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate, inp) is the new hidden and cell state, together with the current input. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.UnICORNNCellType
UnICORNNCell(input_size => hidden_size, [dt];
    alpha=0.0, init_kernel = glorot_uniform,
    init_recurrent_kernel = glorot_uniform, bias = true)

Undamped independent controlled oscillatory recurrent neural unit. See coRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer.
  • dt: time step. Default is 1.0.

Keyword arguments

  • alpha: Control parameter. Default is 0.0.
  • init_kernel: initializer for the input to hidden weights. Default is glorot_uniform.
  • init_recurrent_kernel: initializer for the hidden to hidden weights. Default is glorot_uniform.
  • bias: include a bias or not. Default is true.

Equations

\[\begin{aligned} y_n &= y_{n-1} + \Delta t \, \hat{\sigma}(c) \odot z_n, \\ z_n &= z_{n-1} - \Delta t \, \hat{\sigma}(c) \odot \left[ \sigma \left( w \odot y_{n-1} + V y_{n-1} + b \right) + \alpha y_{n-1} \right]. \end{aligned}\]

Forward

unicornncell(inp, (state, cstate))
unicornncell(inp)

Arguments

  • inp: The input to the unicornncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the UnICORNNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source