RHN

RecurrentLayers.RHNType
RHN(input_size => hidden_size, [depth];
    return_state = false,
    kwargs...)

Recurrent highway network [Zilly2017]. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3

Keyword arguments

  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true
  • return_state: Option to return the last state together with the output. Default is false.

Equations

\[\begin{aligned} \mathbf{s}_{\ell}(t) &= \mathbf{h}_{\ell}(t) \odot \mathbf{t}_{\ell}(t) + \mathbf{s}_{\ell-1}(t) \odot \mathbf{c}_{\ell}(t) \\ \mathbf{h}_{\ell}(t) &= \tanh\left( \mathbf{W}^{h}_{ih} \mathbf{x}(t) \, \mathbb{I}_{\ell = 1} + \mathbf{W}^{h_{\ell}}_{hh} \mathbf{s}_{\ell-1}(t) + \mathbf{b}^{h_{\ell}} \right) \\ \mathbf{t}_{\ell}(t) &= \sigma\left( \mathbf{W}^{t}_{ih} \mathbf{x}(t) \, \mathbb{I}_{\ell = 1} + \mathbf{W}^{t_{\ell}}_{hh} \mathbf{s}_{\ell-1}(t) + \mathbf{b}^{t_{\ell}} \right) \\ \mathbf{c}_{\ell}(t) &= \sigma\left( \mathbf{W}^{c}_{ih} \mathbf{x}(t) \, \mathbb{I}_{\ell = 1} + \mathbf{W}^{c_{\ell}}_{hh} \mathbf{s}_{\ell-1}(t) + \mathbf{b}^{c_{\ell}} \right) \end{aligned}\]

source
  • Zilly2017Zilly, J. G. et al. Recurrent Highway Networks. ICML 2017.