RHN
RecurrentLayers.RHN
— TypeRHN(input_size => hidden_size, [depth];
return_state = false,
kwargs...)
Recurrent highway network [Zilly2017]. See RHNCellUnit
for a the unit component of this layer. See RHNCell
for a layer that processes a single sequence.
Arguments
input_size => hidden_size
: input and inner dimension of the layerdepth
: depth of the recurrence. Default is 3
Keyword arguments
couple_carry
: couples the carry gate and the transform gate. Defaulttrue
init_kernel
: initializer for the input to hidden weightsbias
: include a bias or not. Default istrue
return_state
: Option to return the last state together with the output. Default isfalse
.
Equations
\[\begin{aligned} \mathbf{s}_{\ell}(t) &= \mathbf{h}_{\ell}(t) \odot \mathbf{t}_{\ell}(t) + \mathbf{s}_{\ell-1}(t) \odot \mathbf{c}_{\ell}(t) \\ \mathbf{h}_{\ell}(t) &= \tanh\left( \mathbf{W}^{h}_{ih} \mathbf{x}(t) \, \mathbb{I}_{\ell = 1} + \mathbf{W}^{h_{\ell}}_{hh} \mathbf{s}_{\ell-1}(t) + \mathbf{b}^{h_{\ell}} \right) \\ \mathbf{t}_{\ell}(t) &= \sigma\left( \mathbf{W}^{t}_{ih} \mathbf{x}(t) \, \mathbb{I}_{\ell = 1} + \mathbf{W}^{t_{\ell}}_{hh} \mathbf{s}_{\ell-1}(t) + \mathbf{b}^{t_{\ell}} \right) \\ \mathbf{c}_{\ell}(t) &= \sigma\left( \mathbf{W}^{c}_{ih} \mathbf{x}(t) \, \mathbb{I}_{\ell = 1} + \mathbf{W}^{c_{\ell}}_{hh} \mathbf{s}_{\ell-1}(t) + \mathbf{b}^{c_{\ell}} \right) \end{aligned}\]
- Zilly2017Zilly, J. G. et al. Recurrent Highway Networks. ICML 2017.