torchrecurrent.RAN#
- class torchrecurrent.RAN(input_size, hidden_size, num_layers=1, dropout=0.0, batch_first=False, **kwargs)[source]#
- Multi-layer Recurrent Additive Network (RAN). - [arXiv] - Each layer consists of a - RANCell, which replaces the standard LSTM nonlinearities with purely additive memory updates gated by input and forget gates:\[\begin{split}\begin{aligned} \tilde{c}(t) &= W_{ih}^c x(t) + b_{ih}^c, \\ i(t) &= \sigma(W_{ih}^i x(t) + b_{ih}^i + W_{hh}^i h(t-1) + b_{hh}^i), \\ f(t) &= \sigma(W_{ih}^f x(t) + b_{ih}^f + W_{hh}^f h(t-1) + b_{hh}^f), \\ c(t) &= i(t) \circ \tilde{c}(t) + f(t) \circ c(t-1), \\ h(t) &= \tanh(c(t)). \end{aligned}\end{split}\]- Parameters:
- input_size – The number of expected features in the input x. 
- hidden_size – The number of features in the hidden and cell states. 
- num_layers – Number of recurrent layers. E.g., setting - num_layers=2stacks two RAN layers, with the second receiving the outputs of the first. Default: 1
- dropout – If non-zero, introduces a Dropout layer on the outputs of each layer except the last, with dropout probability equal to - dropout. Default: 0
- batch_first – If - True, input and output tensors are provided as (batch, seq, feature) instead of (seq, batch, feature). Default: False
- bias – If - False, the layer does not use input-side bias b_{ih}. Default: True
- recurrent_bias – If - False, the layer does not use recurrent bias b_{hh}. Default: True
- kernel_init – Initializer for W_{ih}. Default: - torch.nn.init.xavier_uniform_()
- recurrent_kernel_init – Initializer for W_{hh}. Default: - torch.nn.init.xavier_uniform_()
- bias_init – Initializer for b_{ih} when - bias=True. Default:- torch.nn.init.zeros_()
- recurrent_bias_init – Initializer for b_{hh} when - recurrent_bias=True. Default:- torch.nn.init.zeros_()
- device – The desired device of parameters. 
- dtype – The desired floating point type of parameters. 
 
 - Inputs: input, (h_0, c_0)
- input: tensor of shape \((L, H_{in})\) for unbatched input, \((L, N, H_{in})\) when - batch_first=Falseor \((N, L, H_{in})\) when- batch_first=Truecontaining the features of the input sequence.
- h_0: tensor of shape \((\text{num_layers}, H_{out})\) for unbatched input or \((\text{num_layers}, N, H_{out})\) containing the initial hidden state. Defaults to zeros if not provided. 
- c_0: tensor of shape \((\text{num_layers}, H_{out})\) for unbatched input or \((\text{num_layers}, N, H_{out})\) containing the initial cell state. Defaults to zeros if not provided. 
 - where: \[\begin{split}\begin{aligned} N &= \text{batch size} \\ L &= \text{sequence length} \\ H_{in} &= \text{input\_size} \\ H_{out} &= \text{hidden\_size} \end{aligned}\end{split}\]
- Outputs: output, (h_n, c_n)
- output: tensor of shape \((L, H_{out})\) for unbatched input, \((L, N, H_{out})\) when - batch_first=Falseor \((N, L, H_{out})\) when- batch_first=Truecontaining the output features from the last layer, for each timestep.
- h_n: tensor of shape \((\text{num_layers}, H_{out})\) for unbatched input or \((\text{num_layers}, N, H_{out})\) containing the final hidden state for each element in the sequence. 
- c_n: tensor of shape \((\text{num_layers}, H_{out})\) for unbatched input or \((\text{num_layers}, N, H_{out})\) containing the final cell state for each element in the sequence. 
 
 - cells.{k}.weight_ih
- the learnable input–hidden weights of the \(k\)-th layer, of shape (3*hidden_size, input_size) for k=0, otherwise (3*hidden_size, hidden_size). 
 - cells.{k}.weight_hh
- the learnable hidden–hidden weights of the \(k\)-th layer, of shape (2*hidden_size, hidden_size). 
 - cells.{k}.bias_ih
- the learnable input–hidden biases of the \(k\)-th layer, of shape (3*hidden_size). Only present when - bias=True.
 - cells.{k}.bias_hh
- the learnable hidden–hidden biases of the \(k\)-th layer, of shape (2*hidden_size). Only present when - recurrent_bias=True.
 - See also - Examples: - >>> rnn = RAN(16, 32, num_layers=2) >>> x = torch.randn(5, 3, 16) # (seq_len, batch, input_size) >>> h0 = torch.zeros(2, 3, 32) >>> c0 = torch.zeros(2, 3, 32) >>> output, (hn, cn) = rnn(x, (h0, c0)) - __init__(input_size, hidden_size, num_layers=1, dropout=0.0, batch_first=False, **kwargs)[source]#
- Initialize internal Module state, shared by both nn.Module and ScriptModule. 
 - Methods - __init__(input_size, hidden_size[, ...])- Initialize internal Module state, shared by both nn.Module and ScriptModule. - add_module(name, module)- Add a child module to the current module. - apply(fn)- Apply - fnrecursively to every submodule (as returned by- .children()) as well as self.- bfloat16()- Casts all floating point parameters and buffers to - bfloat16datatype.- buffers([recurse])- Return an iterator over module buffers. - children()- Return an iterator over immediate children modules. - compile(*args, **kwargs)- Compile this Module's forward using - torch.compile().- cpu()- Move all model parameters and buffers to the CPU. - cuda([device])- Move all model parameters and buffers to the GPU. - double()- Casts all floating point parameters and buffers to - doubledatatype.- eval()- Set the module in evaluation mode. - extra_repr()- Return the extra representation of the module. - float()- Casts all floating point parameters and buffers to - floatdatatype.- forward(inp[, state])- Define the computation performed at every call. - get_buffer(target)- Return the buffer given by - targetif it exists, otherwise throw an error.- get_extra_state()- Return any extra state to include in the module's state_dict. - get_parameter(target)- Return the parameter given by - targetif it exists, otherwise throw an error.- get_submodule(target)- Return the submodule given by - targetif it exists, otherwise throw an error.- half()- Casts all floating point parameters and buffers to - halfdatatype.- initialize_cells(cell_class, **kwargs)- Helper method to initialize cells for the derived recurrent layer class. - ipu([device])- Move all model parameters and buffers to the IPU. - load_state_dict(state_dict[, strict, assign])- Copy parameters and buffers from - state_dictinto this module and its descendants.- modules()- Return an iterator over all modules in the network. - mtia([device])- Move all model parameters and buffers to the MTIA. - named_buffers([prefix, recurse, ...])- Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself. - named_children()- Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself. - named_modules([memo, prefix, remove_duplicate])- Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself. - named_parameters([prefix, recurse, ...])- Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself. - parameters([recurse])- Return an iterator over module parameters. - register_backward_hook(hook)- Register a backward hook on the module. - register_buffer(name, tensor[, persistent])- Add a buffer to the module. - register_forward_hook(hook, *[, prepend, ...])- Register a forward hook on the module. - register_forward_pre_hook(hook, *[, ...])- Register a forward pre-hook on the module. - register_full_backward_hook(hook[, prepend])- Register a backward hook on the module. - register_full_backward_pre_hook(hook[, prepend])- Register a backward pre-hook on the module. - register_load_state_dict_post_hook(hook)- Register a post-hook to be run after module's - load_state_dict()is called.- register_load_state_dict_pre_hook(hook)- Register a pre-hook to be run before module's - load_state_dict()is called.- register_module(name, module)- Alias for - add_module().- register_parameter(name, param)- Add a parameter to the module. - register_state_dict_post_hook(hook)- Register a post-hook for the - state_dict()method.- register_state_dict_pre_hook(hook)- Register a pre-hook for the - state_dict()method.- requires_grad_([requires_grad])- Change if autograd should record operations on parameters in this module. - set_extra_state(state)- Set extra state contained in the loaded state_dict. - set_submodule(target, module[, strict])- Set the submodule given by - targetif it exists, otherwise throw an error.- share_memory()- See - torch.Tensor.share_memory_().- state_dict(*args[, destination, prefix, ...])- Return a dictionary containing references to the whole state of the module. - to(*args, **kwargs)- Move and/or cast the parameters and buffers. - to_empty(*, device[, recurse])- Move the parameters and buffers to the specified device without copying storage. - train([mode])- Set the module in training mode. - type(dst_type)- Casts all parameters and buffers to - dst_type.- xpu([device])- Move all model parameters and buffers to the XPU. - zero_grad([set_to_none])- Reset gradients of all model parameters. - Attributes - T_destination- call_super_init- dump_patches- input_size- hidden_size- bias- dropout- batch_first- training
