torchrecurrent.STARCell#
- class torchrecurrent.STARCell(input_size, hidden_size, bias=True, recurrent_bias=True, kernel_init=<function xavier_uniform_>, recurrent_kernel_init=<function xavier_uniform_>, bias_init=<function zeros_>, recurrent_bias_init=<function zeros_>, device=None, dtype=None)[source]#
- A Stackable Recurrent (STAR) cell. - [arXiv] \[\begin{split}\begin{aligned} \mathbf{z}(t) &= \tanh\bigl(\mathbf{W}_{ih}^{z}\,\mathbf{x}(t) + \mathbf{b}_{ih}^{z}\bigr), \\ \mathbf{k}(t) &= \sigma\bigl(\mathbf{W}_{ih}^{k}\,\mathbf{x}(t) + \mathbf{b}_{ih}^{k} + \mathbf{W}_{hh}^{k}\,\mathbf{h}(t-1) + \mathbf{b}_{hh}^{k}\bigr), \\ \mathbf{h}(t) &= \tanh\bigl((1 - \mathbf{k}(t)) \circ \mathbf{h}(t-1) + \mathbf{k}(t) \circ \mathbf{z}(t)\bigr), \end{aligned}\end{split}\]- where \(\sigma\) is the sigmoid function and \(\circ\) denotes element-wise multiplication. - Parameters:
- input_size – Number of features in the input \(\mathbf{x}(t)\) 
- hidden_size – Number of features in the hidden state \(\mathbf{h}(t)\) 
- bias – If - False, the layer does not use input-side bias \(\mathbf{b}_{ih}\). Default:- True
- recurrent_bias – If - False, the layer does not use recurrent bias \(\mathbf{b}_{hh}\). Default:- True
- kernel_init – Initializer for - weight_ih. Default:- torch.nn.init.xavier_uniform_()
- recurrent_kernel_init – Initializer for - weight_hh. Default:- torch.nn.init.xavier_uniform_()
- bias_init – Initializer for - bias_ihwhen- bias=True. Default:- torch.nn.init.zeros_()
- recurrent_bias_init – Initializer for - bias_hhwhen- recurrent_bias=True. Default:- torch.nn.init.zeros_()
- device – The desired device of parameters 
- dtype – The desired floating point type of parameters 
 
 - Inputs: input, hidden
- input of shape - (batch, input_size)or- (input_size,): tensor containing input features
- hidden of shape - (batch, hidden_size)or- (hidden_size,): tensor containing the previous hidden state
 - If hidden is not provided, it defaults to zero. 
- Outputs: h_1
- h_1 of shape - (batch, hidden_size)or- (hidden_size,): tensor containing the next hidden state
 
 - Variables:
- weight_ih – input–hidden weights, of shape - (2*hidden_size, input_size)(first half- W_{ih}^z, second half- W_{ih}^k)
- weight_hh – hidden–hidden weights for gate - k, of shape- (hidden_size, hidden_size)
- bias_ih – input biases - [b_{ih}^z, b_{ih}^k], of shape- (2*hidden_size,)if- bias=True
- bias_hh – hidden bias for gate - k, of shape- (hidden_size,)if- recurrent_bias=True
 
 - Examples: - >>> cell = STARCell(16, 32) >>> seq = torch.randn(10, 8, 16) # (time, batch, input_size) >>> h = torch.zeros(8, 32) # (batch, hidden_size) >>> outs = [] >>> for t in range(seq.size(0)): ... h = cell(seq[t], h) ... outs.append(h) >>> outs = torch.stack(outs, dim=0) # (time, batch, hidden_size) - __init__(input_size, hidden_size, bias=True, recurrent_bias=True, kernel_init=<function xavier_uniform_>, recurrent_kernel_init=<function xavier_uniform_>, bias_init=<function zeros_>, recurrent_bias_init=<function zeros_>, device=None, dtype=None)[source]#
- Initialize internal Module state, shared by both nn.Module and ScriptModule. 
 - Methods - __init__(input_size, hidden_size[, bias, ...])- Initialize internal Module state, shared by both nn.Module and ScriptModule. - add_module(name, module)- Add a child module to the current module. - apply(fn)- Apply - fnrecursively to every submodule (as returned by- .children()) as well as self.- bfloat16()- Casts all floating point parameters and buffers to - bfloat16datatype.- buffers([recurse])- Return an iterator over module buffers. - children()- Return an iterator over immediate children modules. - compile(*args, **kwargs)- Compile this Module's forward using - torch.compile().- cpu()- Move all model parameters and buffers to the CPU. - cuda([device])- Move all model parameters and buffers to the GPU. - double()- Casts all floating point parameters and buffers to - doubledatatype.- eval()- Set the module in evaluation mode. - extra_repr()- Return the extra representation of the module. - float()- Casts all floating point parameters and buffers to - floatdatatype.- forward(inp[, state])- Run one step of the recurrent cell. - get_buffer(target)- Return the buffer given by - targetif it exists, otherwise throw an error.- get_extra_state()- Return any extra state to include in the module's state_dict. - get_parameter(target)- Return the parameter given by - targetif it exists, otherwise throw an error.- get_submodule(target)- Return the submodule given by - targetif it exists, otherwise throw an error.- half()- Casts all floating point parameters and buffers to - halfdatatype.- init_weights()- ipu([device])- Move all model parameters and buffers to the IPU. - load_state_dict(state_dict[, strict, assign])- Copy parameters and buffers from - state_dictinto this module and its descendants.- modules()- Return an iterator over all modules in the network. - mtia([device])- Move all model parameters and buffers to the MTIA. - named_buffers([prefix, recurse, ...])- Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself. - named_children()- Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself. - named_modules([memo, prefix, remove_duplicate])- Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself. - named_parameters([prefix, recurse, ...])- Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself. - parameters([recurse])- Return an iterator over module parameters. - register_backward_hook(hook)- Register a backward hook on the module. - register_buffer(name, tensor[, persistent])- Add a buffer to the module. - register_forward_hook(hook, *[, prepend, ...])- Register a forward hook on the module. - register_forward_pre_hook(hook, *[, ...])- Register a forward pre-hook on the module. - register_full_backward_hook(hook[, prepend])- Register a backward hook on the module. - register_full_backward_pre_hook(hook[, prepend])- Register a backward pre-hook on the module. - register_load_state_dict_post_hook(hook)- Register a post-hook to be run after module's - load_state_dict()is called.- register_load_state_dict_pre_hook(hook)- Register a pre-hook to be run before module's - load_state_dict()is called.- register_module(name, module)- Alias for - add_module().- register_parameter(name, param)- Add a parameter to the module. - register_state_dict_post_hook(hook)- Register a post-hook for the - state_dict()method.- register_state_dict_pre_hook(hook)- Register a pre-hook for the - state_dict()method.- requires_grad_([requires_grad])- Change if autograd should record operations on parameters in this module. - set_extra_state(state)- Set extra state contained in the loaded state_dict. - set_submodule(target, module[, strict])- Set the submodule given by - targetif it exists, otherwise throw an error.- share_memory()- See - torch.Tensor.share_memory_().- state_dict(*args[, destination, prefix, ...])- Return a dictionary containing references to the whole state of the module. - to(*args, **kwargs)- Move and/or cast the parameters and buffers. - to_empty(*, device[, recurse])- Move the parameters and buffers to the specified device without copying storage. - train([mode])- Set the module in training mode. - type(dst_type)- Casts all parameters and buffers to - dst_type.- uses_double_state()- Return True if forward returns (h, c), else just h. - xpu([device])- Move all model parameters and buffers to the XPU. - zero_grad([set_to_none])- Reset gradients of all model parameters. - Attributes - T_destination- call_super_init- dump_patches- weight_ih- weight_hh- bias_ih- bias_hh- input_size- hidden_size- bias- recurrent_bias- training
