torchrecurrent.benchmarks.copy_memory#
- torchrecurrent.benchmarks.copy_memory(seq_len, n_samples, num_classes=10, **kwargs)[source]#
Generate data for the copy memory benchmark.
The copy memory task is a synthetic sequence learning problem where a model must memorize and reproduce an input sequence after a long delay. Each sample consists of:
A random sequence of integers (the content to be memorized).
A delimiter symbol marking the end of the input.
A sequence of zeros acting as distractors.
The target sequence requires the model to output padding until the delimiter, then reproduce the original random sequence.
- Parameters:
seq_len (int) – Length of the random sequence to memorize.
n_samples (int) – Number of samples to generate.
num_classes (int, optional) – Number of distinct classes used for the random sequence. Defaults to 10. The delimiter token uses the value
num_classes
.**kwargs – Additional keyword arguments passed to
torch.utils.data.DataLoader
(e.g.batch_size
,shuffle
).
- Returns:
A DataLoader yielding batches of
(input_seq, target_seq)
where:input_seq
has shape(n_samples, 2 * seq_len + 1)
and contains the random sequence, followed by a delimiter token, followed by distractor zeros.target_seq
has shape(n_samples, 2 * seq_len + 1)
and contains padding + delimiter, followed by the original random sequence.
- Return type:
torch.utils.data.DataLoader