# Loss Functions¶

## MLE Loss¶

### sequence_softmax_cross_entropy¶

texar.losses.sequence_softmax_cross_entropy(labels, logits, sequence_length, average_across_batch=True, average_across_timesteps=False, sum_over_batch=False, sum_over_timesteps=True, time_major=False, stop_gradient_to_label=False, name=None)[source]

Computes softmax cross entropy for each time step of sequence predictions.

Parameters: labels – Target class distributions. If time_major is False (default), this must be a Tensor of shape [batch_size, max_time, num_classes]. If time_major is True, this must be a Tensor of shape [max_time, batch_size, num_classes]. Each row of labels should be a valid probability distribution, otherwise, the computation of the gradient will be incorrect. logits – Unscaled log probabilities. This must have the shape of [max_time, batch_size, num_classes] or [batch_size, max_time, num_classes] according to the value of time_major. sequence_length – A Tensor of shape [batch_size]. Time steps beyond the respective sequence lengths will have zero losses. average_across_timesteps (bool) – If set, average the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. average_across_batch (bool) – If set, average the loss across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. sum_over_timesteps (bool) – If set, sum the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. time_major (bool) – The shape format of the inputs. If True, labels and logits must have shape [max_time, batch_size, …]. If False (default), they must have shape [batch_size, max_time, …]. stop_gradient_to_label (bool) – If set, gradient propagation to labels will be disabled. name (str, optional) – A name for the operation. A Tensor containing the loss, of rank 0, 1, or 2 depending on the arguments {average_across}/{sum_over}_{timesteps}/{batch}. For example: If sum_over_timesteps and average_across_batch are True (default), the return Tensor is of rank 0. If average_across_batch is True and other arguments are False, the return Tensor is of shape [max_time].

### sequence_sparse_softmax_cross_entropy¶

texar.losses.sequence_sparse_softmax_cross_entropy(labels, logits, sequence_length, average_across_batch=True, average_across_timesteps=False, sum_over_batch=False, sum_over_timesteps=True, time_major=False, name=None)[source]

Computes sparse softmax cross entropy for each time step of sequence predictions.

Parameters: labels – Target class indexes. I.e., classes are mutually exclusive (each entry is in exactly one class). If time_major is False (default), this must be a Tensor of shape [batch_size, max_time]. If time_major is True, this must be a Tensor of shape [max_time, batch_size]. logits – Unscaled log probabilities. This must have the shape of [max_time, batch_size, num_classes] or [batch_size, max_time, num_classes] according to the value of time_major. sequence_length – A Tensor of shape [batch_size]. Time steps beyond the respective sequence lengths will have zero losses. average_across_timesteps (bool) – If set, average the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. average_across_batch (bool) – If set, average the loss across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. sum_over_timesteps (bool) – If set, sum the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. time_major (bool) – The shape format of the inputs. If True, labels and logits must have shape [max_time, batch_size, …]. If False (default), they must have shape [batch_size, max_time, …]. name (str, optional) – A name for the operation. A Tensor containing the loss, of rank 0, 1, or 2 depending on the arguments {average_across}/{sum_over}_{timesteps}/{batch}. For example: If sum_over_timesteps and average_across_batch are True (default), the return Tensor is of rank 0. If average_across_batch is True and other arguments are False, the return Tensor is of shape [max_time].

Example

embedder = WordEmbedder(vocab_size=data.vocab.size)
decoder = BasicRNNDecoder(vocab_size=data.vocab.size)
outputs, _, _ = decoder(
decoding_strategy='train_greedy',
inputs=embedder(data_batch['text_ids']),
sequence_length=data_batch['length']-1)

loss = sequence_sparse_softmax_cross_entropy(
labels=data_batch['text_ids'][:, 1:],
logits=outputs.logits,
sequence_length=data_batch['length']-1)


### sequence_sigmoid_cross_entropy¶

texar.losses.sequence_sigmoid_cross_entropy(labels, logits, sequence_length, average_across_batch=True, average_across_timesteps=False, average_across_classes=True, sum_over_batch=False, sum_over_timesteps=True, sum_over_classes=False, time_major=False, stop_gradient_to_label=False, name=None)[source]

Computes sigmoid cross entropy for each time step of sequence predictions.

Parameters: labels – Target class distributions. If time_major is False (default), this must be a Tensor of shape [batch_size, max_time(, num_classes)]. If time_major is True, this must be a Tensor of shape [max_time, batch_size(, num_classes)]. Each row of labels should be a valid probability distribution, otherwise, the computation of the gradient will be incorrect. logits – Unscaled log probabilities having the same shape as with labels. sequence_length – A Tensor of shape [batch_size]. Time steps beyond the respective sequence lengths will have zero losses. average_across_timesteps (bool) – If set, average the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. average_across_batch (bool) – If set, average the loss across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. average_across_classes (bool) – If set, average the loss across the class dimension (if exists). Must not set average_across_classes’ and sum_over_classes at the same time. Ignored if logits is a 2D Tensor. sum_over_timesteps (bool) – If set, sum the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. sum_over_classes (bool) – If set, sum the loss across the class dimension. Must not set average_across_classes and sum_over_classes at the same time. Ignored if logits is a 2D Tensor. time_major (bool) – The shape format of the inputs. If True, labels and logits must have shape [max_time, batch_size, …]. If False (default), they must have shape [batch_size, max_time, …]. stop_gradient_to_label (bool) – If set, gradient propagation to labels will be disabled. name (str, optional) – A name for the operation. A Tensor containing the loss, of rank 0, 1, or 2 depending on the arguments {average_across}/{sum_over}_{timesteps}/{batch}/{classes}. For example, if the class dimension does not exist, and If sum_over_timesteps and average_across_batch are True (default), the return Tensor is of rank 0. If average_across_batch is True and other arguments are False, the return Tensor is of shape [max_time].

### binary_sigmoid_cross_entropy¶

texar.losses.binary_sigmoid_cross_entropy(pos_logits=None, neg_logits=None, average_across_batch=True, average_across_classes=True, sum_over_batch=False, sum_over_classes=False, return_pos_neg_losses=False, name=None)[source]

Computes sigmoid cross entropy of binary predictions.

Parameters: pos_logits – The logits of predicting positive on positive data. A tensor of shape [batch_size(, num_classes)]. neg_logits – The logits of predicting positive on negative data. A tensor of shape [batch_size(, num_classes)]. average_across_batch (bool) – If set, average the loss across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. average_across_classes (bool) – If set, average the loss across the class dimension (if exists). Must not set average_across_classes’ and sum_over_classes at the same time. Ignored if logits is a 1D Tensor. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. sum_over_classes (bool) – If set, sum the loss across the class dimension. Must not set average_across_classes and sum_over_classes at the same time. Ignored if logits is a 2D Tensor. return_pos_neg_losses (bool) – If set, additionally returns the losses on pos_logits and neg_logits, respectively. name (str, optional) – A name for the operation. By default, a Tensor containing the loss, of rank 0, 1, or 2 depending on the arguments {average_across}/{sum_over}_{batch}/{classes}. For example: If sum_over_batch and average_across_classes are True (default), the return Tensor is of rank 0. If arguments are False, the return Tensor is of shape [batch_size(, num_classes)]. If return_pos_neg_losses is True, returns a tuple (loss, pos_loss, neg_loss), where loss is the loss above; pos_loss is the loss on pos_logits only; and neg_loss is the loss on neg_logits only. They have loss = pos_loss + neg_loss.

### binary_sigmoid_cross_entropy_with_clas¶

texar.losses.binary_sigmoid_cross_entropy_with_clas(clas_fn, pos_inputs=None, neg_inputs=None, average_across_batch=True, average_across_classes=True, sum_over_batch=False, sum_over_classes=False, return_pos_neg_losses=False, name=None)[source]

Computes sigmoid cross entropy of binary classifier.

Parameters: clas_fn – A callable takes data (e.g., pos_inputs and fake_inputs) and returns the logits of being positive. The signature of clas_fn must be: logits (, …) = clas_fn(inputs). The return value of clas_fn can be the logits, or a tuple where the logits are the first element. pos_inputs – The positive data fed into clas_fn. neg_inputs – The negative data fed into clas_fn. average_across_batch (bool) – If set, average the loss across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. average_across_classes (bool) – If set, average the loss across the class dimension (if exists). Must not set average_across_classes’ and sum_over_classes at the same time. Ignored if logits is a 1D Tensor. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. sum_over_classes (bool) – If set, sum the loss across the class dimension. Must not set average_across_classes and sum_over_classes at the same time. Ignored if logits is a 2D Tensor. return_pos_neg_losses (bool) – If set, additionally returns the losses on pos_logits and neg_logits, respectively. name (str, optional) – A name for the operation. By default, a Tensor containing the loss, of rank 0, 1, or 2 depending on the arguments {average_across}/{sum_over}_{batch}/{classes}. For example: If sum_over_batch and average_across_classes are True (default), the return Tensor is of rank 0. If arguments are False, the return Tensor is of shape [batch_size(, num_classes)]. If return_pos_neg_losses=True, returns a tuple (loss, pos_loss, neg_loss), where loss is the loss above; pos_loss is the loss on pos_logits only; and neg_loss is the loss on neg_logits only. They have loss = pos_loss + neg_loss.

### pg_loss_with_logits¶

texar.losses.pg_loss_with_logits(actions, logits, advantages, rank=None, batched=False, sequence_length=None, average_across_batch=True, average_across_timesteps=False, average_across_remaining=False, sum_over_batch=False, sum_over_timesteps=True, sum_over_remaining=True, time_major=False)[source]

Policy gradient loss with logits. Used for discrete actions.

pg_loss = reduce( advantages * -log_prob( actions ) ), where advantages and actions do not back-propagate gradients.

All arguments except logits and actions are the same with pg_loss_with_log_probs().

Parameters: actions – Tensor of shape [(batch_size,) max_time, d_3, …, d_rank] and of dtype int32 or int64. The rank of the Tensor is specified with rank. The batch dimension exists only if batched is True. The batch and time dimensions are exchanged, i.e., [max_time, batch_size, …] if time_major is True. logits – Unscaled log probabilities of shape [(batch_size,) max_time, d_3, …, d_{rank+1}] and dtype float32 or float64. The batch and time dimensions are exchanged if time_major is True. advantages – Tensor of shape [(batch_size,) max_time, d_3, …, d_rank] and dtype float32 or float64. The batch and time dimensions are exchanged if time_major is True. rank (int, optional) – The rank of actions. If None (default), rank is automatically inferred from actions or advantages. If the inference fails, rank is set to 1 if batched is False, and set to 2 if batched is True. batched (bool) – True if the inputs are batched. sequence_length (optional) – A Tensor of shape [batch_size]. Time steps beyond the respective sequence lengths will have zero losses. Used if batched is True. average_across_timesteps (bool) – If set, average the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. average_across_batch (bool) – If set, average the loss across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. Ignored if batched is False. average_across_remaining (bool) – If set, average the sequence across the remaining dimensions. Must not set average_across_remaining’ and sum_over_remaining at the same time. Ignored if no more dimensions other than the batch and time dimensions. sum_over_timesteps (bool) – If set, sum the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. Ignored if batched is False. sum_over_remaining (bool) – If set, sum the loss across the remaining dimension. Must not set average_across_remaining and sum_over_remaining at the same time. Ignored if no more dimensions other than the batch and time dimensions. time_major (bool) – The shape format of the inputs. If True, logits, actions and advantages must have shape [max_time, batch_size, …]. If False (default), they must have shape [batch_size, max_time, …]. Ignored if batched is False. A Tensor containing the loss to minimize, whose rank depends on the reduce arguments. For example, the batch dimension is reduced if either average_across_batch or sum_over_batch is True, which decreases the rank of output tensor by 1.

### pg_loss_with_log_probs¶

texar.losses.pg_loss_with_log_probs(log_probs, advantages, rank=None, batched=False, sequence_length=None, average_across_batch=True, average_across_timesteps=False, average_across_remaining=False, sum_over_batch=False, sum_over_timesteps=True, sum_over_remaining=True, time_major=False)[source]

Policy gradient loss with log probs of actions.

All arguments except log_probs are the same as pg_loss_with_logits().

Parameters: log_probs – Log probabilities of shape [(batch_size,) max_time, …, d_rank] and dtype float32 or float64. The rank of the Tensor is specified with rank. The batch dimension exists only if batched is True. The batch and time dimensions are exchanged, i.e., [max_time, batch_size, …] if time_major is True. advantages – Tensor of shape [(batch_size,) max_time, d_3, …, d_rank] and dtype float32 or float64. The batch dimension exists only if batched is True. The batch and time dimensions are exchanged if time_major is True. rank (int, optional) – The rank of log_probs. If None (default), rank is automatically inferred from log_probs or advantages. If the inference fails, rank is set to 1 if batched==False, and set to 2 if batched==True. batched (bool) – True if the inputs are batched. sequence_length (optional) – A Tensor of shape [batch_size]. Time steps beyond the respective sequence lengths will have zero losses. Used if batched is True. average_across_timesteps (bool) – If set, average the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. average_across_batch (bool) – If set, average the loss across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. Ignored if batched is False. average_across_remaining (bool) – If set, average the sequence across the remaining dimensions. Must not set average_across_remaining’ and sum_over_remaining at the same time. Ignored if no more dimensions other than the batch and time dimensions. sum_over_timesteps (bool) – If set, sum the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. Ignored if batched is False. sum_over_remaining (bool) – If set, sum the loss across the remaining dimension. Must not set average_across_remaining and sum_over_remaining at the same time. Ignored if no more dimensions other than the batch and time dimensions. time_major (bool) – The shape format of the inputs. If True, log_probs and advantages must have shape [max_time, batch_size, …]. If False (default), they must have shape [batch_size, max_time, …]. Ignored if batched is False. A Tensor containing the loss to minimize, whose rank depends on the reduce arguments. For example, the batch dimension is reduced if either average_across_batch or sum_over_batch is True, which decreases the rank of output tensor by 1.

## Reward¶

### discount_reward¶

texar.losses.discount_reward(reward, sequence_length=None, discount=1.0, normalize=False, dtype=None, tensor_rank=1)[source]

Computes discounted reward.

reward and sequence_length can be either Tensors or python arrays. If both are python array (or None), the return will be a python array as well. Otherwise tf Tensors are returned.

Parameters: reward – A Tensor or python array. Can be 1D with shape [batch_size], or 2D with shape [batch_size, max_time]. sequence_length (optional) – A Tensor or python array of shape [batch_size]. Time steps beyond the respective sequence lengths will be masked. Required if reward is 1D. discount (float) – A scalar. The discount factor. normalize (bool) – Whether to normalize the discounted reward, by (discounted_reward - mean) / std. Here mean and std are over all time steps and all samples in the batch. dtype (dtype) – Type of reward. If None, infer from reward automatically. tensor_rank (int) – The number of dimensions of reward. Default is 1, i.e., reward is a 1D Tensor consisting of a batch dimension. Ignored if reward and sequence_length are python arrays (or None). A 2D Tensor or python array of the discounted reward. If reward and sequence_length are python arrays (or None), the returned value is a python array as well.

Example

r = [2., 1.]
seq_length = [3, 2]
discounted_r = discount_reward(r, seq_length, discount=0.1)
# discounted_r == [[2. * 0.1^2, 2. * 0.1, 2.],
#                  [1. * 0.1,   1.,       0.]]

r = [[3., 4., 5.], [6., 7., 0.]]
seq_length = [3, 2]
discounted_r = discount_reward(r, seq_length, discount=0.1)
# discounted_r == [[3. + 4.*0.1 + 5.*0.1^2, 4. + 5.*0.1, 5.],
#                  [6. + 7.*0.1,            7.,          0.]]


texar.losses.binary_adversarial_losses(real_data, fake_data, discriminator_fn, mode='max_real')[source]

Computes adversarial losses of real/fake binary discrimination game.

Parameters: real_data (Tensor or array) – Real data of shape [num_real_examples, …]. fake_data (Tensor or array) – Fake data of shape [num_fake_examples, …]. num_real_examples does not necessarily equal num_fake_examples. discriminator_fn – A callable takes data (e.g., real_data and fake_data) and returns the logits of being real. The signature of discriminator_fn must be: logits, … = discriminator_fn(data). The return value of discriminator_fn can be the logits, or a tuple where the logits are the first element. mode (str) – Mode of the generator loss. Either “max_real” or “min_fake”. ”max_real” (default): minimizing the generator loss is to maximize the probability of fake data being classified as real. ”min_fake”: minimizing the generator loss is to minimize the probability of fake data being classified as fake. A tuple (generator_loss, discriminator_loss) each of which is a scalar Tensor, loss to be minimized.

## Entropy¶

### entropy_with_logits¶

texar.losses.entropy_with_logits(logits, rank=None, average_across_batch=True, average_across_remaining=False, sum_over_batch=False, sum_over_remaining=True)[source]

Shannon entropy given logits.

Parameters: logits – Unscaled log probabilities of shape [batch_size, d_2, …, d_{rank-1}, distribution_dim] and of dtype float32 or float64. The rank of the tensor is optionally specified by the argument rank. The tensor is considered as having [batch_size, .., d_{rank-1}] elements, each of which has a distribution of length d_rank (i.e., distribution_dim). So the last dimension is always summed out to compute the entropy. rank (int, optional) – The rank of logits. If None (default), rank is inferred automatically from logits. If the inference fails, rank is set to 2, i.e., assuming logits is of shape [batch_size, distribution_dim] average_across_batch (bool) – If set, average the entropy across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. average_across_remaining (bool) – If set, average the entropy across the remaining dimensions. Must not set average_across_remaining’ and sum_over_remaining at the same time. Used only when logits has rank >= 3. sum_over_batch (bool) – If set, sum the entropy across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. sum_over_remaining (bool) – If set, sum the entropy across the remaining dimension. Must not set average_across_remaining and sum_over_remaining at the same time. Used only when logits has rank >= 3. A Tensor containing the shannon entropy. The dimensionality of the Tensor depends on the configuration of reduction arguments. For example, if both batch and remaining dimensions are reduced (by either sum or average), the returned Tensor is a scalar Tensor.

### sequence_entropy_with_logits¶

texar.losses.sequence_entropy_with_logits(logits, rank=None, sequence_length=None, average_across_batch=True, average_across_timesteps=False, average_across_remaining=False, sum_over_batch=False, sum_over_timesteps=True, sum_over_remaining=True, time_major=False)[source]

Shannon entropy given logits.

Parameters: logits – Unscaled log probabilities of shape [batch_size, max_time, d_3, …, d_{rank-1}, distribution_dim] and of dtype float32 or float64. The rank of the tensor is optionally specified by the argument rank. The tensor is considered as having [batch_size, .., d_{rank-1}] elements, each of which has a distribution of length d_rank (i.e., distribution_dim). So the last dimension is always summed out to compute the entropy. The batch and time dimensions are exchanged if time_major is True. rank (int, optional) – The rank of logits. If None (default), rank is inferred automatically from logits. If the inference fails, rank is set to 3, i.e., assuming logits is of shape [batch_size, max_time, distribution_dim] sequence_length (optional) – A Tensor of shape [batch_size]. Time steps beyond the respective sequence lengths are counted into the entropy. average_across_timesteps (bool) – If set, average the entropy across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. average_across_batch (bool) – If set, average the entropy across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. average_across_remaining (bool) – If set, average the entropy across the remaining dimensions. Must not set average_across_remaining’ and sum_over_remaining at the same time. Used only when logits has rank >= 4. sum_over_timesteps (bool) – If set, sum the entropy across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. sum_over_batch (bool) – If set, sum the entropy across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. sum_over_remaining (bool) – If set, sum the entropy across the remaining dimension. Must not set average_across_remaining and sum_over_remaining at the same time. Used only when logits has rank >= 4. time_major (bool) – The shape format of the inputs. If True, logits must have shape [max_time, batch_size, …]. If False (default), it must have shape [batch_size, max_time, …]. A Tensor containing the shannon entropy. The dimensionality of the Tensor depends on the configuration of reduction arguments. For example, if batch, time, and remaining dimensions are all reduced (by either sum or average), the returned Tensor is a scalar Tensor.

## Loss Utils¶

texar.losses.mask_and_reduce(sequence, sequence_length, rank=2, average_across_batch=True, average_across_timesteps=False, average_across_remaining=False, sum_over_batch=False, sum_over_timesteps=True, sum_over_remaining=True, dtype=None, time_major=False)[source]

Masks out sequence entries that are beyond the respective sequence lengths, and reduces (average or sum) away dimensions.

This is a combination of mask_sequences() and reduce_batch_time().

Parameters: sequence – A Tensor of sequence values. If time_major=False (default), this must be a Tensor of shape [batch_size, max_time, d_2, …, d_rank], where the rank of the Tensor is specified with rank. The batch and time dimensions are exchanged if time_major is True. sequence_length – A Tensor of shape [batch_size]. Time steps beyond the respective sequence lengths will be made zero. If None, not masking is performed. rank (int) – The rank of sequence. Must be >= 2. Default is 2, i.e., sequence is a 2D Tensor consisting of batch and time dimensions. average_across_timesteps (bool) – If set, average the sequence across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. average_across_batch (bool) – If set, average the sequence across the batch dimension. Must not set average_across_batch’ and sum_over_batch at the same time. average_across_remaining (bool) – If set, average the sequence across the remaining dimensions. Must not set average_across_remaining’ and sum_over_remaining at the same time. sum_over_timesteps (bool) – If set, sum the loss across the time dimension. Must not set average_across_timesteps and sum_over_timesteps at the same time. sum_over_batch (bool) – If set, sum the loss across the batch dimension. Must not set average_across_batch and sum_over_batch at the same time. sum_over_remaining (bool) – If set, sum the loss across the remaining dimension. Must not set average_across_remaining and sum_over_remaining at the same time. time_major (bool) – The shape format of the inputs. If True, sequence must have shape [max_time, batch_size, …]. If False (default), sequence must have shape [batch_size, max_time, …]. dtype (dtype) – Type of sequence. If None, infer from sequence automatically.
Returns
A Tensor containing the masked and reduced sequence.

### reduce_batch_time¶

texar.losses.reduce_batch_time(sequence, sequence_length, average_across_batch=True, average_across_timesteps=False, sum_over_batch=False, sum_over_timesteps=True)[source]

Average or sum over the respective dimensions of sequence, which is of shape [batch_size, max_time, …].

Assumes sequence has been properly masked according to sequence_length.

### reduce_dimensions¶

texar.losses.reduce_dimensions(tensor, average_axes=None, sum_axes=None, keepdims=None)[source]

Average or sum over dimensions of tensor.

average_axes and sum_axes must be mutually exclusive. That is, elements in average_axes must not be contained in sum_axes, and vice versa.

Parameters: tensor – A tensor to reduce. average_axes (optional) – A (list of) int that indicates the dimensions to reduce by taking average. sum_axes (optional) – A (list of) int that indicates the dimensions to reduce by taking sum. keepdims (optional) – If True, retains reduced dimensions with length 1.