Evaluations

BLEU

sentence_bleu

texar.tf.evals.sentence_bleu(references, hypothesis, max_order=4, lowercase=False, smooth=False, return_all=False)[source]

Calculates BLEU score of a hypothesis sentence.

Parameters:
  • references – A list of reference for the hypothesis. Each reference can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
  • hypotheses – A hypothesis sentence. Each hypothesis can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
  • lowercase (bool) – If True, lowercase reference and hypothesis tokens.
  • max_order (int) – Maximum n-gram order to use when computing BLEU score.
  • smooth (bool) – Whether or not to apply (Lin et al. 2004) smoothing.
  • return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of float32 scores: [BLEU] + n-gram precisions, which is of length max_order + 1.

corpus_bleu

texar.tf.evals.corpus_bleu(list_of_references, hypotheses, max_order=4, lowercase=False, smooth=False, return_all=True)[source]

Computes corpus-level BLEU score.

Parameters:
  • list_of_references – A list of lists of references for each hypothesis. Each reference can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
  • hypotheses – A list of hypothesis sentences. Each hypothesis can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
  • lowercase (bool) – If True, lowercase reference and hypothesis tokens.
  • max_order (int) – Maximum n-gram order to use when computing BLEU score.
  • smooth (bool) – Whether or not to apply (Lin et al. 2004) smoothing.
  • return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of float32 scores: [BLEU] + n-gram precisions, which is of length max_order + 1.

sentence_bleu_moses

texar.tf.evals.sentence_bleu_moses(references, hypothesis, lowercase=False, return_all=False)[source]

Calculates BLEU score of a hypothesis sentence using the MOSES multi-bleu.perl script.

Parameters:
  • references – A list of reference for the hypothesis. Each reference can be either a string, or a list of string tokens. List can also be numpy array.
  • hypotheses – A hypothesis sentence. The hypothesis can be either a string, or a list of string tokens. List can also be numpy array.
  • lowercase (bool) – If True, pass the “-lc” flag to the multi-bleu script.
  • return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of 5 float32 scores: [BLEU, 1-gram precision, …, 4-gram precision].

corpus_bleu_moses

texar.tf.evals.corpus_bleu_moses(list_of_references, hypotheses, lowercase=False, return_all=False)[source]

Calculates corpus-level BLEU score using the MOSES multi-bleu.perl script.

Parameters:
  • list_of_references – A list of lists of references for each hypothesis. Each reference can be either a string, or a list of string tokens. List can also be numpy array.
  • hypotheses – A list of hypothesis sentences. Each hyperthsis can be either a string, or a list of string tokens. List can also be numpy array.
  • lowercase (bool) – If True, pass the “-lc” flag to the multi-bleu script.
  • return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of 5 float32 scores: [BLEU, 1-gram precision, …, 4-gram precision].

Accuracy

accuracy

texar.tf.evals.accuracy(labels, preds)[source]

Calculates the accuracy of predictions.

Parameters:
  • labels – The ground truth values. A Tensor of the same shape of preds.
  • preds – A Tensor of any shape containing the predicted values.
Returns:

A float scalar Tensor containing the accuracy.

binary_clas_accurac

texar.tf.evals.binary_clas_accuracy(pos_preds=None, neg_preds=None)[source]

Calculates the accuracy of binary predictions.

Parameters:
  • pos_preds (optional) – A Tensor of any shape containing the predicted values on positive data (i.e., ground truth labels are 1).
  • neg_preds (optional) – A Tensor of any shape containing the predicted values on negative data (i.e., ground truth labels are 0).
Returns:

A float scalar Tensor containing the accuracy.