Evaluations¶
BLEU¶
sentence_bleu¶
-
texar.tf.evals.
sentence_bleu
(references, hypothesis, max_order=4, lowercase=False, smooth=False, return_all=False)[source]¶ Calculates BLEU score of a hypothesis sentence.
Parameters: - references – A list of reference for the hypothesis. Each reference can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
- hypotheses – A hypothesis sentence. Each hypothesis can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
- lowercase (bool) – If True, lowercase reference and hypothesis tokens.
- max_order (int) – Maximum n-gram order to use when computing BLEU score.
- smooth (bool) – Whether or not to apply (Lin et al. 2004) smoothing.
- return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns: If
return_all
is False (default), returns a float32 BLEU score.If
return_all
is True, returns a list of float32 scores: [BLEU] + n-gram precisions, which is of lengthmax_order
+ 1.
corpus_bleu¶
-
texar.tf.evals.
corpus_bleu
(list_of_references, hypotheses, max_order=4, lowercase=False, smooth=False, return_all=True)[source]¶ Computes corpus-level BLEU score.
Parameters: - list_of_references – A list of lists of references for each hypothesis. Each reference can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
- hypotheses – A list of hypothesis sentences. Each hypothesis can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
- lowercase (bool) – If True, lowercase reference and hypothesis tokens.
- max_order (int) – Maximum n-gram order to use when computing BLEU score.
- smooth (bool) – Whether or not to apply (Lin et al. 2004) smoothing.
- return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns: If
return_all
is False (default), returns a float32 BLEU score.If
return_all
is True, returns a list of float32 scores: [BLEU] + n-gram precisions, which is of lengthmax_order
+ 1.
sentence_bleu_moses¶
-
texar.tf.evals.
sentence_bleu_moses
(references, hypothesis, lowercase=False, return_all=False)[source]¶ Calculates BLEU score of a hypothesis sentence using the MOSES multi-bleu.perl script.
Parameters: - references – A list of reference for the hypothesis. Each reference can be either a string, or a list of string tokens. List can also be numpy array.
- hypotheses – A hypothesis sentence. The hypothesis can be either a string, or a list of string tokens. List can also be numpy array.
- lowercase (bool) – If True, pass the “-lc” flag to the multi-bleu script.
- return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns: If
return_all
is False (default), returns a float32 BLEU score.If
return_all
is True, returns a list of 5 float32 scores: [BLEU, 1-gram precision, …, 4-gram precision].
corpus_bleu_moses¶
-
texar.tf.evals.
corpus_bleu_moses
(list_of_references, hypotheses, lowercase=False, return_all=False)[source]¶ Calculates corpus-level BLEU score using the MOSES multi-bleu.perl script.
Parameters: - list_of_references – A list of lists of references for each hypothesis. Each reference can be either a string, or a list of string tokens. List can also be numpy array.
- hypotheses – A list of hypothesis sentences. Each hyperthsis can be either a string, or a list of string tokens. List can also be numpy array.
- lowercase (bool) – If True, pass the “-lc” flag to the multi-bleu script.
- return_all (bool) – If True, returns BLEU and all n-gram precisions.
Returns: If
return_all
is False (default), returns a float32 BLEU score.If
return_all
is True, returns a list of 5 float32 scores: [BLEU, 1-gram precision, …, 4-gram precision].
Accuracy¶
accuracy¶
binary_clas_accurac¶
-
texar.tf.evals.
binary_clas_accuracy
(pos_preds=None, neg_preds=None)[source]¶ Calculates the accuracy of binary predictions.
Parameters: - pos_preds (optional) – A Tensor of any shape containing the predicted values on positive data (i.e., ground truth labels are 1).
- neg_preds (optional) – A Tensor of any shape containing the predicted values on negative data (i.e., ground truth labels are 0).
Returns: A float scalar Tensor containing the accuracy.