Evaluations¶

BLEU¶

texar.tf.evals.sentence_bleu(references, hypothesis, max_order=4, lowercase=False, smooth=False, return_all=False)[source]¶

Calculates BLEU score of a hypothesis sentence.

Parameters:

references – A list of reference for the hypothesis. Each reference can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
hypotheses – A hypothesis sentence. Each hypothesis can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
lowercase (bool) – If True, lowercase reference and hypothesis tokens.
max_order (int) – Maximum n-gram order to use when computing BLEU score.
smooth (bool) – Whether or not to apply (Lin et al. 2004) smoothing.
return_all (bool) – If True, returns BLEU and all n-gram precisions.

Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of float32 scores: [BLEU] + n-gram precisions, which is of length max_order + 1.

texar.tf.evals.corpus_bleu(list_of_references, hypotheses, max_order=4, lowercase=False, smooth=False, return_all=True)[source]¶

Computes corpus-level BLEU score.

Parameters:

list_of_references – A list of lists of references for each hypothesis. Each reference can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
hypotheses – A list of hypothesis sentences. Each hypothesis can be either a list of string tokens, or a string containing tokenized tokens separated with whitespaces. List can also be numpy array.
lowercase (bool) – If True, lowercase reference and hypothesis tokens.
max_order (int) – Maximum n-gram order to use when computing BLEU score.
smooth (bool) – Whether or not to apply (Lin et al. 2004) smoothing.
return_all (bool) – If True, returns BLEU and all n-gram precisions.

Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of float32 scores: [BLEU] + n-gram precisions, which is of length max_order + 1.

texar.tf.evals.sentence_bleu_moses(references, hypothesis, lowercase=False, return_all=False)[source]¶

Calculates BLEU score of a hypothesis sentence using the MOSES multi-bleu.perl script.

Parameters:

references – A list of reference for the hypothesis. Each reference can be either a string, or a list of string tokens. List can also be numpy array.
hypotheses – A hypothesis sentence. The hypothesis can be either a string, or a list of string tokens. List can also be numpy array.
lowercase (bool) – If True, pass the “-lc” flag to the multi-bleu script.
return_all (bool) – If True, returns BLEU and all n-gram precisions.

Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of 5 float32 scores: [BLEU, 1-gram precision, …, 4-gram precision].

texar.tf.evals.corpus_bleu_moses(list_of_references, hypotheses, lowercase=False, return_all=False)[source]¶

Calculates corpus-level BLEU score using the MOSES multi-bleu.perl script.

Parameters:

list_of_references – A list of lists of references for each hypothesis. Each reference can be either a string, or a list of string tokens. List can also be numpy array.
hypotheses – A list of hypothesis sentences. Each hyperthsis can be either a string, or a list of string tokens. List can also be numpy array.
lowercase (bool) – If True, pass the “-lc” flag to the multi-bleu script.
return_all (bool) – If True, returns BLEU and all n-gram precisions.

Returns:

If return_all is False (default), returns a float32 BLEU score.

If return_all is True, returns a list of 5 float32 scores: [BLEU, 1-gram precision, …, 4-gram precision].

texar.tf.evals.accuracy(labels, preds)[source]¶

Calculates the accuracy of predictions.

Parameters:	labels – The ground truth values. A Tensor of the same shape of `preds`. preds – A Tensor of any shape containing the predicted values.
Returns:	A float scalar Tensor containing the accuracy.

texar.tf.evals.binary_clas_accuracy(pos_preds=None, neg_preds=None)[source]¶

Calculates the accuracy of binary predictions.

Parameters:	pos_preds (optional) – A Tensor of any shape containing the predicted values on positive data (i.e., ground truth labels are 1). neg_preds (optional) – A Tensor of any shape containing the predicted values on negative data (i.e., ground truth labels are 0).
Returns:	A float scalar Tensor containing the accuracy.