scripts/resulttool: do not count newly passing tests as regressions

resulttool regression module simply compare a base test status to a target test
result status. This approach raises many false positives since all XXX -> PASS
transitions (XXX being any status different from PASS) are flagged as
regression.

- Do not list XXX -> PASS transitions in regression  report, instead count them
  and print a summary of "newly passing tests"
- If an inspected pair has only "newly passing tests", do not print detailed
  list and print it as "Improvement" instead of "Regression"

Updated output example looks like the following:
[...]
Improvement: oeselftest_fedora-37_qemux86-64_20230127010225
             oeselftest_ubuntu-22.04_qemux86-64_20230226120516
             (+1 test(s) passing)
[...]
Match:       oeselftest_almalinux-8.7_qemuarm64_20230127015830
             oeselftest_almalinux-8.7_qemuarm64_20230227015258
[...]
Regression:  oeselftest_almalinux-9.1_qemumips_20230127000217
             oeselftest_opensuseleap-15.4_qemumips_20230226130046
    ptestresult.glibc-user.debug/tst-read-chk-cancel: PASS -> None
    ptestresult.glibc-user.nptl/tst-mutexpi4: PASS -> FAIL
    ptestresult.glibc-user.nptl/tst-mutexpi5a: PASS -> FAIL
    Additionally, 44 previously failing test(s) is/are now passing

(From OE-Core rev: c335f96f687c73fde443ac330ca3e17113794d9e)

Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This commit is contained in:
Alexis Lothoré 2023-02-28 19:10:52 +01:00 committed by Richard Purdie
parent c31c140746
commit 1eae53e277

View File

@ -190,11 +190,20 @@ def compare_result(logger, base_name, target_name, base_result, target_result):
else:
logger.error('Failed to retrieved base test case status: %s' % k)
if result:
resultstring = "Regression: %s\n %s\n" % (base_name, target_name)
for k in sorted(result):
resultstring += ' %s: %s -> %s\n' % (k, result[k]['base'], result[k]['target'])
new_pass_count = sum(test['target'] is not None and test['target'].startswith("PASS") for test in result.values())
# Print a regression report only if at least one test has a regression status (FAIL, SKIPPED, absent...)
if new_pass_count < len(result):
resultstring = "Regression: %s\n %s\n" % (base_name, target_name)
for k in sorted(result):
if not result[k]['target'] or not result[k]['target'].startswith("PASS"):
resultstring += ' %s: %s -> %s\n' % (k, result[k]['base'], result[k]['target'])
if new_pass_count > 0:
resultstring += f' Additionally, {new_pass_count} previously failing test(s) is/are now passing\n'
else:
resultstring = "Improvement: %s\n %s\n (+%d test(s) passing)" % (base_name, target_name, new_pass_count)
result = None
else:
resultstring = "Match: %s\n %s" % (base_name, target_name)
resultstring = "Match: %s\n %s" % (base_name, target_name)
return result, resultstring
def get_results(logger, source):
@ -269,9 +278,9 @@ def regression_common(args, logger, base_results, target_results):
else:
notfound.append("%s not found in target" % a)
print("\n".join(sorted(matches)))
print("\n")
print("\n".join(sorted(regressions)))
print("\n".join(sorted(notfound)))
return 0
def regression_git(args, logger):