|
From: | Michael Käppler |
Subject: | Re: search for better regtest comparison algorithm |
Date: | Wed, 31 Jul 2024 22:20:28 +0200 |
User-agent: | Mozilla Thunderbird |
Point taken. Maybe it would be good to take a step back, though. The original example that you came up with was a false negative, namely a missing object that stayed unnoticed. Now we're discussing all kinds of complicated algorithms to reduce the probability of false negatives, while also trying to avoid false positives. My question is: Do we really have a problem with false positives? I had a quick glance on the last MRs that had their artifacts still available and found no example except !2391 that had scores "below threshold". Does this happen frequently? If not, wouldn't it suffice to improve on the sensitivity of the comparison process and not introduce stuff that tries to discriminate between "good" and "bad" changes? We could render tests with unclear results a second time with higher resolution, e.g. I fear that every algorithm, how sophisticated it may be, will be mainly designed with the failure modes we're expecting in mind. However, a test system should be in a sense "objective" that it catches more frequent failure modes as well as strange and rare failure modes. Michael Am 29.07.2024 um 16:54 schrieb Werner LEMBERG:
Or you run it "horizontally": [...]Please don't change the topic in this thread on how to improve/modify/whatever the regression test system. It runs just fine, and IMHO we don't have to change that (except if you *insist* on doing your suggested changes by yourself, also maintaining them for the next 20 years :-). Right now, the only thing that needs improvement is the threshold algorithm. Werner
[Prev in Thread] | Current Thread | [Next in Thread] |