Robust tests for graphics packages

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Robust tests for graphics packages

Tom Breloff
I'd like to make my testing robust for Plots.jl (especially since I'll want to know if plotting backends change)... what is the recommended way to test packages that produce graphics?  I suppose I can test a lot by generating plots, writing to a PNG, and then comparing that the raw bytes are equal to some known (correct) PNG?  Is there accepted best practice here?  What would be the best way to compare the files?  Is there any good way to test that a GUI window is working properly?

Any advice is appreciated.  Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tim Holy
This is an important issue, and I think it's fair to say we don't have a great
solution. Both Winston and Compose do something like this in their tests.
Unfortunately, both projects have issues with different versions of julia---or
different versions of underlying libraries like libcairo on the maintainer's vs
Travis' machines---giving minor differences in output.

Images has a ssd method that computes the sum-of-square differences---perhaps
we should use that? Or do it manually in combination with imfilter_gaussian, to
blur out any differences that affect alignment on the scale of a pixel or two?

Best,
--Tim

On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> I'd like to make my testing robust for Plots.jl (especially since I'll want
> to know if plotting backends change)... what is the recommended way to test
> packages that produce graphics?  I suppose I can test a lot by generating
> plots, writing to a PNG, and then comparing that the raw bytes are equal to
> some known (correct) PNG?  Is there accepted best practice here?  What
> would be the best way to compare the files?  Is there any good way to test
> that a GUI window is working properly?
>
> Any advice is appreciated.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tom Breloff
Tim: can you produce a minimally working example of how you would compare 2 images (using something like your 2nd paragraph) to within 99.9% similarity using Images.jl?  Ideally the correctness number is flexible, so you could validate exact matches and approximate matches with the same method.

Should I start a new package (can add to JuliaGraphics) to streamline these sorts of comparison/tests so that other packages can use them?

On Mon, Sep 21, 2015 at 9:49 AM, Tim Holy <[hidden email]> wrote:
This is an important issue, and I think it's fair to say we don't have a great
solution. Both Winston and Compose do something like this in their tests.
Unfortunately, both projects have issues with different versions of julia---or
different versions of underlying libraries like libcairo on the maintainer's vs
Travis' machines---giving minor differences in output.

Images has a ssd method that computes the sum-of-square differences---perhaps
we should use that? Or do it manually in combination with imfilter_gaussian, to
blur out any differences that affect alignment on the scale of a pixel or two?

Best,
--Tim

On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> I'd like to make my testing robust for Plots.jl (especially since I'll want
> to know if plotting backends change)... what is the recommended way to test
> packages that produce graphics?  I suppose I can test a lot by generating
> plots, writing to a PNG, and then comparing that the raw bytes are equal to
> some known (correct) PNG?  Is there accepted best practice here?  What
> would be the best way to compare the files?  Is there any good way to test
> that a GUI window is working properly?
>
> Any advice is appreciated.  Thanks.


Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tim Holy
I don't think we need a whole package for this, because it's only a handful of
lines. If you want, we could add a @test_images_approx_eq_eps macro to
Images.jl. Or we could add it to ImageView and, in cases where they differ,
make it pop up a window that displays the two images :-).

Here's a demo:

# Get set up
using Images, TestImages
img1 = testimage("mandrill")
# Let's make img2 a single-pixel circular shift of img1
img2 = shareproperties(img1, img1[:,[size(img1,2);1:size(img1,2)-1]])

# Here's the calculation
img1f = float32(img1)   # to make sure Ufixed8 doesn't overflow
img2f = float32(img2)
imgdiff = img1f - img2f
imgblur = imfilter_gaussian(imgdiff, [1,1])  # gaussian with sigma=1pixel
err = sum(abs, imgblur)
pow = (sum(abs, img1f) + sum(abs, img2f))/2
if err > 0.001*pow
    error("img1 and img2 differ")
end

This example gives a 6% difference, which is over your threshold. If you blur
by sigma=2pixels, the error drops to 2.6%.

The idea behind blurring is that shifts result in adjacent negative and
positive values, which cancel when you blur.

--Tim


On Monday, September 21, 2015 10:12:47 AM Tom Breloff wrote:

> Tim: can you produce a minimally working example of how you would compare 2
> images (using something like your 2nd paragraph) to within 99.9% similarity
> using Images.jl?  Ideally the correctness number is flexible, so you could
> validate exact matches and approximate matches with the same method.
>
> Should I start a new package (can add to JuliaGraphics) to streamline these
> sorts of comparison/tests so that other packages can use them?
>
> On Mon, Sep 21, 2015 at 9:49 AM, Tim Holy <[hidden email]> wrote:
> > This is an important issue, and I think it's fair to say we don't have a
> > great
> > solution. Both Winston and Compose do something like this in their tests.
> > Unfortunately, both projects have issues with different versions of
> > julia---or
> > different versions of underlying libraries like libcairo on the
> > maintainer's vs
> > Travis' machines---giving minor differences in output.
> >
> > Images has a ssd method that computes the sum-of-square
> > differences---perhaps
> > we should use that? Or do it manually in combination with
> > imfilter_gaussian, to
> > blur out any differences that affect alignment on the scale of a pixel or
> > two?
> >
> > Best,
> > --Tim
> >
> > On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> > > I'd like to make my testing robust for Plots.jl (especially since I'll
> >
> > want
> >
> > > to know if plotting backends change)... what is the recommended way to
> >
> > test
> >
> > > packages that produce graphics?  I suppose I can test a lot by
> > > generating
> > > plots, writing to a PNG, and then comparing that the raw bytes are equal
> >
> > to
> >
> > > some known (correct) PNG?  Is there accepted best practice here?  What
> > > would be the best way to compare the files?  Is there any good way to
> >
> > test
> >
> > > that a GUI window is working properly?
> > >
> > > Any advice is appreciated.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tom Breloff
I like the idea of adding the underlying comparison logic to Images.  

In regards to ImageView, it might be cool to be able to show the images side by side, along with a third image that is the "difference between each pixel" (I'm not exactly sure what that means... maybe a grayscale image using `colordiff` on each pixel color?)  I don't know ImageView well, but based on my assumptions this would be a relatively easy thing to make.  Thoughts?

On Mon, Sep 21, 2015 at 10:48 AM, Tim Holy <[hidden email]> wrote:
I don't think we need a whole package for this, because it's only a handful of
lines. If you want, we could add a @test_images_approx_eq_eps macro to
Images.jl. Or we could add it to ImageView and, in cases where they differ,
make it pop up a window that displays the two images :-).

Here's a demo:

# Get set up
using Images, TestImages
img1 = testimage("mandrill")
# Let's make img2 a single-pixel circular shift of img1
img2 = shareproperties(img1, img1[:,[size(img1,2);1:size(img1,2)-1]])

# Here's the calculation
img1f = float32(img1)   # to make sure Ufixed8 doesn't overflow
img2f = float32(img2)
imgdiff = img1f - img2f
imgblur = imfilter_gaussian(imgdiff, [1,1])  # gaussian with sigma=1pixel
err = sum(abs, imgblur)
pow = (sum(abs, img1f) + sum(abs, img2f))/2
if err > 0.001*pow
    error("img1 and img2 differ")
end

This example gives a 6% difference, which is over your threshold. If you blur
by sigma=2pixels, the error drops to 2.6%.

The idea behind blurring is that shifts result in adjacent negative and
positive values, which cancel when you blur.

--Tim


On Monday, September 21, 2015 10:12:47 AM Tom Breloff wrote:
> Tim: can you produce a minimally working example of how you would compare 2
> images (using something like your 2nd paragraph) to within 99.9% similarity
> using Images.jl?  Ideally the correctness number is flexible, so you could
> validate exact matches and approximate matches with the same method.
>
> Should I start a new package (can add to JuliaGraphics) to streamline these
> sorts of comparison/tests so that other packages can use them?
>
> On Mon, Sep 21, 2015 at 9:49 AM, Tim Holy <[hidden email]> wrote:
> > This is an important issue, and I think it's fair to say we don't have a
> > great
> > solution. Both Winston and Compose do something like this in their tests.
> > Unfortunately, both projects have issues with different versions of
> > julia---or
> > different versions of underlying libraries like libcairo on the
> > maintainer's vs
> > Travis' machines---giving minor differences in output.
> >
> > Images has a ssd method that computes the sum-of-square
> > differences---perhaps
> > we should use that? Or do it manually in combination with
> > imfilter_gaussian, to
> > blur out any differences that affect alignment on the scale of a pixel or
> > two?
> >
> > Best,
> > --Tim
> >
> > On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> > > I'd like to make my testing robust for Plots.jl (especially since I'll
> >
> > want
> >
> > > to know if plotting backends change)... what is the recommended way to
> >
> > test
> >
> > > packages that produce graphics?  I suppose I can test a lot by
> > > generating
> > > plots, writing to a PNG, and then comparing that the raw bytes are equal
> >
> > to
> >
> > > some known (correct) PNG?  Is there accepted best practice here?  What
> > > would be the best way to compare the files?  Is there any good way to
> >
> > test
> >
> > > that a GUI window is working properly?
> > >
> > > Any advice is appreciated.  Thanks.


Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Seth
There's a related use case here with graphs and compressed file formats / visualizations that may argue for a more general solution. It would be great to be able to take a file of arbitrary format and do comparisons on it. (Too complicated for a first cut?)

Seth.


On Monday, September 21, 2015 at 11:12:07 AM UTC-7, Tom Breloff wrote:
I like the idea of adding the underlying comparison logic to Images.  

In regards to ImageView, it might be cool to be able to show the images side by side, along with a third image that is the "difference between each pixel" (I'm not exactly sure what that means... maybe a grayscale image using `colordiff` on each pixel color?)  I don't know ImageView well, but based on my assumptions this would be a relatively easy thing to make.  Thoughts?

On Mon, Sep 21, 2015 at 10:48 AM, Tim Holy <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="VjnsOJ47BQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">tim....@...> wrote:
I don't think we need a whole package for this, because it's only a handful of
lines. If you want, we could add a @test_images_approx_eq_eps macro to
Images.jl. Or we could add it to ImageView and, in cases where they differ,
make it pop up a window that displays the two images :-).

Here's a demo:

# Get set up
using Images, TestImages
img1 = testimage("mandrill")
# Let's make img2 a single-pixel circular shift of img1
img2 = shareproperties(img1, img1[:,[size(img1,2);1:size(img1,2)-1]])

# Here's the calculation
img1f = float32(img1)   # to make sure Ufixed8 doesn't overflow
img2f = float32(img2)
imgdiff = img1f - img2f
imgblur = imfilter_gaussian(imgdiff, [1,1])  # gaussian with sigma=1pixel
err = sum(abs, imgblur)
pow = (sum(abs, img1f) + sum(abs, img2f))/2
if err > 0.001*pow
    error("img1 and img2 differ")
end

This example gives a 6% difference, which is over your threshold. If you blur
by sigma=2pixels, the error drops to 2.6%.

The idea behind blurring is that shifts result in adjacent negative and
positive values, which cancel when you blur.

--Tim


On Monday, September 21, 2015 10:12:47 AM Tom Breloff wrote:
> Tim: can you produce a minimally working example of how you would compare 2
> images (using something like your 2nd paragraph) to within 99.9% similarity
> using Images.jl?  Ideally the correctness number is flexible, so you could
> validate exact matches and approximate matches with the same method.
>
> Should I start a new package (can add to JuliaGraphics) to streamline these
> sorts of comparison/tests so that other packages can use them?
>
> On Mon, Sep 21, 2015 at 9:49 AM, Tim Holy <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="VjnsOJ47BQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">tim....@...> wrote:
> > This is an important issue, and I think it's fair to say we don't have a
> > great
> > solution. Both Winston and Compose do something like this in their tests.
> > Unfortunately, both projects have issues with different versions of
> > julia---or
> > different versions of underlying libraries like libcairo on the
> > maintainer's vs
> > Travis' machines---giving minor differences in output.
> >
> > Images has a ssd method that computes the sum-of-square
> > differences---perhaps
> > we should use that? Or do it manually in combination with
> > imfilter_gaussian, to
> > blur out any differences that affect alignment on the scale of a pixel or
> > two?
> >
> > Best,
> > --Tim
> >
> > On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> > > I'd like to make my testing robust for Plots.jl (especially since I'll
> >
> > want
> >
> > > to know if plotting backends change)... what is the recommended way to
> >
> > test
> >
> > > packages that produce graphics?  I suppose I can test a lot by
> > > generating
> > > plots, writing to a PNG, and then comparing that the raw bytes are equal
> >
> > to
> >
> > > some known (correct) PNG?  Is there accepted best practice here?  What
> > > would be the best way to compare the files?  Is there any good way to
> >
> > test
> >
> > > that a GUI window is working properly?
> > >
> > > Any advice is appreciated.  Thanks.


Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tim Holy
In reply to this post by Tom Breloff
On Monday, September 21, 2015 02:12:03 PM Tom Breloff wrote:
> I like the idea of adding the underlying comparison logic to Images.
>
> In regards to ImageView, it might be cool to be able to show the images
> side by side, along with a third image that is the "difference between each
> pixel" (I'm not exactly sure what that means... maybe a grayscale image
> using `colordiff` on each pixel color?)  I don't know ImageView well, but
> based on my assumptions this would be a relatively easy thing to make.
> Thoughts?

Yep, pretty easy. I can try to tackle this in the next day or two; if I
forget, file an issue at Images.

--Tim

>
> On Mon, Sep 21, 2015 at 10:48 AM, Tim Holy <[hidden email]> wrote:
> > I don't think we need a whole package for this, because it's only a
> > handful of
> > lines. If you want, we could add a @test_images_approx_eq_eps macro to
> > Images.jl. Or we could add it to ImageView and, in cases where they
> > differ,
> > make it pop up a window that displays the two images :-).
> >
> > Here's a demo:
> >
> > # Get set up
> > using Images, TestImages
> > img1 = testimage("mandrill")
> > # Let's make img2 a single-pixel circular shift of img1
> > img2 = shareproperties(img1, img1[:,[size(img1,2);1:size(img1,2)-1]])
> >
> > # Here's the calculation
> > img1f = float32(img1)   # to make sure Ufixed8 doesn't overflow
> > img2f = float32(img2)
> > imgdiff = img1f - img2f
> > imgblur = imfilter_gaussian(imgdiff, [1,1])  # gaussian with sigma=1pixel
> > err = sum(abs, imgblur)
> > pow = (sum(abs, img1f) + sum(abs, img2f))/2
> > if err > 0.001*pow
> >
> >     error("img1 and img2 differ")
> >
> > end
> >
> > This example gives a 6% difference, which is over your threshold. If you
> > blur
> > by sigma=2pixels, the error drops to 2.6%.
> >
> > The idea behind blurring is that shifts result in adjacent negative and
> > positive values, which cancel when you blur.
> >
> > --Tim
> >
> > On Monday, September 21, 2015 10:12:47 AM Tom Breloff wrote:
> > > Tim: can you produce a minimally working example of how you would
> >
> > compare 2
> >
> > > images (using something like your 2nd paragraph) to within 99.9%
> >
> > similarity
> >
> > > using Images.jl?  Ideally the correctness number is flexible, so you
> >
> > could
> >
> > > validate exact matches and approximate matches with the same method.
> > >
> > > Should I start a new package (can add to JuliaGraphics) to streamline
> >
> > these
> >
> > > sorts of comparison/tests so that other packages can use them?
> > >
> > > On Mon, Sep 21, 2015 at 9:49 AM, Tim Holy <[hidden email]> wrote:
> > > > This is an important issue, and I think it's fair to say we don't have
> >
> > a
> >
> > > > great
> > > > solution. Both Winston and Compose do something like this in their
> >
> > tests.
> >
> > > > Unfortunately, both projects have issues with different versions of
> > > > julia---or
> > > > different versions of underlying libraries like libcairo on the
> > > > maintainer's vs
> > > > Travis' machines---giving minor differences in output.
> > > >
> > > > Images has a ssd method that computes the sum-of-square
> > > > differences---perhaps
> > > > we should use that? Or do it manually in combination with
> > > > imfilter_gaussian, to
> > > > blur out any differences that affect alignment on the scale of a pixel
> >
> > or
> >
> > > > two?
> > > >
> > > > Best,
> > > > --Tim
> > > >
> > > > On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> > > > > I'd like to make my testing robust for Plots.jl (especially since
> >
> > I'll
> >
> > > > want
> > > >
> > > > > to know if plotting backends change)... what is the recommended way
> >
> > to
> >
> > > > test
> > > >
> > > > > packages that produce graphics?  I suppose I can test a lot by
> > > > > generating
> > > > > plots, writing to a PNG, and then comparing that the raw bytes are
> >
> > equal
> >
> > > > to
> > > >
> > > > > some known (correct) PNG?  Is there accepted best practice here?
> >
> > What
> >
> > > > > would be the best way to compare the files?  Is there any good way
> > > > > to
> > > >
> > > > test
> > > >
> > > > > that a GUI window is working properly?
> > > > >
> > > > > Any advice is appreciated.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tim Holy
In reply to this post by Seth
If you mean byte-level comparisons, that's easy. But also relatively unrelated
to the solution Tom and I are converging on, which is quite specific to image
comparisons.

--Tim

On Monday, September 21, 2015 11:20:40 AM [hidden email] wrote:

> There's a related use case here with graphs and compressed file formats /
> visualizations that may argue for a more general solution. It would be
> great to be able to take a file of arbitrary format and do comparisons on
> it. (Too complicated for a first cut?)
>
> Seth.
>
> On Monday, September 21, 2015 at 11:12:07 AM UTC-7, Tom Breloff wrote:
> > I like the idea of adding the underlying comparison logic to Images.
> >
> > In regards to ImageView, it might be cool to be able to show the images
> > side by side, along with a third image that is the "difference between
> > each
> > pixel" (I'm not exactly sure what that means... maybe a grayscale image
> > using `colordiff` on each pixel color?)  I don't know ImageView well, but
> > based on my assumptions this would be a relatively easy thing to make.
> > Thoughts?
> >
> > On Mon, Sep 21, 2015 at 10:48 AM, Tim Holy <[hidden email]
> >
> > <javascript:>> wrote:
> >> I don't think we need a whole package for this, because it's only a
> >> handful of
> >> lines. If you want, we could add a @test_images_approx_eq_eps macro to
> >> Images.jl. Or we could add it to ImageView and, in cases where they
> >> differ,
> >> make it pop up a window that displays the two images :-).
> >>
> >> Here's a demo:
> >>
> >> # Get set up
> >> using Images, TestImages
> >> img1 = testimage("mandrill")
> >> # Let's make img2 a single-pixel circular shift of img1
> >> img2 = shareproperties(img1, img1[:,[size(img1,2);1:size(img1,2)-1]])
> >>
> >> # Here's the calculation
> >> img1f = float32(img1)   # to make sure Ufixed8 doesn't overflow
> >> img2f = float32(img2)
> >> imgdiff = img1f - img2f
> >> imgblur = imfilter_gaussian(imgdiff, [1,1])  # gaussian with sigma=1pixel
> >> err = sum(abs, imgblur)
> >> pow = (sum(abs, img1f) + sum(abs, img2f))/2
> >> if err > 0.001*pow
> >>
> >>     error("img1 and img2 differ")
> >>
> >> end
> >>
> >> This example gives a 6% difference, which is over your threshold. If you
> >> blur
> >> by sigma=2pixels, the error drops to 2.6%.
> >>
> >> The idea behind blurring is that shifts result in adjacent negative and
> >> positive values, which cancel when you blur.
> >>
> >> --Tim
> >>
> >> On Monday, September 21, 2015 10:12:47 AM Tom Breloff wrote:
> >> > Tim: can you produce a minimally working example of how you would
> >>
> >> compare 2
> >>
> >> > images (using something like your 2nd paragraph) to within 99.9%
> >>
> >> similarity
> >>
> >> > using Images.jl?  Ideally the correctness number is flexible, so you
> >>
> >> could
> >>
> >> > validate exact matches and approximate matches with the same method.
> >> >
> >> > Should I start a new package (can add to JuliaGraphics) to streamline
> >>
> >> these
> >>
> >> > sorts of comparison/tests so that other packages can use them?
> >> >
> >> > On Mon, Sep 21, 2015 at 9:49 AM, Tim Holy <[hidden email]
> >>
> >> <javascript:>> wrote:
> >> > > This is an important issue, and I think it's fair to say we don't
> >>
> >> have a
> >>
> >> > > great
> >> > > solution. Both Winston and Compose do something like this in their
> >>
> >> tests.
> >>
> >> > > Unfortunately, both projects have issues with different versions of
> >> > > julia---or
> >> > > different versions of underlying libraries like libcairo on the
> >> > > maintainer's vs
> >> > > Travis' machines---giving minor differences in output.
> >> > >
> >> > > Images has a ssd method that computes the sum-of-square
> >> > > differences---perhaps
> >> > > we should use that? Or do it manually in combination with
> >> > > imfilter_gaussian, to
> >> > > blur out any differences that affect alignment on the scale of a
> >>
> >> pixel or
> >>
> >> > > two?
> >> > >
> >> > > Best,
> >> > > --Tim
> >> > >
> >> > > On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> >> > > > I'd like to make my testing robust for Plots.jl (especially since
> >>
> >> I'll
> >>
> >> > > want
> >> > >
> >> > > > to know if plotting backends change)... what is the recommended way
> >>
> >> to
> >>
> >> > > test
> >> > >
> >> > > > packages that produce graphics?  I suppose I can test a lot by
> >> > > > generating
> >> > > > plots, writing to a PNG, and then comparing that the raw bytes are
> >>
> >> equal
> >>
> >> > > to
> >> > >
> >> > > > some known (correct) PNG?  Is there accepted best practice here?
> >>
> >> What
> >>
> >> > > > would be the best way to compare the files?  Is there any good way
> >>
> >> to
> >>
> >> > > test
> >> > >
> >> > > > that a GUI window is working properly?
> >> > > >
> >> > > > Any advice is appreciated.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Simon Danisch
In reply to this post by Tom Breloff
I wanted to try this at some point: http://pdiff.sourceforge.net/
I also opened an issue about it to not forget it:
https://github.com/SimonDanisch/GLPlot.jl/issues/22

Best,
Simon

Am Montag, 21. September 2015 15:39:30 UTC+2 schrieb Tom Breloff:
I'd like to make my testing robust for Plots.jl (especially since I'll want to know if plotting backends change)... what is the recommended way to test packages that produce graphics?  I suppose I can test a lot by generating plots, writing to a PNG, and then comparing that the raw bytes are equal to some known (correct) PNG?  Is there accepted best practice here?  What would be the best way to compare the files?  Is there any good way to test that a GUI window is working properly?

Any advice is appreciated.  Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Chris Foster
I used pdiff a lot when I used to hack on the aqsis renderer a while
back.  It was quite useful in the regression test suite where we
wanted to render out a whole bunch of different scenes and check that
they matched a set of reference images.  (Page describing the aqsis
image regression test suite:
https://github.com/aqsis/aqsis/wiki/Regression-Testing.  Test suite
code: https://svn.code.sf.net/p/aqsis/svn/trunk/testing/regression/
(yikes, is that an svn repo!?))

Having said that, I'm not sure pdiff is quite what you want: it was
designed for comparing output from 3D software renderers where you
typically use stochastic light integration to compute a solution to
the rendering equation.  In the film rendering world, if you can get
away with a cheaper approximation which is perceptually the same, then
you're winning.

If it's just rendering of 2D vector graphics or 3D OpenGL without
sophisticated lighting, you may be better off just looking at the
image differences and comparing with some simple absolute thresholds.
I remember a discussion about image differencing on the OpenImageIO
list a while back.  In particular, this comment from Larry Gritz:

> I haven't been very satisfied by the Yee (--pdiff) method. More specifically, although it sounds good theoretically, in practice I just haven't found any situations in which it's a superior approach to just using regular diffs with carefully thought-out thresholds and percentages.

See the thread here:
http://lists.openimageio.org/pipermail/oiio-dev-openimageio.org/2014-August/013434.html

Cheers,
~Chris

On Tue, Sep 22, 2015 at 7:44 AM, Simon Danisch <[hidden email]> wrote:

> I wanted to try this at some point: http://pdiff.sourceforge.net/
> I also opened an issue about it to not forget it:
> https://github.com/SimonDanisch/GLPlot.jl/issues/22
>
> Best,
> Simon
>
> Am Montag, 21. September 2015 15:39:30 UTC+2 schrieb Tom Breloff:
>>
>> I'd like to make my testing robust for Plots.jl (especially since I'll
>> want to know if plotting backends change)... what is the recommended way to
>> test packages that produce graphics?  I suppose I can test a lot by
>> generating plots, writing to a PNG, and then comparing that the raw bytes
>> are equal to some known (correct) PNG?  Is there accepted best practice
>> here?  What would be the best way to compare the files?  Is there any good
>> way to test that a GUI window is working properly?
>>
>> Any advice is appreciated.  Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

J Luis
In GMT we use graphicsmagick to compare postscript figures. The output is a png. I have implemented it in my WIP porting of GMT, see an example in

https://github.com/joa-quim/GMT.jl/blob/master/src/gmtest.jl

segunda-feira, 21 de Setembro de 2015 às 23:45:51 UTC+1, Chris Foster escreveu:
I used pdiff a lot when I used to hack on the aqsis renderer a while
back.  It was quite useful in the regression test suite where we
wanted to render out a whole bunch of different scenes and check that
they matched a set of reference images.  (Page describing the aqsis
image regression test suite:
<a href="https://github.com/aqsis/aqsis/wiki/Regression-Testing" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Faqsis%2Faqsis%2Fwiki%2FRegression-Testing\46sa\75D\46sntz\0751\46usg\75AFQjCNEdDKIACiiy4EuckFsiP-B_UiwnBQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Faqsis%2Faqsis%2Fwiki%2FRegression-Testing\46sa\75D\46sntz\0751\46usg\75AFQjCNEdDKIACiiy4EuckFsiP-B_UiwnBQ&#39;;return true;">https://github.com/aqsis/aqsis/wiki/Regression-Testing.  Test suite
code: <a href="https://svn.code.sf.net/p/aqsis/svn/trunk/testing/regression/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fsvn.code.sf.net%2Fp%2Faqsis%2Fsvn%2Ftrunk%2Ftesting%2Fregression%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNE4JpKh1ktj4gFiBFOQmDXxh1V5qQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fsvn.code.sf.net%2Fp%2Faqsis%2Fsvn%2Ftrunk%2Ftesting%2Fregression%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNE4JpKh1ktj4gFiBFOQmDXxh1V5qQ&#39;;return true;">https://svn.code.sf.net/p/aqsis/svn/trunk/testing/regression/
(yikes, is that an svn repo!?))

Having said that, I'm not sure pdiff is quite what you want: it was
designed for comparing output from 3D software renderers where you
typically use stochastic light integration to compute a solution to
the rendering equation.  In the film rendering world, if you can get
away with a cheaper approximation which is perceptually the same, then
you're winning.

If it's just rendering of 2D vector graphics or 3D OpenGL without
sophisticated lighting, you may be better off just looking at the
image differences and comparing with some simple absolute thresholds.
I remember a discussion about image differencing on the OpenImageIO
list a while back.  In particular, this comment from Larry Gritz:

> I haven't been very satisfied by the Yee (--pdiff) method. More specifically, although it sounds good theoretically, in practice I just haven't found any situations in which it's a superior approach to just using regular diffs with carefully thought-out thresholds and percentages.

See the thread here:
<a href="http://lists.openimageio.org/pipermail/oiio-dev-openimageio.org/2014-August/013434.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flists.openimageio.org%2Fpipermail%2Foiio-dev-openimageio.org%2F2014-August%2F013434.html\46sa\75D\46sntz\0751\46usg\75AFQjCNF0gRWRgj5PKego5SP3ddgkVywDvw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flists.openimageio.org%2Fpipermail%2Foiio-dev-openimageio.org%2F2014-August%2F013434.html\46sa\75D\46sntz\0751\46usg\75AFQjCNF0gRWRgj5PKego5SP3ddgkVywDvw&#39;;return true;">http://lists.openimageio.org/pipermail/oiio-dev-openimageio.org/2014-August/013434.html

Cheers,
~Chris

On Tue, Sep 22, 2015 at 7:44 AM, Simon Danisch <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="MWXRJo5KBQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">sdan...@...> wrote:

> I wanted to try this at some point: <a href="http://pdiff.sourceforge.net/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fpdiff.sourceforge.net%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNHgMCp67kgsChBir9xZrn8SjD9gbw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fpdiff.sourceforge.net%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNHgMCp67kgsChBir9xZrn8SjD9gbw&#39;;return true;">http://pdiff.sourceforge.net/
> I also opened an issue about it to not forget it:
> <a href="https://github.com/SimonDanisch/GLPlot.jl/issues/22" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FSimonDanisch%2FGLPlot.jl%2Fissues%2F22\46sa\75D\46sntz\0751\46usg\75AFQjCNECX4ChJxHy7vnHA7PCDuUdk2tGow&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FSimonDanisch%2FGLPlot.jl%2Fissues%2F22\46sa\75D\46sntz\0751\46usg\75AFQjCNECX4ChJxHy7vnHA7PCDuUdk2tGow&#39;;return true;">https://github.com/SimonDanisch/GLPlot.jl/issues/22
>
> Best,
> Simon
>
> Am Montag, 21. September 2015 15:39:30 UTC+2 schrieb Tom Breloff:
>>
>> I'd like to make my testing robust for Plots.jl (especially since I'll
>> want to know if plotting backends change)... what is the recommended way to
>> test packages that produce graphics?  I suppose I can test a lot by
>> generating plots, writing to a PNG, and then comparing that the raw bytes
>> are equal to some known (correct) PNG?  Is there accepted best practice
>> here?  What would be the best way to compare the files?  Is there any good
>> way to test that a GUI window is working properly?
>>
>> Any advice is appreciated.  Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Gunnar Farnebäck
In reply to this post by Tim Holy
A common approach today, if you want more leeway than pixelwise comparison, is the Structural Similarity Index (SSIM),
https://en.wikipedia.org/wiki/Structural_similarity

Not necessarily worth the added complexity though.

Den måndag 21 september 2015 kl. 15:49:08 UTC+2 skrev Tim Holy:
This is an important issue, and I think it's fair to say we don't have a great
solution. Both Winston and Compose do something like this in their tests.
Unfortunately, both projects have issues with different versions of julia---or
different versions of underlying libraries like libcairo on the maintainer's vs
Travis' machines---giving minor differences in output.

Images has a ssd method that computes the sum-of-square differences---perhaps
we should use that? Or do it manually in combination with imfilter_gaussian, to
blur out any differences that affect alignment on the scale of a pixel or two?

Best,
--Tim

On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> I'd like to make my testing robust for Plots.jl (especially since I'll want
> to know if plotting backends change)... what is the recommended way to test
> packages that produce graphics?  I suppose I can test a lot by generating
> plots, writing to a PNG, and then comparing that the raw bytes are equal to
> some known (correct) PNG?  Is there accepted best practice here?  What
> would be the best way to compare the files?  Is there any good way to test
> that a GUI window is working properly?
>
> Any advice is appreciated.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tom Breloff
SSIM looks relatively straightforward... Could add that to Images as well?  I'm happy to add it to my todo list. 

On Tuesday, September 22, 2015, Gunnar Farnebäck <[hidden email]> wrote:
A common approach today, if you want more leeway than pixelwise comparison, is the Structural Similarity Index (SSIM),
https://en.wikipedia.org/wiki/Structural_similarity

Not necessarily worth the added complexity though.

Den måndag 21 september 2015 kl. 15:49:08 UTC+2 skrev Tim Holy:
This is an important issue, and I think it's fair to say we don't have a great
solution. Both Winston and Compose do something like this in their tests.
Unfortunately, both projects have issues with different versions of julia---or
different versions of underlying libraries like libcairo on the maintainer's vs
Travis' machines---giving minor differences in output.

Images has a ssd method that computes the sum-of-square differences---perhaps
we should use that? Or do it manually in combination with imfilter_gaussian, to
blur out any differences that affect alignment on the scale of a pixel or two?

Best,
--Tim

On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> I'd like to make my testing robust for Plots.jl (especially since I'll want
> to know if plotting backends change)... what is the recommended way to test
> packages that produce graphics?  I suppose I can test a lot by generating
> plots, writing to a PNG, and then comparing that the raw bytes are equal to
> some known (correct) PNG?  Is there accepted best practice here?  What
> would be the best way to compare the files?  Is there any good way to test
> that a GUI window is working properly?
>
> Any advice is appreciated.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Andreas Lobinger
In reply to this post by Simon Danisch
Hello colleagues,

On Monday, September 21, 2015 at 11:44:42 PM UTC+2, Simon Danisch wrote:
I wanted to try this at some point: <a href="http://pdiff.sourceforge.net/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fpdiff.sourceforge.net%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNHgMCp67kgsChBir9xZrn8SjD9gbw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fpdiff.sourceforge.net%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNHgMCp67kgsChBir9xZrn8SjD9gbw&#39;;return true;">http://pdiff.sourceforge.net/
I also opened an issue about it to not forget it:
<a href="https://github.com/SimonDanisch/GLPlot.jl/issues/22" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FSimonDanisch%2FGLPlot.jl%2Fissues%2F22\46sa\75D\46sntz\0751\46usg\75AFQjCNECX4ChJxHy7vnHA7PCDuUdk2tGow&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FSimonDanisch%2FGLPlot.jl%2Fissues%2F22\46sa\75D\46sntz\0751\46usg\75AFQjCNECX4ChJxHy7vnHA7PCDuUdk2tGow&#39;;return true;">https://github.com/SimonDanisch/GLPlot.jl/issues/22

libcairo has a local pdiff (in src/test/pdiff) which is used in extensive regression testing. Reading the header of the main program it looks like this is pdiff in an earlier version. First step is comparison of size, then binary, then perceptually.
Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Andreas Lobinger
In reply to this post by Tom Breloff
Did you come to a conclusion, what to use? I'd be interested to share the effort.

On Tuesday, September 22, 2015 at 4:59:44 PM UTC+2, Tom Breloff wrote:
SSIM looks relatively straightforward... Could add that to Images as well?  I'm happy to add it to my todo list. 

On Tuesday, September 22, 2015, Gunnar Farnebäck <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="t19_PLN_BQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">gun...@...> wrote:
A common approach today, if you want more leeway than pixelwise comparison, is the Structural Similarity Index (SSIM),
<a href="https://en.wikipedia.org/wiki/Structural_similarity" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FStructural_similarity\46sa\75D\46sntz\0751\46usg\75AFQjCNFuCm-6d2ytW7YMtxzA0XNcOPlKZQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FStructural_similarity\46sa\75D\46sntz\0751\46usg\75AFQjCNFuCm-6d2ytW7YMtxzA0XNcOPlKZQ&#39;;return true;">https://en.wikipedia.org/wiki/Structural_similarity

Not necessarily worth the added complexity though.

Den måndag 21 september 2015 kl. 15:49:08 UTC+2 skrev Tim Holy:
This is an important issue, and I think it's fair to say we don't have a great
solution. Both Winston and Compose do something like this in their tests.
Unfortunately, both projects have issues with different versions of julia---or
different versions of underlying libraries like libcairo on the maintainer's vs
Travis' machines---giving minor differences in output.

Images has a ssd method that computes the sum-of-square differences---perhaps
we should use that? Or do it manually in combination with imfilter_gaussian, to
blur out any differences that affect alignment on the scale of a pixel or two?

Best,
--Tim

On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> I'd like to make my testing robust for Plots.jl (especially since I'll want
> to know if plotting backends change)... what is the recommended way to test
> packages that produce graphics?  I suppose I can test a lot by generating
> plots, writing to a PNG, and then comparing that the raw bytes are equal to
> some known (correct) PNG?  Is there accepted best practice here?  What
> would be the best way to compare the files?  Is there any good way to test
> that a GUI window is working properly?
>
> Any advice is appreciated.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tom Breloff
I haven't come to any conclusions yet.  If you could produce some sample tests using Cairo's pdiff it would be helpful.

On Fri, Sep 25, 2015 at 9:48 AM, Andreas Lobinger <[hidden email]> wrote:
Did you come to a conclusion, what to use? I'd be interested to share the effort.

On Tuesday, September 22, 2015 at 4:59:44 PM UTC+2, Tom Breloff wrote:
SSIM looks relatively straightforward... Could add that to Images as well?  I'm happy to add it to my todo list. 

On Tuesday, September 22, 2015, Gunnar Farnebäck <[hidden email]> wrote:
A common approach today, if you want more leeway than pixelwise comparison, is the Structural Similarity Index (SSIM),
https://en.wikipedia.org/wiki/Structural_similarity

Not necessarily worth the added complexity though.

Den måndag 21 september 2015 kl. 15:49:08 UTC+2 skrev Tim Holy:
This is an important issue, and I think it's fair to say we don't have a great
solution. Both Winston and Compose do something like this in their tests.
Unfortunately, both projects have issues with different versions of julia---or
different versions of underlying libraries like libcairo on the maintainer's vs
Travis' machines---giving minor differences in output.

Images has a ssd method that computes the sum-of-square differences---perhaps
we should use that? Or do it manually in combination with imfilter_gaussian, to
blur out any differences that affect alignment on the scale of a pixel or two?

Best,
--Tim

On Monday, September 21, 2015 06:39:30 AM Tom Breloff wrote:
> I'd like to make my testing robust for Plots.jl (especially since I'll want
> to know if plotting backends change)... what is the recommended way to test
> packages that produce graphics?  I suppose I can test a lot by generating
> plots, writing to a PNG, and then comparing that the raw bytes are equal to
> some known (correct) PNG?  Is there accepted best practice here?  What
> would be the best way to compare the files?  Is there any good way to test
> that a GUI window is working properly?
>
> Any advice is appreciated.  Thanks.


Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Andreas Lobinger
Hello colleague,

On Friday, September 25, 2015 at 4:02:05 PM UTC+2, Tom Breloff wrote:
I haven't come to any conclusions yet.  If you could produce some sample tests using Cairo's pdiff it would be helpful.


cairos pdiff (build with /test/pdiff   make perceptualdiff) is replying as difference between this
and this

(small gaussian blur in the middle of the picture):

lobi@orange3:~/cairo_int/cairo-1.14.0/test/pdiff$ ./perceptualdiff  ~/cairo-speed-lines.png ~/cairo-speed-lines1.png -verbose
Field of view is 45.000000 degrees
Threshold pixels is 100 pixels
The Gamma is 2.200000
The Display's luminance is 100.000000 candela per meter squared
PASS: Images are perceptually indistinguishable







Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Tom Breloff
Thanks. Can this be automated? Or do we have to scrape/parse the command line output?

On Saturday, September 26, 2015, Andreas Lobinger <[hidden email]> wrote:
Hello colleague,

On Friday, September 25, 2015 at 4:02:05 PM UTC+2, Tom Breloff wrote:
I haven't come to any conclusions yet.  If you could produce some sample tests using Cairo's pdiff it would be helpful.


cairos pdiff (build with /test/pdiff   make perceptualdiff) is replying as difference between this
and this

(small gaussian blur in the middle of the picture):

lobi@orange3:~/cairo_int/cairo-1.14.0/test/pdiff$ ./perceptualdiff  ~/cairo-speed-lines.png ~/cairo-speed-lines1.png -verbose
Field of view is 45.000000 degrees
Threshold pixels is 100 pixels
The Gamma is 2.200000
The Display's luminance is 100.000000 candela per meter squared
PASS: Images are perceptually indistinguishable







Reply | Threaded
Open this post in threaded view
|

Re: Robust tests for graphics packages

Andreas Lobinger


On Saturday, September 26, 2015 at 2:38:57 PM UTC+2, Tom Breloff wrote:
Thanks. Can this be automated? Or do we have to scrape/parse the command line output?

The main program (perceptualdiff.c -> main) returns a status, so running this in a shell and get the status should be easy...
However to get the program you need a full local cairo build and that will get us dependencies a lot. Initially i thought about building the compare function into a lib and just attach via ccall. In my recent experiences with BinDeps my optimism about using external libraries with Julia cooled down.

But looking at the paper and around the SW: Why not doing a re-implementation (of the compare function) in julia.