Julia 0.5 discussions: ReshapedArrays and indexing performance

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Julia 0.5 discussions: ReshapedArrays and indexing performance

Tim Holy
Since there seems to be some interest in discussing strategy for 0.5 (and not
endlessly hijacking the thread on the debugger), I'll start a new thread. I've
been pretty involved in reworking julia's AbstractArray infrastructure in the
past, and I intend to keep participating, but I'm also pretty excited about
the possibilities right now in graphics. Since there's only so much time in
the day, I'll throw out my biggest concern and see whether folks have any good
ideas and/or whether someone wants to pick up a ball and run with it.

In making the transition to defaulting to array views (SubArrays and their
future cousins), I think the insufficiently-recognized elephant in the room is
`reshape`. There seems to be quite a lot of consensus that `reshape` should
return a view rather than a copy. I've demonstrated in the past that `reshape`
does not compose in a closed fashion with `sub`, and come up with a plan for a
2nd view type currently called a ReshapedArray
(https://github.com/JuliaLang/julia/pull/10507).

The kicker is performance. Because indexing a reshaped array often involves a
call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
There are clever ways around its slowness
(http://www.flounder.com/multiplicative_inverse.htm) and we (Simon Kornblith,
with small extensions by me) implemented these algorithms. Unfortunately, in
preliminary tests (see #10507) they are still not nearly as fast as I'd like.
I am not sure how much of this is limitations in strategy, my implementation,
poor (improvable) codegen, or just "that's how it is." It would be interesting
to benchmark against implementations like http://libdivide.com/ and see how
we're doing.

In my opinion, this highly-focused issue is likely to result in the most
difficult decisions we have to make for Arraymagedon. If somebody wants to dive
into this and see how much can be satisfactorily resolved, it would be a big
contribution.

Best,
--Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Matt Bauman
I'll begin looking into this in-depth next week, but others are very welcome to dig in, too.  I've been intending to create an "Array Nirvana 2" umbrella issue to track all the things we want to accomplish for Arrays during 0.5, but haven't had a chance to do it yet.

Here's a quick random spattering of topics off the top of my head:

* User-extensible @inbounds
* ReshapedArrays (requires @inbounds, better perf - although if we get relatively close we could keep the current Array implementation and only use these guys as a fallback)
* Slices as views (requires ReshapedArrays, @inbounds)

* Drop scalar dimensions (could happen independently, relatively simple implementation-wise, unknown how breaking it will prove to be)

* NTuple/Vararg #11242
* Allow arbitrary elements through non-scalar indexing, like #12567

It'd also be cool to do some explorations into the Buffer type: #12447, but ideally that shouldn't be breaking and could happen at any point… I think.

On Monday, September 14, 2015 at 10:00:50 PM UTC-4, Tim Holy wrote:
Since there seems to be some interest in discussing strategy for 0.5 (and not
endlessly hijacking the thread on the debugger), I'll start a new thread. I've
been pretty involved in reworking julia's AbstractArray infrastructure in the
past, and I intend to keep participating, but I'm also pretty excited about
the possibilities right now in graphics. Since there's only so much time in
the day, I'll throw out my biggest concern and see whether folks have any good
ideas and/or whether someone wants to pick up a ball and run with it.

In making the transition to defaulting to array views (SubArrays and their
future cousins), I think the insufficiently-recognized elephant in the room is
`reshape`. There seems to be quite a lot of consensus that `reshape` should
return a view rather than a copy. I've demonstrated in the past that `reshape`
does not compose in a closed fashion with `sub`, and come up with a plan for a
2nd view type currently called a ReshapedArray
(<a href="https://github.com/JuliaLang/julia/pull/10507" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;">https://github.com/JuliaLang/julia/pull/10507).

The kicker is performance. Because indexing a reshaped array often involves a
call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
There are clever ways around its slowness
(<a href="http://www.flounder.com/multiplicative_inverse.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;">http://www.flounder.com/multiplicative_inverse.htm) and we (Simon Kornblith,
with small extensions by me) implemented these algorithms. Unfortunately, in
preliminary tests (see #10507) they are still not nearly as fast as I'd like.
I am not sure how much of this is limitations in strategy, my implementation,
poor (improvable) codegen, or just "that's how it is." It would be interesting
to benchmark against implementations like <a href="http://libdivide.com/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;">http://libdivide.com/ and see how
we're doing.

In my opinion, this highly-focused issue is likely to result in the most
difficult decisions we have to make for Arraymagedon. If somebody wants to dive
into this and see how much can be satisfactorily resolved, it would be a big
contribution.

Best,
--Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Tim Holy
Yep, that's a good list. Really, I think from a technical perspective the two
big items are @inbounds and ReshapedArrays. I guess views of BitArrays could
be another stumbling block. The rest is mostly dealing with the resulting
chaos :-).

--Tim

On Tuesday, September 15, 2015 08:44:36 AM Matt Bauman wrote:

> I'll begin looking into this in-depth next week, but others are very
> welcome to dig in, too.  I've been intending to create an "Array Nirvana 2"
> umbrella issue to track all the things we want to accomplish for Arrays
> during 0.5, but haven't had a chance to do it yet.
>
> Here's a quick random spattering of topics off the top of my head:
>
> * User-extensible @inbounds
> * ReshapedArrays (requires @inbounds, better perf - although if we get
> relatively close we could keep the current Array implementation and only
> use these guys as a fallback)
> * Slices as views (requires ReshapedArrays, @inbounds)
>
> * Drop scalar dimensions (could happen independently, relatively simple
> implementation-wise, unknown how breaking it will prove to be)
>
> * NTuple/Vararg #11242
> * Allow arbitrary elements through non-scalar indexing, like #12567
>
> It'd also be cool to do some explorations into the Buffer type: #12447, but
> ideally that shouldn't be breaking and could happen at any point… I think.
>
> On Monday, September 14, 2015 at 10:00:50 PM UTC-4, Tim Holy wrote:
> > Since there seems to be some interest in discussing strategy for 0.5 (and
> > not
> > endlessly hijacking the thread on the debugger), I'll start a new thread.
> > I've
> > been pretty involved in reworking julia's AbstractArray infrastructure in
> > the
> > past, and I intend to keep participating, but I'm also pretty excited
> > about
> > the possibilities right now in graphics. Since there's only so much time
> > in
> > the day, I'll throw out my biggest concern and see whether folks have any
> > good
> > ideas and/or whether someone wants to pick up a ball and run with it.
> >
> > In making the transition to defaulting to array views (SubArrays and their
> > future cousins), I think the insufficiently-recognized elephant in the
> > room is
> > `reshape`. There seems to be quite a lot of consensus that `reshape`
> > should
> > return a view rather than a copy. I've demonstrated in the past that
> > `reshape`
> > does not compose in a closed fashion with `sub`, and come up with a plan
> > for a
> > 2nd view type currently called a ReshapedArray
> > (https://github.com/JuliaLang/julia/pull/10507).
> >
> > The kicker is performance. Because indexing a reshaped array often
> > involves a
> > call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
> > There are clever ways around its slowness
> > (http://www.flounder.com/multiplicative_inverse.htm) and we (Simon
> > Kornblith,
> > with small extensions by me) implemented these algorithms. Unfortunately,
> > in
> > preliminary tests (see #10507) they are still not nearly as fast as I'd
> > like.
> > I am not sure how much of this is limitations in strategy, my
> > implementation,
> > poor (improvable) codegen, or just "that's how it is." It would be
> > interesting
> > to benchmark against implementations like http://libdivide.com/ and see
> > how
> > we're doing.
> >
> > In my opinion, this highly-focused issue is likely to result in the most
> > difficult decisions we have to make for Arraymagedon. If somebody wants to
> > dive
> > into this and see how much can be satisfactorily resolved, it would be a
> > big
> > contribution.
> >
> > Best,
> > --Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Steven G. Johnson
In reply to this post by Tim Holy
On Monday, September 14, 2015 at 10:00:50 PM UTC-4, Tim Holy wrote:
The kicker is performance. Because indexing a reshaped array often involves a
call to `div`, and `div` is dirt-slow, an alternative is highly desirable.

Presumably this is not a problem for any array type that has fast linear indexing?  You don't need div to index into a reshaped Array, for example. 
Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Matt Bauman
On Monday, September 21, 2015 at 12:46:56 PM UTC-4, Steven G. Johnson wrote:
On Monday, September 14, 2015 at 10:00:50 PM UTC-4, Tim Holy wrote:
The kicker is performance. Because indexing a reshaped array often involves a
call to `div`, and `div` is dirt-slow, an alternative is highly desirable.

Presumably this is not a problem for any array type that has fast linear indexing?  You don't need div to index into a reshaped Array, for example. 

The trouble is the interaction with views.  SubArrays require indexing at full dimensionality, and we'll soon have tons of them.
Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Andy Ferris
I'd like to add an new item to the wish list for "arraymageddon", and this seemed like the best place to post it. The basic idea is a simple (and perhaps built-in) way of doing multilinear algebra (tensor contractions, einsum, or whatever you wish to call it).

In github.com/andyferris/Tensors.jl I've implemented a naming scheme for the indices of arrays that is convenient for multilinear algebra, tensor contractions, index permutations, etc. Though there are different ways of doing this (via macros, Jutho's TensorOperations.jl package, etc), the names of the indices are included in the type (as a type-Tuple{} of integers or symbols, as the user prefers) and assigned via getindex(::Type{Tuple{...}}) (and setindex!, etc). The idea was (a) remove run-time overhead associated with figuring out where indices match and (b) have persistent types that remember what they are. For (a), generated functions are used, so compile-time might be a bit slower. At the moment, the syntax is a bit annoying and thus helper macros are useful, but... there is hope:

If we use {...} as shorthand for Tuple{...} in Julia 0.5 we the nice tensor indexing via [{ ... }]. For a couple of examples

    A[{3,2,1}] = B[{1,2,3}]   # A = permutedims(B,[3,2,1])

or 

    A[{:left,:middle}] * B[{:middle,:right}]    # matrix multiplication, returning a tensor with indices {:left,:right}

which is relatively obvious, pretty terse, quite flexible and convenient for higher-order tensors, while letting the user give sensible, self-documenting names to the indices if they want.

Anyway, what do people think? Is this of broad interest? Does the way I've implemented this seem OK? 

On a general note, I strongly support using {} for Tuple{} in Julia 0.5 as it is a very convenient way of passing *compile-time* information to functions, which is something I can see myself doing more and more (and more) in Julia. Combined with generated functions, it's way more powerful than I've seen in any other programming language.

A couple of related questions ('cause I don't know where to post them).

(1) Is there much difference between Val{...} and Tuple types of a single index? If not, we could use both as {} (i.e. remove Val), and if so, use {X} for Val{X} and {X,} for Tuple{X} in analogy with tuple instances. Then people could pass {true} and {false} to functions instead of Val{true} or Val{false}.
(2) Is there going to be some fix to A += B assigning extra memory? To my use cases, this is *clearly* Julia's biggest current problem. Do we need a mutating + operator like +! (that way we can remember to only use it with mutating types)? Approaches in other languages (I'm thinking of C++) is allowing to overload the meaning of =, +=, etc and/or having a distinction between lvalues and rvalues (which help for delayed evaluation on a line-by-line level, which I feel is the least surprising version (to the programmer) of delayed evaluation).


Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Simon Kornblith
In reply to this post by Tim Holy
The ideal approach would be some kind of optimization pass that can transform loops to eliminate the division. For the simplest case, you want to be able to transform:

for i = a:b
    i2
, i1 = divrem(i-1, m)
    i1
, i2 = i1+1, i2+1
    f
(i1, i2)
end

into:

l2, l1 = divrem(a-1, m)
u2
, u1 = divrem(b-1, m)
l1
, l2, u1, u2 = l1+1, l2+1, u1+1, u2+1
for i2 = l2:u2, i1 = (i2 == l2 ? l1 : 1):(i2 == u2 ? u1 : m)
    f
(i1, i2)
end

It may be possible to speed up fast integer division, but I don't think it will ever be able to reach this level of performance. Additionally, the per-iteration cost of fast integer division scales with the number of dimensions, whereas an appropriately transformed loop only has an add.

Simon

On Monday, September 14, 2015 at 10:00:50 PM UTC-4, Tim Holy wrote:
Since there seems to be some interest in discussing strategy for 0.5 (and not
endlessly hijacking the thread on the debugger), I'll start a new thread. I've
been pretty involved in reworking julia's AbstractArray infrastructure in the
past, and I intend to keep participating, but I'm also pretty excited about
the possibilities right now in graphics. Since there's only so much time in
the day, I'll throw out my biggest concern and see whether folks have any good
ideas and/or whether someone wants to pick up a ball and run with it.

In making the transition to defaulting to array views (SubArrays and their
future cousins), I think the insufficiently-recognized elephant in the room is
`reshape`. There seems to be quite a lot of consensus that `reshape` should
return a view rather than a copy. I've demonstrated in the past that `reshape`
does not compose in a closed fashion with `sub`, and come up with a plan for a
2nd view type currently called a ReshapedArray
(<a href="https://github.com/JuliaLang/julia/pull/10507" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;">https://github.com/JuliaLang/julia/pull/10507).

The kicker is performance. Because indexing a reshaped array often involves a
call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
There are clever ways around its slowness
(<a href="http://www.flounder.com/multiplicative_inverse.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;">http://www.flounder.com/multiplicative_inverse.htm) and we (Simon Kornblith,
with small extensions by me) implemented these algorithms. Unfortunately, in
preliminary tests (see #10507) they are still not nearly as fast as I'd like.
I am not sure how much of this is limitations in strategy, my implementation,
poor (improvable) codegen, or just "that's how it is." It would be interesting
to benchmark against implementations like <a href="http://libdivide.com/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;">http://libdivide.com/ and see how
we're doing.

In my opinion, this highly-focused issue is likely to result in the most
difficult decisions we have to make for Arraymagedon. If somebody wants to dive
into this and see how much can be satisfactorily resolved, it would be a big
contribution.

Best,
--Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Tobias Grosser
In reply to this post by Tim Holy
On 09/15/2015 04:00 AM, Tim Holy wrote:

> Since there seems to be some interest in discussing strategy for 0.5 (and not
> endlessly hijacking the thread on the debugger), I'll start a new thread. I've
> been pretty involved in reworking julia's AbstractArray infrastructure in the
> past, and I intend to keep participating, but I'm also pretty excited about
> the possibilities right now in graphics. Since there's only so much time in
> the day, I'll throw out my biggest concern and see whether folks have any good
> ideas and/or whether someone wants to pick up a ball and run with it.
>
> In making the transition to defaulting to array views (SubArrays and their
> future cousins), I think the insufficiently-recognized elephant in the room is
> `reshape`. There seems to be quite a lot of consensus that `reshape` should
> return a view rather than a copy. I've demonstrated in the past that `reshape`
> does not compose in a closed fashion with `sub`, and come up with a plan for a
> 2nd view type currently called a ReshapedArray
> (https://github.com/JuliaLang/julia/pull/10507).
>
> The kicker is performance. Because indexing a reshaped array often involves a
> call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
> There are clever ways around its slowness
> (http://www.flounder.com/multiplicative_inverse.htm) and we (Simon Kornblith,
> with small extensions by me) implemented these algorithms. Unfortunately, in
> preliminary tests (see #10507) they are still not nearly as fast as I'd like.
> I am not sure how much of this is limitations in strategy, my implementation,
> poor (improvable) codegen, or just "that's how it is." It would be interesting
> to benchmark against implementations like http://libdivide.com/ and see how
> we're doing.
>
> In my opinion, this highly-focused issue is likely to result in the most
> difficult decisions we have to make for Arraymagedon. If somebody wants to dive
> into this and see how much can be satisfactorily resolved, it would be a big
> contribution.

Dear all,

this might be an interesting moment to give some status update on Polly
and its capabilities of optimizing Julia (like) code:

1) We just gained a first-version of run-time bounds-check elimination in Polly

https://github.com/llvm-mirror/polly/commit/41ce0a26932208b1d74ef7c6783f45da37f16e0e

This bound check elimination drops bounds checks as added by address sanitizer
with the option: -fsanitize=vla-bound

For Julia, we still require a small patch to LLVM's post-dominator tree. More
importantly, Julia formulates its bounds checks in a way that we can not drop them.
Specifically, Julia does not check each individual dimension, but only checks if an
access is inside the memory region allocated for an array. As the size of the
region is a polynomial expression in the array-size parameters, we can not
reason with our linear models about Julia bounds checks and can consequently
not drop them. I wonder if it would not make more sense to check each dimensionS
individually?

2) Delinearization

www.grosser.es/#pub-optimistic-delinerization

Polly is now capable of reason about array accesses as they are currently generated
by Julia (sometimes still requiring some loop-invariant-code-motion which we
currently automatize) and can give some nice speed-ups for common dense linear-algebra
kernels or stencils. The point to make here is that for Polly it is important to
know about the shape of the array at compile-time. Reshaped arrays will make it
harder for Polly to reason about them. They will break our array-recovery code (which
we might be able to fix), but to my understanding they will make it impossible to
detect the stride-one dimension of the array, which will make auto-vectorization or
any other kind of compiler optimization hard.

It seems you are currently mostly concerned about the div slowness, but I think it
is of equal importance to understand what other optimizations are hindered by this
array-shape by default approach. Does your approach hinder vectorization?

Best,
Tobias

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Matt Bauman
In reply to this post by Andy Ferris
I think that this is a great use of packages.  I think it's very valuable to allow some syntax in base to be "free" for different disciplines to use however it best suits them.  Beyond the Tuple{} to {} syntax change (which is on the table already), I'm not sure what more you need in base to better support your package.

The (ab)use of Tuples is an interesting way to lift values into the type domain, but be aware that there was already one attempt to remove it: https://github.com/JuliaLang/julia/commit/85f45974a581ab9af955bac600b90d9ab00f093b

On Sunday, September 27, 2015 at 10:47:45 AM UTC-4, Andy Ferris wrote:
I'd like to add an new item to the wish list for "arraymageddon", and this seemed like the best place to post it. The basic idea is a simple (and perhaps built-in) way of doing multilinear algebra (tensor contractions, einsum, or whatever you wish to call it).

In <a href="http://github.com/andyferris/Tensors.jl" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fgithub.com%2Fandyferris%2FTensors.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHYCpOHfGPyxznJprwsicyVvRI7Vg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fgithub.com%2Fandyferris%2FTensors.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHYCpOHfGPyxznJprwsicyVvRI7Vg&#39;;return true;">github.com/andyferris/Tensors.jl I've implemented a naming scheme for the indices of arrays that is convenient for multilinear algebra, tensor contractions, index permutations, etc. Though there are different ways of doing this (via macros, Jutho's TensorOperations.jl package, etc), the names of the indices are included in the type (as a type-Tuple{} of integers or symbols, as the user prefers) and assigned via getindex(::Type{Tuple{...}}) (and setindex!, etc). The idea was (a) remove run-time overhead associated with figuring out where indices match and (b) have persistent types that remember what they are. For (a), generated functions are used, so compile-time might be a bit slower. At the moment, the syntax is a bit annoying and thus helper macros are useful, but... there is hope:

If we use {...} as shorthand for Tuple{...} in Julia 0.5 we the nice tensor indexing via [{ ... }]. For a couple of examples

    A[{3,2,1}] = B[{1,2,3}]   # A = permutedims(B,[3,2,1])

or 

    A[{:left,:middle}] * B[{:middle,:right}]    # matrix multiplication, returning a tensor with indices {:left,:right}

which is relatively obvious, pretty terse, quite flexible and convenient for higher-order tensors, while letting the user give sensible, self-documenting names to the indices if they want.

Anyway, what do people think? Is this of broad interest? Does the way I've implemented this seem OK? 

On a general note, I strongly support using {} for Tuple{} in Julia 0.5 as it is a very convenient way of passing *compile-time* information to functions, which is something I can see myself doing more and more (and more) in Julia. Combined with generated functions, it's way more powerful than I've seen in any other programming language.

A couple of related questions ('cause I don't know where to post them).

(1) Is there much difference between Val{...} and Tuple types of a single index? If not, we could use both as {} (i.e. remove Val), and if so, use {X} for Val{X} and {X,} for Tuple{X} in analogy with tuple instances. Then people could pass {true} and {false} to functions instead of Val{true} or Val{false}.
(2) Is there going to be some fix to A += B assigning extra memory? To my use cases, this is *clearly* Julia's biggest current problem. Do we need a mutating + operator like +! (that way we can remember to only use it with mutating types)? Approaches in other languages (I'm thinking of C++) is allowing to overload the meaning of =, +=, etc and/or having a distinction between lvalues and rvalues (which help for delayed evaluation on a line-by-line level, which I feel is the least surprising version (to the programmer) of delayed evaluation).


Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Andy Ferris
Thanks Matt - you're right that for now Tensors.jl should remain a package (unless people think that kind of notation is just better for general consumption than the current permutedims() in Base, or something).

In addition to {} being Tuple{}, I'd also like operations for tensor sum and tensor product in Base (https://github.com/JuliaLang/julia/issues/13333).

Regarding that attempted removal you mentioned - I'm not sure what will happen if we can't create Tuple's with Int's. How could we create Array{T,N} where N is an Integer? And, if we allow Int's, why not anything isbits() (plus Symbol)? I can also do Val{(...)} - perhaps that's more "correct". I'm finding the ability to pass compile-time information to generated function the most powerful feature I've seen in a programming language. I'm sure it makes compilation a bit slower, but wow, its so much easier than C++ template metaprogramming, and we can't be removing that now.


On Sunday, September 27, 2015 at 5:20:14 PM UTC+2, Matt Bauman wrote:
I think that this is a great use of packages.  I think it's very valuable to allow some syntax in base to be "free" for different disciplines to use however it best suits them.  Beyond the Tuple{} to {} syntax change (which is on the table already), I'm not sure what more you need in base to better support your package.

The (ab)use of Tuples is an interesting way to lift values into the type domain, but be aware that there was already one attempt to remove it: <a href="https://github.com/JuliaLang/julia/commit/85f45974a581ab9af955bac600b90d9ab00f093b" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fcommit%2F85f45974a581ab9af955bac600b90d9ab00f093b\46sa\75D\46sntz\0751\46usg\75AFQjCNHOQPO6TxJ3c6RaFColNJsqtX8W1A&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fcommit%2F85f45974a581ab9af955bac600b90d9ab00f093b\46sa\75D\46sntz\0751\46usg\75AFQjCNHOQPO6TxJ3c6RaFColNJsqtX8W1A&#39;;return true;">https://github.com/JuliaLang/julia/commit/85f45974a581ab9af955bac600b90d9ab00f093b

On Sunday, September 27, 2015 at 10:47:45 AM UTC-4, Andy Ferris wrote:
I'd like to add an new item to the wish list for "arraymageddon", and this seemed like the best place to post it. The basic idea is a simple (and perhaps built-in) way of doing multilinear algebra (tensor contractions, einsum, or whatever you wish to call it).

In <a href="http://github.com/andyferris/Tensors.jl" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fgithub.com%2Fandyferris%2FTensors.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHYCpOHfGPyxznJprwsicyVvRI7Vg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fgithub.com%2Fandyferris%2FTensors.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHYCpOHfGPyxznJprwsicyVvRI7Vg&#39;;return true;">github.com/andyferris/Tensors.jl I've implemented a naming scheme for the indices of arrays that is convenient for multilinear algebra, tensor contractions, index permutations, etc. Though there are different ways of doing this (via macros, Jutho's TensorOperations.jl package, etc), the names of the indices are included in the type (as a type-Tuple{} of integers or symbols, as the user prefers) and assigned via getindex(::Type{Tuple{...}}) (and setindex!, etc). The idea was (a) remove run-time overhead associated with figuring out where indices match and (b) have persistent types that remember what they are. For (a), generated functions are used, so compile-time might be a bit slower. At the moment, the syntax is a bit annoying and thus helper macros are useful, but... there is hope:

If we use {...} as shorthand for Tuple{...} in Julia 0.5 we the nice tensor indexing via [{ ... }]. For a couple of examples

    A[{3,2,1}] = B[{1,2,3}]   # A = permutedims(B,[3,2,1])

or 

    A[{:left,:middle}] * B[{:middle,:right}]    # matrix multiplication, returning a tensor with indices {:left,:right}

which is relatively obvious, pretty terse, quite flexible and convenient for higher-order tensors, while letting the user give sensible, self-documenting names to the indices if they want.

Anyway, what do people think? Is this of broad interest? Does the way I've implemented this seem OK? 

On a general note, I strongly support using {} for Tuple{} in Julia 0.5 as it is a very convenient way of passing *compile-time* information to functions, which is something I can see myself doing more and more (and more) in Julia. Combined with generated functions, it's way more powerful than I've seen in any other programming language.

A couple of related questions ('cause I don't know where to post them).

(1) Is there much difference between Val{...} and Tuple types of a single index? If not, we could use both as {} (i.e. remove Val), and if so, use {X} for Val{X} and {X,} for Tuple{X} in analogy with tuple instances. Then people could pass {true} and {false} to functions instead of Val{true} or Val{false}.
(2) Is there going to be some fix to A += B assigning extra memory? To my use cases, this is *clearly* Julia's biggest current problem. Do we need a mutating + operator like +! (that way we can remember to only use it with mutating types)? Approaches in other languages (I'm thinking of C++) is allowing to overload the meaning of =, +=, etc and/or having a distinction between lvalues and rvalues (which help for delayed evaluation on a line-by-line level, which I feel is the least surprising version (to the programmer) of delayed evaluation).


Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Matt Bauman
On Sunday, September 27, 2015 at 4:00:10 PM UTC-4, Andy Ferris wrote:
Regarding that attempted removal you mentioned - I'm not sure what will happen if we can't create Tuple's with Int's. How could we create Array{T,N} where N is an Integer? And, if we allow Int's, why not anything isbits() (plus Symbol)? I can also do Val{(...)} - perhaps that's more "correct".

General type parameters are separate from Tuple{} parameters.  Tuple{} types are intended to describe the type of tuple () values.  So from a first-principles approach, they shouldn't allow non-Types since that couldn't possibly describe a tuple value.  That said, there is no other data structure in the language that supports variable number of arbitrary type parameters, so they are uniquely useful.

The ability to store bitstypes in general type parameters is definitely not going anywhere.
Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Andy Ferris
Of course - this makes perfect sense. So the sensible way of passing data to functions is a value tuple Val{(...)}. 

If that is so, then perhaps this is a better usage of the {} operator: {...} = Val{(...)}, or at least {T,} for Tuple{T} (like (x,) is a tuple instance) and {T} for Val{T}, thus {(...)} = Val{(...)}. (The point being to create a super-convenient way of passing compile-time arguments to functions, like f(...,Val{true}) but shorter, just using {true} without the Val)
Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Sisyphuss
In reply to this post by Tim Holy
I don't understand why reshape is a problem here.
In R, an array is nothing else than a data vector and the dim attribute. Reshape is just to modify the dim attribute.


On Tuesday, September 15, 2015 at 4:00:50 AM UTC+2, Tim Holy wrote:
Since there seems to be some interest in discussing strategy for 0.5 (and not
endlessly hijacking the thread on the debugger), I'll start a new thread. I've
been pretty involved in reworking julia's AbstractArray infrastructure in the
past, and I intend to keep participating, but I'm also pretty excited about
the possibilities right now in graphics. Since there's only so much time in
the day, I'll throw out my biggest concern and see whether folks have any good
ideas and/or whether someone wants to pick up a ball and run with it.

In making the transition to defaulting to array views (SubArrays and their
future cousins), I think the insufficiently-recognized elephant in the room is
`reshape`. There seems to be quite a lot of consensus that `reshape` should
return a view rather than a copy. I've demonstrated in the past that `reshape`
does not compose in a closed fashion with `sub`, and come up with a plan for a
2nd view type currently called a ReshapedArray
(<a href="https://github.com/JuliaLang/julia/pull/10507" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;">https://github.com/JuliaLang/julia/pull/10507).

The kicker is performance. Because indexing a reshaped array often involves a
call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
There are clever ways around its slowness
(<a href="http://www.flounder.com/multiplicative_inverse.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;">http://www.flounder.com/multiplicative_inverse.htm) and we (Simon Kornblith,
with small extensions by me) implemented these algorithms. Unfortunately, in
preliminary tests (see #10507) they are still not nearly as fast as I'd like.
I am not sure how much of this is limitations in strategy, my implementation,
poor (improvable) codegen, or just "that's how it is." It would be interesting
to benchmark against implementations like <a href="http://libdivide.com/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;">http://libdivide.com/ and see how
we're doing.

In my opinion, this highly-focused issue is likely to result in the most
difficult decisions we have to make for Arraymagedon. If somebody wants to dive
into this and see how much can be satisfactorily resolved, it would be a big
contribution.

Best,
--Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Sisyphuss
I understand why reshape can be a problem now: reshape invalidates the previous ArrayView.

I have a solution: let ArrayView include the dim information (an ArrayView can only be defined when there exists a dimension);
reshape creates an ArrayView with full indices;
an Array is an ArrayView now, the original Array becomes DataVector.


On Monday, September 28, 2015 at 8:23:25 AM UTC+2, Sisyphuss wrote:
I don't understand why reshape is a problem here.
In R, an array is nothing else than a data vector and the dim attribute. Reshape is just to modify the dim attribute.


On Tuesday, September 15, 2015 at 4:00:50 AM UTC+2, Tim Holy wrote:
Since there seems to be some interest in discussing strategy for 0.5 (and not
endlessly hijacking the thread on the debugger), I'll start a new thread. I've
been pretty involved in reworking julia's AbstractArray infrastructure in the
past, and I intend to keep participating, but I'm also pretty excited about
the possibilities right now in graphics. Since there's only so much time in
the day, I'll throw out my biggest concern and see whether folks have any good
ideas and/or whether someone wants to pick up a ball and run with it.

In making the transition to defaulting to array views (SubArrays and their
future cousins), I think the insufficiently-recognized elephant in the room is
`reshape`. There seems to be quite a lot of consensus that `reshape` should
return a view rather than a copy. I've demonstrated in the past that `reshape`
does not compose in a closed fashion with `sub`, and come up with a plan for a
2nd view type currently called a ReshapedArray
(<a href="https://github.com/JuliaLang/julia/pull/10507" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;">https://github.com/JuliaLang/julia/pull/10507).

The kicker is performance. Because indexing a reshaped array often involves a
call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
There are clever ways around its slowness
(<a href="http://www.flounder.com/multiplicative_inverse.htm" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;">http://www.flounder.com/multiplicative_inverse.htm) and we (Simon Kornblith,
with small extensions by me) implemented these algorithms. Unfortunately, in
preliminary tests (see #10507) they are still not nearly as fast as I'd like.
I am not sure how much of this is limitations in strategy, my implementation,
poor (improvable) codegen, or just "that's how it is." It would be interesting
to benchmark against implementations like <a href="http://libdivide.com/" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;">http://libdivide.com/ and see how
we're doing.

In my opinion, this highly-focused issue is likely to result in the most
difficult decisions we have to make for Arraymagedon. If somebody wants to dive
into this and see how much can be satisfactorily resolved, it would be a big
contribution.

Best,
--Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Tim Holy
It's not quite that simple: see the analysis in
https://github.com/JuliaLang/julia/issues/9874#issuecomment-75979041,
especially the part about how sub and reshape compose with each other.

--Tim

On Monday, September 28, 2015 12:37:50 AM Sisyphuss wrote:

> I understand why reshape can be a problem now: reshape invalidates the
> previous ArrayView.
>
> I have a solution: let ArrayView include the dim information (an ArrayView
> can only be defined when there exists a dimension);
> reshape creates an ArrayView with full indices;
> an Array is an ArrayView now, the original Array becomes DataVector.
>
> On Monday, September 28, 2015 at 8:23:25 AM UTC+2, Sisyphuss wrote:
> > I don't understand why reshape is a problem here.
> > In R, an array is nothing else than a data vector and the dim attribute.
> > Reshape is just to modify the dim attribute.
> >
> > On Tuesday, September 15, 2015 at 4:00:50 AM UTC+2, Tim Holy wrote:
> >> Since there seems to be some interest in discussing strategy for 0.5 (and
> >> not
> >> endlessly hijacking the thread on the debugger), I'll start a new thread.
> >> I've
> >> been pretty involved in reworking julia's AbstractArray infrastructure in
> >> the
> >> past, and I intend to keep participating, but I'm also pretty excited
> >> about
> >> the possibilities right now in graphics. Since there's only so much time
> >> in
> >> the day, I'll throw out my biggest concern and see whether folks have any
> >> good
> >> ideas and/or whether someone wants to pick up a ball and run with it.
> >>
> >> In making the transition to defaulting to array views (SubArrays and
> >> their
> >> future cousins), I think the insufficiently-recognized elephant in the
> >> room is
> >> `reshape`. There seems to be quite a lot of consensus that `reshape`
> >> should
> >> return a view rather than a copy. I've demonstrated in the past that
> >> `reshape`
> >> does not compose in a closed fashion with `sub`, and come up with a plan
> >> for a
> >> 2nd view type currently called a ReshapedArray
> >> (https://github.com/JuliaLang/julia/pull/10507).
> >>
> >> The kicker is performance. Because indexing a reshaped array often
> >> involves a
> >> call to `div`, and `div` is dirt-slow, an alternative is highly
> >> desirable.
> >> There are clever ways around its slowness
> >> (http://www.flounder.com/multiplicative_inverse.htm) and we (Simon
> >> Kornblith,
> >> with small extensions by me) implemented these algorithms. Unfortunately,
> >> in
> >> preliminary tests (see #10507) they are still not nearly as fast as I'd
> >> like.
> >> I am not sure how much of this is limitations in strategy, my
> >> implementation,
> >> poor (improvable) codegen, or just "that's how it is." It would be
> >> interesting
> >> to benchmark against implementations like http://libdivide.com/ and see
> >> how
> >> we're doing.
> >>
> >> In my opinion, this highly-focused issue is likely to result in the most
> >> difficult decisions we have to make for Arraymagedon. If somebody wants
> >> to dive
> >> into this and see how much can be satisfactorily resolved, it would be a
> >> big
> >> contribution.
> >>
> >> Best,
> >> --Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Sisyphuss
You mean there's cases where some result could not be represented as view of original data vector?


On Monday, September 28, 2015 at 11:43:12 AM UTC+2, Tim Holy wrote:
It's not quite that simple: see the analysis in
<a href="https://github.com/JuliaLang/julia/issues/9874#issuecomment-75979041" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fissues%2F9874%23issuecomment-75979041\46sa\75D\46sntz\0751\46usg\75AFQjCNEsamUnFgYrld_W0pPRr0iShR80uQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fissues%2F9874%23issuecomment-75979041\46sa\75D\46sntz\0751\46usg\75AFQjCNEsamUnFgYrld_W0pPRr0iShR80uQ&#39;;return true;">https://github.com/JuliaLang/julia/issues/9874#issuecomment-75979041,
especially the part about how sub and reshape compose with each other.

--Tim

On Monday, September 28, 2015 12:37:50 AM Sisyphuss wrote:

> I understand why reshape can be a problem now: reshape invalidates the
> previous ArrayView.
>
> I have a solution: let ArrayView include the dim information (an ArrayView
> can only be defined when there exists a dimension);
> reshape creates an ArrayView with full indices;
> an Array is an ArrayView now, the original Array becomes DataVector.
>
> On Monday, September 28, 2015 at 8:23:25 AM UTC+2, Sisyphuss wrote:
> > I don't understand why reshape is a problem here.
> > In R, an array is nothing else than a data vector and the dim attribute.
> > Reshape is just to modify the dim attribute.
> >
> > On Tuesday, September 15, 2015 at 4:00:50 AM UTC+2, Tim Holy wrote:
> >> Since there seems to be some interest in discussing strategy for 0.5 (and
> >> not
> >> endlessly hijacking the thread on the debugger), I'll start a new thread.
> >> I've
> >> been pretty involved in reworking julia's AbstractArray infrastructure in
> >> the
> >> past, and I intend to keep participating, but I'm also pretty excited
> >> about
> >> the possibilities right now in graphics. Since there's only so much time
> >> in
> >> the day, I'll throw out my biggest concern and see whether folks have any
> >> good
> >> ideas and/or whether someone wants to pick up a ball and run with it.
> >>
> >> In making the transition to defaulting to array views (SubArrays and
> >> their
> >> future cousins), I think the insufficiently-recognized elephant in the
> >> room is
> >> `reshape`. There seems to be quite a lot of consensus that `reshape`
> >> should
> >> return a view rather than a copy. I've demonstrated in the past that
> >> `reshape`
> >> does not compose in a closed fashion with `sub`, and come up with a plan
> >> for a
> >> 2nd view type currently called a ReshapedArray
> >> (<a href="https://github.com/JuliaLang/julia/pull/10507" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;">https://github.com/JuliaLang/julia/pull/10507).
> >>
> >> The kicker is performance. Because indexing a reshaped array often
> >> involves a
> >> call to `div`, and `div` is dirt-slow, an alternative is highly
> >> desirable.
> >> There are clever ways around its slowness
> >> (<a href="http://www.flounder.com/multiplicative_inverse.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;">http://www.flounder.com/multiplicative_inverse.htm) and we (Simon
> >> Kornblith,
> >> with small extensions by me) implemented these algorithms. Unfortunately,
> >> in
> >> preliminary tests (see #10507) they are still not nearly as fast as I'd
> >> like.
> >> I am not sure how much of this is limitations in strategy, my
> >> implementation,
> >> poor (improvable) codegen, or just "that's how it is." It would be
> >> interesting
> >> to benchmark against implementations like <a href="http://libdivide.com/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;">http://libdivide.com/ and see
> >> how
> >> we're doing.
> >>
> >> In my opinion, this highly-focused issue is likely to result in the most
> >> difficult decisions we have to make for Arraymagedon. If somebody wants
> >> to dive
> >> into this and see how much can be satisfactorily resolved, it would be a
> >> big
> >> contribution.
> >>
> >> Best,
> >> --Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Simon Kornblith
In reply to this post by Tobias Grosser
Hi Tobias,

On Sunday, September 27, 2015 at 2:53:15 AM UTC-4, Tobias Grosser wrote:
On 09/15/2015 04:00 AM, Tim Holy wrote:

> Since there seems to be some interest in discussing strategy for 0.5 (and not
> endlessly hijacking the thread on the debugger), I'll start a new thread. I've
> been pretty involved in reworking julia's AbstractArray infrastructure in the
> past, and I intend to keep participating, but I'm also pretty excited about
> the possibilities right now in graphics. Since there's only so much time in
> the day, I'll throw out my biggest concern and see whether folks have any good
> ideas and/or whether someone wants to pick up a ball and run with it.
>
> In making the transition to defaulting to array views (SubArrays and their
> future cousins), I think the insufficiently-recognized elephant in the room is
> `reshape`. There seems to be quite a lot of consensus that `reshape` should
> return a view rather than a copy. I've demonstrated in the past that `reshape`
> does not compose in a closed fashion with `sub`, and come up with a plan for a
> 2nd view type currently called a ReshapedArray
> (<a href="https://github.com/JuliaLang/julia/pull/10507" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaLang%2Fjulia%2Fpull%2F10507\46sa\75D\46sntz\0751\46usg\75AFQjCNEYHpBCIlHXC2yEeP3_lvUwFjTWYQ&#39;;return true;">https://github.com/JuliaLang/julia/pull/10507).
>
> The kicker is performance. Because indexing a reshaped array often involves a
> call to `div`, and `div` is dirt-slow, an alternative is highly desirable.
> There are clever ways around its slowness
> (<a href="http://www.flounder.com/multiplicative_inverse.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.flounder.com%2Fmultiplicative_inverse.htm\46sa\75D\46sntz\0751\46usg\75AFQjCNFjN8qRGz_--Vj9QN9hOMK5dlK3JA&#39;;return true;">http://www.flounder.com/multiplicative_inverse.htm) and we (Simon Kornblith,
> with small extensions by me) implemented these algorithms. Unfortunately, in
> preliminary tests (see #10507) they are still not nearly as fast as I'd like.
> I am not sure how much of this is limitations in strategy, my implementation,
> poor (improvable) codegen, or just "that's how it is." It would be interesting
> to benchmark against implementations like <a href="http://libdivide.com/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Flibdivide.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNGwR7EyfIn5GWMhJsOdznT1jvg2xA&#39;;return true;">http://libdivide.com/ and see how
> we're doing.
>
> In my opinion, this highly-focused issue is likely to result in the most
> difficult decisions we have to make for Arraymagedon. If somebody wants to dive
> into this and see how much can be satisfactorily resolved, it would be a big
> contribution.

Dear all,

this might be an interesting moment to give some status update on Polly
and its capabilities of optimizing Julia (like) code:

1) We just gained a first-version of run-time bounds-check elimination in Polly

<a href="https://github.com/llvm-mirror/polly/commit/41ce0a26932208b1d74ef7c6783f45da37f16e0e" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fllvm-mirror%2Fpolly%2Fcommit%2F41ce0a26932208b1d74ef7c6783f45da37f16e0e\46sa\75D\46sntz\0751\46usg\75AFQjCNGX5-gjVfPnNfQXv1icGXlRn4pIJA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fllvm-mirror%2Fpolly%2Fcommit%2F41ce0a26932208b1d74ef7c6783f45da37f16e0e\46sa\75D\46sntz\0751\46usg\75AFQjCNGX5-gjVfPnNfQXv1icGXlRn4pIJA&#39;;return true;">https://github.com/llvm-mirror/polly/commit/41ce0a26932208b1d74ef7c6783f45da37f16e0e

This bound check elimination drops bounds checks as added by address sanitizer
with the option: -fsanitize=vla-bound

For Julia, we still require a small patch to LLVM's post-dominator tree. More
importantly, Julia formulates its bounds checks in a way that we can not drop them.
Specifically, Julia does not check each individual dimension, but only checks if an
access is inside the memory region allocated for an array. As the size of the
region is a polynomial expression in the array-size parameters, we can not
reason with our linear models about Julia bounds checks and can consequently
not drop them. I wonder if it would not make more sense to check each dimensionS
individually?

I suspect this is an artifact of linear indexing. In Julia, if you have a 2x2x2 array X, then the expressions X[7] and X[1,4] are both legal and equivalent to X[1,2,2]. However, when the number of indices provided is equal to the size of the array, I believe all of the checks could just be simple checks on the individual dimensions, but it looks like the check for the last dimension is more complicated that necessary.

2) Delinearization

<a href="http://www.grosser.es/#pub-optimistic-delinerization" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.grosser.es%2F%23pub-optimistic-delinerization\46sa\75D\46sntz\0751\46usg\75AFQjCNERrYyHgbpJyfHozbR8J_oeZBHaBQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fwww.grosser.es%2F%23pub-optimistic-delinerization\46sa\75D\46sntz\0751\46usg\75AFQjCNERrYyHgbpJyfHozbR8J_oeZBHaBQ&#39;;return true;">www.grosser.es/#pub-optimistic-delinerization

Polly is now capable of reason about array accesses as they are currently generated
by Julia (sometimes still requiring some loop-invariant-code-motion which we
currently automatize) and can give some nice speed-ups for common dense linear-algebra
kernels or stencils. The point to make here is that for Polly it is important to
know about the shape of the array at compile-time. Reshaped arrays will make it
harder for Polly to reason about them. They will break our array-recovery code (which
we might be able to fix), but to my understanding they will make it impossible to
detect the stride-one dimension of the array, which will make auto-vectorization or
any other kind of compiler optimization hard.

It seems you are currently mostly concerned about the div slowness, but I think it
is of equal importance to understand what other optimizations are hindered by this
array-shape by default approach. Does your approach hinder vectorization?

In cases where we can determine based on the types involved that an array can be reshaped to contiguous memory, I think we will will be able to express that in the type of the reshaped array, so LLVM/Polly would only get code where the stride has to be loaded from memory if this is not the case.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Tim Holy
In reply to this post by Sisyphuss
Meaning that, if you want to present the result as a view of the original
vector, you need two different types of views (SubArray and ReshapedArray).
There's no good way to combine both into a single type. Since you can stuff one
inside the other, it should all work out, but there may be more layers than
we'd like.

--Tim

On Monday, September 28, 2015 06:50:32 AM Sisyphuss wrote:

> You mean there's cases where some result could not be represented as view
> of original data vector?
>
> On Monday, September 28, 2015 at 11:43:12 AM UTC+2, Tim Holy wrote:
> > It's not quite that simple: see the analysis in
> > https://github.com/JuliaLang/julia/issues/9874#issuecomment-75979041,
> > especially the part about how sub and reshape compose with each other.
> >
> > --Tim
> >
> > On Monday, September 28, 2015 12:37:50 AM Sisyphuss wrote:
> > > I understand why reshape can be a problem now: reshape invalidates the
> > > previous ArrayView.
> > >
> > > I have a solution: let ArrayView include the dim information (an
> >
> > ArrayView
> >
> > > can only be defined when there exists a dimension);
> > > reshape creates an ArrayView with full indices;
> > > an Array is an ArrayView now, the original Array becomes DataVector.
> > >
> > > On Monday, September 28, 2015 at 8:23:25 AM UTC+2, Sisyphuss wrote:
> > > > I don't understand why reshape is a problem here.
> > > > In R, an array is nothing else than a data vector and the dim
> >
> > attribute.
> >
> > > > Reshape is just to modify the dim attribute.
> > > >
> > > > On Tuesday, September 15, 2015 at 4:00:50 AM UTC+2, Tim Holy wrote:
> > > >> Since there seems to be some interest in discussing strategy for 0.5
> >
> > (and
> >
> > > >> not
> > > >> endlessly hijacking the thread on the debugger), I'll start a new
> >
> > thread.
> >
> > > >> I've
> > > >> been pretty involved in reworking julia's AbstractArray
> >
> > infrastructure in
> >
> > > >> the
> > > >> past, and I intend to keep participating, but I'm also pretty excited
> > > >> about
> > > >> the possibilities right now in graphics. Since there's only so much
> >
> > time
> >
> > > >> in
> > > >> the day, I'll throw out my biggest concern and see whether folks have
> >
> > any
> >
> > > >> good
> > > >> ideas and/or whether someone wants to pick up a ball and run with it.
> > > >>
> > > >> In making the transition to defaulting to array views (SubArrays and
> > > >> their
> > > >> future cousins), I think the insufficiently-recognized elephant in
> >
> > the
> >
> > > >> room is
> > > >> `reshape`. There seems to be quite a lot of consensus that `reshape`
> > > >> should
> > > >> return a view rather than a copy. I've demonstrated in the past that
> > > >> `reshape`
> > > >> does not compose in a closed fashion with `sub`, and come up with a
> >
> > plan
> >
> > > >> for a
> > > >> 2nd view type currently called a ReshapedArray
> > > >> (https://github.com/JuliaLang/julia/pull/10507).
> > > >>
> > > >> The kicker is performance. Because indexing a reshaped array often
> > > >> involves a
> > > >> call to `div`, and `div` is dirt-slow, an alternative is highly
> > > >> desirable.
> > > >> There are clever ways around its slowness
> > > >> (http://www.flounder.com/multiplicative_inverse.htm) and we (Simon
> > > >> Kornblith,
> > > >> with small extensions by me) implemented these algorithms.
> >
> > Unfortunately,
> >
> > > >> in
> > > >> preliminary tests (see #10507) they are still not nearly as fast as
> >
> > I'd
> >
> > > >> like.
> > > >> I am not sure how much of this is limitations in strategy, my
> > > >> implementation,
> > > >> poor (improvable) codegen, or just "that's how it is." It would be
> > > >> interesting
> > > >> to benchmark against implementations like http://libdivide.com/ and
> >
> > see
> >
> > > >> how
> > > >> we're doing.
> > > >>
> > > >> In my opinion, this highly-focused issue is likely to result in the
> >
> > most
> >
> > > >> difficult decisions we have to make for Arraymagedon. If somebody
> >
> > wants
> >
> > > >> to dive
> > > >> into this and see how much can be satisfactorily resolved, it would
> >
> > be a
> >
> > > >> big
> > > >> contribution.
> > > >>
> > > >> Best,
> > > >> --Tim

Reply | Threaded
Open this post in threaded view
|

Re: Julia 0.5 discussions: ReshapedArrays and indexing performance

Tobias Grosser
In reply to this post by Simon Kornblith
Hi Simon.

On 09/28/2015 07:07 PM, Simon Kornblith wrote:

> Hi Tobias,
>
> On Sunday, September 27, 2015 at 2:53:15 AM UTC-4, Tobias Grosser wrote:
>>
>> On 09/15/2015 04:00 AM, Tim Holy wrote:
>>> Since there seems to be some interest in discussing strategy for 0.5
>> (and not
>>> endlessly hijacking the thread on the debugger), I'll start a new
>> thread. I've
>>> been pretty involved in reworking julia's AbstractArray infrastructure
>> in the
>>> past, and I intend to keep participating, but I'm also pretty excited
>> about
>>> the possibilities right now in graphics. Since there's only so much time
>> in
>>> the day, I'll throw out my biggest concern and see whether folks have
>> any good
>>> ideas and/or whether someone wants to pick up a ball and run with it.
>>>
>>> In making the transition to defaulting to array views (SubArrays and
>> their
>>> future cousins), I think the insufficiently-recognized elephant in the
>> room is
>>> `reshape`. There seems to be quite a lot of consensus that `reshape`
>> should
>>> return a view rather than a copy. I've demonstrated in the past that
>> `reshape`
>>> does not compose in a closed fashion with `sub`, and come up with a plan
>> for a
>>> 2nd view type currently called a ReshapedArray
>>> (https://github.com/JuliaLang/julia/pull/10507).
>>>
>>> The kicker is performance. Because indexing a reshaped array often
>> involves a
>>> call to `div`, and `div` is dirt-slow, an alternative is highly
>> desirable.
>>> There are clever ways around its slowness
>>> (http://www.flounder.com/multiplicative_inverse.htm) and we (Simon
>> Kornblith,
>>> with small extensions by me) implemented these algorithms.
>> Unfortunately, in
>>> preliminary tests (see #10507) they are still not nearly as fast as I'd
>> like.
>>> I am not sure how much of this is limitations in strategy, my
>> implementation,
>>> poor (improvable) codegen, or just "that's how it is." It would be
>> interesting
>>> to benchmark against implementations like http://libdivide.com/ and see
>> how
>>> we're doing.
>>>
>>> In my opinion, this highly-focused issue is likely to result in the most
>>> difficult decisions we have to make for Arraymagedon. If somebody wants
>> to dive
>>> into this and see how much can be satisfactorily resolved, it would be a
>> big
>>> contribution.
>>
>> Dear all,
>>
>> this might be an interesting moment to give some status update on Polly
>> and its capabilities of optimizing Julia (like) code:
>>
>> 1) We just gained a first-version of run-time bounds-check elimination in
>> Polly
>>
>>
>> https://github.com/llvm-mirror/polly/commit/41ce0a26932208b1d74ef7c6783f45da37f16e0e
>>
>> This bound check elimination drops bounds checks as added by address
>> sanitizer
>> with the option: -fsanitize=vla-bound
>>
>> For Julia, we still require a small patch to LLVM's post-dominator tree.
>> More
>> importantly, Julia formulates its bounds checks in a way that we can not
>> drop them.
>> Specifically, Julia does not check each individual dimension, but only
>> checks if an
>> access is inside the memory region allocated for an array. As the size of
>> the
>> region is a polynomial expression in the array-size parameters, we can not
>> reason with our linear models about Julia bounds checks and can
>> consequently
>> not drop them. I wonder if it would not make more sense to check each
>> dimensionS
>> individually?
>>
>
> I suspect this is an artifact of linear indexing. In Julia, if you have a
> 2x2x2 array X, then the expressions X[7] and X[1,4] are both legal and
> equivalent to X[1,2,2]. However, when the number of indices provided is
> equal to the size of the array, I believe all of the checks could just be
> simple checks on the individual dimensions, but it looks like the check for
> the last dimension is more complicated that necessary.

Ah, thanks for explaining. Yes, per-dimension checks if possible would be
helpful here.
 

> 2) Delinearization
>>
>> www.grosser.es/#pub-optimistic-delinerization
>>
>> Polly is now capable of reason about array accesses as they are currently
>> generated
>> by Julia (sometimes still requiring some loop-invariant-code-motion which
>> we
>> currently automatize) and can give some nice speed-ups for common dense
>> linear-algebra
>> kernels or stencils. The point to make here is that for Polly it is
>> important to
>> know about the shape of the array at compile-time. Reshaped arrays will
>> make it
>> harder for Polly to reason about them. They will break our array-recovery
>> code (which
>> we might be able to fix), but to my understanding they will make it
>> impossible to
>> detect the stride-one dimension of the array, which will make
>> auto-vectorization or
>> any other kind of compiler optimization hard.
>>
>> It seems you are currently mostly concerned about the div slowness, but I
>> think it
>> is of equal importance to understand what other optimizations are hindered
>> by this
>> array-shape by default approach. Does your approach hinder vectorization?
>>
>
> In cases where we can determine based on the types involved that an array
> can be reshaped to contiguous memory, I think we will will be able to
> express that in the type of the reshaped array, so LLVM/Polly would only
> get code where the stride has to be loaded from memory if this is not the
> case.

This sounds as if we can get the best of both worlds. So you plan to specialize
code for special layouts/shapes whenever possible? This would indeed make it
a lot easier to analyze.

Best,
Tobias