Re: iterator construct seems incorrect?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: iterator construct seems incorrect?

Iain Dunning
Two separate calls to next certainly aren't in general necessary, and I haven't hit any issues implementing the iterator interface before. In fact, it felt quite natural in the end.

I suspect that no matter which way you do the iteration interface, you'll be able to find an example where you'd prefer it to be the other way - it'd be nice to formalize that.

On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote:
Thanks for clarifying. While I know exactly what you're talking about, and I
wasn't around when iterators were designed, I've always assumed that the
current behavior follows from two principles: (1) `done` must evaluate to a
boolean, and (2) the `item` should not leak out of the scope block defined by
the `while` loop. Last I thought through it carefully, the current design
seemed inevitable.

Best,
--Tim

On Tuesday, July 22, 2014 02:59:33 PM <a href="javascript:" target="_blank" gdf-obfuscated-mailto="mqtl11DxtEMJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">vav...@... wrote:

> Tim,
>
> The fact that 'next' is called before the loop body rather than after means
> that the 'done' predicate must provide an answer to the question: "if
> 'next' is called on the current state, then would the result be the end of
> the data structure?"  The obvious way to answer that question is to
> actually invoke 'next' to see what happens. For a balanced tree, there is
> no simpler way that I know of to answer the question other than calling
> 'next'.  It is possible to avoid the performance hit with additional code
> complexity by caching the result of 'next', but still, the question
> remains, why are start/done/next implemented in this way instead of the
> more obvious way (for example, in the C implementation of for(..; .. ; ..)
> the 'next' operation is invoked at the end of the body).
>
> -- Steve
>
> On Wednesday, July 23, 2014 12:29:12 AM UTC+3, [hidden email] wrote:
> > Dear Julia users,
> >
> > As I have mentioned in earlier posts, I am working on a 2-3 tree
> > implementation of a sort-order dict, that is, a dict in which the (key,
> > value) pairs can be retrieved in the sort-order of the keys.  According to
> >
> > section 2.1.7 of the manual, "for i = I; <body> ; end" translates to:
> >   state = start(I)
> >   while !done(I, state)
> >  
> >       (i,state) = next(I,state)
> >       <body>
> >  
> >   end
> >
> > The more obvious way to implement the loop would be to put the body BEFORE
> > the 'next' statement, and at first I thought that maybe the manual has a
> > typo.  But then I checked the file dict.jl, and I found indeed the code:
> >
> > done(t::ObjectIdDict, i) = is(next(t,i),())
> >
> > So this means that every loop iteration over an ObjectIdDict requires two
> > calls to 'next', one as part of the call to 'done', and a second one to
> > actually advance the loop.
> >
> > I also looked at the code for 'done' for Dict in dict.jl, and it appears
> > to be buggy(??).  It does not check whether everything after the current i
> > is deleted(??)
> >
> > For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it
> > twice per loop iteration is undesirable.
> >
> > Can anyone shed light on why the Julia loop construction is implemented
> > this way, so that two separate calls to 'next' are necessary?  I suppose
> > that it is possible with additional code complexity to cache the result of
> > the first call to 'next' inside the state, but what exactly is the
> > rationale for the current design?
> >
> > Thanks,
> > Steve Vavasis

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: iterator construct seems incorrect?

Tim Holy
The other point to make is that if you find it more natural to do your
incrementing inside done, there's no rule saying that next has to do any
actual work. You can cache the item somewhere and just have next fetch it.

--Tim

On Tuesday, July 22, 2014 06:13:25 PM Iain Dunning wrote:

> Two separate calls to next certainly aren't in general necessary, and I
> haven't hit any issues implementing the iterator interface before. In fact,
> it felt quite natural in the end.
>
> I suspect that no matter which way you do the iteration interface, you'll
> be able to find an example where you'd prefer it to be the other way - it'd
> be nice to formalize that.
>
> On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote:
> > Thanks for clarifying. While I know exactly what you're talking about, and
> > I
> > wasn't around when iterators were designed, I've always assumed that the
> > current behavior follows from two principles: (1) `done` must evaluate to
> > a
> > boolean, and (2) the `item` should not leak out of the scope block defined
> > by
> > the `while` loop. Last I thought through it carefully, the current design
> > seemed inevitable.
> >
> > Best,
> > --Tim
> >
> > On Tuesday, July 22, 2014 02:59:33 PM [hidden email] <javascript:>
> >
> > wrote:
> > > Tim,
> > >
> > > The fact that 'next' is called before the loop body rather than after
> >
> > means
> >
> > > that the 'done' predicate must provide an answer to the question: "if
> > > 'next' is called on the current state, then would the result be the end
> >
> > of
> >
> > > the data structure?"  The obvious way to answer that question is to
> > > actually invoke 'next' to see what happens. For a balanced tree, there
> >
> > is
> >
> > > no simpler way that I know of to answer the question other than calling
> > > 'next'.  It is possible to avoid the performance hit with additional
> >
> > code
> >
> > > complexity by caching the result of 'next', but still, the question
> > > remains, why are start/done/next implemented in this way instead of the
> > > more obvious way (for example, in the C implementation of for(..; .. ;
> >
> > ..)
> >
> > > the 'next' operation is invoked at the end of the body).
> > >
> > > -- Steve
> > >
> > > On Wednesday, July 23, 2014 12:29:12 AM UTC+3, [hidden email]
> >
> > wrote:
> > > > Dear Julia users,
> > > >
> > > > As I have mentioned in earlier posts, I am working on a 2-3 tree
> > > > implementation of a sort-order dict, that is, a dict in which the
> >
> > (key,
> >
> > > > value) pairs can be retrieved in the sort-order of the keys.
> >  
> >  According to
> >  
> > > > section 2.1.7 of the manual, "for i = I; <body> ; end" translates to:
> > > >   state = start(I)
> > > >   while !done(I, state)
> > > >  
> > > >       (i,state) = next(I,state)
> > > >       <body>
> > > >  
> > > >   end
> > > >
> > > > The more obvious way to implement the loop would be to put the body
> >
> > BEFORE
> >
> > > > the 'next' statement, and at first I thought that maybe the manual has
> >
> > a
> >
> > > > typo.  But then I checked the file dict.jl, and I found indeed the
> >
> > code:
> > > > done(t::ObjectIdDict, i) = is(next(t,i),())
> > > >
> > > > So this means that every loop iteration over an ObjectIdDict requires
> >
> > two
> >
> > > > calls to 'next', one as part of the call to 'done', and a second one
> >
> > to
> >
> > > > actually advance the loop.
> > > >
> > > > I also looked at the code for 'done' for Dict in dict.jl, and it
> >
> > appears
> >
> > > > to be buggy(??).  It does not check whether everything after the
> >
> > current i
> >
> > > > is deleted(??)
> > > >
> > > > For a 2-3 tree, the 'next' operation is logarithmic time, so requiring
> >
> > it
> >
> > > > twice per loop iteration is undesirable.
> > > >
> > > > Can anyone shed light on why the Julia loop construction is
> >
> > implemented
> >
> > > > this way, so that two separate calls to 'next' are necessary?  I
> >
> > suppose
> >
> > > > that it is possible with additional code complexity to cache the
> >
> > result of
> >
> > > > the first call to 'next' inside the state, but what exactly is the
> > > > rationale for the current design?
> > > >
> > > > Thanks,
> > > > Steve Vavasis

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: iterator construct seems incorrect?

Jameson Nash
it can however to better for done() to do no work, but to have next actually compute the next+1 state:

start(x) = next(x)
next(x,i) = (i, next(x))
done(x,i) = (i===sentinel)

this makes next somewhat more expensive to call repeatedly, but has the benefit that the user can hold onto a reference to any arbitrary node (until they mutate the underlying object) to perform custom iteration sequences

(where sentinel could either be a special value, or a boolean that is part of the state tuple)


On Tue, Jul 22, 2014 at 9:42 PM, Tim Holy <[hidden email]> wrote:
The other point to make is that if you find it more natural to do your
incrementing inside done, there's no rule saying that next has to do any
actual work. You can cache the item somewhere and just have next fetch it.

--Tim

On Tuesday, July 22, 2014 06:13:25 PM Iain Dunning wrote:
> Two separate calls to next certainly aren't in general necessary, and I
> haven't hit any issues implementing the iterator interface before. In fact,
> it felt quite natural in the end.
>
> I suspect that no matter which way you do the iteration interface, you'll
> be able to find an example where you'd prefer it to be the other way - it'd
> be nice to formalize that.
>
> On Tuesday, July 22, 2014 7:34:18 PM UTC-4, Tim Holy wrote:
> > Thanks for clarifying. While I know exactly what you're talking about, and
> > I
> > wasn't around when iterators were designed, I've always assumed that the
> > current behavior follows from two principles: (1) `done` must evaluate to
> > a
> > boolean, and (2) the `item` should not leak out of the scope block defined
> > by
> > the `while` loop. Last I thought through it carefully, the current design
> > seemed inevitable.
> >
> > Best,
> > --Tim
> >
> > On Tuesday, July 22, 2014 02:59:33 PM [hidden email] <javascript:>
> >
> > wrote:
> > > Tim,
> > >
> > > The fact that 'next' is called before the loop body rather than after
> >
> > means
> >
> > > that the 'done' predicate must provide an answer to the question: "if
> > > 'next' is called on the current state, then would the result be the end
> >
> > of
> >
> > > the data structure?"  The obvious way to answer that question is to
> > > actually invoke 'next' to see what happens. For a balanced tree, there
> >
> > is
> >
> > > no simpler way that I know of to answer the question other than calling
> > > 'next'.  It is possible to avoid the performance hit with additional
> >
> > code
> >
> > > complexity by caching the result of 'next', but still, the question
> > > remains, why are start/done/next implemented in this way instead of the
> > > more obvious way (for example, in the C implementation of for(..; .. ;
> >
> > ..)
> >
> > > the 'next' operation is invoked at the end of the body).
> > >
> > > -- Steve
> > >
> > > On Wednesday, July 23, 2014 12:29:12 AM UTC+3, [hidden email]
> >
> > wrote:
> > > > Dear Julia users,
> > > >
> > > > As I have mentioned in earlier posts, I am working on a 2-3 tree
> > > > implementation of a sort-order dict, that is, a dict in which the
> >
> > (key,
> >
> > > > value) pairs can be retrieved in the sort-order of the keys.
> >
> >  According to
> >
> > > > section 2.1.7 of the manual, "for i = I; <body> ; end" translates to:
> > > >   state = start(I)
> > > >   while !done(I, state)
> > > >
> > > >       (i,state) = next(I,state)
> > > >       <body>
> > > >
> > > >   end
> > > >
> > > > The more obvious way to implement the loop would be to put the body
> >
> > BEFORE
> >
> > > > the 'next' statement, and at first I thought that maybe the manual has
> >
> > a
> >
> > > > typo.  But then I checked the file dict.jl, and I found indeed the
> >
> > code:
> > > > done(t::ObjectIdDict, i) = is(next(t,i),())
> > > >
> > > > So this means that every loop iteration over an ObjectIdDict requires
> >
> > two
> >
> > > > calls to 'next', one as part of the call to 'done', and a second one
> >
> > to
> >
> > > > actually advance the loop.
> > > >
> > > > I also looked at the code for 'done' for Dict in dict.jl, and it
> >
> > appears
> >
> > > > to be buggy(??).  It does not check whether everything after the
> >
> > current i
> >
> > > > is deleted(??)
> > > >
> > > > For a 2-3 tree, the 'next' operation is logarithmic time, so requiring
> >
> > it
> >
> > > > twice per loop iteration is undesirable.
> > > >
> > > > Can anyone shed light on why the Julia loop construction is
> >
> > implemented
> >
> > > > this way, so that two separate calls to 'next' are necessary?  I
> >
> > suppose
> >
> > > > that it is possible with additional code complexity to cache the
> >
> > result of
> >
> > > > the first call to 'next' inside the state, but what exactly is the
> > > > rationale for the current design?
> > > >
> > > > Thanks,
> > > > Steve Vavasis


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: iterator construct seems incorrect?

Mauro
In reply to this post by Iain Dunning
Hi Steve

> The more obvious way to implement the loop would be to put the body BEFORE
> the 'next' statement, and at first I thought that maybe the manual has a
> typo.  But then I checked the file dict.jl, and I found indeed the code:
>
> done(t::ObjectIdDict, i) = is(next(t,i),())
>
> So this means that every loop iteration over an ObjectIdDict requires two
> calls to 'next', one as part of the call to 'done', and a second one to
> actually advance the loop.

Presumably for ObjectIdDict calling next is cheap, so there is no reason
to avoid one extra call.  In particular as ObjectIdDict is coded in C,
this avoids having to write more C code.

> I also looked at the code for 'done' for Dict in dict.jl, and it appears to
> be buggy(??).  It does not check whether everything after the current i is
> deleted(??)

For `Dict` `done` does not call `next` again.  Maybe you can model your
iterator after that?  Also, I think there is no bug in it, as `skip_deleted`
will return the index of the last `d.vals` if there are no more filled
slots, which in turn makes `done` true.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: iterator construct seems incorrect?

vavasis
In reply to this post by Iain Dunning
Dear Julia users,

Thank you for all your thoughtful replies on this matter.  Indeed, you are all correct, and I misunderstood the construct.  (The point that confused me was that in the call (i,state)=next(I,state), I mistakenly assumed that the returned "state" should correspond with returned data item "i", when in fact the proper approach is for the data item "i" to correspond to the old "state" and to lag one position behind the returned "state".)  I no longer think there is a problem with the start/done/next paradigm in Julia.

--- Steve Vavasis



On Wednesday, July 23, 2014 12:29:12 AM UTC+3, [hidden email] wrote:
Dear Julia users,

As I have mentioned in earlier posts, I am working on a 2-3 tree implementation of a sort-order dict, that is, a dict in which the (key, value) pairs can be retrieved in the sort-order of the keys.  According to section 2.1.7 of the manual, "for i = I; <body> ; end" translates to:

  state = start(I)
  while !done(I, state)
      (i,state) = next(I,state)
      <body> 
  end

The more obvious way to implement the loop would be to put the body BEFORE the 'next' statement, and at first I thought that maybe the manual has a typo.  But then I checked the file dict.jl, and I found indeed the code:

done(t::ObjectIdDict, i) = is(next(t,i),())

So this means that every loop iteration over an ObjectIdDict requires two calls to 'next', one as part of the call to 'done', and a second one to actually advance the loop.

I also looked at the code for 'done' for Dict in dict.jl, and it appears to be buggy(??).  It does not check whether everything after the current i is deleted(??)

For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it twice per loop iteration is undesirable.

Can anyone shed light on why the Julia loop construction is implemented this way, so that two separate calls to 'next' are necessary?  I suppose that it is possible with additional code complexity to cache the result of the first call to 'next' inside the state, but what exactly is the rationale for the current design?

Thanks,
Steve Vavasis

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: iterator construct seems incorrect?

Stefan Karpinski-2
Glad it makes sense. Maybe we could explain that better.

On Jul 23, 2014, at 3:20 PM, [hidden email] wrote:

Dear Julia users,

Thank you for all your thoughtful replies on this matter.  Indeed, you are all correct, and I misunderstood the construct.  (The point that confused me was that in the call (i,state)=next(I,state), I mistakenly assumed that the returned "state" should correspond with returned data item "i", when in fact the proper approach is for the data item "i" to correspond to the old "state" and to lag one position behind the returned "state".)  I no longer think there is a problem with the start/done/next paradigm in Julia.

--- Steve Vavasis



On Wednesday, July 23, 2014 12:29:12 AM UTC+3, vav...@uwaterloo.ca wrote:
Dear Julia users,

As I have mentioned in earlier posts, I am working on a 2-3 tree implementation of a sort-order dict, that is, a dict in which the (key, value) pairs can be retrieved in the sort-order of the keys.  According to section 2.1.7 of the manual, "for i = I; <body> ; end" translates to:

  state = start(I)
  while !done(I, state)
      (i,state) = next(I,state)
      <body> 
  end

The more obvious way to implement the loop would be to put the body BEFORE the 'next' statement, and at first I thought that maybe the manual has a typo.  But then I checked the file dict.jl, and I found indeed the code:

done(t::ObjectIdDict, i) = is(next(t,i),())

So this means that every loop iteration over an ObjectIdDict requires two calls to 'next', one as part of the call to 'done', and a second one to actually advance the loop.

I also looked at the code for 'done' for Dict in dict.jl, and it appears to be buggy(??).  It does not check whether everything after the current i is deleted(??)

For a 2-3 tree, the 'next' operation is logarithmic time, so requiring it twice per loop iteration is undesirable.

Can anyone shed light on why the Julia loop construction is implemented this way, so that two separate calls to 'next' are necessary?  I suppose that it is possible with additional code complexity to cache the result of the first call to 'next' inside the state, but what exactly is the rationale for the current design?

Thanks,
Steve Vavasis

Loading...