Differences between == and indexing/keys?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Differences between == and indexing/keys?

Joshua Job
Hello all,

I have defined a structure type ProblemInstance with 3 fields (h,J, and a dictionary called properties), and which to store a dictionary that actually uses these structures as the key. I've defined == for these objects as follows:

==(a::ProblemInstance,b::ProblemInstance)=(a.h==b.h)&&(a.J==b.J)&&(a.properties==b.properties)

This function works as expected. And in a single session, if I create my list of problem instances and then generate the dictionary from that, it works fine. However, if I save my dictionary and a list of keys with JLD.save and reload them in a new session:
instances=load("instances.jld","instances"); # the problem instances
my_dict
=load("my_dict.jld","thedict"); # the dictionary, keys are the instances
length
([key==instances[1] for key in keys(my_dict)]) # returns 1, as each problem instance occurs in my_dict only once
my_dict
[instances[1]] # gives error: key not found

This is really odd behavior to me. Surely if == claims two things are equal, then I should be able to index the dictionary by either one and receive the result, no different than
a=5
b
=5
a
==b # returns true
[a=>"eleven"][b] # returns "eleven"

Am I missing some epistemological issue here or is this a bug of some kind?
Reply | Threaded
Open this post in threaded view
|

Differences between == and indexing/keys?

Steven G. Johnson
Defining == is not enough. You have to define a hash(x) method, otherwise it will default to hashing based on the memory address (which changes between runs).
Reply | Threaded
Open this post in threaded view
|

Re: Differences between == and indexing/keys?

Joshua Job
So would something like
hash(x::ProblemInstance,h::Uint64=uint(0)) = 3*hash(x.h)+2*hash(x.J)+hash(x.properties)+h
be a reasonable thing to do? If the h,J,and properties fields are the same, this yields the same hash, but I'm not sure if we might run into problems with hashes potentially overlapping (though provided the base array and dictionary types have good hashes, that shouldn't occur except in highly rare circumstances).


On Monday, October 20, 2014 4:37:48 PM UTC-7, Steven G. Johnson wrote:
Defining == is not enough. You have to define a hash(x) method, otherwise it will default to hashing based on the memory address (which changes between runs).
Reply | Threaded
Open this post in threaded view
|

Re: Differences between == and indexing/keys?

Steven G. Johnson


On Tuesday, October 21, 2014 2:35:01 AM UTC-4, Joshua Job wrote:
So would something like
hash(x::ProblemInstance,h::Uint64=uint(0)) = 3*hash(x.h)+2*hash(x.J)+hash(x.properties)+h
be a reasonable thing to do?

No, it is enough to define:

import Base: hash
hash(x::ProblemInstance, h::Uint64) = hash(x.h, hash(x.J, hash(x.properties, h)))

That is, you should use the hash(x,y) function itself to combine multiple hashes, not addition or any other ad-hoc bitmixing strategy.
Reply | Threaded
Open this post in threaded view
|

Re: Differences between == and indexing/keys?

Stefan Karpinski
I wonder if we should just define

hash(x::Any, ys::Any...) = hash(x, hash(ys...))

so that you can write hash(x.h, x.J, x.properties) for this. The danger is that the two-argument form where the second argument happens to be a Uint already means something particular, but that may be ok (I'm not sure).

On Tue, Oct 21, 2014 at 11:20 AM, Steven G. Johnson <[hidden email]> wrote:


On Tuesday, October 21, 2014 2:35:01 AM UTC-4, Joshua Job wrote:
So would something like
hash(x::ProblemInstance,h::Uint64=uint(0)) = 3*hash(x.h)+2*hash(x.J)+hash(x.properties)+h
be a reasonable thing to do?

No, it is enough to define:

import Base: hash
hash(x::ProblemInstance, h::Uint64) = hash(x.h, hash(x.J, hash(x.properties, h)))

That is, you should use the hash(x,y) function itself to combine multiple hashes, not addition or any other ad-hoc bitmixing strategy.

Reply | Threaded
Open this post in threaded view
|

Re: Differences between == and indexing/keys?

Steven G. Johnson


On Tuesday, October 21, 2014 11:24:32 AM UTC-4, Stefan Karpinski wrote:
I wonder if we should just define

hash(x::Any, ys::Any...) = hash(x, hash(ys...))

so that you can write hash(x.h, x.J, x.properties) for this. The danger is that the two-argument form where the second argument happens to be a Uint already means something particular, but that may be ok (I'm not sure).


You could define hash(x,y,z...) = hash(x, hash(y, z...)) so that the 2-argument form is not changed. 
Reply | Threaded
Open this post in threaded view
|

Re: Differences between == and indexing/keys?

Stefan Karpinski-2
Seems pretty annoying for the one, three, four, etc arg for a to mean one thing and the two arg form to mean something else.


On Oct 21, 2014, at 1:36 PM, Steven G. Johnson <[hidden email]> wrote:



On Tuesday, October 21, 2014 11:24:32 AM UTC-4, Stefan Karpinski wrote:
I wonder if we should just define

hash(x::Any, ys::Any...) = hash(x, hash(ys...))

so that you can write hash(x.h, x.J, x.properties) for this. The danger is that the two-argument form where the second argument happens to be a Uint already means something particular, but that may be ok (I'm not sure).


You could define hash(x,y,z...) = hash(x, hash(y, z...)) so that the 2-argument form is not changed.