Custom array backed dataframe

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Custom array backed dataframe

Lee Bates
Hi,

I've created a wrapper around a mmap array that allows me to efficiently grow and shrink the array.
I would like to use this custom array as the data for the DataArrays backing a DataFrame. My issue is that the DataFrames and DataArray packages use Array rather than AbstractArray for most of the methods and types declared.

I don't believe there is any performance loss by using abstract types. Would it be possible to migrate to using Abstract types?

Also, if this is ok, would and copy replace on all instances of Array{T, N} to AbstractArray{T, N} be enough to update the packages?

Thanks,
Lee

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Custom array backed dataframe

Milan Bouchet-Valat
Le mardi 25 octobre 2016 à 19:45 -0700, Lee Bates a écrit :

> Hi,
>
> I've created a wrapper around a mmap array that allows me to
> efficiently grow and shrink the array.
> I would like to use this custom array as the data for the DataArrays
> backing a DataFrame. My issue is that the DataFrames and DataArray
> packages use Array rather than AbstractArray for most of the methods
> and types declared.
>
> I don't believe there is any performance loss by using abstract
> types. Would it be possible to migrate to using Abstract types?
>
> Also, if this is ok, would and copy replace on all instances of
> Array{T, N} to AbstractArray{T, N} be enough to update the packages?
DataFrames can already have columns of arbitrary array types, but
currently conversion to DataArrays is done automatically in most cases.
We're still discussing whether this behavior should be kept or not:
https://github.com/JuliaStats/DataFrames.jl/issues/1091

Anyway DataArrays is going to be deprecated in favor of NullableArrays.
You can have a look at how Feather.jl handles the creation of
NullableArrays based on mmapped data:
https://github.com/JuliaStats/Feather.jl/blob/245dea9cffd264e25415ae29237ba44e245edaf7/src/Feather.jl#L154


Regards

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Loading...