Statistics roadmap post

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Statistics roadmap post

Simon Byrne
Hi All,

The roadmap for the Moore Foundation work has been posted here:
http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html

Comments/thoughts welcome.

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Iain Dunning
Hey Simon,

Interesting reading! I think its a good summary of work that could be done, but it feels like its missing a final paragraph that outlines JC's timeline for how the grant money will be spent and the particular things JC will be tackling first. This is probably good advertisement for JC in itself, but also guides possible contributors to things that JC might not be able to get around to.

Cheers,
Iain



On Wednesday, January 13, 2016 at 5:26:00 PM UTC-5, Simon Byrne wrote:
Hi All,

The roadmap for the Moore Foundation work has been posted here:
<a href="http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;">http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html

Comments/thoughts welcome.

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

jock.lawrie
In reply to this post by Simon Byrne
Simon that's a great list. Nailing the basics and bringing them up to a modern, best-in-class standard is definitely in order and totally the right use of the grant. Done well, there's nothing that couldn't be built on top of this foundation. I second Iain's comment re deliverables and timelines - it'd be good not only for devs and collaborators, but also for businesses who are assessing Julia for commercial use.


On Thursday, 14 January 2016 09:26:00 UTC+11, Simon Byrne wrote:
Hi All,

The roadmap for the Moore Foundation work has been posted here:
<a href="http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;">http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html

Comments/thoughts welcome.

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

David Anthoff
In reply to this post by Iain Dunning

Yes, great write-up and plan!

 

And I concur with Iain, it would be great if the role JC will play in this could be made more explicit. Maybe even say who from JC is going to coordinate this and contribute? I for once don’t even know who is part of JC other than the original creators of Julia (I assume?), so it would be nice to associate names with the statistics effort that is sponsored by the Moore Foundation.

 

Cheers,

David

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Iain Dunning
Sent: Wednesday, January 13, 2016 3:10 PM
To: julia-stats <[hidden email]>
Subject: [julia-stats] Re: Statistics roadmap post

 

Hey Simon,

 

Interesting reading! I think its a good summary of work that could be done, but it feels like its missing a final paragraph that outlines JC's timeline for how the grant money will be spent and the particular things JC will be tackling first. This is probably good advertisement for JC in itself, but also guides possible contributors to things that JC might not be able to get around to.

 

Cheers,

Iain

 



On Wednesday, January 13, 2016 at 5:26:00 PM UTC-5, Simon Byrne wrote:

Hi All,

 

The roadmap for the Moore Foundation work has been posted here:

 

Comments/thoughts welcome.

 

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Viral Shah
I think a timeframe is a good next step. This was to get the discussion started, and we can now create a rough prioritization, that we can all debate here. Some of these things are straightforward and some are more exploratory, but the goal is exactly as discussed in this thread. Get going on a solid foundation and longer term projects so that other folks can build on it.

Thinking aloud, should the DataFrames work be the highest priority? The challenge with that is that it is more exploratory and could possibly have false starts. Or should we focus on making the stats packages higher quality?

There will be some dependencies on the compiler in terms of optimizations, which will be folded into the regular julia development as part of this work.

On other JC related questions, I really should just update our website.

-viral

On Thursday, January 14, 2016 at 4:59:50 AM UTC+5:30, David Anthoff wrote:

Yes, great write-up and plan!

 

And I concur with Iain, it would be great if the role JC will play in this could be made more explicit. Maybe even say who from JC is going to coordinate this and contribute? I for once don’t even know who is part of JC other than the original creators of Julia (I assume?), so it would be nice to associate names with the statistics effort that is sponsored by the Moore Foundation.

 

Cheers,

David

 

From: <a href="javascript:" target="_blank" gdf-obfuscated-mailto="S7OkZz25CwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">julia...@... [mailto:<a href="javascript:" target="_blank" gdf-obfuscated-mailto="S7OkZz25CwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">julia...@googlegroups.com] On Behalf Of Iain Dunning
Sent: Wednesday, January 13, 2016 3:10 PM
To: julia-stats <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="S7OkZz25CwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">julia...@...>
Subject: [julia-stats] Re: Statistics roadmap post

 

Hey Simon,

 

Interesting reading! I think its a good summary of work that could be done, but it feels like its missing a final paragraph that outlines JC's timeline for how the grant money will be spent and the particular things JC will be tackling first. This is probably good advertisement for JC in itself, but also guides possible contributors to things that JC might not be able to get around to.

 

Cheers,

Iain

 



On Wednesday, January 13, 2016 at 5:26:00 PM UTC-5, Simon Byrne wrote:

Hi All,

 

The roadmap for the Moore Foundation work has been posted here:

<a href="http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;">http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html

 

Comments/thoughts welcome.

 

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="S7OkZz25CwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">julia-stats...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Lars Tonkard
The modeling stuff can be temporarily performed through pycall if more involved tests are needed so my vote (not that it counts much) is for Dataframe and tidy data.

- Any reason not to just lift the api from Dplyr? 
- How will the expression system work? Can macro's be purposed for this?

 Question about the modeling- 

- Do we want to improve on R's formula syntax and have a generic model building front end? 

- Aside from streaming tuples into iterative models, how is it intended to fit models on disparate backends?  Can one hook into the C-Api and do in database linear algebra? This seems technically infeasible otherwise. 

On Wednesday, January 13, 2016 at 11:57:20 PM UTC-5, Viral B. Shah wrote:
I think a timeframe is a good next step. This was to get the discussion started, and we can now create a rough prioritization, that we can all debate here. Some of these things are straightforward and some are more exploratory, but the goal is exactly as discussed in this thread. Get going on a solid foundation and longer term projects so that other folks can build on it.

Thinking aloud, should the DataFrames work be the highest priority? The challenge with that is that it is more exploratory and could possibly have false starts. Or should we focus on making the stats packages higher quality?

There will be some dependencies on the compiler in terms of optimizations, which will be folded into the regular julia development as part of this work.

On other JC related questions, I really should just update our website.

-viral

On Thursday, January 14, 2016 at 4:59:50 AM UTC+5:30, David Anthoff wrote:

Yes, great write-up and plan!

 

And I concur with Iain, it would be great if the role JC will play in this could be made more explicit. Maybe even say who from JC is going to coordinate this and contribute? I for once don’t even know who is part of JC other than the original creators of Julia (I assume?), so it would be nice to associate names with the statistics effort that is sponsored by the Moore Foundation.

 

Cheers,

David

 

From: [hidden email] [mailto:julia...@googlegroups.com] On Behalf Of Iain Dunning
Sent: Wednesday, January 13, 2016 3:10 PM
To: julia-stats <[hidden email]>
Subject: [julia-stats] Re: Statistics roadmap post

 

Hey Simon,

 

Interesting reading! I think its a good summary of work that could be done, but it feels like its missing a final paragraph that outlines JC's timeline for how the grant money will be spent and the particular things JC will be tackling first. This is probably good advertisement for JC in itself, but also guides possible contributors to things that JC might not be able to get around to.

 

Cheers,

Iain

 



On Wednesday, January 13, 2016 at 5:26:00 PM UTC-5, Simon Byrne wrote:

Hi All,

 

The roadmap for the Moore Foundation work has been posted here:

<a href="http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;">http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html

 

Comments/thoughts welcome.

 

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to julia-stats...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Simon Byrne
To answer several questions:

> ... it would be great if the role JC will play in this could be made more explicit. Maybe even say who from JC is going to coordinate this and contribute?

I'll be the one mostly dedicated to this on the Julia Computing side, though others will be involved.

> Any reason not to just lift the api from Dplyr?

dplyr has lots of good ideas (which is why I mentioned it explicitly), but it relies heavily on R's nonstandard evaluation and weird scoping. Establishing a clear, straightforward syntax here is certainly one of the more challenging parts.

> How will the expression system work? Can macro's be purposed for this?

This is one of the main challenges. Macros are incredibly useful, but can often be a bit too magical, making code difficult to reason about.

> Do we want to improve on R's formula syntax and have a generic model building front end?

Yes. R's formula interface is certainly powerful, but does have its drawbacks. Once you step out of the linear model context, the idea of using + to specify the columns to use in the model is a bit odd. It would also be useful if the syntax for referring to columns was the same as the data manipulation framework.

> Aside from streaming tuples into iterative models, how is it intended to fit models on disparate backends? Can one hook into the C-Api and do in database linear algebra? This seems technically infeasible otherwise.

My idea is that you should be able to pass any "data table" object (e.g. DataFrame, or some DB query) to model fitting function and have it "just work". The exact method would depend on the model and backend, but there are a variety of approaches we could use so that you don't have to keep the whole dataset in memory (e.g. chunked-QR, stochastic gradient, distributed linear algebra)

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Lars Tonkard
> My idea is that you should be able to pass any "data table" object (e.g. DataFrame, or some DB query) to model fitting function and have it "just work".

This sounds amazing.

>. It would also be useful if the syntax for referring to columns was the same as the data manipulation framework. 

You mean some unification of the two?

What would this look like? A DAG of columns and distributions.jl?

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Johan Sigfrids
In reply to this post by Simon Byrne
Will migrating DataFrames to NullableArrays be part of this?

On Thursday, January 14, 2016 at 12:26:00 AM UTC+2, Simon Byrne wrote:
Hi All,

The roadmap for the Moore Foundation work has been posted here:
<a href="http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fjuliacomputing.com%2Fblog%2F2016%2F01%2F14%2Fstats-roadmap.html\46sa\75D\46sntz\0751\46usg\75AFQjCNHxTNgfAfnELu-K4Uu5FSgRM9iXOQ&#39;;return true;">http://juliacomputing.com/blog/2016/01/14/stats-roadmap.html

Comments/thoughts welcome.

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Drew G
In reply to this post by Simon Byrne
This is a fantastic list and makes me extremely excited about Julia's future as a modern language for data analysis

Simon, I'm glad to hear that you recognize the need for a clear, straightforward syntax for dataframes, and would like to reiterate your point.  I hope that much of the time and attention goes into making the task of working with data on a day-to-day basis a clear and simple pleasure, even if that data is small and sits in memory.  In that sense, I think of "modern" as being equally about the elegance of the front-end interface and syntax as about the breadth of back-end support.

I'm sure you have thought about this as well, but just wanted to share my perspective.  Looking forward to all of the great work ahead!

On Thursday, January 14, 2016 at 12:13:26 PM UTC-5, Simon Byrne wrote:
To answer several questions:

> ... it would be great if the role JC will play in this could be made more explicit. Maybe even say who from JC is going to coordinate this and contribute?

I'll be the one mostly dedicated to this on the Julia Computing side, though others will be involved.

> Any reason not to just lift the api from Dplyr?

dplyr has lots of good ideas (which is why I mentioned it explicitly), but it relies heavily on R's nonstandard evaluation and weird scoping. Establishing a clear, straightforward syntax here is certainly one of the more challenging parts.

> How will the expression system work? Can macro's be purposed for this?

This is one of the main challenges. Macros are incredibly useful, but can often be a bit too magical, making code difficult to reason about.

> Do we want to improve on R's formula syntax and have a generic model building front end?

Yes. R's formula interface is certainly powerful, but does have its drawbacks. Once you step out of the linear model context, the idea of using + to specify the columns to use in the model is a bit odd. It would also be useful if the syntax for referring to columns was the same as the data manipulation framework.

> Aside from streaming tuples into iterative models, how is it intended to fit models on disparate backends? Can one hook into the C-Api and do in database linear algebra? This seems technically infeasible otherwise.

My idea is that you should be able to pass any "data table" object (e.g. DataFrame, or some DB query) to model fitting function and have it "just work". The exact method would depend on the model and backend, but there are a variety of approaches we could use so that you don't have to keep the whole dataset in memory (e.g. chunked-QR, stochastic gradient, distributed linear algebra)

-Simon

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Benjamin Deonovic
In reply to this post by Simon Byrne
 Complementing the above work, we intend to support a more flexible choice of algorithms, such as QR, Cholesky, stochastic gradient descent, MCMC techniques (for example via Lora.jl or Stan.jl), and variational methods for Bayesian models.

Just want to point out that Mamba.jl is a much more mature MCMC package in julia. Lora.jl has just recently gone through a major revamp and is still in heavy development, doesn't have any convergence diagnostics, or plotting features, and master branch only contains a few samplers (devel branch has several more). Stan.jl requires user to have Stan installed, so I don't think that would be an appropriate addition to GLM.jl, also Stan.jl utilizes Mamba for convergence diagnostics and plotting. 

This isn't a slight against Lora.jl or Stan.jl. Theodore Papamarkou is doing a great job with Lora.jl and Rob Goedman's port of Stan to julia is fantastic. I just wish Mamba got a bit more traffic than it does. Of course having several packages that do the same thing is not a bad thing. It can encourage innovation and development. It does seem a bit unfair that Lora.jl gets to be featured in JuliaStats. 

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Alex Williams
This is perhaps a question/comment for the julia-opt mailing list, but I've previously thought that it would be nice if GLM.jl was linked into the optimization environment (mostly Optim.jl) rather than calling its own internal optimization routines.

It would be really awesome to develop packages for stochastic gradient descent and related techniques. This one looks like a good start: https://github.com/lindahua/SGDOptim.jl


On Sat, Jan 16, 2016 at 10:16 AM, Benjamin Deonovic <[hidden email]> wrote:
 Complementing the above work, we intend to support a more flexible choice of algorithms, such as QR, Cholesky, stochastic gradient descent, MCMC techniques (for example via Lora.jl or Stan.jl), and variational methods for Bayesian models.

Just want to point out that Mamba.jl is a much more mature MCMC package in julia. Lora.jl has just recently gone through a major revamp and is still in heavy development, doesn't have any convergence diagnostics, or plotting features, and master branch only contains a few samplers (devel branch has several more). Stan.jl requires user to have Stan installed, so I don't think that would be an appropriate addition to GLM.jl, also Stan.jl utilizes Mamba for convergence diagnostics and plotting. 

This isn't a slight against Lora.jl or Stan.jl. Theodore Papamarkou is doing a great job with Lora.jl and Rob Goedman's port of Stan to julia is fantastic. I just wish Mamba got a bit more traffic than it does. Of course having several packages that do the same thing is not a bad thing. It can encourage innovation and development. It does seem a bit unfair that Lora.jl gets to be featured in JuliaStats. 

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Rob J. Goedman
In reply to this post by Benjamin Deonovic
I fully support Benjamin’s email (with respect to Stan.jl vs. Mamba.jl).

Mamba.jl should be part of JuliaStats in my opinion.

Regards,
Rob



On Jan 16, 2016, at 10:16, Benjamin Deonovic <[hidden email]> wrote:

 Complementing the above work, we intend to support a more flexible choice of algorithms, such as QR, Cholesky, stochastic gradient descent, MCMC techniques (for example via Lora.jl or Stan.jl), and variational methods for Bayesian models.

Just want to point out that Mamba.jl is a much more mature MCMC package in julia. Lora.jl has just recently gone through a major revamp and is still in heavy development, doesn't have any convergence diagnostics, or plotting features, and master branch only contains a few samplers (devel branch has several more). Stan.jl requires user to have Stan installed, so I don't think that would be an appropriate addition to GLM.jl, also Stan.jl utilizes Mamba for convergence diagnostics and plotting. 

This isn't a slight against Lora.jl or Stan.jl. Theodore Papamarkou is doing a great job with Lora.jl and Rob Goedman's port of Stan to julia is fantastic. I just wish Mamba got a bit more traffic than it does. Of course having several packages that do the same thing is not a bad thing. It can encourage innovation and development. It does seem a bit unfair that Lora.jl gets to be featured in JuliaStats. 

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Lars Tonkard
I think its the lack of autodiff which is an extension of a difference in philosophy of use. 

On Saturday, January 16, 2016 at 3:39:20 PM UTC-5, Rob J Goedman wrote:
I fully support Benjamin’s email (with respect to Stan.jl vs. Mamba.jl).

Mamba.jl should be part of JuliaStats in my opinion.

Regards,
Rob



On Jan 16, 2016, at 10:16, Benjamin Deonovic <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="YY40RK2bDAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">bdeo...@...> wrote:

 Complementing the above work, we intend to support a more flexible choice of algorithms, such as QR, Cholesky, stochastic gradient descent, MCMC techniques (for example via <a href="https://github.com/JuliaStats/Lora.jl" style="color:rgb(39,73,176);font-family:Georgia,&#39;Liberation Serif&#39;,serif;font-size:inherit;line-height:24px" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaStats%2FLora.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHyPujsUleB6eh8yIAoOFCQI7pWYw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2FJuliaStats%2FLora.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHyPujsUleB6eh8yIAoOFCQI7pWYw&#39;;return true;">Lora.jl or <a href="https://github.com/goedman/Stan.jl" style="color:rgb(39,73,176);font-family:Georgia,&#39;Liberation Serif&#39;,serif;font-size:inherit;line-height:24px" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fgoedman%2FStan.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHaCMBbjVRxO2OAoPEtQWhrRF0h8g&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fgoedman%2FStan.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNHaCMBbjVRxO2OAoPEtQWhrRF0h8g&#39;;return true;">Stan.jl), and variational methods for Bayesian models.

Just want to point out that <a href="https://github.com/brian-j-smith/Mamba.jl" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fbrian-j-smith%2FMamba.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNEc6cgcc_Vs1AHQDt4OhVyl1ZL5eg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fbrian-j-smith%2FMamba.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNEc6cgcc_Vs1AHQDt4OhVyl1ZL5eg&#39;;return true;">Mamba.jl is a much more mature MCMC package in julia. Lora.jl has just recently gone through a major revamp and is still in heavy development, doesn't have any convergence diagnostics, or plotting features, and master branch only contains a few samplers (devel branch has several more). Stan.jl requires user to have Stan installed, so I don't think that would be an appropriate addition to GLM.jl, also Stan.jl utilizes Mamba for convergence diagnostics and plotting. 

This isn't a slight against Lora.jl or Stan.jl. Theodore Papamarkou is doing a great job with Lora.jl and Rob Goedman's port of Stan to julia is fantastic. I just wish Mamba got a bit more traffic than it does. Of course having several packages that do the same thing is not a bad thing. It can encourage innovation and development. It does seem a bit unfair that Lora.jl gets to be featured in JuliaStats. 

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="YY40RK2bDAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">julia-stats...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Statistics roadmap post

Simon Byrne
In reply to this post by Benjamin Deonovic
>> It would also be useful if the syntax for referring to columns was the same as the data manipulation framework. 
> You mean some unification of the two?

What I means is that how you refer to a column of data should be the same across all packages. I'm not 100% sure what that should be yet though.

> Will migrating DataFrames to NullableArrays be part of this?

Possibly. Though NullableArrays is currently the most performant option, it is somewhat unsatisfying in that values all end up being wrapped with a Nullable type, which is awkward to use. One of the parts of this project is to improve the performance of small Union types, for example by generating explicit branches at compile time rather than relying on runtime dispatch. If this could be made reasonably fast then it might be worth sticking with the current approach.

Another approach would be to allow DataFrames to accept different array types, so that ordinary well-typed vectors can be used if there is no missing data, or even other AbstractVectors such as Ranges. This would then allow use of either DataArrays or NullableArrays.

> In that sense, I think of "modern" as being equally about the elegance of the front-end interface and syntax as about the breadth of back-end support.

I certainly agree. This will probably take several iterations, but hopefully we can get something elegant.

> Just want to point out that Mamba.jl is a much more mature MCMC package in julia.

Sorry, I had forgotten about Mamba.jl (the list wasn't intended to be exhaustive), but I would certainly hope to include that as well. It is a very nice package.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.