Hi,
I'm trying to do the simplest linear model, OLS regression of X1=[2,4,6] on X2=[1,2,3] and an intercept (the intercept estimate should be 0, the slope estimate should be 2), and obtain t-statistics and R-squared. Here are four commands that ideally would work, yet only one of them is successful: # Setup using GLM using DataFrames x1 = [2,4,6] x2 = [1,2,3] # Version 1: Succeeds julia> A = DataFrame(hcat(x1,x2)) julia> lm(x1 ~ x2, A)Coefficients: Estimate Std.Error t value Pr(>|t|) (Intercept) 2.56395e-15 2.44585e-15 1.04828 0.4850 x2 2.0 1.13221e-15 1.76646e15 <1e-15 # Version 2: Fails julia> A = DataFrame(hcat(x1,x2)) ERROR: `lm` has no method matching lm(::Formula) # Version 3: Fails julia> A = DataFrame(hcat(x1,x2)) julia> lm(A[:x2], A[:x1]) ERROR: `fit` has no method matching fit(::Type{LinearModel{T<:LinPred}}, ::DataArray{Int64,1}, ::DataArray{Int64,1}) in lm at /.../.julia/v0.3/GLM/src/lm.jl:43 # Version 4: Fails julia> lm(x2,x1) ERROR: `fit` has no method matching fit(::Type{LinearModel{T<:LinPred}}, ::Array{Int64,1}, ::Array{Int64,1}) in lm at /.../.julia/v0.3/GLM/src/lm.jl:43 Is it supposed to be this way? Shouldn't there at least be support for lm(X,y) where X and y are Floats? linreg(X,y) works but doesn't have the desired hypothesis tests. lm says it has method lm(X,y), yet Version 4 above didn't work: julia> methods(lm) # 3 methods for generic function "lm": lm(e::Expr,df,args...) at /.../.julia/v0.3/GLM/src/deprecated.jl:6 lm(s::String,df,args...) at /.../.julia/v0.3/GLM/src/deprecated.jl:12 lm(X,y) at /.../.julia/v0.3/GLM/src/lm.jl:43 Thanks, Bradley You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Not having written the LM code, here’s my personal take on this:
Version 1 should work and does. It’s the official interface, which means it's the one that people should generally be using. Version 2 definitely shouldn’t work. Our formula interface isn’t like R’s at all: the inputs must always be the names of columns of a DataFrame, and that DataFrame must be passed explicitly as an argument to lm. If this interface worked, you’d end up conflating values with the names of values. Version 3 probably shouldn’t work since it adds no expressive power over Version 1. But this is definitely debatable. Version 4 probably shouldn’t work since the X input should always be a matrix, not a vector. We could do a reshape when vectors are supplied, but I suspect that the requirement that the design matrix actually be a matrix is a good way to catch typos. Not totally sure about this, though. — John
-- On Aug 24, 2014, at 10:21 AM, Bradley Setzler <[hidden email]> wrote: Hi, You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Thanks for your response, John. One of the most common frustrations that new Julia users have expressed to me, and that I have been frustrated by repeatedly, is that many Julia commands treat column and row vectors as distinct objects, even when the context makes it perfectly clear that they are meant to be the same object. Take for example MvNormal(mu,Sigma), which requires that mu be a column rather than row vector, even though from the context, there is no reason to believe that the user has misunderstood the multivariate normal distribution when he uses a row vector. The accumulating reshape() or vec() commands in the code are messy and unintuitive to new users, not to mention that the error messages which result from providing the wrong vector shape often do not include the obvious suggestion of using a vec() command, so new users will have to figure it out on their own. (Note: MvNormal is an exception, as the error message explains the issue and suggests vec() as a solution.) In summary, I think the user should be trusted to know what he's doing when he provides a row vector where technically a column vector is required; the conversion can be done automatically instead of throwing an error. Hope this is helpful, Bradley PS - Incidentally, neither lm([1. 2 3],[2 4 6]) nor lm([1.,2,3],[2,4,6]) works, so it seems reshaping is not the solution to the issue in Version 4 above. PPS - Another issue with MvNormal is that it cannot accept mu or Sigma as integers, causing unnecessary frustration when it is valid conceptually to characterize a multivariate normal distribution with integers, and the error message does not suggest a solution. On Sunday, August 24, 2014 5:37:59 PM UTC-5, John Myles White wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Following up on the PPS, MvNormal fails with integers is Issue #279. Bradley On Sunday, August 24, 2014 7:34:02 PM UTC-5, Bradley Setzler wrote: -- You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Following up on the original topic, lm(X,y) works if and only if X is float. y can be integer, but cannot be row vector.
-- Issue #84. Bradley On Tuesday, August 26, 2014 9:39:52 AM UTC-5, Bradley Setzler wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Hi Brad,
You’re raising a lot of good issues with the desire to support automatic conversion of integer arrays to float arrays. It would be great if you’d write a few pull requests to move things forward. A quick round of feature requests increases a lot of people’s already full workloads, so it may sit without action for some time. — John On Aug 26, 2014, at 8:10 AM, Bradley Setzler <[hidden email]> wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Hi John,
-- Thanks, I would prefer to do it that way, but my lack of programming experience has kept me from contributing. I previously made an attempt to prepare pull requests but wasn't able to test the modified packages on my local machine. In other words, I cloned Distributions.jl, modified MvNormal to accept row vectors and integers for mu, but then couldn't convince my installation of Julia to use the new version of Distributions.jl instead of the version I installed previously so that I could see how it was working. I want to be a contributor to the Julia community, but admittedly I need the for-dummies version to start. Ideally I would want to see a step-by-step guide to modifying and testing packages locally before submitting the pull request. Do any resources like that exist? These are probably very easy things, but keep in mind that I literally taught myself computing this year and still have a lot to learn. Best, Bradley On Tuesday, August 26, 2014 10:12:31 AM UTC-5, John Myles White wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
There are probably other approaches to testing locally, but the easiest one is to completely replace the copy in .julia/v0.3/PKGNAME with your custom copy. Then your local version will be the version that Julia loads.
— John
-- On Aug 26, 2014, at 8:34 AM, Bradley Setzler <[hidden email]> wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
In reply to this post by Bradley Setzler
On Tuesday, August 26, 2014 10:34:12 AM UTC-5, Bradley Setzler wrote:
[...]
I may need to write (...or convince someone else to write...) a page or two cheat sheet like that in the next few weeks. I'll ping you when it's done. You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
If someone writes up a guide, I'm happy to edit it.
One point: packages in Julia are just a pile of code found in a specific directory. They're just structured according to conventions that allow them to loaded automatically. Specifically, if I type using Foo, then Julia looks in ~/.julia/v0.3/Foo/src for a file called Foo.jl and attempts to include that file as well as import all of the exports from the module Foo. Julia assumes a Foo module is defined inside of the Foo.jl file, which is why the exports are imported. That description may be subtle (and UNIX-centric), but it is almost the entirety of what you need to know to make changes to packages locally. Submitting those changes for review requires some understanding of git and GitHub, but the generic guides for working with GitHub are better than anything we'd produce. -- John
-- On Aug 26, 2014, at 9:09 AM, Gray Calhoun <[hidden email]> wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
The other thing that's important to know is that ~/.julia/v0.3/Foo/ is actually itself the git repository for Foo.jl, so you don't need to clone Foo manually to create a pull request. You just need to fork Foo on github, and then:
-- cd ~/.julia/v0.3/Foo git remote add myfork [hidden email]:<your user name>/Foo.jl.git (Or whatever the clone URL is for your fork of Foo.jl.) After that, the steps for preparing a pull request are the same for any other github project (e.g. create a branch, push that branch to your fork, create pull request on github.com). Dave On Wednesday, August 27, 2014 6:18:44 PM UTC-4, John Myles White wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Thank you Dave, I think this will solve my problems.
-- Bradley On Saturday, August 30, 2014 5:47:00 PM UTC-5, Dave Kleinschmidt wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
I used this little handy gist from dmbates to create my first pull request a few weeks ago: https://gist.github.com/dmbates/2712118
-- good luck K On Sunday, 31 August 2014 15:53:22 UTC+2, Bradley Setzler wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Free forum by Nabble | Edit this page |