I sent this message to the [hidden email] mailing list today inviting follow-ups to take place on this group. Some users have taken the plunge and used Julia because the model fits in R using lme4 were taking a long time, as in many hours. They have seen one to two orders of magnitude differences in speed which, when things are taking that long, is worth the pain of switching. If the fit in R is only taking a few seconds then it is not worthwhile learning a new language just to make that faster. I think that performing an lmer-like fit in Julia is now sufficiently straightforward that it will be worthwhile for others to try doing so. We have developed a Julia package called RCall which allows a Julia user to run an R process from within Julia. In particular, RCall makes it easy to create a copy of an R data.frame as a Julia DataFrame object and use that to fit a linear mixed model. The steps are reasonably straightforward. I will illustrate with data on ratings of about 3700 movies by about 6000 users. Of course, not every user rates every movie. There are about 1,000,000 ratings in the data set. These data are available as the MovieLens 1M Dataset at http://grouplens.org/datasets/movielens/ and as the ml1m data set in the Timings R package available at https://github.com/Stat990-033/Timings. That package is not on CRAN because the data sets are too large. You must install it in R with install.packages("devtools") devtools::Install_github("Stat990-033/Timings") Downloads of Julia itself are at http://julialang.org/downloads/ After installing Julia start it and add the packages Pkg.add("RCall") Pkg.add("MixedModels") The actual model fit is performed as julia> using DataFrames, MixedModels, RCall julia> ml1m = rcopy("Timings::ml1m"); julia> @time m1 = fit!(lmm(Y ~ 1 + (1|G) + (1|H), ml1m)) 24.956579 seconds (39.95 M allocations: 1.398 GB, 1.06% gc time) Linear mixed model fit by maximum likelihood logLik: -1331986.005811, deviance: 2663972.011622, AIC: 2663980.011622, BIC: 2664027.274500 Variance components: Variance Std.Dev. G 0.12985210 0.36034996 H 0.36879694 0.60728654 Residual 0.81390867 0.90216887 Number of obs: 1000209; levels of grouping factors: 6040, 3706 Fixed-effects parameters: Estimate Std.Error z value (Intercept) 3.33902 0.0114624 291.302 The "using" directive is similar to the "library" or "require" functions in R. The named Julia packages are, in R terminology, loaded and attached. The "Timings::ml1m" expression is an R expression. It accesses the ml1m object in the Timings package, loading the package first, if necessary. The call to the Julia function lmm is similar to lmer but only creates the model. The call to fit! is what causes the model to be fit. As you can see, this fit takes about 25 seconds. A similar fit using lmer takes about 40 minutes on the same machine. I would be happy to answer questions about the MixedModels package but I don't think this forum would be appropriate. It is a forum for questions about fitting mixed-effects models with R. For the time being I would suggest asking questions on the Google group called julia-stats https://groups.google.com/forum/#!forum/julia-stats to which I am sending a copy of this message. You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
I quoted an incorrect R time. When I reran the model using the lme4 package for R it converged in 162 seconds elapsed on this machine. The gain in speed is a factor of 6 not "one or two orders of magnitude"
-- On Tuesday, February 23, 2016 at 11:45:41 AM UTC-6, Douglas Bates wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
This is great! Maybe worth just noting that the R command is devtools::install_github("Stat990-033/Timings") #'i' in install should not be in caps On Tue, Feb 23, 2016 at 7:00 PM, Douglas Bates <[hidden email]> wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Free forum by Nabble | Edit this page |