ANN: OnlineStats.jl

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ANN: OnlineStats.jl

Josh Day
Introducing OnlineStats.jl: Online statistics for Julia

This package contains a catalogue of online algorithms for statistics.  Each model uses O(1) memory and is designed for processing streaming data or data too large to hold in memory.  The functionality from StreamStats.jl has been absorbed and many other models are included.  Short descriptions of the models are here.

Some Highlights:
  • Summary Statistics
    • Mean
    • Variance
    • Covariance Matrix
    • Quantiles
  • Simple interface for stochastic gradient descent (SGD) and Adagrad algorithms with optional regularization
    • L1 Regression
    • L2 Regression
    • Logistic Regression
    • Quantile Regression
    • Support Vector Machine
    • Huber Loss Regression  
  • Parametric fitting of distributions
  • Gaussian Mixtures
  • Exact estimates for least squares, ridge regression, and soon LASSO/Elastic Net

Future Work
In the near future, expect to see:
  • More support for LASSO/Elastic Net penalties for stochastic gradient types and the sparse regression type SparseReg.
  • More examples in the documentation
  • As OnlineStats is partially the implementation of my research in the PhD Statistics program at NC State University, there will hopefully be some unique algorithms that don't exist elsewhere.  For example, quantile regression using an online MM algorithm (QuantRegMM).
A big thanks to Tom Breloff for implementing some cool algorithms and being a general programming advisor to this statistician.  

Collaboration and feedback is very welcome.  

Best,
-Josh

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: ANN: OnlineStats.jl

David Gold
I'm excited to see all this, and I've started to play around with an interface for a SQLite3-backended data structure. Have you registered this package? I had to clone it.

On Thursday, August 27, 2015 at 10:01:13 AM UTC-7, Josh Day wrote:
Introducing <a href="https://github.com/joshday/OnlineStats.jl" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjoshday%2FOnlineStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNG0odrrg_WYoooIRzOaNSQourYOJg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjoshday%2FOnlineStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNG0odrrg_WYoooIRzOaNSQourYOJg&#39;;return true;">OnlineStats.jl: Online statistics for Julia

This package contains a catalogue of online algorithms for statistics.  Each model uses O(1) memory and is designed for processing streaming data or data too large to hold in memory.  The functionality from <a href="https://github.com/johnmyleswhite/StreamStats.jl" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjohnmyleswhite%2FStreamStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNGdeWM2S0fYTaOsL_ZErzVCc-ne5w&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjohnmyleswhite%2FStreamStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNGdeWM2S0fYTaOsL_ZErzVCc-ne5w&#39;;return true;">StreamStats.jl has been absorbed and many other models are included.  Short descriptions of the models are <a href="http://onlinestatsjl.readthedocs.org/en/latest/Models/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fonlinestatsjl.readthedocs.org%2Fen%2Flatest%2FModels%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNG6osl9SLeLVGoJ9r0UAxzWfRuJsA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fonlinestatsjl.readthedocs.org%2Fen%2Flatest%2FModels%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNG6osl9SLeLVGoJ9r0UAxzWfRuJsA&#39;;return true;">here.

Some Highlights:
  • Summary Statistics
    • Mean
    • Variance
    • Covariance Matrix
    • Quantiles
  • Simple interface for stochastic gradient descent (SGD) and Adagrad algorithms with optional regularization
    • L1 Regression
    • L2 Regression
    • Logistic Regression
    • Quantile Regression
    • Support Vector Machine
    • Huber Loss Regression  
  • Parametric fitting of distributions
  • Gaussian Mixtures
  • Exact estimates for least squares, ridge regression, and soon LASSO/Elastic Net

Future Work
In the near future, expect to see:
  • More support for LASSO/Elastic Net penalties for stochastic gradient types and the sparse regression type SparseReg.
  • More examples in the documentation
  • As OnlineStats is partially the implementation of my research in the PhD Statistics program at NC State University, there will hopefully be some unique algorithms that don't exist elsewhere.  For example, quantile regression using an online MM algorithm (QuantRegMM).
A big thanks to Tom Breloff for implementing some cool algorithms and being a general programming advisor to this statistician.  

Collaboration and feedback is very welcome.  

Best,
-Josh

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: ANN: OnlineStats.jl

Josh Day
Very cool!  It is registered.  A new version was merged into METADATA last night.

On Sep 4, 2015, at 10:44 AM, David Gold <[hidden email]> wrote:

I'm excited to see all this, and I've started to play around with an interface for a SQLite3-backended data structure. Have you registered this package? I had to clone it.

On Thursday, August 27, 2015 at 10:01:13 AM UTC-7, Josh Day wrote:
Introducing <a href="https://github.com/joshday/OnlineStats.jl" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjoshday%2FOnlineStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNG0odrrg_WYoooIRzOaNSQourYOJg';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjoshday%2FOnlineStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNG0odrrg_WYoooIRzOaNSQourYOJg';return true;" class="">OnlineStats.jl: Online statistics for Julia

This package contains a catalogue of online algorithms for statistics.  Each model uses O(1) memory and is designed for processing streaming data or data too large to hold in memory.  The functionality from <a href="https://github.com/johnmyleswhite/StreamStats.jl" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjohnmyleswhite%2FStreamStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNGdeWM2S0fYTaOsL_ZErzVCc-ne5w';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Fjohnmyleswhite%2FStreamStats.jl\46sa\75D\46sntz\0751\46usg\75AFQjCNGdeWM2S0fYTaOsL_ZErzVCc-ne5w';return true;" class="">StreamStats.jl has been absorbed and many other models are included.  Short descriptions of the models are <a href="http://onlinestatsjl.readthedocs.org/en/latest/Models/" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fonlinestatsjl.readthedocs.org%2Fen%2Flatest%2FModels%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNG6osl9SLeLVGoJ9r0UAxzWfRuJsA';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fonlinestatsjl.readthedocs.org%2Fen%2Flatest%2FModels%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNG6osl9SLeLVGoJ9r0UAxzWfRuJsA';return true;" class="">here.

Some Highlights:
  • Summary Statistics
    • Mean
    • Variance
    • Covariance Matrix
    • Quantiles
  • Simple interface for stochastic gradient descent (SGD) and Adagrad algorithms with optional regularization
    • L1 Regression
    • L2 Regression
    • Logistic Regression
    • Quantile Regression
    • Support Vector Machine
    • Huber Loss Regression  
  • Parametric fitting of distributions
  • Gaussian Mixtures
  • Exact estimates for least squares, ridge regression, and soon LASSO/Elastic Net

Future Work
In the near future, expect to see:
  • More support for LASSO/Elastic Net penalties for stochastic gradient types and the sparse regression type SparseReg.
  • More examples in the documentation
  • As OnlineStats is partially the implementation of my research in the PhD Statistics program at NC State University, there will hopefully be some unique algorithms that don't exist elsewhere.  For example, quantile regression using an online MM algorithm (QuantRegMM).
A big thanks to Tom Breloff for implementing some cool algorithms and being a general programming advisor to this statistician.  

Collaboration and feedback is very welcome.  

Best,
-Josh

--
You received this message because you are subscribed to a topic in the Google Groups "julia-stats" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/julia-stats/2yMT7fdrxMQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.