Goodness-of-fit for Gaussian distribution of values X,Y

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Goodness-of-fit for Gaussian distribution of values X,Y

Fred
Hi,

I read in distributions.jl documentation that it is possible to fit a distribution to a given set of samples using : d = fit(D, sample)

I have a set of (X,Y) values and I would like to determine if the distribution of these values is Gaussian and to have a goodness of fit or Pvalue to accept or reject the hypothesis that the distribution is Gaussian.

1- First of all, in the equation d = fit(Normal, sample), is is unclear for me how sample should be organised : can I concatenate the X and Y vectors like that  : sample = [X Y] ?

2- params(d) does not give goodness-of-fit, how it is possible to have this information ?

Many thanks for your comments !

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Dan
Reply | Threaded
Open this post in threaded view
|

Re: Goodness-of-fit for Gaussian distribution of values X,Y

Dan
The `D` in the code snippet could be replaced with `MultivariateNormal`. `sample` should have a row for `X` and a row for `Y`. This could be arranged using `sample = [X'; Y']`.

The resulting fit is a 2D Gaussian, with a mean for X and Y and the correlations.

If this is not the desired fit, perhaps an Ordinary Least Squares would do the job.
Hope this helps.

On Wednesday, April 13, 2016 at 4:12:30 PM UTC+3, Fred wrote:
Hi,

I read in <a href="http://distributionsjl.readthedocs.org/en/latest/fit.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fdistributionsjl.readthedocs.org%2Fen%2Flatest%2Ffit.html\46sa\75D\46sntz\0751\46usg\75AFQjCNFYpD9Ueyjb3FdFucDtgf5VlRxK2A&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\75http%3A%2F%2Fdistributionsjl.readthedocs.org%2Fen%2Flatest%2Ffit.html\46sa\75D\46sntz\0751\46usg\75AFQjCNFYpD9Ueyjb3FdFucDtgf5VlRxK2A&#39;;return true;">distributions.jl documentation that it is possible to fit a distribution to a given set of samples using : d = fit(D, sample)

I have a set of (X,Y) values and I would like to determine if the distribution of these values is Gaussian and to have a goodness of fit or Pvalue to accept or reject the hypothesis that the distribution is Gaussian.

1- First of all, in the equation d = fit(Normal, sample), is is unclear for me how sample should be organised : can I concatenate the X and Y vectors like that  : sample = [X Y] ?

2- params(d) does not give goodness-of-fit, how it is possible to have this information ?

Many thanks for your comments !

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Goodness-of-fit for Gaussian distribution of values X,Y

Fred
Thank you very much Dan ! I tried with a big array and I had an error :

julia> typeof(mi)
Array{Float64,2}


julia
> size(mi)
(1913,2)


julia
> d = fit_mle(MultivariateNormal, mi)
ERROR
: Base.LinAlg.PosDefException(2)
 
in chol! at linalg/cholesky.jl:28
 
in cholfact at linalg/cholesky.jl:126 (repeats 2 times)
 
in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:261
 
in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:251


So I created a simple example :


julia
> x
6-element Array{Any,1}:
 
1
 
2
 
3
 
4
 
5
 
2


julia
> y
6-element Array{Any,1}:
 
2
 
3
 
3
 
5
 
4
 
2


julia
> s = [x y]
6x2 Array{Any,2}:
 
1  2
 
2  3
 
3  3
 
4  5
 
5  4
 
2  2

julia
> d = fit_mle(MultivariateNormal, s)
ERROR
: suffstats is not implemented for (Distributions.MvNormal{Cov<:PDMats.AbstractPDMat{T<:Real},Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}},Array{Any,2}).
 
in error at ./error.jl:21





--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Goodness-of-fit for Gaussian distribution of values X,Y

Fred
julia> d = fit_mle(MvNormal, mi)
ERROR
: Base.LinAlg.PosDefException(2)
 
in chol! at linalg/cholesky.jl:28
 
in cholfact at linalg/cholesky.jl:126 (repeats 2 times)
 
in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:261
 
in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:251


--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.