Unusual amount of storage allocation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Unusual amount of storage allocation

Douglas Bates
I am encountering an unexpected amount of storage allocation in the cfactor!{A::HBlkDiag) method in the MixedModels package.  See
https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl  for the code.

An HBlkDiag matrix is a homogeneous block diagonal matrix where "homogeneous" refers to the fact that all the diagonal blocks are square and of the same size.  Because of this homogeneity the blocks can be stored in an r by r by k array where r is the size of each of the square block and k is the number of such blocks.

On entry to this method the blocks are symmetric, positive semi-definite.  I want to overwrite the upper triangle of each of these blocks with its Cholesky factor.  I call LAPACK.potrf! directly because I don't want cholfact! to throw a non-positive-definite error.  The strange thing to me is that when I monitor the storage allocation, I get a huge amount of storage being allocated in the line with that call.  This may be because LAPACK.potrf! returns a tuple of the original matrix and an Int (the info code) but I'm not sure.

To see an example of this unusual amount of allocation try the following code with julia --track-allocation=user

using Feather, MixedModels
cd(Pkg.dir("MixedModels", "test", "data"))
sleepstudy = Feather.read("sleepstudy.feather", nullable=false)
fm1 = fit!(lmm(Reaction ~ 1 + Days + (1 + Days | Subject), sleepstudy))
Profile.clear_malloc_data()
devs, vars, betas, thetas = bootstrap(10_000, fm1)

I get

        - function cfactor!{T}(A::HBlkDiag{T})
        -     Aa = A.arr
        0     r, s, t = size(Aa)
        0     if r ≠ s
        0         throw(ArgumentError("HBlkDiag matrix A must be square"))
        -     end
 94428000     scm = Array(T, (r, r))
        0     for k in 1 : t  # FIXME: Lots of allocations in this loop
        0         for j in 1 : r, i in 1 : j
        0             scm[i, j] = Aa[i, j, k]
        -         end
566568000         LAPACK.potrf!('U', scm)
        0         for j in 1 : r, i in 1 : j
        0             Aa[i, j, k] = scm[i, j]
        -         end
        -     end
 10492000     UpperTriangular(A)
        - end

In this case the HBlkDiag matrix being decomposed has is 36 by 36 consisting of 18 2 by 2 diagonal blocks.  scm is a scratch 2 by 2 matrixthat is overwritten in sequence by the upper triangle of each of the original 2 by 2 blocks and passed to LAPACK.potrf!
Reply | Threaded
Open this post in threaded view
|

Re: Unusual amount of storage allocation

Douglas Bates
I moved this discussion to https://discourse.julialang.org where it is easier to format the code chunks.

On Monday, October 31, 2016 at 2:04:02 PM UTC-5, Douglas Bates wrote:
I am encountering an unexpected amount of storage allocation in the cfactor!{A::HBlkDiag) method in the MixedModels package.  See
<a href="https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdmbates%2FMixedModels.jl%2Fblob%2Fmaster%2Fsrc%2Fcfactor.jl\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_0eyFRACV_P9izBDH7wVqI0x-0A&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdmbates%2FMixedModels.jl%2Fblob%2Fmaster%2Fsrc%2Fcfactor.jl\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_0eyFRACV_P9izBDH7wVqI0x-0A&#39;;return true;">https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl  for the code.

An HBlkDiag matrix is a homogeneous block diagonal matrix where "homogeneous" refers to the fact that all the diagonal blocks are square and of the same size.  Because of this homogeneity the blocks can be stored in an r by r by k array where r is the size of each of the square block and k is the number of such blocks.

On entry to this method the blocks are symmetric, positive semi-definite.  I want to overwrite the upper triangle of each of these blocks with its Cholesky factor.  I call LAPACK.potrf! directly because I don't want cholfact! to throw a non-positive-definite error.  The strange thing to me is that when I monitor the storage allocation, I get a huge amount of storage being allocated in the line with that call.  This may be because LAPACK.potrf! returns a tuple of the original matrix and an Int (the info code) but I'm not sure.

To see an example of this unusual amount of allocation try the following code with julia --track-allocation=user

using Feather, MixedModels
cd(Pkg.dir("MixedModels", "test", "data"))
sleepstudy = Feather.read("sleepstudy.feather", nullable=false)
fm1 = fit!(lmm(Reaction ~ 1 + Days + (1 + Days | Subject), sleepstudy))
Profile.clear_malloc_data()
devs, vars, betas, thetas = bootstrap(10_000, fm1)

I get

        - function cfactor!{T}(A::HBlkDiag{T})
        -     Aa = A.arr
        0     r, s, t = size(Aa)
        0     if r ≠ s
        0         throw(ArgumentError("HBlkDiag matrix A must be square"))
        -     end
 94428000     scm = Array(T, (r, r))
        0     for k in 1 : t  # FIXME: Lots of allocations in this loop
        0         for j in 1 : r, i in 1 : j
        0             scm[i, j] = Aa[i, j, k]
        -         end
566568000         LAPACK.potrf!('U', scm)
        0         for j in 1 : r, i in 1 : j
        0             Aa[i, j, k] = scm[i, j]
        -         end
        -     end
 10492000     UpperTriangular(A)
        - end

In this case the HBlkDiag matrix being decomposed has is 36 by 36 consisting of 18 2 by 2 diagonal blocks.  scm is a scratch 2 by 2 matrixthat is overwritten in sequence by the upper triangle of each of the original 2 by 2 blocks and passed to LAPACK.potrf!