I am encountering an unexpected amount of storage allocation in the cfactor!{A::HBlkDiag) method in the MixedModels package. See

<a href="https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdmbates%2FMixedModels.jl%2Fblob%2Fmaster%2Fsrc%2Fcfactor.jl\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_0eyFRACV_P9izBDH7wVqI0x-0A';return true;" onclick="this.href='https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdmbates%2FMixedModels.jl%2Fblob%2Fmaster%2Fsrc%2Fcfactor.jl\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_0eyFRACV_P9izBDH7wVqI0x-0A';return true;">https://github.com/dmbates/MixedModels.jl/blob/master/src/cfactor.jl for the code.

An HBlkDiag matrix is a homogeneous block diagonal matrix where "homogeneous" refers to the fact that all the diagonal blocks are square and of the same size. Because of this homogeneity the blocks can be stored in an r by r by k array where r is the size of each of the square block and k is the number of such blocks.

On entry to this method the blocks are symmetric, positive semi-definite. I want to overwrite the upper triangle of each of these blocks with its Cholesky factor. I call LAPACK.potrf! directly because I don't want cholfact! to throw a non-positive-definite error. The strange thing to me is that when I monitor the storage allocation, I get a huge amount of storage being allocated in the line with that call. This may be because LAPACK.potrf! returns a tuple of the original matrix and an Int (the info code) but I'm not sure.

To see an example of this unusual amount of allocation try the following code with julia --track-allocation=user

using Feather, MixedModels

cd(Pkg.dir("MixedModels", "test", "data"))

sleepstudy = Feather.read("sleepstudy.feather", nullable=false)

fm1 = fit!(lmm(Reaction ~ 1 + Days + (1 + Days | Subject), sleepstudy))

Profile.clear_malloc_data()

devs, vars, betas, thetas = bootstrap(10_000, fm1)

I get

- function cfactor!{T}(A::HBlkDiag{T})

- Aa = A.arr

0 r, s, t = size(Aa)

0 if r ≠ s

0 throw(ArgumentError("HBlkDiag matrix A must be square"))

- end

94428000 scm = Array(T, (r, r))

0 for k in 1 : t # FIXME: Lots of allocations in this loop

0 for j in 1 : r, i in 1 : j

0 scm[i, j] = Aa[i, j, k]

- end

566568000 LAPACK.potrf!('U', scm)

0 for j in 1 : r, i in 1 : j

0 Aa[i, j, k] = scm[i, j]

- end

- end

10492000 UpperTriangular(A)

- end

In this case the HBlkDiag matrix being decomposed has is 36 by 36 consisting of 18 2 by 2 diagonal blocks. scm is a scratch 2 by 2 matrixthat is overwritten in sequence by the upper triangle of each of the original 2 by 2 blocks and passed to LAPACK.potrf!