# Analysis of four-dimensional datasets

5 messages
Open this post in threaded view
|

## Analysis of four-dimensional datasets

 Hi all,I started working with Julia when I started my PhD / physics and I wrote quite a bit for my research. However, not being educated in programming, I currently reach some limits; performance wise and my knowledge of existing algorithms is limited as well. Now I am here to ask for (general) help.I am working in physics, scattering techniques. Therefore I work a lot with four dimensional datasets (reciprocal space and energy, (H,K,L,E)), which tend to be huge (several gigs).I found it quite easy to work with DataFrames here.From a 4d dataset, I need to reduce dimensions.Example: say H,E gets binned, and K and L are integrated and normalized and the statistical error is calculated. The result would be sth like a 2D dataset with axis along H and E.  H = [-2,0.1,2]; -0.1 < K < 0.1; 1 < L < 4; E = [0,1,45]I need to find the optimal region to present data, searching in a 4d dataset is hugely inconvenient. So far I used a DataFrame's join() on several fields, which creates a small set of duplicate / similar datapoints, which are then combined. This works on small datasets, but as you compare each point against a huge set of points, I can't rely on this technique for sets > 1 M points :)I guess clustering is the best way to go; are there any algorithems that come to mind? I googled and found the canopy clustering algorithem by McCallum et al.. Considering, that this creates overlapping canopies, and it should be a good starting point. However, it has not been implemented into Julia, right?Last but not least, I need to fit my 1d-data to custom models, using the Chi^2 algorithm.I had huge problems fitting arbitrary curves (Background, several Normal-distributions, linear offset) to my dataset. This should use the Chi^2 algorithm for fits weighted by the statistical error. Is that already implemented in any package?Again -- for each of the tasks I have written some routines, but I have the feeling there is much faster and eleganter stuff out there. It would be a huge help of you guys if you dropped me a line with your thoughts on these points.Looking forward to your input! -- You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout.
Open this post in threaded view
|

## Re: Analysis of four-dimensional datasets

Open this post in threaded view
|

## Re: Analysis of four-dimensional datasets

Open this post in threaded view
|

## Re: Analysis of four-dimensional datasets

Open this post in threaded view
|