Hi all,
-- I am having a problem when predicting from the regression with pooled variables. By pooled variables I mean the ones that are created from the pool() function. I pooled the variable by groups (36 total), so putting the pooled variable in the regression automatically runs regression with indicators for 36 groups (some will be dropped due to collinearity). My current code is something like below: # Create pooled data array from group_index column sampledata[:group_pooled] = pool(sampledata[:group_index]) # Run regression IPW_treat_fml = Formula(:attr_treat, :group_pooled) IPW_treat_reg = glm(IPW_treat_fml, sampledata, Normal(), IdentityLink()) # Predict predict(IPW_treat_reg, sampledata) However, the predict(IPW_treat_reg, sampledata) does not work and gives me an error saying "DimensionMismatch("second dimension of A, 36, does not match length of x, 35"). If I write predict(IPW_treat_reg), then the code works, but I need to put sampledata in the prediction function in order to see all the NA predictions as well. predict(IPW_treat_reg) drops all the NA results. Any help will be greatly appreciated! You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Okay so I temporarily created a solution for this.
-- I think predict(IPW_treat_reg, sampledata) does not work in this case, because as I said some pooled variables are dropped due to collinearity. predict(IPW_treat_reg) works, although it only shows prediction for the non-NA dependent variable values. I don't need predictions for NA dependent variable values, so I decided to do the following. sampledata[:predict] = 0.0 # Make sure that it is a float! p_index = 1 # Index for the prediction values. Going to be increased in the loop. for i in 1:length(sampledata[:attr_treat]) if !isna(sampledata[i, :attr_treat]) sampledata[i, :predict] = predict(IPW_treat_reg)[p_index] p_index = p_index + 1 else sampledata[i, :predict] = NA end end Above code allows me to create a new column called "predict" in sampledata that shows NA for the NA dependent variable values and predicted values for non-NAs. Let me know if there is an easier way to do this! On Tuesday, May 31, 2016 at 11:19:02 AM UTC-5, Jessica Koh wrote:
You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Free forum by Nabble | Edit this page |