Abstract
We consider problems in model selection caused by the geometry of models close to their points of intersection. In some cases---including common classes of causal or graphical models, as well as time series models---distinct models may nevertheless have identical tangent spaces. This has two immediate consequences: first, in order to obtain constant power to reject one model in favour of another we need local alternative hypotheses that decrease to the null at a slower rate than the usual parametric n^{-1/2} (typically we will require n^{-1/4} or slower); in other words, to distinguish between the models we need large effect sizes or very large sample sizes. Second, we show that under even weaker conditions on their tangent cones, models in these classes cannot be made simultaneously convex by a reparameterization. This shows that Bayesian network models, amongst others, cannot be learned directly with a convex method similar to the graphical lasso. However, we are able to use our results to suggest methods for model selection that learn the tangent space directly, rather than the model itself. If there is time, we will give a generic algorithm for learning Bayesian network models.
Reference
Evans, R.J. Model selection and local geometry. Annals of Statistics, 48(6): 3514-3544, 2020.