Geometry of smooth random fields

J. Taylor

September 3-6, 2017

Outline

\[ \newcommand{\Ee}{\mathbb{E}} \newcommand{\Pp}{\mathbb{P}} \newcommand{\real}{\mathbb{R}} \newcommand{\hauss}{{\cal H}} \newcommand{\lips}{{\cal L}} \]

Talk I

Talk II

Random sets

PET study

Random sets

A simplified neuroimaging study

\(M\) is the brain

Random sets

Question of interest

Random sets

PET study

Random sets

Gaussian processes: basic building blocks

Why Gaussian?

Random sets

Why manifolds and not Euclidean?

Manifolds can have boundary

Random sets

Model for PET data

Basic problem

Random sets

Excursion sets

Vector valued fields

Random sets

High level heuristics

Critical points

Random sets

Excursion above 0

Random sets

Excursion above 1

Random sets

Excursion above 1.5

Random sets

Excursion above 2

Random sets

Excursion above 2.5

Random sets

Excursion above 3

Random sets

High level heuristics

Random sets

What geometric features? Why?

EC tells you very little

Let \(M\) be a 2-manifold without boundary, then \[ \Ee \left\{\chi\left(M \cap f^{-1}[0,+\infty)\right)\right\} = \frac{1}{2} \cdot \chi(M)\]

Random sets

What is the Euler characteristic?

Random sets

EC tells you a lot

For nice enough \(M\) \[ \Ee \left\{\chi\left(M \cap f^{-1}[u,+\infty)\right)\right\} \overset{u \rightarrow \infty} \simeq \Pp \left\{\sup_{t \in M} f_t \geq u \right\}\]

EC counts local maxima

Random sets

EC is computable

Of all quantities in the studies of Gaussian processes, the EC stands out as being explicitly computable in wide generality \[ \Ee \left\{\chi\left(M \cap f^{-1}[u,+\infty)\right)\right\} = \sum_{j=0}^{\text{dim}(M)} \lips_j(M) \rho_j(u).\]

Integral geometry

Tube formulae

Intrinsic volumes

Integral geometry

Riemannian metric

Metric induced by a Gaussian process

Integral geometry

\[ \hauss_3\left( \text{Tube}([0,a] \times [0,b] \times [0,c],r)\right) = abc + 2r \cdot ( ab+bc+ac) + (\pi r^2) \cdot (a+b+c) + \frac{4\pi r^3}{3} \]

Integral geometry

Local convexity is important!

Integral geometry

Curvature measures

\[ \hauss_{k-1} \left(\exp_{\nu} \left(A \cap \partial D, r\nu \right) \right) = \sum_{j=1}^k r^{j-1} \cdot \omega_j \int_A \; \lips_{k-j}(D; dp) \]

Integral geometry

Curvature measures

Integral geometry

Normal bundle of the cube

Integral geometry

Parametrization of tube

Talk II: Inference for the LASSO

Running example

Inference for the LASSO

dim(X)
## [1] 633  91
colnames(X)[1:5]
## [1] "P6D"  "P20R" "P21I" "P35I" "P35M"

Inference for the LASSO

Inference for the LASSO

Model selection with the LASSO

\[ \hat{\beta}_{\lambda} = \text{argmin}_{\beta} \frac{1}{2} \|Y-X\beta\|^2_2 + \lambda \|\beta\|_1 \]

Choice of \(\lambda\)

\[ \lambda = \kappa \cdot \mathbb{E}( \|X^T\epsilon\|_{\infty}), \qquad \epsilon \sim N(0, \sigma^2 I). \]

arxiv.org/1311.6238, arxiv.org/1504.08031

Inference for the LASSO

Selected variables

selected_vars
##  [1] "P62V"  "P65R"  "P67N"  "P69i"  "P75I"  "P77L"  "P83K"  "P90I" 
##  [9] "P115F" "P151M" "P181C" "P184V" "P190A" "P215F" "P215Y" "P219R"
Xselect = X[,selected_vars]

Inference after LASSO?

Is this OK?

lm.selected = lm(Y ~ ., data=data.frame(Y, Xselect))
summary(lm.selected)
## 
## Call:
## lm(formula = Y ~ ., data = data.frame(Y, Xselect))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9654 -0.3327 -0.0651  0.2653  4.9333 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.252e-12  2.497e-02   0.000 1.000000    
## P62V         4.465e-02  2.999e-02   1.489 0.136994    
## P65R         2.837e-01  2.612e-02  10.860  < 2e-16 ***
## P67N         1.423e-01  2.884e-02   4.935 1.03e-06 ***
## P69i         1.649e-01  2.691e-02   6.126 1.61e-09 ***
## P75I         2.770e-02  3.955e-02   0.700 0.483889    
## P77L         4.774e-02  4.327e-02   1.103 0.270371    
## P83K        -7.475e-02  2.551e-02  -2.930 0.003513 ** 
## P90I         1.038e-01  2.536e-02   4.094 4.81e-05 ***
## P115F        6.754e-02  2.729e-02   2.475 0.013593 *  
## P151M        9.424e-02  3.541e-02   2.661 0.007992 ** 
## P181C        9.691e-02  2.623e-02   3.694 0.000240 ***
## P184V        2.218e+00  2.610e-02  84.973  < 2e-16 ***
## P190A        4.797e-02  2.582e-02   1.858 0.063696 .  
## P215F        1.152e-01  2.896e-02   3.976 7.83e-05 ***
## P215Y        1.748e-01  2.992e-02   5.845 8.24e-09 ***
## P219R        8.512e-02  2.569e-02   3.313 0.000976 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6281 on 616 degrees of freedom
## Multiple R-squared:  0.9313, Adjusted R-squared:  0.9296 
## F-statistic: 522.2 on 16 and 616 DF,  p-value: < 2.2e-16

Inference after LASSO?

No!

No!

Inference after LASSO

Exploratory Data Analysis (EDA)

more emphasis onusing data to suggest hypotheses to test.

biastesting hypotheses suggested by the data.

Inference after LASSO

Selective inference

Inference after LASSO

Data splitting

X1 = X[selection_half,]; Y1 = Y[selection_half]
G1 = glmnet(X1, Y1)
beta.hat1 = coef(G1, 43/(633 * sqrt(2)), exact=TRUE)[2:92]
selected1 = which(beta.hat1 != 0)
selected1
##  [1]  1  8 16 17 19 23 29 31 32 42 50 65 67 68 78 79 81 82 84 86 87 91

Inference after LASSO

Data splitting: no assumptions?

summary(lm(Y ~ ., data=data.frame(X[,selected1], Y), subset=inference_half))
## 
## Call:
## lm(formula = Y ~ ., data = data.frame(X[, selected1], Y), subset = inference_half)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0572 -0.3701 -0.0860  0.3240  4.9064 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.037984   0.039323   0.966 0.334871    
## P6D          0.040582   0.044685   0.908 0.364533    
## P41L        -0.121952   0.075873  -1.607 0.109056    
## P62V         0.061613   0.049640   1.241 0.215522    
## P65R         0.313896   0.039398   7.967 3.60e-14 ***
## P67N         0.211301   0.061126   3.457 0.000627 ***
## P69i         0.170076   0.049531   3.434 0.000681 ***
## P75T         0.104660   0.041891   2.498 0.013023 *  
## P83K        -0.083117   0.041709  -1.993 0.047206 *  
## P90I         0.094192   0.037612   2.504 0.012811 *  
## P116Y        0.151594   0.037910   3.999 8.06e-05 ***
## P135M        0.017384   0.041423   0.420 0.675030    
## P178M       -0.046817   0.040655  -1.152 0.250434    
## P181C        0.136794   0.042864   3.191 0.001570 ** 
## P184V        2.203686   0.041395  53.236  < 2e-16 ***
## P210W       -0.011475   0.069596  -0.165 0.869152    
## P211K       -0.005614   0.041842  -0.134 0.893348    
## P215F        0.189774   0.051694   3.671 0.000287 ***
## P215Y        0.288219   0.076418   3.772 0.000196 ***
## P219E       -0.041281   0.043536  -0.948 0.343806    
## P219Q       -0.033336   0.059221  -0.563 0.573921    
## P219R        0.081919   0.038202   2.144 0.032822 *  
## P228R       -0.044568   0.044285  -1.006 0.315055    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6924 on 294 degrees of freedom
## Multiple R-squared:  0.9201, Adjusted R-squared:  0.9141 
## F-statistic: 153.8 on 22 and 294 DF,  p-value: < 2.2e-16

Inference after LASSO

Data splitting

Inference after LASSO

Data splitting

Random hypotheses

Inference after LASSO

arxiv.org/1311.6238, arxiv.org/1410.2597.

selected_vars
##  [1] "P62V"  "P65R"  "P67N"  "P69i"  "P75I"  "P77L"  "P83K"  "P90I" 
##  [9] "P115F" "P151M" "P181C" "P184V" "P190A" "P215F" "P215Y" "P219R"

Inference after LASSO

Inference after LASSO

Using selectiveInference

fixedLassoInf(X, Y, beta.hat, 43, intercept=FALSE)
## 
## Call:
## fixedLassoInf(x = X, y = Y, beta = beta.hat, lambda = 43, intercept = FALSE)
## 
## Standard deviation of noise (specified or estimated) sigma = 0.634
## 
## Testing results at lambda = 43.000, with alpha = 0.100
## 
##  Var   Coef Z-score P-value LowConfPt UpConfPt LowTailArea UpTailArea
##   16  0.045   1.476   0.184    -0.045    0.145       0.050      0.050
##   17  0.284  10.761   0.000     0.240    0.336       0.049      0.049
##   19  0.142   4.890   0.000     0.095    0.225       0.049      0.050
##   23  0.165   6.070   0.000     0.114    0.210       0.048      0.049
##   27  0.028   0.694   0.723    -0.434    0.094       0.050      0.049
##   30  0.048   1.093   0.234    -0.108    0.324       0.050      0.050
##   31 -0.075  -2.904   0.025    -0.117   -0.013       0.050      0.049
##   32  0.104   4.057   0.007     0.041    0.146       0.050      0.049
##   41  0.068   2.453   0.083    -0.015    0.119       0.050      0.050
##   54  0.094   2.637   0.040     0.006    0.168       0.049      0.050
##   67  0.097   3.661   0.001     0.056    0.282       0.049      0.050
##   68  2.218  84.205   0.000     2.162    2.262       0.049      0.049
##   69  0.048   1.841   0.845    -0.973    0.053       0.050      0.049
##   81  0.115   3.940   0.008     0.044    0.214       0.050      0.049
##   82  0.175   5.792   0.000     0.124    0.225       0.048      0.050
##   87  0.085   3.283   0.049     0.000    0.139       0.050      0.049
## 
## Note: coefficients shown are partial regression coefficients

Inference after LASSO

Some intervals seem to be long…

Inference after LASSO

Selective inference for LASSO

\[ {\cal L}_F\left(X^T_Ey \big\vert (\hat{E}, z_{\hat{E}}) = (E, z_E) \right) \]

\[ {\cal L}_F\left(X^T_Ey \big\vert \hat{E}=E \right) \]

\[ F \in \left\{N(X_E\beta_E, \sigma^2_E I): \beta_E \in \mathbb{R}^E \right\} ={\cal M}_E. \]

Inference after LASSO

Dual problem

Inference after LASSO

Dual problem and critical points

Inference after LASSO

Visualizing LASSO partition

(Credit Naftali Harris)

Inference after LASSO

What we know after conditioning

Inference after LASSO

KKT conditions

Inference after LASSO

KKT conditions

Inference after LASSO

Selection event

Inference after LASSO

Polyhedral constraints

Inference after LASSO

Inference after LASSO

Selective hypothesis tests

\[ H_{0,(j|E)} : \beta_{j|E} = 0. \]

\[\phi_{(j| \hat{E})}(y), j \in \hat{E}.\]

Summary