Compute lesion overlap¶
# ConWhAt stuff
from conwhat import VolConnAtlas,StreamConnAtlas,VolTractAtlas,StreamTractAtlas
from conwhat.viz.volume import plot_vol_scatter
# Neuroimaging stuff
import nibabel as nib
from nilearn.plotting import (plot_stat_map,plot_surf_roi,plot_roi,
plot_connectome,find_xyz_cut_coords)
from nilearn.image import resample_to_img
# Viz stuff
%matplotlib inline
from matplotlib import pyplot as plt
import seaborn as sns
# Generic stuff
import glob, numpy as np, pandas as pd, networkx as nx
from datetime import datetime
We now use the synthetic lesion constructed in the previous example in a ConWhAt lesion analysis.
lesion_file = 'synthetic_lesion_20mm_sphere_-46_-60_6.nii.gz' # we created this file from scratch in the previous example
Take another quick look at this mask:
lesion_img = nib.load(lesion_file)
plot_roi(lesion_file);
Since our lesion mask does not (by construction) have a huge amount of spatial detail, it makes sense to use one of the lower-resolution atlas. As one might expect, computation time is considerably faster for lower-resolution atlases.
>>> cw_atlases_dir = '/global/scratch/hpc3230/Data/conwhat_atlases' # change this accordingly
>>> atlas_name = 'CWL2k8Sc33Vol3d100s_v01'
>>> atlas_dir = '%s/%s' %(cw_atlases_dir, atlas_name)
See the previous tutorial on ‘exploring the conwhat atlases’ for more info on how to examine the components of a given atlas in ConWhAt.
Initialize the atlas
>>> cw_vca = VolConnAtlas(atlas_dir=atlas_dir)
loading file mapping
loading vol bbox
loading connectivity
Choose which connections to evaluate.
This is normally an array of numbers indexing entries in
cw_vca.vfms
.
Pre-defining connection subsets is a useful way of speeding up large analyses, especially if one is only interested in connections between specific sets of regions.
As we are using a relatively small atlas, and our lesion is not too extensive, we can assess all connections.
>>> idxs = 'all' # alternatively, something like: range(1,100), indicates the first 100 cnxns (rows in .vmfs)
Now, compute lesion overlap statistics.
>>> jlc_dir = '/global/scratch/hpc3230/joblib_cache_dir' # this is the cache dir where joblib writes temporary files
>>> lo_df,lo_nx = cw_vca.compute_hit_stats(lesion_file,idxs,n_jobs=4,joblib_cache_dir=jlc_dir)
computing hit stats for roi synthetic_lesion_20mm_sphere_-46_-60_6.nii.gz
This takes about 20 minutes to run.
vca.compute_hit_stats()
returns a pandas
dataframe, lo_df
,
and a networkx
object, lo_nx
.
Both contain mostly the same information, which is sometimes more useful in one of these formats and sometimes in the other.
lo_df
is a table, with rows corresponding to each connection, and
columns for each of a wide set of statistical
metrics
for evaluating sensitivity and specificity of binary hit/miss data:
>>> lo_df.head()
metric | ACC | BM | F1 | FDR | FN | FNR | FP | FPR | Kappa | MCC | MK | NPV | PPV | TN | TNR | TP | TPR | corr_nothr | corr_thr | corr_thrbin |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
idx | ||||||||||||||||||||
0 | 0.990646 | 0.104859 | 0.098135 | 0.911501 | 29696.0 | 0.889874 | 37851.0 | 0.005266 | 0.330534 | 0.094054 | 0.084363 | 0.995864 | 0.088499 | 7149810.0 | 0.994734 | 3675.0 | 0.110126 | 0.042205 | 0.042205 | 0.094054 |
3 | 0.987324 | 0.011683 | 0.014279 | 0.988855 | 32708.0 | 0.980132 | 58828.0 | 0.008185 | 0.329134 | 0.008766 | 0.006577 | 0.995433 | 0.011145 | 7128833.0 | 0.991815 | 663.0 | 0.019868 | -0.001487 | -0.001487 | 0.008766 |
7 | 0.987160 | -0.006617 | 0.001185 | 0.999075 | 33316.0 | 0.998352 | 59404.0 | 0.008265 | 0.329023 | -0.004966 | -0.003727 | 0.995348 | 0.000925 | 7128257.0 | 0.991735 | 55.0 | 0.001648 | -0.003549 | -0.003549 | -0.004966 |
10 | 0.994367 | -0.000926 | 0.000147 | 0.999589 | 33368.0 | 0.999910 | 7305.0 | 0.001016 | 0.331450 | -0.001976 | -0.004215 | 0.995374 | 0.000411 | 7180356.0 | 0.998984 | 3.0 | 0.000090 | -0.001975 | -0.001975 | -0.001976 |
11 | 0.989105 | 0.048907 | 0.044941 | 0.962227 | 31520.0 | 0.944533 | 47152.0 | 0.006560 | 0.329846 | 0.040403 | 0.033378 | 0.995605 | 0.037773 | 7140509.0 | 0.993440 | 1851.0 | 0.055467 | 0.017664 | 0.017664 | 0.040403 |
Typically we will be mainly interested in two of these metric scores:
TPR
- True positive (i.e. hit) rate: number of true positives,
divided by number of true positives + number of false negatives
corr_thrbin
- Pearson correlation between the lesion amge and the
thresholded, binarized connectome edge image (group-level visitation
map)
>>> lo_df[['TPR', 'corr_thrbin']].iloc[:10].T
idx | 0 | 3 | 7 | 10 | 11 | 13 | 14 | 15 | 18 | 19 |
---|---|---|---|---|---|---|---|---|---|---|
metric | ||||||||||
TPR | 0.110126 | 0.019868 | 0.001648 | 0.000090 | 0.055467 | 0.002128 | 0.000569 | 0.000000 | 0.098469 | 0.023523 |
corr_thrbin | 0.094054 | 0.008766 | -0.004966 | -0.001976 | 0.040403 | 0.005801 | 0.000641 | -0.002543 | 0.169234 | 0.029414 |
We can obtain these numbers as a ‘modification matrix’ (connectivity matrix)
>>> tpr_adj = nx.to_pandas_adjacency(lo_nx,weight='TPR')
>>> cpr_adj = nx.to_pandas_adjacency(lo_nx,weight='corr_thrbin')
These two maps are, unsurprisingly, very similar:
>>> np.corrcoef(tpr_adj.values.ravel(), cpr_adj.values.ravel())
array([[1. , 0.96271946],
[0.96271946, 1. ]])
>>> fig, ax = plt.subplots(ncols=2, figsize=(12,4))
>>> sns.heatmap(tpr_adj,xticklabels='',yticklabels='',vmin=0,vmax=0.5,ax=ax[0]);
>>> sns.heatmap(cpr_adj,xticklabels='',yticklabels='',vmin=0,vmax=0.5,ax=ax[1]);
(…with an alternative color scheme…)
>>> fig, ax = plt.subplots(ncols=2, figsize=(12,4))
>>> sns.heatmap(tpr_adj, xticklabels='',yticklabels='',cmap='Reds',
>>> mask=tpr_adj.values==0,vmin=0,vmax=0.5,ax=ax[0]);
>>> sns.heatmap(cpr_adj,xticklabels='',yticklabels='',cmap='Reds',
>>> mask=cpr_adj.values==0,vmin=0,vmax=0.5,ax=ax[1]);
We can list directly the most affected (greatest % overlap) connections,
>>> cw_vca.vfms.loc[lo_df.index].head()
name | nii_file | nii_file_id | 4dvolind | |
---|---|---|---|---|
idx | ||||
0 | 61_to_80 | vismap_grp_62-81_norm.nii.gz | 0 | NaN |
3 | 18_to_19 | vismap_grp_19-20_norm.nii.gz | 3 | NaN |
7 | 45_to_48 | vismap_grp_46-49_norm.nii.gz | 7 | NaN |
10 | 19_to_68 | vismap_grp_20-69_norm.nii.gz | 10 | NaN |
11 | 21_to_61 | vismap_grp_22-62_norm.nii.gz | 11 | NaN |
To plot the modification matrix information on a brain, we first need to some spatial locations to plot as nodes. For these, we calculate (an approprixation to) each atlas region’s centriod location:
>>> parc_img = cw_vca.region_nii
>>> parc_dat = parc_img.get_data()
>>> parc_vals = np.unique(parc_dat)[1:]
>>> ccs = {roival: find_xyz_cut_coords(nib.Nifti1Image((dat==roival).astype(int),img.affine),
>>> activation_threshold=0) for roival in roivals}
>>> ccs_arr = np.array(ccs.values())
Now plotting on a glass brain:
>>> fig, ax = plt.subplots(figsize=(16,6))
>>> plot_connectome(tpr_adj.values,ccs_arr,axes=ax,edge_threshold=0.2,colorbar=True,
>>> edge_cmap='Reds',edge_vmin=0,edge_vmax=1.,
>>> node_color='lightgrey',node_kwargs={'alpha': 0.4});
>>> #edge_vmin=0,edge_vmax=1)
>>> fig, ax = plt.subplots(figsize=(16,6))
>>> plot_connectome(cpr_adj.values,ccs_arr,axes=ax)
The lines in this figure show network connections (drawn as a straight line between two nodes) whose atlas image volume have a non-zero level of overlap with the synthetic lesion volume. Transparency and colour intensity indicate the magnitude of overlap. Thus the thickest, brightest red lines correspond to tracts that pass directly through the centre of the synthetic lesion mask, and for whom the lesion overlaps with a substantial amount of their total volume. Light, thinner lines, extending to/from the contralateral hemisphere and frontal cortex, correspond to connections with a proportionally smaller degree of lesion load.