Compute lesion overlap

# ConWhAt stuff
from conwhat import VolConnAtlas,StreamConnAtlas,VolTractAtlas,StreamTractAtlas
from conwhat.viz.volume import plot_vol_scatter

# Neuroimaging stuff
import nibabel as nib
from nilearn.plotting import (plot_stat_map,plot_surf_roi,plot_roi,
                             plot_connectome,find_xyz_cut_coords)
from nilearn.image import resample_to_img

# Viz stuff
%matplotlib inline
from matplotlib import pyplot as plt
import seaborn as sns

# Generic stuff
import glob, numpy as np, pandas as pd, networkx as nx
from datetime import datetime

We now use the synthetic lesion constructed in the previous example in a ConWhAt lesion analysis.

lesion_file = 'synthetic_lesion_20mm_sphere_-46_-60_6.nii.gz' # we created this file from scratch in the previous example

Take another quick look at this mask:

lesion_img = nib.load(lesion_file)
plot_roi(lesion_file);
../_images/output_5_0.png

Since our lesion mask does not (by construction) have a huge amount of spatial detail, it makes sense to use one of the lower-resolution atlas. As one might expect, computation time is considerably faster for lower-resolution atlases.

>>> cw_atlases_dir = '/global/scratch/hpc3230/Data/conwhat_atlases'  # change this accordingly
>>> atlas_name = 'CWL2k8Sc33Vol3d100s_v01'
>>> atlas_dir = '%s/%s' %(cw_atlases_dir, atlas_name)

See the previous tutorial on ‘exploring the conwhat atlases’ for more info on how to examine the components of a given atlas in ConWhAt.

Initialize the atlas

>>> cw_vca = VolConnAtlas(atlas_dir=atlas_dir)
loading file mapping
loading vol bbox
loading connectivity

Choose which connections to evaluate.

This is normally an array of numbers indexing entries in cw_vca.vfms.

Pre-defining connection subsets is a useful way of speeding up large analyses, especially if one is only interested in connections between specific sets of regions.

As we are using a relatively small atlas, and our lesion is not too extensive, we can assess all connections.

>>> idxs = 'all' # alternatively, something like: range(1,100), indicates the first 100 cnxns (rows in .vmfs)

Now, compute lesion overlap statistics.

>>> jlc_dir = '/global/scratch/hpc3230/joblib_cache_dir' # this is the cache dir where joblib writes temporary files
>>> lo_df,lo_nx = cw_vca.compute_hit_stats(lesion_file,idxs,n_jobs=4,joblib_cache_dir=jlc_dir)
computing hit stats for roi synthetic_lesion_20mm_sphere_-46_-60_6.nii.gz

This takes about 20 minutes to run.

vca.compute_hit_stats() returns a pandas dataframe, lo_df, and a networkx object, lo_nx.

Both contain mostly the same information, which is sometimes more useful in one of these formats and sometimes in the other.

lo_df is a table, with rows corresponding to each connection, and columns for each of a wide set of statistical metrics for evaluating sensitivity and specificity of binary hit/miss data:

>>> lo_df.head()
metric ACC BM F1 FDR FN FNR FP FPR Kappa MCC MK NPV PPV TN TNR TP TPR corr_nothr corr_thr corr_thrbin
idx
0 0.990646 0.104859 0.098135 0.911501 29696.0 0.889874 37851.0 0.005266 0.330534 0.094054 0.084363 0.995864 0.088499 7149810.0 0.994734 3675.0 0.110126 0.042205 0.042205 0.094054
3 0.987324 0.011683 0.014279 0.988855 32708.0 0.980132 58828.0 0.008185 0.329134 0.008766 0.006577 0.995433 0.011145 7128833.0 0.991815 663.0 0.019868 -0.001487 -0.001487 0.008766
7 0.987160 -0.006617 0.001185 0.999075 33316.0 0.998352 59404.0 0.008265 0.329023 -0.004966 -0.003727 0.995348 0.000925 7128257.0 0.991735 55.0 0.001648 -0.003549 -0.003549 -0.004966
10 0.994367 -0.000926 0.000147 0.999589 33368.0 0.999910 7305.0 0.001016 0.331450 -0.001976 -0.004215 0.995374 0.000411 7180356.0 0.998984 3.0 0.000090 -0.001975 -0.001975 -0.001976
11 0.989105 0.048907 0.044941 0.962227 31520.0 0.944533 47152.0 0.006560 0.329846 0.040403 0.033378 0.995605 0.037773 7140509.0 0.993440 1851.0 0.055467 0.017664 0.017664 0.040403

Typically we will be mainly interested in two of these metric scores:

TPR - True positive (i.e. hit) rate: number of true positives, divided by number of true positives + number of false negatives

corr_thrbin - Pearson correlation between the lesion amge and the thresholded, binarized connectome edge image (group-level visitation map)

>>> lo_df[['TPR', 'corr_thrbin']].iloc[:10].T
idx 0 3 7 10 11 13 14 15 18 19
metric
TPR 0.110126 0.019868 0.001648 0.000090 0.055467 0.002128 0.000569 0.000000 0.098469 0.023523
corr_thrbin 0.094054 0.008766 -0.004966 -0.001976 0.040403 0.005801 0.000641 -0.002543 0.169234 0.029414

We can obtain these numbers as a ‘modification matrix’ (connectivity matrix)

>>> tpr_adj = nx.to_pandas_adjacency(lo_nx,weight='TPR')
>>> cpr_adj = nx.to_pandas_adjacency(lo_nx,weight='corr_thrbin')

These two maps are, unsurprisingly, very similar:

>>> np.corrcoef(tpr_adj.values.ravel(), cpr_adj.values.ravel())
array([[1.        , 0.96271946],
       [0.96271946, 1.        ]])
>>> fig, ax = plt.subplots(ncols=2, figsize=(12,4))
>>> sns.heatmap(tpr_adj,xticklabels='',yticklabels='',vmin=0,vmax=0.5,ax=ax[0]);
>>> sns.heatmap(cpr_adj,xticklabels='',yticklabels='',vmin=0,vmax=0.5,ax=ax[1]);
../_images/output_24_0.png

(…with an alternative color scheme…)

>>> fig, ax = plt.subplots(ncols=2, figsize=(12,4))
>>> sns.heatmap(tpr_adj, xticklabels='',yticklabels='',cmap='Reds',
>>>                   mask=tpr_adj.values==0,vmin=0,vmax=0.5,ax=ax[0]);
>>> sns.heatmap(cpr_adj,xticklabels='',yticklabels='',cmap='Reds',
>>>                   mask=cpr_adj.values==0,vmin=0,vmax=0.5,ax=ax[1]);
../_images/output_26_0.png

We can list directly the most affected (greatest % overlap) connections,

>>> cw_vca.vfms.loc[lo_df.index].head()
name nii_file nii_file_id 4dvolind
idx
0 61_to_80 vismap_grp_62-81_norm.nii.gz 0 NaN
3 18_to_19 vismap_grp_19-20_norm.nii.gz 3 NaN
7 45_to_48 vismap_grp_46-49_norm.nii.gz 7 NaN
10 19_to_68 vismap_grp_20-69_norm.nii.gz 10 NaN
11 21_to_61 vismap_grp_22-62_norm.nii.gz 11 NaN

To plot the modification matrix information on a brain, we first need to some spatial locations to plot as nodes. For these, we calculate (an approprixation to) each atlas region’s centriod location:

>>> parc_img = cw_vca.region_nii
>>> parc_dat = parc_img.get_data()
>>> parc_vals = np.unique(parc_dat)[1:]

>>> ccs = {roival: find_xyz_cut_coords(nib.Nifti1Image((dat==roival).astype(int),img.affine),
>>>                                   activation_threshold=0) for roival in roivals}
>>> ccs_arr = np.array(ccs.values())

Now plotting on a glass brain:

>>> fig, ax = plt.subplots(figsize=(16,6))
>>> plot_connectome(tpr_adj.values,ccs_arr,axes=ax,edge_threshold=0.2,colorbar=True,
>>>                    edge_cmap='Reds',edge_vmin=0,edge_vmax=1.,
>>>                    node_color='lightgrey',node_kwargs={'alpha': 0.4});
>>> #edge_vmin=0,edge_vmax=1)
>>> fig, ax = plt.subplots(figsize=(16,6))
>>> plot_connectome(cpr_adj.values,ccs_arr,axes=ax)
../_images/output_33_1.png

The lines in this figure show network connections (drawn as a straight line between two nodes) whose atlas image volume have a non-zero level of overlap with the synthetic lesion volume. Transparency and colour intensity indicate the magnitude of overlap. Thus the thickest, brightest red lines correspond to tracts that pass directly through the centre of the synthetic lesion mask, and for whom the lesion overlaps with a substantial amount of their total volume. Light, thinner lines, extending to/from the contralateral hemisphere and frontal cortex, correspond to connections with a proportionally smaller degree of lesion load.