What we will cover today

  • Overview of Nipype
  • Semantics of Nipype
  • Playing with interfaces
  • Creating workflows
  • Advanced features
  • Future directions

Presenter Notes

Why Nipype?

Presenter Notes

... one ring to bind them ...

Presenter Notes

Brain imaging: the process

From design to databases [1]

images/EDC.png

Presenter Notes

Brainimaging software

a plethora of evolving options

images/brainimagingsoftware.png

Presenter Notes

Brainimaging software: issues

  • different algorithms
  • different assumptions
  • different platforms
  • different interfaces
  • different file formats

Presenter Notes

Leads to many questions?

neuroscientist:

  • which packages should I use?
  • why should I use these packages?
  • how do they differ?
  • how should I use these packages?

developer:

  • which package(s) should I develop for?
  • how do I disseminate my software?

Presenter Notes

... and more questions

How do we:

  • Install, use, maintain and test multiple packages
  • Reduce manual intervention
  • Train people
  • Tailor to specific projects
  • Develop new tools
  • Perform reproducible research
images/fmri.png

Presenter Notes

Presenter Notes

Solution requirements

Coming at it from a developer's perspective, we needed something

  • lightweight
  • scriptable
  • provided formal, common semantics
  • allowed interactive exploration
  • supported efficient batch processing
  • enabled rapid algorithm prototyping
  • was flexible and adaptive

Presenter Notes

Existing technologies

shell scripting:

Can be quick to do, and powerful, but application specific scalability, and not easy to port across different architectures.

make/CMake:

Similar in concept to workflow execution in Nipype, but again limited by the need for command line tools and flexibility in terms of scaling across hardware architectures (although see makeflow).

Octave/MATLAB:

Integration with other tools is ad hoc (i.e., system call) and dataflow is managed at a programmatic level. However, see PSOM which offers a very nice alternative to some aspects of Nipype for Octave/Matlab users.

Graphical options: (e.g., LONI pipeline)

Adding or reusing components across different projects require XML manipulation or subscribing to some specific databases.

Presenter Notes

We built Nipype in Python

Presenter Notes

Why Python?

  • easy to learn
  • coding style makes for easy readability
  • cross-platform
  • extensive infrastructure for
  • development and distribution
  • scientific computing
  • brain imaging
  • several institutions are adopting it in computer science classes

Presenter Notes

What can we use Python for?

  • scripting (like shell scripts e.g. bash, csh)
  • make web sites (like these slides)
  • science (like R, Matlab, IDL, Octave, Scilab)
  • etc.

You just need to know 1 language to do almost everything !

Presenter Notes

Scientific Python building blocks

Presenter Notes

Brain Imaging in Python

  • NiPy, an umbrella project for Neuroimaging in Python: http://nipy.org
    • DiPy, diffusion imaging
    • Nibabel, file reading and writing
    • NiPy, preprocessing and statistical routines
    • Nipype, interfaces and workflows
    • Nitime, time series analysis
    • PySurfer, Surface visualization
  • PyMVPA, machine learning for neuroimaging: http://pymvpa.org
  • PsychoPy, stimulus presentation: http://psychopy.org

Presenter Notes

What is Nipype?

Presenter Notes

Nipype architecture [2]

  • Interface
  • Engine
  • Executable Plugins
images/arch.png

Presenter Notes

Semantics: Interface

  • Interface: Wraps a program or function
images/arch.png

Presenter Notes

Semantics: Engine

  • Node/MapNode: Wraps an Interface for use in a Workflow that provides caching and other goodies (e.g., pseudo-sandbox)
  • Workflow: A graph or forest of graphs whose nodes are of type Node, MapNode or Workflow and whose edges represent data flow
images/arch.png

Presenter Notes

Semantics

  • Plugin: A component that describes how a Workflow should be executed
images/arch.png

Presenter Notes

Software interfaces

Currently supported (4-2-2012). Click here for latest

AFNI ANTS
BRAINS Camino
Camino-TrackVis ConnectomeViewerToolkit
dcm2nii Diffusion Toolkit
FreeSurfer FSL
MRtrx Nipy
Nitime PyXNAT
Slicer SPM

Most used/contributed policy!

Not every component of these packages are available.

Presenter Notes

Workflows

Properties:

  • processing pipeline is a directed acyclic graph (DAG)
  • nodes are processes
  • edges represent data flow
  • compact represenation for any process
  • code and data separation
images/workflow.png

Presenter Notes

Execution Plugins

Allows seamless execution across many architectures

  • local
    • serially
    • multicore
  • Clusters
    • Condor
    • PBS/Torque
    • SGE
    • SSH (via IPython)

Presenter Notes

How can I use Nipype?

  • Environment and installing
  • Nipype as a brain imaging library
  • Building and executing workflows
  • Contributing to Nipype

Presenter Notes

  • imperative style caching
  • Workflow concepts
  • Hello World! of workflows
  • Grabbing and Sinking
  • iterables and iterfields
  • Distributed computing
  • The Function interface
  • Config options
  • Debugging
  • actual workflows (resting, task, diffusion)

Installing and environment

Scientific Python:

  • Debian/Ubuntu/Scientific Fedora
  • Enthought Python Distribution (EPD)

Installing Nipype:

Running Nipype (Quickstart):

  • Ensure tools are installed and accessible
  • Nipype is a wrapper, not a substitute for AFNI, ANTS, FreeSurfer, FSL, SPM, NiPy, etc.,.

Presenter Notes

For today's tutorial

At MIT you can configure your environment as:

source /software/python/EPD/virtualenvs/7.2/nipype0.5/bin/activate
export TUT_DIR=/mindhive/scratch/mri_class/$LOGNAME/nipype-tutorial
mkdir -p $TUT_DIR
cd $TUT_DIR
ln -s /mindhive/xnat/data/nki_test_retest nki
ln -s /mindhive/xnat/data/openfmri/ds107 ds107
ln -s /mindhive/xnat/surfaces/nki_test_retest nki_surfaces
ln -s /mindhive/xnat/surfaces/openfmri/ds107 ds107_surfaces
module add torque
export ANTSPATH=/software/ANTS/versions/120325/bin/
export PATH=/software/common/bin:$ANTSPATH:$PATH
. fss 5.1.0
. /etc/fsl/4.1/fsl.sh

For our interactive session we will use IPython:

ipython notebook --pylab=inline

Presenter Notes

Tutorial data and subject ids

Presenter Notes

Hello nipype!

  • Nipype as a library
  • Imperative programming with caching
  • Workflow concepts
  • Hello World! of workflows
  • Data grabbing and sinking
  • Loops: iterables and iterfields
  • The IdentityInterface and Function interfaces
  • Config options, Debugging, Distributed computing

Presenter Notes

Nipype as a library

Importing functionality

>>> from nipype.interfaces.camino import DTIFit
>>> from nipype.interfaces.spm import Realign

Finding interface inputs and outputs and examples

>>> DTIFit.help()
>>> Realign.help()

Executing the interfaces

>>> fitter = DTIFit(scheme_file='A.sch',
                    in_file='data.bfloat')
>>> fitter.run()

>>> aligner = Realign(in_file='A.nii')
>>> aligner.run()

Presenter Notes

Work in a directory

import os
from shutil import copyfile
library_dir = os.path.join(os.getenv('TUT_DIR'), 'as_a_library')
os.mkdir(library_dir)
os.chdir(library_dir)

Presenter Notes

Using interfaces: comparison

We will use FreeSurfer to convert the file to uncompressed Nifti

from nipype.interfaces.freesurfer import MRIConvert
MRIConvert(in_file='../ds107/sub001/BOLD/task001_run001/bold.nii.gz',
           out_file='ds107.nii').run()

Normally:

$ mri_convert ../ds107/sub001/BOLD/task001_run001/bold.nii.gz
       ds107.nii

Shell script wins!

Presenter Notes

Using interfaces: more Interfaces

Import the motion-correction interfaces

from nipype.interfaces.spm import Realign
from nipype.interfaces.fsl import MCFLIRT

Run SPM first

>>> results1 = Realign(in_files='ds107.nii',
                       register_to_mean=False).run()
>>> ls
ds107.mat  ds107.nii  meands107.nii  pyscript_realign.m  rds107.mat
rds107.nii  rp_ds107.txt

Shell script goes into hiding. Of course it could do ;)

$ python -c "from nipype.interfaces.spm import Realign;
             Realign(...).run()"

Presenter Notes

Let's use FSL

but how?

>>> MCFLIRT.help()

or go to: MCFLIRT help

>>> results2 = MCFLIRT(in_file='ds107.nii', ref_vol=0,
                       save_plots=True).run()

Now we can look at some results

subplot(211);plot(genfromtxt('ds107_mcf.nii.gz.par')[:, 3:]);
title('FSL')
subplot(212);plot(genfromtxt('rp_ds107.txt')[:,:3]);title('SPM')

if i execute the MCFLIRT line again, well, it runs again!

Presenter Notes

Using Nipype caching

Setup

>>> from nipype.caching import Memory
>>> mem = Memory('.')

Create cacheable objects

>>> spm_realign = mem.cache(Realign)
>>> fsl_realign = mem.cache(MCFLIRT)

Execute interfaces

>>> spm_results = spm_realign(in_files='./as_a_library/ds107.nii',
                              register_to_mean=False)
>>> fsl_results = fsl_realign(in_file='./as_a_library/ds107.nii',
                              ref_vol=0, save_plots=True)

Compare

subplot(211);plot(genfromtxt(fsl_results.outputs.par_file)[:, 3:])
subplot(212);
plot(genfromtxt(spm_results.outputs.realignment_parameters)[:,:3])

Presenter Notes

More caching

Execute interfaces again

>>> spm_results = spm_realign(in_files='./as_a_library/ds107.nii',
                              register_to_mean=False)
>>> fsl_results = fsl_realign(in_file='./as_a_library/ds107.nii',
                              ref_vol=0, save_plots=True)

Output

120401-23:16:21,144 workflow INFO:
Executing node 43650b0cabb14ef502659398b944be8b in dir: /mindhive/gablab/satra/mri_class/nipype_mem/nipype-interfaces-spm-preprocess-Realign/43650b0cabb14ef502659398b944be8b
120401-23:16:21,145 workflow INFO:
Collecting precomputed outputs
120401-23:16:21,158 workflow INFO:
Executing node e91bcd85558ecd0a2786c9fdd2bcb65a in dir: /mindhive/gablab/satra/mri_class/nipype_mem/nipype-interfaces-fsl-preprocess-MCFLIRT/e91bcd85558ecd0a2786c9fdd2bcb65a
120401-23:16:21,159 workflow INFO:
Collecting precomputed outputs

Presenter Notes

More files to process

what if we had more files?

>>> from os.path import abspath as opap
>>> files = [opap('ds107/sub001/BOLD/task001_run001/bold.nii.gz'),
             opap('ds107/sub001/BOLD/task001_run002/bold.nii.gz')]
>>> fsl_results = fsl_realign(in_file=files, ref_vol=0,
                              save_plots=True)
>>> spm_results = spm_realign(in_files=files, register_to_mean=False)

They will both break but for different reasons:

1. Interface incompatibility
2. File format
converter = mem.cache(MRIConvert)
newfiles = []
for idx, fname in enumerate(files):
    newfiles.append(converter(in_file=fname,
                              out_type='nii').outputs.out_file)

Presenter Notes

Workflow concepts

Where:

>>> from nipype.pipeline.engine import Node, MapNode, Workflow

Node:

>>> spm_realign = mem.cache(Realign)
>>> realign_spm = Node(Realign(), name='motion_correct')

Mapnode:

>>> realign_fsl = MapNode(MCFLIRT(), iterfield=['in_file'],
                          name='motion_correct_with_fsl')

Workflow:

>>> myflow = Workflow(name='realign')
>>> myflow.add_nodes([realign_spm, realign_fsl])

Presenter Notes

Workflow: set inputs and run

Node:

>>> realign_spm.inputs.in_files = newfiles
>>> realign_spm.inputs.register_to_mean = False
>>> realign_spm.run()

Mapnode:

>>> realign_fsl.inputs.in_file = files
>>> realign_fsl.inputs.ref_vol = 0
>>> realign_fsl.run()

Workflow:

>>> myflow = Workflow(name='realign')
>>> myflow.add_nodes([realign_spm, realign_fsl])
>>> myflow.base_dir = opap('.')
>>> myflow.run()

Presenter Notes

Workflow: setting inputs

Workflow:

>>> myflow = Workflow(name='realign')
>>> myflow.add_nodes([realign_spm, realign_fsl])
>>> myflow.base_dir = opap('.')
>>> myflow.inputs.motion_correct.in_files = newfiles
>>> myflow.inputs.motion_correct.register_to_mean = False
>>> myflow.inputs.motion_correct_with_fsl.in_file = files
>>> myflow.inputs.motion_correct_with_fsl.ref_vol = 0
>>> myflow.run()

Presenter Notes

"Hello World" of Nipype workflows

Create two nodes:

>>> convert2nii = MapNode(MRIConvert(out_type='nii'),
                          iterfield=['in_file'],
                          name='convert2nii')
>>> realign_spm = Node(Realign(), name='motion_correct')

Set inputs:

>>> convert2nii.inputs.in_file = files
>>> realign_spm.inputs.register_to_mean = False

Connect them up:

>>> realignflow = Workflow(name='realign_with_spm')
>>> realignflow.connect(convert2nii, 'out_file',
                        realign_spm, 'in_files')
>>> realignflow.base_dir = opap('.')
>>> realignflow.run()

Presenter Notes

Visualize the workflow

>>> realignflow.write_graph()
images/graph.dot.png
>>> realignflow.write_graph(graph2use='orig')
images/graph_detailed.dot.png

Presenter Notes

Data grabbing

Instead of assigning data ourselves, let's glob it

>>> from nipype.interfaces.io import DataGrabber
>>> ds = Node(DataGrabber(infields=['subject_id'],
                          outfields=['func']),
              name='datasource')
>>> ds.inputs.base_directory = opap('ds107')
>>> ds.inputs.template = '%s/BOLD/task001*/bold.nii.gz'

>>> ds.inputs.subject_id = 'sub001'
>>> ds.run().outputs
func = ['...mri_class/ds107/sub001/BOLD/task001_run001/bold.nii.gz',
        '...mri_class/ds107/sub001/BOLD/task001_run002/bold.nii.gz']

>>> ds.inputs.subject_id = 'sub049'
>>> ds.run().outputs
func = ['...mri_class/ds107/sub049/BOLD/task001_run001/bold.nii.gz',
        '...mri_class/ds107/sub049/BOLD/task001_run002/bold.nii.gz']

Presenter Notes

Multiple files

A little more practical usage

>>> ds = Node(DataGrabber(infields=['subject_id', 'task_id'],
                          outfields=['func', 'anat']),
              name='datasource')
>>> ds.inputs.base_directory = opap('ds107')
>>> ds.inputs.template = '*'
>>> ds.inputs.template_args = {'func': [['subject_id', 'task_id']],
                               'anat': [['subject_id']]}
>>> ds.inputs.field_template =
                     {'func': '%s/BOLD/task%03d*/bold.nii.gz',
                      'anat': '%s/anatomy/highres001.nii.gz'}

>>> ds.inputs.subject_id = 'sub001'
>>> ds.inputs.task_id = 1
>>> ds.run().outputs
anat = '...mri_class/ds107/sub001/anatomy/highres001.nii.gz'
func = ['...mri_class/ds107/sub001/BOLD/task001_run001/bold.nii.gz',
        '...mri_class/ds107/sub001/BOLD/task001_run002/bold.nii.gz']

Presenter Notes

Loops: iterfield (MapNode)

MapNode + iterfield: runs underlying interface several times

>>> convert2nii = MapNode(MRIConvert(out_type='nii'),
                          iterfield=['in_file'],
                          name='convert2nii')
images/mapnode.png

Presenter Notes

Loops: iterables (subgraph)

Workflow + iterables: runs subgraph several times, attribute not input

>>> multiworkflow = Workflow(name='iterables')
>>> ds.iterables = ('subject_id', ['sub001', 'sub049'])
>>> multiworkflow.add_nodes([ds])
>>> multiworkflow.run()
images/iterables.png

Presenter Notes

Reminder

>>> convert2nii = MapNode(MRIConvert(out_type='nii'),
                          iterfield=['in_file'],
                          name='convert2nii')
>>> realign_spm = Node(Realign(), name='motion_correct')

Set inputs:

>>> convert2nii.inputs.in_file = files
>>> realign_spm.inputs.register_to_mean = False

Connect them up:

>>> realignflow = Workflow(name='realign_with_spm')
>>> realignflow.connect(convert2nii, 'out_file',
                        realign_spm, 'in_files')

Presenter Notes

Connecting to computation

>>> ds = Node(DataGrabber(infields=['subject_id', 'task_id'],
                          outfields=['func']),
              name='datasource')
>>> ds.inputs.base_directory = opap('ds107')
>>> ds.inputs.template = '%s/BOLD/task%03d*/bold.nii.gz'
>>> ds.inputs.template_args = {'func': [['subject_id', 'task_id']]}
>>> ds.inputs.task_id = 1
>>> convert2nii = MapNode(MRIConvert(out_type='nii'),
                          iterfield=['in_file'],
                          name='convert2nii')
>>> realign_spm = Node(Realign(), name='motion_correct')
>>> realign_spm.inputs.register_to_mean = False

>>> connectedworkflow = Workflow(name='connectedtogether')
>>> ds.iterables = ('subject_id', ['sub001', 'sub049'])
>>> connectedworkflow.connect(ds, 'func', convert2nii, 'in_file')
>>> connectedworkflow.connect(convert2nii, 'out_file',
                              realign_spm, 'in_files')
>>> connectedworkflow.run()

Presenter Notes

Data sinking

Take output computed in a workflow out of it.

>>> sinker = Node(DataSink(), name='sinker')
>>> sinker.inputs.base_directory = opap('output')
>>> connectedworkflow.connect(realign_spm, 'realigned_files',
                              sinker, 'realigned')
>>> connectedworkflow.connect(realign_spm, 'realignment_parameters',
                              sinker, 'realigned.@parameters')

How to determine output location:

'base_directory/container/parameterization/destloc/filename'

destloc = string[[.[@]]string[[.[@]]string]] and
filename comes from the input to the connect statement.

Presenter Notes

Putting it all together

iterables + MapNode + Node + Workflow + DataGrabber + DataSink

images/alltogether.png

Presenter Notes

Two utility interfaces

  1. IdentityInterface: Whatever comes in goes out
  2. Function: The do anything you want card

Presenter Notes

IdentityInterface

>>> from nipype.interfaces.utility import IdentityInterface
>>> subject_id = Node(IdentityInterface(fields=['subject_id']),
                      name='subject_id')
>>> subject_id.iterables = ('subject_id', [0, 1, 2, 3])

or my usual test mode

>>> subject_id.iterables = ('subject_id', subjects[:1])

or

>>> subject_id.iterables = ('subject_id', subjects[:10])

Presenter Notes

Function Interface

Do anything you want in Nipype card!

>>> from nipype.interfaces.utility import Function

>>> def myfunc(input1, input2):
        """Add and subtract two inputs
        """
        return input1 + input2, input1 - input2

>>> calcfunc = Node(Function(input_names=['input1', 'input2'],
                             output_names = ['sum', 'difference'],
                             function=myfunc),
                    name='mycalc')
>>> calcfunc.inputs.input1 = 1
>>> calcfunc.inputs.input2 = 2
>>> res = calcfunc.run()
>>> res.outputs
sum = 3
difference = -1

Presenter Notes

Distributed computing

Normally calling run executes the workflow in series

>>> connectedworkflow.run()

but you can scale to a cluster very easily

>>> connectedworkflow.run('MultiProc', plugin_args={'n_procs': 4})
>>> connectedworkflow.run('PBS', plugin_args={'qsub_args': '-q many'})
>>> connectedworkflow.run('SGE', plugin_args={'qsub_args': '-q many'})
>>> connectedworkflow.run('Condor',
                           plugin_args={'qsub_args': '-q many'})
>>> connectedworkflow.run('IPython')

Requirement: shared filesystem

where art thou shell script?

Presenter Notes

Databases

>>> from nipype.interfaces.io import XNATSource
>>> from nipype.pipeline.engine import Node, Workflow
>>> from nipype.interfaces.fsl import BET

>>> dg = Node(XNATSource(infields=['subject_id', 'mpr_id'],
                         outfields=['struct'],
                         config='/Users/satra/xnatconfig'),
              name='xnatsource')
>>> dg.inputs.query_template = ('/projects/CENTRAL_OASIS_CS/subjects/'
                                '%s/experiments/%s_MR1/scans/mpr-%d/'
                                'resources/files')
>>> dg.inputs.query_template_args['struct'] = [['subject_id',
                                                'subject_id',
                                                'mpr_id']]
>>> dg.inputs.subject_id = 'OAS1_0002'
>>> dg.inputs.mpr_id = 1

>>> bet = Node(BET(), name='skull_stripper')
>>> wf = Workflow(name='testxnat')
>>> wf.base_dir = '/software/temp/xnattest'
>>> wf.connect(dg, ('struct', select_img), bet, 'in_file')

Presenter Notes

Databases

['/var/.../c67d371..._OAS1_0002_MR1_mpr-1_anon.img',
 '/var/.../c67d371..._OAS1_0002_MR1_mpr-1_anon.hdr',
 '/var/.../c67d371..._OAS1_0002_MR1_mpr-1_anon_sag_66.gif']
>>> wf.connect(dg, ('struct', select_img), bet, 'in_file')

>>> def select_img(central_list):
        for fname in central_list:
            if fname.endswith('img'):
                return fname

Presenter Notes

Miscellaneous topics

  1. Config options: controlling behavior
>>> from nipype import config, logging

>>> config.set_debug_mode()
>>> logging.update_logging()

>>> config.set('execution', 'keep_unnecessary_outputs', 'true')
  1. Reusing workflows
>>> from nipype.workflows.smri.freesurfer.utils import
          create_getmask_flow

>>> getmask = create_getmask_flow()
>>> getmask.inputs.inputspec.source_file = 'mean.nii'
>>> getmask.inputs.inputspec.subject_id = 's1'
>>> getmask.inputs.inputspec.subjects_dir = '.'
>>> getmask.inputs.inputspec.contrast_type = 't2'
>>> getmask.run()

Presenter Notes

Where to go from here

Nipype website

Presenter Notes

Future Directions

  • Reproducible research (standards)
  • Scalability
    • AWS
    • Graph submission with depth first order
  • Social collaboration and workflow development
    • Google docs for scientific workflows

Presenter Notes

References

[1]Poline J, Breeze JL, Ghosh SS, Gorgolewski K, Halchenko YO, Hanke M, Haselgrove, C, Helmer KG, Marcus DS, Poldrack RA, Schwartz Y, Ashburner J and Kennedy DN (2012). Data sharing in neuroimaging research. Front. Neuroinform. 6:9. http://dx.doi.org/10.3389/fninf.2012.00009
[2]Gorgolewski K, Burns CD, Madison C, Clark D, Halchenko YO, Waskom ML, Ghosh SS (2011) Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in Python. Front. Neuroinform. 5:13. http://dx.doi.org/10.3389/fninf.2011.00013

Presenter Notes