GPU Acceleration in ITK v4: Overview and Implementation

 
GPU Acceleration in ITK v4
 
ITK v4 
 
summer 
meeting
June 28, 2011
Won-Ki Jeong
Harvard University
 
Overview
 
Introduction
Current status
Examples
Future work
 
2
 
GPU Acceleration
 
GPU as a fast co-processor
Massively parallel
Huge speed up for certain types of problem
Physically independent system
Problems
Memory management
Process management
Implementation
 
3
 
Goals
 
Provide high-level GPU abstraction
GPU resource management
Transparent to existing ITK code
Pipeline and object factory supports
Basic CMake setup
GPU module
 
4
 
Status
 
28 new GPU classes
GPU image
GPU manager classes
GPU filter base classes
6 example GPU image filters
Gradient anisotropic diffusion
Demons registration
 
5
 
Code Development
 
Github (most recent version)
https://graphor@github.com/graphor/ITK.git
Branch: GPU-Alpha
Gerrit
http://review.source.kitware.com/#change,1923
Waiting for reviewing
 
6
 
CMake Setup
 
Enabling GPU module
ITK_USE_GPU
Module_ITK-GPUCommon
OpenCL source files will be copied to
${ITK_BINARY_DIR}/bin/OpenCL
${CMAKE_CURRENT_BINARY_DIR}/OpenCL
 
7
 
Naming Convention
 
File
itkGPU***
ex) itkMeanImageFilter -> itkGPUMeanImageFilter
Class
GPU***
ex) MeanImageFilter -> GPUMeanImageFilter
Method
GPU***
ex) GenerateData() -> GPUGenerateData()
 
8
 
GPU Core Classes
 
GPUContextManager
Manage context and command queues
GPUKernelManager
Load, compile, run GPU code
GPUDataManager
Data container for GPU
GPUImageDataManager
 
9
 
GPU Image Class
 
Derived from 
itk::Image
Compatible to existing ITK filters
GPUImageDataManager as a member
Separate GPU implementation from Image class
Graft(const GPUDataManager *)
Implicit(automatic) memory synchronization
Dirty flags
Time stamp
 
(Modified())
 
10
 
GPU Filter Classes
 
11
GPUImageToImageFilter
GPUInPlaceImageFilter
GPUBoxImageFilter
GPUNeighborhoodOperatorImageFilter
GPUDiscreteGaussianImageFilter
GPUMeanImageFilter
GPUUnaryFunctorImageFilter
GPUFiniteDifferenceImageFilter
GPUDenseFiniteDifferenceImageFilter
GPUAnisotropicDiffusionImageFilter
GPUPDEDeformableRegistrationFilter
GPUBinaryThresholdImageFilter
GPUGradientAnisotropicDiffusionImageFilter
GPUDemonsRegistrationFilter
 
GPU Functor/Function Classes
 
12
GPUFunctorBase
GPUBinaryThreshold
GPUFiniteDifferenceFunction
GPUAnisotropicDiffusionFunction
GPUPDEDeformableRegistrationFunction
GPUScalarAnisotropicDiffusionFunction
GPUGradiendNDAnisotropicDiffusionFunction
GPUDemonsRegistrationFunction
 
GPUImageToImageFilter
 
Base class for GPU image filters
Extend existing itk filters
 
 
Turn on/off GPU filter
IsGPUEnabled()
GPU filter implementation
GPUGenerateData()
 
13
 
template< class TInputImage, class TOutputImage, class 
TParentImageFilter
 >
class ITK_EXPORT GPUImageToImageFilter: public 
TParentImageFilter
{ ... }
 
GPUBinaryThresholdImageFilter
 
Example of functor-based filter
GPUUnaryFunctorImageFilter
GPU Functor
Per-pixel operator
SetGPUKernelArguments()
Set up GPU kernel arguments
Returns # of arguments that have been set
 
14
 
15
 
template< class TInput, class TOutput >
class GPUBinaryThreshold : public GPUFunctorBase
{
public:  GPUBinaryThreshold()
{
m_LowerThreshold = NumericTraits< TInput >::NonpositiveMin();
m_UpperThreshold = NumericTraits< TInput >::max();
m_OutsideValue   = NumericTraits< TOutput >::Zero;
m_InsideValue    = NumericTraits< TOutput >::max();
}
 
....
 
int SetGPUKernelArguments(GPUKernelManager::Pointer KernelManager, int
KernelHandle)
{
KernelManager->SetKernelArg(KernelHandle, 0, sizeof(TInput),
&(m_LowerThreshold));
KernelManager->SetKernelArg(KernelHandle, 1, sizeof(TInput),
&(m_UpperThreshold));
KernelManager->SetKernelArg(KernelHandle, 2, sizeof(TOutput),
&(m_InsideValue));
KernelManager->SetKernelArg(KernelHandle, 3, sizeof(TOutput),
&(m_OutsideValue));
return 4;
}
;
}
 
16
 
GPUUnaryFunctorImageFilter< TInputImage, TOutputImage, TFunction, TParentImageFilter
>::GPUGenerateData()
{
 
  ....
  // arguments set up using Functor
  int argidx = (this->GetFunctor()).
SetGPUKernelArguments(this->m_GPUKernelManager,
                                         m_UnaryFunctorImageFilterGPUKernelHandle)
;
  // arguments set up
  this->m_GPUKernelManager->SetKernelArgWithImage
                          (m_UnaryFunctorImageFilterGPUKernelHandle,
                           argidx++, inPtr->GetGPUDataManager());
  this->m_GPUKernelManager->SetKernelArgWithImage
                          (m_UnaryFunctorImageFilterGPUKernelHandle,
                           argidx++, otPtr->GetGPUDataManager());
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
  {
    this->m_GPUKernelManager->SetKernelArg(m_UnaryFunctorImageFilterGPUKernelHandle,
                                           argidx++, sizeof(int), &(imgSize[i]));
  }
 
  // launch kernel
  this->m_GPUKernelManager->LaunchKernel(m_UnaryFunctorImageFilterGPUKernelHandle,
                                         ImageDim, globalSize, localSize );
}
 
GPUNeighborhoodOperatorImageFilter
 
Pixel-wise inner product of neighborhood
and operator coefficients
Convolution
__constant
 GPU buffer for coefficients
GPU Discrete Gaussian Filter
GPU NOIF using 1D Gaussian operator per axis
 
17
 
GPUFiniteDifferenceImageFilter
 
Base class for GPU finite difference filters
GPUGradientAnisotropicDiffusionImageFilter
GPUDemonsRegistrationFilter
New virtual methods
GPUApplyUpdate()
GPUCalculateChange()
Need finite difference function
 
18
 
GPUFiniteDifferenceFunction
 
Base class for GPU finite difference functions
GPUGradientNDAnisotropicDiffusionFunction
GPUDemonsRegistrationFunction
New virtual method
GPUComputeUpdate()
Compute update buffer using GPU kernel
 
19
 
GPUGradientAnisotropicDiffusionImageFilter
 
GPUScalarAnisotropicDiffusionFunction
New virtual method
GPUCalculateAverageGradientMagnitudeSquared()
GPUGradientNDAnisotropicDiffusionFunction
GPU function for gradient-based anisotropic
diffusion
 
20
 
GPUDemonsRegistrationFilter
 
Baohua from UPenn
New method
GPUSmoothDeformationField()
GPUReduction
 
21
 
Performance
 
22
 
Intel Xeon Quad Core 3.2GHz CPU vs. NVIDIA GTX 480 GPU
256x256x100 CT volume
 
Create Your Own GPU Image Filter
 
Step 1: Derive your filter from
GPUImageToImageFilter 
using an existing
itk image filter as parent filter type
Step 2: Load and compile GPU source code
and create kernels in the constructor
Step 3: Implement filter by calling GPU
kernels in 
GPUGenerateData()
 
23
 
Example: GPUMeanImageFilter
 
Step 1: Class declaration
 
24
 
template< class TInputImage, class TOutputImage >
class ITK_EXPORT GPUMeanImageFilter :
 
public GPUImageToImageFilter< TInputImage, TOutputImage,
    
MeanImageFilter< TInputImage, TOutputImage > 
>
{ ... }
 
Example: GPUMeanImageFilter
 
Step 2: Constructor
 
25
 
template< class TInputImage, class TOutputImage >
GPUMeanImageFilter< TInputImage, TOutputImage>::GPUMeanImageFilter()
{
 
std::ostringstream defines;
 
defines << "#define DIM_" << TInputImage::ImageDimension <<
"\n";
 
defines << "#define PIXELTYPE ";
 
GetTypenameInString( typeid (TInputImage::PixelType), defines );
 
// OpenCL source path
 
std::string oclSrcPath = "./../OpenCL/GPUMeanImageFilter.cl";
 
// load and build OpenCL program
 
m_KernelManager
->LoadProgramFromFile( 
oclSrcPath.c_str(),
        
    defines.str().c_str
()
);
 
// create GPU kernel
 
m_KernelHandle = m_KernelManager->CreateKernel("MeanFilter");
}
 
Example: GPUMeanImageFilter
 
Step 3: GPUGenerateData()
 
26
 
template< class TInputImage, class TOutputImage >
void
GPUMeanImageFilter< TInputImage, TOutputImage >::GPUGenerateData()
{
  typedef itk::
GPUTraits
< TInputImage >::Type  GPUInputImage;
  typedef itk::GPUTraits< TOutputImage >::Type  GPUOutputImage;
  // get input & output image pointer
  GPUInputImage::Pointer inPtr =
  
dynamic_cast
< GPUInputImage * >( this->ProcessObject::GetInput(0) );
  GPUOutputImage::Pointer otPtr =
  
dynamic_cast
< GPUOutputImage * >( this->ProcessObject::GetOutput(0) );
  GPUOutputImage::SizeType outSize = otPtr->GetLargestPossibleRegion().GetSize();
  int radius[3], imgSize[3];
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
  {
 
 radius[i] = (this->
GetRadius
())[i];
     imgSize[i] = outSize[i];
  }
 
27
 
(Continued..)
 
  size_t localSize[3], globalSize[3];
  localSize[0] = localSize[1] = localSize[2] = 8;
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
  {
    
globalSize[i]
        = localSize[i]*(unsigned int)ceil((float)outSize[i]/(float)localSize[i]);
  }
  // kernel arguments set up
  int argidx = 0;
  m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++,
         
   inPtr->GetGPUDataManager());
  m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++,
         
   otPtr->GetGPUDataManager());
 
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
    m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int),
    
       
    &(radius[i]));
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
    m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int),
 
       
    &(imgSize[i]));
  // launch kernel
  m_KernelManager->LaunchKernel(m_KernelHandle,
       
  (int)TInputImage::ImageDimension,
       
  globalSize, localSize);
}
Pipeline Support
Allow combining CPU and GPU filters
Efficient CPU/GPU synchronization
28
 
ReaderType::Pointer reader = ReaderType::New();
WriterType::Pointer writer = WriterType::New();
GPUMeanFilterType::Pointer filter1 = GPUMeanFilterType::New();
GPUMeanFilterType::Pointer filter2 = GPUMeanFilterType::New();
ThresholdFilterType::Pointer filter3 = ThresholdFilterType::New();
filter1->SetInput( reader->GetOutput() ); 
// copy CPU->GPU implicitly
filter2->SetInput( filter1->GetOutput() );
filter3->SetInput( filter2->GetOutput() );
writer->SetInput( filter3->GetOutput() ); 
// copy GPU->CPU implicitly
writer->Update();
 
Object Factory Support
 
Create GPU object when possible
No need to explicitly define GPU objects
 
29
 
// register object factory for GPU image and filter objects
ObjectFactoryBase::RegisterFactory(GPUImageFactory::New());
ObjectFactoryBase::RegisterFactory(GPUMeanImageFilterFactory::New());
 
typedef itk::
Image
< InputPixelType,  2 >  InputImageType;
typedef itk::
Image
< OutputPixelType, 2 >  OutputImageType;
 
typedef itk::
MeanImageFilter
< InputImageType, OutputImageType >
             
  MeanFilterType;
MeanFilterType::Pointer filter = MeanFilterType::New
();
 
Type Casting
 
Image must be casted to GPUImage for
auto-synchronization for non-pipelined
workflow with object factory
Use GPUTraits
 
30
 
template <class T> class GPUTraits
{
public:
  typedef T   Type;
};
template <class T, unsigned int D> class GPUTraits< Image< T, D > >
{
public:
  typedef GPUImage<T,D>   Type;
};
 
InputImageType::Pointer img;
typedef itk::GPUTraits< InputImageType >::Type GPUImageType;
GPUImageType::Pointer otPtr = dynamic_cast< GPUImageType* >( img );
 
Future Work
 
Multi-GPU support
GPUThreadedGenerateData()
GPUImage internal types
Image (texture)
GPU ND Neighbor Iterator
 
31
 
Discussion
Slide Note
Embed
Share

This presentation discusses the implementation of GPU acceleration in ITK v4, focusing on providing a high-level GPU abstraction, transparent resource management, code development status, and GPU core classes. Goals include speeding up certain types of problems and managing memory effectively.

  • GPU Acceleration
  • ITK v4
  • Implementation
  • High-level Abstraction
  • Resource Management

Uploaded on Sep 30, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. GPU Acceleration in ITK v4 ITK v4 summer meeting June 28, 2011 Won-Ki Jeong Harvard University

  2. Overview Introduction Current status Examples Future work 2

  3. GPU Acceleration GPU as a fast co-processor Massively parallel Huge speed up for certain types of problem Physically independent system Problems Memory management Process management Implementation 3

  4. Goals Provide high-level GPU abstraction GPU resource management Transparent to existing ITK code Pipeline and object factory supports Basic CMake setup GPU module 4

  5. Status 28 new GPU classes GPU image GPU manager classes GPU filter base classes 6 example GPU image filters Gradient anisotropic diffusion Demons registration 5

  6. Code Development Github (most recent version) https://graphor@github.com/graphor/ITK.git Branch: GPU-Alpha Gerrit http://review.source.kitware.com/#change,1923 Waiting for reviewing 6

  7. CMake Setup Enabling GPU module ITK_USE_GPU Module_ITK-GPUCommon OpenCL source files will be copied to ${ITK_BINARY_DIR}/bin/OpenCL ${CMAKE_CURRENT_BINARY_DIR}/OpenCL 7

  8. Naming Convention File itkGPU*** ex) itkMeanImageFilter -> itkGPUMeanImageFilter Class GPU*** ex) MeanImageFilter -> GPUMeanImageFilter Method GPU*** ex) GenerateData() -> GPUGenerateData() 8

  9. GPU Core Classes GPUContextManager Manage context and command queues GPUKernelManager Load, compile, run GPU code GPUDataManager Data container for GPU GPUImageDataManager 9

  10. GPU Image Class Derived from itk::Image Compatible to existing ITK filters GPUImageDataManager as a member Separate GPU implementation from Image class Graft(const GPUDataManager *) Implicit(automatic) memory synchronization Dirty flags Time stamp (Modified()) 10

  11. GPU Filter Classes GPUDiscreteGaussianImageFilter GPUImageToImageFilter GPUNeighborhoodOperatorImageFilter GPUBoxImageFilter GPUMeanImageFilter GPUInPlaceImageFilter GPUFiniteDifferenceImageFilter GPUUnaryFunctorImageFilter GPUBinaryThresholdImageFilter GPUDenseFiniteDifferenceImageFilter GPUPDEDeformableRegistrationFilter GPUAnisotropicDiffusionImageFilter GPUDemonsRegistrationFilter GPUGradientAnisotropicDiffusionImageFilter 11

  12. GPU Functor/Function Classes GPUFunctorBase GPUFiniteDifferenceFunction GPUBinaryThreshold GPUAnisotropicDiffusionFunction GPUPDEDeformableRegistrationFunction GPUScalarAnisotropicDiffusionFunction GPUDemonsRegistrationFunction GPUGradiendNDAnisotropicDiffusionFunction 12

  13. GPUImageToImageFilter Base class for GPU image filters Extend existing itk filters template< class TInputImage, class TOutputImage, class TParentImageFilter > class ITK_EXPORT GPUImageToImageFilter: public TParentImageFilter { ... } Turn on/off GPU filter IsGPUEnabled() GPU filter implementation GPUGenerateData() 13

  14. GPUBinaryThresholdImageFilter Example of functor-based filter GPUUnaryFunctorImageFilter GPU Functor Per-pixel operator SetGPUKernelArguments() Set up GPU kernel arguments Returns # of arguments that have been set 14

  15. template< class TInput, class TOutput > class GPUBinaryThreshold : public GPUFunctorBase { public: GPUBinaryThreshold() { m_LowerThreshold = NumericTraits< TInput >::NonpositiveMin(); m_UpperThreshold = NumericTraits< TInput >::max(); m_OutsideValue = NumericTraits< TOutput >::Zero; m_InsideValue = NumericTraits< TOutput >::max(); } .... int SetGPUKernelArguments(GPUKernelManager::Pointer KernelManager, int KernelHandle) { KernelManager->SetKernelArg(KernelHandle, 0, sizeof(TInput), &(m_LowerThreshold)); KernelManager->SetKernelArg(KernelHandle, 1, sizeof(TInput), &(m_UpperThreshold)); KernelManager->SetKernelArg(KernelHandle, 2, sizeof(TOutput), &(m_InsideValue)); KernelManager->SetKernelArg(KernelHandle, 3, sizeof(TOutput), &(m_OutsideValue)); return 4; }; } 15

  16. GPUUnaryFunctorImageFilter< TInputImage, TOutputImage, TFunction, TParentImageFilter >::GPUGenerateData() { .... // arguments set up using Functor int argidx = (this->GetFunctor()).SetGPUKernelArguments(this->m_GPUKernelManager, m_UnaryFunctorImageFilterGPUKernelHandle); // arguments set up this->m_GPUKernelManager->SetKernelArgWithImage (m_UnaryFunctorImageFilterGPUKernelHandle, argidx++, inPtr->GetGPUDataManager()); this->m_GPUKernelManager->SetKernelArgWithImage (m_UnaryFunctorImageFilterGPUKernelHandle, argidx++, otPtr->GetGPUDataManager()); for(int i=0; i<(int)TInputImage::ImageDimension; i++) { this->m_GPUKernelManager->SetKernelArg(m_UnaryFunctorImageFilterGPUKernelHandle, argidx++, sizeof(int), &(imgSize[i])); } // launch kernel this->m_GPUKernelManager->LaunchKernel(m_UnaryFunctorImageFilterGPUKernelHandle, ImageDim, globalSize, localSize ); } 16

  17. GPUNeighborhoodOperatorImageFilter Pixel-wise inner product of neighborhood and operator coefficients Convolution __constant GPU buffer for coefficients GPU Discrete Gaussian Filter GPU NOIF using 1D Gaussian operator per axis 17

  18. GPUFiniteDifferenceImageFilter Base class for GPU finite difference filters GPUGradientAnisotropicDiffusionImageFilter GPUDemonsRegistrationFilter New virtual methods GPUApplyUpdate() GPUCalculateChange() Need finite difference function 18

  19. GPUFiniteDifferenceFunction Base class for GPU finite difference functions GPUGradientNDAnisotropicDiffusionFunction GPUDemonsRegistrationFunction New virtual method GPUComputeUpdate() Compute update buffer using GPU kernel 19

  20. GPUGradientAnisotropicDiffusionImageFilter GPUScalarAnisotropicDiffusionFunction New virtual method GPUCalculateAverageGradientMagnitudeSquared() GPUGradientNDAnisotropicDiffusionFunction GPU function for gradient-based anisotropic diffusion 20

  21. GPUDemonsRegistrationFilter Baohua from UPenn New method GPUSmoothDeformationField() GPUReduction 21

  22. Performance Binary Threshold Anisotropic Diffusion Gaussian Mean CPU 1 0.09346 0.7696 24.68 4.069 CPU 2 0.0408 0.7546 13.83 2.086 CPU 3 0.02865 0.6986 10.12 1.542 CPU 4 0.02313 0.763 9.14 1.572 GPU 0.019 0.0532 0.46 0.059 Speed up 1.2~4.9x 13~14x 19~53x 26~68x Intel Xeon Quad Core 3.2GHz CPU vs. NVIDIA GTX 480 GPU 256x256x100 CT volume 22

  23. Create Your Own GPU Image Filter Step 1: Derive your filter from GPUImageToImageFilter using an existing itk image filter as parent filter type Step 2: Load and compile GPU source code and create kernels in the constructor Step 3: Implement filter by calling GPU kernels in GPUGenerateData() 23

  24. Example: GPUMeanImageFilter Step 1: Class declaration template< class TInputImage, class TOutputImage > class ITK_EXPORT GPUMeanImageFilter : public GPUImageToImageFilter< TInputImage, TOutputImage, MeanImageFilter< TInputImage, TOutputImage > > { ... } 24

  25. Example: GPUMeanImageFilter Step 2: Constructor template< class TInputImage, class TOutputImage > GPUMeanImageFilter< TInputImage, TOutputImage>::GPUMeanImageFilter() { std::ostringstream defines; defines << "#define DIM_" << TInputImage::ImageDimension << "\n"; defines << "#define PIXELTYPE "; GetTypenameInString( typeid (TInputImage::PixelType), defines ); // OpenCL source path std::string oclSrcPath = "./../OpenCL/GPUMeanImageFilter.cl"; // load and build OpenCL program m_KernelManager->LoadProgramFromFile( oclSrcPath.c_str(), defines.str().c_str()); // create GPU kernel m_KernelHandle = m_KernelManager->CreateKernel("MeanFilter"); } 25

  26. Example: GPUMeanImageFilter Step 3: GPUGenerateData() template< class TInputImage, class TOutputImage > void GPUMeanImageFilter< TInputImage, TOutputImage >::GPUGenerateData() { typedef itk::GPUTraits< TInputImage >::Type GPUInputImage; typedef itk::GPUTraits< TOutputImage >::Type GPUOutputImage; // get input & output image pointer GPUInputImage::Pointer inPtr = dynamic_cast< GPUInputImage * >( this->ProcessObject::GetInput(0) ); GPUOutputImage::Pointer otPtr = dynamic_cast< GPUOutputImage * >( this->ProcessObject::GetOutput(0) ); GPUOutputImage::SizeType outSize = otPtr->GetLargestPossibleRegion().GetSize(); int radius[3], imgSize[3]; for(int i=0; i<(int)TInputImage::ImageDimension; i++) { radius[i] = (this->GetRadius())[i]; imgSize[i] = outSize[i]; } 26

  27. (Continued..) size_t localSize[3], globalSize[3]; localSize[0] = localSize[1] = localSize[2] = 8; for(int i=0; i<(int)TInputImage::ImageDimension; i++) { globalSize[i] = localSize[i]*(unsigned int)ceil((float)outSize[i]/(float)localSize[i]); } // kernel arguments set up int argidx = 0; m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++, inPtr->GetGPUDataManager()); m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++, otPtr->GetGPUDataManager()); for(int i=0; i<(int)TInputImage::ImageDimension; i++) m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int), &(radius[i])); for(int i=0; i<(int)TInputImage::ImageDimension; i++) m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int), &(imgSize[i])); // launch kernel m_KernelManager->LaunchKernel(m_KernelHandle, (int)TInputImage::ImageDimension, globalSize, localSize); } 27

  28. Pipeline Support Allow combining CPU and GPU filters Efficient CPU/GPU synchronization ReaderType::Pointer reader = ReaderType::New(); WriterType::Pointer writer = WriterType::New(); GPUMeanFilterType::Pointer filter1 = GPUMeanFilterType::New(); GPUMeanFilterType::Pointer filter2 = GPUMeanFilterType::New(); ThresholdFilterType::Pointer filter3 = ThresholdFilterType::New(); Filter1 (GPU) (GPU) (CPU) Reader Filter2 Filter3 (CPU) Writer (CPU) filter1->SetInput( reader->GetOutput() ); // copy CPU->GPU implicitly filter2->SetInput( filter1->GetOutput() ); filter3->SetInput( filter2->GetOutput() ); writer->SetInput( filter3->GetOutput() ); // copy GPU->CPU implicitly Synchronize Synchronize writer->Update(); 28

  29. Object Factory Support Create GPU object when possible No need to explicitly define GPU objects // register object factory for GPU image and filter objects ObjectFactoryBase::RegisterFactory(GPUImageFactory::New()); ObjectFactoryBase::RegisterFactory(GPUMeanImageFilterFactory::New()); typedef itk::Image< InputPixelType, 2 > InputImageType; typedef itk::Image< OutputPixelType, 2 > OutputImageType; typedef itk::MeanImageFilter< InputImageType, OutputImageType > MeanFilterType::Pointer filter = MeanFilterType::New(); MeanFilterType; 29

  30. Type Casting Image must be casted to GPUImage for auto-synchronization for non-pipelined workflow with object factory Use GPUTraits template <class T> class GPUTraits { public: typedef T Type; }; template <class T, unsigned int D> class GPUTraits< Image< T, D > > { public: typedef GPUImage<T,D> Type; }; InputImageType::Pointer img; typedef itk::GPUTraits< InputImageType >::Type GPUImageType; GPUImageType::Pointer otPtr = dynamic_cast< GPUImageType* >( img ); 30

  31. Future Work Multi-GPU support GPUThreadedGenerateData() GPUImage internal types Image (texture) GPU ND Neighbor Iterator 31

  32. Discussion

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#