GPU Acceleration in ITK v4 Overview

GPU Acceleration in ITK v4
ITK v4 
 
winter 
meeting
Feb 2
nd
 2011
Won-Ki Jeong, Harvard University
(wkjeong@seas.harvard.edu)
Overview
Introduction
Current status in GPU ITK v4
GPU managers
GPU image
GPU image filters
Examples
Future work
2
GPU Acceleration
GPU as a fast co-processor
Massively parallel
Huge speed up for certain types of problem
Physically independent system
Problems
Memory management
Process management
Implementation
3
Proposal
 
Provide high-
GPU data structure and filter
Developer only need to implement GPU kernel
ITK will do dirty jobs
GPU image filter framework
GPU filter template
Pipelining support
OpenCL
Industry standard
4
Quick Summary
Is GPU Draft Implementation done?
Yes: 
http://review.source.kitware.com/#change,800
What do we have now?
Basic GPU computing framework
GPU image and filter class
Pipeline and Object Factory supports
Basic CMake setup for GPU code
5
CMake Setup
ITK_USE_GPU
OFF by default
Self-contained in Code/GPU except a few minor
modification of existing files
OpenCL source file location
binary_dir/Code/GPU
binary_dir is written into
pathToOpenCLSourceCode.h
6
7
New GPU Classes
GPUContextManager
GPUDataManager
GPUImageDataManager
GPUKernelManager
GPUImage
GPUImageToImageFilter
GPUMeanImageFilter
8
GPU Context Manager
Global GPU resource manager
One instance per process
All GPU objects should have a pointer to it
Resources
Platforms
Devices
Contexts
Command queues
GetCommandQueue(), GetNumCommandQueue()
9
GPU Data Manager
Base class to manage GPU memory
GPU data container
Synchronization between CPU & GPU memory
10
Synchronization
Dirty flags
Lightweight
Pixel access functions in GPU images
Time stamp
Better to use sparingly
Pipeline
11
Data Manager Example
12
unsigned int arraySize = 100;
// create CPU memory
float *a = new float[arraySize];
// create GPU memory
GPUDataManager::Pointer b = GPUDataManager::New();
b->SetBufferSize(arraySize*sizeof(float));
b->SetCPUBufferPointer(a);
b->Allocate();
// change values in CPU memory
a[10] = 8;
// mark GPU as dirty and synchronize CPU -> GPU
b->SetGPUBufferDirty();
b->MakeUpToDate();
Data Manager Example (cont’d)
13
// change values in GPU memory
... (run GPU kernel)
// mark CPU as dirty and synchronize GPU -> CPU
b->SetCPUBufferDirty();
b->MakeUpToDate();
Create Your Own GPU Data Manager
14
GPUDataManager
GPUImageDataManager
GPUMeshDataManager
GPUVideoDataManager
....
....
GPU Image
Derived from 
itk::Image
Compatible to existing ITK filters
GPUImageDataManager as a member
Separate GPU implementation from Image class
Implicit(automatic) synchronization
Override CPU buffer access functions to
properly set the dirty buffer flag
Provide a single view of CPU/GPU memory
15
GPU Image
16
itk::Image::GPUImage
itk::Image
CPU Memory
GPUImageDataManager
FillBuffer()
GetPixel()
SetPixel()
GetBufferPointer()
GetPixelAccessor()
GetNeighborhoodAccessor()
...
FillBuffer()
GetPixel()
SetPixel()
GetBufferPointer()
GetPixelAccessor()
GetNeighborhoodAccessor()
...
SetGPUBufferDirty()
MakeUpToDate()
GPU Kernel Manager
Load and compile GPU source code
LoadProgramFromFile()
Create GPU kernels
CreateKernel()
Execute GPU kernels
SetKernelArg()
SetKernelArgWithImage()
LaunchKernel()
17
Kernel Manager Example
18
// create GPU images
itk::GPUImage<float,2>::Pointer srcA, srcB, dest;
srcA = itk::GPUImage<float,2>::New();
...
// create GPU kernel manager
  
GPUKernelManager::Pointer kernelManager = GPUKernelManager::New();
 
 
// load program and compile
 
kernelManager->LoadProgramFromFile( “ImageOps.cl”,
          
"#define PIXELTYPE float\n"
);
// create ADD kernel
int kernel_add = kernelManager->CreateKernel("ImageAdd");
Kernel Manager Example(cont’d)
19
unsigned int nElem = 256*256;
// set kernel arguments
kernelManager->SetKernelArgWithImage(kernel_add, 0,
          
srcA->GetGPUDataManager());
 
kernelManager->SetKernelArgWithImage(kernel_add, 1,
          
srcB->GetGPUDataManager());
kernelManager->SetKernelArgWithImage(kernel_add, 2,
          
dest->GetGPUDataManager());
 
kernelManager->SetKernelArg(kernel_add, 3, sizeof(unsigned int),
       
  &nElem);
// launch kernel
kernelManager->LaunchKernel2D(kernel_add, 256, 256, 16, 16);
 
OpenCL Source Code Example
20
//
// pixel by pixel addition of 2D images
 
//
__kernel void ImageAdd(__global const PIXELTYPE* a, 
  
     
 __global const PIXELTYPE* b,
       
 __global PIXELTYPE* c,
       
 unsigned int nElem)
 
{
 
 
 unsigned int width = get_global_size(0);
 
 
 unsigned int gix = get_global_id(0);
 
 
 unsigned int giy = get_global_id(1);
 
 
 unsigned int gidx = giy*width + gix;
 
 
 
 // bound check
 
 
 if (gidx < nElem)
 
  
 
 {
 
  
 
 
 
c[gidx] = a[gidx] + b[gidx];
 
 
 }
  
}
GPUImageToImageFilter
Base class for GPU image filter
Extend existing itk filters using CRTP
Turn on/off GPU filter
IsGPUEnabled(bool)
GPU filter implementation
GPUGenerateData()
21
template< class TInputImage, class TOutputImage, class 
TParentImageFilter
 >
class ITK_EXPORT GPUImageToImageFilter: public 
TParentImageFilter
{ ... }
Create Your Own GPU Image Filter
Step 1: Derive your filter from
GPUImageToImageFilter 
using an existing
itk image filter
Step 2: Load and compile GPU source code
and create kernels in the constructor
Step 3: Implement filter by calling GPU
kernels in 
GPUGenerateData()
22
Example: GPUMeanImageFilter
Step 1: Class declaration
23
template< class TInputImage, class TOutputImage >
class ITK_EXPORT GPUMeanImageFilter :
 
public GPUImageToImageFilter< TInputImage, TOutputImage, 
 
    
MeanImageFilter< TInputImage, TOutputImage > 
>
{ ... }
Example: GPUMeanImageFilter
Step 2: Constructor
24
template< class TInputImage, class TOutputImage >
GPUMeanImageFilter< TInputImage, TOutputImage>::GPUMeanImageFilter()
{
 
 
char buf[100];
 
 
 
// OpenCL source path
 
 
char oclSrcPath[100];
 
 
sprintf(oclSrcPath,
   
 "%s/Code/GPU/GPUMeanImageFilter.cl", 
itk_root_path
);
 
 
// load and build OpenCL program
 
 
m_KernelManager
->LoadProgramFromFile( oclSrcPath, buf );
 
 
 
// create GPU kernel
 
 
m_KernelHandle = m_KernelManager->CreateKernel("MeanFilter");
 
}
 
Defined in pathToOpenCLSourceCode.h
Example: GPUMeanImageFilter
Step 3: GPUGenerateData()
25
template< class TInputImage, class TOutputImage >
     
void
 
GPUMeanImageFilter< TInputImage, TOutputImage >::GPUGenerateData()
 
{
 
  typedef itk::
GPUTraits
< TInputImage >::Type  GPUInputImage;
 
  typedef itk::GPUTraits< TOutputImage >::Type  GPUOutputImage;
 
  // get input & output image pointer
  GPUInputImage::Pointer inPtr =
  
dynamic_cast
< GPUInputImage * >( this->ProcessObject::GetInput(0) );
 
  GPUOutputImage::Pointer otPtr =
  
dynamic_cast
< GPUOutputImage * >( this->ProcessObject::GetOutput(0) );
 
  GPUOutputImage::SizeType outSize = otPtr->GetLargestPossibleRegion().GetSize();
 
  int radius[3], imgSize[3];
 
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
 
  {
 
 
 radius[i] = (this->
GetRadius
())[i];
 
     imgSize[i] = outSize[i];
 
  }
 
26
(Continued..)
 
  size_t localSize[2], globalSize[2];
 
  localSize[0] = localSize[1] = 16;
 
  
globalSize[0]
        = localSize[0]*(unsigned int)ceil((float)outSize[0]/(float)localSize[0]);
  globalSize[1]
        = localSize[1]*(unsigned int)ceil((float)outSize[1]/(float)localSize[1]);
   
  // kernel arguments set up
 
  int argidx = 0;
 
  m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++,
         
   inPtr->GetGPUDataManager());
 
  m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++,
         
   otPtr->GetGPUDataManager());
 
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
 
    m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int),
    
       
&(radius[i]));
 
  for(int i=0; i<(int)TInputImage::ImageDimension; i++)
 
    m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int),
 
        
&(imgSize[i]));
 
 
  // launch kernel
 
  m_KernelManager->LaunchKernel(m_KernelHandle,
       
  (int)TInputImage::ImageDimension,
       
  globalSize, localSize);
 
}
Big Picture: Collaboration Diagram
27
GPU Context Manager
GPU Kernel Manager
GPU Image
Pipeline Support
Allow combining CPU and GPU filters
Efficient CPU/GPU synchronization
Currently 
ImageToImageFilter 
is supported
28
 
ReaderType::Pointer reader = ReaderType::New();
WriterType::Pointer writer = WriterType::New();
GPUMeanFilterType::Pointer filter1 = GPUMeanFilterType::New();
GPUMeanFilterType::Pointer filter2 = GPUMeanFilterType::New();
ThresholdFilterType::Pointer filter3 = ThresholdFilterType::New();
filter1->SetInput( reader->GetOutput() ); 
// copy CPU->GPU implicitly
filter2->SetInput( filter1->GetOutput() );
filter3->SetInput( filter2->GetOutput() );
writer->SetInput( filter3->GetOutput() ); 
// copy GPU->CPU implicitly
writer->Update();
Complicated Filter Design
Multiple kernel launches
Single kernel, multiple calls
Multiple kernels
Design choices
Each kernel is a filter, pipelining
Reusable, memory overhead
Put multiple kernels in a single filter
Not reusable, less memory overhead
29
Object Factory Support
Create GPU object when possible
No need to explicitly define GPU objects
30
 
// register object factory for GPU image and filter objects
ObjectFactoryBase::RegisterFactory(GPUImageFactory::New());
ObjectFactoryBase::RegisterFactory(GPUMeanImageFilterFactory::New());
typedef itk::
Image
< InputPixelType,  2 >  InputImageType;
typedef itk::
Image
< OutputPixelType, 2 >  OutputImageType;
typedef itk::
MeanImageFilter
< InputImageType, OutputImageType >
             
  MeanFilterType;
MeanFilterType::Pointer filter = MeanFilterType::New
();
Type Casting
Image must be casted to GPUImage for
auto-synchronization for non-pipelined
workflow with object factory
Use GPUTraits
31
template <class T> class GPUTraits
 
{
 
public:
 
  typedef T   Type;
 
};
 
template <class T, unsigned int D> class GPUTraits< Image< T, D > >
 
{
 
public:
 
  typedef GPUImage<T,D>   Type;
 
};
 
InputImageType::Pointer img;
typedef itk::GPUTraits< InputImageType >::Type GPUImageType;
GPUImageType::Pointer otPtr = dynamic_cast< GPUImageType* >( img );
 
Examples
test_itkGPUImage.cxx
Simple image algebra
Multiple kernels and command queues
test_itkGPUImageFilter.cxx
GPUMeanImageFilter
Pipeline and object factory
ctest –R gpuImageFilterTest -V
32
ToDo List
Multi-GPU support
GPUThreadedGenerateData()
InPlace filter base class
Grafting for GPUImage
GPUImage internal types
Buffer, image (texture)
Basic filters
Level set, registration, etc
33
Useful Links
Current code in gerrit
http://review.source.kitware.com/#change,800
OpenCL
http://www.khronos.org/opencl/
http://www.nvidia.com/object/cuda_opencl_new.html
http://developer.amd.com/zones/OpenCLZone/pages/de
fault.aspx
http://software.intel.com/en-us/articles/intel-opencl-sdk/
34
Questions?
Slide Note
Embed
Share

This presentation by Won-Ki Jeong from Harvard University at the ITK v4 winter meeting in 2011 discusses the implementation and advantages of GPU acceleration in ITK v4. Topics covered include the use of GPUs as co-processors for massively parallel processing, memory and process management, new GPU classes introduced, and the setup in CMake for GPU code. It also outlines the current status, future work, and platforms supported for GPU acceleration in ITK v4.

  • GPU Acceleration
  • ITK v4
  • Harvard University
  • Parallel Processing
  • Implementation

Uploaded on Sep 26, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. GPU Acceleration in ITK v4 ITK v4 winter meeting Feb 2nd2011 Won-Ki Jeong, Harvard University (wkjeong@seas.harvard.edu)

  2. Overview Introduction Current status in GPU ITK v4 GPU managers GPU image GPU image filters Examples Future work 2

  3. GPU Acceleration GPU as a fast co-processor Massively parallel Huge speed up for certain types of problem Physically independent system Problems Memory management Process management Implementation 3

  4. Quick Summary Is GPU Draft Implementation done? Yes: http://review.source.kitware.com/#change,800 What do we have now? Basic GPU computing framework GPU image and filter class Pipeline and Object Factory supports Basic CMake setup for GPU code 5

  5. CMake Setup ITK_USE_GPU OFF by default Self-contained in Code/GPU except a few minor modification of existing files OpenCL source file location binary_dir/Code/GPU binary_dir is written into pathToOpenCLSourceCode.h 6

  6. Platforms: NVIDIA, ATI, Intel Devices Context Programs Command Queues Kernels Kernels GPU Images 7

  7. New GPU Classes GPUContextManager GPUDataManager GPUImageDataManager GPUKernelManager Basic GPU Objects GPUImage GPUImageToImageFilter GPUMeanImageFilter ITK Object Extension 8

  8. GPU Context Manager Global GPU resource manager One instance per process All GPU objects should have a pointer to it Resources Platforms Devices Contexts Command queues GetCommandQueue(), GetNumCommandQueue() 9

  9. GPU Data Manager Base class to manage GPU memory GPU data container Synchronization between CPU & GPU memory Synchronization APIs: Data Container APIs: SetCPUDirtyFlag() SetGPUDirtyFlag() SetCPUBufferDirty() SetGPUBufferDirty() MakeCPUBufferUpToDate() MakeGPUBufferUpToDate() MakeUpToDate() SetBufferSize() SetCPUBufferPointer() Allocate() protected: GetGPUBufferPointer() 10

  10. Synchronization Dirty flags Lightweight Pixel access functions in GPU images Time stamp Better to use sparingly Pipeline 11

  11. Data Manager Example unsigned int arraySize = 100; // create CPU memory float *a = new float[arraySize]; // create GPU memory GPUDataManager::Pointer b = GPUDataManager::New(); b->SetBufferSize(arraySize*sizeof(float)); b->SetCPUBufferPointer(a); b->Allocate(); // change values in CPU memory a[10] = 8; // mark GPU as dirty and synchronize CPU -> GPU b->SetGPUBufferDirty(); b->MakeUpToDate(); 12

  12. Data Manager Example (contd) // change values in GPU memory ... (run GPU kernel) // mark CPU as dirty and synchronize GPU -> CPU b->SetCPUBufferDirty(); b->MakeUpToDate(); 13

  13. Create Your Own GPU Data Manager GPUDataManager GPUImageDataManager GPUMeshDataManager GPUVideoDataManager SetImagePointer() MakeCPUBufferUpToDate( ) MakeGPUBufferUpToDate( ) .... .... 14

  14. GPU Image Derived from itk::Image Compatible to existing ITK filters GPUImageDataManager as a member Separate GPU implementation from Image class Implicit(automatic) synchronization Override CPU buffer access functions to properly set the dirty buffer flag Provide a single view of CPU/GPU memory 15

  15. GPU Image itk::Image::GPUImage itk::Image FillBuffer() GetPixel() SetPixel() GetBufferPointer() GetPixelAccessor() GetNeighborhoodAccessor() ... FillBuffer() GetPixel() SetPixel() GetBufferPointer() GetPixelAccessor() GetNeighborhoodAccessor() ... GPUImageDataManager CPU Memory SetGPUBufferDirty() MakeUpToDate() 16

  16. GPU Kernel Manager Load and compile GPU source code LoadProgramFromFile() Create GPU kernels CreateKernel() Execute GPU kernels SetKernelArg() SetKernelArgWithImage() LaunchKernel() 17

  17. Kernel Manager Example // create GPU images itk::GPUImage<float,2>::Pointer srcA, srcB, dest; srcA = itk::GPUImage<float,2>::New(); ... // create GPU kernel manager GPUKernelManager::Pointer kernelManager = GPUKernelManager::New(); // load program and compile kernelManager->LoadProgramFromFile( ImageOps.cl , ); "#define PIXELTYPE float\n" // create ADD kernel int kernel_add = kernelManager->CreateKernel("ImageAdd"); 18

  18. Kernel Manager Example(contd) unsigned int nElem = 256*256; // set kernel arguments kernelManager->SetKernelArgWithImage(kernel_add, 0, kernelManager->SetKernelArgWithImage(kernel_add, 1, kernelManager->SetKernelArgWithImage(kernel_add, 2, kernelManager->SetKernelArg(kernel_add, 3, sizeof(unsigned int), &nElem); srcA->GetGPUDataManager()); srcB->GetGPUDataManager()); dest->GetGPUDataManager()); // launch kernel kernelManager->LaunchKernel2D(kernel_add, 256, 256, 16, 16); 19

  19. OpenCL Source Code Example // // pixel by pixel addition of 2D images // __kernel void ImageAdd(__global const PIXELTYPE* a, __global const PIXELTYPE* b, __global PIXELTYPE* c, unsigned int nElem) { unsigned int width = get_global_size(0); unsigned int gix = get_global_id(0); unsigned int giy = get_global_id(1); unsigned int gidx = giy*width + gix; // bound check if (gidx < nElem) { c[gidx] = a[gidx] + b[gidx]; } } 20

  20. GPUImageToImageFilter Base class for GPU image filter Extend existing itk filters using CRTP template< class TInputImage, class TOutputImage, class TParentImageFilter > class ITK_EXPORT GPUImageToImageFilter: public TParentImageFilter { ... } Turn on/off GPU filter IsGPUEnabled(bool) GPU filter implementation GPUGenerateData() 21

  21. Create Your Own GPU Image Filter Step 1: Derive your filter from GPUImageToImageFilter using an existing itk image filter Step 2: Load and compile GPU source code and create kernels in the constructor Step 3: Implement filter by calling GPU kernels in GPUGenerateData() 22

  22. Example: GPUMeanImageFilter Step 1: Class declaration template< class TInputImage, class TOutputImage > class ITK_EXPORT GPUMeanImageFilter : public GPUImageToImageFilter< TInputImage, TOutputImage, MeanImageFilter< TInputImage, TOutputImage > > { ... } 23

  23. Example: GPUMeanImageFilter Step 2: Constructor template< class TInputImage, class TOutputImage > GPUMeanImageFilter< TInputImage, TOutputImage>::GPUMeanImageFilter() { char buf[100]; // OpenCL source path char oclSrcPath[100]; sprintf(oclSrcPath, "%s/Code/GPU/GPUMeanImageFilter.cl", itk_root_path); // load and build OpenCL program m_KernelManager->LoadProgramFromFile( oclSrcPath, buf ); // create GPU kernel m_KernelHandle = m_KernelManager->CreateKernel("MeanFilter"); } Defined in pathToOpenCLSourceCode.h 24

  24. Example: GPUMeanImageFilter Step 3: GPUGenerateData() template< class TInputImage, class TOutputImage > void GPUMeanImageFilter< TInputImage, TOutputImage >::GPUGenerateData() { typedef itk::GPUTraits< TInputImage >::Type GPUInputImage; typedef itk::GPUTraits< TOutputImage >::Type GPUOutputImage; // get input & output image pointer GPUInputImage::Pointer inPtr = dynamic_cast< GPUInputImage * >( this->ProcessObject::GetInput(0) ); GPUOutputImage::Pointer otPtr = dynamic_cast< GPUOutputImage * >( this->ProcessObject::GetOutput(0) ); GPUOutputImage::SizeType outSize = otPtr->GetLargestPossibleRegion().GetSize(); int radius[3], imgSize[3]; for(int i=0; i<(int)TInputImage::ImageDimension; i++) { radius[i] = (this->GetRadius())[i]; imgSize[i] = outSize[i]; } 25

  25. (Continued..) size_t localSize[2], globalSize[2]; localSize[0] = localSize[1] = 16; globalSize[0] = localSize[0]*(unsigned int)ceil((float)outSize[0]/(float)localSize[0]); globalSize[1] = localSize[1]*(unsigned int)ceil((float)outSize[1]/(float)localSize[1]); // kernel arguments set up int argidx = 0; m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++, inPtr->GetGPUDataManager()); m_KernelManager->SetKernelArgWithImage(m_KernelHandle, argidx++, otPtr->GetGPUDataManager()); for(int i=0; i<(int)TInputImage::ImageDimension; i++) m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int), &(radius[i])); for(int i=0; i<(int)TInputImage::ImageDimension; i++) m_KernelManager->SetKernelArg(m_KernelHandle, argidx++, sizeof(int), &(imgSize[i])); // launch kernel m_KernelManager->LaunchKernel(m_KernelHandle, (int)TInputImage::ImageDimension, globalSize, localSize); } 26

  26. Pipeline Support Allow combining CPU and GPU filters Efficient CPU/GPU synchronization Currently ImageToImageFilter is supported ReaderType::Pointer reader = ReaderType::New(); WriterType::Pointer writer = WriterType::New(); GPUMeanFilterType::Pointer filter1 = GPUMeanFilterType::New(); GPUMeanFilterType::Pointer filter2 = GPUMeanFilterType::New(); ThresholdFilterType::Pointer filter3 = ThresholdFilterType::New(); (GPU) (GPU) (CPU) Reader Filter1 Filter2 Filter3 (CPU) Writer (CPU) filter1->SetInput( reader->GetOutput() ); // copy CPU->GPU implicitly filter2->SetInput( filter1->GetOutput() ); filter3->SetInput( filter2->GetOutput() ); writer->SetInput( filter3->GetOutput() ); // copy GPU->CPU implicitly Synchronize Synchronize writer->Update(); 28

  27. Complicated Filter Design Multiple kernel launches Single kernel, multiple calls Multiple kernels Design choices Each kernel is a filter, pipelining Reusable, memory overhead Put multiple kernels in a single filter Not reusable, less memory overhead 29

  28. Object Factory Support Create GPU object when possible No need to explicitly define GPU objects // register object factory for GPU image and filter objects ObjectFactoryBase::RegisterFactory(GPUImageFactory::New()); ObjectFactoryBase::RegisterFactory(GPUMeanImageFilterFactory::New()); typedef itk::Image< InputPixelType, 2 > InputImageType; typedef itk::Image< OutputPixelType, 2 > OutputImageType; typedef itk::MeanImageFilter< InputImageType, OutputImageType > MeanFilterType::Pointer filter = MeanFilterType::New(); MeanFilterType; 30

  29. Type Casting Image must be casted to GPUImage for auto-synchronization for non-pipelined workflow with object factory Use GPUTraits template <class T> class GPUTraits { public: typedef T Type; }; template <class T, unsigned int D> class GPUTraits< Image< T, D > > { public: typedef GPUImage<T,D> Type; }; InputImageType::Pointer img; typedef itk::GPUTraits< InputImageType >::Type GPUImageType; GPUImageType::Pointer otPtr = dynamic_cast< GPUImageType* >( img ); 31

  30. Examples test_itkGPUImage.cxx Simple image algebra Multiple kernels and command queues test_itkGPUImageFilter.cxx GPUMeanImageFilter Pipeline and object factory ctest R gpuImageFilterTest -V 32

  31. ToDo List Multi-GPU support GPUThreadedGenerateData() InPlace filter base class Grafting for GPUImage GPUImage internal types Buffer, image (texture) Basic filters Level set, registration, etc 33

  32. Useful Links Current code in gerrit http://review.source.kitware.com/#change,800 OpenCL http://www.khronos.org/opencl/ http://www.nvidia.com/object/cuda_opencl_new.html http://developer.amd.com/zones/OpenCLZone/pages/de fault.aspx http://software.intel.com/en-us/articles/intel-opencl-sdk/ 34

  33. Questions?

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#