OpenCL Design Examples

The Altera® SDK for OpenCLTM* provides a design environment for you to easily implement Open Computing Language (OpenCL) applications with FPGA-based accelerators. For more information, visit the following web pages:

Design Examples

 
The following examples demonstrate how to describe various applications in OpenCL along with their respective host applications, which you can compile and execute on a host with an FPGA board that supports the Altera SDK for OpenCL.

Basic Examples

Design Example Features Benefits Description
Hello World
  • OpenCL application programming interface (API) to initialize a device and run a kernel
  • Getting started
This simple design example demonstrates a basic OpenCL kernel containing a printf call and its corresponding host program.
Vector Addition
  • OpenCL API
  • Partition a large problem across multiple devices
  • OpenCL events and event profiling
  • Getting started
This simple design example demonstrates a basic vector addition OpenCL kernel and its corresponding host program.

Network Platform Examples

Design Example Features Benefits Description
OPRA FAST Parser
  • Single work-item kernel
  • I/O channels
  • Low latency
  • 10G link saturation
This design example demonstrates a streaming parser commonly used in high-frequency trading algorithms. The parser accepts an OPRA FAST data stream and decompresses the fields for use upstream. It illustrates how you can process streaming messages efficiently to achieve 10G link saturation.

HPC Platform Examples

Design Example Features Benefits Description
Channelizer
  • Kernel channels
  • Multiple simultaneous kernels
  • Single work-item kernels
  • Performance
  • Getting started with kernel channels
This design example demonstrates a high-performance channelizer design using OpenCL. The channelizer combines a polyphase filter bank (PFB) with a fast Fourier transform to reduce the effects of spectral leakage on the resulting frequency spectrum.
Finite Difference Computation (3D)
  • Single-precision floating-point optimizations
  • Single work-item kernel
  • Optimizations to minimize redundant memory use
  • Performance
This design example demonstrates a high-performance 3D finite-difference stencil-only computation using OpenCL. It shows how to efficiently describe a sliding window data reuse pattern.
FFT (1D)
  • Single-precision floating-point optimizations
  • Single work-item kernel
  • Performance
This design example demonstrates a high-performance 1D radix-4 complex fast Fourier transform (FFT) or inverse fast Fourier transform (IFFT) engine using OpenCL. This example takes advantage of the efficient sliding window data reuse pattern.
FFT (2D)
  • Single-precision floating-point optimizations
  • Kernel channels
  • Memory access pattern optimizations
  • Multiple simultaneous kernels
  • Mix of single work-item and NDRange kernels
  • Performance
  • Getting started with kernel channels
This design example demonstrates a high-performance 2D radix-4 complex FFT/IFFT engine using OpenCL. This engine is targeted at large problem sizes (1024x1024 by default) and uses global memory to store the intermediate transposition. One aspect highlighted by this example is how to efficiently perform matrix transposition in global memory.
JPEG Decoder
  • Single work-item kernels
  • Kernel channels
  • Overlapping memory transfers and kernel invocations
  • Scalable Performance
  • Getting started with kernel channels
This design example showcases a higher-performance JPEG decoding solution.
Mandelbrot Fractal Rendering
  • Double-precision floating-point optimizations
  • Multiple device partitioning
  • Scalable Performance
This design example includes a kernel that implements the Mandelbrot fractal convergence algorithm and displays the results to the screen.
Matrix Multiplication
  • Single-precision floating-point optimizations
  • Local memory buffering
  • Compiler optimizations
  • Multiple device execution
  • Scalable performance
  • Getting started with optimization methods
This example shows the optimization of the fundamental matrix multiplication operation using loop tiling to take advantage of the data reuse inherent in the computation.
Monte Carlo Black-Scholes Asian Options Pricing
  • Double-precision floating-point optimizations
  • Kernel channels
  • Multiple device execution
  • Multiple simultaneous kernels
  • Scalable
  • Power-efficient performance
  • Getting started with kernel channels
This design example implements the Monte Carlo Black-Scholes simulation for Asian option pricing. This example shows how to run multiple kernels simultaneously, with each performing different parts of the simulation (random number generation, path simulation, and accumulation) and communicating using Altera's channels vendor extension.
Sobel Filter
  • Integer arithmetic
  • Single work-item kernel
  • Efficient 2D sliding window line buffer
  • Performance
This design example demonstrates a seamless software solution of a Sobel filter in OpenCL to perform edge detection on an image and display the resulting filtered image on the screen.
Time-Domain FIR Filter
  • Single-precision floating-point optimizations
  • Efficient 1D sliding window buffer implementation
  • Single work-item kernel
  • Optimization methods
  • Performance
  • Getting started with optimization methods
This design implements the time-domain finite impulse response (FIR) filter benchmark from the HPEC Challenge Benchmark Suite. This design example is a great example of how FPGAs can provide far better performance than a graphics processing unit (GPU) architecture for floating-point FIR filters.
Video Downscaling
  • Kernel channels
  • Multiple simultaneous kernels
  • Memory access pattern optimizations
  • Performance
  • Getting started with kernel channels
This design example implements a video downscaler that takes 1080p input video and outputs 720p video at 110 frames per second. This example uses multiple kernels to efficiently read from and write to global memory.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

* Product is based on a published Khronos Specification, and has passed the Khronos Conformance Testing Process. Current conformance status can be found at www.khronos.org/conformance.

Design Examples Disclaimer

These design examples may only be used within Altera devices and remain the property of Altera Corporation. They are being provided on an “as-is” basis and as an accommodation; therefore, all warranties, representations, or guarantees of any kind (whether express, implied, or statutory) including, without limitation, warranties of merchantability, non-infringement, or fitness for a particular purpose, are specifically disclaimed. Altera expressly does not recommend, suggest, or require that these examples be used in combination with any other product not provided by Altera.