## Papers on hgpu.org (.txt-file)

A Self-Optimizing Framework for Developing Metrology Software on Massive Parallel Processor Architectures

A self-organization based optical flow estimator with GPU implementation

A self-organization based optical flow estimator with GPU implementation (thesis)

A Semi-Automated Tool Flow for Roofline Anaylsis of OpenCL Kernels on Accelerators

A Shader Library for OpenGL 4 and GLSL 4.3 Learning and Development

A shared file system abstraction for heterogeneous architectures

A shared-scene-graph image-warping architecture for VR: Low latency versus image quality

A short guide to CUDA C: For physicists with multi-core graphics cards

A Short Note on Gaussian Process Modeling for Large Datasets using Graphics Processing Units

A SIMD Interpreter for Genetic Programming on GPU Graphics Cards

A SIMD-efficient 14 instruction shader program for high-throughput microtriangle rasterization

A Similarity Measure for GPU Kernel Subgraph Matching

A Similarity-Based Analysis Tool for Scientific Application Porting

A simple and efficient way to compute depth maps for multi-view videos

A simple and flexible volume rendering framework for graphics-hardware-based raycasting

A simple GPU-based approach for 3D Voronoi diagram construction and visualization

A simple method to accelerate fringe analysis algorithms based on graphics processing unit and MATLAB

A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures

A Simulation Framework for Scheduling Performance Evaluation on CPU-GPU Heterogeneous System

A simulation suite for lattice Boltzmann based real time CFD applications exploiting multi-level parallelism on modern multi-and many-core architectures

A Simulation Suite for Lattice-Boltzmann based Real-Time CFD Applications Exploiting Multi-Level Parallelism on modern Multi- and Many-Core Architectures

A Simulator for the Cafadis Real Time 3DTV Camera

A Single (Unified) Shader GPU Microarchitecture for Embedded Systems

A small-world network model for distributed storage of semantic metadata

A Smart GPU Implementation of an Elliptic Kernel for an Ocean Global Circulation Model

A smooth particle hydrodynamics code to model collisions between solid, self-gravitating objects

A Software Framework for the Detection and Classification of Biological Targets in Bio-Nano Sensing

A Software-Based Self Test of CUDA Fermi GPUs

A Sorting Library for FPGA Implementation in OpenCL Programming

A Sparse Matrix Personality for the Convey HC-1

A sparse octree gravitational N-body code that runs entirely on the GPU processor

A Spiking Neural P system simulator based on CUDA

A Splitting Algorithm for Directional Regularization and Sparsification

A stand-alone Finite Difference Time Domain (FDTD) simulation for Integrated Optoelectronics Laboratory

A state-of-the-art password strength analysis demonstrator

A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning

A Static Load Balancing Scheme for Parallel Volume Rendering on Multi-GPU Clusters

A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL

A Stencil DSEL for Single Code Accelerated Computing with SYCL

A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA

A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU

A stereoscopic movie player with real-time content adaptation to the display geometry

A Stochastic-based Optimized Schwarz Method for the Gravimetry Equations on GPU Clusters

A straightforward CUDA implementation for interactive ray-tracing

A Straightforward Preprocessing Approach for Accelerating Convex Hull Computations on the GPU

A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs

A Strategy for Automatically Generating High Performance CUDA Code for a GPU Accelerator from a Specialized Fortran Code Expression

A Stream Processor Cluster Architecture Model with the Hybrid Technology of MPI and CUDA

A stream-computing extension to OpenMP

A streaming model for nested data parallelism

A streaming narrow-band algorithm: interactive computation and visualization of level sets

A structural analysis of the A5/1 state transition graph

A structured parallel periodic arnoldi shooting algorithm for RF-PSS analysis based on GPU platforms

A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

A Study of CUDA Acceleration and Impact of Data Transfer Overhead in Heterogeneous Environment

A Study of Data Partitioning on OpenCL-based FPGAs

A study of integer sorting on multicores

A Study of Mixed Precision Strategies for GMRES on GPUs

A study of parallel evolution strategy: pattern search on a GPU computing platform

A Study of Parallel Sorting Algorithms Using CUDA and OpenMP

A Study of Productivity and Performance of Modern Vector Processors

A Study of Real-Time Lighting Effects

A Study of Scheduling a Neuro-imaging Application On a Heterogeneous CPU-GPU Cluster

A Study of Single and Multi-device Synchronization Methods in Nvidia GPUs

A Study of Successive Over-relaxation Method Parallelization Over Modern HPC Languages

A Study of the Parallelization of Hybrid SAT Solver using CUDA

A Study of the Potential of Locality-Aware Thread Scheduling for GPUs

A study of the speed and the accuracy of the Boundary Element Method as applied to the computational simulation of biological organs

A Study of Time and Energy Efficient Algorithms for Parallel and Heterogeneous Computing

A Study on Efficient Application Mapping on Parallel Computing Accelerators

A Study on GPU Computing and Accelerating Simulation of Sedimentary Rock Structure

A Study on Parallel Imaging Algorithm of 3D Geological Data

A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation

A Study on the Acceleration of Arrival Curve Construction and Regular Specification Mining using GPUs

A Summary of Recent GPU Developments and Key Enabling Technologies for Digital Media Applications

A Superresolution Framework for High-Accuracy Multiview Reconstruction

A Survey Of Architectural Approaches for Data Compression in Cache and Main Memory Systems

A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches

A Survey of Architectural Techniques For DRAM Power Management

A Survey of Architectural Techniques For Improving Cache Power Efficiency

A Survey Of Architectural Techniques for Managing Process Variation

A Survey Of Architectural Techniques for Near-Threshold Computing

A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks

A survey of BRDF models for computer graphics

A Survey of Cache Bypassing Techniques

A Survey of Cache Partitioning Techniques for Multicore Processors

A Survey of Cloud Lighting and Rendering Techniques

A Survey of Convolutional Neural Networks on Edge with Reconfigurable Computing

A Survey of CPU-GPU Heterogeneous Computing Techniques

A Survey of CUDA-based Multidimensional Scaling on GPU Architecture

A Survey of FPGA Based Deep Learning Accelerators: Challenges and Opportunities

A Survey of FPGA Based Neural Network Accelerator

A Survey of FPGA-based Accelerators for Convolutional Neural Networks

A Survey of General-Purpose Computation on Graphics Hardware

A survey of GPU-based medical image computing techniques

A Survey of Machine Learning for Computer Architecture and Systems

A survey of medical image registration on graphics hardware

A Survey of Medical Image Registration on Multicore and the GPU

A Survey of Methods For Analyzing and Improving GPU Energy Efficiency

Titles: 100

open PDFs: 90

packages: 10