December 05, 2016
Performance Optimization on Modern Processor Architecture through Vectorization
by Yong Fu, Software Engineer
In this article, we focus on optimizing performance through SIMD. SIMD processing exploits data-level parallelism. All mainstream processors today, such as Intel Sandy/Ivy Bridge, Haswell/Broadwell, AMD Bulldozer, ARM, and PowerPC implement SIMD features, although the implementation details may vary.
Since most SIMD instructions on specific hardware involve operations of vector operands, sometimes SIMD operations are also referred to as vectorizations. Although there is a subtle difference between these two terms, in this article we use them interchangeably.