Skip to content
This repository has been archived by the owner on Jun 4, 2018. It is now read-only.

Optimizations techniques for code generation #58

Open
felippezacarias opened this issue Jan 12, 2016 · 0 comments
Open

Optimizations techniques for code generation #58

felippezacarias opened this issue Jan 12, 2016 · 0 comments

Comments

@felippezacarias
Copy link
Contributor

Pull Request Why Reference Code parameters used Time Before/After – Xeon Time Before/After – Xeon Phi
#51 Thread blocking access would be achieved by the directive  schedule(static,1) on the outer most loop. It allows threads processing the z plane use some y and x planes already in cache. Wave Equation Based Stencil Optimizations on Multi-core CPU - Muhong Zhou and William W. Symes, Rice University – Section: Reducing L3 Cache Misses – Blocking thread accesses Xeon: Code 8th order, Grid size 512x512x512 Xeon Phi: Code 8th order, Grid size 420x420x420 288 sec - 258 sec 123 sec - 112 sec
#52 Modifies the array access pattern by fission on the inner most loop and rearranging the access pattern by its stride. Beyond that, this changes helps to reduce register pressure on the vectorization. Borges, L., 2011, 3d finite differences on multi-core processors. (available online at [https://software.intel. com/en-us/articles/3d-finite-differences-on-multi-core-processors](https://software.intel. com/en-us/articles/3d-finite-differences-on-multi-core-processors)). Xeon: Code 8th order, Grid size 512x512x512 Xeon Phi: Code 8th order, Grid size 420x420x420 258 sec - 158 sec 112 sec - 196 sec
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant