Skip to content

Systolic array implementations for Cholesky, LU, and QR decomposition

Notifications You must be signed in to change notification settings

Sibylau/HLS_designs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dataflow Systolic Array Implementations of Matrix Decomposition Using High Level Synthesis

Systolic array implementations for Cholesky, LU, and QR decomposition using HLS

Get started

Environment

  • Ubuntu 16.04.5 LTS
  • Xilinx Vivado HLS v2017.4
  • Matlab R2017a

Directory Tree

Inside each design folder, here are:

|-- Design_Folder/
  |- common/ 
  |- model4x4/
  |- template/

Folder common/ includes script files shared for different designs.
Folder model4x4/ gives an example of 4x4 implementation, with detailed comments alongside the codes.
Folder template/ includes template cpp files used for generating codes.
For a understanding of each design, please go to model4x4/ and view the comments in design_name.cpp, and refer to the illustrations shown below if necessary : )

Run it

For each design,

  • Go to common/. Find algorithm_name.cfg.xml, revise it according to your matrix size MxN.
    Please manually modify the parameter BIT according to BIT = ceiling(log2(SIZE)).
  • Run runit.csh. It will generate a new folder design_files/ with the design MxN/ inside:
|-- Design_Folder/
  |- common/
  |- design_files/
    |- MxN/
  |- model4x4/
  |- template/
  • Go to Design_Folder/ and call genA() in MATLAB to generate a random operand matrix A required for testbench.
    Some specific cases like 8x8, 16x16 etc. are provided under Design_Folder/. You can use it or generate a new one.
How to use function genA():
For Cholesky: generating NxN symmetric positive definite matrix by calling genA(N)
For LU: generating NxN full rank matrix by calling genA(N)
For QR: generating MxN full rank matrix by calling genA(M,N)
  • Go to Design_Folder/design_files/MxN/, run script.tcl in vivado_hls environment by $vivado_hls script.tcl.
  • Revise script.tcl according to your demands. It by default runs through csim, synthesis and cosim.

Versions

Cholesky

  • cholesky_v1.3:
    A 1-D systolic array design for Cholesky Decomposition along projection vector (i,j,k)=(0,1,0) and (i,k)=(0,1), as illustrated below in (b).
  • cholesky_v4.0:
    A 1-D systolic array design for Cholesky Decomposition along projection vector (i,j,k)=(0,1,0) and (i,k)=(1,0), as illustrated below in (c).

  • cholesky_v3.2:
    A 2-D systolic array design for Cholesky Decomposition along projection vector (i,j,k)=(1,0,0), as illustrated below in (b).

  • cholesky_v2.2:
    A 1-D systolic array design for Cholesky Decomposition along projection vector (i,j,k)=(0,1,0) and (i,k)=(0,1), as illustrated below in (b).

LU

  • lu1D_v1.0:
    A 1-D systolic array design for LU Decomposition along projection vector (i,j,k)=(0,1,0) and (i,k)=(0,1), as illustrated at the bottom of the picture below.

  • lu1D_v2.0:
    A 1-D systolic array design for LU Decomposition along projection vector (i,j,k)=(0,1,0) and (i,k)=(1,0), as illustrated at the right of the picture below.

  • lu2D_v1.0:
    A 2-D systolic array design for LU Decomposition along projection vector (i,j,k)=(0,1,0), as illustrated below in (b).

QR

  • qr_v1.1:
    A 1-D systolic array design for QR Decomposition along projection vector (i,j,k)=(1,0,0) and (j,k)=(0,1), as illustrated at the bottom of the picture below.

  • qr_v1.2:
    Replace unroll with pipeline in qr_v1.1 for better performance, as automatically unrolled rotations will not be parallelized due to FIFO conficts.

  • qr_v2.1:
    A 2-D systolic array design for QR Decomposition along projection vector (i,j,k)=(1,0,0), as illustrated below in (b).

Citation

I'd appreciate it if you could take a look at the following abstract and please cite it if it helps your work. :)

@inproceedings{Liu:2019:DSA:3289602.3293969,
 author = {Liu, Jie and Cong, Jason},
 title = {Dataflow Systolic Array Implementations of Matrix Decomposition Using High Level Synthesis},
 booktitle = {Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays},
 series = {FPGA '19},
 year = {2019},
 isbn = {978-1-4503-6137-8},
 location = {Seaside, CA, USA},
 pages = {187--187},
 numpages = {1},
 url = {http://doi.acm.org/10.1145/3289602.3293969},
 doi = {10.1145/3289602.3293969},
 acmid = {3293969},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {dataflow, high-level synthesis, matrix decomposition, systolic array, throughput},
} 

About

Systolic array implementations for Cholesky, LU, and QR decomposition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published