Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu/vexiiriscv integration #1923

Merged
merged 28 commits into from
Jun 8, 2024
Merged

Conversation

Dolu1990
Copy link
Collaborator

@Dolu1990 Dolu1990 commented Apr 10, 2024

Do not merge yet (WIP)

Hi,
This PR add VexiiRiscv in Litex. Here is an example to generate a quad-core linux capable RISC-V on digilent_nexys_video :

(single issue, RV32IMASU, linux ready, performances look quite good so far)

Just a few info about VexiiRiscv config.
It came into 3 base cpu-variant (standard, linux, debian), but then on the top of that, there is quite a few parameters which can make a big inpact on performances.

For instance, when i'm testing it to run debian on max performance i do :

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv --cpu-variant=debian --with-jtag-tap  --bus-standard axi-lite \
--vexii-args="--lsu-software-prefetch --lsu-hardware-prefetch rpt --performance-counters 9 --regfile-async --lsu-l1-store-buffer-ops=32  --lsu-l1-store-buffer-slots=4 --lsu-l1-refill-count 4 --lsu-l1-writeback-count 4" \
--cpu-count=4 --with-jtag-tap  --with-video-framebuffer --l2-self-flush=40c00000,40dd4c00,1666666  --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144  --sys-clk-freq 100000000 \
--update-repo=no --soc-json build/csr.json --build   --load

Here is an overview:

Arguements which goes into the --vexii-args="xxxx"

  • --lsu-software-prefetch --lsu-hardware-prefetch rpt For fast sequencial memory accesses
  • --lsu-l1-store-buffer-ops=32 --lsu-l1-store-buffer-slots=4 To avoid to stall the CPU on store miss
  • --lsu-l1-refill-count 4 --lsu-l1-writeback-count 4 To allow the data cache to have multiple inflight memory requests
  • --relaxed-btb to improve timings of the BTB at the cost of 1 cycle penality per predicted taken branch/jump

Soc level arguements

  • --l2-byte=262144 Enable the Vexii shared L2 cache
  • --l2-self-flush=40c00000,40dd4c00,1666666 is a workaround which will periodicaly flush a given memory space in the L2 to ensure the video DMA get updated pictures

@Dolu1990
Copy link
Collaborator Author

Dolu1990 commented May 6, 2024

Here is an example of a debian ready dual core (WIP) :

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \
--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 0 \
--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \
--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \
--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \
--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=2 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=131072 --update-repo=no  --sys-clk-freq 100000000 --build   --load

@Dolu1990
Copy link
Collaborator Author

Updated it with 3 base variant :
cached, linux, debian

On digilent video :

# debian ready : 
python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv --cpu-variant=debian  --cpu-count=1 --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --build --load

#debian ready with more perf 
python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv --cpu-variant=debian --bus-standard axi-lite --vexii-args="--regfile-async --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2" --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144  --build --load

--l2-self-flush=40c00000,40DD4C00,1666666
@Dolu1990
Copy link
Collaborator Author

Dolu1990 commented Jun 7, 2024

@enjoy-digital
Hi,
I think this is ok for merge now :)

Copy link
Owner

@enjoy-digital enjoy-digital left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks @Dolu1990!

@enjoy-digital enjoy-digital merged commit 7f81499 into enjoy-digital:master Jun 8, 2024
1 check passed
@Dolu1990
Copy link
Collaborator Author

@enjoy-digital
Copy link
Owner

Great, thanks @Dolu1990!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants