- What is the Turbo9?
- What are the target applications?
- Why use the 6809 instruction set? Why not RISC?
- But wait 6809 is CISC and CISC is bad!
- Key Features
- Presentations
- Publications
- Directory Structure
- Third-Party Tools
- Current Status
- Team Members
- Faculty
- Contact
The Turbo9 is a pipelined microprocessor IP written in Verilog that executes a superset of the Motorola 6809 instruction set. It is a new modern microarchitecture with 16-bit internal datapaths that balances high performance vs small area / low power. The Turbo9R with a 16-bit memory interface achieves 0.69 DMIPS/MHz which is 3.8 times faster than Motorola's original 8-bit MC6809 implementation. It is an active graduate research project at the Department of Electrical & Computer Engineering at the University of Florida
The target applications are SoC sub-blocks or small mixed-signal ASICs that require a compact and efficient microprocessor for programmable high-level control. There are many 32 or 64-bit RISC-V or ARM cores that try to fill this niche, but prove to be inefficient solutions given many of these applications only require 16-bit precision.
Current industry trends are to adapt 32-bit RISC IP for microcontroller use, however their large 32x32 register files and loosely encoded instructions limit their absolute minimum footprint. So with the goal of a creating a performance and compact microprocessor IP, we need an 16-bit instruction set architecture (ISA). Also, we want an architecture that is capable of running C code effectively. Given these requirements, the Motorola 6809 ISA stands out with its minimal number of registers (shown below), orthogonal instruction set, and powerful indexed and indirect addressing modes that map well to C concepts, such as arrays and pointers.
The 6809 was designed before the definition of RISC and therefore retroactively is classed as a CISC processor. However, the instruction set is actually simpler than many RISC ISAs. The main rule that 6809 instruction set breaks that it is not a "load-store" architecture. It is a simple accumulator architecture where one of its operands is memory. However, the instruction set is very elegant and well thought-through. This presents the challenge of pipelining a CISC processor while remaining as small as possible and attempting to rival performance levels of RISC implementations. To do this, the Turbo9 implements a novel CISC to RISC micro-op decode stage (shown below).
-
Professional Level IP
- Modern RTL design techniques & "good practice"
- Fully synchronous with single clock
- Well defined separation of control and datapath
- Separate hierarchy into smaller easier to maintain modules
- Design for efficient synthesis into ASIC standard cell libraries & FPGAs
- Written in Verilog 2001 for EDA tool compatibility
- Optimized for speed, power and area
- Design for performance, but not at the expense of power and area
- Minimize timing paths for max clock rate
- Implement multi-cycle to reduce area / power
- Modern RTL design techniques & "good practice"
-
Executes a Superset of the Motorola 6809 Instruction Set
- Compatible with existing 6809 compilers, assemblers and code base
- 16/32-bit multiply & divide instruction extensions
-
Modern pipelined 16-bit micro-architecture
- Instruction prefetch stage
- Advanced decode stage (CISC to RISC micro-op translation)
- Single/Multi-cycle execute stage
- Turbo9R with a 16-bit memory interface achieves 0.69 DMIPS/MHz
- ~3.8 times faster than original 8-bit MC6809 implementation
-
Pipelined Wishbone bus
- Public domain industry standard
- Internal separate Program Bus & Data Bus
- External shared Program/Data Bus
- Adjustable pipeline stages w/ automatic latency adjustment
- Different bus configurations available:
- Turbo9: 8-bit shared data/program bus
- Turbo9S: 16-bit aligned shared data/program bus
- Turbo9R: 16-bit non-aligned shared data/program bus
- Turbo9GTR: 16-bit non-aligned dual data & program bus
-
Custom uRTL microcode assembler
- written in C
- macro based assembler
- Verilog output for efficient synthesis into gates, no ROMs
- Statistics output
- Unlike traditional sequential microcode, it also capable of direct parallel decoding
-
Professional Verification Testbench
- Full self-checking Verilog testbench to verify instruction set
- Full randomized regression capable
- Youtube videos:
asm/ | Assembly code for the Turbo9 | |
docs/ | Documents | |
images/ | Images | |
c_code/ | C code for the Turbo9 | |
build_gcc/ | build directory for GCC | |
build_vbcc/ | build directory for VBCC | |
byte_sieve_src/ | BYTE Sieve source | |
dhrystone_src/ | Dhrystone source | |
hello_world_src/ | Hello World source | |
lib_gcc/ | Library for GCC | |
lib_vbcc/ | Library for VBCC | |
fpga/ | FPGA project directory | |
bit_files/ | .bit files for Arty A7-100T | |
regress/ | Nightly regression run directory | |
rtl/ | Verilog RTL for micro-architecture | |
urtl/ | uRTL microcode for micro-architecture | |
sim/ | Simulation run directory | |
tb/ | Testbench & Testcases | |
urtl_asm_src/ | uRTL microcode assembler source code | |
-
Linux environment with C shell, bash, gcc, and make
- A Linux environment is required for running and building Turbo9 scripts and tools
- We recommend Ubuntu MATE ;-)
-
- Required for running testbench
-
- Required for viewing waveforms from Icarus Verilog
-
vbcc - portable ISO C compiler
- An excellent C compiler for the 6809 / Turbo9
- makefile and library provided in c_code/
-
- A port of the GCC compiler for the 6809
- makefile and library provided in c_code/
-
CMOC 6809 C language cross-compiler
- A very nice C compiler for the 6809
-
LWTOOLS cross-dev tools for Motorola 6809
- An excellent assembler for the 6809
- Required for several scripts in asm/
The current version of the Turbo9 is thoroughly verified and is capable of running C code. However, we still consider this version v0.9 because we are missing a few items. All the 6809 instructions and addressing modes have been implemented and tested except SYNC and CWAI. The signed versions of the Turbo9's 16-bit divide and multiply need to be completed. Interrupts are partially implemented including SWI and Reset. In order to achieve version 1.0 we require the following:
- Finish SYNC and CWAI (6809 instructions)
- Finish EDIVS & EMULS (Turbo9 extensions)
- Finish Interrupts
- Finish Turbo9S bus version
- Implement testcases to verify the above
Other things to do:
- fix stim bench
- Verify pipeline bubbles on reset are benign
- Project Leader
- Responsibilities
- Microarchitecture design
- RTL & Microcode development
- 15 years of industry experience in ASIC design
- Bachelor's Degree in Electrical Engineering from University of Florida 2008
- Master's Degree in Electrical Engineering from University of Florida in 2022
- Currently pursuing a PhD from University of Florida
- Master's Thesis: A Compact & Efficient Microprocessor IP for SoC Sub-Blocks and Mixed-Signal ASICs
- Principal Contributor
- Responsibilities
- Custom uRTL microcode assembler
- Verification & Tools
- 15 years of industry experience in ASIC design
- Bachelor's Degree of science in Computer Science and software Engineering from Florida Institute of Technology 2008
- Currently pursuing a Master's Degree in Electrical Engineering from University of Florida
- Master's Thesis: Verification of a compact & efficient microprocessor IP
- Associate Professor
- NSF Center for Space, High-Performance, and Resilient Computing (SHREC)
- Research interests: Embedded systems with an emphasis in synthesis, compilers, reconfigurable computing, hardware/software co-design
- Website: www.gstitt.ece.ufl.edu
- Instructional Professor
- Machine Intelligence Laboratory Director
- Research interests: Robotics, embedded systems, controls, autonomous mobile agents
- Website: mil.ufl.edu/ems
- Director of School of Computing and Informatics - University of Louisiana Lafayette
- Academia: Former Professor and Chair of the Electrical and Computer Engineering Department at the University of Massachusetts Lowell
- Website: people.cmix.louisiana.edu/margala/
You may contact us at team[at]turbo9[dot]org. Thank you!