Skip to content

Latest commit

 

History

History
239 lines (212 loc) · 10.9 KB

README.adoc

File metadata and controls

239 lines (212 loc) · 10.9 KB

Architecture Specification

RISC-V is a highly configurable standard. Numerous unnamed implementation options exist, and the presence/absence of hundreds of named optional extensions alter the behavior of the architecture. Additionally, implementations are free to add custom, non-standard extensions on top of those ratified by RISC-V International (RVI). As a result, two equally valid implementations of RISC-V can have wildly different specifications.

This makes creating and maintaining generic RISC-V specifications a daunting task. A few challenging issues that arise include:

  • The RISC-V standard actually specifies two different architectures, namely RV32 and RV64.

  • The presence/absence of standard instructions and CSRs is affected by both unnamed implementation options (e.g., the number of implemented PMP registers) and named extensions.

  • The behavior of architecturally-visible state is dependent on the set of implemented extensions. For example, the legal values of satp.MODE depend on whether or not the Sv32, Sv39, Sv48, and/or Sv57 extensions are implemented.

To tame this challenge, this specification generator takes the following approach:

  • The generic architecture (in the arch/ folder) is described in a way that covers all implementation options. As much as possible, the architecture is defined in a structured way that can be easily parsed using any programming language with a YAML library. Complex behavior (e.g., instruction operation) and some configuration-dependent metadata is specified in an architecture definition language (IDL) that can formally define the architecture in arbitrary ways.

  • An implementation of RISC-V is specified as a configuration containing all unnamed parameters and list of named supported extensions (in the cfgs/ folder).

  • A tool, included in this repository, can generate an implmentation-specific specification by applying the configuration to the generic spec. Behaviors that are not relevant are left out, and behaviors that are affected by multiple extensions are merged into a single description.

The architecture is specified in a series of YAML files for Extensions, Instructions, and Control and Status Registers (CSRs). Each extension/instruction/CSR has its own file.

Flow

+----------------------------------------------------------------------------------------------+
| Config (cfgs/NAME)                                                                           |
| +-------------------------+ +-------------------------+ +---------------------------------+  |
| | Implementation Options  | | Extension List          | | Implementation Overlay          |  |
| | (cfgs/NAME/params.yaml) | | (cfgs/NAME/params.yaml) | | (cfgs/NAME/arch_overlay/*.yaml) |  |
| +-------------------------+ +-------------------------+ +---------------------------------+  |
+------------------------|-----|--------------------------------|------------------------------+
                         |     |                                |
 +----------------+      v     v                                v
 |{s} Generic     |   /----------\                        /----------\
 |    Arch Spec   |-->|  Render  |                        |  Render  |
 | (arch)         |   |  (ERB)   :                        |  (ERB)   :
 +----------------+   \----------/                        \----------/
                           |                                    |
                           v                                    v
              +-----------------------------+     +-----------------------------------+
              | {s} Redered Arch Spec       |     | {s} Redered Overlay               |
              | (gen/NAME/.rendered_arch)   |     | (gen/NAME/.rendered_overlay_arch) |
              +-----------------------------+     +------------------------------------+
                           |                                    |
                           v                                    |
                 /-------------------------\                    |
                 |  Overlay                |<-------------------+
                 | (gen/NAME/.merged_arch) :
                 \-------------------------/
                           |
                           |
                     /-----------\
                     | Normalize |
                     \-----------/
                           |
                           v
              +-----------------------------+
              | {s} Implementation-specific |
              |     Archiecture Spec        |
              | (gen/NAME/arch)             |
              +-----------------------------+

The normalization step transitively determines all extensions/instructions/csrs that are implemented (e.g., because some extensions have implied dependencies) and ensures that missing fields are filled in with their defaults.

Data Format

All specification data is written in YAML. The data for Extensions, Instructions, and CSRs follow their own schemas, documented below. The files are checked for validity using Json Schema, and the precise schemas are located in the schemas/ directory.

Extensions

Example extension specification
H: # (1)
  type: privileged # (2)
  versions: # (3)
  - version: 1.0
    ratification_date: 2019-12
    requires: [S, '>= 1.12'] # (4)
  interrupt_codes: # (5)
  - num: 2
    name: Virtual supervisor software interrupt
  - num: 6
    name: Virtual supervisor timer interrupt
  - num: 10
    name: Virtual supervisor external interrupt
  - num: 12
    name: Supervisor guest external interrupt
  exception_codes: # (6)
  - num: 10
    name: Environment call from VS-mode
  - num: 20
    name: Instruction guest page fault
  - num: 21
    name: Load guest page fault
  - num: 22
    name: Virtual instruction
  - num: 23
    name: Store/AMO guest page fault
  description: | # (7)
    An Asciidoc description... (6)
  1. Name of the extension, which must follow the RVI naming scheme.

  2. Extension type: privileged or unprivileged

  3. List of versions

  4. [Optional] Declares a dependency on another extension (may be a list if there is more than one dependency).

  5. [Optional] List of asynchronous interrupts added by this extension

  6. [Optional] List of synchronous exceptions added by this extension

  7. A description of the extension, as Asciidoc source

Instructions

add: # (1)
  long_name: Add
  description: | # (2)
    Add the value in rs1 to rs2, and store the result in rd.
    Any overflow is thrown away.
  encoding: # (3)
    mask: 0000000----------000-----0110011
    fields:
    - name: rs2
      location: 24-20
    - name: rs1
      location: 19-15
    - name: rd
      location: 11-7
  definedBy: I # (4)
  assembly: xd, xs1, xs2 # (5)
  access: # (6)
    s: always
    u: always
    vs: always
    vu: always
  operation(): X[rd] = X[rs1] + X[rs2]; # (7)
  1. The instruction mnemonic, in lowercase

  2. Asciidoc description of the instruction

  3. Encoding of the instruction. 'mask' specifies the values and position of opcode fields, and 'fields' specifies the locations of decode variables.

  4. Extension that defines this instruction. May be a list if the instruction is defined by multiple extensions.

  5. Assembly format, to be used by ISS/disassembler/compiler/etc.

  6. Per-mode access rights (always, sometimes, or none). When 'sometimes', a field 'access-detail' should also be provided.

  7. Formal definition of the instruction operation, in IDL

Some instructions have decode fields that cannot take a certain value. This is especially common in the C extension where, for example, some register specifier fields can be anything but x0. That can be represented by adding a not_mask key to the encoding:

encoding for c.addi
encoding:
  mask:     000-----------01
  not_mask: ----00000------- # rs1/rd cannot be 0
  fields:
  - name: imm
    location: 12|6-2
  - name: rs1_rd
    location: 11-7

Not mask can also be a list when more than one value is prohibited (e.g., c.lui prohibits both x0 and x2 for rd).

Some fields are shifted before use, and can be represented using the left_shift key:

encoding for jal
  encoding:
    mask: -------------------------1101111
    fields:
    - name: imm
      # lsb of the immediate is always zero, so it isn't encoded in the instruction
      # this is also an example of representing decode variables that are split in the
      # encoding
      location: 31|19-12|20|30-21
      left_shift: 1
    - name: rd
      location: 11-7

CSRs

CSR specification for marchid
marchid: # (1)
  long_name: Machine Architecture ID
  address: 0xf12 # (2)
  priv_mode: M # (3)
  length: MXLEN # (4)
  description: | # (5)
    Asciidoc description
  definedBy: Sm # (6)
  fields: # (7)
    Architecture:
      location_rv32: 31-0 # (8)
      location_rv64: 63-0
      type: RO # (7)
      description: Vendor-specific microarchitecture ID. # (9)
      reset_value(): return ARCH_ID; # (10)
  1. CSR name

  2. CSR address (used by CSRs that not indirect)

  3. Least-privileged mode required to access the CSR

  4. Length of the CSR, in bits. Can either be an integer (e.g. 32, 64), or 'MXLEN', 'SXLEN', or 'VSXLEN' when the length is equal to the XLEN in M, S, or VS mode, respectively.

  5. Asciidoc description

  6. Defining extension. Can be list when more than one extension defines the CSR.

  7. List of fields in the CSR

  8. Location. In this case, the location changes with XLEN, so location_rv32 and location_rv64 are used. When the location does not change, use the single key location.

  9. Type of the field. See below for more information.

  10. Reset value. In this case, the reset value is determined by the configuration, so it is specified as an IDL function.

CSR fields are given a type, which does not necessarily correspond to the WARL/WLRL types in the RVI specs. We use a different format here because the RVI CSR types are vauge and inconsistent. The types are:

Type Meaning

RO

Read-only

RO-H

Read-only, and hardware updates the field

RW

Read-write

RW-R

Read-write, but only a restricted set of values are allowed

RW-H

Read-write, and hardware updates the field

RW-RH

Read-write, only a restricted set of values are allowed, and hardware updates the field

In many cases, the values of CSR and/or CSR field data are configuration dependent. Some of that is covered directly by the data model (e.g., with location_rv32, location_rv64), but some cases are too complex to express with YAML. For this reason, many of the keys can be specified as IDL functions. See the schema documentation and examples in the arch/csr folder for more information.

Some keys that only apply to certain CSRs are not shown above.