diff --git a/docs/Advanced-Usage/Resources.rst b/docs/Advanced-Usage/Resources.rst index 538c43b0..788e43be 100644 --- a/docs/Advanced-Usage/Resources.rst +++ b/docs/Advanced-Usage/Resources.rst @@ -1,8 +1,8 @@ Accessing Scala Resources =============================== -A simple way to copy over a source file to the build directory to be used for a simulation compile or VLSI flow is to use the ``setResource`` or ``addResource`` functions given by FIRRTL. -They can be used in the following way: +A simple way to copy over a source file to the build directory to be used for a simulation compile or VLSI flow is to use the ``addResource`` functions given by FIRRTL. +It can be used in the following way: .. code-block:: scala @@ -14,13 +14,13 @@ They can be used in the following way: val exit = Output(Bool()) }) - setResource("/testchipip/vsrc/SimSerial.v") - setResource("/testchipip/csrc/SimSerial.cc") + addResource("/testchipip/vsrc/SimSerial.v") + addResource("/testchipip/csrc/SimSerial.cc") } In this example, the ``SimSerial`` files will be copied from a specific folder (in this case the ``path/to/testchipip/src/main/resources/testchipip/...``) to the build folder. -The ``set/addResource`` path retrieves resources from the ``src/main/resources`` directory. -So to get an item at ``src/main/resources/fileA.v`` you can use ``setResource("/fileA.v")``. +The ``addResource`` path retrieves resources from the ``src/main/resources`` directory. +So to get an item at ``src/main/resources/fileA.v`` you can use ``addResource("/fileA.v")``. However, one caveat of this approach is that to retrieve the file during the FIRRTL compile, you must have that project in the FIRRTL compiler's classpath. Thus, you need to add the SBT project as a dependency to the FIRRTL compiler in the Chipyard ``build.sbt``, which in Chipyards case is the ``tapeout`` project. For example, you added a new project called ``myAwesomeAccel`` in the Chipyard ``build.sbt``. diff --git a/docs/Chipyard-Basics/Chipyard-Components.rst b/docs/Chipyard-Basics/Chipyard-Components.rst index 49aff86a..250beb85 100644 --- a/docs/Chipyard-Basics/Chipyard-Components.rst +++ b/docs/Chipyard-Basics/Chipyard-Components.rst @@ -86,12 +86,12 @@ Sims **verilator (Verilator wrapper)** Verilator is an open source Verilog simulator. The ``verilator`` directory provides wrappers which construct Verilator-based simulators from relevant generated RTL, allowing for execution of test RISC-V programs on the simulator (including vcd waveform files). - See :ref:`Verilator` for more information. + See :ref:`Verilator (Open-Source)` for more information. **vcs (VCS wrapper)** VCS is a proprietary Verilog simulator. Assuming the user has valid VCS licenses and installations, the ``vcs`` directory provides wrappers which construct VCS-based simulators from relevant generated RTL, allowing for execution of test RISC-V programs on the simulator (including vcd/vpd waveform files). - See :ref:`VCS` for more information. + See :ref:`Synopsys VCS (License Required)` for more information. **FireSim** FireSim is an open-source FPGA-accelerated simulation platform, using Amazon Web Services (AWS) EC2 F1 instances on the public cloud. @@ -109,4 +109,4 @@ VLSI The HAMMER flow provide automated scripts which generate relevant tool commands based on a higher level description of physical design constraints. The HAMMER flow also allows for re-use of process technology knowledge by enabling the construction of process-technology-specific plug-ins, which describe particular constraints relating to that process technology (obsolete standard cells, metal layer routing constraints, etc.). The HAMMER flow requires access to proprietary EDA tools and process technology libraries. - See :ref:`HAMMER` for more information. + See :ref:`Core HAMMER` for more information. diff --git a/docs/Chipyard-Basics/Configs-Parameters-Mixins.rst b/docs/Chipyard-Basics/Configs-Parameters-Mixins.rst index d9becfdf..1d0aa813 100644 --- a/docs/Chipyard-Basics/Configs-Parameters-Mixins.rst +++ b/docs/Chipyard-Basics/Configs-Parameters-Mixins.rst @@ -21,7 +21,7 @@ Configs are additive, can override each other, and can be composed of other Conf The naming convention for an additive Config is ``With``, while the naming convention for a non-additive Config will be ````. Configs can take arguments which will in-turn set parameters in the design or reference other parameters in the design (see :ref:`Parameters`). -:numref:`basic-config-example` shows a basic additive Config class that takes in zero arguments and instead uses hardcoded values to set the RTL design parameters. +This example shows a basic additive Config class that takes in zero arguments and instead uses hardcoded values to set the RTL design parameters. In this example, ``MyAcceleratorConfig`` is a Scala case class that defines a set of variables that the generator can use when referencing the ``MyAcceleratorKey`` in the design. .. _basic-config-example: @@ -38,7 +38,7 @@ In this example, ``MyAcceleratorConfig`` is a Scala case class that defines a se someLength = 256) }) -This next example (:numref:`complex-config-example`) shows a "higher-level" additive Config that uses prior parameters that were set to derive other parameters. +This next example shows a "higher-level" additive Config that uses prior parameters that were set to derive other parameters. .. _complex-config-example: .. code-block:: scala @@ -52,7 +52,7 @@ This next example (:numref:`complex-config-example`) shows a "higher-level" addi hartId = up(RocketTilesKey, site).length) }) -:numref:`top-level-config` shows a non-additive Config that combines the prior two additive Configs using ``++``. +The following example shows a non-additive Config that combines the prior two additive Configs using ``++``. The additive Configs are applied from the right to left in the list (or bottom to top in the example). Thus, the order of the parameters being set will first start with the ``DefaultExampleConfig``, then ``WithMyAcceleratorParams``, then ``WithMyMoreComplexAcceleratorConfig``. @@ -65,13 +65,18 @@ Thus, the order of the parameters being set will first start with the ``DefaultE new DefaultExampleConfig ) +The ``site``, ``here``, and ``up`` objects in ``WithMyMoreComplexAcceleratorConfig`` are maps from configuration keys to their definitions. +The ``site`` map gives you the definitions as seen from the root of the configuration hierarchy (in this example, ``SomeAdditiveConfig``). +The ``here`` map gives the definitions as seen at the current level of the hierarchy (i.e. in ``WithMyMoreComplexAcceleratorConfig`` itself). +The ``up`` map gives the definitions as seen from the next level up from the current (i.e. from ``WithMyAcceleratorParams``). + Cake Pattern ------------------------- A cake pattern is a Scala programming pattern, which enable "mixing" of multiple traits or interface definitions (sometimes referred to as dependency injection). It is used in the Rocket Chip SoC library and Chipyard framework in merging multiple system components and IO interfaces into a large system component. -:numref:`cake-example` shows a Rocket Chip based SoC that merges multiple system components (BootROM, UART, etc) into a single top-level design. +This example shows a Rocket Chip based SoC that merges multiple system components (BootROM, UART, etc) into a single top-level design. .. _cake-example: .. code-block:: scala @@ -92,7 +97,7 @@ Mix-in A mix-in is a Scala trait, which sets parameters for specific system components, as well as enabling instantiation and wiring of the relevant system components to system buses. The naming convention for an additive mix-in is ``Has``. -This is show in :numref:`cake-example` where things such as ``HasPeripherySerial`` connect a RTL component to a bus and expose signals to the top-level. +This is shown in the MySoC class where things such as ``HasPeripherySerial`` connect a RTL component to a bus and expose signals to the top-level. Additional References --------------------------- diff --git a/docs/Chipyard-Basics/Running-A-Simulation.rst b/docs/Chipyard-Basics/Running-A-Simulation.rst index 7b3f0cc1..76eb0acb 100644 --- a/docs/Chipyard-Basics/Running-A-Simulation.rst +++ b/docs/Chipyard-Basics/Running-A-Simulation.rst @@ -9,7 +9,7 @@ Software RTL Simulation ------------------------ The Chipyard framework provides wrappers for two common software RTL simulators: the open-source Verilator simulator and the proprietary VCS simulator. -For more information on either of these simulators, please refer to :ref:`Verilator` or :ref:`VCS`. +For more information on either of these simulators, please refer to :ref:`Verilator (Open-Source)` or :ref:`Synopsys VCS (License Required)`. The following instructions assume at least one of these simulators is installed. Verilator/VCS Flows diff --git a/docs/Customization/Adding-An-Accelerator.rst b/docs/Customization/Adding-An-Accelerator.rst index 90a74733..3c0ea4eb 100644 --- a/docs/Customization/Adding-An-Accelerator.rst +++ b/docs/Customization/Adding-An-Accelerator.rst @@ -1,6 +1,6 @@ .. _adding-an-accelerator: -Adding An Accelerator/Device +Adding an Accelerator/Device =============================== Accelerators or custom IO devices can be added to your SoC in several ways: @@ -66,50 +66,23 @@ MMIO Peripheral The easiest way to create a TileLink peripheral is to use the ``TLRegisterRouter``, which abstracts away the details of handling the TileLink protocol and provides a convenient interface for specifying memory-mapped registers. To create a RegisterRouter-based peripheral, you will need to specify a parameter case class for the configuration settings, a bundle trait with the extra top-level ports, and a module implementation containing the actual RTL. +In this case we use a submodule ``PWMBase`` to actually perform the pulse-width modulation. The ``PWMModule`` class only creates the registers and hooks them +up using ``regmap``. -.. code-block:: scala - - case class PWMParams(address: BigInt, beatBytes: Int) - - trait PWMTLBundle extends Bundle { - val pwmout = Output(Bool()) - } - - trait PWMTLModule { - val io: PWMTLBundle - implicit val p: Parameters - def params: PWMParams - - val w = params.beatBytes * 8 - val period = Reg(UInt(w.W)) - val duty = Reg(UInt(w.W)) - val enable = RegInit(false.B) - - // ... Use the registers to drive io.pwmout ... - - regmap( - 0x00 -> Seq( - RegField(w, period)), - 0x04 -> Seq( - RegField(w, duty)), - 0x08 -> Seq( - RegField(1, enable))) - } - +.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala + :language: scala + :start-after: DOC include start: PWM generic traits + :end-before: DOC include end: PWM generic traits Once you have these classes, you can construct the final peripheral by extending the ``TLRegisterRouter`` and passing the proper arguments. The first set of arguments determines where the register router will be placed in the global address map and what information will be put in its device tree entry. The second set of arguments is the IO bundle constructor, which we create by extending ``TLRegBundle`` with our bundle trait. The final set of arguments is the module constructor, which we create by extends ``TLRegModule`` with our module trait. -.. code-block:: scala - - class PWMTL(c: PWMParams)(implicit p: Parameters) - extends TLRegisterRouter( - c.address, "pwm", Seq("ucbbar,pwm"), - beatBytes = c.beatBytes)( - new TLRegBundle(c, _) with PWMTLBundle)( - new TLRegModule(c, _, _) with PWMTLModule) +.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala + :language: scala + :start-after: DOC include start: PWMTL + :end-before: DOC include end: PWMTL The full module code can be found in ``generators/example/src/main/scala/PWM.scala``. @@ -121,20 +94,10 @@ In the Rocket Chip cake, there are two kinds of traits: a ``LazyModule`` trait a The ``LazyModule`` trait runs setup code that must execute before all the hardware gets elaborated. For a simple memory-mapped peripheral, this just involves connecting the peripheral's TileLink node to the MMIO crossbar. -.. code-block:: scala - - trait HasPeripheryPWM extends HasSystemNetworks { - implicit val p: Parameters - - private val address = 0x2000 - - val pwm = LazyModule(new PWMTL( - PWMParams(address, peripheryBusConfig.beatBytes))(p)) - - pwm.node := TLFragmenter( - peripheryBusConfig.beatBytes, cacheBlockBytes)(peripheryBus.node) - } - +.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala + :language: scala + :start-after: DOC include start: HasPeripheryPWMTL + :end-before: DOC include end: HasPeripheryPWMTL Note that the ``PWMTL`` class we created from the register router is itself a ``LazyModule``. Register routers have a TileLink node simply named "node", which we can hook up to the Rocket Chip bus. @@ -144,77 +107,43 @@ The module implementation trait is where we instantiate our PWM module and conne Since this module has an extra `pwmout` output, we declare that in this trait, using Chisel's multi-IO functionality. We then connect the ``PWMTL``'s pwmout to the pwmout we declared. -.. code-block:: scala - - trait HasPeripheryPWMModuleImp extends LazyMultiIOModuleImp { - implicit val p: Parameters - val outer: HasPeripheryPWM - - val pwmout = IO(Output(Bool())) - - pwmout := outer.pwm.module.io.pwmout - } +.. literalinclude:: ../../generators/example/src/main/scala/PWM.scala + :language: scala + :start-after: DOC include start: HasPeripheryPWMTLModuleImp + :end-before: DOC include end: HasPeripheryPWMTLModuleImp Now we want to mix our traits into the system as a whole. This code is from ``generators/example/src/main/scala/Top.scala``. -.. code-block:: scala - - class ExampleTopWithPWM(q: Parameters) extends ExampleTop(q) - with PeripheryPWM { - override lazy val module = Module( - new ExampleTopWithPWMModule(p, this)) - } - - class ExampleTopWithPWMModule(l: ExampleTopWithPWM) - extends ExampleTopModule(l) with HasPeripheryPWMModuleImp - +.. literalinclude:: ../../generators/example/src/main/scala/Top.scala + :language: scala + :start-after: DOC include start: TopWithPWMTL + :end-before: DOC include end: TopWithPWMTL Just as we need separate traits for ``LazyModule`` and module implementation, we need two classes to build the system. -The ``ExampleTop`` classes already have the basic peripherals included for us, so we will just extend those. +The ``Top`` classes already have the basic peripherals included for us, so we will just extend those. -The ``ExampleTop`` class includes the pre-elaboration code and also a ``lazy val`` to produce the module implementation (hence ``LazyModule``). -The ``ExampleTopModule`` class is the actual RTL that gets synthesized. +The ``Top`` class includes the pre-elaboration code and also a ``lazy val`` to produce the module implementation (hence ``LazyModule``). +The ``TopModule`` class is the actual RTL that gets synthesized. -Finally, we need to add a configuration class in ``generators/example/src/main/scala/Configs.scala`` that tells the ``TestHarness`` to instantiate ``ExampleTopWithPWM`` instead of the default ``ExampleTop``. +Next, we need to add a configuration mixin in ``generators/example/src/main/scala/ConfigMixins.scala`` that tells the ``TestHarness`` to instantiate ``TopWithPWMTL`` instead of the default ``Top``. -.. code-block:: scala +.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala + :language: scala + :start-after: DOC include start: WithPWMTop + :end-before: DOC include end: WithPWMTop - class WithPWM extends Config((site, here, up) => { - case BuildTop => (p: Parameters) => - Module(LazyModule(new ExampleTopWithPWM()(p)).module) - }) - - class PWMConfig extends Config(new WithPWM ++ new BaseExampleConfig) +And finally, we create a configuration class in ``generators/example/src/main/scala/Configs.scala`` that uses this mixin. +.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala + :language: scala + :start-after: DOC include start: PWMRocketConfig + :end-before: DOC include end: PWMRocketConfig Now we can test that the PWM is working. The test program is in ``tests/pwm.c``. -.. code-block:: c - - #define PWM_PERIOD 0x2000 - #define PWM_DUTY 0x2008 - #define PWM_ENABLE 0x2010 - - static inline void write_reg(unsigned long addr, unsigned long data) - { - volatile unsigned long *ptr = (volatile unsigned long *) addr; - *ptr = data; - } - - static inline unsigned long read_reg(unsigned long addr) - { - volatile unsigned long *ptr = (volatile unsigned long *) addr; - return *ptr; - } - - int main(void) - { - write_reg(PWM_PERIOD, 20); - write_reg(PWM_DUTY, 5); - write_reg(PWM_ENABLE, 1); - } - +.. literalinclude:: ../../tests/pwm.c + :language: c This just writes out to the registers we defined earlier. The base of the module's MMIO region is at 0x2000. @@ -226,9 +155,9 @@ Now with all of that done, we can go ahead and run our simulation. .. code-block:: shell - cd verilator - make CONFIG=PWMConfig - ./simulator-example-PWMConfig ../tests/pwm.riscv + cd sims/verilator + make CONFIG=PWMRocketConfig TOP=TopWithPWMTL + ./simulator-example-PWMRocketConfig ../../tests/pwm.riscv Adding a RoCC Accelerator ---------------------------- @@ -293,47 +222,40 @@ For instance, if we wanted to add the previously defined accelerator and route c }) class CustomAcceleratorConfig extends Config( - new WithCustomAccelerator ++ new DefaultExampleConfig) + new WithCustomAccelerator ++ new RocketConfig) + +To add RoCC instructions in your program, use the RoCC C macros provided in ``tests/rocc.h``. You can find examples in the files ``tests/accum.c`` and ``charcount.c``. Adding a DMA port ------------------- -IO devices or accelerators (like a disk or network driver), we may want to have the device write directly to the coherent memory system instead. -To add a device like that, you would do the following. +For IO devices or accelerators (like a disk or network driver), instead of +having the CPU poll data from the device, we may want to have the device write +directly to the coherent memory system instead. For example, here is a device +that writes zeros to the memory at a configured address. -.. code-block:: scala +.. literalinclude:: ../../generators/example/src/main/scala/InitZero.scala + :language: scala - class DMADevice(implicit p: Parameters) extends LazyModule { - val node = TLClientNode(TLClientParameters( - name = "dma-device", sourceId = IdRange(0, 1))) +.. literalinclude:: ../../generators/example/src/main/scala/Top.scala + :language: scala + :start-after: DOC include start: TopWithInitZero + :end-before: DOC include end: TopWithInitZero - lazy val module = new DMADeviceModule(this) - } +We use ``TLHelper.makeClientNode`` to create a TileLink client node for us. +We then connect the client node to the memory system through the front bus (fbus). +For more info on creating TileLink client nodes, take a look at :ref:`Client Node`. - class DMADeviceModule(outer: DMADevice) extends LazyModuleImp(outer) { - val io = IO(new Bundle { - val mem = outer.node.bundleOut - val ext = new ExtBundle - }) +Once we've created our top-level module including the DMA widget, we can create a configuration for it as we did before. - // ... rest of the code ... - } +.. literalinclude:: ../../generators/example/src/main/scala/ConfigMixins.scala + :language: scala + :start-after: DOC include start: WithInitZero + :end-before: DOC include end: WithInitZero - trait HasPeripheryDMA extends HasSystemNetworks { - implicit val p: Parameters - - val dma = LazyModule(new DMADevice) - - fsb.node := dma.node - } - - trait HasPeripheryDMAModuleImp extends LazyMultiIOModuleImp { - val ext = IO(new ExtBundle) - ext <> outer.dma.module.io.ext - } +.. literalinclude:: ../../generators/example/src/main/scala/RocketConfigs.scala + :language: scala + :start-after: DOC include start: InitZeroRocketConfig + :end-before: DOC include end: InitZeroRocketConfig -The ``ExtBundle`` contains the signals we connect off-chip that we get data from. -The DMADevice also has a Tilelink client port that we connect into the L1-L2 crossbar through the front-side buffer (fsb). -The sourceId variable given in the ``TLClientNode`` instantiation determines the range of ids that can be used in acquire messages from this device. -Since we specified [0, 1) as our range, only the ID 0 can be used. diff --git a/docs/Customization/Boot-Process.rst b/docs/Customization/Boot-Process.rst new file mode 100644 index 00000000..0557b8bb --- /dev/null +++ b/docs/Customization/Boot-Process.rst @@ -0,0 +1,78 @@ +Chipyard Boot Process +======================= + +This section will describe in detail the process by which a Chipyard-based +SoC boots a Linux kernel and the changes you can make to customize this process. + +BootROM and RISC-V Frontend Server +---------------------------------- + +The first instructions to run when the SoC is powered on are those stored in +the BootROM. The assembly for the BootROM code is located in +`generators/testchipip/src/main/resources/testchipip/bootrom/bootrom.S `_. +The BootROM address space starts at ``0x10000`` and execution starts at address +``0x10040``, which is marked by the ``_hang`` label in the BootROM assembly. + +The Chisel generator encodes the assembled instructions into the BootROM +hardware at elaboration time, so if you want to change the BootROM code, you +will need to run ``make`` in the bootrom directory and then regenerate the +verilog. If you don't want to overwrite the existing ``bootrom.S``, you can +also point the generator to a different bootrom image by overriding the +``BootROMParams`` key in the configuration. + +.. code-block:: scala + + class WithMyBootROM extends Config((site, here, up) => { + case BootROMParams => + BootROMParams(contentFileName = "/path/to/your/bootrom.img") + }) + +The default bootloader simply loops on a wait-for-interrupt (WFI) instruction +as the RISC-V frontend-server (FESVR) loads the actual program. +FESVR is a program that runs on the host CPU and can read/write arbitrary +parts of the target system memory using the Tethered Serial Interface (TSI). + +FESVR uses TSI to load a baremetal executable or second-stage bootloader into +the SoC memory. In :ref:`Software RTL Simulation`, this will be the binary you +pass to the simulator. Once it is finished loading the program, FESVR will +write to the software interrupt register for CPU 0, which will bring CPU 0 +out of its WFI loop. Once it receives the interrupt, CPU 0 will write to +the software interrupt registers for the other CPUs in the system and then +jump to the beginning of DRAM to execute the first instruction of the loaded +executable. The other CPUs will be woken up by the first CPU and also jump +to the beginning of DRAM. + +The executable loaded by FESVR should have memory locations designated +as *tohost* and *fromhost*. FESVR uses these memory locations to communicate +with the executable once it is running. The executable uses *tohost* to send +commands to FESVR for things like printing to the console, +proxying system calls, and shutting down the SoC. The *fromhost* register is +used to send back responses for *tohost* commands and for sending console +input. + +The Berkeley Boot Loader and RISC-V Linux +----------------------------------------- + +For baremetal programs, the story ends here. The loaded executable will run in +machine mode until it sends a command through the *tohost* register telling the +FESVR to power off the SoC. + +However, for booting the Linux Kernel, you will need to use a second-stage +bootloader called the Berkeley Boot Loader, or BBL. This program reads the +device tree encoded in the boot ROM and transforms it into a format compatible +with the Linux kernel. It then sets up virtual memory and the interrupt +controller, loads the kernel, which is embedded in the bootloader binary as a +payload, and starts executing the kernel in supervisor mode. The bootloader is +also responsible for servicing machine-mode traps from the kernel and +proxying them over FESVR. + +Once BBL jumps into supervisor mode, the Linux kernel takes over and begins +its process. It eventually loads the ``init`` program and runs it in user +mode, thus starting userspace execution. + +The easiest way to build a BBL image that boots Linux is to use the FireMarshal +tool that lives in the `firesim-software `_ +repository. Directions on how to use FireMarshal can be found in the +`FireSim documentation `_. +Using FireMarshal, you can add custom kernel configurations and userspace software +to your workload. diff --git a/docs/Customization/Memory-Hierarchy.rst b/docs/Customization/Memory-Hierarchy.rst index fc9792c4..7864ddcc 100644 --- a/docs/Customization/Memory-Hierarchy.rst +++ b/docs/Customization/Memory-Hierarchy.rst @@ -1,4 +1,129 @@ Memory Hierarchy =============================== -TODO: Talk about SiFive Cache, and integration with L1 and backing main memory models -(maybe even Tilelink) + +The L1 Caches +-------------- + +Each CPU tile has an L1 instruction cache and L1 data cache. The size and +associativity of these caches can be configured. The default ``RocketConfig`` +uses 16 KiB, 4-way set-associative instruction and data caches. However, +if you use the ``NMediumCores`` or ``NSmallCores`` configurations, you can +configure 4 KiB direct-mapped caches for L1I and L1D. + +.. code-block:: scala + + import freechips.rocketchip.subsystem.{WithNMediumCores, WithNSmallCores} + + class SmallRocketConfig extends Config( + new WithNSmallCores(1) ++ + new RocketConfig) + + class MediumRocketConfig extends Config( + new WithNMediumCores(1) ++ + new RocketConfig) + +If you only want to change the size or associativity, there are configuration +mixins for those too. + +.. code-block:: scala + + import freechips.rocketchip.subsystem.{WithL1ICacheSets, WithL1DCacheSets, WithL1ICacheWays, WithL1DCacheWays} + + class MyL1RocketConfig extends Config( + new WithL1ICacheSets(128) ++ + new WithL1ICacheWays(2) ++ + new WithL1DCacheSets(128) ++ + new WithL1DCacheWays(2) ++ + new RocketConfig) + +You can also configure the L1 data cache as an data scratchpad instead. +However, there are some limitations on this. If you are using a data scratchpad, +you can only use a single core and you cannot give the design an external DRAM. + +.. code-block:: scala + + import freechips.rocketchip.subsystem.{WithNoMemPort, WithScratchpadsOnly} + + class ScratchpadRocketConfig extends Config( + new WithNoMemPort ++ + new WithScratchpadsOnly ++ + new SmallRocketConfig) + +The SiFive L2 Cache +------------------- + +The default RocketConfig provided in the Chipyard example project uses SiFive's +InclusiveCache generator to produce a shared L2 cache. In the default +configuration, the L2 uses a single cache bank with 512 KiB capacity and 8-way +set-associativity. However, you can change these parameters to obtain your +desired cache configuration. The main restriction is that the number of ways +and the number of banks must be powers of 2. + +.. code-block:: scala + + import freechips.rocketchip.subsystem.WithInclusiveCache + + # Create an SoC with 1 MB, 4-way, 4-bank cache + class MyCacheRocketConfig extends Config( + new WithInclusiveCache( + capacityKB = 1024, + nWays = 4, + nBanks = 4) ++ + new RocketConfig) + +The Broadcast Hub +----------------- + +If you do not want to use the L2 cache (say, for a resource-limited embedded +design), you can create a configuration without it. Instead of using the L2 +cache, you will instead use RocketChip's TileLink broadcast hub. +To make such a configuration, you can just copy the definition of +``RocketConfig`` but omit the ``WithInclusiveCache`` mixin from the +list of included mixims. + +.. code-block:: scala + + import freechips.rocketchip.subsystem.{WithNBigCores, BaseConfig} + + class CachelessRocketConfig extends Config( + new WithTop ++ + new WithBootROM ++ + new WithNBigCores(1) ++ + new BaseConfig) + +If you want to reduce the resources used even further, you can configure +the Broadcast Hub to use a bufferless design. + +.. code-block:: scala + + import freechips.rocketchip.subsystem.WithBufferlessBroadcastHub + + class BufferlessRocketConfig extends Config( + new WithBufferlessBroadcastHub ++ + new CachelessRocketConfig) + +The Outer Memory System +----------------------- + +The L2 coherence agent (either L2 cache of Broadcast Hub) makes requests to +an outer memory system consisting of an AXI4-compatible DRAM controller. + +The default configuration uses a single memory channel, but you can configure +the system to use multiple channels. As with the number of L2 banks, the +number of DRAM channels is restricted to powers of two. + +.. code-block:: scala + + import freechips.rocketchip.subsystem.WithNMemoryChannels + + class DualChannelRocketConfig extends Config( + new WithNMemoryChannels(2) ++ + new RocketConfig) + +In VCS and Verilator simulation, the DRAM is simulated using the +``SimAXIMem`` module, which simply attaches a single-cycle SRAM to each +memory channel. + +If you want a more realistic memory simulation, you can use FireSim, which +can simulate the timing of DDR3 controllers. More documentation on FireSim +memory models is available in the `FireSim docs `_. diff --git a/docs/Customization/index.rst b/docs/Customization/index.rst index 8d61801e..c0432e73 100644 --- a/docs/Customization/index.rst +++ b/docs/Customization/index.rst @@ -16,3 +16,4 @@ Hit next to get started! Heterogeneous-SoCs Adding-An-Accelerator Memory-Hierarchy + Boot-Process diff --git a/docs/Generators/RocketChip.rst b/docs/Generators/RocketChip.rst new file mode 100644 index 00000000..b6050534 --- /dev/null +++ b/docs/Generators/RocketChip.rst @@ -0,0 +1,72 @@ +RocketChip +========== + +RocketChip is an SoC generator developed at Berkeley and now supported by +SiFive. Chipyard uses RocketChip as the basis for producing a RISC-V SoC. + +RocketChip is distinct from Rocket, the in-order RISC-V CPU generator. +RocketChip includes many parts of the SoC besides the CPU. Though RocketChip +uses Rocket CPUs by default, it can also be configured to use the BOOM +out-of-order core generator or some other custom CPU generator instead. + +A detailed diagram of a typical RocketChip system is shown below. + +.. image:: ../_static/images/rocketchip-diagram.png + +Tiles +----- + +The diagram shows a dual-core ``Rocket`` system. Each ``Rocket`` core is +grouped with a page-table walker, L1 instruction cache, and L1 data cache into +a ``RocketTile``. + +The ``Rocket`` core can also be swapped for a ``BOOM`` core. Each tile can +also be configured with a RoCC accelerator that connects to the core as a +coprocessor. + +Memory System +------------- +The tiles connect to the ``SystemBus``, which connect it to the L2 cache banks. +The L2 cache banks then connect to the ``MemoryBus``, which connects to the +DRAM controller through a TileLink to AXI converter. + +To learn more about the memory hierarchy, see :ref:`Memory Hierarchy`. + +MMIO +---- + +For MMIO peripherals, the ``SystemBus`` connects to the ``ControlBus`` and ``PeripheryBus``. + +The ``ControlBus`` attaches standard peripherals like the BootROM, the +Platform-Level Interrupt Controller (PLIC), the core-local interrupts (CLINT), +and the Debug Unit. + +The BootROM contains the first stage bootloader, the first instructions to run +when the system comes out of reset. It also contains the Device Tree, which is +used by Linux to determine what other peripherals are attached. + +The PLIC aggregates and masks device interrupts and external interrupts. + +The core-local interrupts include software interrupts and timer interrupts for +each CPU. + +The Debug Unit is used to control the chip externally. It can be used to load +data and instructions to memory or pull data from memory. It can be controlled +through a custom DMI or standard JTAG protocol. + +The ``PeripheryBus`` attaches additional peripherals like the NIC and Block Device. +It can also optionally expose an external AXI4 port, which can be attached to +vendor-supplied AXI4 IP. + +To learn more about adding MMIO peripherals, check out the :ref:`MMIO Peripheral` +section of :ref:`Adding an Accelerator/Device`. + +DMA +--- + +You can also add DMA devices that read and write directly from the memory +system. These are attached to the ``FrontendBus``. The ``FrontendBus`` can also +connect vendor-supplied AXI4 DMA devices through an AXI4 to TileLink converter. + +To learn more about adding DMA devices, see the :ref:`Adding a DMA port` section +of :ref:`Adding an Accelerator/Device`. diff --git a/docs/Generators/index.rst b/docs/Generators/index.rst index a01b5adc..a147ffeb 100644 --- a/docs/Generators/index.rst +++ b/docs/Generators/index.rst @@ -14,4 +14,4 @@ The following pages introduce the generators integrated with the Chipyard framew Rocket BOOM Hwacha - + RocketChip diff --git a/docs/TileLink-Diplomacy-Reference/Diplomacy-Connectors.rst b/docs/TileLink-Diplomacy-Reference/Diplomacy-Connectors.rst new file mode 100644 index 00000000..f066dcf6 --- /dev/null +++ b/docs/TileLink-Diplomacy-Reference/Diplomacy-Connectors.rst @@ -0,0 +1,38 @@ +Diplomacy Connectors +==================== + +Nodes in a Diplomacy graph are connected to each other with edges. The Diplomacy +library provides four operators that can be used to form edges between nodes. + +:= +-- + +This is the basic connection operator. It is the same syntax as the Chisel +uni-directional connector, but it is not equivalent. This operator connects +Diplomacy nodes, not Chisel bundles. + +The basic connection operator always creates a single edge between the two +nodes. + +:=\* +---- + +This is a "query" type connection operator. It can create multiple edges +between nodes, with the number of edges determined by the client node +(the node on the right side of the operator). This can be useful if you +are connecting a multi-edge client to a nexus node or adapter node. + +:\*= +---- + +This is a "star" type connection operator. It also creates multiple edges, +but the number of edges is determined by the manager (left side of operator), +rather than the client. It's useful for connecting nexus nodes to multi-edge +manager nodes. + +:\*=\* +------ + +This is a "flex" connection operator. It creates multiple edges based on +whichever side of the operator has a known number of edges. This can be used +in generators where the type of node on either side isn't known until runtime. diff --git a/docs/TileLink-Diplomacy-Reference/EdgeFunctions.rst b/docs/TileLink-Diplomacy-Reference/EdgeFunctions.rst new file mode 100644 index 00000000..6385f908 --- /dev/null +++ b/docs/TileLink-Diplomacy-Reference/EdgeFunctions.rst @@ -0,0 +1,252 @@ +TileLink Edge Object Methods +============================ + +The edge object associated with a TileLink node has several helpful methods +for constructing TileLink messages and retrieving data from them. + + +Get +--- + +Constructor for a TLBundleA encoding a ``Get`` message, which requests data +from memory. The D channel response to this message will be an +``AccessAckData``, which may have multiple beats. + +**Arguments:** + + - ``fromSource: UInt`` - Source ID for this transaction + - ``toAddress: UInt`` - The address to read from + - ``lgSize: UInt`` - Base two logarithm of the number of bytes to be read + +**Returns:** + +A ``(Bool, TLBundleA)`` tuple. The first item in the pair is a boolean +indicating whether or not the operation is legal for this edge. The second +is the A channel bundle. + +Put +--- + +Constructor for a TLBundleA encoding a ``PutFull`` or ``PutPartial`` message, +which write data to memory. It will be a ``PutPartial`` if the ``mask`` is +specified and a ``PutFull`` if it is omitted. The put may require multiple +beats. If that is the case, only ``data`` and ``mask`` should change for each +beat. All other fields must be the same for all beats in the transaction, +including the address. The manager will respond to this message with a single +``AccessAck``. + +**Arguments:** + + - ``fromSource: UInt`` - Source ID for this transaction. + - ``toAddress: UInt`` - The address to write to. + - ``lgSize: UInt`` - Base two logarithm of the number of bytes to be written. + - ``data: UInt`` - The data to write on this beat. + - ``mask: UInt`` - (optional) The write mask for this beat. + +**Returns:** + +A ``(Bool, TLBundleA)`` tuple. The first item in the pair is a boolean +indicating whether or not the operation is legal for this edge. The second +is the A channel bundle. + +Arithmetic +---------- + +Constructor for a TLBundleA encoding an ``Arithmetic`` message, which is an +atomic operation. The possible values for the ``atomic`` field are defined +in the ``TLAtomics`` object. It can be ``MIN``, ``MAX``, ``MINU``, ``MAXU``, or +``ADD``, which correspond to atomic minimum, maximum, unsigned minimum, unsigned +maximum, or addition operations, respectively. The previous value at the +memory location will be returned in the response, which will be in the form +of an ``AccessAckData``. + +**Arguments:** + + - ``fromSource: UInt`` - Source ID for this transaction. + - ``toAddress: UInt`` - The address to perform an arithmetic operation on. + - ``lgSize: UInt`` - Base two logarithm of the number of bytes to operate on. + - ``data: UInt`` - Right-hand operand of the arithmetic operation + - ``atomic: UInt`` - Arithmetic operation type (from ``TLAtomics``) + +**Returns:** + +A ``(Bool, TLBundleA)`` tuple. The first item in the pair is a boolean +indicating whether or not the operation is legal for this edge. The second +is the A channel bundle. + +Logical +------- + +Constructor for a TLBundleA encoding a ``Logical`` message, an atomic operation. +The possible values for the ``atomic`` field are ``XOR``, ``OR``, ``AND``, and +``SWAP``, which correspond to atomic bitwise exclusive or, bitwise inclusive or, +bitwise and, and swap operations, respectively. The previous value at the +memory location will be returned in an ``AccessAckData`` response. + +**Arguments:** + + - ``fromSource: UInt`` - Source ID for this transaction. + - ``toAddress: UInt`` - The address to perform a logical operation on. + - ``lgSize: UInt`` - Base two logarithm of the number of bytes to operate on. + - ``data: UInt`` - Right-hand operand of the logical operation + - ``atomic: UInt`` - Logical operation type (from ``TLAtomics``) + +**Returns:** + +A ``(Bool, TLBundleA)`` tuple. The first item in the pair is a boolean +indicating whether or not the operation is legal for this edge. The second +is the A channel bundle. + +Hint +---- + +Constructor for a TLBundleA encoding a ``Hint`` message, which is used to +send prefetch hints to caches. The ``param`` argument determines what kind +of hint it is. The possible values come from the ``TLHints`` object and are +``PREFETCH_READ`` and ``PREFETCH_WRITE``. The first one tells caches to +acquire data in a shared state. The second one tells cache to acquire data +in an exclusive state. If the cache this message reaches is a last-level cache, +there won't be any difference. If the manager this message reaches is not a +cache, it will simply be ignored. In any case, a ``HintAck`` message will be +sent in response. + +**Arguments:** + + - ``fromSource: UInt`` - Source ID for this transaction. + - ``toAddress: UInt`` - The address to prefetch + - ``lgSize: UInt`` - Base two logarithm of the number of bytes to prefetch + - ``param: UInt`` - Hint type (from TLHints) + +**Returns:** + +A ``(Bool, TLBundleA)`` tuple. The first item in the pair is a boolean +indicating whether or not the operation is legal for this edge. The second +is the A channel bundle. + +AccessAck +--------- + +Constructor for a TLBundleD encoding an ``AccessAck`` or ``AccessAckData`` +message. If the optional ``data`` field is supplied, it will be an +``AccessAckData``. Otherwise, it will be an ``AccessAck``. + +**Arguments** + + - ``a: TLBundleA`` - The A channel message to acknowledge + - ``data: UInt`` - (optional) The data to send back + +**Returns:** + +The ``TLBundleD`` for the D channel message. + +HintAck +------- + +Constructor for a TLBundleD encoding a ``HintAck`` message. + +**Arguments** + + - ``a: TLBundleA`` - The A channel message to acknowledge + +**Returns:** + +The ``TLBundleD`` for the D channel message. + +first +----- + +This method take a decoupled channel (either the A channel or D channel) +and determines whether the current beat is the first beat in the transaction. + +**Arguments:** + + - ``x: DecoupledIO[TLChannel]`` - The decoupled channel to snoop on. + +**Returns:** + +A ``Boolean`` which is true if the current beat is the first, or false otherwise. + +last +---- + +This method take a decoupled channel (either the A channel or D channel) +and determines whether the current beat is the last in the transaction. + +**Arguments:** + + - ``x: DecoupledIO[TLChannel]`` - The decoupled channel to snoop on. + +**Returns:** + +A ``Boolean`` which is true if the current beat is the last, or false otherwise. + +done +---- + +Equivalent to ``x.fire() && last(x)``. + +**Arguments:** + + - ``x: DecoupledIO[TLChannel]`` - The decoupled channel to snoop on. + +**Returns:** + +A ``Boolean`` which is true if the current beat is the last and a beat is +sent on this cycle. False otherwise. + +count +----- + +This method take a decoupled channel (either the A channel or D channel) and +determines the count (starting from 0) of the current beat in the transaction. + +**Arguments:** + + - ``x: DecoupledIO[TLChannel]`` - The decoupled channel to snoop on. + +**Returns:** + +A ``UInt`` indicating the count of the current beat. + +numBeats +--------- + +This method takes in a TileLink bundle and gives the number of beats expected +for the transaction. + +**Arguments:** + + - ``x: TLChannel`` - The TileLink bundle to get the number of beats from + +**Returns:** + +A ``UInt`` that is the number of beats in the current transaction. + +numBeats1 +--------- + +Similar to ``numBeats`` except it gives the number of beats minus one. If this +is what you need, you should use this instead of doing ``numBeats - 1.U``, as +this is more efficient. + +**Arguments:** + + - ``x: TLChannel`` - The TileLink bundle to get the number of beats from + +**Returns:** + +A ``UInt`` that is the number of beats in the current transaction minus one. + +hasData +-------- + +Determines whether the TileLink message contains data or not. This is true +if the message is a PutFull, PutPartial, Arithmetic, Logical, or AccessAckData. + +**Arguments:** + + - ``x: TLChannel`` - The TileLink bundle to check + +**Returns:** + +A ``Boolean`` that is true if the current message has data and false otherwise. diff --git a/docs/TileLink-Diplomacy-Reference/NodeTypes.rst b/docs/TileLink-Diplomacy-Reference/NodeTypes.rst new file mode 100644 index 00000000..be3fb33d --- /dev/null +++ b/docs/TileLink-Diplomacy-Reference/NodeTypes.rst @@ -0,0 +1,237 @@ +TileLink Node Types +=================== + +Diplomacy represents the different components of an SoC as nodes of a +directed acyclic graph. TileLink nodes can come in several different types. + +Client Node +----------- + +TileLink clients are modules that initiate TileLink transactions by sending +requests on the A channel and receive responses on the D channel. If the +client implements TL-C, it will receive probes on the B channel, send releases +on the C channel, and send grant acknowledgements on the E channel. + +The L1 caches and DMA devices in RocketChip/Chipyard have client nodes. + +You can add a TileLink client node to your LazyModule using the TLHelper +object from testchipip like so: + +.. literalinclude:: ../../generators/example/src/main/scala/NodeTypes.scala + :language: scala + :start-after: DOC include start: MyClient + :end-before: DOC include end: MyClient + +The ``name`` argument identifies the node in the Diplomacy graph. It is the +only required argument for TLClientParameters. + +The ``sourceId`` argument specifies the range of source identifiers that this +client will use. Since we have set the range to [0, 4) here, this client will +be able to send up to four requests in flight at a time. Each request will +have a distinct value in its source field. The default value for this field +is ``IdRange(0, 1)``, which means it would only be able to send a single +request inflight. + +The ``requestFifo`` argument is a boolean option which defaults to false. +If it is set to true, the client will request that downstream managers that +support it send responses in FIFO order (that is, in the same order the +corresponding requests were sent). + +The ``visibility`` argument specifies the address ranges that the client will +access. By default it is set to include all addresses. In this example, we set +it to contain a single address range ``AddressSet(0x10000, 0xffff)``, which +means that the client will only be able to access addresses from 0x10000 to +0x1ffff. normally do not specify this, but it can help downstream crossbar +generators optimize the hardware by not arbitrating the client to managers with +address ranges that don't overlap with its visibility. + +Inside your lazy module implementation, you can call ``node.out`` to get a +list of bundle/edge pairs. If you used the TLHelper, you only specified a +single client edge, so this list will only have one pair. + +The ``tl`` bundle is a Chisel hardware bundle that connects to the IO of this +module. It contains two (in the case of TL-UL and TL-UH) or five (in the case +of TL-C) decoupled bundles corresponding to the TileLink channels. This is +what you should connect your hardware logic to in order to actually send/receive +TileLink messages. + +The ``edge`` object represents the edge of the Diplomacy graph. It contains +some useful helper functions which will be documented in +:ref:`TileLink Edge Object Methods`. + +Manager Node +------------ + +TileLink managers take requests from clients on the A channel and send +responses back on the D channel. You can create a manager node using the +TLHelper like so: + +.. literalinclude:: ../../generators/example/src/main/scala/NodeTypes.scala + :language: scala + :start-after: DOC include start: MyManager + :end-before: DOC include end: MyManager + +The ``makeManagerNode`` method takes two arguments. The first is ``beatBytes``, +which is the physical width of the TileLink interface in bytes. The second +is a TLManagerParameters object. + +The only required argument for ``TLManagerParameters`` is the ``address``, +which is the set of address ranges that this manager will serve. +This information is used to route requests from the clients. In this example, +the manager will only take requests for addresses from 0x20000 to 0x20fff. +The second argument in ``AddressSet`` is a mask, not a size. +You should generally set it to be one less than a power of two. Otherwise, +the addressing behavior may not be what you expect. + +The second argument is ``resources``, which is usually retrieved from a +``Device`` object. In this case, we use a ``SimpleDevice`` object. +This argument is necessary if you want to add an entry to the DeviceTree in +the BootROM so that it can be read by a Linux driver. The two arguments to +``SimpleDevice`` are the name and compatibility list for the device tree +entry. For this manager, then, the device tree entry would look like + +.. code-block:: text + + L12: my-device@20000 { + compatible = "tutorial,my-device0"; + reg = <0x20000 0x1000>; + }; + +The next argument is ``regionType``, which gives some information about +the caching behavior of the manager. There are seven region types, listed below: + +1. ``CACHED`` - An intermediate agent may have cached a copy of the region for you. +2. ``TRACKED`` - The region may have been cached by another master, but coherence is being provided. +3. ``UNCACHED`` - The region has not been cached yet, but should be cached when possible. +4. ``IDEMPOTENT`` - Gets return most recently put content, but content should not be cached. +5. ``VOLATILE`` - Content may change without a put, but puts and gets have no side effects. +6. ``PUT_EFFECTS`` - Puts produce side effects and so must not be combined/delayed. +7. ``GET_EFFECTS`` - Gets produce side effects and so must not be issued speculatively. + +Next is the ``executable`` argument, which determines if the CPU is allowed to +fetch instructions from this manager. By default it is false, which is what +most MMIO peripherals should set it to. + +The next six arguments start with ``support`` and determine the different +A channel message types that the manager can accept. The definitions of the +message types are explained in :ref:`TileLink Edge Object Methods`. +The ``TransferSizes`` case class specifies the range of logical sizes (in bytes) +that the manager can accept for the particular message type. This is an inclusive +range and all logical sizes must be powers of two. So in this case, the manager +can accept requests with sizes of 1, 2, 4, or 8 bytes. + +The final argument shown here is the ``fifoId`` setting, which determines +which FIFO domain (if any) the manager is in. If this argument is set to ``None`` +(the default), the manager will not guarantee any ordering of the responses. +If the ``fifoId`` is set, it will share a FIFO domain with all other managers +that specify the same ``fifoId``. This means that client requests sent to +that FIFO domain will see responses in the same order. + +Register Node +------------- + +While you can directly specify a manager node and write all of the logic +to handle TileLink requests, it is usually much easier to use a register node. +This type of node provides a ``regmap`` method that allows you to specify +control/status registers and automatically generates the logic to handle the +TileLink protocol. More information about how to use register nodes can be +found in :ref:`Register Router`. + +Identity Node +------------- + +Unlike the previous node types, which had only inputs or only outputs, the +identity node has both. As its name suggests, it simply connects the inputs +to the outputs unchanged. This node is mainly used to combine multiple +nodes into a single node with multiple edges. For instance, say we have two +client lazy modules, each with their own client node. + +.. literalinclude:: ../../generators/example/src/main/scala/NodeTypes.scala + :language: scala + :start-after: DOC include start: MyClient1+MyClient2 + :end-before: DOC include end: MyClient1+MyClient2 + +Now we instantiate these two clients in another lazy module and expose their +nodes as a single node. + +.. literalinclude:: ../../generators/example/src/main/scala/NodeTypes.scala + :language: scala + :start-after: DOC include start: MyClientGroup + :end-before: DOC include end: MyClientGroup + +We can also do the same for managers. + +.. literalinclude:: ../../generators/example/src/main/scala/NodeTypes.scala + :language: scala + :start-after: DOC include start: MyManagerGroup + :end-before: DOC include end: MyManagerGroup + +If we want to connect the client and manager groups together, we can now do this. + +.. literalinclude:: ../../generators/example/src/main/scala/NodeTypes.scala + :language: scala + :start-after: DOC include start: MyClientManagerComplex + :end-before: DOC include end: MyClientManagerComplex + +The meaning of the ``:=*`` operator is explained in more detail in the +:ref:`Diplomacy Connectors` section. In summary, it connects two nodes together +using multiple edges. The edges in the identity node are assigned in order, +so in this case ``client1.node`` will eventually connect to ``manager1.node`` +and ``client2.node`` will connect to ``manager2.node``. + +The number of inputs to an identity node should match the number of outputs. +A mismatch will cause an elaboration error. + +Adapter Node +------------ + +Like the identity node, the adapter node takes some number of inputs and +produces the same number of outputs. However, unlike the identity node, the +adapter node does not simply pass the connections through unchanged. +It can change the logical and physical interfaces between input and output and +rewrite messages going through. RocketChip provides a library of adapters, +which are catalogued in :ref:`Diplomatic Widgets`. + +You will rarely need to create an adapter node yourself, but the invocation is +as follows. + +.. code-block:: scala + + val node = TLAdapterNode( + clientFn = { cp => + // .. + }, + managerFn = { mp => + // .. + }) + +The ``clientFn`` is a function that takes the ``TLClientPortParameters`` of +the input as an argument and returns the corresponding parameters for the +output. The ``managerFn`` takes the ``TLManagerPortParameters`` of the output +as an argument and returns the corresponding parameters for the input. + +Nexus Node +---------- + +The nexus node is similar to the adapter node in that it has a different +output interface than input interface. But it can also have a different +number of inputs than it does outputs. This node type is mainly used by +the ``TLXbar`` widget, which provides a TileLink crossbar generator. You will +also likely not need to define this node type manually, but its invocation is +as follows. + +.. code-block:: scala + + val node = TLNexusNode( + clientFn = { seq => + // .. + }, + managerFn = { seq => + // .. + }) + +This has similar arguments as the adapter node's constructor, but instead of +taking single parameters objects as arguments and returning single objects +as results, the functions take and return sequences of parameters. And as you +might expect, the size of the returned sequence need not be the same size as +the input sequence. diff --git a/docs/TileLink-Diplomacy-Reference/Register-Router.rst b/docs/TileLink-Diplomacy-Reference/Register-Router.rst new file mode 100644 index 00000000..cc735578 --- /dev/null +++ b/docs/TileLink-Diplomacy-Reference/Register-Router.rst @@ -0,0 +1,141 @@ +Register Router +=============== + +Memory-mapped devices generally follow a common pattern. They expose a set +of registers to the CPUs. By writing to a register, the CPU can change the +device's settings or send a command. By reading from a register, the CPU can +query the device's state or retrieve results. + +While designers can manually instantiate a manager node and write the logic +for exposing registers themselves, it's much easier to use RocketChip's +``regmap`` interface, which can generate most of the glue logic. + +For TileLink devices, you can use the ``regmap`` interface by extending +the ``TLRegisterRouter`` class, as shown in :ref:`Adding An Accelerator/Device`, +or you can create a regular LazyModule and instantiate a ``TLRegisterNode``. +This section will focus on the second method. + +Basic Usage +----------- + +.. literalinclude:: ../../generators/example/src/main/scala/RegisterNodeExample.scala + :language: scala + :start-after: DOC include start: MyDeviceController + :end-before: DOC include end: MyDeviceController + +The code example above shows a simple lazy module that uses the ``TLRegisterNode`` +to memory map hardware registers of different sizes. The constructor has +two required arguments: ``address``, which is the base address of the registers, +and ``device``, which is the device tree entry. There are also two optional +arguments. The ``beatBytes`` argument is the interface width in bytes. +The default value is 4 bytes. The ``concurrency`` argument is the size of the +internal queue for TileLink requests. By default, this value is 0, which means +there will be no queue. This value must be greater than 0 if you wish to +decoupled requests and responses for register accesses. This is discussed +in :ref:`Using Functions`. + +The main way to interact with the node is to call the ``regmap`` method, which +takes a sequence of pairs. The first element of the pair is an offset from the +base address. The second is a sequence of ``RegField`` objects, each of +which maps a different register. The ``RegField`` constructor takes two +arguments. The first argument is the width of the register in bits. +The second is the register itself. + +Since the argument is a sequence, you can associate multiple ``RegField`` +objects with an offset. If you do, the registers are read or written in parallel +when the offset is accessed. The registers are in little endian order, so the +first register in the list corresponds to the least significant bits in the +value written. In this example, if the CPU wrote to offset 0x0E with the value +0xAB, ``smallReg0`` will get the value 0xB and ``smallReg1`` would get 0xA. + +Decoupled Interfaces +-------------------- + +Sometimes you may want to do something other than read and write from a hardware +register. The ``RegField`` interface also provides support for reading +and writing ``DecoupledIO`` interfaces. For instance, you can implement a +hardware FIFO like so. + +.. literalinclude:: ../../generators/example/src/main/scala/RegisterNodeExample.scala + :language: scala + :start-after: DOC include start: MyQueueRegisters + :end-before: DOC include end: MyQueueRegisters + +This variant of the ``RegField`` constructor takes three arguments instead of +two. The first argument is still the bit width. The second is the decoupled +interface to read from. The third is the decoupled interface to write to. +In this example, writing to the "register" will push the data into the queue +and reading from it will pop data from the queue. + +You need not specify both read and write for a register. You can also create +read-only or write-only registers. So for the previous example, if you wanted +enqueue and dequeue to use different addresses, you could write the following. + +.. literalinclude:: ../../generators/example/src/main/scala/RegisterNodeExample.scala + :language: scala + :start-after: DOC include start: MySeparateQueueRegisters + :end-before: DOC include end: MySeparateQueueRegisters + +The read-only register function can also be used to read signals +that aren't registers. + +.. code-block:: scala + + val constant = 0xf00d.U + + node.regmap( + 0x00 -> Seq(RegField.r(8, constant))) + +Using Functions +--------------- + +You can also create registers using functions. Say, for instance, that you +want to create a counter that gets incremented on a write and decremented on +a read. + +.. literalinclude:: ../../generators/example/src/main/scala/RegisterNodeExample.scala + :language: scala + :start-after: DOC include start: MyCounterRegisters + :end-before: DOC include end: MyCounterRegisters + +The functions here are essentially the same as a decoupled interface. +The read function gets passed the ``ready`` signal and returns the +``valid`` and ``bits`` signals. The write function gets passed ``valid` and +``bits`` and returns ``ready``. + +You can also pass functions that decouple the read/write request and response. +The request will appear as a decoupled input and the response as a decoupled +output. So for instance, if we wanted to do this for the previous example. + +.. literalinclude:: ../../generators/example/src/main/scala/RegisterNodeExample.scala + :language: scala + :start-after: DOC include start: MyCounterReqRespRegisters + :end-before: DOC include end: MyCounterReqRespRegisters + +In each function, we set up a state variable ``responding``. The function +is ready to take requests when this is false and is sending a response when +this is true. + +In this variant, both read and write take an input valid and return an +output ready. The only different is that bits is an input for read and an +output for write. + +In order to use this variant, you need to set ``concurrency`` to a value +larger than 0. + +Register Routers for Other Protocols +------------------------------------ + +One useful feature of the register router interface is that you can easily +change the protocol being used. For instance, in the first example in +:ref:`Basic Usage`, you could simply change the ``TLRegisterNode`` to +and ``AXI4RegisterNode``. + +.. literalinclude:: ../../generators/example/src/main/scala/RegisterNodeExample.scala + :language: scala + :start-after: DOC include start: MyAXI4DeviceController + :end-before: DOC include end: MyAXI4DeviceController + +Other than the fact that AXI4 nodes don't take a ``device`` argument, and can +only have a single AddressSet instead of multiple, everything else is +unchanged. diff --git a/docs/TileLink-Diplomacy-Reference/Widgets.rst b/docs/TileLink-Diplomacy-Reference/Widgets.rst new file mode 100644 index 00000000..41fc033b --- /dev/null +++ b/docs/TileLink-Diplomacy-Reference/Widgets.rst @@ -0,0 +1,467 @@ +Diplomatic Widgets +================== + +RocketChip provides a library of diplomatic TileLink and AXI4 widgets. +The most commonly used widgets are documented here. The TileLink widgets +are available from ``freechips.rocketchip.tilelink`` and the AXI4 widgets +from ``freechips.rocketchip.amba.axi4``. + +TLBuffer +-------- + +A widget for buffering TileLink transactions. It simply instantiates queues +for each of the 2 (or 5 for TL-C) decoupled channels. To configure the queue +for each channel, you pass the constructor a +``freechips.rocketchip.diplomacy.BufferParams`` object. The arguments for +this case class are: + + - ``depth: Int`` - The number of entries in the queue + - ``flow: Boolean`` - If true, combinationally couple the valid signals so + that an input can be consumed on the same cycle it is enqueued. + - ``pipe: Boolean`` - If true, combinationally couple the ready signals so + that single-entry queues can run at full rate. + +There is an implicit conversion from ``Int`` available. If you pass an +integer instead of a BufferParams object, the queue will be the depth +given in the integer and ``flow`` and ``pipe`` will both be false. + +You can also use one of the predefined BufferParams objects. + + - ``BufferParams.default`` = ``BufferParams(2, false, false)`` + - ``BufferParams.none`` = ``BufferParams(0, false, false)`` + - ``BufferParams.flow`` = ``BufferParams(1, true, false)`` + - ``BufferParams.pipe`` = ``BufferParams(1, false, true)`` + +**Arguments:** + +There are four constructors available with zero, one, two, or five arguments. + +The zero-argument constructor uses ``BufferParams.default`` for all of the +channels. + +The single-argument constructor takes a ``BufferParams`` object to use for all +channels. + +The arguments for the two-argument constructor are: + + - ``ace: BufferParams`` - Parameters to use for the A, C, and E channels. + - ``bd: BufferParams`` - Parameters to use for the B and D channels + +The arguments for the five-argument constructor are + + - ``a: BufferParams`` - Buffer parameters for the A channel + - ``b: BufferParams`` - Buffer parameters for the B channel + - ``c: BufferParams`` - Buffer parameters for the C channel + - ``d: BufferParams`` - Buffer parameters for the D channel + - ``e: BufferParams`` - Buffer parameters for the E channel + +**Example Usage:** + +.. code-block:: scala + + // Default settings + manager0.node := TLBuffer() := client0.node + + // Using implicit conversion to make buffer with 8 queue entries per channel + manager1.node := TLBuffer(8) := client1.node + + // Use default on A channel but pipe on D channel + manager2.node := TLBuffer(BufferParams.default, BufferParams.pipe) := client2.node + + // Only add queues for the A and D channel + manager3.node := TLBuffer( + BufferParams.default, + BufferParams.none, + BufferParams.none, + BufferParams.default, + BufferParams.none) := client3.node + +AXI4Buffer +---------- + +Similar to the :ref:`TLBuffer`, but for AXI4. It also takes ``BufferParams`` objects +as arguments. + +**Arguments:** + +Like TLBuffer, AXI4Buffer has zero, one, two, and five-argument constructors. + +The zero-argument constructor uses the default BufferParams for all channels. + +The one-argument constructor uses the provided BufferParams for all channels. + +The two-argument constructor has the following arguments. + + - ``aw: BufferParams`` - Buffer parameters for the "ar", "aw", and "w" channels. + - ``br: BufferParams`` - Buffer parameters for the "b", and "r" channels. + +The five-argument constructor has the following arguments + + - ``aw: BufferParams`` - Buffer parameters for the "ar" channel + - ``w: BufferParams`` - Buffer parameters for the "w" channel + - ``b: BufferParams`` - Buffer parameters for the "b" channel + - ``ar: BufferParams`` - Buffer parameters for the "ar" channel + - ``r: BufferParams`` - Buffer parameters for the "r" channel + +**Example Usage:** + +.. code-block:: scala + + // Default settings + slave0.node := AXI4Buffer() := master0.node + + // Using implicit conversion to make buffer with 8 queue entries per channel + slave1.node := AXI4Buffer(8) := master1.node + + // Use default on aw/w/ar channel but pipe on b/r channel + slave2.node := AXI4Buffer(BufferParams.default, BufferParams.pipe) := master2.node + + // Single-entry queues for aw, b, and ar but two-entry queues for w and r + slave3.node := AXI4Buffer(1, 2, 1, 1, 2) := master3.node + +AXI4UserYanker +-------------- + +This widget takes an AXI4 port that has a user field and turns it into +one without a user field. The values of the user field from input AR and AW +requests is kept in internal queues associated with the ARID/AWID, which is +then used to associate the correct user field to the responses. + +**Arguments:** + + - ``capMaxFlight: Option[Int]`` - (optional) An option which can hold the + number of requests that can be inflight for each ID. If ``None`` (the default), + the UserYanker will support the maximum number of inflight requests. + +**Example Usage:** + +.. code-block:: scala + + nouser.node := AXI4UserYanker(Some(1)) := hasuser.node + +AXI4Deinterleaver +----------------- + +Multi-beat AXI4 read responses for different IDs can potentially be interleaved. +This widget reorders read responses from the slave so that all of the beats +for a single transaction are consecutive. + +**Arguments:** + + - ``maxReadBytes: Int`` - The maximum number of bytes that can be read + in a single transaction. + +**Example Usage:** + +.. code-block:: scala + + interleaved.node := AXI4Deinterleaver() := consecutive.node + +TLFragmenter +------------ + +The TLFragmenter widget shrinks the maximum logical transfer size of the +TileLink interface by breaking larger transactions into multiple smaller +transactions. + +**Arguments:** + + - ``minSize: Int`` - Minimum size of transfers supported by all outward managers. + - ``maxSize: Int`` - Maximum size of transfers supported after the Fragmenter is applied. + - ``alwaysMin: Boolean`` - (optional) Fragment all requests down to minSize (else fragment to maximum supported by manager). (default: false) + - ``earlyAck: EarlyAck.T`` - (optional) Should a multibeat Put be acknowledged on the first beat or last beat? + Possible values (default: ``EarlyAck.None``): + + - ``EarlyAck.AllPuts`` - always acknowledge on first beat. + - ``EarlyAck.PutFulls`` - acknowledge on first beat if PutFull, otherwise acknowledge on last beat. + - ``EarlyAck.None`` - always acknowledge on last beat. + + - ``holdFirstDenied: Boolean`` - (optional) Allow the Fragmenter to unsafely combine multibeat Gets by taking the first denied for the whole burst. (default: false) + +**Example Usage:** + +.. code-block:: scala + + val beatBytes = 8 + val blockBytes = 64 + + single.node := TLFragmenter(beatBytes, blockBytes) := multi.node + + axi4lite.node := AXI4Fragmenter() := axi4full.node + +**Additional Notes** + + - TLFragmenter modifies: PutFull, PutPartial, LogicalData, Get, Hint + - TLFragmenter passes: ArithmeticData (truncated to minSize if alwaysMin) + - TLFragmenter cannot modify acquire (could livelock); thus it is unsafe to put caches on both sides + +AXI4Fragmenter +-------------- + +The AXI4Fragmenter is similar to the :ref:`TLFragmenter`, except it can only +break multi-beat AXI4 transactions into single-beat transactions. This +effectively serves as an AXI4 to AXI4-Lite converter. The constructor for this +widget does not take any arguments. + +**Example Usage:** + +.. code-block:: scala + + axi4lite.node := AXI4Fragmenter() := axi4full.node + +TLSourceShrinker +---------------- + +The number of source IDs that a manager sees is usually computed based on the +clients that connect to it. In some cases, you may wish to fix the +number of source IDs. For instance, you might do this if you wish to export +the TileLink port to a Verilog black box. This will pose a problem, however, +if the clients require a larger number of source IDs. In this situation, +you will want to use a TLSourceShrinker. + +**Arguments:** + + - ``maxInFlight: Int`` - The maximum number of source IDs that will be sent + from the TLSourceShrinker to the manager. + +**Example Usage:** + +.. code-block:: scala + + // client.node may have >16 source IDs + // manager.node will only see 16 + manager.node := TLSourceShrinker(16) := client.node + +AXI4IdIndexer +------------- + +The AXI4 equivalent of :ref:`TLSourceShrinker`. This limits the number of +AWID/ARID bits in the slave AXI4 interface. Useful for connecting to external +or black box AXI4 ports. + +**Arguments:** + + - ``idBits: Int`` - The number of ID bits on the slave interface. + +**Example Usage:** + +.. code-block:: scala + + // master.node may have >16 unique IDs + // slave.node will only see 4 ID bits + slave.node := AXI4IdIndexer(4) := master.node + +**Notes:** + +The AXI4IdIndexer will create a ``user`` field on the slave interface, as it +stores the ID of the master requests in this field. If connecting to an AXI4 +interface that doesn't have a ``user`` field, you'll need to use the :ref:`AXI4UserYanker`. + +TLWidthWidget +------------- + +This widget changes the physical width of the TileLink interface. The width +of a TileLink interface is configured by managers, but sometimes you want +the client to see a particular width. + +**Arguments:** + + - ``innerBeatBytes: Int`` - The physical width (in bytes) seen by the client + +**Example Usage:** + +.. code-block:: + + // Assume the manager node sets beatBytes to 8 + // With WidthWidget, client sees beatBytes of 4 + manager.node := TLWidthWidget(4) := client.node + +TLFIFOFixer +----------- + +TileLink managers that declare a FIFO domain must ensure that all requests to +that domain from clients which have requested FIFO ordering see responses in +order. However, they can only control the ordering of their own responses, and +do not have control over how those responses interleave with responses from +other managers in the same FIFO domain. Responsibility for ensuring FIFO order +across managers goes to the TLFIFOFixer. + +**Arguments:** + + - ``policy: TLFIFOFixer.Policy`` - (optional) Which managers will the + TLFIFOFixer enforce ordering on? (default: ``TLFIFOFixer.all``) + +The possible values of ``policy`` are: + + - ``TLFIFOFixer.all`` - All managers (including those without a FIFO domain) + will have ordering guaranteed + - ``TLFIFOFixer.allFIFO`` - All managers that define a FIFO domain will have + ordering guaranteed + - ``TLFIFOFixer.allVolatile`` - All managers that have a RegionType of + ``VOLATILE``, ``PUT_EFFECTS``, or ``GET_EFFECTS`` will have ordering + guaranteed (see :ref:`Manager Node` for explanation of region types). + +TLXbar and AXI4Xbar +------------------- + +These are crossbar generators for TileLink and AXI4 which will route requests +from TL client / AXI4 master nodes to TL manager / AXI4 slave nodes based on +the addresses defined in the managers / slaves. Normally, these are constructed +without arguments. However, you can change the arbitration policy, which +determines which client ports get precedent in the arbiters. The default policy +is ``TLArbiter.roundRobin``, but you can change it to ``TLArbiter.lowestIndexFirst`` +if you want a fixed arbitration precedence. + +**Arguments:** + +All arguments are optional. + + - ``arbitrationPolicy: TLArbiter.Policy`` - The arbitration policy to use. + - ``maxFlightPerId: Int`` - (AXI4 only) The number of transactions with the + same ID that can be inflight at a time. (default: 7) + - ``awQueueDepth: Int`` - (AXI4 only) The depth of the write address queue. + (default: 2) + +**Example Usage:** + +.. code-block:: scala + + // Instantiate the crossbar lazy module + val tlBus = LazyModule(new TLXbar) + + // Connect a single input edge + tlBus.node := tlClient0.node + // Connect multiple input edges + tlBus.node :=* tlClient1.node + + // Connect a single output edge + tlManager0.node := tlBus.node + // Connect multiple output edges + tlManager1.node :*= tlBus.node + + // Instantiate a crossbar with lowestIndexFirst arbitration policy + // Yes, we still use the TLArbiter singleton even though this is AXI4 + val axiBus = LazyModule(new AXI4Xbar(TLArbiter.lowestIndexFirst)) + + // The connections work the same as TL + axiBus.node := axiClient0.node + axiBus.node :=* axiClient1.node + axiManager0.node := axiBus.node + axiManager1.node :*= axiBus.node + + + +TLToAXI4 and AXI4ToTL +--------------------- + +These are converters between the TileLink and AXI4 protocols. TLToAXI4 +takes a TileLink client and connects to an AXI4 slave. AXI4ToTL takes an +AXI4 master and connects to a TileLink manager. Generally you don't want to +override the default arguments of the constructors for these widgets. + +**Example Usage:** + +.. code-block:: scala + + axi4slave.node := + AXI4UserYanker() := + AXI4Deinterleaver(64) := + TLToAXI4() := + tlclient.node + + tlmanager.node := + AXI4ToTL() := + AXI4UserYanker() := + AXI4Fragmenter() := + axi4master.node + +You will need to add an :ref:`AXI4Deinterleaver` after the TLToAXI4 converter +because it cannot deal with interleaved read responses. The TLToAXI4 converter +also uses the AXI4 user field to store some information, so you will need an +:ref:`AXI4UserYanker` if you want to connect to an AXI4 port without user +fields. + +Before you connect an AXI4 port to the AXI4ToTL widget, you will need to +add an :ref:`AXI4Fragmenter` and :ref:`AXI4UserYanker` because the converter cannot +deal with multi-beat transactions or user fields. + +TLROM +------ + +The TLROM widget provides a read-only memory that can be accessed using +TileLink. Note: this widget is in the ``freechips.rocketchip.devices.tilelink`` +package, not the ``freechips.rocketchip.tilelink`` package like the others. + +**Arguments:** + + - ``base: BigInt`` - The base address of the memory + - ``size: Int`` - The size of the memory in bytes + - ``contentsDelayed: => Seq[Byte]`` - A function which, when called generates + the byte contents of the ROM. + - ``executable: Boolean`` - (optional) Specify whether the CPU can fetch + instructions from the ROM (default: ``true``). + - ``beatBytes: Int`` - (optional) The width of the interface in bytes. + (default: 4). + - ``resources: Seq[Resource]`` - (optional) Sequence of resources to add to + the device tree. + +**Example Usage:** + +.. code-block:: scala + + val rom = LazyModule(new TLROM( + base = 0x100A0000, + size = 64, + contentsDelayed = Seq.tabulate(64) { i => i.toByte }, + beatBytes = 8)) + rom.node := TLFragmenter(8, 64) := client.node + +**Supported Operations:** + +The TLROM only supports single-beat reads. If you want to perform multi-beat +reads, you should attach a TLFragmenter in front of the ROM. + +TLRAM and AXI4RAM +----------------- + +The TLRAM and AXI4RAM widgets provide read-write memories implemented as SRAMs. + +**Arguments:** + + - ``address: AddressSet`` - The address range that this RAM will cover. + - ``cacheable: Boolean`` - (optional) Can the contents of this RAM be cached. + (default: ``true``) + - ``executable: Boolean`` - (optional) Can the contents of this RAM be fetched + as instructions. (default: ``true``) + - ``beatBytes: Int`` - (optional) Width of the TL/AXI4 interface in bytes. + (default: 4) + - ``atomics: Boolean`` - (optional, TileLink only) Does the RAM support + atomic operations? (default: ``false``) + +**Example Usage:** + +.. code-block:: scala + + val xbar = LazyModule(new TLXbar) + + val tlram = LazyModule(new TLRAM( + address = AddressSet(0x1000, 0xfff))) + + val axiram = LazyModule(new AXI4RAM( + address = AddressSet(0x2000, 0xfff))) + + tlram.node := xbar.node + axiram := TLToAXI4() := xbar.node + +**Supported Operations:** + +TLRAM only supports single-beat TL-UL requests. If you set ``atomics`` to true, +it will also support Logical and Arithmetic operations. Use a ``TLFragmenter`` +if you want multi-beat reads/writes. + +AXI4RAM only supports AXI4-Lite operations, so multi-beat reads/writes and +reads/writes smaller than full-width are not supported. Use an ``AXI4Fragmenter`` +if you want to use the full AXI4 protocol. + + + diff --git a/docs/TileLink-Diplomacy-Reference/index.rst b/docs/TileLink-Diplomacy-Reference/index.rst new file mode 100644 index 00000000..23a5a175 --- /dev/null +++ b/docs/TileLink-Diplomacy-Reference/index.rst @@ -0,0 +1,30 @@ +TileLink and Diplomacy Reference +================================ + +TileLink is the cache coherence and memory protocol used by RocketChip and +other Chipyard generators. It is how different modules like caches, memories, +peripherals, and DMA devices communicate with each other. + +RocketChip's TileLink implementation is built on top of Diplomacy, a framework +for exchanging configuration information among Chisel generators in a two-phase +elaboration scheme. For a detailed explanation of Diplomacy, see `the paper +by Cook, Terpstra, and Lee `_. + +A brief overview of how to connect simple TileLink widgets can be found +in the :ref:`Adding-an-Accelerator` section. This section will provide a +detailed reference for the TileLink and Diplomacy functionality provided by +RocketChip. + +A detailed specification of the TileLink 1.7 protocol can be found on the +`SiFive website `_. + + +.. toctree:: + :maxdepth: 2 + :caption: Reference + + NodeTypes + Diplomacy-Connectors + EdgeFunctions + Register-Router + Widgets diff --git a/docs/_static/images/rocketchip-diagram.png b/docs/_static/images/rocketchip-diagram.png new file mode 100644 index 00000000..febc2e87 Binary files /dev/null and b/docs/_static/images/rocketchip-diagram.png differ diff --git a/docs/index.rst b/docs/index.rst index 1f41b4ed..61acfae3 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -12,43 +12,35 @@ New to Chipyard? Jump to the :ref:`Chipyard Basics` page for more info. .. include:: Quick-Start.rst +Getting Help +------------ + +If you have a question about Chipyard that isn't answered by the existing +documentation, feel free to ask for help on the +`Chipyard Google Group `_. + +Table of Contents +----------------- + .. toctree:: :maxdepth: 3 - :caption: Contents: :numbered: Chipyard-Basics/index - :maxdepth: 3 - :caption: Simulation: - :numbered: Simulation/index - :maxdepth: 3 - :caption: Generators: - :numbered: Generators/index - :maxdepth: 3 - :caption: Tools: - :numbered: Tools/index - :maxdepth: 3 - :caption: VLSI Production: - :numbered: VLSI/index - :maxdepth: 3 - :caption: Customization: - :numbered: Customization/index - :maxdepth: 3 - :caption: Advanced Usage: - :numbered: Advanced-Usage/index + TileLink-Diplomacy-Reference/index Indices and tables diff --git a/generators/example/src/main/scala/ConfigMixins.scala b/generators/example/src/main/scala/ConfigMixins.scala index 17cbcaa5..a829db22 100644 --- a/generators/example/src/main/scala/ConfigMixins.scala +++ b/generators/example/src/main/scala/ConfigMixins.scala @@ -70,10 +70,12 @@ class WithDTMTop extends Config((site, here, up) => { /** * Class to specify a top level BOOM and/or Rocket system with PWM */ +// DOC include start: WithPWMTop class WithPWMTop extends Config((site, here, up) => { case BuildTop => (clock: Clock, reset: Bool, p: Parameters) => Module(LazyModule(new TopWithPWMTL()(p)).module) }) +// DOC include end: WithPWMTop /** * Class to specify a top level BOOM and/or Rocket system with a PWM AXI4 @@ -157,3 +159,14 @@ class WithMultiRoCCHwacha(harts: Int*) extends Config((site, here, up) => { } } }) + +// DOC include start: WithInitZero +class WithInitZero(base: BigInt, size: BigInt) extends Config((site, here, up) => { + case InitZeroKey => InitZeroConfig(base, size) +}) + +class WithInitZeroTop extends Config((site, here, up) => { + case BuildTop => (clock: Clock, reset: Bool, p: Parameters) => + Module(LazyModule(new TopWithInitZero()(p)).module) +}) +// DOC include end: WithInitZero diff --git a/generators/example/src/main/scala/InitZero.scala b/generators/example/src/main/scala/InitZero.scala new file mode 100644 index 00000000..3a90bfcc --- /dev/null +++ b/generators/example/src/main/scala/InitZero.scala @@ -0,0 +1,69 @@ +package example + +import chisel3._ +import chisel3.util._ +import freechips.rocketchip.subsystem.{BaseSubsystem, CacheBlockBytes} +import freechips.rocketchip.config.{Parameters, Field} +import freechips.rocketchip.diplomacy.{LazyModule, LazyModuleImp, IdRange} +import testchipip.TLHelper + +case class InitZeroConfig(base: BigInt, size: BigInt) +case object InitZeroKey extends Field[InitZeroConfig] + +class InitZero(implicit p: Parameters) extends LazyModule { + val node = TLHelper.makeClientNode( + name = "init-zero", sourceId = IdRange(0, 1)) + + lazy val module = new InitZeroModuleImp(this) +} + +class InitZeroModuleImp(outer: InitZero) extends LazyModuleImp(outer) { + val config = p(InitZeroKey) + + val (mem, edge) = outer.node.out(0) + val addrBits = edge.bundle.addressBits + val blockBytes = p(CacheBlockBytes) + + require(config.size % blockBytes == 0) + + val s_init :: s_write :: s_resp :: s_done :: Nil = Enum(4) + val state = RegInit(s_init) + + val addr = Reg(UInt(addrBits.W)) + val bytesLeft = Reg(UInt(log2Ceil(config.size+1).W)) + + mem.a.valid := state === s_write + mem.a.bits := edge.Put( + fromSource = 0.U, + toAddress = addr, + lgSize = log2Ceil(blockBytes).U, + data = 0.U)._2 + mem.d.ready := state === s_resp + + when (state === s_init) { + addr := config.base.U + bytesLeft := config.size.U + state := s_write + } + + when (edge.done(mem.a)) { + addr := addr + blockBytes.U + bytesLeft := bytesLeft - blockBytes.U + state := s_resp + } + + when (mem.d.fire()) { + state := Mux(bytesLeft === 0.U, s_done, s_write) + } +} + +trait HasPeripheryInitZero { this: BaseSubsystem => + implicit val p: Parameters + + val initZero = LazyModule(new InitZero()(p)) + fbus.fromPort(Some("init-zero"))() := initZero.node +} + +trait HasPeripheryInitZeroModuleImp extends LazyModuleImp { + // Don't need anything here +} diff --git a/generators/example/src/main/scala/NodeTypes.scala b/generators/example/src/main/scala/NodeTypes.scala new file mode 100644 index 00000000..577b9baf --- /dev/null +++ b/generators/example/src/main/scala/NodeTypes.scala @@ -0,0 +1,128 @@ +package example + +import freechips.rocketchip.config.Parameters +import freechips.rocketchip.diplomacy._ +import freechips.rocketchip.tilelink._ +import testchipip.TLHelper + +// These modules are not meant to be synthesized. +// They are used as examples in the documentation and are only here +// to check that they compile. + +// DOC include start: MyClient +class MyClient(implicit p: Parameters) extends LazyModule { + val node = TLHelper.makeClientNode(TLClientParameters( + name = "my-client", + sourceId = IdRange(0, 4), + requestFifo = true, + visibility = Seq(AddressSet(0x10000, 0xffff)))) + + lazy val module = new LazyModuleImp(this) { + val (tl, edge) = node.out(0) + + // Rest of code here + } +} +// DOC include end: MyClient + +// DOC include start: MyManager +class MyManager(implicit p: Parameters) extends LazyModule { + val device = new SimpleDevice("my-device", Seq("tutorial,my-device0")) + val beatBytes = 8 + val node = TLHelper.makeManagerNode(beatBytes, TLManagerParameters( + address = Seq(AddressSet(0x20000, 0xfff)), + resources = device.reg, + regionType = RegionType.UNCACHED, + executable = true, + supportsArithmetic = TransferSizes(1, beatBytes), + supportsLogical = TransferSizes(1, beatBytes), + supportsGet = TransferSizes(1, beatBytes), + supportsPutFull = TransferSizes(1, beatBytes), + supportsPutPartial = TransferSizes(1, beatBytes), + supportsHint = TransferSizes(1, beatBytes), + fifoId = Some(0))) + + lazy val module = new LazyModuleImp(this) { + val (tl, edge) = node.in(0) + } +} +// DOC include end: MyManager + +// DOC include start: MyClient1+MyClient2 +class MyClient1(implicit p: Parameters) extends LazyModule { + val node = TLHelper.makeClientNode("my-client1", IdRange(0, 1)) + + lazy val module = new LazyModuleImp(this) { + // ... + } +} + +class MyClient2(implicit p: Parameters) extends LazyModule { + val node = TLHelper.makeClientNode("my-client2", IdRange(0, 1)) + + lazy val module = new LazyModuleImp(this) { + // ... + } +} +// DOC include end: MyClient1+MyClient2 + +// DOC include start: MyClientGroup +class MyClientGroup(implicit p: Parameters) extends LazyModule { + val client1 = LazyModule(new MyClient1) + val client2 = LazyModule(new MyClient2) + val node = TLIdentityNode() + + node := client1.node + node := client2.node + + lazy val module = new LazyModuleImp(this) { + // Nothing to do here + } +} +// DOC include end: MyClientGroup + +// DOC include start: MyManagerGroup +class MyManager1(beatBytes: Int)(implicit p: Parameters) extends LazyModule { + val node = TLHelper.makeManagerNode(beatBytes, TLManagerParameters( + address = Seq(AddressSet(0x0, 0xfff)))) + + lazy val module = new LazyModuleImp(this) { + // ... + } +} + +class MyManager2(beatBytes: Int)(implicit p: Parameters) extends LazyModule { + val node = TLHelper.makeManagerNode(beatBytes, TLManagerParameters( + address = Seq(AddressSet(0x1000, 0xfff)))) + + lazy val module = new LazyModuleImp(this) { + // ... + } +} + +class MyManagerGroup(beatBytes: Int)(implicit p: Parameters) extends LazyModule { + val man1 = LazyModule(new MyManager1(beatBytes)) + val man2 = LazyModule(new MyManager2(beatBytes)) + val node = TLIdentityNode() + + man1.node := node + man2.node := node + + lazy val module = new LazyModuleImp(this) { + // Nothing to do here + } +} +// DOC include end: MyManagerGroup + +// DOC include start: MyClientManagerComplex +class MyClientManagerComplex(implicit p: Parameters) extends LazyModule { + val client = LazyModule(new MyClientGroup) + val manager = LazyModule(new MyManagerGroup(8)) + + manager.node :=* client.node + + lazy val module = new LazyModuleImp(this) { + // Nothing to do here + } +} +// DOC include end: MyClientManagerComplex diff --git a/generators/example/src/main/scala/PWM.scala b/generators/example/src/main/scala/PWM.scala index 2cec926e..bd6056f4 100644 --- a/generators/example/src/main/scala/PWM.scala +++ b/generators/example/src/main/scala/PWM.scala @@ -10,6 +10,7 @@ import freechips.rocketchip.regmapper.{HasRegMap, RegField} import freechips.rocketchip.tilelink._ import freechips.rocketchip.util.UIntIsOneOf +// DOC include start: PWM generic traits case class PWMParams(address: BigInt, beatBytes: Int) class PWMBase(w: Int) extends Module { @@ -64,19 +65,23 @@ trait PWMModule extends HasRegMap { 0x08 -> Seq( RegField(1, enable))) } +// DOC include end: PWM generic traits +// DOC include start: PWMTL class PWMTL(c: PWMParams)(implicit p: Parameters) extends TLRegisterRouter( c.address, "pwm", Seq("ucbbar,pwm"), beatBytes = c.beatBytes)( new TLRegBundle(c, _) with PWMBundle)( new TLRegModule(c, _, _) with PWMModule) +// DOC include end: PWMTL class PWMAXI4(c: PWMParams)(implicit p: Parameters) extends AXI4RegisterRouter(c.address, beatBytes = c.beatBytes)( new AXI4RegBundle(c, _) with PWMBundle)( new AXI4RegModule(c, _, _) with PWMModule) +// DOC include start: HasPeripheryPWMTL trait HasPeripheryPWMTL { this: BaseSubsystem => implicit val p: Parameters @@ -88,7 +93,9 @@ trait HasPeripheryPWMTL { this: BaseSubsystem => pbus.toVariableWidthSlave(Some(portName)) { pwm.node } } +// DOC include end: HasPeripheryPWMTL +// DOC include start: HasPeripheryPWMTLModuleImp trait HasPeripheryPWMTLModuleImp extends LazyModuleImp { implicit val p: Parameters val outer: HasPeripheryPWMTL @@ -97,6 +104,7 @@ trait HasPeripheryPWMTLModuleImp extends LazyModuleImp { pwmout := outer.pwm.module.io.pwmout } +// DOC include end: HasPeripheryPWMTLModuleImp trait HasPeripheryPWMAXI4 { this: BaseSubsystem => implicit val p: Parameters diff --git a/generators/example/src/main/scala/RegisterNodeExample.scala b/generators/example/src/main/scala/RegisterNodeExample.scala new file mode 100644 index 00000000..cda91ffe --- /dev/null +++ b/generators/example/src/main/scala/RegisterNodeExample.scala @@ -0,0 +1,181 @@ +// DOC include start: MyDeviceController + +import chisel3._ +import chisel3.util._ +import freechips.rocketchip.config.Parameters +import freechips.rocketchip.diplomacy._ +import freechips.rocketchip.regmapper._ +import freechips.rocketchip.tilelink.TLRegisterNode + +class MyDeviceController(implicit p: Parameters) extends LazyModule { + val device = new SimpleDevice("my-device", Seq("tutorial,my-device0")) + val node = TLRegisterNode( + address = Seq(AddressSet(0x10028000, 0xfff)), + device = device, + beatBytes = 8, + concurrency = 1) + + lazy val module = new LazyModuleImp(this) { + val bigReg = RegInit(0.U(64.W)) + val mediumReg = RegInit(0.U(32.W)) + val smallReg = RegInit(0.U(16.W)) + + val tinyReg0 = RegInit(0.U(4.W)) + val tinyReg1 = RegInit(0.U(4.W)) + + node.regmap( + 0x00 -> Seq(RegField(64, bigReg)), + 0x08 -> Seq(RegField(32, mediumReg)), + 0x0C -> Seq(RegField(16, smallReg)), + 0x0E -> Seq( + RegField(4, tinyReg0), + RegField(4, tinyReg1))) + } +} + +// DOC include end: MyDeviceController + +// DOC include start: MyAXI4DeviceController +import freechips.rocketchip.amba.axi4.AXI4RegisterNode + +class MyAXI4DeviceController(implicit p: Parameters) extends LazyModule { + val node = AXI4RegisterNode( + address = AddressSet(0x10029000, 0xfff), + beatBytes = 8, + concurrency = 1) + + lazy val module = new LazyModuleImp(this) { + val bigReg = RegInit(0.U(64.W)) + val mediumReg = RegInit(0.U(32.W)) + val smallReg = RegInit(0.U(16.W)) + + val tinyReg0 = RegInit(0.U(4.W)) + val tinyReg1 = RegInit(0.U(4.W)) + + node.regmap( + 0x00 -> Seq(RegField(64, bigReg)), + 0x08 -> Seq(RegField(32, mediumReg)), + 0x0C -> Seq(RegField(16, smallReg)), + 0x0E -> Seq( + RegField(4, tinyReg0), + RegField(4, tinyReg1))) + } +} +// DOC include end: MyAXI4DeviceController + +class MyQueueRegisters(implicit p: Parameters) extends LazyModule { + val device = new SimpleDevice("my-queue", Seq("tutorial,my-queue0")) + val node = TLRegisterNode( + address = Seq(AddressSet(0x1002A000, 0xfff)), + device = device, + beatBytes = 8, + concurrency = 1) + + lazy val module = new LazyModuleImp(this) { +// DOC include start: MyQueueRegisters + // 4-entry 64-bit queue + val queue = Module(new Queue(UInt(64.W), 4)) + + node.regmap( + 0x00 -> Seq(RegField(64, queue.io.deq, queue.io.enq))) +// DOC include end: MyQueueRegisters + } +} + +class MySeparateQueueRegisters(implicit p: Parameters) extends LazyModule { + val device = new SimpleDevice("my-queue", Seq("tutorial,my-queue1")) + val node = TLRegisterNode( + address = Seq(AddressSet(0x1002B000, 0xfff)), + device = device, + beatBytes = 8, + concurrency = 1) + + lazy val module = new LazyModuleImp(this) { + val queue = Module(new Queue(UInt(64.W), 4)) + +// DOC include start: MySeparateQueueRegisters + node.regmap( + 0x00 -> Seq(RegField.r(64, queue.io.deq)), + 0x08 -> Seq(RegField.w(64, queue.io.enq))) +// DOC include end: MySeparateQueueRegisters + } +} + +class MyCounterRegisters(implicit p: Parameters) extends LazyModule { + val device = new SimpleDevice("my-counters", Seq("tutorial,my-counters0")) + val node = TLRegisterNode( + address = Seq(AddressSet(0x1002C000, 0xfff)), + device = device, + beatBytes = 8, + concurrency = 1) + + lazy val module = new LazyModuleImp(this) { +// DOC include start: MyCounterRegisters + val counter = RegInit(0.U(64.W)) + + def readCounter(ready: Bool): (Bool, UInt) = { + when (ready) { counter := counter - 1.U } + // (ready, bits) + (true.B, counter) + } + + def writeCounter(valid: Bool, bits: UInt): Bool = { + when (valid) { counter := counter + 1.U } + // Ignore bits + // Return ready + true.B + } + + node.regmap( + 0x00 -> Seq(RegField.r(64, readCounter(_))), + 0x08 -> Seq(RegField.w(64, writeCounter(_, _)))) +// DOC include end: MyCounterRegisters + } +} + +class MyCounterReqRespRegisters(implicit p: Parameters) extends LazyModule { + val device = new SimpleDevice("my-counters", Seq("tutorial,my-counters1")) + val node = TLRegisterNode( + address = Seq(AddressSet(0x1002D000, 0xfff)), + device = device, + beatBytes = 8, + concurrency = 1) + + lazy val module = new LazyModuleImp(this) { +// DOC include start: MyCounterReqRespRegisters + val counter = RegInit(0.U(64.W)) + + def readCounter(ivalid: Bool, oready: Bool): (Bool, Bool, UInt) = { + val responding = RegInit(false.B) + + when (ivalid && !responding) { responding := true.B } + + when (responding && oready) { + counter := counter - 1.U + responding := false.B + } + + // (iready, ovalid, obits) + (!responding, responding, counter) + } + + def writeCounter(ivalid: Bool, oready: Bool, ibits: UInt): (Bool, Bool) = { + val responding = RegInit(false.B) + + when (ivalid && !responding) { responding := true.B } + + when (responding && oready) { + counter := counter + 1.U + responding := false.B + } + + // (iready, ovalid) + (!responding, responding) + } + + node.regmap( + 0x00 -> Seq(RegField.r(64, readCounter(_, _))), + 0x08 -> Seq(RegField.w(64, writeCounter(_, _, _)))) +// DOC include end: MyCounterReqRespRegisters + } +} diff --git a/generators/example/src/main/scala/RocketConfigs.scala b/generators/example/src/main/scala/RocketConfigs.scala index 21a7d13a..cd78b9d2 100644 --- a/generators/example/src/main/scala/RocketConfigs.scala +++ b/generators/example/src/main/scala/RocketConfigs.scala @@ -39,12 +39,14 @@ class jtagRocketConfig extends Config( new freechips.rocketchip.subsystem.WithNBigCores(1) ++ new freechips.rocketchip.system.BaseConfig) +// DOC include start: PWMRocketConfig class PWMRocketConfig extends Config( new WithPWMTop ++ // use top with tilelink-controlled PWM new WithBootROM ++ new freechips.rocketchip.subsystem.WithInclusiveCache ++ new freechips.rocketchip.subsystem.WithNBigCores(1) ++ new freechips.rocketchip.system.BaseConfig) +// DOC include end: PWMRocketConfig class PWMRAXI4ocketConfig extends Config( new WithPWMAXI4Top ++ // use top with axi4-controlled PWM @@ -107,3 +109,13 @@ class Sha3RocketConfig extends Config( new freechips.rocketchip.subsystem.WithInclusiveCache ++ new freechips.rocketchip.subsystem.WithNBigCores(1) ++ new freechips.rocketchip.system.BaseConfig) + +// DOC include start: InitZeroRocketConfig +class InitZeroRocketConfig extends Config( + new WithInitZero(0x88000000L, 0x1000L) ++ + new WithInitZeroTop ++ + new WithBootROM ++ + new freechips.rocketchip.subsystem.WithInclusiveCache ++ + new freechips.rocketchip.subsystem.WithNBigCores(1) ++ + new freechips.rocketchip.system.BaseConfig) +// DOC include end: InitZeroRocketConfig diff --git a/generators/example/src/main/scala/Top.scala b/generators/example/src/main/scala/Top.scala index 19990817..94bed0de 100644 --- a/generators/example/src/main/scala/Top.scala +++ b/generators/example/src/main/scala/Top.scala @@ -30,6 +30,7 @@ class TopModule[+L <: Top](l: L) extends SystemModule(l) with DontTouch //--------------------------------------------------------------------------------------------------------- +// DOC include start: TopWithPWMTL class TopWithPWMTL(implicit p: Parameters) extends Top with HasPeripheryPWMTL { @@ -39,6 +40,7 @@ class TopWithPWMTL(implicit p: Parameters) extends Top class TopWithPWMTLModule(l: TopWithPWMTL) extends TopModule(l) with HasPeripheryPWMTLModuleImp +// DOC include end: TopWithPWMTL //--------------------------------------------------------------------------------------------------------- class TopWithPWMAXI4(implicit p: Parameters) extends Top @@ -78,3 +80,14 @@ class TopWithDTM(implicit p: Parameters) extends System } class TopWithDTMModule[+L <: TopWithDTM](l: L) extends SystemModule(l) + +//--------------------------------------------------------------------------------------------------------- +// DOC include start: TopWithInitZero +class TopWithInitZero(implicit p: Parameters) extends Top + with HasPeripheryInitZero { + override lazy val module = new TopWithInitZeroModuleImp(this) +} + +class TopWithInitZeroModuleImp(l: TopWithInitZero) extends TopModule(l) + with HasPeripheryInitZeroModuleImp +// DOC include end: TopWithInitZero