Vortex 2.0 changes:
+ Microarchitecture optimizations + 64-bit support + Xilinx FPGA support + LLVM-16 support + Refactoring and quality control fixes minor update minor update minor update minor update minor update minor update cleanup cleanup cache bindings and memory perf refactory minor update minor update hw unit tests fixes minor update minor update minor update minor update minor update minor udpate minor update minor update minor update minor update minor update minor update minor update minor updates minor updates minor update minor update minor update minor update minor update minor update minor updates minor updates minor updates minor updates minor update minor update
This commit is contained in:
@@ -40,9 +40,9 @@ VX.cache.v is the top module of the cache verilog code located in the `/hw/rtl/c
|
||||
- Core Response Merge
|
||||
- Cache accesses one line at a time. As a result, each request may not come back in the same response. This module tries to recombine the responses by thread ID.
|
||||
|
||||
### VX_bank.v
|
||||
### VX_cache_bank.v
|
||||
|
||||
VX_bank.v is the verilog code that handles cache bank functionality and is located in the `/hw/rtl/cache` directory.
|
||||
VX_cache_bank.v is the verilog code that handles cache bank functionality and is located in the `/hw/rtl/cache` directory.
|
||||
|
||||

|
||||
|
||||
|
||||
@@ -3,38 +3,39 @@
|
||||
The directory/file layout of the Vortex codebase is as followed:
|
||||
|
||||
- `hw`:
|
||||
- `rtl`: hardware rtl sources
|
||||
- `cache`: cache subsystem code
|
||||
- `fp_cores`: floating point unit code
|
||||
- `rtl`: hardware rtl sources
|
||||
- `core`: core pipeline
|
||||
- `cache`: cache subsystem
|
||||
- `mem`: memory subsystem
|
||||
- `fpu`: floating point unit
|
||||
- `interfaces`: interfaces for inter-module communication
|
||||
- `libs`: general-purpose RTL modules
|
||||
- `libs`: general-purpose RTL modules
|
||||
- `syn`: synthesis directory
|
||||
- `opae`: OPAE synthesis scripts
|
||||
- `quartus`: Quartus synthesis scripts
|
||||
- `altera`: Altera synthesis scripts
|
||||
- `xilinx`: Xilinx synthesis scripts
|
||||
- `synopsys`: Synopsys synthesis scripts
|
||||
- `modelsim`: Modelsim synthesis scripts
|
||||
- `yosys`: Yosys synthesis scripts
|
||||
- `unit_tests`: unit tests for some hardware components
|
||||
- `driver`: host drivers repository
|
||||
- `runtime`: host runtime software APIs
|
||||
- `include`: Vortex driver public headers
|
||||
- `stub`: Vortex stub driver library
|
||||
- `fpga`: software driver that uses Intel OPAE FPGA
|
||||
- `asesim`: software driver that uses Intel ASE simulator
|
||||
- `vlsim`: software driver that uses vlsim simulator
|
||||
- `opae`: software driver that uses Intel OPAE API with device targets=fpga|asesim|opaesim
|
||||
- `xrt`: software driver that uses Xilinx XRT API with device targets=hw|hw_emu|sw_emu
|
||||
- `rtlsim`: software driver that uses rtlsim simulator
|
||||
- `simx`: software driver that uses simX simulator
|
||||
- `runtime`: kernel runtime software
|
||||
- `kernel`: GPU kernel software APIs
|
||||
- `include`: Vortex runtime public headers
|
||||
- `linker`: linker file for compiling kernels
|
||||
- `src`: runtime implementation
|
||||
- `sim`:
|
||||
- `vlsim`: AFU RTL simulator
|
||||
- `opaesim`: Intel OPAE AFU RTL simulator
|
||||
- `rtlsim`: processor RTL simulator
|
||||
- `simX`: cycle approximate simulator for vortex
|
||||
- `tests`: tests repository.
|
||||
- `runtime`: runtime tests
|
||||
- `regression`: regression tests
|
||||
- `riscv`: RISC-V standard tests
|
||||
- `riscv`: RISC-V conformance tests
|
||||
- `kernel`: kernel tests
|
||||
- `regression`: regression tests
|
||||
- `opencl`: opencl benchmarks and tests
|
||||
- `ci`: continuous integration scripts
|
||||
- `miscs`: miscellaneous resources.
|
||||
|
||||
@@ -1,29 +1,37 @@
|
||||
# Debugging Vortex Hardware
|
||||
# Debugging Vortex GPU
|
||||
|
||||
## Testing changes to the RTL or simulator GPU driver.
|
||||
|
||||
The Blackbox utility script will not pick up your changes if the h/w configuration is the same as during teh last run.
|
||||
To force the utility to build the driver, you need pass the --rebuild=1 option when running tests.
|
||||
Using --rebuild=0 will prevent the rebuild even if the h/w configuration is different from last run.
|
||||
|
||||
$ ./ci/blackbox.sh --driver=simx --app=demo --rebuild=1
|
||||
|
||||
## SimX Debugging
|
||||
|
||||
SimX cycle-approximate simulator allows faster debugging of Vortex kernels' execution.
|
||||
The recommended method to enable debugging is to pass the `--debug` flag to `blackbox` tool when running a program.
|
||||
The recommended method to enable debugging is to pass the `--debug=<level>` flag to `blackbox` tool when running a program.
|
||||
|
||||
// Running demo program on SimX in debug mode
|
||||
$ ./ci/blackbox.sh --driver=simx --app=demo --debug
|
||||
$ ./ci/blackbox.sh --driver=simx --app=demo --debug=1
|
||||
|
||||
A debug trace `run.log` is generated in the current directory during the program execution. The trace includes important states of the simulated processor (decoded instruction, register states, pipeline states, etc..). You can increase the verbosity level of the trace by changing the `DEBUG_LEVEL` variable to a value [1-5] (default is 3).
|
||||
A debug trace `run.log` is generated in the current directory during the program execution. The trace includes important states of the simulated processor (decoded instruction, register states, pipeline states, etc..). You can increase the verbosity of the trace by changing the debug level.
|
||||
|
||||
// Using SimX in debug mode with verbose level 4
|
||||
$ CONFIGS=-DDEBUG_LEVEL=4 ./ci/blackbox.sh --driver=simx --app=demo --debug
|
||||
// Using SimX in debug mode with verbose level 3
|
||||
$ ./ci/blackbox.sh --driver=simx --app=demo --debug=3
|
||||
|
||||
## RTL Debugging
|
||||
|
||||
To debug the processor RTL, you need to use VLSIM or RTLSIM driver. VLSIM simulates the full processor including the AFU command processor (using `/rtl/afu/vortex_afu.sv` as top module). RTLSIM simulates the Vortex processor only (using `/rtl/Vortex.v` as top module).
|
||||
To debug the processor RTL, you need to use VLSIM or RTLSIM driver. VLSIM simulates the full processor including the AFU command processor (using `/rtl/afu/opae/vortex_afu.sv` as top module). RTLSIM simulates the Vortex processor only (using `/rtl/Vortex.v` as top module).
|
||||
|
||||
The recommended method to enable debugging is to pass the `--debug` flag to `blackbox` tool when running a program.
|
||||
|
||||
// Running demo program on vlsim in debug mode
|
||||
$ ./ci/blackbox.sh --driver=vlsim --app=demo --debug
|
||||
// Running demo program on the opae simulator in debug mode
|
||||
$ TARGET=opaesim ./ci/blackbox.sh --driver=opae --app=demo --debug=1
|
||||
|
||||
// Running demo program on rtlsim in debug mode
|
||||
$ ./ci/blackbox.sh --driver=rtlsim --app=demo --debug
|
||||
$ ./ci/blackbox.sh --driver=rtlsim --app=demo --debug=1
|
||||
|
||||
A debug trace `run.log` is generated in the current directory during the program execution. The trace includes important states of the simulated processor (memory, caches, pipeline, stalls, etc..). A waveform trace `trace.vcd` is also generated in the current directory during the program execution. You can visualize the waveform trace using any tool that can open VCD files (Modelsim, Quartus, Vivado, etc..). [GTKwave] (http://gtkwave.sourceforge.net) is a great open-source scope analyzer that also works with VCD files.
|
||||
|
||||
@@ -32,7 +40,7 @@ A debug trace `run.log` is generated in the current directory during the program
|
||||
Debugging the FPGA directly may be necessary to investigate runtime bugs that the RTL simulation cannot catch. We have implemented an in-house scope analyzer for Vortex that works when the FPGA is running. To enable the FPGA scope analyzer, the FPGA bitstream should be built using `SCOPE=1` flag
|
||||
|
||||
& cd /hw/syn/opae
|
||||
$ CONFIGS=-DSCOPE=1 make fpga-4c
|
||||
$ CONFIGS="-DSCOPE=1" TARGET=fpga make
|
||||
|
||||
When running the program on the FPGA, you need to pass the `--scope` flag to the `blackbox` tool.
|
||||
|
||||
@@ -40,4 +48,18 @@ When running the program on the FPGA, you need to pass the `--scope` flag to the
|
||||
$ ./ci/blackbox.sh --driver=fpga --app=demo --scope
|
||||
|
||||
|
||||
A waveform trace `trace.vcd` will be generated in the current directory during the program execution. This trace includes a limited set of signals that are defined in `/hw/scripts/scope.json`. You can expand your signals' selection by updating the json file.
|
||||
A waveform trace `trace.vcd` will be generated in the current directory during the program execution. This trace includes a limited set of signals that are defined in `/hw/scripts/scope.json`. You can expand your signals' selection by updating the json file.
|
||||
|
||||
## Analyzing Vortex trace log
|
||||
|
||||
When debugging Vortex RTL or SimX Simulator, reading the trace run.log file can be overwhelming when the trace gets really large.
|
||||
We provide a trace sanitizer tool under ./hw/scripts/trace_csv.py that you can use to convert the large trace into a CSV file containing all the instructions that executed with their source and destination operands.
|
||||
|
||||
$ ./ci/blackbox.sh --driver=rtlsim --app=demo --debug=3 --log=run_rtlsim.log
|
||||
$ ./ci/trace_csv.py -trtlsim run_rtlsim.log -otrace_rtlsim.csv
|
||||
|
||||
$ ./ci/blackbox.sh --driver=simx --app=demo --debug=3 --log=run_simx.log
|
||||
$ ./ci/trace_csv.py -tsimx run_simx.log -otrace_simx.csv
|
||||
|
||||
The first column in the CSV trace is UUID (universal unique identifier) of the instruction and the content is sorted by the UUID. You can use the UUID to trace the same instruction running on either the RTL hw or SimX simulator.
|
||||
This can be very effective if you want to use SimX to debugging your RTL hardware by comparing CSV traces.
|
||||
@@ -1,128 +0,0 @@
|
||||
# Execute OpenCL on Vortex backend
|
||||
|
||||
## Requirements
|
||||
- [Vortex](https://github.com/vortexgpgpu/vortex)
|
||||
- [POCL for Vortex](https://github.com/vortexgpgpu/pocl)
|
||||
- [riscv-toolchain](https://github.com/riscv-collab/riscv-gnu-toolchain)
|
||||
- [llvm-riscv](https://github.com/llvm-mirror/llvm)
|
||||
|
||||
For installation, please see [Build Instructions](../README.md) for more details.
|
||||
|
||||
**For Ubuntu18.04 users, you can directly download pre-build toolchains with [toolchain_install.sh](https://github.com/vortexgpgpu/vortex/blob/master/ci/toolchain_install.sh) script.**
|
||||
```bash
|
||||
# please modify the DESTDIR variable in the script before execution
|
||||
bash toolchain_install.sh -all
|
||||
```
|
||||
Assuming we have installed all dependencies in `/opt` path, we can get the following environment:
|
||||
```bash
|
||||
tree -L 2 /opt
|
||||
'''
|
||||
/opt/
|
||||
├── llvm-riscv
|
||||
│ ├── bin
|
||||
│ ├── include
|
||||
│ ├── lib
|
||||
│ ├── libexec
|
||||
│ └── share
|
||||
├── pocl
|
||||
│ ├── compiler
|
||||
│ └── runtime
|
||||
├── riscv-gnu-toolchain
|
||||
│ ├── bin
|
||||
│ ├── drops
|
||||
│ ├── include
|
||||
│ ├── lib
|
||||
│ ├── libexec
|
||||
│ ├── riscv32-unknown-elf
|
||||
│ ├── share
|
||||
│ └── var
|
||||
└── verilator
|
||||
├── bin
|
||||
├── examples
|
||||
├── include
|
||||
├── verilator-config.cmake
|
||||
└── verilator-config-version.cmake
|
||||
'''
|
||||
```
|
||||
## Execute OpenCL on Vortex
|
||||
In this tutorial, we show the example of executing a vecadd programs on SIMX backend.
|
||||
To execute a OpenCL program on Vortex, we have the following steps:
|
||||
- Compile the [OpenCL kernels](https://github.com/vortexgpgpu/vortex/blob/master/tests/opencl/vecadd/kernel.cl) into risc-v binary by POCL compiler.
|
||||
- Compile the [OpenCL host](https://github.com/vortexgpgpu/vortex/blob/master/tests/opencl/vecadd/main.cc) and link with Vortex driver(```-lvortex```).
|
||||
- Execute the compiled host programs on a backend.
|
||||
|
||||
Thus, we can write a Makefile as following:
|
||||
```Makefile
|
||||
LLVM_PREFIX ?= /opt/llvm-riscv
|
||||
RISCV_TOOLCHAIN_PATH ?= /opt/riscv-gnu-toolchain
|
||||
SYSROOT ?= $(RISCV_TOOLCHAIN_PATH)/riscv32-unknown-elf
|
||||
POCL_CC_PATH ?= /opt/pocl/compiler
|
||||
POCL_RT_PATH ?= /opt/pocl/runtime
|
||||
|
||||
OPTS ?= -n64
|
||||
|
||||
# please edit these two variable to your environment
|
||||
VORTEX_DRV_PATH ?= $(realpath ../../../driver)
|
||||
VORTEX_RT_PATH ?= $(realpath ../../../runtime)
|
||||
|
||||
K_LLCFLAGS += "-O3 -march=riscv32 -target-abi=ilp32f -mcpu=generic-rv32 -mattr=+m,+f -mattr=+vortex -float-abi=hard -code-model=small"
|
||||
K_CFLAGS += "-v -O3 --sysroot=$(SYSROOT) --gcc-toolchain=$(RISCV_TOOLCHAIN_PATH) -march=rv32imf -mabi=ilp32f -Xclang -target-feature -Xclang +vortex -I$(VORTEX_RT_PATH)/include -fno-rtti -fno-exceptions -ffreestanding -nostartfiles -fdata-sections -ffunction-sections"
|
||||
K_LDFLAGS += "-Wl,-Bstatic,-T$(VORTEX_RT_PATH)/linker/vx_link.ld -Wl,--gc-sections $(VORTEX_RT_PATH)/libvortexrt.a -lm"
|
||||
|
||||
CXXFLAGS += -std=c++11 -O2 -Wall -Wextra -Wfatal-errors
|
||||
|
||||
CXXFLAGS += -Wno-deprecated-declarations -Wno-unused-parameter
|
||||
|
||||
CXXFLAGS += -I$(POCL_RT_PATH)/include
|
||||
|
||||
LDFLAGS += -L$(POCL_RT_PATH)/lib -L$(VORTEX_DRV_PATH)/stub -lOpenCL -lvortex
|
||||
|
||||
PROJECT = vecadd
|
||||
|
||||
SRCS = main.cc
|
||||
|
||||
all: $(PROJECT) kernel.pocl
|
||||
|
||||
kernel.pocl: kernel.cl
|
||||
LLVM_PREFIX=$(LLVM_PREFIX) POCL_DEBUG=all LD_LIBRARY_PATH=$(LLVM_PREFIX)/lib:$(POCL_CC_PATH)/lib $(POCL_CC_PATH)/bin/poclcc -LLCFLAGS $(K_LLCFLAGS) -CFLAGS $(K_CFLAGS) -LDFLAGS $(K_LDFLAGS) -o kernel.pocl kernel.cl
|
||||
|
||||
$(PROJECT): $(SRCS)
|
||||
$(CXX) $(CXXFLAGS) $^ $(LDFLAGS) -o $@
|
||||
|
||||
run-fpga: $(PROJECT) kernel.pocl
|
||||
LD_LIBRARY_PATH=$(POCL_RT_PATH)/lib:$(VORTEX_DRV_PATH)/fpga:$(LD_LIBRARY_PATH) ./$(PROJECT) $(OPTS)
|
||||
|
||||
run-asesim: $(PROJECT) kernel.pocl
|
||||
LD_LIBRARY_PATH=$(POCL_RT_PATH)/lib:$(VORTEX_DRV_PATH)/asesim:$(LD_LIBRARY_PATH) ./$(PROJECT) $(OPTS)
|
||||
|
||||
run-vlsim: $(PROJECT) kernel.pocl
|
||||
LD_LIBRARY_PATH=$(POCL_RT_PATH)/lib:$(VORTEX_DRV_PATH)/vlsim:$(LD_LIBRARY_PATH) ./$(PROJECT) $(OPTS)
|
||||
|
||||
run-simx: $(PROJECT) kernel.pocl
|
||||
LD_LIBRARY_PATH=$(POCL_RT_PATH)/lib:$(VORTEX_DRV_PATH)/simx:$(LD_LIBRARY_PATH) ./$(PROJECT) $(OPTS)
|
||||
|
||||
run-rtlsim: $(PROJECT) kernel.pocl
|
||||
LD_LIBRARY_PATH=$(POCL_RT_PATH)/lib:$(VORTEX_DRV_PATH)/rtlsim:$(LD_LIBRARY_PATH) ./$(PROJECT) $(OPTS)
|
||||
|
||||
.depend: $(SRCS)
|
||||
$(CXX) $(CXXFLAGS) -MM $^ > .depend;
|
||||
|
||||
clean:
|
||||
rm -rf $(PROJECT) *.o .depend
|
||||
|
||||
clean-all: clean
|
||||
rm -rf *.pocl *.dump
|
||||
|
||||
ifneq ($(MAKECMDGOALS),clean)
|
||||
-include .depend
|
||||
endif
|
||||
```
|
||||
|
||||
First, build the host program.
|
||||
```bash
|
||||
make all
|
||||
```
|
||||
If we want to execute on SIMX, we can execute the command below.
|
||||
```bash
|
||||
make run-simx
|
||||
```
|
||||
@@ -17,22 +17,16 @@ OPAE Build
|
||||
------------------
|
||||
|
||||
The FPGA has to following configuration options:
|
||||
- 1 core fpga (fpga-1c)
|
||||
- 2 cores fpga (fpga-2c)
|
||||
- 4 cores fpga (fpga-4c)
|
||||
- 8 cores fpga (fpga-8c)
|
||||
- 16 cores fpga (fpga-16c)
|
||||
- 32 cores fpga (fpga-32c)
|
||||
- 64 cores fpga (fpga-64c)
|
||||
- DEVICE_FAMILY=arria10 | stratix10
|
||||
- NUM_CORES=#n
|
||||
|
||||
Command line:
|
||||
|
||||
$ cd hw/syn/opae
|
||||
$ make fpga-<num-of-cores>c
|
||||
$ cd hw/syn/altera/opae
|
||||
$ PREFIX=test1 TARGET=fpga NUM_CORES=4 make
|
||||
|
||||
Example: `make fpga-4c`
|
||||
|
||||
A new folder (ex: `build_fpga_4c`) will be created and the build will start and take ~30-480 min to complete.
|
||||
A new folder (ex: `test1_xxx_4c`) will be created and the build will start and take ~30-480 min to complete.
|
||||
Setting TARGET=ase will build the project for simulation using Intel ASE.
|
||||
|
||||
|
||||
OPAE Build Configuration
|
||||
@@ -45,35 +39,32 @@ The hardware configuration file `/hw/rtl/VX_config.vh` defines all the hardware
|
||||
|
||||
You configure the syntesis build from the command line:
|
||||
|
||||
$ CONFIGS="-DPERF_ENABLE -DNUM_THREADS=8" make fpga-4c
|
||||
$ CONFIGS="-DPERF_ENABLE -DNUM_THREADS=8" make
|
||||
|
||||
OPAE Build Progress
|
||||
-------------------
|
||||
|
||||
You could check the last 10 lines in the build log for possible errors until build completion.
|
||||
|
||||
$ tail -n 10 ./build_fpga_<num-of-cores>c/build.log
|
||||
$ tail -n 10 <build_dir>/build.log
|
||||
|
||||
Check if the build is still running by looking for quartus_sh, quartus_syn, or quartus_fit programs.
|
||||
|
||||
$ ps -u <username>
|
||||
|
||||
|
||||
If the build fails and you need to restart it, clean up the build folder using the following command:
|
||||
|
||||
$ make clean-fpga-<num-of-cores>c
|
||||
|
||||
Example: `make clean-fpga-4c`
|
||||
$ make clean
|
||||
|
||||
The file `vortex_afu.gbs` should exist when the build is done:
|
||||
|
||||
$ ls -lsa ./build_fpga_<num-of-cores>c/vortex_afu.gbs
|
||||
$ ls -lsa <build_dir>/vortex_afu.gbs
|
||||
|
||||
|
||||
Signing the bitstream and Programming the FPGA
|
||||
----------------------------------------------
|
||||
|
||||
$ cd ./build_fpga_<num-of-cores>c
|
||||
$ cd <build_dir>
|
||||
$ PACSign PR -t UPDATE -H openssl_manager -i vortex_afu.gbs -o vortex_afu_unsigned_ssl.gbs
|
||||
$ fpgasupdate vortex_afu_unsigned_ssl.gbs
|
||||
|
||||
|
||||
@@ -11,7 +11,6 @@
|
||||
- [Debugging](debugging.md)
|
||||
- [Useful Links](references.md)
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
- Refer to the build instructions in [README](../README.md).
|
||||
@@ -22,9 +21,11 @@ Running Vortex simulators with different configurations:
|
||||
- Run basic driver test with rtlsim driver and Vortex config of 2 clusters, 2 cores, 2 warps, 4 threads
|
||||
|
||||
$ ./ci/blackbox.sh --driver=rtlsim --clusters=2 --cores=2 --warps=2 --threads=4 --app=basic
|
||||
- Run demo driver test with vlsim driver and Vortex config of 1 clusters, 4 cores, 4 warps, 2 threads
|
||||
|
||||
$ ./ci/blackbox.sh --driver=vlsim --clusters=1 --cores=4 --warps=4 --threads=2 --app=demo
|
||||
- Run demo driver test with opae driver and Vortex config of 1 clusters, 4 cores, 4 warps, 2 threads
|
||||
|
||||
$ ./ci/blackbox.sh --driver=opae --clusters=1 --cores=4 --warps=4 --threads=2 --app=demo
|
||||
|
||||
- Run dogfood driver test with simx driver and Vortex config of 4 cluster, 4 cores, 8 warps, 6 threads
|
||||
|
||||
$ ./ci/blackbox.sh --driver=simx --clusters=4 --cores=4 --warps=8 --threads=6 --app=dogfood
|
||||
@@ -10,7 +10,7 @@ SimX is a C++ cycle-level in-house simulator developed for Vortex. The relevant
|
||||
|
||||
### FGPA Simulation
|
||||
|
||||
The current target FPGA for simulation is the Arria10 Intel Accelerator Card v1.0. The guide to build the fpga with specific configurations is located [here.](https://github.com/vortexgpgpu/vortex-dev/blob/master/doc/FPGA_Startup_Guide.md)
|
||||
The current target FPGA for simulation is the Arria10 Intel Accelerator Card v1.0. The guide to build the fpga with specific configurations is located [here.](fpga_setup.md)
|
||||
|
||||
### How to Test
|
||||
|
||||
@@ -22,15 +22,15 @@ Running tests under specific drivers (rtlsim,simx,fpga) is done using the script
|
||||
- *Threads* - used to specify the number of threads (smallest unit of computation) within a configuration.
|
||||
- *L2cache* - used to enable the shard l2cache among the Vortex cores.
|
||||
- *L3cache* - used to enable the shared l3cache among the Vortex clusters.
|
||||
- *Driver* - used to specify which driver to run the Vortex simulation (either rtlsim, vlsim, fpga, or simx).
|
||||
- *Driver* - used to specify which driver to run the Vortex simulation (either rtlsim, opae, xrt, simx).
|
||||
- *Debug* - used to enable debug mode for the Vortex simulation.
|
||||
- *Perf* - used to enable the detailed performance counters within the Vortex simulation.
|
||||
- *App* - used to specify which test/benchmark to run in the Vortex simulation. The main choices are vecadd, sgemm, basic, demo, and dogfood. Other tests/benchmarks are located in the `/benchmarks/opencl` folder though not all of them work wit the current version of Vortex.
|
||||
- *Args* - used to pass additional arguments to the application.
|
||||
|
||||
Example use of command line arguments: Run the sgemm benchmark using the vlsim driver with a Vortex configuration of 1 cluster, 4 cores, 4 warps, and 4 threads.
|
||||
Example use of command line arguments: Run the sgemm benchmark using the opae driver with a Vortex configuration of 1 cluster, 4 cores, 4 warps, and 4 threads.
|
||||
|
||||
$ ./ci/blackbox.sh --clusters=1 --cores=4 --warps=4 --threads=4 --driver=vlsim --app=sgemm
|
||||
$ ./ci/blackbox.sh --clusters=1 --cores=4 --warps=4 --threads=4 --driver=opae --app=sgemm
|
||||
|
||||
Output from terminal:
|
||||
```
|
||||
|
||||
33
docs/testing.md
Normal file
33
docs/testing.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Testing
|
||||
|
||||
## Running a Vortex application
|
||||
|
||||
The framework provides a utility script: blakcbox.sh under the /ci/ folder for executing applications in the tests tree.
|
||||
You can query the commandline options of the tool using:
|
||||
|
||||
$ ./ci/blakcbox.sh --help
|
||||
|
||||
To execute sgemm test program on the simx driver and passing "-n10" as argument to sgemm:
|
||||
|
||||
$ ./ci/blakcbox.sh --driver=simx --app=sgemm --args="-n10"
|
||||
|
||||
You can execute the same application of a GPU architecture with 2 cores:
|
||||
|
||||
$ ./ci/blakcbox.sh --core=2 --driver=simx --app=sgemm --args="-n10"
|
||||
|
||||
When excuting, Blackbox needs to recompile the driver if the desired architecture changes.
|
||||
It tracks the latest configuration in a file under the current directory blackbox.<driver>.cache.
|
||||
To avoid having to rebuild the driver all the time, Blackbox checks if the latest cached configuration matches the current.
|
||||
|
||||
## Running Benchmarks
|
||||
|
||||
The Vortex test suite is located under the /test/ folder
|
||||
You can execute the default regression suite by running the following commands at the root folder.
|
||||
|
||||
$ make -C tests/regression run-simx
|
||||
$ make -C tests/regression run-rtlsim
|
||||
|
||||
You can execute the default opncl suite by running the following commands at the root folder.
|
||||
|
||||
$ make -C tests/opencl run-simx
|
||||
$ make -C tests/opencl run-rtlsim
|
||||
Reference in New Issue
Block a user