Compare commits

..

46 Commits

Author SHA1 Message Date
29f75e60a5 Merge remote-tracking branch 'origin/IRPrinter' into IRPrinter 2025-06-23 00:24:19 +08:00
9d8930f5df fix % repeat in IR print 2025-06-23 00:22:15 +08:00
3da2f3ec80 修复函数类型判断,终端跑通所有测试代码。Printer格式需要修复 2025-06-22 18:40:33 +08:00
496e2abfb6 构建IR打印器,llvm风格,跑通大部分样例(9/10),待修复 2025-06-22 17:59:19 +08:00
4711fb603b fixed bugs brought out by merging 2025-06-22 14:39:38 +08:00
dda8bbe444 Merge branch 'array_add' 2025-06-22 14:24:00 +08:00
d90330af3f add Utils::initExternalFunction 2025-06-22 14:14:02 +08:00
25a8c72a9b [backend] it works 1.0 2025-06-22 14:06:14 +08:00
4828c18f96 前端基本构建完毕,build前端部分无报错,argument类删除后端报错,llvmIR输出待完成 2025-06-22 00:25:43 +08:00
73b382773a 暂存旧符号表结构定义,TODO.md中添加相关说明 2025-06-21 18:07:32 +08:00
0a04c816cf 更新IR,.g4修改 2025-06-21 18:06:29 +08:00
232ed6d023 [backend] introduced rv32 backend 2025-06-21 17:26:50 +08:00
3ed1c7fecd 更改前置声明,IR生成更新 2025-06-21 16:39:13 +08:00
ba5f2a0620 删除格式化功能 2025-06-21 15:40:00 +08:00
8109d44232 工具类方法部分实现,实现部分IR生成 2025-06-21 14:33:56 +08:00
2b038e671b 修复bug 2025-06-21 14:33:22 +08:00
c1583e447d 更改g4文件,优化IR生成流程 2025-06-21 13:44:51 +08:00
30f89bba23 更新IR结构,重写IRBuilder 2025-06-21 12:53:41 +08:00
c54543bff3 更新目录结构,修改IR结构,部分修复IR生成 2025-06-20 22:46:04 +08:00
1de8c0e7d7 引入了常量池优化,修改constvalue类并对IR生成修复,能够编译通过 2025-06-19 00:18:58 +08:00
1aa785efc3 add arraytype def 2025-06-16 20:56:32 +08:00
5727d3bde5 [IR Gen] debugging SIGSEGV 2025-06-09 21:11:17 +08:00
3c5fb7d17b [IR Gen] fixed build errors 2025-06-09 20:06:05 +08:00
7d08763b2e [IR gen] debugging 2025-06-09 19:30:37 +08:00
c47d522e3a [IR Gen] debugging expreimental IR generator 2025-06-09 19:29:59 +08:00
5e84961dcf [IR gen] introduced IR builder into LLVMIRGenerator 2025-06-09 00:47:47 +08:00
df209f976e fixed bugs brought out by merging 2025-05-30 02:13:17 +08:00
969c83125d Merge branch 'lab2-IRGen' 2025-05-30 02:06:43 +08:00
fbb2f5f310 replace "i++;" with "i = i + 1;" in testcase 20 2025-05-30 01:57:24 +08:00
77f79dcbaf merging, fixed some bugs 2025-05-30 01:34:47 +08:00
551d727733 merging 2025-05-29 22:09:16 +08:00
1c799bd04f merging 2025-05-29 19:25:46 +08:00
09d67fdaf1 merging branch lab2-IRGen into master 2025-05-29 17:14:42 +08:00
1e47af2771 merging branch lab2-IRGen into master 2025-05-29 16:09:17 +08:00
bb73ce3b5a merging branch lab2-IRGen into master 2025-05-28 23:49:02 +08:00
90ba6db318 [lab2]IR-Gen implementation using given code by xxy
Merge pull request !4 from Downright/master
2025-03-26 11:42:19 +00:00
a35c63245e Remove .vscode dir 2025-03-26 18:34:36 +08:00
f01c38d3e8 commit 4 cmakelist and .gitignore 2025-03-26 18:29:17 +08:00
f74d319399 pass test 11_add2 2025-03-26 11:44:34 +08:00
1322ed8e08 gdb json 2025-03-26 11:39:29 +08:00
9bea0879e0 pass test1,but test2 segmentation fault 2025-03-26 11:39:22 +08:00
7f364abffb frame finished but bad_any_cast 2025-03-24 19:06:49 +08:00
a36f73c8a2 add file 2025-03-24 00:44:52 +08:00
3e80cb8f3f add tag4stmt in .g4,format cpp modification needed 2025-03-19 20:35:37 +08:00
d4d7e6494b exp2 first commit 2025-03-19 19:06:14 +08:00
qzx
a6f95366c8 fix a bug of issue1 2025-03-06 11:22:13 +08:00
43 changed files with 6932 additions and 2779 deletions

7
.gitignore vendored
View File

@ -37,8 +37,9 @@ doxygen
!/testdata/functional/*.out
!/testdata/performance/*.out
build
build/
.antlr
.vscode/
tmp
@ -48,4 +49,6 @@ GTAGS
__init__.py
*.pyc
*.pyc
.DS_*

View File

@ -29,7 +29,8 @@ sudo apt install -y uuid-dev libutfcpp-dev pkg-config make git cmake openjdk-11-
git clone https://gitee.com/xsu1989/sysy.git
cd sysy
cmake -S . -B build
cmake --build build
cmake --build build --target all -- -j $(nproc)
cmake --build build --target all -- -j $(nproc) -DCMAKE_BUILD_TYPE=Debug
```
构建完成后,可以运行一个小的测试用例。该测试将逗号分隔的整数或字符串列表进行格式化后重新输出,即将相邻参数之间的分隔统一调整为逗号外加一个空格。
@ -81,7 +82,7 @@ cmake_install.cmake SysYBaseVisitor.h SysYLexer.cpp SysYLexer.interp SysYP
完成格式化器进阶内容后,可以通过执行`sysyc`观察格式化器的执行效果。
```bash
./build/sysyc -f test/format-test.sy
./build/bin/sysyc -f test/format-test.sy
```
## 实验2从AST生成中间表示
@ -100,7 +101,7 @@ cmake_install.cmake SysYBaseVisitor.h SysYLexer.cpp SysYLexer.interp SysYP
完成实验2后可以通过执行`sysyc`观察生成的IR。
```bash
./build/sysyc -s ir 01_add.sy
./build/bin/sysyc -s ir test/01_add.sy
```
请同学们仔细阅读代码学习IR的定义。可以使用doxygen工具自动生成HTML文档
@ -112,6 +113,8 @@ doxygen doc/Doxyfile
上述命令执行完毕后将在doxygen/html下找到生成的代码文档。
该版本还未实现数组初始化定义等功能
## 实验3从SysY IR 生成ARMv7汇编代码
### 后端相关源码

87
TODO.md Normal file
View File

@ -0,0 +1,87 @@
要打通从SysY到RISC-V的完整编译流程以下是必须实现的核心模块和关键步骤按编译流程顺序。在你们当前IR生成阶段可以优先实现这些基础模块来快速获得可工作的RISC-V汇编输出
### 1. **前端必须模块**
- **词法/语法分析**(已完成):
- `SysYLexer`/`SysYParser`ANTLR生成的解析器
- **IR生成核心**
- `SysYIRGenerator`将AST转换为中间表示IR
- `IRBuilder`:构建指令和基本块的工具类(你们正在实现的部分)
### 2. **中端必要优化(最小集合)**
常量传播
| 优化阶段 | 关键作用 | 是否必须 |
|-------------------|----------------------------------|----------|
| `Mem2Reg` | 消除冗余内存访问转换为SSA形式 | ✅ 核心 |
| `DCE` (死代码消除) | 移除无用指令 | ✅ 必要 |
| `DFE` (死函数消除) | 移除未使用的函数 | ✅ 必要 |
| `FuncAnalysis` | 函数调用关系分析 | ✅ 基础 |
| `Global2Local` | 全局变量降级为局部变量 | ✅ 重要 |
### 3. **后端核心流程(必须实现)**
```mermaid
graph LR
A[IR指令选择] --> B[寄存器分配]
B --> C[指令调度]
C --> D[汇编生成]
```
1. **指令选择**(关键步骤):
- `DAGBuilder`将IR转换为有向无环图DAG
- `DAGCoverage`DAG到目标指令的映射
- `Mid2End`IR到机器指令的转换接口
2. **寄存器分配**
- `RegisterAlloc`:基础寄存器分配器(可先实现简单算法如线性扫描)
3. **汇编生成**
- `RiscvPrinter`将机器指令输出为RISC-V汇编
- 实现基础指令集:`add`/`sub`/`lw`/`sw`/`beq`/`jal`
### 4. **最小可工作流程**
```cpp
// 精简版编译流程(跳过复杂优化)
int main() {
// 1. 前端解析
auto module = sysy::SysYIRGenerator().genIR(input);
// 2. 关键中端优化
sysy::Mem2Reg(module).run(); // 必须
sysy::Global2Local(module).run(); // 必须
sysy::DCE(module).run(); // 推荐
// 3. 后端代码生成
auto backendModule = mid2end::CodeGenerater().run(module);
riscv::RiscvPrinter().print("output.s", backendModule);
}
```
### 5. **当前开发优先级建议**
1. **完成IR生成**
- 确保能构建基本块、函数、算术/内存/控制流指令
- 实现`createCall`/`createLoad`/`createStore`等核心方法
2. **实现Mem2Reg**
- 插入Phi节点
- 变量重命名(关键算法)
3. **构建基础后端**
- 指令选择实现IR到RISC-V的简单映射例如`IRAdd``add`
- 寄存器分配:使用无限寄存器方案(后期替换为真实分配)
- 汇编打印:支持基础指令输出
> **注意**:循环优化、函数内联、高级寄存器分配等可在基础流程打通后逐步添加。初期可跳过复杂优化。
### 6. 调试建议
- 添加IR打印模块`SysYPrinter`)验证前端输出
- 使用简化测试用例:
```c
int main() {
int a = 1;
int b = a + 2;
return b;
}
```
- 逐步扩展支持:
1. 算术运算 → 2. 条件分支 → 3. 函数调用 → 4. 数组访问
通过聚焦这些核心模块你们可以快速打通从SysY到RISC-V的基础编译流程后续再逐步添加优化传递提升代码质量。

169
doc/markdowns/IR.md Normal file
View File

@ -0,0 +1,169 @@
这个头文件定义了一个用于生成中间表示IR的数据结构主要用于编译器前端将抽象语法树AST转换为中间代码。以下是文件中定义的主要类和功能的整理和解释
---
### **1. 类型系统Type System**
#### **1.1 `Type` 类**
- **作用**:表示所有基本标量类型(如 `int``float``void` 等)以及指针类型和函数类型。
- **成员**
- `Kind` 枚举:表示类型的种类(如 `kInt``kFloat``kPointer` 等)。
- `kind`:当前类型的种类。
- 构造函数:`Type(Kind kind)`,用于初始化类型。
- 静态方法:如 `getIntType()``getFloatType()` 等,用于获取特定类型的单例对象。
- 类型检查方法:如 `isInt()``isFloat()` 等,用于检查当前类型是否为某种类型。
- `getSize()`:获取类型的大小。
- `as<T>()`:将当前类型动态转换为派生类(如 `PointerType``FunctionType`)。
#### **1.2 `PointerType` 类**
- **作用**:表示指针类型,派生自 `Type`
- **成员**
- `baseType`:指针指向的基础类型。
- 静态方法:`get(Type *baseType)`,用于获取指向 `baseType` 的指针类型。
- `getBaseType()`:获取指针指向的基础类型。
#### **1.3 `FunctionType` 类**
- **作用**:表示函数类型,派生自 `Type`
- **成员**
- `returnType`:函数的返回类型。
- `paramTypes`:函数的参数类型列表。
- 静态方法:`get(Type *returnType, const std::vector<Type *> &paramTypes)`,用于获取函数类型。
- `getReturnType()`:获取函数的返回类型。
- `getParamTypes()`:获取函数的参数类型列表。
---
### **2. 中间表示IR**
#### **2.1 `Value` 类**
- **作用**:表示 IR 中的所有值(如指令、常量、参数等)。
- **成员**
- `Kind` 枚举:表示值的种类(如 `kAdd``kSub``kConstant` 等)。
- `kind`:当前值的种类。
- `type`:值的类型。
- `name`:值的名称。
- `uses`:值的用途列表(表示哪些指令使用了该值)。
- 构造函数:`Value(Kind kind, Type *type, const std::string &name)`
- 类型检查方法:如 `isInt()``isFloat()` 等。
- `getUses()`:获取值的用途列表。
- `replaceAllUsesWith(Value *value)`:将该值的所有用途替换为另一个值。
- `print(std::ostream &os)`:打印值的表示。
#### **2.2 `ConstantValue` 类**
- **作用**:表示编译时常量(如整数常量、浮点数常量)。
- **成员**
- `iScalar``fScalar`:分别存储整数和浮点数常量的值。
- 静态方法:`get(int value)``get(float value)`,用于获取常量值。
- `getInt()``getFloat()`:获取常量的值。
#### **2.3 `Argument` 类**
- **作用**:表示函数或基本块的参数。
- **成员**
- `block`:参数所属的基本块。
- `index`:参数的索引。
- 构造函数:`Argument(Type *type, BasicBlock *block, int index, const std::string &name)`
- `getParent()`:获取参数所属的基本块。
- `getIndex()`:获取参数的索引。
#### **2.4 `BasicBlock` 类**
- **作用**:表示基本块,包含一系列指令。
- **成员**
- `parent`:基本块所属的函数。
- `instructions`:基本块中的指令列表。
- `arguments`:基本块的参数列表。
- `successors``predecessors`:基本块的后继和前驱列表。
- 构造函数:`BasicBlock(Function *parent, const std::string &name)`
- `getParent()`:获取基本块所属的函数。
- `getInstructions()`:获取基本块中的指令列表。
- `createArgument()`:为基本块创建一个参数。
#### **2.5 `Instruction` 类**
- **作用**:表示 IR 中的指令,派生自 `User`
- **成员**
- `Kind` 枚举:表示指令的种类(如 `kAdd``kSub``kLoad` 等)。
- `kind`:当前指令的种类。
- `parent`:指令所属的基本块。
- 构造函数:`Instruction(Kind kind, Type *type, BasicBlock *parent, const std::string &name)`
- `getParent()`:获取指令所属的基本块。
- `getFunction()`:获取指令所属的函数。
- 指令分类方法:如 `isBinary()``isUnary()``isMemory()` 等。
#### **2.6 `User` 类**
- **作用**:表示使用其他值的指令或全局值,派生自 `Value`
- **成员**
- `operands`:指令的操作数列表。
- 构造函数:`User(Kind kind, Type *type, const std::string &name)`
- `getOperand(int index)`:获取指定索引的操作数。
- `addOperand(Value *value)`:添加一个操作数。
- `replaceOperand(int index, Value *value)`:替换指定索引的操作数。
#### **2.7 具体指令类**
- **`CallInst`**:表示函数调用指令。
- **`UnaryInst`**:表示一元操作指令(如取反、类型转换)。
- **`BinaryInst`**:表示二元操作指令(如加法、减法)。
- **`ReturnInst`**:表示返回指令。
- **`UncondBrInst`**:表示无条件跳转指令。
- **`CondBrInst`**:表示条件跳转指令。
- **`AllocaInst`**:表示栈内存分配指令。
- **`LoadInst`**:表示从内存加载值的指令。
- **`StoreInst`**:表示将值存储到内存的指令。
---
### **3. 模块和函数**
#### **3.1 `Function` 类**
- **作用**:表示函数,包含多个基本块。
- **成员**
- `parent`:函数所属的模块。
- `blocks`:函数中的基本块列表。
- 构造函数:`Function(Module *parent, Type *type, const std::string &name)`
- `getReturnType()`:获取函数的返回类型。
- `getParamTypes()`:获取函数的参数类型列表。
- `addBasicBlock()`:为函数添加一个基本块。
#### **3.2 `GlobalValue` 类**
- **作用**:表示全局变量或常量。
- **成员**
- `parent`:全局值所属的模块。
- `hasInit`:是否有初始化值。
- `isConst`:是否是常量。
- 构造函数:`GlobalValue(Module *parent, Type *type, const std::string &name, const std::vector<Value *> &dims, Value *init)`
- `init()`:获取全局值的初始化值。
#### **3.3 `Module` 类**
- **作用**:表示整个编译单元(如一个源文件)。
- **成员**
- `children`:模块中的所有值(如函数、全局变量)。
- `functions`:模块中的函数列表。
- `globals`:模块中的全局变量列表。
- `createFunction()`:创建一个函数。
- `createGlobalValue()`:创建一个全局变量。
- `getFunction()`:获取指定名称的函数。
- `getGlobalValue()`:获取指定名称的全局变量。
---
### **4. 工具类**
#### **4.1 `Use` 类**
- **作用**:表示值与其使用者之间的关系。
- **成员**
- `index`:值在使用者操作数列表中的索引。
- `user`:使用者。
- `value`:被使用的值。
- 构造函数:`Use(int index, User *user, Value *value)`
- `getValue()`:获取被使用的值。
#### **4.2 `range` 类**
- **作用**:封装迭代器对 `[begin, end)`,用于遍历容器。
- **成员**
- `begin()``end()`:返回范围的起始和结束迭代器。
- `size()`:返回范围的大小。
- `empty()`:判断范围是否为空。
---
### **5. 总结**
- **类型系统**`Type``PointerType``FunctionType` 用于表示 IR 中的类型。
- **中间表示**`Value``ConstantValue``Instruction` 等用于表示 IR 中的值和指令。
- **模块和函数**`Module``Function``GlobalValue` 用于组织 IR 的结构。
- **工具类**`Use``range` 用于辅助实现 IR 的数据结构和遍历。
这个头文件定义了一个完整的 IR 数据结构,适用于编译器前端将 AST 转换为中间代码,并支持后续的优化和目标代码生成。

156
doc/markdowns/IRbuilder.md Normal file
View File

@ -0,0 +1,156 @@
`IRBuilder.h` 文件定义了一个 `IRBuilder`用于简化中间表示IR的构建过程。`IRBuilder` 提供了创建各种 IR 指令的便捷方法,并将这些指令插入到指定的基本块中。以下是对文件中主要内容的整理和解释:
---
### **1. `IRBuilder` 类的作用**
`IRBuilder` 是一个工具类用于在生成中间表示IR时简化指令的创建和插入操作。它的主要功能包括
- 提供创建各种 IR 指令的工厂方法。
- 将创建的指令插入到指定的基本块中。
- 支持在基本块的任意位置插入指令。
---
### **2. 主要成员**
#### **2.1 成员变量**
- **`block`**:当前操作的基本块。
- **`position`**:当前操作的插入位置(基本块中的迭代器)。
#### **2.2 构造函数**
- **默认构造函数**`IRBuilder()`
- **带参数的构造函数**
- `IRBuilder(BasicBlock *block)`:初始化 `IRBuilder`,并设置当前基本块和插入位置(默认在基本块末尾)。
- `IRBuilder(BasicBlock *block, BasicBlock::iterator position)`:初始化 `IRBuilder`,并设置当前基本块和插入位置。
#### **2.3 设置方法**
- **`setPosition(BasicBlock *block, BasicBlock::iterator position)`**:设置当前基本块和插入位置。
- **`setPosition(BasicBlock::iterator position)`**:设置当前插入位置。
#### **2.4 获取方法**
- **`getBasicBlock()`**:获取当前基本块。
- **`getPosition()`**:获取当前插入位置。
---
### **3. 指令创建方法**
`IRBuilder` 提供了多种工厂方法,用于创建不同类型的 IR 指令。这些方法会将创建的指令插入到当前基本块的指定位置。
#### **3.1 函数调用指令**
- **`createCallInst(Function *callee, const std::vector<Value *> &args, const std::string &name)`**
- 创建一个函数调用指令。
- 参数:
- `callee`:被调用的函数。
- `args`:函数参数列表。
- `name`:指令的名称(可选)。
- 返回:`CallInst*`
#### **3.2 一元操作指令**
- **`createUnaryInst(Instruction::Kind kind, Type *type, Value *operand, const std::string &name)`**
- 创建一个一元操作指令(如取反、类型转换)。
- 参数:
- `kind`:指令的类型(如 `kNeg``kFtoI` 等)。
- `type`:指令的结果类型。
- `operand`:操作数。
- `name`:指令的名称(可选)。
- 返回:`UnaryInst*`
- **具体一元操作指令**
- `createNegInst(Value *operand, const std::string &name)`:创建整数取反指令。
- `createNotInst(Value *operand, const std::string &name)`:创建逻辑取反指令。
- `createFtoIInst(Value *operand, const std::string &name)`:创建浮点数转整数指令。
- `createFNegInst(Value *operand, const std::string &name)`:创建浮点数取反指令。
- `createIToFInst(Value *operand, const std::string &name)`:创建整数转浮点数指令。
#### **3.3 二元操作指令**
- **`createBinaryInst(Instruction::Kind kind, Type *type, Value *lhs, Value *rhs, const std::string &name)`**
- 创建一个二元操作指令(如加法、减法)。
- 参数:
- `kind`:指令的类型(如 `kAdd``kSub` 等)。
- `type`:指令的结果类型。
- `lhs``rhs`:左操作数和右操作数。
- `name`:指令的名称(可选)。
- 返回:`BinaryInst*`
- **具体二元操作指令**
- 整数运算:
- `createAddInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数加法指令。
- `createSubInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数减法指令。
- `createMulInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数乘法指令。
- `createDivInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数除法指令。
- `createRemInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数取余指令。
- 整数比较:
- `createICmpEQInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数相等比较指令。
- `createICmpNEInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数不等比较指令。
- `createICmpLTInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数小于比较指令。
- `createICmpLEInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数小于等于比较指令。
- `createICmpGTInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数大于比较指令。
- `createICmpGEInst(Value *lhs, Value *rhs, const std::string &name)`:创建整数大于等于比较指令。
- 浮点数运算:
- `createFAddInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数加法指令。
- `createFSubInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数减法指令。
- `createFMulInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数乘法指令。
- `createFDivInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数除法指令。
- `createFRemInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数取余指令。
- 浮点数比较:
- `createFCmpEQInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数相等比较指令。
- `createFCmpNEInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数不等比较指令。
- `createFCmpLTInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数小于比较指令。
- `createFCmpLEInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数小于等于比较指令。
- `createFCmpGTInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数大于比较指令。
- `createFCmpGEInst(Value *lhs, Value *rhs, const std::string &name)`:创建浮点数大于等于比较指令。
#### **3.4 控制流指令**
- **`createReturnInst(Value *value)`**
- 创建返回指令。
- 参数:
- `value`:返回值(可选)。
- 返回:`ReturnInst*`
- **`createUncondBrInst(BasicBlock *block, std::vector<Value *> args)`**
- 创建无条件跳转指令。
- 参数:
- `block`:目标基本块。
- `args`:跳转参数(可选)。
- 返回:`UncondBrInst*`
- **`createCondBrInst(Value *condition, BasicBlock *thenBlock, BasicBlock *elseBlock, const std::vector<Value *> &thenArgs, const std::vector<Value *> &elseArgs)`**
- 创建条件跳转指令。
- 参数:
- `condition`:跳转条件。
- `thenBlock`:条件为真时的目标基本块。
- `elseBlock`:条件为假时的目标基本块。
- `thenArgs``elseArgs`:跳转参数(可选)。
- 返回:`CondBrInst*`
#### **3.5 内存操作指令**
- **`createAllocaInst(Type *type, const std::vector<Value *> &dims, const std::string &name)`**
- 创建栈内存分配指令。
- 参数:
- `type`:分配的类型。
- `dims`:数组维度(可选)。
- `name`:指令的名称(可选)。
- 返回:`AllocaInst*`
- **`createLoadInst(Value *pointer, const std::vector<Value *> &indices, const std::string &name)`**
- 创建加载指令。
- 参数:
- `pointer`:指针值。
- `indices`:数组索引(可选)。
- `name`:指令的名称(可选)。
- 返回:`LoadInst*`
- **`createStoreInst(Value *value, Value *pointer, const std::vector<Value *> &indices, const std::string &name)`**
- 创建存储指令。
- 参数:
- `value`:要存储的值。
- `pointer`:指针值。
- `indices`:数组索引(可选)。
- `name`:指令的名称(可选)。
- 返回:`StoreInst*`
---
### **4. 总结**
- `IRBuilder` 是一个用于简化 IR 构建的工具类,提供了创建各种 IR 指令的工厂方法。
- 通过 `IRBuilder`,可以方便地在指定基本块的任意位置插入指令。
- 该类的设计使得 IR 的生成更加模块化和易于维护。

121
doc/markdowns/IRcpp.md Normal file
View File

@ -0,0 +1,121 @@
这个 `IR.cpp` 文件实现了 `IR.h` 中定义的中间表示IR数据结构的功能。它包含了类型系统、值、指令、基本块、函数和模块的具体实现以及一些辅助函数用于打印 IR 的内容。以下是对文件中主要内容的整理和解释:
---
### **1. 辅助函数**
#### **1.1 `interleave` 函数**
- **作用**:用于在输出流中插入分隔符(如逗号)来打印容器中的元素。
- **示例**
```cpp
interleave(os, container, ", ");
```
#### **1.2 打印函数**
- **`printVarName`**:打印变量名,全局变量以 `@` 开头,局部变量以 `%` 开头。
- **`printBlockName`**:打印基本块名,以 `^` 开头。
- **`printFunctionName`**:打印函数名,以 `@` 开头。
- **`printOperand`**:打印操作数,如果是常量则直接打印值,否则打印变量名。
---
### **2. 类型系统**
#### **2.1 `Type` 类的实现**
- **静态方法**
- `getIntType()`、`getFloatType()`、`getVoidType()`、`getLabelType()`:返回对应类型的单例对象。
- `getPointerType(Type *baseType)`:返回指向 `baseType` 的指针类型。
- `getFunctionType(Type *returnType, const vector<Type *> &paramTypes)`:返回函数类型。
- **`getSize()`**:返回类型的大小(如 `int` 和 `float` 为 4 字节,指针为 8 字节)。
- **`print()`**:打印类型的表示。
#### **2.2 `PointerType` 类的实现**
- **静态方法**
- `get(Type *baseType)`:返回指向 `baseType` 的指针类型,使用 `std::map` 缓存已创建的指针类型。
- **`getBaseType()`**:返回指针指向的基础类型。
#### **2.3 `FunctionType` 类的实现**
- **静态方法**
- `get(Type *returnType, const vector<Type *> &paramTypes)`:返回函数类型,使用 `std::set` 缓存已创建的函数类型。
- **`getReturnType()`** 和 `getParamTypes()`:分别返回函数的返回类型和参数类型列表。
---
### **3. 值Value**
#### **3.1 `Value` 类的实现**
- **`replaceAllUsesWith(Value *value)`**:将该值的所有用途替换为另一个值。
- **`isConstant()`**:判断值是否为常量(包括常量值、全局值和函数)。
#### **3.2 `ConstantValue` 类的实现**
- **静态方法**
- `get(int value)``get(float value)`:返回整数或浮点数常量,使用 `std::map` 缓存已创建的常量。
- **`getInt()``getFloat()`**:返回常量的值。
- **`print()`**:打印常量的值。
#### **3.3 `Argument` 类的实现**
- **构造函数**:初始化参数的类型、所属基本块和索引。
- **`print()`**:打印参数的表示。
---
### **4. 基本块BasicBlock**
#### **4.1 `BasicBlock` 类的实现**
- **构造函数**:初始化基本块的名称和所属函数。
- **`print()`**:打印基本块的表示,包括参数和指令。
---
### **5. 指令Instruction**
#### **5.1 `Instruction` 类的实现**
- **构造函数**:初始化指令的类型、所属基本块和名称。
- **`print()`**:由具体指令类实现。
#### **5.2 具体指令类的实现**
- **`CallInst`**:表示函数调用指令。
- **`print()`**:打印函数调用的表示。
- **`UnaryInst`**:表示一元操作指令(如取反、类型转换)。
- **`print()`**:打印一元操作的表示。
- **`BinaryInst`**:表示二元操作指令(如加法、减法)。
- **`print()`**:打印二元操作的表示。
- **`ReturnInst`**:表示返回指令。
- **`print()`**:打印返回指令的表示。
- **`UncondBrInst`**:表示无条件跳转指令。
- **`print()`**:打印无条件跳转的表示。
- **`CondBrInst`**:表示条件跳转指令。
- **`print()`**:打印条件跳转的表示。
- **`AllocaInst`**:表示栈内存分配指令。
- **`print()`**:打印内存分配的表示。
- **`LoadInst`**:表示从内存加载值的指令。
- **`print()`**:打印加载指令的表示。
- **`StoreInst`**:表示将值存储到内存的指令。
- **`print()`**:打印存储指令的表示。
---
### **6. 函数Function**
#### **6.1 `Function` 类的实现**
- **构造函数**:初始化函数的名称、返回类型和参数类型。
- **`print()`**:打印函数的表示,包括基本块和指令。
---
### **7. 模块Module**
#### **7.1 `Module` 类的实现**
- **`print()`**:打印模块的表示,包括所有函数和全局变量。
---
### **8. 用户User**
#### **8.1 `User` 类的实现**
- **`setOperand(int index, Value *value)`**:设置指定索引的操作数。
- **`replaceOperand(int index, Value *value)`**:替换指定索引的操作数,并更新用途列表。
---
### **9. 总结**
- **类型系统**:实现了 `Type``PointerType``FunctionType`,用于表示 IR 中的类型。
- **值**:实现了 `Value``ConstantValue``Argument`,用于表示 IR 中的值和参数。
- **基本块**:实现了 `BasicBlock`,用于组织指令。
- **指令**:实现了多种具体指令类(如 `CallInst``BinaryInst` 等),用于表示 IR 中的操作。
- **函数和模块**:实现了 `Function``Module`,用于组织 IR 的结构。
- **打印功能**:通过 `print()` 方法,可以将 IR 的内容输出为可读的文本格式。
这个文件是编译器中间表示的核心实现能够将抽象语法树AST转换为中间代码并支持后续的优化和目标代码生成。

62
olddef.h Normal file
View File

@ -0,0 +1,62 @@
class SymbolTable{
private:
enum Kind
{
kModule,
kFunction,
kBlock,
};
std::forward_list<std::pair<Kind, std::unordered_map<std::string, Value*>>> Scopes;
public:
struct ModuleScope {
SymbolTable& tables_ref;
ModuleScope(SymbolTable& tables) : tables_ref(tables) {
tables.enter(kModule);
}
~ModuleScope() { tables_ref.exit(); }
};
struct FunctionScope {
SymbolTable& tables_ref;
FunctionScope(SymbolTable& tables) : tables_ref(tables) {
tables.enter(kFunction);
}
~FunctionScope() { tables_ref.exit(); }
};
struct BlockScope {
SymbolTable& tables_ref;
BlockScope(SymbolTable& tables) : tables_ref(tables) {
tables.enter(kBlock);
}
~BlockScope() { tables_ref.exit(); }
};
SymbolTable() = default;
bool isModuleScope() const { return Scopes.front().first == kModule; }
bool isFunctionScope() const { return Scopes.front().first == kFunction; }
bool isBlockScope() const { return Scopes.front().first == kBlock; }
Value *lookup(const std::string &name) const {
for (auto &scope : Scopes) {
auto iter = scope.second.find(name);
if (iter != scope.second.end())
return iter->second;
}
return nullptr;
}
auto insert(const std::string &name, Value *value) {
assert(not Scopes.empty());
return Scopes.front().second.emplace(name, value);
}
private:
void enter(Kind kind) {
Scopes.emplace_front();
Scopes.front().first = kind;
}
void exit() {
Scopes.pop_front();
}
};

View File

@ -1,358 +0,0 @@
#include <algorithm>
#include <iostream>
using namespace std;
#include "ASTPrinter.h"
#include "SysYParser.h"
any ASTPrinter::visitCompUnit(SysYParser::CompUnitContext *ctx) {
if(ctx->decl().empty() && ctx->funcDef().empty())
return nullptr;
for (auto dcl : ctx->decl()) {dcl->accept(this);cout << '\n';}cout << '\n';
for (auto func : ctx->funcDef()) {func->accept(this);cout << "\n";}
return nullptr;
}
// std::any ASTPrinter::visitBType(SysYParser::BTypeContext *ctx);
// std::any ASTPrinter::visitDecl(SysYParser::DeclContext *ctx);
std::any ASTPrinter::visitConstDecl(SysYParser::ConstDeclContext *ctx) {
cout << getIndent() << ctx->CONST()->getText() << ' ' << ctx->bType()->getText() << ' ';
auto numConstDefs = ctx->constDef().size();
ctx->constDef(0)->accept(this);
for (int i = 1; i < numConstDefs; ++i) {
cout << ctx->COMMA(i - 1)->getText() << ' ';
ctx->constDef(i)->accept(this);
}
cout << ctx->SEMICOLON()->getText() << '\n';
return nullptr;
}
std::any ASTPrinter::visitConstDef(SysYParser::ConstDefContext *ctx) {
cout << ctx->Ident()->getText();
auto numConstExps = ctx->constExp().size();
for (int i = 0; i < numConstExps; ++i) {
cout << ctx->LBRACK(i)->getText();
ctx->constExp(i)->accept(this);
cout << ctx->RBRACK(i)->getText();
}
cout << ' ' << ctx->ASSIGN()->getText() << ' ';
ctx->constInitVal()->accept(this);
return nullptr;
}
// std::any ASTPrinter::visitConstInitVal(SysYParser::ConstInitValContext *ctx);
std::any ASTPrinter::visitVarDecl(SysYParser::VarDeclContext *ctx){
cout << getIndent() << ctx->bType()->getText() << ' ';
auto numVarDefs = ctx->varDef().size();
ctx->varDef(0)->accept(this);
for (int i = 1; i < numVarDefs; ++i) {
cout << ", ";
ctx->varDef(i)->accept(this);
}
cout << ctx->SEMICOLON()->getText() << '\n';
return nullptr;
}
std::any ASTPrinter::visitVarDef(SysYParser::VarDefContext *ctx){
cout << ctx->Ident()->getText();
auto numConstExps = ctx->constExp().size();
for (int i = 0; i < numConstExps; ++i) {
cout << ctx->LBRACK(i)->getText();
ctx->constExp(i)->accept(this);
cout << ctx->RBRACK(i)->getText();
}
if (ctx->initVal()) {
cout << ' ' << ctx->ASSIGN()->getText() << ' ';
ctx->initVal()->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitInitVal(SysYParser::InitValContext *ctx){
if (ctx->exp()) {
ctx->exp()->accept(this);
} else {
cout << ctx->LBRACE()->getText();
auto numInitVals = ctx->initVal().size();
ctx->initVal(0)->accept(this);
for (int i = 1; i < numInitVals; ++i) {
cout << ctx->COMMA(i - 1)->getText() << ' ';
ctx->initVal(i)->accept(this);
}
cout << ctx->RBRACE()->getText();
}
return nullptr;
}
std::any ASTPrinter::visitFuncDef(SysYParser::FuncDefContext *ctx){
cout << getIndent() << ctx->funcType()->getText() << ' ' << ctx->Ident()->getText();
cout << ctx->LPAREN()->getText();
if (ctx->funcFParams()) ctx->funcFParams()->accept(this);
cout << ctx->RPAREN()->getText();
ctx->blockStmt()->accept(this);
return nullptr;
}
// std::any ASTPrinter::visitFuncType(SysYParser::FuncTypeContext *ctx);
std::any ASTPrinter::visitFuncFParams(SysYParser::FuncFParamsContext *ctx){
auto numFuncFParams = ctx->funcFParam().size();
ctx->funcFParam(0)->accept(this);
for (int i = 1; i < numFuncFParams; ++i) {
cout << ctx->COMMA(i - 1)->getText() << ' ';
ctx->funcFParam(i)->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitFuncFParam(SysYParser::FuncFParamContext *ctx){
cout << ctx->bType()->getText() << ' ' << ctx->Ident()->getText();
if (!ctx->exp().empty()) {
cout << "[]";
for (auto exp : ctx->exp()) {
cout << '[';
exp->accept(this);
cout << ']';
}
}
return nullptr;
}
std::any ASTPrinter::visitBlockStmt(SysYParser::BlockStmtContext *ctx){
cout << ' ' << ctx->LBRACE()->getText() << endl;
indentLevel++;
for (auto item : ctx->blockItem()) item->accept(this);
indentLevel--;
cout << getIndent() << ctx->RBRACE()->getText() << endl;
return nullptr;
}
// std::any ASTPrinter::visitBlockItem(SysYParser::BlockItemContext *ctx);
std::any ASTPrinter::visitStmt(SysYParser::StmtContext *ctx){
if(ctx->lValue()&& ctx->exp()) {
cout << getIndent();
ctx->lValue()->accept(this);
cout << ' ' << ctx->ASSIGN()->getText() <<' ';
ctx->exp()->accept(this);
cout << ctx->SEMICOLON()->getText() << endl;
}
else if(ctx->blockStmt()){
ctx->blockStmt()->accept(this);
}
else if(ctx->IF()){
cout << getIndent() << "if (";
ctx->cond()->accept(this);
cout << ")";
// visit ctx->stmt(0) to judge if it is a blockStmt or a Lvale=exp
if(ctx->stmt(0)->blockStmt())ctx->stmt(0)->accept(this);
else{
cout << " {"<< endl;
indentLevel++;
ctx->stmt(0)->accept(this);
indentLevel--;
cout << getIndent() << "}" << endl;
}
if (ctx->stmt().size() > 1) {
cout << getIndent() << "else";
if(ctx->stmt(1)->blockStmt())ctx->stmt(1)->accept(this);
else{
cout << " {"<< endl;
indentLevel++;
ctx->stmt(1)->accept(this);
indentLevel--;
cout << getIndent() << "}" << endl;
}
}
}
else if(ctx->WHILE()){
cout << getIndent() << "while (";
ctx->cond()->accept(this);
cout << ")";
if(ctx->stmt(0)->blockStmt())ctx->stmt(0)->accept(this);
else{
cout << " {"<< endl;
indentLevel++;
ctx->stmt(0)->accept(this);
indentLevel--;
cout << getIndent() << "}" << endl;
}
}
else if(ctx->BREAK()){
cout << getIndent() << "break;" << endl;
}
else if(ctx->CONTINUE()){
cout << getIndent() << "continue;" << endl;
}
else if(ctx->RETURN()){
cout << getIndent() << "return";
if(ctx->exp()){
cout << ' ';
ctx->exp()->accept(this);
}
cout << ctx->SEMICOLON()->getText() << endl;
}
return nullptr;
}
// std::any ASTPrinter::visitExp(SysYParser::ExpContext *ctx);
// std::any ASTPrinter::visitCond(SysYParser::CondContext *ctx);
std::any ASTPrinter::visitLValue(SysYParser::LValueContext *ctx){
cout << ctx->Ident()->getText();
for (auto exp : ctx->exp()) {
cout << "[";
exp->accept(this);
cout << "]";
}
return nullptr;
}
// std::any ASTPrinter::visitPrimaryExp(SysYParser::PrimaryExpContext *ctx);
std::any ASTPrinter::visitNumber(SysYParser::NumberContext *ctx) {
if(ctx->ILITERAL())cout << ctx->ILITERAL()->getText();
if(ctx->FLITERAL())cout << ctx->FLITERAL()->getText();
return nullptr;
}
std::any ASTPrinter::visitString(SysYParser::StringContext *ctx) {
cout << ctx->STRING()->getText();
return nullptr;
}
std::any ASTPrinter::visitUnaryExp(SysYParser::UnaryExpContext *ctx){
if(ctx->primaryExp())
ctx->primaryExp()->accept(this);
else if(ctx->Ident()){
cout << ctx->Ident()->getText() << ctx->LPAREN()->getText();
if(ctx->funcRParams())
ctx->funcRParams()->accept(this);
if (ctx->RPAREN())
cout << ctx->RPAREN()->getText();
else
cout << "<missing \')\'?>";
}
else if(ctx->unaryExp()){
cout << ctx->unaryOp()->getText();
ctx->unaryExp()->accept(this);
}
else
ctx->accept(this);
return nullptr;
}
// std::any ASTPrinter::visitUnaryOp(SysYParser::UnaryOpContext *ctx);
any ASTPrinter::visitFuncRParams(SysYParser::FuncRParamsContext *ctx) {
if (ctx->exp().empty())
return nullptr;
auto numParams = ctx->exp().size();
ctx->exp(0)->accept(this);
for (int i = 1; i < numParams; ++i) {
cout << ctx->COMMA(i - 1)->getText() << ' ';
ctx->exp(i)->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitMulExp(SysYParser::MulExpContext *ctx){
auto unaryExps = ctx->unaryExp();
if (unaryExps.size() == 1) {
unaryExps[0]->accept(this);
} else {
for (size_t i = 0; i < unaryExps.size() - 1; ++i) {
auto opNode = dynamic_cast<antlr4::tree::TerminalNode *>(ctx->children[2 * i + 1]);
if (opNode) {
unaryExps[i]->accept(this);
cout << " " << opNode->getText() << " ";
}
}
unaryExps.back()->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitAddExp(SysYParser::AddExpContext *ctx){
auto mulExps = ctx->mulExp();
if (mulExps.size() == 1) {
mulExps[0]->accept(this);
} else {
for (size_t i = 0; i < mulExps.size() - 1; ++i) {
auto opNode = dynamic_cast<antlr4::tree::TerminalNode *>(ctx->children[2 * i + 1]);
if (opNode) {
mulExps[i]->accept(this);
cout << " " << opNode->getText() << " ";
}
}
mulExps.back()->accept(this);
}
return nullptr;
}
// 以下表达式待补全形式同addexp mulexp
std::any ASTPrinter::visitRelExp(SysYParser::RelExpContext *ctx){
auto relExps = ctx->addExp();
if (relExps.size() == 1) {
relExps[0]->accept(this);
} else {
for (size_t i = 0; i < relExps.size() - 1; ++i) {
auto opNode = dynamic_cast<antlr4::tree::TerminalNode *>(ctx->children[2 * i + 1]);
if (opNode) {
relExps[i]->accept(this);
cout << " " << opNode->getText() << " ";
}
}
relExps.back()->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitEqExp(SysYParser::EqExpContext *ctx){
auto eqExps = ctx->relExp();
if (eqExps.size() == 1) {
eqExps[0]->accept(this);
} else {
for (size_t i = 0; i < eqExps.size() - 1; ++i) {
auto opNode = dynamic_cast<antlr4::tree::TerminalNode *>(ctx->children[2 * i + 1]);
if (opNode) {
eqExps[i]->accept(this);
cout << " " << opNode->getText() << " ";
}
}
eqExps.back()->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitLAndExp(SysYParser::LAndExpContext *ctx){
auto lAndExps = ctx->eqExp();
if (lAndExps.size() == 1) {
lAndExps[0]->accept(this);
} else {
for (size_t i = 0; i < lAndExps.size() - 1; ++i) {
auto opNode = dynamic_cast<antlr4::tree::TerminalNode *>(ctx->children[2 * i + 1]);
if (opNode) {
lAndExps[i]->accept(this);
cout << " " << opNode->getText() << " ";
}
}
lAndExps.back()->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitLOrExp(SysYParser::LOrExpContext *ctx){
auto lOrExps = ctx->lAndExp();
if (lOrExps.size() == 1) {
lOrExps[0]->accept(this);
} else {
for (size_t i = 0; i < lOrExps.size() - 1; ++i) {
auto opNode = dynamic_cast<antlr4::tree::TerminalNode *>(ctx->children[2 * i + 1]);
if (opNode) {
lOrExps[i]->accept(this);
cout << " " << opNode->getText() << " ";
}
}
lOrExps.back()->accept(this);
}
return nullptr;
}
std::any ASTPrinter::visitConstExp(SysYParser::ConstExpContext *ctx){
ctx->addExp()->accept(this);
return nullptr;
}

View File

@ -1,46 +0,0 @@
#pragma once
#include "SysYBaseVisitor.h"
#include "SysYParser.h"
class ASTPrinter : public SysYBaseVisitor {
private:
int indentLevel = 0;
std::string getIndent() {
return std::string(indentLevel * 4, ' ');
}
public:
std::any visitCompUnit(SysYParser::CompUnitContext *ctx) override;
// std::any visitBType(SysYParser::BTypeContext *ctx) override;
// std::any visitDecl(SysYParser::DeclContext *ctx) override;
std::any visitConstDecl(SysYParser::ConstDeclContext *ctx) override;
std::any visitConstDef(SysYParser::ConstDefContext *ctx) override;
// std::any visitConstInitVal(SysYParser::ConstInitValContext *ctx) override;
std::any visitVarDecl(SysYParser::VarDeclContext *ctx) override;
std::any visitVarDef(SysYParser::VarDefContext *ctx) override;
std::any visitInitVal(SysYParser::InitValContext *ctx) override;
std::any visitFuncDef(SysYParser::FuncDefContext *ctx) override;
// std::any visitFuncType(SysYParser::FuncTypeContext *ctx) override;
std::any visitFuncFParams(SysYParser::FuncFParamsContext *ctx) override;
std::any visitFuncFParam(SysYParser::FuncFParamContext *ctx) override;
std::any visitBlockStmt(SysYParser::BlockStmtContext *ctx) override;
// std::any visitBlockItem(SysYParser::BlockItemContext *ctx) override;
std::any visitStmt(SysYParser::StmtContext *ctx) override;
// std::any visitExp(SysYParser::ExpContext *ctx) override;
// std::any visitCond(SysYParser::CondContext *ctx) override;
std::any visitLValue(SysYParser::LValueContext *ctx) override;
// std::any visitPrimaryExp(SysYParser::PrimaryExpContext *ctx) override;
std::any visitNumber(SysYParser::NumberContext *ctx) override;
std::any visitString(SysYParser::StringContext *ctx) override;
std::any visitUnaryExp(SysYParser::UnaryExpContext *ctx) override;
// std::any visitUnaryOp(SysYParser::UnaryOpContext *ctx) override;
std::any visitFuncRParams(SysYParser::FuncRParamsContext *ctx) override;
std::any visitMulExp(SysYParser::MulExpContext *ctx) override;
std::any visitAddExp(SysYParser::AddExpContext *ctx) override;
std::any visitRelExp(SysYParser::RelExpContext *ctx) override;
std::any visitEqExp(SysYParser::EqExpContext *ctx) override;
std::any visitLAndExp(SysYParser::LAndExpContext *ctx) override;
std::any visitLOrExp(SysYParser::LOrExpContext *ctx) override;
std::any visitConstExp(SysYParser::ConstExpContext *ctx) override;
};

View File

@ -13,12 +13,14 @@ target_link_libraries(SysYParser PUBLIC antlr4_shared)
add_executable(sysyc
sysyc.cpp
ASTPrinter.cpp
IR.cpp
SysYIRGenerator.cpp
Backend.cpp
# Backend.cpp
SysYIRPrinter.cpp
RISCv32Backend.cpp
)
target_include_directories(sysyc PRIVATE ${CMAKE_CURRENT_BINARY_DIR})
target_include_directories(sysyc PRIVATE ${CMAKE_CURRENT_BINARY_DIR} ${CMAKE_CURRENT_SOURCE_DIR}/include)
target_compile_options(sysyc PRIVATE -frtti)
target_link_libraries(sysyc PRIVATE SysYParser)
# set(THREADS_PREFER_PTHREAD_FLAG ON)

1038
src/IR.cpp

File diff suppressed because it is too large Load Diff

994
src/IR.h
View File

@ -1,994 +0,0 @@
#pragma once
#include "range.h"
#include <cassert>
#include <cstdint>
#include <iterator>
#include <list>
#include <map>
#include <memory>
#include <ostream>
#include <string>
#include <type_traits>
#include <vector>
namespace sysy {
/*!
* \defgroup type Types
* The SysY type system is quite simple.
* 1. The base class `Type` is used to represent all primitive scalar types,
* include `int`, `float`, `void`, and the label type representing branch
* targets.
* 2. `PointerType` and `FunctionType` derive from `Type` and represent pointer
* type and function type, respectively.
*
* NOTE `Type` and its derived classes have their ctors declared as 'protected'.
* Users must use Type::getXXXType() methods to obtain `Type` pointers.
* @{
*/
/*!
* `Type` is used to represent all primitive scalar types,
* include `int`, `float`, `void`, and the label type representing branch
* targets
*/
class Type {
public:
enum Kind {
kInt,
kFloat,
kVoid,
kLabel,
kPointer,
kFunction,
};
Kind kind;
protected:
Type(Kind kind) : kind(kind) {}
virtual ~Type() = default;
public:
static Type *getIntType();
static Type *getFloatType();
static Type *getVoidType();
static Type *getLabelType();
static Type *getPointerType(Type *baseType);
static Type *getFunctionType(Type *returnType,
const std::vector<Type *> &paramTypes = {});
public:
Kind getKind() const { return kind; }
bool isInt() const { return kind == kInt; }
bool isFloat() const { return kind == kFloat; }
bool isVoid() const { return kind == kVoid; }
bool isLabel() const { return kind == kLabel; }
bool isPointer() const { return kind == kPointer; }
bool isFunction() const { return kind == kFunction; }
bool isIntOrFloat() const { return kind == kInt or kind == kFloat; }
int getSize() const;
template <typename T>
std::enable_if_t<std::is_base_of_v<Type, T>, T *> as() const {
return dynamic_cast<T *>(const_cast<Type *>(this));
}
void print(std::ostream &os) const;
}; // class Type
//! Pointer type
class PointerType : public Type {
protected:
Type *baseType;
protected:
PointerType(Type *baseType) : Type(kPointer), baseType(baseType) {}
public:
static PointerType *get(Type *baseType);
public:
Type *getBaseType() const { return baseType; }
}; // class PointerType
//! Function type
class FunctionType : public Type {
private:
Type *returnType;
std::vector<Type *> paramTypes;
protected:
FunctionType(Type *returnType, const std::vector<Type *> &paramTypes = {})
: Type(kFunction), returnType(returnType), paramTypes(paramTypes) {}
public:
static FunctionType *get(Type *returnType,
const std::vector<Type *> &paramTypes = {});
public:
Type *getReturnType() const { return returnType; }
auto getParamTypes() const { return make_range(paramTypes); }
int getNumParams() const { return paramTypes.size(); }
}; // class FunctionType
/*!
* @}
*/
/*!
* \defgroup ir IR
*
* The SysY IR is an instruction level language. The IR is orgnized
* as a four-level tree structure, as shown below
*
* \dotfile ir-4level.dot IR Structure
*
* - `Module` corresponds to the top level "CompUnit" syntax structure
* - `GlobalValue` corresponds to the "Decl" syntax structure
* - `Function` corresponds to the "FuncDef" syntax structure
* - `BasicBlock` is a sequence of instructions without branching. A `Function`
* made up by one or more `BasicBlock`s.
* - `Instruction` represents a primitive operation on values, e.g., add or sub.
*
* The fundamental data concept in SysY IR is `Value`. A `Value` is like
* a register and is used by `Instruction`s as input/output operand. Each value
* has an associated `Type` indicating the data type held by the value.
*
* Most `Instruction`s have a three-address signature, i.e., there are at most 2
* input values and at most 1 output value.
*
* The SysY IR adots a Static-Single-Assignment (SSA) design. That is, `Value`
* is defined (as the output operand ) by some instruction, and used (as the
* input operand) by other instructions. While a value can be used by multiple
* instructions, the `definition` occurs only once. As a result, there is a
* one-to-one relation between a value and the instruction defining it. In other
* words, any instruction defines a value can be viewed as the defined value
* itself. So `Instruction` is also a `Value` in SysY IR. See `Value` for the
* type hierachy.
*
* @{
*/
class User;
class Value;
//! `Use` represents the relation between a `Value` and its `User`
class Use {
private:
//! the position of value in the user's operands, i.e.,
//! user->getOperands[index] == value
int index;
User *user;
Value *value;
public:
Use() = default;
Use(int index, User *user, Value *value)
: index(index), user(user), value(value) {}
public:
int getIndex() const { return index; }
User *getUser() const { return user; }
Value *getValue() const { return value; }
void setValue(Value *value) { value = value; }
}; // class Use
template <typename T>
inline std::enable_if_t<std::is_base_of_v<Value, T>, bool>
isa(const Value *value) {
return T::classof(value);
}
template <typename T>
inline std::enable_if_t<std::is_base_of_v<Value, T>, T *>
dyncast(Value *value) {
return isa<T>(value) ? static_cast<T *>(value) : nullptr;
}
template <typename T>
inline std::enable_if_t<std::is_base_of_v<Value, T>, const T *>
dyncast(const Value *value) {
return isa<T>(value) ? static_cast<const T *>(value) : nullptr;
}
//! The base class of all value types
class Value {
public:
enum Kind : uint64_t {
kInvalid,
// Instructions
// Binary
kAdd = 0x1UL << 0,
kSub = 0x1UL << 1,
kMul = 0x1UL << 2,
kDiv = 0x1UL << 3,
kRem = 0x1UL << 4,
kICmpEQ = 0x1UL << 5,
kICmpNE = 0x1UL << 6,
kICmpLT = 0x1UL << 7,
kICmpGT = 0x1UL << 8,
kICmpLE = 0x1UL << 9,
kICmpGE = 0x1UL << 10,
kFAdd = 0x1UL << 14,
kFSub = 0x1UL << 15,
kFMul = 0x1UL << 16,
kFDiv = 0x1UL << 17,
kFRem = 0x1UL << 18,
kFCmpEQ = 0x1UL << 19,
kFCmpNE = 0x1UL << 20,
kFCmpLT = 0x1UL << 21,
kFCmpGT = 0x1UL << 22,
kFCmpLE = 0x1UL << 23,
kFCmpGE = 0x1UL << 24,
// Unary
kNeg = 0x1UL << 25,
kNot = 0x1UL << 26,
kFNeg = 0x1UL << 27,
kFtoI = 0x1UL << 28,
kIToF = 0x1UL << 29,
// call
kCall = 0x1UL << 30,
// terminator
kCondBr = 0x1UL << 31,
kBr = 0x1UL << 32,
kReturn = 0x1UL << 33,
// mem op
kAlloca = 0x1UL << 34,
kLoad = 0x1UL << 35,
kStore = 0x1UL << 36,
kFirstInst = kAdd,
kLastInst = kStore,
// others
kArgument = 0x1UL << 37,
kBasicBlock = 0x1UL << 38,
kFunction = 0x1UL << 39,
kConstant = 0x1UL << 40,
kGlobal = 0x1UL << 41,
};
protected:
Kind kind;
Type *type;
std::string name;
std::list<Use *> uses;
protected:
Value(Kind kind, Type *type, const std::string &name = "")
: kind(kind), type(type), name(name), uses() {}
public:
virtual ~Value() = default;
public:
Kind getKind() const { return kind; }
static bool classof(const Value *) { return true; }
public:
Type *getType() const { return type; }
const std::string &getName() const { return name; }
void setName(const std::string &n) { name = n; }
bool hasName() const { return not name.empty(); }
bool isInt() const { return type->isInt(); }
bool isFloat() const { return type->isFloat(); }
bool isPointer() const { return type->isPointer(); }
const std::list<Use *> &getUses() { return uses; }
void addUse(Use *use) { uses.push_back(use); }
void replaceAllUsesWith(Value *value);
void removeUse(Use *use) { uses.remove(use); }
bool isConstant() const;
public:
virtual void print(std::ostream &os) const {};
}; // class Value
/*!
* Static constants known at compile time.
*
* `ConstantValue`s are not defined by instructions, and do not use any other
* `Value`s. It's type is either `int` or `float`.
*/
class ConstantValue : public Value {
protected:
union {
int iScalar;
float fScalar;
};
protected:
ConstantValue(int value)
: Value(kConstant, Type::getIntType(), ""), iScalar(value) {}
ConstantValue(float value)
: Value(kConstant, Type::getFloatType(), ""), fScalar(value) {}
public:
static ConstantValue *get(int value);
static ConstantValue *get(float value);
public:
static bool classof(const Value *value) {
return value->getKind() == kConstant;
}
public:
int getInt() const {
assert(isInt());
return iScalar;
}
float getFloat() const {
assert(isFloat());
return fScalar;
}
public:
void print(std::ostream &os) const override;
}; // class ConstantValue
class BasicBlock;
/*!
* Arguments of `BasicBlock`s.
*
* SysY IR is an SSA language, however, it does not use PHI instructions as in
* LLVM IR. `Value`s from different predecessor blocks are passed explicitly as
* block arguments. This is also the approach used by MLIR.
* NOTE that `Function` does not own `Argument`s, function arguments are
* implemented as its entry block's arguments.
*/
class Argument : public Value {
protected:
BasicBlock *block;
int index;
public:
Argument(Type *type, BasicBlock *block, int index,
const std::string &name = "");
public:
static bool classof(const Value *value) {
return value->getKind() == kConstant;
}
public:
BasicBlock *getParent() const { return block; }
int getIndex() const { return index; }
public:
void print(std::ostream &os) const override;
};
class Instruction;
class Function;
/*!
* The container for `Instruction` sequence.
*
* `BasicBlock` maintains a list of `Instruction`s, with the last one being
* a terminator (branch or return). Besides, `BasicBlock` stores its arguments
* and records its predecessor and successor `BasicBlock`s.
*/
class BasicBlock : public Value {
friend class Function;
public:
using inst_list = std::list<std::unique_ptr<Instruction>>;
using iterator = inst_list::iterator;
using arg_list = std::vector<std::unique_ptr<Argument>>;
using block_list = std::vector<BasicBlock *>;
protected:
Function *parent;
inst_list instructions;
arg_list arguments;
block_list successors;
block_list predecessors;
protected:
explicit BasicBlock(Function *parent, const std::string &name = "");
public:
static bool classof(const Value *value) {
return value->getKind() == kBasicBlock;
}
public:
int getNumInstructions() const { return instructions.size(); }
int getNumArguments() const { return arguments.size(); }
int getNumPredecessors() const { return predecessors.size(); }
int getNumSuccessors() const { return successors.size(); }
Function *getParent() const { return parent; }
inst_list &getInstructions() { return instructions; }
auto getArguments() const { return make_range(arguments); }
block_list &getPredecessors() { return predecessors; }
block_list &getSuccessors() { return successors; }
iterator begin() { return instructions.begin(); }
iterator end() { return instructions.end(); }
iterator terminator() { return std::prev(end()); }
Argument *createArgument(Type *type, const std::string &name = "") {
auto arg = new Argument(type, this, arguments.size(), name);
assert(arg);
arguments.emplace_back(arg);
return arguments.back().get();
};
public:
void print(std::ostream &os) const override;
}; // class BasicBlock
//! User is the abstract base type of `Value` types which use other `Value` as
//! operands. Currently, there are two kinds of `User`s, `Instruction` and
//! `GlobalValue`.
class User : public Value {
protected:
std::vector<Use> operands;
protected:
User(Kind kind, Type *type, const std::string &name = "")
: Value(kind, type, name), operands() {}
public:
using use_iterator = std::vector<Use>::const_iterator;
struct operand_iterator : public std::vector<Use>::const_iterator {
using Base = std::vector<Use>::const_iterator;
operand_iterator(const Base &iter) : Base(iter) {}
using value_type = Value *;
value_type operator->() { return Base::operator*().getValue(); }
value_type operator*() { return Base::operator*().getValue(); }
};
// struct const_operand_iterator : std::vector<Use>::const_iterator {
// using Base = std::vector<Use>::const_iterator;
// const_operand_iterator(const Base &iter) : Base(iter) {}
// using value_type = Value *;
// value_type operator->() { return operator*().getValue(); }
// };
public:
int getNumOperands() const { return operands.size(); }
operand_iterator operand_begin() const { return operands.begin(); }
operand_iterator operand_end() const { return operands.end(); }
auto getOperands() const {
return make_range(operand_begin(), operand_end());
}
Value *getOperand(int index) const { return operands[index].getValue(); }
void addOperand(Value *value) {
operands.emplace_back(operands.size(), this, value);
value->addUse(&operands.back());
}
template <typename ContainerT> void addOperands(const ContainerT &operands) {
for (auto value : operands)
addOperand(value);
}
void replaceOperand(int index, Value *value);
void setOperand(int index, Value *value);
}; // class User
/*!
* Base of all concrete instruction types.
*/
class Instruction : public User {
public:
// enum Kind : uint64_t {
// kInvalid = 0x0UL,
// // Binary
// kAdd = 0x1UL << 0,
// kSub = 0x1UL << 1,
// kMul = 0x1UL << 2,
// kDiv = 0x1UL << 3,
// kRem = 0x1UL << 4,
// kICmpEQ = 0x1UL << 5,
// kICmpNE = 0x1UL << 6,
// kICmpLT = 0x1UL << 7,
// kICmpGT = 0x1UL << 8,
// kICmpLE = 0x1UL << 9,
// kICmpGE = 0x1UL << 10,
// kFAdd = 0x1UL << 14,
// kFSub = 0x1UL << 15,
// kFMul = 0x1UL << 16,
// kFDiv = 0x1UL << 17,
// kFRem = 0x1UL << 18,
// kFCmpEQ = 0x1UL << 19,
// kFCmpNE = 0x1UL << 20,
// kFCmpLT = 0x1UL << 21,
// kFCmpGT = 0x1UL << 22,
// kFCmpLE = 0x1UL << 23,
// kFCmpGE = 0x1UL << 24,
// // Unary
// kNeg = 0x1UL << 25,
// kNot = 0x1UL << 26,
// kFNeg = 0x1UL << 27,
// kFtoI = 0x1UL << 28,
// kIToF = 0x1UL << 29,
// // call
// kCall = 0x1UL << 30,
// // terminator
// kCondBr = 0x1UL << 31,
// kBr = 0x1UL << 32,
// kReturn = 0x1UL << 33,
// // mem op
// kAlloca = 0x1UL << 34,
// kLoad = 0x1UL << 35,
// kStore = 0x1UL << 36,
// // constant
// // kConstant = 0x1UL << 37,
// };
protected:
Kind kind;
BasicBlock *parent;
protected:
Instruction(Kind kind, Type *type, BasicBlock *parent = nullptr,
const std::string &name = "");
public:
static bool classof(const Value *value) {
return value->getKind() >= kFirstInst and value->getKind() <= kLastInst;
}
public:
Kind getKind() const { return kind; }
BasicBlock *getParent() const { return parent; }
Function *getFunction() const { return parent->getParent(); }
void setParent(BasicBlock *bb) { parent = bb; }
bool isBinary() const {
static constexpr uint64_t BinaryOpMask =
(kAdd | kSub | kMul | kDiv | kRem) |
(kICmpEQ | kICmpNE | kICmpLT | kICmpGT | kICmpLE | kICmpGE) |
(kFAdd | kFSub | kFMul | kFDiv | kFRem) |
(kFCmpEQ | kFCmpNE | kFCmpLT | kFCmpGT | kFCmpLE | kFCmpGE);
return kind & BinaryOpMask;
}
bool isUnary() const {
static constexpr uint64_t UnaryOpMask = kNeg | kNot | kFNeg | kFtoI | kIToF;
return kind & UnaryOpMask;
}
bool isMemory() const {
static constexpr uint64_t MemoryOpMask = kAlloca | kLoad | kStore;
return kind & MemoryOpMask;
}
bool isTerminator() const {
static constexpr uint64_t TerminatorOpMask = kCondBr | kBr | kReturn;
return kind & TerminatorOpMask;
}
bool isCmp() const {
static constexpr uint64_t CmpOpMask =
(kICmpEQ | kICmpNE | kICmpLT | kICmpGT | kICmpLE | kICmpGE) |
(kFCmpEQ | kFCmpNE | kFCmpLT | kFCmpGT | kFCmpLE | kFCmpGE);
return kind & CmpOpMask;
}
bool isBranch() const {
static constexpr uint64_t BranchOpMask = kBr | kCondBr;
return kind & BranchOpMask;
}
bool isCommutative() const {
static constexpr uint64_t CommutativeOpMask =
kAdd | kMul | kICmpEQ | kICmpNE | kFAdd | kFMul | kFCmpEQ | kFCmpNE;
return kind & CommutativeOpMask;
}
bool isUnconditional() const { return kind == kBr; }
bool isConditional() const { return kind == kCondBr; }
}; // class Instruction
class Function;
//! Function call.
class CallInst : public Instruction {
friend class IRBuilder;
protected:
CallInst(Function *callee, const std::vector<Value *> &args = {},
BasicBlock *parent = nullptr, const std::string &name = "");
public:
static bool classof(const Value *value) { return value->getKind() == kCall; }
public:
Function *getCallee() const;
auto getArguments() const {
return make_range(std::next(operand_begin()), operand_end());
}
public:
void print(std::ostream &os) const override;
}; // class CallInst
//! Unary instruction, includes '!', '-' and type conversion.
class UnaryInst : public Instruction {
friend class IRBuilder;
protected:
UnaryInst(Kind kind, Type *type, Value *operand, BasicBlock *parent = nullptr,
const std::string &name = "")
: Instruction(kind, type, parent, name) {
addOperand(operand);
}
public:
static bool classof(const Value *value) {
return Instruction::classof(value) and
static_cast<const Instruction *>(value)->isUnary();
}
public:
Value *getOperand() const { return User::getOperand(0); }
public:
void print(std::ostream &os) const override;
}; // class UnaryInst
//! Binary instruction, e.g., arithmatic, relation, logic, etc.
class BinaryInst : public Instruction {
friend class IRBuilder;
protected:
BinaryInst(Kind kind, Type *type, Value *lhs, Value *rhs, BasicBlock *parent,
const std::string &name = "")
: Instruction(kind, type, parent, name) {
addOperand(lhs);
addOperand(rhs);
}
public:
static bool classof(const Value *value) {
return Instruction::classof(value) and
static_cast<const Instruction *>(value)->isBinary();
}
public:
Value *getLhs() const { return getOperand(0); }
Value *getRhs() const { return getOperand(1); }
public:
void print(std::ostream &os) const override;
}; // class BinaryInst
//! The return statement
class ReturnInst : public Instruction {
friend class IRBuilder;
protected:
ReturnInst(Value *value = nullptr, BasicBlock *parent = nullptr)
: Instruction(kReturn, Type::getVoidType(), parent, "") {
if (value)
addOperand(value);
}
public:
static bool classof(const Value *value) {
return value->getKind() == kReturn;
}
public:
bool hasReturnValue() const { return not operands.empty(); }
Value *getReturnValue() const {
return hasReturnValue() ? getOperand(0) : nullptr;
}
public:
void print(std::ostream &os) const override;
}; // class ReturnInst
//! Unconditional branch
class UncondBrInst : public Instruction {
friend class IRBuilder;
protected:
UncondBrInst(BasicBlock *block, std::vector<Value *> args,
BasicBlock *parent = nullptr)
: Instruction(kCondBr, Type::getVoidType(), parent, "") {
assert(block->getNumArguments() == args.size());
addOperand(block);
addOperands(args);
}
public:
static bool classof(const Value *value) { return value->getKind() == kBr; }
public:
BasicBlock *getBlock() const { return dyncast<BasicBlock>(getOperand(0)); }
auto getArguments() const {
return make_range(std::next(operand_begin()), operand_end());
}
public:
void print(std::ostream &os) const override;
}; // class UncondBrInst
//! Conditional branch
class CondBrInst : public Instruction {
friend class IRBuilder;
protected:
CondBrInst(Value *condition, BasicBlock *thenBlock, BasicBlock *elseBlock,
const std::vector<Value *> &thenArgs,
const std::vector<Value *> &elseArgs, BasicBlock *parent = nullptr)
: Instruction(kCondBr, Type::getVoidType(), parent, "") {
assert(thenBlock->getNumArguments() == thenArgs.size() and
elseBlock->getNumArguments() == elseArgs.size());
addOperand(condition);
addOperand(thenBlock);
addOperand(elseBlock);
addOperands(thenArgs);
addOperands(elseArgs);
}
public:
static bool classof(const Value *value) {
return value->getKind() == kCondBr;
}
public:
Value *getCondition() const { return getOperand(0); }
BasicBlock *getThenBlock() const {
return dyncast<BasicBlock>(getOperand(1));
}
BasicBlock *getElseBlock() const {
return dyncast<BasicBlock>(getOperand(2));
}
auto getThenArguments() const {
auto begin = std::next(operand_begin(), 3);
auto end = std::next(begin, getThenBlock()->getNumArguments());
return make_range(begin, end);
}
auto getElseArguments() const {
auto begin =
std::next(operand_begin(), 3 + getThenBlock()->getNumArguments());
auto end = operand_end();
return make_range(begin, end);
}
public:
void print(std::ostream &os) const override;
}; // class CondBrInst
//! Allocate memory for stack variables, used for non-global variable declartion
class AllocaInst : public Instruction {
friend class IRBuilder;
protected:
AllocaInst(Type *type, const std::vector<Value *> &dims = {},
BasicBlock *parent = nullptr, const std::string &name = "")
: Instruction(kAlloca, type, parent, name) {
addOperands(dims);
}
public:
static bool classof(const Value *value) {
return value->getKind() == kAlloca;
}
public:
int getNumDims() const { return getNumOperands(); }
auto getDims() const { return getOperands(); }
Value *getDim(int index) { return getOperand(index); }
public:
void print(std::ostream &os) const override;
}; // class AllocaInst
//! Load a value from memory address specified by a pointer value
class LoadInst : public Instruction {
friend class IRBuilder;
protected:
LoadInst(Value *pointer, const std::vector<Value *> &indices = {},
BasicBlock *parent = nullptr, const std::string &name = "")
: Instruction(kLoad, pointer->getType()->as<PointerType>()->getBaseType(),
parent, name) {
addOperand(pointer);
addOperands(indices);
}
public:
static bool classof(const Value *value) { return value->getKind() == kLoad; }
public:
int getNumIndices() const { return getNumOperands() - 1; }
Value *getPointer() const { return getOperand(0); }
auto getIndices() const {
return make_range(std::next(operand_begin()), operand_end());
}
Value *getIndex(int index) const { return getOperand(index + 1); }
public:
void print(std::ostream &os) const override;
}; // class LoadInst
//! Store a value to memory address specified by a pointer value
class StoreInst : public Instruction {
friend class IRBuilder;
protected:
StoreInst(Value *value, Value *pointer,
const std::vector<Value *> &indices = {},
BasicBlock *parent = nullptr, const std::string &name = "")
: Instruction(kStore, Type::getVoidType(), parent, name) {
addOperand(value);
addOperand(pointer);
addOperands(indices);
}
public:
static bool classof(const Value *value) { return value->getKind() == kStore; }
public:
int getNumIndices() const { return getNumOperands() - 2; }
Value *getValue() const { return getOperand(0); }
Value *getPointer() const { return getOperand(1); }
auto getIndices() const {
return make_range(std::next(operand_begin(), 2), operand_end());
}
Value *getIndex(int index) const { return getOperand(index + 2); }
public:
void print(std::ostream &os) const override;
}; // class StoreInst
class Module;
//! Function definition
class Function : public Value {
friend class Module;
protected:
Function(Module *parent, Type *type, const std::string &name)
: Value(kFunction, type, name), parent(parent), variableID(0), blocks() {
blocks.emplace_back(new BasicBlock(this, "entry"));
}
public:
static bool classof(const Value *value) {
return value->getKind() == kFunction;
}
public:
using block_list = std::list<std::unique_ptr<BasicBlock>>;
protected:
Module *parent;
int variableID;
int blockID;
block_list blocks;
public:
Type *getReturnType() const {
return getType()->as<FunctionType>()->getReturnType();
}
auto getParamTypes() const {
return getType()->as<FunctionType>()->getParamTypes();
}
auto getBasicBlocks() const { return make_range(blocks); }
BasicBlock *getEntryBlock() const { return blocks.front().get(); }
BasicBlock *addBasicBlock(const std::string &name = "") {
blocks.emplace_back(new BasicBlock(this, name));
return blocks.back().get();
}
void removeBasicBlock(BasicBlock *block) {
blocks.remove_if([&](std::unique_ptr<BasicBlock> &b) -> bool {
return block == b.get();
});
}
int allocateVariableID() { return variableID++; }
int allocateblockID() { return blockID++; }
public:
void print(std::ostream &os) const override;
}; // class Function
// class ArrayValue : public User {
// protected:
// ArrayValue(Type *type, const std::vector<Value *> &values = {})
// : User(type, "") {
// addOperands(values);
// }
// public:
// static ArrayValue *get(Type *type, const std::vector<Value *> &values);
// static ArrayValue *get(const std::vector<int> &values);
// static ArrayValue *get(const std::vector<float> &values);
// public:
// auto getValues() const { return getOperands(); }
// public:
// void print(std::ostream &os) const override{};
// }; // class ConstantArray
//! Global value declared at file scope
class GlobalValue : public User {
friend class Module;
protected:
Module *parent;
bool hasInit;
bool isConst;
protected:
GlobalValue(Module *parent, Type *type, const std::string &name,
const std::vector<Value *> &dims = {}, Value *init = nullptr)
: User(kGlobal, type, name), parent(parent), hasInit(init) {
assert(type->isPointer());
addOperands(dims);
if (init)
addOperand(init);
}
public:
static bool classof(const Value *value) {
return value->getKind() == kGlobal;
}
public:
Value *init() const { return hasInit ? operands.back().getValue() : nullptr; }
int getNumDims() const { return getNumOperands() - (hasInit ? 1 : 0); }
Value *getDim(int index) { return getOperand(index); }
public:
void print(std::ostream &os) const override{};
}; // class GlobalValue
//! IR unit for representing a SysY compile unit
class Module {
protected:
std::vector<std::unique_ptr<Value>> children;
std::map<std::string, Function *> functions;
std::map<std::string, GlobalValue *> globals;
public:
Module() = default;
public:
Function *createFunction(const std::string &name, Type *type) {
if (functions.count(name))
return nullptr;
auto func = new Function(this, type, name);
assert(func);
children.emplace_back(func);
functions.emplace(name, func);
return func;
};
GlobalValue *createGlobalValue(const std::string &name, Type *type,
const std::vector<Value *> &dims = {},
Value *init = nullptr) {
if (globals.count(name))
return nullptr;
auto global = new GlobalValue(this, type, name, dims, init);
assert(global);
children.emplace_back(global);
globals.emplace(name, global);
return global;
}
Function *getFunction(const std::string &name) const {
auto result = functions.find(name);
if (result == functions.end())
return nullptr;
return result->second;
}
GlobalValue *getGlobalValue(const std::string &name) const {
auto result = globals.find(name);
if (result == globals.end())
return nullptr;
return result->second;
}
std::map<std::string, Function *> *getFunctions(){
return &functions;
}
std::map<std::string, GlobalValue *> *getGlobalValues(){
return &globals;
}
public:
void print(std::ostream &os) const;
}; // class Module
/*!
* @}
*/
inline std::ostream &operator<<(std::ostream &os, const Type &type) {
type.print(os);
return os;
}
inline std::ostream &operator<<(std::ostream &os, const Value &value) {
value.print(os);
return os;
}
} // namespace sysy

View File

@ -1,232 +0,0 @@
#pragma once
#include "IR.h"
#include <cassert>
#include <memory>
namespace sysy {
class IRBuilder {
private:
BasicBlock *block;
BasicBlock::iterator position;
public:
IRBuilder() = default;
IRBuilder(BasicBlock *block) : block(block), position(block->end()) {}
IRBuilder(BasicBlock *block, BasicBlock::iterator position)
: block(block), position(position) {}
public:
BasicBlock *getBasicBlock() const { return block; }
BasicBlock::iterator getPosition() const { return position; }
void setPosition(BasicBlock *block, BasicBlock::iterator position) {
this->block = block;
this->position = position;
}
void setPosition(BasicBlock::iterator position) { this->position = position; }
public:
CallInst *createCallInst(Function *callee,
const std::vector<Value *> &args = {},
const std::string &name = "") {
auto inst = new CallInst(callee, args, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
UnaryInst *createUnaryInst(Instruction::Kind kind, Type *type, Value *operand,
const std::string &name = "") {
auto inst = new UnaryInst(kind, type, operand, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
UnaryInst *createNegInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kNeg, Type::getIntType(), operand,
name);
}
UnaryInst *createNotInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kNot, Type::getIntType(), operand,
name);
}
UnaryInst *createFtoIInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kFtoI, Type::getIntType(), operand,
name);
}
UnaryInst *createFNegInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kFNeg, Type::getFloatType(), operand,
name);
}
UnaryInst *createIToFInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kIToF, Type::getFloatType(), operand,
name);
}
BinaryInst *createBinaryInst(Instruction::Kind kind, Type *type, Value *lhs,
Value *rhs, const std::string &name = "") {
auto inst = new BinaryInst(kind, type, lhs, rhs, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
BinaryInst *createAddInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kAdd, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createSubInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kSub, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createMulInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kMul, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createDivInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kDiv, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createRemInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kRem, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createICmpEQInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kICmpEQ, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createICmpNEInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kICmpNE, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createICmpLTInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kICmpLT, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createICmpLEInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kICmpLE, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createICmpGTInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kICmpGT, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createICmpGEInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kICmpGE, Type::getIntType(), lhs, rhs,
name);
}
BinaryInst *createFAddInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFAdd, Type::getFloatType(), lhs, rhs,
name);
}
BinaryInst *createFSubInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFSub, Type::getFloatType(), lhs, rhs,
name);
}
BinaryInst *createFMulInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFMul, Type::getFloatType(), lhs, rhs,
name);
}
BinaryInst *createFDivInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFDiv, Type::getFloatType(), lhs, rhs,
name);
}
BinaryInst *createFRemInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFRem, Type::getFloatType(), lhs, rhs,
name);
}
BinaryInst *createFCmpEQInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpEQ, Type::getFloatType(), lhs,
rhs, name);
}
BinaryInst *createFCmpNEInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpNE, Type::getFloatType(), lhs,
rhs, name);
}
BinaryInst *createFCmpLTInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpLT, Type::getFloatType(), lhs,
rhs, name);
}
BinaryInst *createFCmpLEInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpLE, Type::getFloatType(), lhs,
rhs, name);
}
BinaryInst *createFCmpGTInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpGT, Type::getFloatType(), lhs,
rhs, name);
}
BinaryInst *createFCmpGEInst(Value *lhs, Value *rhs,
const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpGE, Type::getFloatType(), lhs,
rhs, name);
}
ReturnInst *createReturnInst(Value *value = nullptr) {
auto inst = new ReturnInst(value);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
UncondBrInst *createUncondBrInst(BasicBlock *block,
std::vector<Value *> args) {
auto inst = new UncondBrInst(block, args, block);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
CondBrInst *createCondBrInst(Value *condition, BasicBlock *thenBlock,
BasicBlock *elseBlock,
const std::vector<Value *> &thenArgs,
const std::vector<Value *> &elseArgs) {
auto inst = new CondBrInst(condition, thenBlock, elseBlock, thenArgs,
elseArgs, block);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
AllocaInst *createAllocaInst(Type *type,
const std::vector<Value *> &dims = {},
const std::string &name = "") {
auto inst = new AllocaInst(type, dims, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
LoadInst *createLoadInst(Value *pointer,
const std::vector<Value *> &indices = {},
const std::string &name = "") {
auto inst = new LoadInst(pointer, indices, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
StoreInst *createStoreInst(Value *value, Value *pointer,
const std::vector<Value *> &indices = {},
const std::string &name = "") {
auto inst = new StoreInst(value, pointer, indices, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
}
};
} // namespace sysy

674
src/LLVMIRGenerator.cpp Normal file
View File

@ -0,0 +1,674 @@
// LLVMIRGenerator.cpp
// TODO类型转换及其检查
// TODOsysy库函数处理
// TODO数组处理
// TODO对while、continue、break的测试
#include "LLVMIRGenerator.h"
#include <iomanip>
using namespace std;
namespace sysy {
std::string LLVMIRGenerator::generateIR(SysYParser::CompUnitContext* unit) {
// 初始化自定义IR数据结构
irModule = std::make_unique<sysy::Module>();
irBuilder = sysy::IRBuilder(); // 初始化IR构建器
tempCounter = 0;
symbolTable.clear();
tmpTable.clear();
globalVars.clear();
inFunction = false;
visitCompUnit(unit);
return irStream.str();
}
std::string LLVMIRGenerator::getNextTemp() {
std::string ret = "%." + std::to_string(tempCounter++);
tmpTable[ret] = "void";
return ret;
}
std::string LLVMIRGenerator::getLLVMType(const std::string& type) {
if (type == "int") return "i32";
if (type == "float") return "float";
if (type.find("[]") != std::string::npos)
return getLLVMType(type.substr(0, type.size()-2)) + "*";
return "i32";
}
sysy::Type* LLVMIRGenerator::getSysYType(const std::string& typeStr) {
if (typeStr == "int") return sysy::Type::getIntType();
if (typeStr == "float") return sysy::Type::getFloatType();
if (typeStr == "void") return sysy::Type::getVoidType();
// 处理指针类型等
return sysy::Type::getIntType();
}
std::any LLVMIRGenerator::visitCompUnit(SysYParser::CompUnitContext* ctx) {
auto type_i32 = Type::getIntType();
auto type_f32 = Type::getFloatType();
auto type_void = Type::getVoidType();
auto type_i32p = Type::getPointerType(type_i32);
auto type_f32p = Type::getPointerType(type_f32);
// 创建运行时库函数
irModule->createFunction("getint", sysy::FunctionType::get(type_i32, {}));
irModule->createFunction("getch", sysy::FunctionType::get(type_i32, {}));
irModule->createFunction("getfloat", sysy::FunctionType::get(type_f32, {}));
//TODO: 添加更多运行时库函数
irStream << "declare i32 @getint()\n";
irStream << "declare i32 @getch()\n";
irStream << "declare float @getfloat()\n";
//TODO: 添加更多运行时库函数的文本IR
for (auto decl : ctx->decl()) {
decl->accept(this);
}
for (auto funcDef : ctx->funcDef()) {
inFunction = true; // 进入函数定义
funcDef->accept(this);
inFunction = false; // 离开函数定义
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDecl(SysYParser::VarDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
currentVarType = getLLVMType(type);
for (auto varDef : ctx->varDef()) {
if (!inFunction) {
// 全局变量声明
std::string varName = varDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0"; // 默认值为 0
if (varDef->ASSIGN()) {
value = std::any_cast<std::string>(varDef->initVal()->accept(this));
} else {
std::cout << "[WR-Release-01]Warning: Global variable '" << varName
<< "' is declared without initialization, defaulting to 0.\n";
}
irStream << "@" << varName << " = dso_local global " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName); // 记录全局变量
} else {
// 局部变量声明
varDef->accept(this);
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitConstDecl(SysYParser::ConstDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
for (auto constDef : ctx->constDef()) {
if (!inFunction) {
// 全局常量声明
std::string varName = constDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0"; // 默认值为 0
try {
value = std::any_cast<std::string>(constDef->constInitVal()->accept(this));
} catch (...) {
throw std::runtime_error("[ERR-Release-01]Const value must be initialized upon definition.");
}
// 如果是 float 类型,转换为十六进制表示
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("[ERR-Release-02]Invalid float literal: " + value);
}
}
irStream << "@" << varName << " = dso_local constant " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName); // 记录全局变量
} else {
// 局部常量声明
std::string varName = constDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string allocaName = getNextTemp();
std::string value = "0"; // 默认值为 0
try {
value = std::any_cast<std::string>(constDef->constInitVal()->accept(this));
} catch (...) {
throw std::runtime_error("Const value must be initialized upon definition.");
}
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDef(SysYParser::VarDefContext* ctx) {
// TODO数组初始化
std::string varName = ctx->Ident()->getText();
std::string type = currentVarType;
std::string llvmType = getLLVMType(type);
std::string allocaName = getNextTemp();
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
if (ctx->ASSIGN()) {
std::string value = std::any_cast<std::string>(ctx->initVal()->accept(this));
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
}
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
return nullptr;
}
std::any LLVMIRGenerator::visitFuncDef(SysYParser::FuncDefContext* ctx) {
currentFunction = ctx->Ident()->getText();
currentReturnType = getLLVMType(ctx->funcType()->getText());
symbolTable.clear();
tmpTable.clear();
tempCounter = 0;
hasReturn = false;
irStream << "define dso_local " << currentReturnType << " @" << currentFunction << "(";
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
tempCounter += params.size();
for (size_t i = 0; i < params.size(); ++i) {
if (i > 0) irStream << ", ";
std::string paramType = getLLVMType(params[i]->bType()->getText());
irStream << paramType << " noundef %" << i;
symbolTable[params[i]->Ident()->getText()] = {"%" + std::to_string(i), paramType};
tmpTable["%" + std::to_string(i)] = paramType;
}
}
tempCounter++;
irStream << ") #0 {\n";
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
for (size_t i = 0; i < params.size(); ++i) {
std::string varName = params[i]->Ident()->getText();
std::string type = params[i]->bType()->getText();
std::string llvmType = getLLVMType(type);
std::string allocaName = getNextTemp();
tmpTable[allocaName] = llvmType;
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
irStream << " store " << llvmType << " " << symbolTable[varName].first << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
symbolTable[varName] = {allocaName, llvmType};
}
}
ctx->blockStmt()->accept(this);
if (!hasReturn) {
if (currentReturnType == "void") {
irStream << " ret void\n";
} else {
irStream << " ret " << currentReturnType << " 0\n";
}
}
irStream << "}\n";
return nullptr;
}
std::any LLVMIRGenerator::visitBlockStmt(SysYParser::BlockStmtContext* ctx) {
for (auto item : ctx->blockItem()) {
item->accept(this);
}
return nullptr;
}
std::any LLVMIRGenerator::visitAssignStmt(SysYParser::AssignStmtContext *ctx)
{
std::string lhsAlloca = std::any_cast<std::string>(ctx->lValue()->accept(this));
std::string lhsType = symbolTable[ctx->lValue()->Ident()->getText()].second;
std::string rhs = std::any_cast<std::string>(ctx->exp()->accept(this));
if (lhsType == "float") {
try {
double floatValue = std::stod(rhs);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
rhs = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + rhs);
}
}
irStream << " store " << lhsType << " " << rhs << ", " << lhsType
<< "* " << lhsAlloca << ", align 4\n";
return nullptr;
}
std::any LLVMIRGenerator::visitIfStmt(SysYParser::IfStmtContext *ctx)
{
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
std::string trueLabel = "if.then." + std::to_string(tempCounter);
std::string falseLabel = "if.else." + std::to_string(tempCounter);
std::string mergeLabel = "if.end." + std::to_string(tempCounter++);
irStream << " br i1 " << cond << ", label %" << trueLabel << ", label %" << falseLabel << "\n";
irStream << trueLabel << ":\n";
ctx->stmt(0)->accept(this);
irStream << " br label %" << mergeLabel << "\n";
irStream << falseLabel << ":\n";
if (ctx->ELSE()) {
ctx->stmt(1)->accept(this);
}
irStream << " br label %" << mergeLabel << "\n";
irStream << mergeLabel << ":\n";
return nullptr;
}
std::any LLVMIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext *ctx)
{
std::string loop_cond = "while.cond." + std::to_string(tempCounter);
std::string loop_body = "while.body." + std::to_string(tempCounter);
std::string loop_end = "while.end." + std::to_string(tempCounter++);
loopStack.push({loop_end, loop_cond});
irStream << " br label %" << loop_cond << "\n";
irStream << loop_cond << ":\n";
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
irStream << " br i1 " << cond << ", label %" << loop_body << ", label %" << loop_end << "\n";
irStream << loop_body << ":\n";
ctx->stmt()->accept(this);
irStream << " br label %" << loop_cond << "\n";
irStream << loop_end << ":\n";
loopStack.pop();
return nullptr;
}
std::any LLVMIRGenerator::visitBreakStmt(SysYParser::BreakStmtContext *ctx)
{
if (loopStack.empty()) {
throw std::runtime_error("Break statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().breakLabel << "\n";
return nullptr;
}
std::any LLVMIRGenerator::visitContinueStmt(SysYParser::ContinueStmtContext *ctx)
{
if (loopStack.empty()) {
throw std::runtime_error("Continue statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().continueLabel << "\n";
return nullptr;
}
std::any LLVMIRGenerator::visitReturnStmt(SysYParser::ReturnStmtContext *ctx)
{
hasReturn = true;
if (ctx->exp()) {
std::string value = std::any_cast<std::string>(ctx->exp()->accept(this));
irStream << " ret " << currentReturnType << " " << value << "\n";
} else {
irStream << " ret void\n";
}
return nullptr;
}
// std::any LLVMIRGenerator::visitStmt(SysYParser::StmtContext* ctx) {
// if (ctx->lValue() && ctx->ASSIGN()) {
// std::string lhsAlloca = std::any_cast<std::string>(ctx->lValue()->accept(this));
// std::string lhsType = symbolTable[ctx->lValue()->Ident()->getText()].second;
// std::string rhs = std::any_cast<std::string>(ctx->exp()->accept(this));
// if (lhsType == "float") {
// try {
// double floatValue = std::stod(rhs);
// uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
// std::stringstream ss;
// ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
// rhs = ss.str();
// } catch (...) {
// throw std::runtime_error("Invalid float literal: " + rhs);
// }
// }
// irStream << " store " << lhsType << " " << rhs << ", " << lhsType
// << "* " << lhsAlloca << ", align 4\n";
// } else if (ctx->RETURN()) {
// hasReturn = true;
// if (ctx->exp()) {
// std::string value = std::any_cast<std::string>(ctx->exp()->accept(this));
// irStream << " ret " << currentReturnType << " " << value << "\n";
// } else {
// irStream << " ret void\n";
// }
// } else if (ctx->IF()) {
// std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
// std::string trueLabel = "if.then." + std::to_string(tempCounter);
// std::string falseLabel = "if.else." + std::to_string(tempCounter);
// std::string mergeLabel = "if.end." + std::to_string(tempCounter++);
// irStream << " br i1 " << cond << ", label %" << trueLabel << ", label %" << falseLabel << "\n";
// irStream << trueLabel << ":\n";
// ctx->stmt(0)->accept(this);
// irStream << " br label %" << mergeLabel << "\n";
// irStream << falseLabel << ":\n";
// if (ctx->ELSE()) {
// ctx->stmt(1)->accept(this);
// }
// irStream << " br label %" << mergeLabel << "\n";
// irStream << mergeLabel << ":\n";
// } else if (ctx->WHILE()) {
// std::string loop_cond = "while.cond." + std::to_string(tempCounter);
// std::string loop_body = "while.body." + std::to_string(tempCounter);
// std::string loop_end = "while.end." + std::to_string(tempCounter++);
// loopStack.push({loop_end, loop_cond});
// irStream << " br label %" << loop_cond << "\n";
// irStream << loop_cond << ":\n";
// std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
// irStream << " br i1 " << cond << ", label %" << loop_body << ", label %" << loop_end << "\n";
// irStream << loop_body << ":\n";
// ctx->stmt(0)->accept(this);
// irStream << " br label %" << loop_cond << "\n";
// irStream << loop_end << ":\n";
// loopStack.pop();
// } else if (ctx->BREAK()) {
// if (loopStack.empty()) {
// throw std::runtime_error("Break statement outside of a loop.");
// }
// irStream << " br label %" << loopStack.top().breakLabel << "\n";
// } else if (ctx->CONTINUE()) {
// if (loopStack.empty()) {
// throw std::runtime_error("Continue statement outside of a loop.");
// }
// irStream << " br label %" << loopStack.top().continueLabel << "\n";
// } else if (ctx->blockStmt()) {
// ctx->blockStmt()->accept(this);
// } else if (ctx->exp()) {
// ctx->exp()->accept(this);
// }
// return nullptr;
// }
std::any LLVMIRGenerator::visitLValue(SysYParser::LValueContext* ctx) {
std::string varName = ctx->Ident()->getText();
return symbolTable[varName].first;
}
// std::any LLVMIRGenerator::visitPrimaryExp(SysYParser::PrimaryExpContext* ctx) {
// if (ctx->lValue()) {
// std::string allocaPtr = std::any_cast<std::string>(ctx->lValue()->accept(this));
// std::string varName = ctx->lValue()->Ident()->getText();
// std::string type = symbolTable[varName].second;
// std::string temp = getNextTemp();
// irStream << " " << temp << " = load " << type << ", " << type << "* " << allocaPtr << ", align 4\n";
// tmpTable[temp] = type;
// return temp;
// } else if (ctx->exp()) {
// return ctx->exp()->accept(this);
// } else {
// return ctx->number()->accept(this);
// }
// }
std::any LLVMIRGenerator::visitPrimExp(SysYParser::PrimExpContext *ctx){
// irStream << "visitPrimExp\n";
// std::cout << "Type name: " << typeid(*(ctx->primaryExp())).name() << std::endl;
SysYParser::PrimaryExpContext* pExpCtx = ctx->primaryExp();
if (auto* lvalCtx = dynamic_cast<SysYParser::LValContext*>(pExpCtx)) {
std::string allocaPtr = std::any_cast<std::string>(lvalCtx->lValue()->accept(this));
std::string varName = lvalCtx->lValue()->Ident()->getText();
std::string type = symbolTable[varName].second;
std::string temp = getNextTemp();
irStream << " " << temp << " = load " << type << ", " << type << "* " << allocaPtr << ", align 4\n";
tmpTable[temp] = type;
return temp;
} else if (auto* expCtx = dynamic_cast<SysYParser::ParenExpContext*>(pExpCtx)) {
return expCtx->exp()->accept(this);
} else if (auto* strCtx = dynamic_cast<SysYParser::StrContext*>(pExpCtx)) {
return strCtx->string()->accept(this);
} else if (auto* numCtx = dynamic_cast<SysYParser::NumContext*>(pExpCtx)) {
return numCtx->number()->accept(this);
} else {
// 没有成功转换,说明 ctx->primaryExp() 不是 NumContext 或其他已知类型
// 可能是其他类型的表达式,或者是一个空的 PrimaryExpContext
std::cout << "Unknown primary expression type." << std::endl;
throw std::runtime_error("Unknown primary expression type.");
}
// return visitChildren(ctx);
}
std::any LLVMIRGenerator::visitParenExp(SysYParser::ParenExpContext* ctx) {
return ctx->exp()->accept(this);
}
std::any LLVMIRGenerator::visitNumber(SysYParser::NumberContext* ctx) {
if (ctx->ILITERAL()) {
return ctx->ILITERAL()->getText();
} else if (ctx->FLITERAL()) {
return ctx->FLITERAL()->getText();
}
return "";
}
std::any LLVMIRGenerator::visitString(SysYParser::StringContext *ctx)
{
if (ctx->STRING()) {
// 处理字符串常量
std::string str = ctx->STRING()->getText();
// 去掉引号
str = str.substr(1, str.size() - 2);
// 转义处理
std::string escapedStr;
for (char c : str) {
if (c == '\\') {
escapedStr += "\\\\";
} else if (c == '"') {
escapedStr += "\\\"";
} else {
escapedStr += c;
}
}
return "\"" + escapedStr + "\"";
}
return ctx->STRING()->getText();
}
std::any LLVMIRGenerator::visitUnExp(SysYParser::UnExpContext* ctx) {
if (ctx->unaryOp()) {
std::string operand = std::any_cast<std::string>(ctx->unaryExp()->accept(this));
std::string op = ctx->unaryOp()->getText();
std::string temp = getNextTemp();
std::string type = operand.substr(0, operand.find(' '));
tmpTable[temp] = type;
if (op == "-") {
irStream << " " << temp << " = sub " << type << " 0, " << operand << "\n";
} else if (op == "!") {
irStream << " " << temp << " = xor " << type << " " << operand << ", 1\n";
}
return temp;
}
return ctx->unaryExp()->accept(this);
}
std::any LLVMIRGenerator::visitCall(SysYParser::CallContext *ctx)
{
std::string funcName = ctx->Ident()->getText();
std::vector<std::string> args;
if (ctx->funcRParams()) {
for (auto argCtx : ctx->funcRParams()->exp()) {
args.push_back(std::any_cast<std::string>(argCtx->accept(this)));
}
}
std::string temp = getNextTemp();
std::string argList = "";
for (size_t i = 0; i < args.size(); ++i) {
if (i > 0) argList += ", ";
argList +=tmpTable[args[i]] + " noundef " + args[i];
}
irStream << " " << temp << " = call " << currentReturnType << " @" << funcName << "(" << argList << ")\n";
tmpTable[temp] = currentReturnType;
return temp;
}
std::any LLVMIRGenerator::visitMulExp(SysYParser::MulExpContext* ctx) {
auto unaryExps = ctx->unaryExp();
std::string left = std::any_cast<std::string>(unaryExps[0]->accept(this));
for (size_t i = 1; i < unaryExps.size(); ++i) {
std::string right = std::any_cast<std::string>(unaryExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "*") {
irStream << " " << temp << " = mul nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "/") {
irStream << " " << temp << " = sdiv " << type << " " << left << ", " << right << "\n";
} else if (op == "%") {
irStream << " " << temp << " = srem " << type << " " << left << ", " << right << "\n";
}
left = temp;
tmpTable[temp] = type;
}
return left;
}
std::any LLVMIRGenerator::visitAddExp(SysYParser::AddExpContext* ctx) {
auto mulExps = ctx->mulExp();
std::string left = std::any_cast<std::string>(mulExps[0]->accept(this));
for (size_t i = 1; i < mulExps.size(); ++i) {
std::string right = std::any_cast<std::string>(mulExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "+") {
irStream << " " << temp << " = add nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "-") {
irStream << " " << temp << " = sub nsw " << type << " " << left << ", " << right << "\n";
}
left = temp;
tmpTable[temp] = type;
}
return left;
}
std::any LLVMIRGenerator::visitRelExp(SysYParser::RelExpContext* ctx) {
auto addExps = ctx->addExp();
std::string left = std::any_cast<std::string>(addExps[0]->accept(this));
for (size_t i = 1; i < addExps.size(); ++i) {
std::string right = std::any_cast<std::string>(addExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "<") {
irStream << " " << temp << " = icmp slt " << type << " " << left << ", " << right << "\n";
} else if (op == ">") {
irStream << " " << temp << " = icmp sgt " << type << " " << left << ", " << right << "\n";
} else if (op == "<=") {
irStream << " " << temp << " = icmp sle " << type << " " << left << ", " << right << "\n";
} else if (op == ">=") {
irStream << " " << temp << " = icmp sge " << type << " " << left << ", " << right << "\n";
}
left = temp;
}
return left;
}
std::any LLVMIRGenerator::visitEqExp(SysYParser::EqExpContext* ctx) {
auto relExps = ctx->relExp();
std::string left = std::any_cast<std::string>(relExps[0]->accept(this));
for (size_t i = 1; i < relExps.size(); ++i) {
std::string right = std::any_cast<std::string>(relExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "==") {
irStream << " " << temp << " = icmp eq " << type << " " << left << ", " << right << "\n";
} else if (op == "!=") {
irStream << " " << temp << " = icmp ne " << type << " " << left << ", " << right << "\n";
}
left = temp;
}
return left;
}
std::any LLVMIRGenerator::visitLAndExp(SysYParser::LAndExpContext* ctx) {
auto eqExps = ctx->eqExp();
std::string left = std::any_cast<std::string>(eqExps[0]->accept(this));
for (size_t i = 1; i < eqExps.size(); ++i) {
std::string falseLabel = "land.false." + std::to_string(tempCounter);
std::string endLabel = "land.end." + std::to_string(tempCounter++);
std::string temp = getNextTemp();
irStream << " br label %" << falseLabel << "\n";
irStream << falseLabel << ":\n";
std::string right = std::any_cast<std::string>(eqExps[i]->accept(this));
irStream << " " << temp << " = and i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
left = temp;
}
return left;
}
std::any LLVMIRGenerator::visitLOrExp(SysYParser::LOrExpContext* ctx) {
auto lAndExps = ctx->lAndExp();
std::string left = std::any_cast<std::string>(lAndExps[0]->accept(this));
for (size_t i = 1; i < lAndExps.size(); ++i) {
std::string trueLabel = "lor.true." + std::to_string(tempCounter);
std::string endLabel = "lor.end." + std::to_string(tempCounter++);
std::string temp = getNextTemp();
irStream << " br label %" << trueLabel << "\n";
irStream << trueLabel << ":\n";
std::string right = std::any_cast<std::string>(lAndExps[i]->accept(this));
irStream << " " << temp << " = or i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
left = temp;
}
return left;
}
}

859
src/LLVMIRGenerator_1.cpp Normal file
View File

@ -0,0 +1,859 @@
// LLVMIRGenerator.cpp
// TODO类型转换及其检查
// TODOsysy库函数处理
// TODO数组处理
// TODO对while、continue、break的测试
#include "LLVMIRGenerator_1.h"
#include <iomanip>
#include <stdexcept>
#include <sstream>
// namespace sysy {
std::string LLVMIRGenerator::generateIR(SysYParser::CompUnitContext* unit) {
// 初始化 SysY IR 模块
module = std::make_unique<sysy::Module>();
// 清空符号表和临时变量表
symbolTable.clear();
tmpTable.clear();
irSymbolTable.clear();
irTmpTable.clear();
tempCounter = 0;
globalVars.clear();
hasReturn = false;
loopStack = std::stack<LoopLabels>();
inFunction = false;
// 访问编译单元
visitCompUnit(unit);
return irStream.str();
}
std::string LLVMIRGenerator::getNextTemp() {
std::string ret = "%." + std::to_string(tempCounter++);
tmpTable[ret] = "void";
return ret;
}
std::string LLVMIRGenerator::getIRTempName() {
return "%" + std::to_string(tempCounter++);
}
std::string LLVMIRGenerator::getLLVMType(const std::string& type) {
if (type == "int") return "i32";
if (type == "float") return "float";
if (type.find("[]") != std::string::npos)
return getLLVMType(type.substr(0, type.size() - 2)) + "*";
return "i32";
}
sysy::Type* LLVMIRGenerator::getIRType(const std::string& type) {
if (type == "int") return sysy::Type::getIntType();
if (type == "float") return sysy::Type::getFloatType();
if (type == "void") return sysy::Type::getVoidType();
if (type.find("[]") != std::string::npos) {
std::string baseType = type.substr(0, type.size() - 2);
return sysy::Type::getPointerType(getIRType(baseType));
}
return sysy::Type::getIntType(); // 默认 int
}
void LLVMIRGenerator::setIRPosition(sysy::BasicBlock* block) {
currentIRBlock = block;
}
std::any LLVMIRGenerator::visitCompUnit(SysYParser::CompUnitContext* ctx) {
for (auto decl : ctx->decl()) {
decl->accept(this);
}
for (auto funcDef : ctx->funcDef()) {
inFunction = true;
funcDef->accept(this);
inFunction = false;
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDecl(SysYParser::VarDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
currentVarType = getLLVMType(type);
sysy::Type* irType = sysy::Type::getPointerType(getIRType(type));
for (auto varDef : ctx->varDef()) {
if (!inFunction) {
// 全局变量(文本 IR
std::string varName = varDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0";
sysy::Value* initValue = nullptr;
if (varDef->ASSIGN()) {
value = std::any_cast<std::string>(varDef->initVal()->accept(this));
if (irTmpTable.find(value) != irTmpTable.end() && isa<sysy::ConstantValue>(irTmpTable[value])) {
initValue = irTmpTable[value];
}
}
if (llvmType == "float" && initValue) {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("[ERR-Release-02]Invalid float literal: " + value);
}
}
irStream << "@" << varName << " = dso_local global " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName);
// 全局变量SysY IR
auto globalValue = module->createGlobalValue(varName, irType, {}, initValue);
irSymbolTable[varName] = globalValue;
} else {
varDef->accept(this);
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitConstDecl(SysYParser::ConstDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
currentVarType = getLLVMType(type);
sysy::Type* irType = sysy::Type::getPointerType(getIRType(type)); // 全局变量为指针类型
for (auto constDef : ctx->constDef()) {
std::string varName = constDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0";
sysy::Value* initValue = nullptr;
try {
value = std::any_cast<std::string>(constDef->constInitVal()->accept(this));
if (isa<sysy::ConstantValue>(irTmpTable[value])) {
initValue = irTmpTable[value];
}
} catch (...) {
throw std::runtime_error("Const value must be initialized upon definition.");
}
if (!inFunction) {
// 全局常量(文本 IR
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("[ERR-Release-03]Invalid float literal: " + value);
}
}
irStream << "@" << varName << " = dso_local constant " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName);
// 全局常量SysY IR
auto globalValue = module->createGlobalValue(varName, irType, {}, initValue);
irSymbolTable[varName] = globalValue;
} else {
// 局部常量(文本 IR
std::string allocaName = getNextTemp();
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
// 局部常量SysY IRTODO:这里可能有bugAI在犯蠢
sysy::IRBuilder builder(currentIRBlock);
auto allocaInst = builder.createAllocaInst(irType, {}, varName);
builder.createStoreInst(initValue, allocaInst);
irSymbolTable[varName] = allocaInst;
irTmpTable[allocaName] = allocaInst;
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDef(SysYParser::VarDefContext* ctx) {
// TODO数组初始化
std::string varName = ctx->Ident()->getText();
std::string llvmType = currentVarType;
sysy::Type* irType = sysy::Type::getPointerType(getIRType(currentVarType == "i32" ? "int" : "float"));
std::string allocaName = getNextTemp();
// 局部变量(文本 IR
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
// 局部变量SysY IR
sysy::IRBuilder builder(currentIRBlock);
auto allocaInst = builder.createAllocaInst(irType, {}, varName);
sysy::Value* initValue = nullptr;
if (ctx->ASSIGN()) {
std::string value = std::any_cast<std::string>(ctx->initVal()->accept(this));
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
if (irTmpTable.find(value) != irTmpTable.end()) {
initValue = irTmpTable[value];
}
builder.createStoreInst(initValue, allocaInst);
}
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
irSymbolTable[varName] = allocaInst;//TODO:这里没看懂在干嘛
irTmpTable[allocaName] = allocaInst;//TODO:这里没看懂在干嘛
builder.createStoreInst(initValue, allocaInst);//TODO:这里没看懂在干嘛
return nullptr;
}
std::any LLVMIRGenerator::visitFuncDef(SysYParser::FuncDefContext* ctx) {
currentFunction = ctx->Ident()->getText();
currentReturnType = getLLVMType(ctx->funcType()->getText());
sysy::Type* irReturnType = getIRType(ctx->funcType()->getText());
std::vector<sysy::Type*> paramTypes;
// 清空符号表
symbolTable.clear();
tmpTable.clear();
irSymbolTable.clear();
irTmpTable.clear();
tempCounter = 0;
hasReturn = false;
// 处理函数参数(文本 IR 和 SysY IR
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
for (size_t i = 0; i < params.size(); ++i) {
std::string paramType = getLLVMType(params[i]->bType()->getText());
if (i > 0) irStream << ", ";
irStream << paramType << " noundef %" << i;
symbolTable[params[i]->Ident()->getText()] = {"%" + std::to_string(i), paramType};
tmpTable["%" + std::to_string(i)] = paramType;
paramTypes.push_back(getIRType(params[i]->bType()->getText()));
}
tempCounter += params.size();
}
tempCounter++;
// 文本 IR 函数定义
irStream << "define dso_local " << currentReturnType << " @" << currentFunction << "(";
irStream << ") #0 {\n";
// SysY IR 函数定义
sysy::Type* funcType = sysy::Type::getFunctionType(irReturnType, paramTypes);
currentIRFunction = module->createFunction(currentFunction, funcType);
setIRPosition(currentIRFunction->getEntryBlock());
// 处理函数参数分配
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
for (size_t i = 0; i < params.size(); ++i) {
std::string varName = params[i]->Ident()->getText();
std::string llvmType = getLLVMType(params[i]->bType()->getText());
sysy::Type* irType = getIRType(params[i]->bType()->getText());
std::string allocaName = getNextTemp();
tmpTable[allocaName] = llvmType;
// 文本 IR 分配
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
irStream << " store " << llvmType << " %" << i << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
// SysY IR 分配
sysy::IRBuilder builder(currentIRBlock);
auto arg = currentIRBlock->createArgument(irType, varName);
auto allocaInst = builder.createAllocaInst(sysy::Type::getPointerType(irType), {}, varName);
builder.createStoreInst(arg, allocaInst);
symbolTable[varName] = {allocaName, llvmType};
irSymbolTable[varName] = allocaInst;
irTmpTable[allocaName] = allocaInst;
}
}
ctx->blockStmt()->accept(this);
if (!hasReturn) {
if (currentReturnType == "void") {
irStream << " ret void\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createReturnInst();
} else {
irStream << " ret " << currentReturnType << " 0\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createReturnInst(sysy::ConstantValue::get(0));
}
}
irStream << "}\n";
currentIRFunction = nullptr;
currentIRBlock = nullptr;
return nullptr;
}
std::any LLVMIRGenerator::visitBlockStmt(SysYParser::BlockStmtContext* ctx) {
for (auto item : ctx->blockItem()) {
item->accept(this);
}
return nullptr;
}
std::any LLVMIRGenerator::visitAssignStmt(SysYParser::AssignStmtContext* ctx) {
std::string lhsAlloca = std::any_cast<std::string>(ctx->lValue()->accept(this));
std::string lhsType = symbolTable[ctx->lValue()->Ident()->getText()].second;
std::string rhs = std::any_cast<std::string>(ctx->exp()->accept(this));
sysy::Value* rhsValue = irTmpTable[rhs];
// 文本 IR
if (lhsType == "float") {
try {
double floatValue = std::stod(rhs);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
rhs = ss.str();
} catch (...) {
// 如果 rhs 不是字面量,假设已正确处理
throw std::runtime_error("Invalid float literal: " + rhs);
}
}
irStream << " store " << lhsType << " " << rhs << ", " << lhsType
<< "* " << lhsAlloca << ", align 4\n";
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
builder.createStoreInst(rhsValue, irSymbolTable[ctx->lValue()->Ident()->getText()]);
return nullptr;
}
std::any LLVMIRGenerator::visitIfStmt(SysYParser::IfStmtContext* ctx) {
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
sysy::Value* condValue = irTmpTable[cond];
std::string trueLabel = "if.then." + std::to_string(tempCounter);
std::string falseLabel = "if.else." + std::to_string(tempCounter);
std::string mergeLabel = "if.end." + std::to_string(tempCounter++);
// SysY IR 基本块
sysy::BasicBlock* thenBlock = currentIRFunction->addBasicBlock(trueLabel);
sysy::BasicBlock* elseBlock = ctx->ELSE() ? currentIRFunction->addBasicBlock(falseLabel) : nullptr;
sysy::BasicBlock* mergeBlock = currentIRFunction->addBasicBlock(mergeLabel);
// 文本 IR
irStream << " br i1 " << cond << ", label %" << trueLabel << ", label %"
<< (ctx->ELSE() ? falseLabel : mergeLabel) << "\n";
// SysY IR 条件分支
sysy::IRBuilder builder(currentIRBlock);
builder.createCondBrInst(condValue, thenBlock, ctx->ELSE() ? elseBlock : mergeBlock, {}, {});
// 处理 then 分支
setIRPosition(thenBlock);
irStream << trueLabel << ":\n";
ctx->stmt(0)->accept(this);
irStream << " br label %" << mergeLabel << "\n";
builder.setPosition(thenBlock, thenBlock->end());
builder.createUncondBrInst(mergeBlock, {});
// 处理 else 分支
if (ctx->ELSE()) {
setIRPosition(elseBlock);
irStream << falseLabel << ":\n";
ctx->stmt(1)->accept(this);
irStream << " br label %" << mergeLabel << "\n";
builder.setPosition(elseBlock, elseBlock->end());
builder.createUncondBrInst(mergeBlock, {});
}
// 合并点
setIRPosition(mergeBlock);
irStream << mergeLabel << ":\n";
return nullptr;
}
std::any LLVMIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext* ctx) {
std::string loopCond = "while.cond." + std::to_string(tempCounter);
std::string loopBody = "while.body." + std::to_string(tempCounter);
std::string loopEnd = "while.end." + std::to_string(tempCounter++);
// SysY IR 基本块
sysy::BasicBlock* condBlock = currentIRFunction->addBasicBlock(loopCond);
sysy::BasicBlock* bodyBlock = currentIRFunction->addBasicBlock(loopBody);
sysy::BasicBlock* endBlock = currentIRFunction->addBasicBlock(loopEnd);
loopStack.push({loopEnd, loopCond, endBlock, condBlock});
// 跳转到条件块
sysy::IRBuilder builder(currentIRBlock);
builder.createUncondBrInst(condBlock, {});
irStream << " br label %" << loopCond << "\n";
// 条件块
setIRPosition(condBlock);
irStream << loopCond << ":\n";
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
sysy::Value* condValue = irTmpTable[cond];
irStream << " br i1 " << cond << ", label %" << loopBody << ", label %" << loopEnd << "\n";
builder.setPosition(condBlock, condBlock->end());
builder.createCondBrInst(condValue, bodyBlock, endBlock, {}, {});
// 循环体
setIRPosition(bodyBlock);
irStream << loopBody << ":\n";
ctx->stmt()->accept(this);
irStream << " br label %" << loopCond << "\n";
builder.setPosition(bodyBlock, bodyBlock->end());
builder.createUncondBrInst(condBlock, {});
// 结束块
setIRPosition(endBlock);
irStream << loopEnd << ":\n";
loopStack.pop();
return nullptr;
}
std::any LLVMIRGenerator::visitBreakStmt(SysYParser::BreakStmtContext* ctx) {
if (loopStack.empty()) {
throw std::runtime_error("Break statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().breakLabel << "\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createUncondBrInst(loopStack.top().irBreakBlock, {});
return nullptr;
}
std::any LLVMIRGenerator::visitContinueStmt(SysYParser::ContinueStmtContext* ctx) {
if (loopStack.empty()) {
throw std::runtime_error("Continue statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().continueLabel << "\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createUncondBrInst(loopStack.top().irContinueBlock, {});
return nullptr;
}
std::any LLVMIRGenerator::visitReturnStmt(SysYParser::ReturnStmtContext* ctx) {
hasReturn = true;
sysy::IRBuilder builder(currentIRBlock);
if (ctx->exp()) {
std::string value = std::any_cast<std::string>(ctx->exp()->accept(this));
sysy::Value* irValue = irTmpTable[value];
irStream << " ret " << currentReturnType << " " << value << "\n";
builder.createReturnInst(irValue);
} else {
irStream << " ret void\n";
builder.createReturnInst();
}
return nullptr;
}
std::any LLVMIRGenerator::visitLValue(SysYParser::LValueContext* ctx) {
std::string varName = ctx->Ident()->getText();
if (irSymbolTable.find(varName) == irSymbolTable.end()) {
throw std::runtime_error("Undefined variable: " + varName);
}
// 对于 LValue返回分配的指针文本 IR 和 SysY IR 一致)
return symbolTable[varName].first;
}
std::any LLVMIRGenerator::visitPrimExp(SysYParser::PrimExpContext* ctx) {
SysYParser::PrimaryExpContext* pExpCtx = ctx->primaryExp();
if (auto* lvalCtx = dynamic_cast<SysYParser::LValContext*>(pExpCtx)) {
std::string allocaPtr = std::any_cast<std::string>(lvalCtx->lValue()->accept(this));
std::string varName = lvalCtx->lValue()->Ident()->getText();
std::string type = symbolTable[varName].second;
std::string temp = getNextTemp();
sysy::Type* irType = getIRType(type == "i32" ? "int" : "float");
// 文本 IR
irStream << " " << temp << " = load " << type << ", " << type << "* " << allocaPtr << ", align 4\n";
tmpTable[temp] = type;
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
auto loadInst = builder.createLoadInst(irSymbolTable[varName], {});
irTmpTable[temp] = loadInst;
return temp;
} else if (auto* expCtx = dynamic_cast<SysYParser::ParenExpContext*>(pExpCtx)) {
return expCtx->exp()->accept(this);
} else if (auto* strCtx = dynamic_cast<SysYParser::StrContext*>(pExpCtx)) {
return strCtx->string()->accept(this);
} else if (auto* numCtx = dynamic_cast<SysYParser::NumContext*>(pExpCtx)) {
return numCtx->number()->accept(this);
} else {
// 没有成功转换,说明 ctx->primaryExp() 不是 NumContext 或其他已知类型
// 可能是其他类型的表达式,或者是一个空的 PrimaryExpContext
std::cout << "Unknown primary expression type." << std::endl;
throw std::runtime_error("Unknown primary expression type.");
}
}
std::any LLVMIRGenerator::visitParenExp(SysYParser::ParenExpContext* ctx) {
return ctx->exp()->accept(this);
}
std::any LLVMIRGenerator::visitNumber(SysYParser::NumberContext* ctx) {
std::string value;
sysy::Value* irValue = nullptr;
if (ctx->ILITERAL()) {
value = ctx->ILITERAL()->getText();
irValue = sysy::ConstantValue::get(std::stoi(value));
} else if (ctx->FLITERAL()) {
value = ctx->FLITERAL()->getText();
irValue = sysy::ConstantValue::get(std::stof(value));
} else {
value = "";
}
std::string temp = getNextTemp();
tmpTable[temp] = ctx->ILITERAL() ? "i32" : "float";
irTmpTable[temp] = irValue;
return value;
}
std::any LLVMIRGenerator::visitString(SysYParser::StringContext* ctx) {
if (ctx->STRING()) {
std::string str = ctx->STRING()->getText();
str = str.substr(1, str.size() - 2);
std::string escapedStr;
for (char c : str) {
if (c == '\\') {
escapedStr += "\\\\";
} else if (c == '"') {
escapedStr += "\\\"";
} else {
escapedStr += c;
}
}
// TODO: SysY IR 暂不支持字符串常量,返回文本 IR 结果
return "\"" + escapedStr + "\"";
}
return ctx->STRING()->getText();
}
std::any LLVMIRGenerator::visitUnExp(SysYParser::UnExpContext* ctx) {
if (ctx->unaryOp()) {
std::string operand = std::any_cast<std::string>(ctx->unaryExp()->accept(this));
sysy::Value* irOperand = irTmpTable[operand];
std::string op = ctx->unaryOp()->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[operand];
sysy::Type* irType = getIRType(type == "i32" ? "int" : "float");
tmpTable[temp] = type;
// 文本 IR
if (op == "-") {
irStream << " " << temp << " = sub " << type << " 0, " << operand << "\n";
} else if (op == "!") {
irStream << " " << temp << " = xor " << type << " " << operand << ", 1\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind = (op == "-") ? (type == "i32" ? sysy::Instruction::kNeg : sysy::Instruction::kFNeg)
: sysy::Instruction::kNot;
auto unaryInst = builder.createUnaryInst(kind, irType, irOperand, temp);
irTmpTable[temp] = unaryInst;
return temp;
}
return ctx->unaryExp()->accept(this);
}
std::any LLVMIRGenerator::visitCall(SysYParser::CallContext* ctx) {
std::string funcName = ctx->Ident()->getText();
std::vector<std::string> args;
std::vector<sysy::Value*> irArgs;
if (ctx->funcRParams()) {
for (auto argCtx : ctx->funcRParams()->exp()) {
std::string arg = std::any_cast<std::string>(argCtx->accept(this));
args.push_back(arg);
irArgs.push_back(irTmpTable[arg]);
}
}
std::string temp = getNextTemp();
std::string argList;
for (size_t i = 0; i < args.size(); ++i) {
if (i > 0) argList += ", ";
argList += tmpTable[args[i]] + " noundef " + args[i];
}
// 文本 IR
irStream << " " << temp << " = call " << currentReturnType << " @" << funcName << "(" << argList << ")\n";
tmpTable[temp] = currentReturnType;
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Function* callee = module->getFunction(funcName);
if (!callee) {
throw std::runtime_error("Undefined function: " + funcName);
}
auto callInst = builder.createCallInst(callee, irArgs, temp);
irTmpTable[temp] = callInst;
return temp;
}
std::any LLVMIRGenerator::visitMulExp(SysYParser::MulExpContext* ctx) {
auto unaryExps = ctx->unaryExp();
std::string left = std::any_cast<std::string>(unaryExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = irLeft->getType();
for (size_t i = 1; i < unaryExps.size(); ++i) {
std::string right = std::any_cast<std::string>(unaryExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = type;
// 文本 IR
if (op == "*") {
irStream << " " << temp << " = mul nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "/") {
irStream << " " << temp << " = sdiv " << type << " " << left << ", " << right << "\n";
} else if (op == "%") {
irStream << " " << temp << " = srem " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind;
if (type == "i32") {
if (op == "*") kind = sysy::Instruction::kMul;
else if (op == "/") kind = sysy::Instruction::kDiv;
else kind = sysy::Instruction::kRem;
} else {
if (op == "*") kind = sysy::Instruction::kFMul;
else if (op == "/") kind = sysy::Instruction::kFDiv;
else kind = sysy::Instruction::kFRem;
}
auto binaryInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = binaryInst;
left = temp;
irLeft = binaryInst;
}
return left;
}
std::any LLVMIRGenerator::visitAddExp(SysYParser::AddExpContext* ctx) {
auto mulExps = ctx->mulExp();
std::string left = std::any_cast<std::string>(mulExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = irLeft->getType();
for (size_t i = 1; i < mulExps.size(); ++i) {
std::string right = std::any_cast<std::string>(mulExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = type;
// 文本 IR
if (op == "+") {
irStream << " " << temp << " = add nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "-") {
irStream << " " << temp << " = sub nsw " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind = (type == "i32") ? (op == "+" ? sysy::Instruction::kAdd : sysy::Instruction::kSub)
: (op == "+" ? sysy::Instruction::kFAdd : sysy::Instruction::kFSub);
auto binaryInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = binaryInst;
left = temp;
irLeft = binaryInst;
}
return left;
}
std::any LLVMIRGenerator::visitRelExp(SysYParser::RelExpContext* ctx) {
auto addExps = ctx->addExp();
std::string left = std::any_cast<std::string>(addExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = sysy::Type::getIntType(); // 比较结果为 i1
for (size_t i = 1; i < addExps.size(); ++i) {
std::string right = std::any_cast<std::string>(addExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = "i1";
// 文本 IR
if (op == "<") {
irStream << " " << temp << " = icmp slt " << type << " " << left << ", " << right << "\n";
} else if (op == ">") {
irStream << " " << temp << " = icmp sgt " << type << " " << left << ", " << right << "\n";
} else if (op == "<=") {
irStream << " " << temp << " = icmp sle " << type << " " << left << ", " << right << "\n";
} else if (op == ">=") {
irStream << " " << temp << " = icmp sge " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind;
if (type == "i32") {
if (op == "<") kind = sysy::Instruction::kICmpLT;
else if (op == ">") kind = sysy::Instruction::kICmpGT;
else if (op == "<=") kind = sysy::Instruction::kICmpLE;
else kind = sysy::Instruction::kICmpGE;
} else {
if (op == "<") kind = sysy::Instruction::kFCmpLT;
else if (op == ">") kind = sysy::Instruction::kFCmpGT;
else if (op == "<=") kind = sysy::Instruction::kFCmpLE;
else kind = sysy::Instruction::kFCmpGE;
}
auto cmpInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = cmpInst;
left = temp;
irLeft = cmpInst;
}
return left;
}
std::any LLVMIRGenerator::visitEqExp(SysYParser::EqExpContext* ctx) {
auto relExps = ctx->relExp();
std::string left = std::any_cast<std::string>(relExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = sysy::Type::getIntType(); // 比较结果为 i1
for (size_t i = 1; i < relExps.size(); ++i) {
std::string right = std::any_cast<std::string>(relExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = "i1";
// 文本 IR
if (op == "==") {
irStream << " " << temp << " = icmp eq " << type << " " << left << ", " << right << "\n";
} else if (op == "!=") {
irStream << " " << temp << " = icmp ne " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind = (type == "i32") ? (op == "==" ? sysy::Instruction::kICmpEQ : sysy::Instruction::kICmpNE)
: (op == "==" ? sysy::Instruction::kFCmpEQ : sysy::Instruction::kFCmpNE);
auto cmpInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = cmpInst;
left = temp;
irLeft = cmpInst;
}
return left;
}
std::any LLVMIRGenerator::visitLAndExp(SysYParser::LAndExpContext* ctx) {
auto eqExps = ctx->eqExp();
std::string left = std::any_cast<std::string>(eqExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
for (size_t i = 1; i < eqExps.size(); ++i) {
std::string falseLabel = "land.false." + std::to_string(tempCounter);
std::string endLabel = "land.end." + std::to_string(tempCounter++);
sysy::BasicBlock* falseBlock = currentIRFunction->addBasicBlock(falseLabel);
sysy::BasicBlock* endBlock = currentIRFunction->addBasicBlock(endLabel);
std::string temp = getNextTemp();
tmpTable[temp] = "i1";
// 文本 IR
irStream << " br i1 " << left << ", label %" << falseLabel << ", label %" << endLabel << "\n";
irStream << falseLabel << ":\n";
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
builder.createCondBrInst(irLeft, falseBlock, endBlock, {}, {});
setIRPosition(falseBlock);
std::string right = std::any_cast<std::string>(eqExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
irStream << " " << temp << " = and i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
// SysY IR 逻辑与(通过基本块实现短路求值)
builder.setPosition(falseBlock, falseBlock->end());
auto andInst = builder.createBinaryInst(sysy::Instruction::kICmpEQ, sysy::Type::getIntType(), irLeft, irRight, temp);
builder.createUncondBrInst(endBlock, {});
irTmpTable[temp] = andInst;
left = temp;
irLeft = andInst;
setIRPosition(endBlock);
}
return left;
}
std::any LLVMIRGenerator::visitLOrExp(SysYParser::LOrExpContext* ctx) {
auto lAndExps = ctx->lAndExp();
std::string left = std::any_cast<std::string>(lAndExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
for (size_t i = 1; i < lAndExps.size(); ++i) {
std::string trueLabel = "lor.true." + std::to_string(tempCounter);
std::string endLabel = "lor.end." + std::to_string(tempCounter++);
sysy::BasicBlock* trueBlock = currentIRFunction->addBasicBlock(trueLabel);
sysy::BasicBlock* endBlock = currentIRFunction->addBasicBlock(endLabel);
std::string temp = getNextTemp();
tmpTable[temp] = "i1";
// 文本 IR
irStream << " br i1 " << left << ", label %" << trueLabel << ", label %" << endLabel << "\n";
irStream << trueLabel << ":\n";
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
builder.createCondBrInst(irLeft, trueBlock, endBlock, {}, {});
setIRPosition(trueBlock);
std::string right = std::any_cast<std::string>(lAndExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
irStream << " " << temp << " = or i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
// SysY IR 逻辑或(通过基本块实现短路求值)
builder.setPosition(trueBlock, trueBlock->end());
auto orInst = builder.createBinaryInst(sysy::Instruction::kICmpEQ, sysy::Type::getIntType(), irLeft, irRight, temp);
builder.createUncondBrInst(endBlock, {});
irTmpTable[temp] = orInst;
left = temp;
irLeft = orInst;
setIRPosition(endBlock);
}
return left;
}
// } // namespace sysy

0
src/Mem2Reg.cpp Normal file
View File

157
src/RISCv32Backend.cpp Normal file
View File

@ -0,0 +1,157 @@
#include "RISCv32Backend.h"
#include <sstream>
#include <algorithm>
namespace sysy {
const std::vector<RISCv32CodeGen::PhysicalReg> RISCv32CodeGen::allocable_regs = {
PhysicalReg::T0, PhysicalReg::T1, PhysicalReg::T2, PhysicalReg::T3,
PhysicalReg::T4, PhysicalReg::T5, PhysicalReg::T6,
PhysicalReg::A0, PhysicalReg::A1, PhysicalReg::A2, PhysicalReg::A3,
PhysicalReg::A4, PhysicalReg::A5, PhysicalReg::A6, PhysicalReg::A7
};
std::string RISCv32CodeGen::reg_to_string(PhysicalReg reg) {
switch (reg) {
case PhysicalReg::T0: return "t0"; case PhysicalReg::T1: return "t1";
case PhysicalReg::T2: return "t2"; case PhysicalReg::T3: return "t3";
case PhysicalReg::T4: return "t4"; case PhysicalReg::T5: return "t5";
case PhysicalReg::T6: return "t6"; case PhysicalReg::A0: return "a0";
case PhysicalReg::A1: return "a1"; case PhysicalReg::A2: return "a2";
case PhysicalReg::A3: return "a3"; case PhysicalReg::A4: return "a4";
case PhysicalReg::A5: return "a5"; case PhysicalReg::A6: return "a6";
case PhysicalReg::A7: return "a7";
default: return "";
}
}
std::string RISCv32CodeGen::code_gen() {
std::stringstream ss;
ss << ".text\n";
ss << module_gen();
return ss.str();
}
std::string RISCv32CodeGen::module_gen() {
std::stringstream ss;
// 生成全局变量(数据段)
for (const auto& global : module->getGlobals()) {
ss << ".data\n";
ss << ".globl " << global->getName() << "\n";
ss << global->getName() << ":\n";
ss << " .word 0\n"; // 假设初始化为0
}
// 生成函数(文本段)
ss << ".text\n";
for (const auto& func : module->getFunctions()) {
ss << function_gen(func.second.get());
}
return ss.str();
}
std::string RISCv32CodeGen::function_gen(Function* func) {
std::stringstream ss;
// 函数标签
ss << ".globl " << func->getName() << "\n";
ss << func->getName() << ":\n";
// 序言:保存 ra分配堆栈
bool is_leaf = true; // 简化假设
ss << " addi sp, sp, -16\n";
ss << " sw ra, 12(sp)\n";
// 寄存器分配
auto alloc = register_allocation(func);
// 生成基本块代码
for (const auto& bb : func->getBasicBlocks()) {
ss << basicBlock_gen(bb.get(), alloc);
}
// 结尾:恢复 ra释放堆栈
ss << " lw ra, 12(sp)\n";
ss << " addi sp, sp, 16\n";
ss << " ret\n";
return ss.str();
}
std::string RISCv32CodeGen::basicBlock_gen(BasicBlock* bb, const RegAllocResult& alloc) {
std::stringstream ss;
ss << bb->getName() << ":\n";
for (const auto& inst : bb->getInstructions()) {
auto riscv_insts = instruction_gen(inst.get());
for (const auto& riscv_inst : riscv_insts) {
ss << " " << riscv_inst.opcode;
for (size_t i = 0; i < riscv_inst.operands.size(); ++i) {
if (i > 0) ss << ", ";
if (riscv_inst.operands[i].kind == Operand::Kind::Reg) {
auto it = alloc.reg_map.find(riscv_inst.operands[i].value);
if (it != alloc.reg_map.end()) {
ss << reg_to_string(it->second);
} else {
auto stack_it = alloc.stack_map.find(riscv_inst.operands[i].value);
if (stack_it != alloc.stack_map.end()) {
ss << stack_it->second << "(sp)";
} else {
ss << "%" << riscv_inst.operands[i].value->getName();
}
}
} else if (riscv_inst.operands[i].kind == Operand::Kind::Imm) {
ss << riscv_inst.operands[i].label;
} else {
ss << riscv_inst.operands[i].label;
}
}
ss << "\n";
}
}
return ss.str();
}
std::vector<RISCv32CodeGen::RISCv32Inst> RISCv32CodeGen::instruction_gen(Instruction* inst) {
std::vector<RISCv32Inst> insts;
if (auto bin = dynamic_cast<BinaryInst*>(inst)) {
std::string opcode;
if (bin->getKind() == BinaryInst::kAdd) opcode = "add";
else if (bin->getKind() == BinaryInst::kSub) opcode = "sub";
else if (bin->getKind() == BinaryInst::kMul) opcode = "mul";
else return insts; // 其他操作未实现
insts.emplace_back(opcode, std::vector<Operand>{
{Operand::Kind::Reg, bin},
{Operand::Kind::Reg, bin->getLhs()},
{Operand::Kind::Reg, bin->getRhs()}
});
} else if (auto load = dynamic_cast<LoadInst*>(inst)) {
insts.emplace_back("lw", std::vector<Operand>{
{Operand::Kind::Reg, load},
{Operand::Kind::Label, load->getPointer()->getName()}
});
} else if (auto store = dynamic_cast<StoreInst*>(inst)) {
insts.emplace_back("sw", std::vector<Operand>{
{Operand::Kind::Reg, store->getValue()},
{Operand::Kind::Label, store->getPointer()->getName()}
});
}
return insts;
}
void RISCv32CodeGen::eliminate_phi(Function* func) {
// TODO: 实现 phi 指令消除
}
std::map<Instruction*, std::set<Value*>> RISCv32CodeGen::liveness_analysis(Function* func) {
std::map<Instruction*, std::set<Value*>> live_sets;
// TODO: 实现活跃性分析
return live_sets;
}
std::map<Value*, std::set<Value*>> RISCv32CodeGen::build_interference_graph(
const std::map<Instruction*, std::set<Value*>>& live_sets) {
std::map<Value*, std::set<Value*>> graph;
// TODO: 实现干扰图构建
return graph;
}
RISCv32CodeGen::RegAllocResult RISCv32CodeGen::register_allocation(Function* func) {
RegAllocResult result;
// TODO: 实现寄存器分配
return result;
}
} // namespace sysy

67
src/RISCv32Backend.h Normal file
View File

@ -0,0 +1,67 @@
#ifndef RISCV32_BACKEND_H
#define RISCV32_BACKEND_H
#include "IR.h"
#include <string>
#include <vector>
#include <map>
#include <set>
namespace sysy {
class RISCv32CodeGen {
public:
explicit RISCv32CodeGen(Module* mod) : module(mod) {}
std::string code_gen(); // 生成模块的汇编代码
private:
Module* module;
// 物理寄存器
enum class PhysicalReg {
T0, T1, T2, T3, T4, T5, T6, // x5-x7, x28-x31
A0, A1, A2, A3, A4, A5, A6, A7 // x10-x17
};
static const std::vector<PhysicalReg> allocable_regs;
// 操作数
struct Operand {
enum class Kind { Reg, Imm, Label };
Kind kind;
Value* value; // 用于寄存器
std::string label; // 用于标签或立即数
Operand(Kind k, Value* v) : kind(k), value(v), label("") {}
Operand(Kind k, const std::string& l) : kind(k), value(nullptr), label(l) {}
};
// RISC-V 指令
struct RISCv32Inst {
std::string opcode;
std::vector<Operand> operands;
RISCv32Inst(const std::string& op, const std::vector<Operand>& ops)
: opcode(op), operands(ops) {}
};
// 寄存器分配结果
struct RegAllocResult {
std::map<Value*, PhysicalReg> reg_map; // 虚拟寄存器到物理寄存器的映射
std::map<Value*, int> stack_map; // 虚拟寄存器到堆栈槽的映射
int stack_size; // 堆栈帧大小
};
// 后端方法
std::string module_gen();
std::string function_gen(Function* func);
std::string basicBlock_gen(BasicBlock* bb, const RegAllocResult& alloc);
std::vector<RISCv32Inst> instruction_gen(Instruction* inst);
RegAllocResult register_allocation(Function* func);
void eliminate_phi(Function* func);
std::map<Instruction*, std::set<Value*>> liveness_analysis(Function* func);
std::map<Value*, std::set<Value*>> build_interference_graph(
const std::map<Instruction*, std::set<Value*>>& live_sets);
std::string reg_to_string(PhysicalReg reg);
};
} // namespace sysy
#endif // RISCV32_BACKEND_H

View File

@ -101,7 +101,10 @@ BLOCKCOMMENT: '/*' .*? '*/' -> skip;
// CompUnit: (CompUnit)? (decl |funcDef);
compUnit: (decl |funcDef)+;
compUnit: (globalDecl |funcDef)+;
globalDecl: constDecl # globalConstDecl
| varDecl # globalVarDecl;
decl: constDecl | varDecl;
@ -111,16 +114,16 @@ bType: INT | FLOAT;
constDef: Ident (LBRACK constExp RBRACK)* ASSIGN constInitVal;
constInitVal: constExp
| LBRACE (constInitVal (COMMA constInitVal)*)? RBRACE;
constInitVal: constExp # constScalarInitValue
| LBRACE (constInitVal (COMMA constInitVal)*)? RBRACE # constArrayInitValue;
varDecl: bType varDef (COMMA varDef)* SEMICOLON;
varDef: Ident (LBRACK constExp RBRACK)*
| Ident (LBRACK constExp RBRACK)* ASSIGN initVal;
initVal: exp
| LBRACE (initVal (COMMA initVal)*)? RBRACE;
initVal: exp # scalarInitValue
| LBRACE (initVal (COMMA initVal)*)? RBRACE # arrayInitValue;
funcType: VOID | INT | FLOAT;
@ -136,14 +139,14 @@ blockStmt: LBRACE blockItem* RBRACE;
blockItem: decl | stmt;
stmt: lValue ASSIGN exp SEMICOLON
| exp? SEMICOLON
| blockStmt
| IF LPAREN cond RPAREN stmt (ELSE stmt)?
| WHILE LPAREN cond RPAREN stmt
| BREAK SEMICOLON
| CONTINUE SEMICOLON
| RETURN exp? SEMICOLON;
stmt: lValue ASSIGN exp SEMICOLON #assignStmt
| exp? SEMICOLON #expStmt
| blockStmt #blkStmt
| IF LPAREN cond RPAREN stmt (ELSE stmt)? #ifStmt
| WHILE LPAREN cond RPAREN stmt #whileStmt
| BREAK SEMICOLON #breakStmt
| CONTINUE SEMICOLON #continueStmt
| RETURN exp? SEMICOLON #returnStmt;
exp: addExp;
cond: lOrExp;
@ -156,8 +159,9 @@ primaryExp: LPAREN exp RPAREN
| string;
number: ILITERAL | FLITERAL;
unaryExp: primaryExp
| Ident LPAREN (funcRParams)? RPAREN
call: Ident LPAREN (funcRParams)? RPAREN;
unaryExp: primaryExp
| call
| unaryOp unaryExp;
unaryOp: ADD|SUB|NOT;

0
src/SysYIRAnalyser.cpp Normal file
View File

File diff suppressed because it is too large Load Diff

View File

@ -1,55 +0,0 @@
#pragma once
#include "SysYBaseVisitor.h"
#include "SysYParser.h"
#include <sstream>
#include <map>
#include <vector>
#include <stack>
class SysYIRGenerator : public SysYBaseVisitor {
public:
std::string generateIR(SysYParser::CompUnitContext* unit);
std::string getIR() const { return irStream.str(); }
private:
std::stringstream irStream;
int tempCounter = 0;
std::string currentVarType;
std::map<std::string, std::pair<std::string, std::string>> symbolTable;
std::map<std::string, std::string> tmpTable;
std::vector<std::string> globalVars;
std::string currentFunction;
std::string currentReturnType;
std::vector<std::string> breakStack;
std::vector<std::string> continueStack;
bool hasReturn = false;
struct LoopLabels {
std::string breakLabel; // break跳转的目标标签
std::string continueLabel; // continue跳转的目标标签
};
std::stack<LoopLabels> loopStack; // 用于管理循环的break和continue标签
std::string getNextTemp();
std::string getLLVMType(const std::string& type);
bool inFunction = false; // 标识当前是否处于函数内部
// 访问方法
std::any visitCompUnit(SysYParser::CompUnitContext* ctx) override;
std::any visitConstDecl(SysYParser::ConstDeclContext* ctx) override;
std::any visitVarDecl(SysYParser::VarDeclContext* ctx) override;
std::any visitVarDef(SysYParser::VarDefContext* ctx) override;
std::any visitFuncDef(SysYParser::FuncDefContext* ctx) override;
std::any visitBlockStmt(SysYParser::BlockStmtContext* ctx) override;
std::any visitStmt(SysYParser::StmtContext* ctx) override;
std::any visitLValue(SysYParser::LValueContext* ctx) override;
std::any visitPrimaryExp(SysYParser::PrimaryExpContext* ctx) override;
std::any visitNumber(SysYParser::NumberContext* ctx) override;
std::any visitUnaryExp(SysYParser::UnaryExpContext* ctx) override;
std::any visitMulExp(SysYParser::MulExpContext* ctx) override;
std::any visitAddExp(SysYParser::AddExpContext* ctx) override;
std::any visitRelExp(SysYParser::RelExpContext* ctx) override;
std::any visitEqExp(SysYParser::EqExpContext* ctx) override;
std::any visitLAndExp(SysYParser::LAndExpContext* ctx) override;
std::any visitLOrExp(SysYParser::LOrExpContext* ctx) override;
};

479
src/SysYIRPrinter.cpp Normal file
View File

@ -0,0 +1,479 @@
#include "SysYIRPrinter.h"
#include <cassert>
#include <fstream>
#include <iostream>
#include <string>
#include "IR.h"
namespace sysy {
void SysYPrinter::printIR() {
const auto &functions = pModule->getFunctions();
//TODO: Print target datalayout and triple (minimal required by LLVM)
printGlobalVariable();
for (const auto &iter : functions) {
if (iter.second->getName() == "main") {
printFunction(iter.second.get());
break;
}
}
for (const auto &iter : functions) {
if (iter.second->getName() != "main") {
printFunction(iter.second.get());
}
}
}
std::string SysYPrinter::getTypeString(Type *type) {
if (type->isVoid()) {
return "void";
} else if (type->isInt()) {
return "i32";
} else if (type->isFloat()) {
return "float";
} else if (auto ptrType = dynamic_cast<PointerType*>(type)) {
return getTypeString(ptrType->getBaseType()) + "*";
} else if (auto ptrType = dynamic_cast<FunctionType*>(type)) {
return getTypeString(ptrType->getReturnType());
}
assert(false && "Unsupported type");
return "";
}
std::string SysYPrinter::getValueName(Value *value) {
if (auto global = dynamic_cast<GlobalValue*>(value)) {
return "@" + global->getName();
} else if (auto inst = dynamic_cast<Instruction*>(value)) {
return "%" + inst->getName();
} else if (auto constVal = dynamic_cast<ConstantValue*>(value)) {
if (constVal->isFloat()) {
return std::to_string(constVal->getFloat());
}
return std::to_string(constVal->getInt());
} else if (auto constVar = dynamic_cast<ConstantVariable*>(value)) {
return constVar->getName();
}
assert(false && "Unknown value type");
return "";
}
void SysYPrinter::printType(Type *type) {
std::cout << getTypeString(type);
}
void SysYPrinter::printValue(Value *value) {
std::cout << getValueName(value);
}
void SysYPrinter::printGlobalVariable() {
auto &globals = pModule->getGlobals();
for (const auto &global : globals) {
std::cout << "@" << global->getName() << " = global ";
auto baseType = dynamic_cast<PointerType *>(global->getType())->getBaseType();
printType(baseType);
if (global->getNumDims() > 0) {
// Array type
std::cout << " [";
for (unsigned i = 0; i < global->getNumDims(); i++) {
if (i > 0) std::cout << " x ";
std::cout << getValueName(global->getDim(i));
}
std::cout << "]";
}
std::cout << " ";
if (global->getNumDims() > 0) {
// Array initializer
std::cout << "[";
auto values = global->getInitValues();
auto counterValues = values.getValues();
auto counterNumbers = values.getNumbers();
for (size_t i = 0; i < counterNumbers.size(); i++) {
if (i > 0) std::cout << ", ";
if (baseType->isFloat()) {
std::cout << "float " << dynamic_cast<ConstantValue*>(counterValues[i])->getFloat();
} else {
std::cout << "i32 " << dynamic_cast<ConstantValue*>(counterValues[i])->getInt();
}
}
std::cout << "]";
} else {
// Scalar initializer
if (baseType->isFloat()) {
std::cout << "float " << dynamic_cast<ConstantValue*>(global->getByIndex(0))->getFloat();
} else {
std::cout << "i32 " << dynamic_cast<ConstantValue*>(global->getByIndex(0))->getInt();
}
}
std::cout << ", align 4" << std::endl;
}
}
void SysYPrinter::printFunction(Function *function) {
// Function signature
std::cout << "define ";
printType(function->getReturnType());
std::cout << " @" << function->getName() << "(";
auto entryBlock = function->getEntryBlock();
auto &args = entryBlock->getArguments();
for (size_t i = 0; i < args.size(); i++) {
if (i > 0) std::cout << ", ";
printType(args[i]->getType());
std::cout << " %" << args[i]->getName();
}
std::cout << ") {" << std::endl;
// Function body
for (const auto &blockIter : function->getBasicBlocks()) {
// Basic block label
BasicBlock* blockPtr = blockIter.get();
if (blockPtr == function->getEntryBlock()) {
std::cout << "entry:" << std::endl;
} else if (!blockPtr->getName().empty()) {
std::cout << blockPtr->getName() << ":" << std::endl;
}
// Instructions
for (const auto &instIter : blockIter->getInstructions()) {
auto inst = instIter.get();
std::cout << " ";
printInst(inst);
}
}
std::cout << "}" << std::endl << std::endl;
}
void SysYPrinter::printInst(Instruction *pInst) {
using Kind = Instruction::Kind;
switch (pInst->getKind()) {
case Kind::kAdd:
case Kind::kSub:
case Kind::kMul:
case Kind::kDiv:
case Kind::kRem:
case Kind::kFAdd:
case Kind::kFSub:
case Kind::kFMul:
case Kind::kFDiv:
case Kind::kICmpEQ:
case Kind::kICmpNE:
case Kind::kICmpLT:
case Kind::kICmpGT:
case Kind::kICmpLE:
case Kind::kICmpGE:
case Kind::kFCmpEQ:
case Kind::kFCmpNE:
case Kind::kFCmpLT:
case Kind::kFCmpGT:
case Kind::kFCmpLE:
case Kind::kFCmpGE:
case Kind::kAnd:
case Kind::kOr: {
auto binInst = dynamic_cast<BinaryInst *>(pInst);
// Print result variable if exists
if (!binInst->getName().empty()) {
std::cout << "%" << binInst->getName() << " = ";
}
// Operation name
switch (pInst->getKind()) {
case Kind::kAdd: std::cout << "add"; break;
case Kind::kSub: std::cout << "sub"; break;
case Kind::kMul: std::cout << "mul"; break;
case Kind::kDiv: std::cout << "sdiv"; break;
case Kind::kRem: std::cout << "srem"; break;
case Kind::kFAdd: std::cout << "fadd"; break;
case Kind::kFSub: std::cout << "fsub"; break;
case Kind::kFMul: std::cout << "fmul"; break;
case Kind::kFDiv: std::cout << "fdiv"; break;
case Kind::kICmpEQ: std::cout << "icmp eq"; break;
case Kind::kICmpNE: std::cout << "icmp ne"; break;
case Kind::kICmpLT: std::cout << "icmp slt"; break;
case Kind::kICmpGT: std::cout << "icmp sgt"; break;
case Kind::kICmpLE: std::cout << "icmp sle"; break;
case Kind::kICmpGE: std::cout << "icmp sge"; break;
case Kind::kFCmpEQ: std::cout << "fcmp oeq"; break;
case Kind::kFCmpNE: std::cout << "fcmp one"; break;
case Kind::kFCmpLT: std::cout << "fcmp olt"; break;
case Kind::kFCmpGT: std::cout << "fcmp ogt"; break;
case Kind::kFCmpLE: std::cout << "fcmp ole"; break;
case Kind::kFCmpGE: std::cout << "fcmp oge"; break;
case Kind::kAnd: std::cout << "and"; break;
case Kind::kOr: std::cout << "or"; break;
default: break;
}
// Types and operands
std::cout << " ";
printType(binInst->getType());
std::cout << " ";
printValue(binInst->getLhs());
std::cout << ", ";
printValue(binInst->getRhs());
std::cout << std::endl;
} break;
case Kind::kNeg:
case Kind::kNot:
case Kind::kFNeg:
case Kind::kFNot:
case Kind::kFtoI:
case Kind::kBitFtoI:
case Kind::kItoF:
case Kind::kBitItoF: {
auto unyInst = dynamic_cast<UnaryInst *>(pInst);
if (!unyInst->getName().empty()) {
std::cout << "%" << unyInst->getName() << " = ";
}
switch (pInst->getKind()) {
case Kind::kNeg: std::cout << "sub "; break;
case Kind::kNot: std::cout << "xor "; break;
case Kind::kFNeg: std::cout << "fneg "; break;
case Kind::kFNot: std::cout << "fneg "; break; // FNot not standard, map to fneg
case Kind::kFtoI: std::cout << "fptosi "; break;
case Kind::kBitFtoI: std::cout << "bitcast "; break;
case Kind::kItoF: std::cout << "sitofp "; break;
case Kind::kBitItoF: std::cout << "bitcast "; break;
default: break;
}
printType(unyInst->getType());
std::cout << " ";
// Special handling for negation
if (pInst->getKind() == Kind::kNeg || pInst->getKind() == Kind::kNot) {
std::cout << "i32 0, ";
}
printValue(pInst->getOperand(0));
// For bitcast, need to specify destination type
if (pInst->getKind() == Kind::kBitFtoI || pInst->getKind() == Kind::kBitItoF) {
std::cout << " to ";
printType(unyInst->getType());
}
std::cout << std::endl;
} break;
case Kind::kCall: {
auto callInst = dynamic_cast<CallInst *>(pInst);
auto function = callInst->getCallee();
if (!callInst->getName().empty()) {
std::cout << "%" << callInst->getName() << " = ";
}
std::cout << "call ";
printType(callInst->getType());
std::cout << " @" << function->getName() << "(";
auto params = callInst->getArguments();
bool first = true;
for (auto &param : params) {
if (!first) std::cout << ", ";
first = false;
printType(param->getValue()->getType());
std::cout << " ";
printValue(param->getValue());
}
std::cout << ")" << std::endl;
} break;
case Kind::kCondBr: {
auto condBrInst = dynamic_cast<CondBrInst *>(pInst);
std::cout << "br i1 ";
printValue(condBrInst->getCondition());
std::cout << ", label %" << condBrInst->getThenBlock()->getName();
std::cout << ", label %" << condBrInst->getElseBlock()->getName();
std::cout << std::endl;
} break;
case Kind::kBr: {
auto brInst = dynamic_cast<UncondBrInst *>(pInst);
std::cout << "br label %" << brInst->getBlock()->getName();
std::cout << std::endl;
} break;
case Kind::kReturn: {
auto retInst = dynamic_cast<ReturnInst *>(pInst);
std::cout << "ret ";
if (retInst->getNumOperands() != 0) {
printType(retInst->getOperand(0)->getType());
std::cout << " ";
printValue(retInst->getOperand(0));
} else {
std::cout << "void";
}
std::cout << std::endl;
} break;
case Kind::kAlloca: {
auto allocaInst = dynamic_cast<AllocaInst *>(pInst);
std::cout << "%" << allocaInst->getName() << " = alloca ";
auto baseType = dynamic_cast<PointerType *>(allocaInst->getType())->getBaseType();
printType(baseType);
if (allocaInst->getNumDims() > 0) {
std::cout << ", ";
for (size_t i = 0; i < allocaInst->getNumDims(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(allocaInst->getDim(i));
}
}
std::cout << ", align 4" << std::endl;
} break;
case Kind::kLoad: {
auto loadInst = dynamic_cast<LoadInst *>(pInst);
std::cout << "%" << loadInst->getName() << " = load ";
printType(loadInst->getType());
std::cout << ", ";
printType(loadInst->getPointer()->getType());
std::cout << " ";
printValue(loadInst->getPointer());
if (loadInst->getNumIndices() > 0) {
std::cout << ", ";
for (size_t i = 0; i < loadInst->getNumIndices(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(loadInst->getIndex(i));
}
}
std::cout << ", align 4" << std::endl;
} break;
case Kind::kLa: {
auto laInst = dynamic_cast<LaInst *>(pInst);
std::cout << "%" << laInst->getName() << " = getelementptr inbounds ";
auto ptrType = dynamic_cast<PointerType*>(laInst->getPointer()->getType());
printType(ptrType->getBaseType());
std::cout << ", ";
printType(laInst->getPointer()->getType());
std::cout << " ";
printValue(laInst->getPointer());
std::cout << ", ";
for (size_t i = 0; i < laInst->getNumIndices(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(laInst->getIndex(i));
}
std::cout << std::endl;
} break;
case Kind::kStore: {
auto storeInst = dynamic_cast<StoreInst *>(pInst);
std::cout << "store ";
printType(storeInst->getValue()->getType());
std::cout << " ";
printValue(storeInst->getValue());
std::cout << ", ";
printType(storeInst->getPointer()->getType());
std::cout << " ";
printValue(storeInst->getPointer());
if (storeInst->getNumIndices() > 0) {
std::cout << ", ";
for (size_t i = 0; i < storeInst->getNumIndices(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(storeInst->getIndex(i));
}
}
std::cout << ", align 4" << std::endl;
} break;
case Kind::kMemset: {
auto memsetInst = dynamic_cast<MemsetInst *>(pInst);
std::cout << "call void @llvm.memset.p0.";
printType(memsetInst->getPointer()->getType());
std::cout << "(";
printType(memsetInst->getPointer()->getType());
std::cout << " ";
printValue(memsetInst->getPointer());
std::cout << ", i8 ";
printValue(memsetInst->getValue());
std::cout << ", i32 ";
printValue(memsetInst->getSize());
std::cout << ", i1 false)" << std::endl;
} break;
case Kind::kPhi: {
auto phiInst = dynamic_cast<PhiInst *>(pInst);
std::cout << "%" << phiInst->getName() << " = phi ";
printType(phiInst->getType());
for (unsigned i = 0; i < phiInst->getNumOperands(); i += 2) {
if (i > 0) std::cout << ", ";
std::cout << "[ ";
printValue(phiInst->getOperand(i));
std::cout << ", %" << dynamic_cast<BasicBlock*>(phiInst->getOperand(i+1))->getName() << " ]";
}
std::cout << std::endl;
} break;
case Kind::kGetSubArray: {
auto getSubArrayInst = dynamic_cast<GetSubArrayInst *>(pInst);
std::cout << "%" << getSubArrayInst->getName() << " = getelementptr inbounds ";
auto ptrType = dynamic_cast<PointerType*>(getSubArrayInst->getFatherArray()->getType());
printType(ptrType->getBaseType());
std::cout << ", ";
printType(getSubArrayInst->getFatherArray()->getType());
std::cout << " ";
printValue(getSubArrayInst->getFatherArray());
std::cout << ", ";
bool firstIndex = true;
for (auto &index : getSubArrayInst->getIndices()) {
if (!firstIndex) std::cout << ", ";
firstIndex = false;
printType(Type::getIntType());
std::cout << " ";
printValue(index->getValue());
}
std::cout << std::endl;
} break;
default:
assert(false && "Unsupported instruction kind");
break;
}
}
} // namespace sysy

1711
src/include/IR.h Normal file

File diff suppressed because it is too large Load Diff

349
src/include/IRBuilder.h Normal file
View File

@ -0,0 +1,349 @@
#pragma once
#include <cassert>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include "IR.h"
/**
* @file IRBuilder.h
*
* @brief 定义IR构建器的头文件
*/
namespace sysy {
/**
* @brief 中间IR的构建器
*
*/
class IRBuilder {
private:
unsigned labelIndex; ///< 基本块标签编号
unsigned tmpIndex; ///< 临时变量编号
BasicBlock *block; ///< 当前基本块
BasicBlock::iterator position; ///< 当前基本块指令列表位置的迭代器
std::vector<BasicBlock *> trueBlocks; ///< true分支基本块列表
std::vector<BasicBlock *> falseBlocks; ///< false分支基本块列表
std::vector<BasicBlock *> breakBlocks; ///< break目标块列表
std::vector<BasicBlock *> continueBlocks; ///< continue目标块列表
public:
IRBuilder() : labelIndex(0), tmpIndex(0), block(nullptr) {}
explicit IRBuilder(BasicBlock *block) : labelIndex(0), tmpIndex(0), block(block), position(block->end()) {}
IRBuilder(BasicBlock *block, BasicBlock::iterator position)
: labelIndex(0), tmpIndex(0), block(block), position(position) {}
public:
unsigned getLabelIndex() {
labelIndex += 1;
return labelIndex - 1;
} ///< 获取基本块标签编号
unsigned getTmpIndex() {
tmpIndex += 1;
return tmpIndex - 1;
} ///< 获取临时变量编号
BasicBlock * getBasicBlock() const { return block; } ///< 获取当前基本块
BasicBlock * getBreakBlock() const { return breakBlocks.back(); } ///< 获取break目标块
BasicBlock * popBreakBlock() {
auto result = breakBlocks.back();
breakBlocks.pop_back();
return result;
} ///< 弹出break目标块
BasicBlock * getContinueBlock() const { return continueBlocks.back(); } ///< 获取continue目标块
BasicBlock * popContinueBlock() {
auto result = continueBlocks.back();
continueBlocks.pop_back();
return result;
} ///< 弹出continue目标块
BasicBlock * getTrueBlock() const { return trueBlocks.back(); } ///< 获取true分支基本块
BasicBlock * getFalseBlock() const { return falseBlocks.back(); } ///< 获取false分支基本块
BasicBlock * popTrueBlock() {
auto result = trueBlocks.back();
trueBlocks.pop_back();
return result;
} ///< 弹出true分支基本块
BasicBlock * popFalseBlock() {
auto result = falseBlocks.back();
falseBlocks.pop_back();
return result;
} ///< 弹出false分支基本块
BasicBlock::iterator getPosition() const { return position; } ///< 获取当前基本块指令列表位置的迭代器
void setPosition(BasicBlock *block, BasicBlock::iterator position) {
this->block = block;
this->position = position;
} ///< 设置基本块和基本块指令列表位置的迭代器
void setPosition(BasicBlock::iterator position) {
this->position = position;
} ///< 设置当前基本块指令列表位置的迭代器
void pushBreakBlock(BasicBlock *block) { breakBlocks.push_back(block); } ///< 压入break目标基本块
void pushContinueBlock(BasicBlock *block) { continueBlocks.push_back(block); } ///< 压入continue目标基本块
void pushTrueBlock(BasicBlock *block) { trueBlocks.push_back(block); } ///< 压入true分支基本块
void pushFalseBlock(BasicBlock *block) { falseBlocks.push_back(block); } ///< 压入false分支基本块
public:
Instruction * insertInst(Instruction *inst) {
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 插入指令
UnaryInst * createUnaryInst(Instruction::Kind kind, Type *type, Value *operand, const std::string &name = "") {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
newName = name;
}
auto inst = new UnaryInst(kind, type, operand, block, newName);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建一元指令
UnaryInst * createNegInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kNeg, Type::getIntType(), operand, name);
} ///< 创建取反指令
UnaryInst * createNotInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kNot, Type::getIntType(), operand, name);
} ///< 创建取非指令
UnaryInst * createFtoIInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kFtoI, Type::getIntType(), operand, name);
} ///< 创建浮点转整型指令
UnaryInst * createBitFtoIInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kBitFtoI, Type::getIntType(), operand, name);
} ///< 创建按位浮点转整型指令
UnaryInst * createFNegInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kFNeg, Type::getFloatType(), operand, name);
} ///< 创建浮点取反指令
UnaryInst * createFNotInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kFNot, Type::getIntType(), operand, name);
} ///< 创建浮点取非指令
UnaryInst * createIToFInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kItoF, Type::getFloatType(), operand, name);
} ///< 创建整型转浮点指令
UnaryInst * createBitItoFInst(Value *operand, const std::string &name = "") {
return createUnaryInst(Instruction::kBitItoF, Type::getFloatType(), operand, name);
} ///< 创建按位整型转浮点指令
BinaryInst * createBinaryInst(Instruction::Kind kind, Type *type, Value *lhs, Value *rhs, const std::string &name = "") {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
newName = name;
}
auto inst = new BinaryInst(kind, type, lhs, rhs, block, newName);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建二元指令
BinaryInst * createAddInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kAdd, Type::getIntType(), lhs, rhs, name);
} ///< 创建加法指令
BinaryInst * createSubInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kSub, Type::getIntType(), lhs, rhs, name);
} ///< 创建减法指令
BinaryInst * createMulInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kMul, Type::getIntType(), lhs, rhs, name);
} ///< 创建乘法指令
BinaryInst * createDivInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kDiv, Type::getIntType(), lhs, rhs, name);
} ///< 创建除法指令
BinaryInst * createRemInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kRem, Type::getIntType(), lhs, rhs, name);
} ///< 创建取余指令
BinaryInst * createICmpEQInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kICmpEQ, Type::getIntType(), lhs, rhs, name);
} ///< 创建相等设置指令
BinaryInst * createICmpNEInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kICmpNE, Type::getIntType(), lhs, rhs, name);
} ///< 创建不相等设置指令
BinaryInst * createICmpLTInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kICmpLT, Type::getIntType(), lhs, rhs, name);
} ///< 创建小于设置指令
BinaryInst * createICmpLEInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kICmpLE, Type::getIntType(), lhs, rhs, name);
} ///< 创建小于等于设置指令
BinaryInst * createICmpGTInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kICmpGT, Type::getIntType(), lhs, rhs, name);
} ///< 创建大于设置指令
BinaryInst * createICmpGEInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kICmpGE, Type::getIntType(), lhs, rhs, name);
} ///< 创建大于等于设置指令
BinaryInst * createFAddInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFAdd, Type::getFloatType(), lhs, rhs, name);
} ///< 创建浮点加法指令
BinaryInst * createFSubInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFSub, Type::getFloatType(), lhs, rhs, name);
} ///< 创建浮点减法指令
BinaryInst * createFMulInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFMul, Type::getFloatType(), lhs, rhs, name);
} ///< 创建浮点乘法指令
BinaryInst * createFDivInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFDiv, Type::getFloatType(), lhs, rhs, name);
} ///< 创建浮点除法指令
BinaryInst * createFCmpEQInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpEQ, Type::getIntType(), lhs, rhs, name);
} ///< 创建浮点相等设置指令
BinaryInst * createFCmpNEInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpNE, Type::getIntType(), lhs, rhs, name);
} ///< 创建浮点不相等设置指令
BinaryInst * createFCmpLTInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpLT, Type::getIntType(), lhs, rhs, name);
} ///< 创建浮点小于设置指令
BinaryInst * createFCmpLEInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpLE, Type::getIntType(), lhs, rhs, name);
} ///< 创建浮点小于等于设置指令
BinaryInst * createFCmpGTInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpGT, Type::getIntType(), lhs, rhs, name);
} ///< 创建浮点大于设置指令
BinaryInst * createFCmpGEInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kFCmpGE, Type::getIntType(), lhs, rhs, name);
} ///< 创建浮点相大于等于设置指令
BinaryInst * createAndInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kAnd, Type::getIntType(), lhs, rhs, name);
} ///< 创建按位且指令
BinaryInst * createOrInst(Value *lhs, Value *rhs, const std::string &name = "") {
return createBinaryInst(Instruction::kOr, Type::getIntType(), lhs, rhs, name);
} ///< 创建按位或指令
CallInst * createCallInst(Function *callee, const std::vector<Value *> &args, const std::string &name = "") {
std::string newName;
if (name.empty() && callee->getReturnType() != Type::getVoidType()) {
std::stringstream ss;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
newName = name;
}
auto inst = new CallInst(callee, args, block, newName);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建Call指令
ReturnInst * createReturnInst(Value *value = nullptr, const std::string &name = "") {
auto inst = new ReturnInst(value, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建return指令
UncondBrInst * createUncondBrInst(BasicBlock *thenBlock, const std::vector<Value *> &args) {
auto inst = new UncondBrInst(thenBlock, args, block);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建无条件指令
CondBrInst * createCondBrInst(Value *condition, BasicBlock *thenBlock, BasicBlock *elseBlock,
const std::vector<Value *> &thenArgs, const std::vector<Value *> &elseArgs) {
auto inst = new CondBrInst(condition, thenBlock, elseBlock, thenArgs, elseArgs, block);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建条件跳转指令
AllocaInst * createAllocaInst(Type *type, const std::vector<Value *> &dims = {}, const std::string &name = "") {
auto inst = new AllocaInst(type, dims, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建分配指令
AllocaInst * createAllocaInstWithoutInsert(Type *type, const std::vector<Value *> &dims = {}, BasicBlock *parent = nullptr,
const std::string &name = "") {
auto inst = new AllocaInst(type, dims, parent, name);
assert(inst);
return inst;
} ///< 创建不插入指令列表的分配指令
LoadInst * createLoadInst(Value *pointer, const std::vector<Value *> &indices = {}, const std::string &name = "") {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
newName = name;
}
auto inst = new LoadInst(pointer, indices, block, newName);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建load指令
LaInst * createLaInst(Value *pointer, const std::vector<Value *> &indices = {}, const std::string &name = "") {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
newName = name;
}
auto inst = new LaInst(pointer, indices, block, newName);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建la指令
GetSubArrayInst * createGetSubArray(LVal *fatherArray, const std::vector<Value *> &indices, const std::string &name = "") {
assert(fatherArray->getLValNumDims() > indices.size());
std::vector<Value *> subDims;
auto dims = fatherArray->getLValDims();
auto iter = std::next(dims.begin(), indices.size());
while (iter != dims.end()) {
subDims.emplace_back(*iter);
iter++;
}
std::string childArrayName;
std::stringstream ss;
ss << "A"
<< "%" << tmpIndex;
childArrayName = ss.str();
tmpIndex++;
auto fatherArrayValue = dynamic_cast<Value *>(fatherArray);
auto childArray = new AllocaInst(fatherArrayValue->getType(), subDims, block, childArrayName);
auto inst = new GetSubArrayInst(fatherArray, childArray, indices, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建获取部分数组指令
MemsetInst * createMemsetInst(Value *pointer, Value *begin, Value *size, Value *value, const std::string &name = "") {
auto inst = new MemsetInst(pointer, begin, size, value, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建memset指令
StoreInst * createStoreInst(Value *value, Value *pointer, const std::vector<Value *> &indices = {},
const std::string &name = "") {
auto inst = new StoreInst(value, pointer, indices, block, name);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;
} ///< 创建store指令
PhiInst * createPhiInst(Type *type, Value *lhs, BasicBlock *parent, const std::string &name = "") {
auto predNum = parent->getNumPredecessors();
std::vector<Value *> rhs;
for (size_t i = 0; i < predNum; i++) {
rhs.push_back(lhs);
}
auto inst = new PhiInst(type, lhs, rhs, lhs, parent, name);
assert(inst);
parent->getInstructions().emplace(parent->begin(), inst);
return inst;
} ///< 创建Phi指令
};
} // namespace sysy

View File

@ -0,0 +1,78 @@
#pragma once
#include "SysYBaseVisitor.h"
#include "SysYParser.h"
#include "IR.h"
#include "IRBuilder.h"
#include <sstream>
#include <map>
#include <vector>
#include <stack>
class LLVMIRGenerator : public SysYBaseVisitor {
public:
sysy::Module* getIRModule() const { return irModule.get(); }
std::string generateIR(SysYParser::CompUnitContext* unit);
std::string getIR() const { return irStream.str(); }
private:
std::unique_ptr<sysy::Module> irModule; // IR数据结构
std::stringstream irStream; // 文本输出流
sysy::IRBuilder irBuilder; // IR构建器
int tempCounter = 0;
std::string currentVarType;
// std::map<std::string, sysy::Value*> symbolTable;
std::map<std::string, std::pair<std::string, std::string>> symbolTable;
std::map<std::string, std::string> tmpTable;
std::vector<std::string> globalVars;
std::string currentFunction;
std::string currentReturnType;
std::vector<std::string> breakStack;
std::vector<std::string> continueStack;
bool hasReturn = false;
struct LoopLabels {
std::string breakLabel; // break跳转的目标标签
std::string continueLabel; // continue跳转的目标标签
};
std::stack<LoopLabels> loopStack; // 用于管理循环的break和continue标签
std::string getNextTemp();
std::string getLLVMType(const std::string&);
sysy::Type* getSysYType(const std::string&);
bool inFunction = false; // 标识当前是否处于函数内部
// 访问方法
std::any visitCompUnit(SysYParser::CompUnitContext* ctx);
std::any visitConstDecl(SysYParser::ConstDeclContext* ctx);
std::any visitVarDecl(SysYParser::VarDeclContext* ctx);
std::any visitVarDef(SysYParser::VarDefContext* ctx);
std::any visitFuncDef(SysYParser::FuncDefContext* ctx);
std::any visitBlockStmt(SysYParser::BlockStmtContext* ctx);
// std::any visitStmt(SysYParser::StmtContext* ctx);
std::any visitLValue(SysYParser::LValueContext* ctx);
std::any visitPrimaryExp(SysYParser::PrimaryExpContext* ctx);
std::any visitPrimExp(SysYParser::PrimExpContext* ctx);
std::any visitParenExp(SysYParser::ParenExpContext* ctx);
std::any visitNumber(SysYParser::NumberContext* ctx);
std::any visitString(SysYParser::StringContext* ctx);
std::any visitCall(SysYParser::CallContext *ctx);
std::any visitUnExp(SysYParser::UnExpContext* ctx);
std::any visitMulExp(SysYParser::MulExpContext* ctx);
std::any visitAddExp(SysYParser::AddExpContext* ctx);
std::any visitRelExp(SysYParser::RelExpContext* ctx);
std::any visitEqExp(SysYParser::EqExpContext* ctx);
std::any visitLAndExp(SysYParser::LAndExpContext* ctx);
std::any visitLOrExp(SysYParser::LOrExpContext* ctx);
std::any visitAssignStmt(SysYParser::AssignStmtContext *ctx) override;
std::any visitIfStmt(SysYParser::IfStmtContext *ctx) override;
std::any visitWhileStmt(SysYParser::WhileStmtContext *ctx) override;
std::any visitBreakStmt(SysYParser::BreakStmtContext *ctx) override;
std::any visitContinueStmt(SysYParser::ContinueStmtContext *ctx) override;
std::any visitReturnStmt(SysYParser::ReturnStmtContext *ctx) override;
// 统一创建二元操作(同时生成数据结构和文本)
sysy::Value* createBinaryOp(SysYParser::ExpContext* lhs,
SysYParser::ExpContext* rhs,
sysy::Instruction::Kind opKind);
};

0
src/include/Mem2Reg.h Normal file
View File

View File

View File

@ -0,0 +1,138 @@
#pragma once
#include "IR.h"
#include "IRBuilder.h"
#include "SysYBaseVisitor.h"
#include "SysYParser.h"
#include <memory>
#include <unordered_map>
#include <forward_list>
namespace sysy {
// @brief 用于存储数组值的树结构
// 多位数组本质上是一维数组的嵌套可以用树来表示。
class ArrayValueTree {
private:
Value *value = nullptr; /// 该节点存储的value
std::vector<std::unique_ptr<ArrayValueTree>> children; /// 子节点列表
public:
ArrayValueTree() = default;
public:
auto getValue() const -> Value * { return value; }
auto getChildren() const
-> const std::vector<std::unique_ptr<ArrayValueTree>> & {
return children;
}
void setValue(Value *newValue) { value = newValue; }
void addChild(ArrayValueTree *newChild) { children.emplace_back(newChild); }
void addChildren(const std::vector<ArrayValueTree *> &newChildren) {
for (const auto &child : newChildren) {
children.emplace_back(child);
}
}
};
class Utils {
public:
// transform a tree of ArrayValueTree to a ValueCounter
static void tree2Array(Type *type, ArrayValueTree *root,
const std::vector<Value *> &dims, unsigned numDims,
ValueCounter &result, IRBuilder *builder);
static void
createExternalFunction(const std::vector<Type *> &paramTypes,
const std::vector<std::string> &paramNames,
const std::vector<std::vector<Value *>> &paramDims,
Type *returnType, const std::string &funcName,
Module *pModule, IRBuilder *pBuilder);
static void initExternalFunction(Module *pModule, IRBuilder *pBuilder);
};
class SysYIRGenerator : public SysYBaseVisitor {
private:
std::unique_ptr<Module> module;
IRBuilder builder;
public:
SysYIRGenerator() = default;
public:
Module *get() const { return module.get(); }
IRBuilder *getBuilder(){ return &builder; }
public:
std::any visitCompUnit(SysYParser::CompUnitContext *ctx) override;
std::any visitGlobalConstDecl(SysYParser::GlobalConstDeclContext *ctx) override;
std::any visitGlobalVarDecl(SysYParser::GlobalVarDeclContext *ctx) override;
// std::any visitDecl(SysYParser::DeclContext *ctx) override;
std::any visitConstDecl(SysYParser::ConstDeclContext *ctx) override;
std::any visitVarDecl(SysYParser::VarDeclContext *ctx) override;
std::any visitBType(SysYParser::BTypeContext *ctx) override;
// std::any visitConstDef(SysYParser::ConstDefContext *ctx) override;
// std::any visitVarDef(SysYParser::VarDefContext *ctx) override;
std::any visitScalarInitValue(SysYParser::ScalarInitValueContext *ctx) override;
std::any visitArrayInitValue(SysYParser::ArrayInitValueContext *ctx) override;
std::any visitConstScalarInitValue(SysYParser::ConstScalarInitValueContext *ctx) override;
std::any visitConstArrayInitValue(SysYParser::ConstArrayInitValueContext *ctx) override;
// std::any visitConstInitVal(SysYParser::ConstInitValContext *ctx) override;
std::any visitFuncType(SysYParser::FuncTypeContext* ctx) override;
std::any visitFuncDef(SysYParser::FuncDefContext* ctx) override;
// std::any visitInitVal(SysYParser::InitValContext *ctx) override;
// std::any visitFuncFParam(SysYParser::FuncFParamContext *ctx) override;
// std::any visitFuncFParams(SysYParser::FuncFParamsContext *ctx) override;
std::any visitBlockStmt(SysYParser::BlockStmtContext* ctx) override;
// std::any visitStmt(SysYParser::StmtContext *ctx) override;
std::any visitAssignStmt(SysYParser::AssignStmtContext *ctx) override;
// std::any visitExpStmt(SysYParser::ExpStmtContext *ctx) override;
// std::any visitBlkStmt(SysYParser::BlkStmtContext *ctx) override;
std::any visitIfStmt(SysYParser::IfStmtContext *ctx) override;
std::any visitWhileStmt(SysYParser::WhileStmtContext *ctx) override;
std::any visitBreakStmt(SysYParser::BreakStmtContext *ctx) override;
std::any visitContinueStmt(SysYParser::ContinueStmtContext *ctx) override;
std::any visitReturnStmt(SysYParser::ReturnStmtContext *ctx) override;
// std::any visitExp(SysYParser::ExpContext *ctx) override;
// std::any visitCond(SysYParser::CondContext *ctx) override;
std::any visitLValue(SysYParser::LValueContext *ctx) override;
std::any visitPrimaryExp(SysYParser::PrimaryExpContext *ctx) override;
// std::any visitParenExp(SysYParser::ParenExpContext *ctx) override;
std::any visitNumber(SysYParser::NumberContext *ctx) override;
// std::any visitString(SysYParser::StringContext *ctx) override;
std::any visitCall(SysYParser::CallContext *ctx) override;
std::any visitUnaryExp(SysYParser::UnaryExpContext *ctx) override;
// std::any visitUnaryOp(SysYParser::UnaryOpContext *ctx) override;
// std::any visitUnExp(SysYParser::UnExpContext *ctx) override;
std::any visitFuncRParams(SysYParser::FuncRParamsContext *ctx) override;
std::any visitMulExp(SysYParser::MulExpContext *ctx) override;
std::any visitAddExp(SysYParser::AddExpContext *ctx) override;
std::any visitRelExp(SysYParser::RelExpContext *ctx) override;
std::any visitEqExp(SysYParser::EqExpContext *ctx) override;
std::any visitLAndExp(SysYParser::LAndExpContext *ctx) override;
std::any visitLOrExp(SysYParser::LOrExpContext *ctx) override;
// std::any visitConstExp(SysYParser::ConstExpContext *ctx) override;
}; // class SysYIRGenerator
} // namespace sysy

View File

@ -0,0 +1,29 @@
#pragma once
#include <string>
#include "IR.h"
namespace sysy {
class SysYPrinter {
private:
Module *pModule;
public:
explicit SysYPrinter(Module *pModule) : pModule(pModule) {}
public:
void printIR();
void printGlobalVariable();
void printFunction(Function *function);
void printInst(Instruction *pInst);
void printType(Type *type);
void printValue(Value *value);
public:
static std::string getOperandName(Value *operand);
std::string getTypeString(Type *type);
std::string getValueName(Value *value);
};
} // namespace sysy

View File

@ -1 +0,0 @@
llvmlite==0.41.0

View File

@ -1,68 +0,0 @@
from ..middle_ir import *
class X86Emitter:
def __init__(self):
self.reg_pool = ["eax", "ebx", "ecx", "edx"] # 简化寄存器分配
def emit_function(self, func: MiddleFunction) -> str:
asm = [f".globl {func.name}", f"{func.name}:"]
for block in func.basic_blocks:
asm.append(f".{block.name}:")
for instr in block.instructions:
asm.append(self.emit_instruction(instr))
return "\n".join(asm)
def is_register(self, operand):
return operand.startswith("%")
def get_operand_asm(self, operand):
if self.is_register(operand):
return "%" + self.map_operand(operand)
else:
# 移除类型信息(如 'i32'),提取数值
# 假设操作数格式为 'i32 5' 或直接为 '5'
parts = operand.split()
# 如果包含类型信息,取最后一个部分(数值)
value = parts[-1] if len(parts) > 1 else operand
try:
# 验证是有效整数
int(value)
return "$" + value
except ValueError:
# 如果不是有效整数,抛出错误
raise ValueError(f"Invalid immediate operand: {operand}")
def map_operand(self, operand: str) -> str:
"""将虚拟寄存器映射到物理寄存器(简化版)"""
if operand.startswith('%'):
# 去掉 '%' 和 '.'(如果有)
reg_num = operand[1:].replace('.', '')
try:
idx = int(reg_num)
return self.reg_pool[idx % len(self.reg_pool)]
except ValueError:
return "eax" # 默认
return operand
def emit_instruction(self, instr: MiddleInstruction) -> str:
op = instr.opcode
if op in ["add", "sub", "mul"]:
dest_reg = self.map_operand(instr.dest)
src1, src2 = instr.operands
if op == "add":
op_asm = "addl"
elif op == "sub":
op_asm = "subl"
elif op == "mul":
op_asm = "imull"
asm = f" movl {self.get_operand_asm(src1)}, %{dest_reg}\n"
asm += f" {op_asm} {self.get_operand_asm(src2)}, %{dest_reg}"
return asm
elif op == "ret":
if instr.operands:
op = instr.operands[0]
return f" movl {self.get_operand_asm(op)}, %eax\n ret"
else:
return " ret"
else:
raise NotImplementedError(f"Unsupported opcode: {op}")

View File

@ -1,55 +0,0 @@
class MiddleFunction:
def __init__(self, name: str):
self.name = name
self.basic_blocks = [] # List[MiddleBasicBlock]
def __repr__(self):
return f"Function({self.name}):\n" + "\n".join(str(block) for block in self.basic_blocks)
def __str__(self):
return self.__repr__()
def collect_used_registers(self):
used = set()
for block in self.basic_blocks:
for instr in block.instructions:
for op in instr.operands:
if op.startswith("%"):
used.add(op)
# For ret, if it has operands
if instr.opcode == "ret" and instr.operands:
op = instr.operands[0]
if op.startswith("%"):
used.add(op)
return used
def dead_code_elimination(self):
used = self.collect_used_registers()
for block in self.basic_blocks:
block.instructions = [instr for instr in block.instructions if instr.dest is None or instr.dest in used]
class MiddleBasicBlock:
def __init__(self, name: str):
self.name = name
self.instructions = [] # List[MiddleInstruction]
def __repr__(self):
return f"BasicBlock({self.name}):\n" + "\n".join(str(instr) for instr in self.instructions)
def __str__(self):
return self.__repr__()
class MiddleInstruction:
def __init__(self, opcode: str, operands: list, dest=None):
self.opcode = opcode # e.g., 'add'
self.operands = operands # e.g., ['5', '7']
self.dest = dest # e.g., '%0'
def __repr__(self):
if self.dest:
return f"{self.dest} = {self.opcode} " + ", ".join(self.operands)
else:
return f"{self.opcode} " + ", ".join(self.operands)
def __str__(self):
return self.__repr__()

View File

@ -1,31 +0,0 @@
from llvmlite import ir
from ..middle_ir import *
def parse_llvm_ir(ir_text: str) -> MiddleFunction:
"""将 LLVM IR 文本转换为 MiddleIR 结构(简化版)"""
from llvmlite.binding import parse_assembly
# 使用 binding 模块解析 LLVM IR 文本
with parse_assembly(ir_text) as module:
# 获取第一个函数
func = list(module.functions)[0]
middle_func = MiddleFunction(func.name)
# 转换基本块和指令
for block in func.blocks:
mid_block = MiddleBasicBlock(block.name)
for instr in block.instructions:
opcode = instr.opcode
operands = [str(op) for op in instr.operands]
# 如果是有返回值的指令(如 add第一个操作数是 dest
dest = None
if hasattr(instr, 'name') and instr.name:
dest = f"%{instr.name}" # 获取 LLVM IR 中的目标寄存器名
mid_instr = MiddleInstruction(opcode, operands, dest)
mid_block.instructions.append(mid_instr)
middle_func.basic_blocks.append(mid_block)
return middle_func

View File

@ -1,26 +0,0 @@
import sys
from sysy.utils.ir_parser import parse_llvm_ir
from sysy.backend.x86_emitter import X86Emitter
def main():
if len(sys.argv) != 2:
print("Usage: sysyc.py <input.ll>")
sys.exit(1)
# 读取 LLVM IR
with open(sys.argv[1], 'r') as f:
ir_text = f.read()
# 解析并生成汇编
func = parse_llvm_ir(ir_text)
# print("Before optimization:")
# print(func)
# func.dead_code_elimination()
# print("After optimization:")
# print(func)
asm = X86Emitter().emit_function(func)
print(asm)
if __name__ == "__main__":
main()

Binary file not shown.

View File

@ -6,9 +6,10 @@ using namespace std;
#include "SysYLexer.h"
#include "SysYParser.h"
using namespace antlr4;
#include "ASTPrinter.h"
#include "Backend.h"
// #include "Backend.h"
#include "SysYIRGenerator.h"
#include "SysYIRPrinter.h"
// #include "LLVMIRGenerator.h"
using namespace sysy;
static string argStopAfter;
@ -18,9 +19,9 @@ static bool argFormat = false;
void usage(int code = EXIT_FAILURE) {
const char *msg = "Usage: sysyc [options] inputfile\n\n"
"Supported options:\n"
" -h \tprint help message and exit\n"
" -f \tpretty-format the input file\n"
" -s {ast,ir,asm}\tstop after generating AST/IR/Assembly\n";
" -h \tprint help message and exit\n";
" -f \tpretty-format the input file\n";
" -s {ast,ir,asm,llvmir}\tstop after generating AST/IR/Assembly\n";
cerr << msg;
exit(code);
}
@ -51,14 +52,14 @@ void parseArgs(int argc, char **argv) {
int main(int argc, char **argv) {
parseArgs(argc, argv);
// 打开输入文件
// open the input file
ifstream fin(argInputFile);
if (not fin) {
cerr << "Failed to open file " << argv[1];
return EXIT_FAILURE;
}
// 解析 SysY 源码为 AST
// parse sysy source to AST
ANTLRInputStream input(fin);
SysYLexer lexer(&input);
CommonTokenStream tokens(&lexer);
@ -69,27 +70,16 @@ int main(int argc, char **argv) {
return EXIT_SUCCESS;
}
// 格式化输入文件
if (argFormat) {
ASTPrinter printer;
printer.visitCompUnit(moduleAST);
return EXIT_SUCCESS;
}
// 遍历 AST 生成 IR
SysYIRGenerator generator;
generator.generateIR(moduleAST); // 使用公共接口生成 IR
// visit AST to generate IR
if (argStopAfter == "ir") {
cout << generator.getIR(); // 输出生成的 IR
SysYIRGenerator generator;
generator.visitCompUnit(moduleAST);
auto moduleIR = generator.get();
SysYPrinter printer(moduleIR);
printer.printIR();
return EXIT_SUCCESS;
}
// // 生成汇编代码
// CodeGen codegen(generator.getIR()); // 假设 CodeGen 接受字符串作为输入
// string asmCode = codegen.code_gen();
// cout << asmCode << endl;
// if (argStopAfter == "asm")
// return EXIT_SUCCESS;
return EXIT_SUCCESS;
}

View File

@ -1,18 +1,12 @@
//test add
//const int a33[2] = {66,88};
int main(){
int a, b;
// int g[2][3][4] = {0,1,2,3,4,5,6,7,8,};
// const int f[2] = {33,44};
float c;
float d;
a = 10;
b = 2;
a = 11;
c = 1.6 ;
return a+b+1;
int c = a;
d = 1.1 ;
return a + b + c;
}
int add(int a, int b){
return a+b;
}

View File

@ -1,7 +1,7 @@
//test file for backend lab
int main() {
int a = 1;
const int a = 1;
const int b = 2;
int c;

View File

@ -7,7 +7,7 @@ int mul(int x, int y) {
int main(){
int a, b;
a = 10;
b = 3;
b = 0;
a = mul(a, b);
return a + b;
}