Compare commits

...

62 Commits

Author SHA1 Message Date
50fd9cffe9 [IRPrinter&DCE]修改定义方便调试打印,在DEC中增加调试信息 2025-07-16 13:04:05 +08:00
3419f84898 Merge remote-tracking branch 'origin/backend' into loopinfo 2025-07-15 13:09:55 +08:00
ede6465e8c [IR]:增加默认添加ret指令逻辑 2025-07-15 12:53:03 +08:00
a509dabbf0 [backend]解决了数组访存地址计算问题,加入了参数控制的中端、后端调试选项 2025-07-15 11:32:53 +08:00
e576f0a21e Merge remote-tracking branch 'origin/DCE' into backend 2025-06-27 22:44:08 +08:00
34ffa39b8a [backend] modified some comments and created a shell srcipt for test inside riscv64-vms 2025-06-25 20:59:40 +08:00
d06c5efae1 [backend] fixed bugs of deadcode elimation 2025-06-25 18:56:08 +08:00
019cb6dc0d [backend] debugging array 2025-06-25 17:07:37 +08:00
d9fa9e787a 删除注释 2025-06-25 16:33:43 +08:00
97410d9417 删除调试信息输出 2025-06-25 16:07:29 +08:00
44fb098aff Merge branch 'DCE' into backend 2025-06-25 16:04:42 +08:00
6f897d797a [backend] debugging array 2025-06-25 16:02:41 +08:00
0d23475aa1 [死代码删除]:保证扩展性、模块化构建死代码删除,包括消除无用store,alloca,load,globalval,mem2reg引入的无用alloca以及reg2mem导致的store-load-store冗余存储 2025-06-25 15:33:25 +08:00
b12732f10d 修复分析器逻辑,保证优化遍共享一个分析器(主要是在mem2reg) 2025-06-25 15:30:28 +08:00
15a80bd5cd [backend] fix the logical error of constants in interference graph construction 2025-06-25 14:37:46 +08:00
c8587a6d0b [backend] introduced riscv64 2025-06-25 14:37:46 +08:00
4c9c25aadc 修复break,continue的IR生成 2025-06-25 14:15:54 +08:00
1e06c5a446 debugging 2025-06-25 14:00:27 +08:00
050113d31d 增加Reg2Mem,但是会生成死存储指令,需要死代码删除支持识别死存储指令 2025-06-25 13:17:16 +08:00
3dc7c274cf 修复支配树生成算法 2025-06-25 12:42:28 +08:00
e6c4e91956 fix % repeat 2025-06-25 12:27:02 +08:00
4fabcc9952 mem2reg流程基本跑通,修复phi函数打印,需要删除调试print 2025-06-25 12:23:59 +08:00
9bb300ece5 Created a shell script for testing 2025-06-25 06:27:31 +08:00
c04f508171 [backend] implemented call function parameter passing using registers 2025-06-25 06:27:05 +08:00
24913641f2 [backend] fix bugs of not 2025-06-25 02:24:45 +08:00
bd0b624e87 debugging 2025-06-25 02:22:16 +08:00
af1ad795ff [backend] fix bugs of unary ops 2025-06-25 01:07:13 +08:00
ac7644f450 添加数据流分析类,实现前向后向分析的模板动作,实现活跃变量分析,基本借鉴学长代码,后续可优化实现 2025-06-24 23:45:43 +08:00
eadeadfbad [backend] introduced float instrs and regs 2025-06-24 23:24:09 +08:00
430224cfef Merge commit 'd50f76a77024d830c3dd7311ed910d689c9d5f16' into backend 2025-06-24 22:52:01 +08:00
3dbb394bc2 初步构建分析器,增加控制流分析,实现支配节点计算,支配树构建,支配边界计算,为后续Mem2reg做准备 2025-06-24 22:39:20 +08:00
d50f76a770 修复IR函数参数输出,变量命名 2025-06-24 16:39:42 +08:00
5222027b68 [backend] almost all test passed 2025-06-24 16:03:39 +08:00
cd91cc98ed Created some shell scripts for testing 2025-06-24 15:13:02 +08:00
f72b9ccc00 [backend] fixed bugs of testcase1 2025-06-24 15:12:07 +08:00
385f2f9712 [backend] fixed the bug of physical register allocation error 2025-06-24 14:15:02 +08:00
73dd8eba22 删除IR中关于分析的属性,准备建立分析器 2025-06-24 10:18:29 +08:00
395e6e4003 [backend] fixed many bugs 2025-06-24 03:23:45 +08:00
20cc08708a [backend] introduced debug option 2025-06-24 02:56:17 +08:00
942cb32976 [backend] fixed bugs 2025-06-24 00:42:14 +08:00
ac7569d890 Merge branch 'IROptPre' into backend 2025-06-24 00:40:36 +08:00
11cd32e6df [backend] fixed some bugs 2025-06-24 00:35:38 +08:00
617244fae7 [backend] switch to simpler implementation for inst selection 2025-06-24 00:30:33 +08:00
3c3f48ee87 [backend] fixed 1 segmentation fault 2025-06-23 22:38:29 +08:00
10b43fc90d 修复若干bug 2025-06-23 17:04:45 +08:00
ab3eb253f9 [backend] debugging segmentation fault caused by branch instr 2025-06-23 17:02:29 +08:00
3d233ff199 基本完成CFG优化(IR修复) 2025-06-23 16:25:52 +08:00
7d37bd7528 [backend] introduced DAG, GraphAlloc 2025-06-23 15:38:01 +08:00
568e9af626 IRoptpre 初步构建 2025-06-23 13:17:15 +08:00
63fc92dcbd 数组命名修复 2025-06-23 11:35:44 +08:00
af00612376 [backend] supported if 2025-06-23 06:16:19 +08:00
29f75e60a5 Merge remote-tracking branch 'origin/IRPrinter' into IRPrinter 2025-06-23 00:24:19 +08:00
9d8930f5df fix % repeat in IR print 2025-06-23 00:22:15 +08:00
10e1476ba1 [backend] test01 passed 2025-06-22 20:05:34 +08:00
b94e87637a Merge remote-tracking branch 'origin/IRPrinter' into backend 2025-06-22 20:00:29 +08:00
88a561177d [backend] incorrect asm output 2025-06-22 20:00:03 +08:00
3da2f3ec80 修复函数类型判断,终端跑通所有测试代码。Printer格式需要修复 2025-06-22 18:40:33 +08:00
496e2abfb6 构建IR打印器,llvm风格,跑通大部分样例(9/10),待修复 2025-06-22 17:59:19 +08:00
4711fb603b fixed bugs brought out by merging 2025-06-22 14:39:38 +08:00
dda8bbe444 Merge branch 'array_add' 2025-06-22 14:24:00 +08:00
25a8c72a9b [backend] it works 1.0 2025-06-22 14:06:14 +08:00
232ed6d023 [backend] introduced rv32 backend 2025-06-21 17:26:50 +08:00
42 changed files with 7284 additions and 2182 deletions

2
.gitignore vendored
View File

@ -50,3 +50,5 @@ GTAGS
__init__.py
*.pyc
.DS_*

17
TODO.md
View File

@ -3,20 +3,27 @@
### 1. **前端必须模块**
- **词法/语法分析**(已完成):
- `SysYLexer`/`SysYParser`ANTLR生成的解析器
- **IR生成核心**
- **IR生成核心**(已完成)
- `SysYIRGenerator`将AST转换为中间表示IR
- `IRBuilder`:构建指令和基本块的工具类(你们正在实现的部分)
- **IR打印器**(基本完成)
- `SysYIRPrinter`: 打印llvm ir格式的指令优化遍后查看优化效果la指令,subarray数组翻译范式需要改进
### 2. **中端必要优化(最小集合)**
- **CFG优化**(待测试)
- `SysYIROptPre`CFG优化顺便解决IR生成的缺陷void自动添加ret指令合并嵌套if/while语句生成的多个exit后续可以实现回填机制
常量传播
| 优化阶段 | 关键作用 | 是否必须 |
|-------------------|----------------------------------|----------|
| `Mem2Reg` | 消除冗余内存访问转换为SSA形式 | ✅ 核心 |
| `DCE` (死代码消除) | 移除无用指令 | ✅ 必要 |
| `DFE` (死函数消除) | 移除未使用的函数 | ✅ 必要 |
| `FuncAnalysis` | 函数调用关系分析 | ✅ 基础 |
| `Mem2Reg` | 消除冗余内存访问转换为SSA形式 | ✅ 核心 |(必须)
| `DCE` (死代码消除) | 移除无用指令 | ✅ 必要 |(必须)
| `DFE` (死函数消除) | 移除未使用的函数 | ✅ 必要 |(必须)
| `Global2Local` | 全局变量降级为局部变量 | ✅ 重要 |
还需要做 Reg2Mem
### 3. **后端核心流程(必须实现)**
```mermaid
graph LR

View File

@ -16,8 +16,13 @@ add_executable(sysyc
IR.cpp
SysYIRGenerator.cpp
# Backend.cpp
# LLVMIRGenerator.cpp
# LLVMIRGenerator_1.cpp
SysYIRPrinter.cpp
SysYIROptPre.cpp
SysYIRAnalyser.cpp
DeadCodeElimination.cpp
Mem2Reg.cpp
Reg2Mem.cpp
RISCv64Backend.cpp
)
target_include_directories(sysyc PRIVATE ${CMAKE_CURRENT_BINARY_DIR} ${CMAKE_CURRENT_SOURCE_DIR}/include)
target_compile_options(sysyc PRIVATE -frtti)

276
src/DeadCodeElimination.cpp Normal file
View File

@ -0,0 +1,276 @@
#include "DeadCodeElimination.h"
#include <iostream>
extern int DEBUG;
namespace sysy {
void DeadCodeElimination::runDCEPipeline() {
const auto& functions = pModule->getFunctions();
for (const auto& function : functions) {
const auto& func = function.second;
bool changed = true;
while (changed) {
changed = false;
eliminateDeadStores(func.get(), changed);
eliminateDeadLoads(func.get(), changed);
eliminateDeadAllocas(func.get(), changed);
eliminateDeadRedundantLoadStore(func.get(), changed);
eliminateDeadGlobals(changed);
}
}
}
// 消除无用存储 消除条件:
// 存储的目标指针pointer不是全局变量!isGlobal(pointer))。
// 存储的目标指针不是数组参数(!isArr(pointer) 或不在函数参数列表里)。
// 该指针的所有使用者uses仅限 alloca 或 store即没有 load 或其他指令使用它)。
void DeadCodeElimination::eliminateDeadStores(Function* func, bool& changed) {
for (const auto& block : func->getBasicBlocks()) {
auto& instrs = block->getInstructions();
for (auto iter = instrs.begin(); iter != instrs.end();) {
auto inst = iter->get();
if (!inst->isStore()) {
++iter;
continue;
}
auto storeInst = dynamic_cast<StoreInst*>(inst);
auto pointer = storeInst->getPointer();
// 如果是全局变量或者是函数的数组参数
if (isGlobal(pointer) || (isArr(pointer) &&
std::find(func->getEntryBlock()->getArguments().begin(),
func->getEntryBlock()->getArguments().end(),
pointer) != func->getEntryBlock()->getArguments().end())) {
++iter;
continue;
}
bool changetag = true;
for (auto& use : pointer->getUses()) {
// 依次判断store的指针是否被其他指令使用
auto user = use->getUser();
auto userInst = dynamic_cast<Instruction*>(user);
// 如果使用store的指针的指令不是Alloca或Store则不删除
if (userInst != nullptr && !userInst->isAlloca() && !userInst->isStore()) {
changetag = false;
break;
}
}
if (changetag) {
changed = true;
if(DEBUG){
std::cout << "=== Dead Store Found ===\n";
SysYPrinter::printInst(storeInst);
}
usedelete(storeInst);
iter = instrs.erase(iter);
} else {
++iter;
}
}
}
}
// 消除无用加载 消除条件:
// 该指令的结果未被使用inst->getUses().empty())。
void DeadCodeElimination::eliminateDeadLoads(Function* func, bool& changed) {
for (const auto& block : func->getBasicBlocks()) {
auto& instrs = block->getInstructions();
for (auto iter = instrs.begin(); iter != instrs.end();) {
auto inst = iter->get();
if (inst->isBinary() || inst->isUnary() || inst->isLoad()) {
if (inst->getUses().empty()) {
changed = true;
if(DEBUG){
std::cout << "=== Dead Load Binary Unary Found ===\n";
SysYPrinter::printInst(inst);
}
usedelete(inst);
iter = instrs.erase(iter);
continue;
}
}
++iter;
}
}
}
// 消除无用加载 消除条件:
// 该 alloca 未被任何指令使用allocaInst->getUses().empty())。
// 该 alloca 不是函数的参数(不在 entry 块的参数列表里)。
void DeadCodeElimination::eliminateDeadAllocas(Function* func, bool& changed) {
for (const auto& block : func->getBasicBlocks()) {
auto& instrs = block->getInstructions();
for (auto iter = instrs.begin(); iter != instrs.end();) {
auto inst = iter->get();
if (inst->isAlloca()) {
auto allocaInst = dynamic_cast<AllocaInst*>(inst);
if (allocaInst->getUses().empty() &&
std::find(func->getEntryBlock()->getArguments().begin(),
func->getEntryBlock()->getArguments().end(),
allocaInst) == func->getEntryBlock()->getArguments().end()) {
changed = true;
if(DEBUG){
std::cout << "=== Dead Alloca Found ===\n";
SysYPrinter::printInst(inst);
}
usedelete(inst);
iter = instrs.erase(iter);
continue;
}
}
++iter;
}
}
}
void DeadCodeElimination::eliminateDeadIndirectiveAllocas(Function* func, bool& changed) {
// 删除mem2reg时引入的且现在已经没有value使用了的隐式alloca
FunctionAnalysisInfo* funcInfo = pCFA->getFunctionAnalysisInfo(func);
for (auto it = funcInfo->getIndirectAllocas().begin(); it != funcInfo->getIndirectAllocas().end();) {
auto &allocaInst = *it;
if (allocaInst->getUses().empty()) {
changed = true;
if(DEBUG){
std::cout << "=== Dead Indirect Alloca Found ===\n";
SysYPrinter::printInst(allocaInst.get());
}
it = funcInfo->getIndirectAllocas().erase(it);
} else {
++it;
}
}
}
// 该全局变量未被任何指令使用global->getUses().empty())。
void DeadCodeElimination::eliminateDeadGlobals(bool& changed) {
auto& globals = pModule->getGlobals();
for (auto it = globals.begin(); it != globals.end();) {
auto& global = *it;
if (global->getUses().empty()) {
changed = true;
if(DEBUG){
std::cout << "=== Dead Global Found ===\n";
SysYPrinter::printValue(global.get());
}
it = globals.erase(it);
} else {
++it;
}
}
}
// 消除冗余加载和存储 消除条件:
// phi 指令的目标指针仅被该 phi 使用(无其他 store/load 使用)。
// memset 指令的目标指针未被使用pointer->getUses().empty()
// store -> load -> store 模式
void DeadCodeElimination::eliminateDeadRedundantLoadStore(Function* func, bool& changed) {
for (const auto& block : func->getBasicBlocks()) {
auto& instrs = block->getInstructions();
for (auto iter = instrs.begin(); iter != instrs.end();) {
auto inst = iter->get();
if (inst->isPhi()) {
auto phiInst = dynamic_cast<PhiInst*>(inst);
auto pointer = phiInst->getPointer();
bool tag = true;
for (const auto& use : pointer->getUses()) {
auto user = use->getUser();
if (user != inst) {
tag = false;
break;
}
}
/// 如果 pointer 仅被该 phi 使用,可以删除 ph
if (tag) {
changed = true;
usedelete(inst);
iter = instrs.erase(iter);
continue;
}
// 数组指令还不完善不保证memset优化效果
} else if (inst->isMemset()) {
auto memsetInst = dynamic_cast<MemsetInst*>(inst);
auto pointer = memsetInst->getPointer();
if (pointer->getUses().empty()) {
changed = true;
usedelete(inst);
iter = instrs.erase(iter);
continue;
}
}else if(inst->isLoad()) {
if (iter != instrs.begin()) {
auto loadInst = dynamic_cast<LoadInst*>(inst);
auto loadPointer = loadInst->getPointer();
// TODO:store -> load -> store 模式
auto prevIter = std::prev(iter);
auto prevInst = prevIter->get();
if (prevInst->isStore()) {
auto prevStore = dynamic_cast<StoreInst*>(prevInst);
auto prevStorePointer = prevStore->getPointer();
auto prevStoreValue = prevStore->getOperand(0);
// 确保前一个 store 不是数组操作
if (prevStore->getIndices().empty()) {
// 检查后一条指令是否是 store 同一个值
auto nextIter = std::next(iter);
if (nextIter != instrs.end()) {
auto nextInst = nextIter->get();
if (nextInst->isStore()) {
auto nextStore = dynamic_cast<StoreInst*>(nextInst);
auto nextStorePointer = nextStore->getPointer();
auto nextStoreValue = nextStore->getOperand(0);
// 确保后一个 store 不是数组操作
if (nextStore->getIndices().empty()) {
// 判断优化条件:
// 1. prevStore 的指针操作数 == load 的指针操作数
// 2. nextStore 的值操作数 == load 指令本身
if (prevStorePointer == loadPointer &&
nextStoreValue == loadInst) {
// 可以优化直接把prevStorePointer的值存到nextStorePointer
changed = true;
nextStore->setOperand(0, prevStoreValue);
if(DEBUG){
std::cout << "=== Dead Store Load Store Found(now only del Load) ===\n";
SysYPrinter::printInst(prevStore);
SysYPrinter::printInst(loadInst);
SysYPrinter::printInst(nextStore);
}
usedelete(loadInst);
iter = instrs.erase(iter);
// 删除 prevStore 这里是不是可以留给删除无用store处理
// if (prevStore->getUses().empty()) {
// usedelete(prevStore);
// instrs.erase(prevIter); // 删除 prevStore
// }
continue; // 跳过 ++iter因为已经移动迭代器
}
}
}
}
}
}
}
}
++iter;
}
}
}
bool DeadCodeElimination::isGlobal(Value *val){
auto gval = dynamic_cast<GlobalValue *>(val);
return gval != nullptr;
}
bool DeadCodeElimination::isArr(Value *val){
auto aval = dynamic_cast<AllocaInst *>(val);
return aval != nullptr && aval->getNumDims() != 0;
}
void DeadCodeElimination::usedelete(Instruction *instr){
for (auto &use1 : instr->getOperands()) {
auto val1 = use1->getValue();
val1->removeUse(use1);
}
}
} // namespace sysy

View File

@ -135,7 +135,7 @@ auto Function::getCalleesWithNoExternalAndSelf() -> std::set<Function *> {
}
return result;
}
// 函数克隆,后续函数级优化(内联等)需要用到
Function * Function::clone(const std::string &suffix) const {
std::stringstream ss;
std::map<BasicBlock *, BasicBlock *> oldNewBlockMap;
@ -527,11 +527,7 @@ Function * Function::clone(const std::string &suffix) const {
return newFunction;
}
/**
* @brief 设置操作数
*
* @param [in] index 所要设置的操作数的位置
* @param [in] value 所要设置成的value
* @return 无返回值
* 设置操作数
*/
void User::setOperand(unsigned index, Value *value) {
assert(index < getNumOperands());
@ -539,11 +535,7 @@ void User::setOperand(unsigned index, Value *value) {
value->addUse(operands[index]);
}
/**
* @brief 替换操作数
*
* @param [in] index 所要替换的操作数的位置
* @param [in] value 所要替换成的value
* @return 无返回值
* 替换操作数
*/
void User::replaceOperand(unsigned index, Value *value) {
assert(index < getNumOperands());
@ -561,17 +553,12 @@ CallInst::CallInst(Function *callee, const std::vector<Value *> &args, BasicBloc
}
}
/**
* @brief 获取被调用函数的指针
*
* @return 被调用函数的指针
* 获取被调用函数的指针
*/
Function * CallInst::getCallee() const { return dynamic_cast<Function *>(getOperand(0)); }
/**
* @brief 获取变量指针
*
* @param [in] name 变量名字
* @return 变量指针
* 获取变量指针
*/
auto SymbolTable::getVariable(const std::string &name) const -> User * {
auto node = curNode;
@ -586,11 +573,7 @@ auto SymbolTable::getVariable(const std::string &name) const -> User * {
return nullptr;
}
/**
* @brief 添加变量
*
* @param [in] name 变量名字
* @param [in] variable 变量指针
* @return 变量指针
* 添加变量到符号表
*/
auto SymbolTable::addVariable(const std::string &name, User *variable) -> User * {
User *result = nullptr;
@ -598,11 +581,11 @@ auto SymbolTable::addVariable(const std::string &name, User *variable) -> User *
std::stringstream ss;
auto iter = variableIndex.find(name);
if (iter != variableIndex.end()) {
ss << name << "(" << iter->second << ")";
ss << name << iter->second ;
iter->second += 1;
} else {
variableIndex.emplace(name, 1);
ss << name << "(" << 0 << ")";
ss << name << 0 ;
}
variable->setName(ss.str());
@ -621,21 +604,15 @@ auto SymbolTable::addVariable(const std::string &name, User *variable) -> User *
return result;
}
/**
* @brief 获取全局变量
*
* @return 全局变量列表
* 获取全局变量
*/
auto SymbolTable::getGlobals() -> std::vector<std::unique_ptr<GlobalValue>> & { return globals; }
/**
* @brief 获取常量
*
* @return 常量列表
* 获取常量
*/
auto SymbolTable::getConsts() const -> const std::vector<std::unique_ptr<ConstantVariable>> & { return consts; }
/**
* @brief 进入新的作用域
*
* @return 无返回值
* 进入新的作用域
*/
void SymbolTable::enterNewScope() {
auto newNode = new SymbolTableNode;
@ -647,67 +624,20 @@ void SymbolTable::enterNewScope() {
curNode = newNode;
}
/**
* @brief 进入全局作用域
*
* @return 无返回值
* 进入全局作用域
*/
void SymbolTable::enterGlobalScope() { curNode = nodeList.front().get(); }
/**
* @brief 离开作用域
*
* @return 无返回值
* 离开作用域
*/
void SymbolTable::leaveScope() { curNode = curNode->pNode; }
/**
* @brief 是否位于全局作用域
*
* @return 布尔值
* 是否位于全局作用域
*/
auto SymbolTable::isInGlobalScope() const -> bool { return curNode->pNode == nullptr; }
/**
* @brief 判断是否为循环不变量
* @param value: 要判断的value
* @return true: 是不变量
* @return false: 不是
*/
auto Loop::isSimpleLoopInvariant(Value *value) -> bool {
// auto constValue = dynamic_cast<ConstantValue *>(value);
// if (constValue != nullptr) {
// return false;
// }
if (auto instr = dynamic_cast<Instruction *>(value)) {
if (instr->isLoad()) {
auto loadinst = dynamic_cast<LoadInst *>(instr);
auto loadvalue = dynamic_cast<AllocaInst *>(loadinst->getOperand(0));
if (loadvalue != nullptr) {
if (loadvalue->getParent() != nullptr) {
auto basicblock = loadvalue->getParent();
return !this->isLoopContainsBasicBlock(basicblock);
}
}
auto globalvalue = dynamic_cast<GlobalValue *>(loadinst->getOperand(0));
if (globalvalue != nullptr) {
return true;
}
auto basicblock = instr->getParent();
return !this->isLoopContainsBasicBlock(basicblock);
}
auto basicblock = instr->getParent();
return !this->isLoopContainsBasicBlock(basicblock);
}
return true;
}
/**
* @brief 移动指令
*
* @param [in] sourcePos 源指令列表位置
* @param [in] targetPos 目的指令列表位置
* @param [in] block 目标基本块
* @return 无返回值
*移动指令
*/
auto BasicBlock::moveInst(iterator sourcePos, iterator targetPos, BasicBlock *block) -> iterator {
auto inst = sourcePos->release();

View File

@ -1,674 +0,0 @@
// LLVMIRGenerator.cpp
// TODO类型转换及其检查
// TODOsysy库函数处理
// TODO数组处理
// TODO对while、continue、break的测试
#include "LLVMIRGenerator.h"
#include <iomanip>
using namespace std;
namespace sysy {
std::string LLVMIRGenerator::generateIR(SysYParser::CompUnitContext* unit) {
// 初始化自定义IR数据结构
irModule = std::make_unique<sysy::Module>();
irBuilder = sysy::IRBuilder(); // 初始化IR构建器
tempCounter = 0;
symbolTable.clear();
tmpTable.clear();
globalVars.clear();
inFunction = false;
visitCompUnit(unit);
return irStream.str();
}
std::string LLVMIRGenerator::getNextTemp() {
std::string ret = "%." + std::to_string(tempCounter++);
tmpTable[ret] = "void";
return ret;
}
std::string LLVMIRGenerator::getLLVMType(const std::string& type) {
if (type == "int") return "i32";
if (type == "float") return "float";
if (type.find("[]") != std::string::npos)
return getLLVMType(type.substr(0, type.size()-2)) + "*";
return "i32";
}
sysy::Type* LLVMIRGenerator::getSysYType(const std::string& typeStr) {
if (typeStr == "int") return sysy::Type::getIntType();
if (typeStr == "float") return sysy::Type::getFloatType();
if (typeStr == "void") return sysy::Type::getVoidType();
// 处理指针类型等
return sysy::Type::getIntType();
}
std::any LLVMIRGenerator::visitCompUnit(SysYParser::CompUnitContext* ctx) {
auto type_i32 = Type::getIntType();
auto type_f32 = Type::getFloatType();
auto type_void = Type::getVoidType();
auto type_i32p = Type::getPointerType(type_i32);
auto type_f32p = Type::getPointerType(type_f32);
// 创建运行时库函数
irModule->createFunction("getint", sysy::FunctionType::get(type_i32, {}));
irModule->createFunction("getch", sysy::FunctionType::get(type_i32, {}));
irModule->createFunction("getfloat", sysy::FunctionType::get(type_f32, {}));
//TODO: 添加更多运行时库函数
irStream << "declare i32 @getint()\n";
irStream << "declare i32 @getch()\n";
irStream << "declare float @getfloat()\n";
//TODO: 添加更多运行时库函数的文本IR
for (auto decl : ctx->decl()) {
decl->accept(this);
}
for (auto funcDef : ctx->funcDef()) {
inFunction = true; // 进入函数定义
funcDef->accept(this);
inFunction = false; // 离开函数定义
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDecl(SysYParser::VarDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
currentVarType = getLLVMType(type);
for (auto varDef : ctx->varDef()) {
if (!inFunction) {
// 全局变量声明
std::string varName = varDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0"; // 默认值为 0
if (varDef->ASSIGN()) {
value = std::any_cast<std::string>(varDef->initVal()->accept(this));
} else {
std::cout << "[WR-Release-01]Warning: Global variable '" << varName
<< "' is declared without initialization, defaulting to 0.\n";
}
irStream << "@" << varName << " = dso_local global " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName); // 记录全局变量
} else {
// 局部变量声明
varDef->accept(this);
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitConstDecl(SysYParser::ConstDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
for (auto constDef : ctx->constDef()) {
if (!inFunction) {
// 全局常量声明
std::string varName = constDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0"; // 默认值为 0
try {
value = std::any_cast<std::string>(constDef->constInitVal()->accept(this));
} catch (...) {
throw std::runtime_error("[ERR-Release-01]Const value must be initialized upon definition.");
}
// 如果是 float 类型,转换为十六进制表示
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("[ERR-Release-02]Invalid float literal: " + value);
}
}
irStream << "@" << varName << " = dso_local constant " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName); // 记录全局变量
} else {
// 局部常量声明
std::string varName = constDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string allocaName = getNextTemp();
std::string value = "0"; // 默认值为 0
try {
value = std::any_cast<std::string>(constDef->constInitVal()->accept(this));
} catch (...) {
throw std::runtime_error("Const value must be initialized upon definition.");
}
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDef(SysYParser::VarDefContext* ctx) {
// TODO数组初始化
std::string varName = ctx->Ident()->getText();
std::string type = currentVarType;
std::string llvmType = getLLVMType(type);
std::string allocaName = getNextTemp();
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
if (ctx->ASSIGN()) {
std::string value = std::any_cast<std::string>(ctx->initVal()->accept(this));
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
}
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
return nullptr;
}
std::any LLVMIRGenerator::visitFuncDef(SysYParser::FuncDefContext* ctx) {
currentFunction = ctx->Ident()->getText();
currentReturnType = getLLVMType(ctx->funcType()->getText());
symbolTable.clear();
tmpTable.clear();
tempCounter = 0;
hasReturn = false;
irStream << "define dso_local " << currentReturnType << " @" << currentFunction << "(";
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
tempCounter += params.size();
for (size_t i = 0; i < params.size(); ++i) {
if (i > 0) irStream << ", ";
std::string paramType = getLLVMType(params[i]->bType()->getText());
irStream << paramType << " noundef %" << i;
symbolTable[params[i]->Ident()->getText()] = {"%" + std::to_string(i), paramType};
tmpTable["%" + std::to_string(i)] = paramType;
}
}
tempCounter++;
irStream << ") #0 {\n";
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
for (size_t i = 0; i < params.size(); ++i) {
std::string varName = params[i]->Ident()->getText();
std::string type = params[i]->bType()->getText();
std::string llvmType = getLLVMType(type);
std::string allocaName = getNextTemp();
tmpTable[allocaName] = llvmType;
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
irStream << " store " << llvmType << " " << symbolTable[varName].first << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
symbolTable[varName] = {allocaName, llvmType};
}
}
ctx->blockStmt()->accept(this);
if (!hasReturn) {
if (currentReturnType == "void") {
irStream << " ret void\n";
} else {
irStream << " ret " << currentReturnType << " 0\n";
}
}
irStream << "}\n";
return nullptr;
}
std::any LLVMIRGenerator::visitBlockStmt(SysYParser::BlockStmtContext* ctx) {
for (auto item : ctx->blockItem()) {
item->accept(this);
}
return nullptr;
}
std::any LLVMIRGenerator::visitAssignStmt(SysYParser::AssignStmtContext *ctx)
{
std::string lhsAlloca = std::any_cast<std::string>(ctx->lValue()->accept(this));
std::string lhsType = symbolTable[ctx->lValue()->Ident()->getText()].second;
std::string rhs = std::any_cast<std::string>(ctx->exp()->accept(this));
if (lhsType == "float") {
try {
double floatValue = std::stod(rhs);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
rhs = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + rhs);
}
}
irStream << " store " << lhsType << " " << rhs << ", " << lhsType
<< "* " << lhsAlloca << ", align 4\n";
return nullptr;
}
std::any LLVMIRGenerator::visitIfStmt(SysYParser::IfStmtContext *ctx)
{
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
std::string trueLabel = "if.then." + std::to_string(tempCounter);
std::string falseLabel = "if.else." + std::to_string(tempCounter);
std::string mergeLabel = "if.end." + std::to_string(tempCounter++);
irStream << " br i1 " << cond << ", label %" << trueLabel << ", label %" << falseLabel << "\n";
irStream << trueLabel << ":\n";
ctx->stmt(0)->accept(this);
irStream << " br label %" << mergeLabel << "\n";
irStream << falseLabel << ":\n";
if (ctx->ELSE()) {
ctx->stmt(1)->accept(this);
}
irStream << " br label %" << mergeLabel << "\n";
irStream << mergeLabel << ":\n";
return nullptr;
}
std::any LLVMIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext *ctx)
{
std::string loop_cond = "while.cond." + std::to_string(tempCounter);
std::string loop_body = "while.body." + std::to_string(tempCounter);
std::string loop_end = "while.end." + std::to_string(tempCounter++);
loopStack.push({loop_end, loop_cond});
irStream << " br label %" << loop_cond << "\n";
irStream << loop_cond << ":\n";
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
irStream << " br i1 " << cond << ", label %" << loop_body << ", label %" << loop_end << "\n";
irStream << loop_body << ":\n";
ctx->stmt()->accept(this);
irStream << " br label %" << loop_cond << "\n";
irStream << loop_end << ":\n";
loopStack.pop();
return nullptr;
}
std::any LLVMIRGenerator::visitBreakStmt(SysYParser::BreakStmtContext *ctx)
{
if (loopStack.empty()) {
throw std::runtime_error("Break statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().breakLabel << "\n";
return nullptr;
}
std::any LLVMIRGenerator::visitContinueStmt(SysYParser::ContinueStmtContext *ctx)
{
if (loopStack.empty()) {
throw std::runtime_error("Continue statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().continueLabel << "\n";
return nullptr;
}
std::any LLVMIRGenerator::visitReturnStmt(SysYParser::ReturnStmtContext *ctx)
{
hasReturn = true;
if (ctx->exp()) {
std::string value = std::any_cast<std::string>(ctx->exp()->accept(this));
irStream << " ret " << currentReturnType << " " << value << "\n";
} else {
irStream << " ret void\n";
}
return nullptr;
}
// std::any LLVMIRGenerator::visitStmt(SysYParser::StmtContext* ctx) {
// if (ctx->lValue() && ctx->ASSIGN()) {
// std::string lhsAlloca = std::any_cast<std::string>(ctx->lValue()->accept(this));
// std::string lhsType = symbolTable[ctx->lValue()->Ident()->getText()].second;
// std::string rhs = std::any_cast<std::string>(ctx->exp()->accept(this));
// if (lhsType == "float") {
// try {
// double floatValue = std::stod(rhs);
// uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
// std::stringstream ss;
// ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
// rhs = ss.str();
// } catch (...) {
// throw std::runtime_error("Invalid float literal: " + rhs);
// }
// }
// irStream << " store " << lhsType << " " << rhs << ", " << lhsType
// << "* " << lhsAlloca << ", align 4\n";
// } else if (ctx->RETURN()) {
// hasReturn = true;
// if (ctx->exp()) {
// std::string value = std::any_cast<std::string>(ctx->exp()->accept(this));
// irStream << " ret " << currentReturnType << " " << value << "\n";
// } else {
// irStream << " ret void\n";
// }
// } else if (ctx->IF()) {
// std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
// std::string trueLabel = "if.then." + std::to_string(tempCounter);
// std::string falseLabel = "if.else." + std::to_string(tempCounter);
// std::string mergeLabel = "if.end." + std::to_string(tempCounter++);
// irStream << " br i1 " << cond << ", label %" << trueLabel << ", label %" << falseLabel << "\n";
// irStream << trueLabel << ":\n";
// ctx->stmt(0)->accept(this);
// irStream << " br label %" << mergeLabel << "\n";
// irStream << falseLabel << ":\n";
// if (ctx->ELSE()) {
// ctx->stmt(1)->accept(this);
// }
// irStream << " br label %" << mergeLabel << "\n";
// irStream << mergeLabel << ":\n";
// } else if (ctx->WHILE()) {
// std::string loop_cond = "while.cond." + std::to_string(tempCounter);
// std::string loop_body = "while.body." + std::to_string(tempCounter);
// std::string loop_end = "while.end." + std::to_string(tempCounter++);
// loopStack.push({loop_end, loop_cond});
// irStream << " br label %" << loop_cond << "\n";
// irStream << loop_cond << ":\n";
// std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
// irStream << " br i1 " << cond << ", label %" << loop_body << ", label %" << loop_end << "\n";
// irStream << loop_body << ":\n";
// ctx->stmt(0)->accept(this);
// irStream << " br label %" << loop_cond << "\n";
// irStream << loop_end << ":\n";
// loopStack.pop();
// } else if (ctx->BREAK()) {
// if (loopStack.empty()) {
// throw std::runtime_error("Break statement outside of a loop.");
// }
// irStream << " br label %" << loopStack.top().breakLabel << "\n";
// } else if (ctx->CONTINUE()) {
// if (loopStack.empty()) {
// throw std::runtime_error("Continue statement outside of a loop.");
// }
// irStream << " br label %" << loopStack.top().continueLabel << "\n";
// } else if (ctx->blockStmt()) {
// ctx->blockStmt()->accept(this);
// } else if (ctx->exp()) {
// ctx->exp()->accept(this);
// }
// return nullptr;
// }
std::any LLVMIRGenerator::visitLValue(SysYParser::LValueContext* ctx) {
std::string varName = ctx->Ident()->getText();
return symbolTable[varName].first;
}
// std::any LLVMIRGenerator::visitPrimaryExp(SysYParser::PrimaryExpContext* ctx) {
// if (ctx->lValue()) {
// std::string allocaPtr = std::any_cast<std::string>(ctx->lValue()->accept(this));
// std::string varName = ctx->lValue()->Ident()->getText();
// std::string type = symbolTable[varName].second;
// std::string temp = getNextTemp();
// irStream << " " << temp << " = load " << type << ", " << type << "* " << allocaPtr << ", align 4\n";
// tmpTable[temp] = type;
// return temp;
// } else if (ctx->exp()) {
// return ctx->exp()->accept(this);
// } else {
// return ctx->number()->accept(this);
// }
// }
std::any LLVMIRGenerator::visitPrimExp(SysYParser::PrimExpContext *ctx){
// irStream << "visitPrimExp\n";
// std::cout << "Type name: " << typeid(*(ctx->primaryExp())).name() << std::endl;
SysYParser::PrimaryExpContext* pExpCtx = ctx->primaryExp();
if (auto* lvalCtx = dynamic_cast<SysYParser::LValContext*>(pExpCtx)) {
std::string allocaPtr = std::any_cast<std::string>(lvalCtx->lValue()->accept(this));
std::string varName = lvalCtx->lValue()->Ident()->getText();
std::string type = symbolTable[varName].second;
std::string temp = getNextTemp();
irStream << " " << temp << " = load " << type << ", " << type << "* " << allocaPtr << ", align 4\n";
tmpTable[temp] = type;
return temp;
} else if (auto* expCtx = dynamic_cast<SysYParser::ParenExpContext*>(pExpCtx)) {
return expCtx->exp()->accept(this);
} else if (auto* strCtx = dynamic_cast<SysYParser::StrContext*>(pExpCtx)) {
return strCtx->string()->accept(this);
} else if (auto* numCtx = dynamic_cast<SysYParser::NumContext*>(pExpCtx)) {
return numCtx->number()->accept(this);
} else {
// 没有成功转换,说明 ctx->primaryExp() 不是 NumContext 或其他已知类型
// 可能是其他类型的表达式,或者是一个空的 PrimaryExpContext
std::cout << "Unknown primary expression type." << std::endl;
throw std::runtime_error("Unknown primary expression type.");
}
// return visitChildren(ctx);
}
std::any LLVMIRGenerator::visitParenExp(SysYParser::ParenExpContext* ctx) {
return ctx->exp()->accept(this);
}
std::any LLVMIRGenerator::visitNumber(SysYParser::NumberContext* ctx) {
if (ctx->ILITERAL()) {
return ctx->ILITERAL()->getText();
} else if (ctx->FLITERAL()) {
return ctx->FLITERAL()->getText();
}
return "";
}
std::any LLVMIRGenerator::visitString(SysYParser::StringContext *ctx)
{
if (ctx->STRING()) {
// 处理字符串常量
std::string str = ctx->STRING()->getText();
// 去掉引号
str = str.substr(1, str.size() - 2);
// 转义处理
std::string escapedStr;
for (char c : str) {
if (c == '\\') {
escapedStr += "\\\\";
} else if (c == '"') {
escapedStr += "\\\"";
} else {
escapedStr += c;
}
}
return "\"" + escapedStr + "\"";
}
return ctx->STRING()->getText();
}
std::any LLVMIRGenerator::visitUnExp(SysYParser::UnExpContext* ctx) {
if (ctx->unaryOp()) {
std::string operand = std::any_cast<std::string>(ctx->unaryExp()->accept(this));
std::string op = ctx->unaryOp()->getText();
std::string temp = getNextTemp();
std::string type = operand.substr(0, operand.find(' '));
tmpTable[temp] = type;
if (op == "-") {
irStream << " " << temp << " = sub " << type << " 0, " << operand << "\n";
} else if (op == "!") {
irStream << " " << temp << " = xor " << type << " " << operand << ", 1\n";
}
return temp;
}
return ctx->unaryExp()->accept(this);
}
std::any LLVMIRGenerator::visitCall(SysYParser::CallContext *ctx)
{
std::string funcName = ctx->Ident()->getText();
std::vector<std::string> args;
if (ctx->funcRParams()) {
for (auto argCtx : ctx->funcRParams()->exp()) {
args.push_back(std::any_cast<std::string>(argCtx->accept(this)));
}
}
std::string temp = getNextTemp();
std::string argList = "";
for (size_t i = 0; i < args.size(); ++i) {
if (i > 0) argList += ", ";
argList +=tmpTable[args[i]] + " noundef " + args[i];
}
irStream << " " << temp << " = call " << currentReturnType << " @" << funcName << "(" << argList << ")\n";
tmpTable[temp] = currentReturnType;
return temp;
}
std::any LLVMIRGenerator::visitMulExp(SysYParser::MulExpContext* ctx) {
auto unaryExps = ctx->unaryExp();
std::string left = std::any_cast<std::string>(unaryExps[0]->accept(this));
for (size_t i = 1; i < unaryExps.size(); ++i) {
std::string right = std::any_cast<std::string>(unaryExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "*") {
irStream << " " << temp << " = mul nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "/") {
irStream << " " << temp << " = sdiv " << type << " " << left << ", " << right << "\n";
} else if (op == "%") {
irStream << " " << temp << " = srem " << type << " " << left << ", " << right << "\n";
}
left = temp;
tmpTable[temp] = type;
}
return left;
}
std::any LLVMIRGenerator::visitAddExp(SysYParser::AddExpContext* ctx) {
auto mulExps = ctx->mulExp();
std::string left = std::any_cast<std::string>(mulExps[0]->accept(this));
for (size_t i = 1; i < mulExps.size(); ++i) {
std::string right = std::any_cast<std::string>(mulExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "+") {
irStream << " " << temp << " = add nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "-") {
irStream << " " << temp << " = sub nsw " << type << " " << left << ", " << right << "\n";
}
left = temp;
tmpTable[temp] = type;
}
return left;
}
std::any LLVMIRGenerator::visitRelExp(SysYParser::RelExpContext* ctx) {
auto addExps = ctx->addExp();
std::string left = std::any_cast<std::string>(addExps[0]->accept(this));
for (size_t i = 1; i < addExps.size(); ++i) {
std::string right = std::any_cast<std::string>(addExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "<") {
irStream << " " << temp << " = icmp slt " << type << " " << left << ", " << right << "\n";
} else if (op == ">") {
irStream << " " << temp << " = icmp sgt " << type << " " << left << ", " << right << "\n";
} else if (op == "<=") {
irStream << " " << temp << " = icmp sle " << type << " " << left << ", " << right << "\n";
} else if (op == ">=") {
irStream << " " << temp << " = icmp sge " << type << " " << left << ", " << right << "\n";
}
left = temp;
}
return left;
}
std::any LLVMIRGenerator::visitEqExp(SysYParser::EqExpContext* ctx) {
auto relExps = ctx->relExp();
std::string left = std::any_cast<std::string>(relExps[0]->accept(this));
for (size_t i = 1; i < relExps.size(); ++i) {
std::string right = std::any_cast<std::string>(relExps[i]->accept(this));
std::string op = ctx->children[2*i-1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
if (op == "==") {
irStream << " " << temp << " = icmp eq " << type << " " << left << ", " << right << "\n";
} else if (op == "!=") {
irStream << " " << temp << " = icmp ne " << type << " " << left << ", " << right << "\n";
}
left = temp;
}
return left;
}
std::any LLVMIRGenerator::visitLAndExp(SysYParser::LAndExpContext* ctx) {
auto eqExps = ctx->eqExp();
std::string left = std::any_cast<std::string>(eqExps[0]->accept(this));
for (size_t i = 1; i < eqExps.size(); ++i) {
std::string falseLabel = "land.false." + std::to_string(tempCounter);
std::string endLabel = "land.end." + std::to_string(tempCounter++);
std::string temp = getNextTemp();
irStream << " br label %" << falseLabel << "\n";
irStream << falseLabel << ":\n";
std::string right = std::any_cast<std::string>(eqExps[i]->accept(this));
irStream << " " << temp << " = and i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
left = temp;
}
return left;
}
std::any LLVMIRGenerator::visitLOrExp(SysYParser::LOrExpContext* ctx) {
auto lAndExps = ctx->lAndExp();
std::string left = std::any_cast<std::string>(lAndExps[0]->accept(this));
for (size_t i = 1; i < lAndExps.size(); ++i) {
std::string trueLabel = "lor.true." + std::to_string(tempCounter);
std::string endLabel = "lor.end." + std::to_string(tempCounter++);
std::string temp = getNextTemp();
irStream << " br label %" << trueLabel << "\n";
irStream << trueLabel << ":\n";
std::string right = std::any_cast<std::string>(lAndExps[i]->accept(this));
irStream << " " << temp << " = or i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
left = temp;
}
return left;
}
}

View File

@ -1,859 +0,0 @@
// LLVMIRGenerator.cpp
// TODO类型转换及其检查
// TODOsysy库函数处理
// TODO数组处理
// TODO对while、continue、break的测试
#include "LLVMIRGenerator_1.h"
#include <iomanip>
#include <stdexcept>
#include <sstream>
// namespace sysy {
std::string LLVMIRGenerator::generateIR(SysYParser::CompUnitContext* unit) {
// 初始化 SysY IR 模块
module = std::make_unique<sysy::Module>();
// 清空符号表和临时变量表
symbolTable.clear();
tmpTable.clear();
irSymbolTable.clear();
irTmpTable.clear();
tempCounter = 0;
globalVars.clear();
hasReturn = false;
loopStack = std::stack<LoopLabels>();
inFunction = false;
// 访问编译单元
visitCompUnit(unit);
return irStream.str();
}
std::string LLVMIRGenerator::getNextTemp() {
std::string ret = "%." + std::to_string(tempCounter++);
tmpTable[ret] = "void";
return ret;
}
std::string LLVMIRGenerator::getIRTempName() {
return "%" + std::to_string(tempCounter++);
}
std::string LLVMIRGenerator::getLLVMType(const std::string& type) {
if (type == "int") return "i32";
if (type == "float") return "float";
if (type.find("[]") != std::string::npos)
return getLLVMType(type.substr(0, type.size() - 2)) + "*";
return "i32";
}
sysy::Type* LLVMIRGenerator::getIRType(const std::string& type) {
if (type == "int") return sysy::Type::getIntType();
if (type == "float") return sysy::Type::getFloatType();
if (type == "void") return sysy::Type::getVoidType();
if (type.find("[]") != std::string::npos) {
std::string baseType = type.substr(0, type.size() - 2);
return sysy::Type::getPointerType(getIRType(baseType));
}
return sysy::Type::getIntType(); // 默认 int
}
void LLVMIRGenerator::setIRPosition(sysy::BasicBlock* block) {
currentIRBlock = block;
}
std::any LLVMIRGenerator::visitCompUnit(SysYParser::CompUnitContext* ctx) {
for (auto decl : ctx->decl()) {
decl->accept(this);
}
for (auto funcDef : ctx->funcDef()) {
inFunction = true;
funcDef->accept(this);
inFunction = false;
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDecl(SysYParser::VarDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
currentVarType = getLLVMType(type);
sysy::Type* irType = sysy::Type::getPointerType(getIRType(type));
for (auto varDef : ctx->varDef()) {
if (!inFunction) {
// 全局变量(文本 IR
std::string varName = varDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0";
sysy::Value* initValue = nullptr;
if (varDef->ASSIGN()) {
value = std::any_cast<std::string>(varDef->initVal()->accept(this));
if (irTmpTable.find(value) != irTmpTable.end() && sysy::isa<sysy::ConstantValue>(irTmpTable[value])) {
initValue = irTmpTable[value];
}
}
if (llvmType == "float" && initValue) {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("[ERR-Release-02]Invalid float literal: " + value);
}
}
irStream << "@" << varName << " = dso_local global " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName);
// 全局变量SysY IR
auto globalValue = module->createGlobalValue(varName, irType, {}, initValue);
irSymbolTable[varName] = globalValue;
} else {
varDef->accept(this);
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitConstDecl(SysYParser::ConstDeclContext* ctx) {
// TODO数组初始化
std::string type = ctx->bType()->getText();
currentVarType = getLLVMType(type);
sysy::Type* irType = sysy::Type::getPointerType(getIRType(type)); // 全局变量为指针类型
for (auto constDef : ctx->constDef()) {
std::string varName = constDef->Ident()->getText();
std::string llvmType = getLLVMType(type);
std::string value = "0";
sysy::Value* initValue = nullptr;
try {
value = std::any_cast<std::string>(constDef->constInitVal()->accept(this));
if (sysy::isa<sysy::ConstantValue>(irTmpTable[value])) {
initValue = irTmpTable[value];
}
} catch (...) {
throw std::runtime_error("Const value must be initialized upon definition.");
}
if (!inFunction) {
// 全局常量(文本 IR
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("[ERR-Release-03]Invalid float literal: " + value);
}
}
irStream << "@" << varName << " = dso_local constant " << llvmType << " " << value << ", align 4\n";
globalVars.push_back(varName);
// 全局常量SysY IR
auto globalValue = module->createGlobalValue(varName, irType, {}, initValue);
irSymbolTable[varName] = globalValue;
} else {
// 局部常量(文本 IR
std::string allocaName = getNextTemp();
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << hexValue;
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
// 局部常量SysY IRTODO:这里可能有bugAI在犯蠢
sysy::IRBuilder builder(currentIRBlock);
auto allocaInst = builder.createAllocaInst(irType, {}, varName);
builder.createStoreInst(initValue, allocaInst);
irSymbolTable[varName] = allocaInst;
irTmpTable[allocaName] = allocaInst;
}
}
return nullptr;
}
std::any LLVMIRGenerator::visitVarDef(SysYParser::VarDefContext* ctx) {
// TODO数组初始化
std::string varName = ctx->Ident()->getText();
std::string llvmType = currentVarType;
sysy::Type* irType = sysy::Type::getPointerType(getIRType(currentVarType == "i32" ? "int" : "float"));
std::string allocaName = getNextTemp();
// 局部变量(文本 IR
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
// 局部变量SysY IR
sysy::IRBuilder builder(currentIRBlock);
auto allocaInst = builder.createAllocaInst(irType, {}, varName);
sysy::Value* initValue = nullptr;
if (ctx->ASSIGN()) {
std::string value = std::any_cast<std::string>(ctx->initVal()->accept(this));
if (llvmType == "float") {
try {
double floatValue = std::stod(value);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
value = ss.str();
} catch (...) {
throw std::runtime_error("Invalid float literal: " + value);
}
}
irStream << " store " << llvmType << " " << value << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
if (irTmpTable.find(value) != irTmpTable.end()) {
initValue = irTmpTable[value];
}
builder.createStoreInst(initValue, allocaInst);
}
symbolTable[varName] = {allocaName, llvmType};
tmpTable[allocaName] = llvmType;
irSymbolTable[varName] = allocaInst;//TODO:这里没看懂在干嘛
irTmpTable[allocaName] = allocaInst;//TODO:这里没看懂在干嘛
builder.createStoreInst(initValue, allocaInst);//TODO:这里没看懂在干嘛
return nullptr;
}
std::any LLVMIRGenerator::visitFuncDef(SysYParser::FuncDefContext* ctx) {
currentFunction = ctx->Ident()->getText();
currentReturnType = getLLVMType(ctx->funcType()->getText());
sysy::Type* irReturnType = getIRType(ctx->funcType()->getText());
std::vector<sysy::Type*> paramTypes;
// 清空符号表
symbolTable.clear();
tmpTable.clear();
irSymbolTable.clear();
irTmpTable.clear();
tempCounter = 0;
hasReturn = false;
// 处理函数参数(文本 IR 和 SysY IR
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
for (size_t i = 0; i < params.size(); ++i) {
std::string paramType = getLLVMType(params[i]->bType()->getText());
if (i > 0) irStream << ", ";
irStream << paramType << " noundef %" << i;
symbolTable[params[i]->Ident()->getText()] = {"%" + std::to_string(i), paramType};
tmpTable["%" + std::to_string(i)] = paramType;
paramTypes.push_back(getIRType(params[i]->bType()->getText()));
}
tempCounter += params.size();
}
tempCounter++;
// 文本 IR 函数定义
irStream << "define dso_local " << currentReturnType << " @" << currentFunction << "(";
irStream << ") #0 {\n";
// SysY IR 函数定义
sysy::Type* funcType = sysy::Type::getFunctionType(irReturnType, paramTypes);
currentIRFunction = module->createFunction(currentFunction, funcType);
setIRPosition(currentIRFunction->getEntryBlock());
// 处理函数参数分配
if (ctx->funcFParams()) {
auto params = ctx->funcFParams()->funcFParam();
for (size_t i = 0; i < params.size(); ++i) {
std::string varName = params[i]->Ident()->getText();
std::string llvmType = getLLVMType(params[i]->bType()->getText());
sysy::Type* irType = getIRType(params[i]->bType()->getText());
std::string allocaName = getNextTemp();
tmpTable[allocaName] = llvmType;
// 文本 IR 分配
irStream << " " << allocaName << " = alloca " << llvmType << ", align 4\n";
irStream << " store " << llvmType << " %" << i << ", " << llvmType
<< "* " << allocaName << ", align 4\n";
// SysY IR 分配
sysy::IRBuilder builder(currentIRBlock);
auto arg = currentIRBlock->createArgument(irType, varName);
auto allocaInst = builder.createAllocaInst(sysy::Type::getPointerType(irType), {}, varName);
builder.createStoreInst(arg, allocaInst);
symbolTable[varName] = {allocaName, llvmType};
irSymbolTable[varName] = allocaInst;
irTmpTable[allocaName] = allocaInst;
}
}
ctx->blockStmt()->accept(this);
if (!hasReturn) {
if (currentReturnType == "void") {
irStream << " ret void\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createReturnInst();
} else {
irStream << " ret " << currentReturnType << " 0\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createReturnInst(sysy::ConstantValue::get(getIRType("int"),0));
}
}
irStream << "}\n";
currentIRFunction = nullptr;
currentIRBlock = nullptr;
return nullptr;
}
std::any LLVMIRGenerator::visitBlockStmt(SysYParser::BlockStmtContext* ctx) {
for (auto item : ctx->blockItem()) {
item->accept(this);
}
return nullptr;
}
std::any LLVMIRGenerator::visitAssignStmt(SysYParser::AssignStmtContext* ctx) {
std::string lhsAlloca = std::any_cast<std::string>(ctx->lValue()->accept(this));
std::string lhsType = symbolTable[ctx->lValue()->Ident()->getText()].second;
std::string rhs = std::any_cast<std::string>(ctx->exp()->accept(this));
sysy::Value* rhsValue = irTmpTable[rhs];
// 文本 IR
if (lhsType == "float") {
try {
double floatValue = std::stod(rhs);
uint64_t hexValue = reinterpret_cast<uint64_t&>(floatValue);
std::stringstream ss;
ss << "0x" << std::hex << std::uppercase << (hexValue & (0xffffffffUL << 32));
rhs = ss.str();
} catch (...) {
// 如果 rhs 不是字面量,假设已正确处理
throw std::runtime_error("Invalid float literal: " + rhs);
}
}
irStream << " store " << lhsType << " " << rhs << ", " << lhsType
<< "* " << lhsAlloca << ", align 4\n";
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
builder.createStoreInst(rhsValue, irSymbolTable[ctx->lValue()->Ident()->getText()]);
return nullptr;
}
std::any LLVMIRGenerator::visitIfStmt(SysYParser::IfStmtContext* ctx) {
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
sysy::Value* condValue = irTmpTable[cond];
std::string trueLabel = "if.then." + std::to_string(tempCounter);
std::string falseLabel = "if.else." + std::to_string(tempCounter);
std::string mergeLabel = "if.end." + std::to_string(tempCounter++);
// SysY IR 基本块
sysy::BasicBlock* thenBlock = currentIRFunction->addBasicBlock(trueLabel);
sysy::BasicBlock* elseBlock = ctx->ELSE() ? currentIRFunction->addBasicBlock(falseLabel) : nullptr;
sysy::BasicBlock* mergeBlock = currentIRFunction->addBasicBlock(mergeLabel);
// 文本 IR
irStream << " br i1 " << cond << ", label %" << trueLabel << ", label %"
<< (ctx->ELSE() ? falseLabel : mergeLabel) << "\n";
// SysY IR 条件分支
sysy::IRBuilder builder(currentIRBlock);
builder.createCondBrInst(condValue, thenBlock, ctx->ELSE() ? elseBlock : mergeBlock, {}, {});
// 处理 then 分支
setIRPosition(thenBlock);
irStream << trueLabel << ":\n";
ctx->stmt(0)->accept(this);
irStream << " br label %" << mergeLabel << "\n";
builder.setPosition(thenBlock, thenBlock->end());
builder.createUncondBrInst(mergeBlock, {});
// 处理 else 分支
if (ctx->ELSE()) {
setIRPosition(elseBlock);
irStream << falseLabel << ":\n";
ctx->stmt(1)->accept(this);
irStream << " br label %" << mergeLabel << "\n";
builder.setPosition(elseBlock, elseBlock->end());
builder.createUncondBrInst(mergeBlock, {});
}
// 合并点
setIRPosition(mergeBlock);
irStream << mergeLabel << ":\n";
return nullptr;
}
std::any LLVMIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext* ctx) {
std::string loopCond = "while.cond." + std::to_string(tempCounter);
std::string loopBody = "while.body." + std::to_string(tempCounter);
std::string loopEnd = "while.end." + std::to_string(tempCounter++);
// SysY IR 基本块
sysy::BasicBlock* condBlock = currentIRFunction->addBasicBlock(loopCond);
sysy::BasicBlock* bodyBlock = currentIRFunction->addBasicBlock(loopBody);
sysy::BasicBlock* endBlock = currentIRFunction->addBasicBlock(loopEnd);
loopStack.push({loopEnd, loopCond, endBlock, condBlock});
// 跳转到条件块
sysy::IRBuilder builder(currentIRBlock);
builder.createUncondBrInst(condBlock, {});
irStream << " br label %" << loopCond << "\n";
// 条件块
setIRPosition(condBlock);
irStream << loopCond << ":\n";
std::string cond = std::any_cast<std::string>(ctx->cond()->accept(this));
sysy::Value* condValue = irTmpTable[cond];
irStream << " br i1 " << cond << ", label %" << loopBody << ", label %" << loopEnd << "\n";
builder.setPosition(condBlock, condBlock->end());
builder.createCondBrInst(condValue, bodyBlock, endBlock, {}, {});
// 循环体
setIRPosition(bodyBlock);
irStream << loopBody << ":\n";
ctx->stmt()->accept(this);
irStream << " br label %" << loopCond << "\n";
builder.setPosition(bodyBlock, bodyBlock->end());
builder.createUncondBrInst(condBlock, {});
// 结束块
setIRPosition(endBlock);
irStream << loopEnd << ":\n";
loopStack.pop();
return nullptr;
}
std::any LLVMIRGenerator::visitBreakStmt(SysYParser::BreakStmtContext* ctx) {
if (loopStack.empty()) {
throw std::runtime_error("Break statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().breakLabel << "\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createUncondBrInst(loopStack.top().irBreakBlock, {});
return nullptr;
}
std::any LLVMIRGenerator::visitContinueStmt(SysYParser::ContinueStmtContext* ctx) {
if (loopStack.empty()) {
throw std::runtime_error("Continue statement outside of a loop.");
}
irStream << " br label %" << loopStack.top().continueLabel << "\n";
sysy::IRBuilder builder(currentIRBlock);
builder.createUncondBrInst(loopStack.top().irContinueBlock, {});
return nullptr;
}
std::any LLVMIRGenerator::visitReturnStmt(SysYParser::ReturnStmtContext* ctx) {
hasReturn = true;
sysy::IRBuilder builder(currentIRBlock);
if (ctx->exp()) {
std::string value = std::any_cast<std::string>(ctx->exp()->accept(this));
sysy::Value* irValue = irTmpTable[value];
irStream << " ret " << currentReturnType << " " << value << "\n";
builder.createReturnInst(irValue);
} else {
irStream << " ret void\n";
builder.createReturnInst();
}
return nullptr;
}
std::any LLVMIRGenerator::visitLValue(SysYParser::LValueContext* ctx) {
std::string varName = ctx->Ident()->getText();
if (irSymbolTable.find(varName) == irSymbolTable.end()) {
throw std::runtime_error("Undefined variable: " + varName);
}
// 对于 LValue返回分配的指针文本 IR 和 SysY IR 一致)
return symbolTable[varName].first;
}
std::any LLVMIRGenerator::visitPrimExp(SysYParser::PrimExpContext* ctx) {
SysYParser::PrimaryExpContext* pExpCtx = ctx->primaryExp();
if (auto* lvalCtx = dynamic_cast<SysYParser::LValContext*>(pExpCtx)) {
std::string allocaPtr = std::any_cast<std::string>(lvalCtx->lValue()->accept(this));
std::string varName = lvalCtx->lValue()->Ident()->getText();
std::string type = symbolTable[varName].second;
std::string temp = getNextTemp();
sysy::Type* irType = getIRType(type == "i32" ? "int" : "float");
// 文本 IR
irStream << " " << temp << " = load " << type << ", " << type << "* " << allocaPtr << ", align 4\n";
tmpTable[temp] = type;
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
auto loadInst = builder.createLoadInst(irSymbolTable[varName], {});
irTmpTable[temp] = loadInst;
return temp;
} else if (auto* expCtx = dynamic_cast<SysYParser::ParenExpContext*>(pExpCtx)) {
return expCtx->exp()->accept(this);
} else if (auto* strCtx = dynamic_cast<SysYParser::StrContext*>(pExpCtx)) {
return strCtx->string()->accept(this);
} else if (auto* numCtx = dynamic_cast<SysYParser::NumContext*>(pExpCtx)) {
return numCtx->number()->accept(this);
} else {
// 没有成功转换,说明 ctx->primaryExp() 不是 NumContext 或其他已知类型
// 可能是其他类型的表达式,或者是一个空的 PrimaryExpContext
std::cout << "Unknown primary expression type." << std::endl;
throw std::runtime_error("Unknown primary expression type.");
}
}
std::any LLVMIRGenerator::visitParenExp(SysYParser::ParenExpContext* ctx) {
return ctx->exp()->accept(this);
}
std::any LLVMIRGenerator::visitNumber(SysYParser::NumberContext* ctx) {
std::string value;
sysy::Value* irValue = nullptr;
if (ctx->ILITERAL()) {
value = ctx->ILITERAL()->getText();
irValue = sysy::ConstantValue::get(getIRType("int"), std::stoi(value));
} else if (ctx->FLITERAL()) {
value = ctx->FLITERAL()->getText();
irValue = sysy::ConstantValue::get(getIRType("float"), std::stof(value));
} else {
value = "";
}
std::string temp = getNextTemp();
tmpTable[temp] = ctx->ILITERAL() ? "i32" : "float";
irTmpTable[temp] = irValue;
return value;
}
std::any LLVMIRGenerator::visitString(SysYParser::StringContext* ctx) {
if (ctx->STRING()) {
std::string str = ctx->STRING()->getText();
str = str.substr(1, str.size() - 2);
std::string escapedStr;
for (char c : str) {
if (c == '\\') {
escapedStr += "\\\\";
} else if (c == '"') {
escapedStr += "\\\"";
} else {
escapedStr += c;
}
}
// TODO: SysY IR 暂不支持字符串常量,返回文本 IR 结果
return "\"" + escapedStr + "\"";
}
return ctx->STRING()->getText();
}
std::any LLVMIRGenerator::visitUnExp(SysYParser::UnExpContext* ctx) {
if (ctx->unaryOp()) {
std::string operand = std::any_cast<std::string>(ctx->unaryExp()->accept(this));
sysy::Value* irOperand = irTmpTable[operand];
std::string op = ctx->unaryOp()->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[operand];
sysy::Type* irType = getIRType(type == "i32" ? "int" : "float");
tmpTable[temp] = type;
// 文本 IR
if (op == "-") {
irStream << " " << temp << " = sub " << type << " 0, " << operand << "\n";
} else if (op == "!") {
irStream << " " << temp << " = xor " << type << " " << operand << ", 1\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind = (op == "-") ? (type == "i32" ? sysy::Instruction::kNeg : sysy::Instruction::kFNeg)
: sysy::Instruction::kNot;
auto unaryInst = builder.createUnaryInst(kind, irType, irOperand, temp);
irTmpTable[temp] = unaryInst;
return temp;
}
return ctx->unaryExp()->accept(this);
}
std::any LLVMIRGenerator::visitCall(SysYParser::CallContext* ctx) {
std::string funcName = ctx->Ident()->getText();
std::vector<std::string> args;
std::vector<sysy::Value*> irArgs;
if (ctx->funcRParams()) {
for (auto argCtx : ctx->funcRParams()->exp()) {
std::string arg = std::any_cast<std::string>(argCtx->accept(this));
args.push_back(arg);
irArgs.push_back(irTmpTable[arg]);
}
}
std::string temp = getNextTemp();
std::string argList;
for (size_t i = 0; i < args.size(); ++i) {
if (i > 0) argList += ", ";
argList += tmpTable[args[i]] + " noundef " + args[i];
}
// 文本 IR
irStream << " " << temp << " = call " << currentReturnType << " @" << funcName << "(" << argList << ")\n";
tmpTable[temp] = currentReturnType;
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Function* callee = module->getFunction(funcName);
if (!callee) {
throw std::runtime_error("Undefined function: " + funcName);
}
auto callInst = builder.createCallInst(callee, irArgs, temp);
irTmpTable[temp] = callInst;
return temp;
}
std::any LLVMIRGenerator::visitMulExp(SysYParser::MulExpContext* ctx) {
auto unaryExps = ctx->unaryExp();
std::string left = std::any_cast<std::string>(unaryExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = irLeft->getType();
for (size_t i = 1; i < unaryExps.size(); ++i) {
std::string right = std::any_cast<std::string>(unaryExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = type;
// 文本 IR
if (op == "*") {
irStream << " " << temp << " = mul nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "/") {
irStream << " " << temp << " = sdiv " << type << " " << left << ", " << right << "\n";
} else if (op == "%") {
irStream << " " << temp << " = srem " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind;
if (type == "i32") {
if (op == "*") kind = sysy::Instruction::kMul;
else if (op == "/") kind = sysy::Instruction::kDiv;
else kind = sysy::Instruction::kRem;
} else {
if (op == "*") kind = sysy::Instruction::kFMul;
else if (op == "/") kind = sysy::Instruction::kFDiv;
else kind = sysy::Instruction::kFRem;
}
auto binaryInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = binaryInst;
left = temp;
irLeft = binaryInst;
}
return left;
}
std::any LLVMIRGenerator::visitAddExp(SysYParser::AddExpContext* ctx) {
auto mulExps = ctx->mulExp();
std::string left = std::any_cast<std::string>(mulExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = irLeft->getType();
for (size_t i = 1; i < mulExps.size(); ++i) {
std::string right = std::any_cast<std::string>(mulExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = type;
// 文本 IR
if (op == "+") {
irStream << " " << temp << " = add nsw " << type << " " << left << ", " << right << "\n";
} else if (op == "-") {
irStream << " " << temp << " = sub nsw " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind = (type == "i32") ? (op == "+" ? sysy::Instruction::kAdd : sysy::Instruction::kSub)
: (op == "+" ? sysy::Instruction::kFAdd : sysy::Instruction::kFSub);
auto binaryInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = binaryInst;
left = temp;
irLeft = binaryInst;
}
return left;
}
std::any LLVMIRGenerator::visitRelExp(SysYParser::RelExpContext* ctx) {
auto addExps = ctx->addExp();
std::string left = std::any_cast<std::string>(addExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = sysy::Type::getIntType(); // 比较结果为 i1
for (size_t i = 1; i < addExps.size(); ++i) {
std::string right = std::any_cast<std::string>(addExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = "i1";
// 文本 IR
if (op == "<") {
irStream << " " << temp << " = icmp slt " << type << " " << left << ", " << right << "\n";
} else if (op == ">") {
irStream << " " << temp << " = icmp sgt " << type << " " << left << ", " << right << "\n";
} else if (op == "<=") {
irStream << " " << temp << " = icmp sle " << type << " " << left << ", " << right << "\n";
} else if (op == ">=") {
irStream << " " << temp << " = icmp sge " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind;
if (type == "i32") {
if (op == "<") kind = sysy::Instruction::kICmpLT;
else if (op == ">") kind = sysy::Instruction::kICmpGT;
else if (op == "<=") kind = sysy::Instruction::kICmpLE;
else kind = sysy::Instruction::kICmpGE;
} else {
if (op == "<") kind = sysy::Instruction::kFCmpLT;
else if (op == ">") kind = sysy::Instruction::kFCmpGT;
else if (op == "<=") kind = sysy::Instruction::kFCmpLE;
else kind = sysy::Instruction::kFCmpGE;
}
auto cmpInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = cmpInst;
left = temp;
irLeft = cmpInst;
}
return left;
}
std::any LLVMIRGenerator::visitEqExp(SysYParser::EqExpContext* ctx) {
auto relExps = ctx->relExp();
std::string left = std::any_cast<std::string>(relExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
sysy::Type* irType = sysy::Type::getIntType(); // 比较结果为 i1
for (size_t i = 1; i < relExps.size(); ++i) {
std::string right = std::any_cast<std::string>(relExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
std::string op = ctx->children[2 * i - 1]->getText();
std::string temp = getNextTemp();
std::string type = tmpTable[left];
tmpTable[temp] = "i1";
// 文本 IR
if (op == "==") {
irStream << " " << temp << " = icmp eq " << type << " " << left << ", " << right << "\n";
} else if (op == "!=") {
irStream << " " << temp << " = icmp ne " << type << " " << left << ", " << right << "\n";
}
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
sysy::Instruction::Kind kind = (type == "i32") ? (op == "==" ? sysy::Instruction::kICmpEQ : sysy::Instruction::kICmpNE)
: (op == "==" ? sysy::Instruction::kFCmpEQ : sysy::Instruction::kFCmpNE);
auto cmpInst = builder.createBinaryInst(kind, irType, irLeft, irRight, temp);
irTmpTable[temp] = cmpInst;
left = temp;
irLeft = cmpInst;
}
return left;
}
std::any LLVMIRGenerator::visitLAndExp(SysYParser::LAndExpContext* ctx) {
auto eqExps = ctx->eqExp();
std::string left = std::any_cast<std::string>(eqExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
for (size_t i = 1; i < eqExps.size(); ++i) {
std::string falseLabel = "land.false." + std::to_string(tempCounter);
std::string endLabel = "land.end." + std::to_string(tempCounter++);
sysy::BasicBlock* falseBlock = currentIRFunction->addBasicBlock(falseLabel);
sysy::BasicBlock* endBlock = currentIRFunction->addBasicBlock(endLabel);
std::string temp = getNextTemp();
tmpTable[temp] = "i1";
// 文本 IR
irStream << " br i1 " << left << ", label %" << falseLabel << ", label %" << endLabel << "\n";
irStream << falseLabel << ":\n";
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
builder.createCondBrInst(irLeft, falseBlock, endBlock, {}, {});
setIRPosition(falseBlock);
std::string right = std::any_cast<std::string>(eqExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
irStream << " " << temp << " = and i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
// SysY IR 逻辑与(通过基本块实现短路求值)
builder.setPosition(falseBlock, falseBlock->end());
auto andInst = builder.createBinaryInst(sysy::Instruction::kICmpEQ, sysy::Type::getIntType(), irLeft, irRight, temp);
builder.createUncondBrInst(endBlock, {});
irTmpTable[temp] = andInst;
left = temp;
irLeft = andInst;
setIRPosition(endBlock);
}
return left;
}
std::any LLVMIRGenerator::visitLOrExp(SysYParser::LOrExpContext* ctx) {
auto lAndExps = ctx->lAndExp();
std::string left = std::any_cast<std::string>(lAndExps[0]->accept(this));
sysy::Value* irLeft = irTmpTable[left];
for (size_t i = 1; i < lAndExps.size(); ++i) {
std::string trueLabel = "lor.true." + std::to_string(tempCounter);
std::string endLabel = "lor.end." + std::to_string(tempCounter++);
sysy::BasicBlock* trueBlock = currentIRFunction->addBasicBlock(trueLabel);
sysy::BasicBlock* endBlock = currentIRFunction->addBasicBlock(endLabel);
std::string temp = getNextTemp();
tmpTable[temp] = "i1";
// 文本 IR
irStream << " br i1 " << left << ", label %" << trueLabel << ", label %" << endLabel << "\n";
irStream << trueLabel << ":\n";
// SysY IR
sysy::IRBuilder builder(currentIRBlock);
builder.createCondBrInst(irLeft, trueBlock, endBlock, {}, {});
setIRPosition(trueBlock);
std::string right = std::any_cast<std::string>(lAndExps[i]->accept(this));
sysy::Value* irRight = irTmpTable[right];
irStream << " " << temp << " = or i1 " << left << ", " << right << "\n";
irStream << " br label %" << endLabel << "\n";
irStream << endLabel << ":\n";
// SysY IR 逻辑或(通过基本块实现短路求值)
builder.setPosition(trueBlock, trueBlock->end());
auto orInst = builder.createBinaryInst(sysy::Instruction::kICmpEQ, sysy::Type::getIntType(), irLeft, irRight, temp);
builder.createUncondBrInst(endBlock, {});
irTmpTable[temp] = orInst;
left = temp;
irLeft = orInst;
setIRPosition(endBlock);
}
return left;
}
// } // namespace sysy

801
src/Mem2Reg.cpp Normal file
View File

@ -0,0 +1,801 @@
#include "Mem2Reg.h"
#include <algorithm>
#include <cassert>
#include <iterator>
#include <memory>
#include <queue>
#include <stack>
#include <string>
#include <unordered_map>
#include <utility>
#include "IR.h"
#include "SysYIRAnalyser.h"
#include "SysYIRPrinter.h"
namespace sysy {
// 计算给定变量的定义块集合的迭代支配边界
// TODO优化Semi-Naive IDF
std::unordered_set<BasicBlock *> Mem2Reg::computeIterDf(const std::unordered_set<BasicBlock *> &blocks) {
std::unordered_set<BasicBlock *> workList;
std::unordered_set<BasicBlock *> ret_list;
workList.insert(blocks.begin(), blocks.end());
while (!workList.empty()) {
auto n = workList.begin();
BlockAnalysisInfo* blockInfo = controlFlowAnalysis->getBlockAnalysisInfo(*n);
auto DFs = blockInfo->getDomFrontiers();
for (auto c : DFs) {
// 如果c不在ret_list中则将其加入ret_list和workList
// 这里的c是n的支配边界
// 也就是n的支配边界中的块
// 需要注意的是,支配边界是一个集合,所以可能会有重复
if (ret_list.count(c) == 0U) {
ret_list.emplace(c);
workList.emplace(c);
}
}
workList.erase(n);
}
return ret_list;
}
/**
* 计算value2Blocks的映射包括value2AllocBlocks、value2DefBlocks以及value2UseBlocks
* 其中value2DefBlocks可用于计算迭代支配边界来插入相应变量的phi结点
* 这里的value2AllocBlocks、value2DefBlocks和value2UseBlocks改变了函数级别的分析信息
*/
auto Mem2Reg::computeValue2Blocks() -> void {
SysYPrinter printer(pModule); // 初始化打印机
// std::cout << "===== Start computeValue2Blocks =====" << std::endl;
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
// std::cout << "\nProcessing function: " << func->getName() << std::endl;
FunctionAnalysisInfo* funcInfo = controlFlowAnalysis->getFunctionAnalysisInfo(func);
if (!funcInfo) {
std::cerr << "ERROR: No analysis info for function " << func->getName() << std::endl;
continue;
}
auto basicBlocks = func->getBasicBlocks();
// std::cout << "BasicBlocks count: " << basicBlocks.size() << std::endl;
for (auto &it : basicBlocks) {
auto basicBlock = it.get();
// std::cout << "\nProcessing BB: " << basicBlock->getName() << std::endl;
// printer.printBlock(basicBlock); // 打印基本块内容
auto &instrs = basicBlock->getInstructions();
for (auto &instr : instrs) {
// std::cout << " Analyzing instruction: ";
// printer.printInst(instr.get());
// std::cout << std::endl;
if (instr->isAlloca()) {
if (!(isArr(instr.get()) || isGlobal(instr.get()))) {
// std::cout << " Found alloca: ";
// printer.printInst(instr.get());
// std::cout << " -> Adding to allocBlocks" << std::endl;
funcInfo->addValue2AllocBlocks(instr.get(), basicBlock);
} else {
// std::cout << " Skip array/global alloca: ";
// printer.printInst(instr.get());
// std::cout << std::endl;
}
}
else if (instr->isStore()) {
auto val = instr->getOperand(1);
// std::cout << " Store target: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
if (!(isArr(val) || isGlobal(val))) {
// std::cout << " Adding store to defBlocks for value: ";
// printer.printInst(dynamic_cast<Instruction *>(instr.get()));
// std::cout << std::endl;
// 将store的目标值添加到defBlocks中
funcInfo->addValue2DefBlocks(val, basicBlock);
} else {
// std::cout << " Skip array/global store" << std::endl;
}
}
else if (instr->isLoad()) {
auto val = instr->getOperand(0);
// std::cout << " Load source: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << std::endl;
if (!(isArr(val) || isGlobal(val))) {
// std::cout << " Adding load to useBlocks for value: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << std::endl;
funcInfo->addValue2UseBlocks(val, basicBlock);
} else {
// std::cout << " Skip array/global load" << std::endl;
}
}
}
}
// 打印分析结果
// std::cout << "\nAnalysis results for function " << func->getName() << ":" << std::endl;
// auto &allocMap = funcInfo->getValue2AllocBlocks();
// std::cout << "AllocBlocks (" << allocMap.size() << "):" << std::endl;
// for (auto &[val, bb] : allocMap) {
// std::cout << " ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << " in BB: " << bb->getName() << std::endl;
// }
// auto &defMap = funcInfo->getValue2DefBlocks();
// std::cout << "DefBlocks (" << defMap.size() << "):" << std::endl;
// for (auto &[val, bbs] : defMap) {
// std::cout << " ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// for (const auto &[bb, count] : bbs) {
// std::cout << " in BB: " << bb->getName() << " (count: " << count << ")";
// }
// }
// auto &useMap = funcInfo->getValue2UseBlocks();
// std::cout << "UseBlocks (" << useMap.size() << "):" << std::endl;
// for (auto &[val, bbs] : useMap) {
// std::cout << " ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// for (const auto &[bb, count] : bbs) {
// std::cout << " in BB: " << bb->getName() << " (count: " << count << ")";
// }
// }
}
// std::cout << "===== End computeValue2Blocks =====" << std::endl;
}
/**
* @brief 级联关系的顺带消除用于llvm mem2reg类预优化1
*
* 采用队列进行模拟从某种程度上来看其实可以看作是UD链的反向操作
*
* @param [in] instr store指令使用的指令
* @param [in] changed 不动点法的判断标准,地址传递
* @param [in] func 指令所在函数
* @param [in] block 指令所在基本块
* @param [in] instrs 基本块所在指令集合,地址传递
* @return 无返回值,但满足条件的情况下会对指令进行删除
*/
auto Mem2Reg::cascade(Instruction *instr, bool &changed, Function *func, BasicBlock *block,
std::list<std::unique_ptr<Instruction>> &instrs) -> void {
if (instr != nullptr) {
if (instr->isUnary() || instr->isBinary() || instr->isLoad()) {
std::queue<Instruction *> toRemove;
toRemove.push(instr);
while (!toRemove.empty()) {
auto top = toRemove.front();
toRemove.pop();
auto operands = top->getOperands();
for (const auto &operand : operands) {
auto elem = dynamic_cast<Instruction *>(operand->getValue());
if (elem != nullptr) {
if ((elem->isUnary() || elem->isBinary() || elem->isLoad()) && elem->getUses().size() == 1 &&
elem->getUses().front()->getUser() == top) {
toRemove.push(elem);
} else if (elem->isAlloca()) {
// value2UseBlock中该block对应次数-1如果该变量的该useblock中count减为0了则意味着
// 该block其他地方也没用到该alloc了故从value2UseBlock中删除
FunctionAnalysisInfo* funcInfo = controlFlowAnalysis->getFunctionAnalysisInfo(func);
auto res = funcInfo->removeValue2UseBlock(elem, block);
// 只要有一次返回了true就说明有变化
if (res) {
changed = true;
}
}
}
}
auto tofind =
std::find_if(instrs.begin(), instrs.end(), [&top](const auto &instr) { return instr.get() == top; });
assert(tofind != instrs.end());
usedelete(tofind->get());
instrs.erase(tofind);
}
}
}
}
/**
* llvm mem2reg预优化1: 删除不含load的alloc和store
*
* 1. 删除不含load的alloc和store
* 2. 删除store指令之前的用于作store指令第0个操作数的那些级联指令就冗余了也要删除
* 3. 删除之后可能有些变量的load使用恰好又没有了因此再次从第一步开始循环这里使用不动点法
*
* 由于删除了级联关系,所以这里的方法有点儿激进;
* 同时也考虑了级联关系时如果调用了函数可能会有side effect所以没有删除调用函数的级联关系
* 而且关于函数参数的alloca不会在指令中删除也不会在value2Alloca中删除;
* 同样地我们不考虑数组和global不过这里的代码是基于value2blocks的在value2blocks中已经考虑了所以不用显式指明
*=
*/
auto Mem2Reg::preOptimize1() -> void {
SysYPrinter printer(pModule); // 初始化打印机
auto &functions = pModule->getFunctions();
// std::cout << "===== Start preOptimize1 =====" << std::endl;
for (const auto &function : functions) {
auto func = function.second.get();
// std::cout << "\nProcessing function: " << func->getName() << std::endl;
FunctionAnalysisInfo* funcInfo = controlFlowAnalysis->getFunctionAnalysisInfo(func);
if (!funcInfo) {
// std::cerr << "ERROR: No analysis info for function " << func->getName() << std::endl;
continue;
}
auto &vToDefB = funcInfo->getValue2DefBlocks();
auto &vToUseB = funcInfo->getValue2UseBlocks();
auto &vToAllocB = funcInfo->getValue2AllocBlocks();
// 打印初始状态
// std::cout << "Initial allocas: " << vToAllocB.size() << std::endl;
// for (auto &[val, bb] : vToAllocB) {
// std::cout << " Alloca: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << " in BB: " << bb->getName() << std::endl;
// }
// 阶段1删除无store的alloca
// std::cout << "\nPhase 1: Remove unused allocas" << std::endl;
for (auto iter = vToAllocB.begin(); iter != vToAllocB.end();) {
auto val = iter->first;
auto bb = iter->second;
// std::cout << "Checking alloca: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << " in BB: " << bb->getName() << std::endl;
// 如果该alloca没有对应的store指令且不在函数参数中
// 这里的vToDefB是value2DefBlocksvToUseB是value2UseBlocks
// 打印vToDefB
// std::cout << "DefBlocks (" << vToDefB.size() << "):" << std::endl;
// for (auto &[val, bbs] : vToDefB) {
// std::cout << " ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// for (const auto &[bb, count] : bbs) {
// std::cout << " in BB: " << bb->getName() << " (count: " << count << ")" << std::endl;
// }
// }
// std::cout << vToDefB.count(val) << std::endl;
if (vToDefB.count(val) == 0U &&
std::find(func->getEntryBlock()->getArguments().begin(),
func->getEntryBlock()->getArguments().end(),
val) == func->getEntryBlock()->getArguments().end()) {
// std::cout << " Removing unused alloca: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << std::endl;
auto tofind = std::find_if(bb->getInstructions().begin(),
bb->getInstructions().end(),
[val](const auto &instr) {
return instr.get() == val;
});
if (tofind == bb->getInstructions().end()) {
// std::cerr << "ERROR: Alloca not found in BB!" << std::endl;
++iter;
continue;
}
usedelete(tofind->get());
bb->getInstructions().erase(tofind);
iter = vToAllocB.erase(iter);
} else {
++iter;
}
}
// 阶段2删除无load的store
// std::cout << "\nPhase 2: Remove dead stores" << std::endl;
bool changed = true;
int iteration = 0;
while (changed) {
changed = false;
iteration++;
// std::cout << "\nIteration " << iteration << std::endl;
for (auto iter = vToDefB.begin(); iter != vToDefB.end();) {
auto val = iter->first;
// std::cout << "Checking value: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << std::endl;
if (vToUseB.count(val) == 0U) {
// std::cout << " Found dead store for value: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << std::endl;
auto blocks = funcInfo->getDefBlocksByValue(val);
for (auto block : blocks) {
// std::cout << " Processing BB: " << block->getName() << std::endl;
// printer.printBlock(block); // 打印基本块内容
auto &instrs = block->getInstructions();
for (auto it = instrs.begin(); it != instrs.end();) {
if ((*it)->isStore() && (*it)->getOperand(1) == val) {
// std::cout << " Removing store: ";
// printer.printInst(it->get());
std::cout << std::endl;
auto valUsedByStore = dynamic_cast<Instruction *>((*it)->getOperand(0));
usedelete(it->get());
if (valUsedByStore != nullptr &&
valUsedByStore->getUses().size() == 1 &&
valUsedByStore->getUses().front()->getUser() == (*it).get()) {
// std::cout << " Cascade deleting: ";
// printer.printInst(valUsedByStore);
// std::cout << std::endl;
cascade(valUsedByStore, changed, func, block, instrs);
}
it = instrs.erase(it);
changed = true;
} else {
++it;
}
}
}
// 删除对应的alloca
if (std::find(func->getEntryBlock()->getArguments().begin(),
func->getEntryBlock()->getArguments().end(),
val) == func->getEntryBlock()->getArguments().end()) {
auto bb = funcInfo->getAllocBlockByValue(val);
if (bb != nullptr) {
// std::cout << " Removing alloca: ";
// printer.printInst(dynamic_cast<Instruction *>(val));
// std::cout << " in BB: " << bb->getName() << std::endl;
funcInfo->removeValue2AllocBlock(val);
auto tofind = std::find_if(bb->getInstructions().begin(),
bb->getInstructions().end(),
[val](const auto &instr) {
return instr.get() == val;
});
if (tofind != bb->getInstructions().end()) {
usedelete(tofind->get());
bb->getInstructions().erase(tofind);
} else {
std::cerr << "ERROR: Alloca not found in BB!" << std::endl;
}
}
}
iter = vToDefB.erase(iter);
} else {
++iter;
}
}
}
}
// std::cout << "===== End preOptimize1 =====" << std::endl;
}
/**
* llvm mem2reg预优化2: 针对某个变量的Defblocks只有一个块的情况
*
* 1. 该基本块最后一次对该变量的store指令后的所有对该变量的load指令都可以替换为该基本块最后一次store指令的第0个操作数
* 2. 以该基本块为必经结点的结点集合中的对该变量的load指令都可以替换为该基本块最后一次对该变量的store指令的第0个操作数
* 3.
* 如果对该变量的所有load均替换掉了删除该基本块中最后一次store指令如果这个store指令是唯一的define那么再删除alloca指令不删除参数的alloca
* 4.
* 如果对该value的所有load都替换掉了对于该变量剩下还有store的话就转换成了preOptimize1的情况再调用preOptimize1进行删除
*
* 同样不考虑数组和全局变量因为这些变量不会被mem2reg优化在value2blocks中已经考虑了所以不用显式指明
* 替换的操作采用了UD链进行简化和效率的提升
*
*/
auto Mem2Reg::preOptimize2() -> void {
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
FunctionAnalysisInfo* funcInfo = controlFlowAnalysis->getFunctionAnalysisInfo(func);
auto values = funcInfo->getValuesOfDefBlock();
for (auto val : values) {
auto blocks = funcInfo->getDefBlocksByValue(val);
// 该val只有一个defining block
if (blocks.size() == 1) {
auto block = *blocks.begin();
auto &instrs = block->getInstructions();
auto rit = std::find_if(instrs.rbegin(), instrs.rend(),
[val](const auto &instr) { return instr->isStore() && instr->getOperand(1) == val; });
// 注意reverse_iterator求base后是指向下一个指令因此要减一才是原来的指令
assert(rit != instrs.rend());
auto it = --rit.base();
auto propogationVal = (*it)->getOperand(0);
// 其实该块中it后对该val的load指令也可以替换掉了
for (auto curit = std::next(it); curit != instrs.end();) {
if ((*curit)->isLoad() && (*curit)->getOperand(0) == val) {
curit->get()->replaceAllUsesWith(propogationVal);
usedelete(curit->get());
curit = instrs.erase(curit);
funcInfo->removeValue2UseBlock(val, block);
} else {
++curit;
}
}
// 在支配树后继结点中替换load指令的操作数
BlockAnalysisInfo* blockInfo = controlFlowAnalysis->getBlockAnalysisInfo(block);
std::vector<BasicBlock *> blkchildren;
// 获取该块的支配树后继结点
std::queue<BasicBlock *> q;
auto sdoms = blockInfo->getSdoms();
for (auto sdom : sdoms) {
q.push(sdom);
blkchildren.push_back(sdom);
}
while (!q.empty()) {
auto blk = q.front();
q.pop();
BlockAnalysisInfo* blkInfo = controlFlowAnalysis->getBlockAnalysisInfo(blk);
for (auto sdom : blkInfo->getSdoms()) {
q.push(sdom);
blkchildren.push_back(sdom);
}
}
for (auto child : blkchildren) {
auto &childInstrs = child->getInstructions();
for (auto childIter = childInstrs.begin(); childIter != childInstrs.end();) {
if ((*childIter)->isLoad() && (*childIter)->getOperand(0) == val) {
childIter->get()->replaceAllUsesWith(propogationVal);
usedelete(childIter->get());
childIter = childInstrs.erase(childIter);
funcInfo->removeValue2UseBlock(val, child);
} else {
++childIter;
}
}
}
// 如果对该val的所有load均替换掉了那么对于该val的defining block中的最后一个define也可以删除了
// 同时该块中前面对于该val的define也变成死代码了可调用preOptimize1进行删除
if (funcInfo->getUseBlocksByValue(val).empty()) {
usedelete(it->get());
instrs.erase(it);
auto change = funcInfo->removeValue2DefBlock(val, block);
if (change) {
// 如果define是唯一的且不是函数参数的alloca直接删alloca
if (std::find(func->getEntryBlock()->getArguments().begin(), func->getEntryBlock()->getArguments().end(),
val) == func->getEntryBlock()->getArguments().end()) {
auto bb = funcInfo->getAllocBlockByValue(val);
assert(bb != nullptr);
auto tofind = std::find_if(bb->getInstructions().begin(), bb->getInstructions().end(),
[val](const auto &instr) { return instr.get() == val; });
usedelete(tofind->get());
bb->getInstructions().erase(tofind);
funcInfo->removeValue2AllocBlock(val);
}
} else {
// 如果该变量还有其他的define那么前面的define也变成死代码了
assert(!funcInfo->getDefBlocksByValue(val).empty());
assert(funcInfo->getUseBlocksByValue(val).empty());
preOptimize1();
}
}
}
}
}
}
/**
* @brief llvm mem2reg类预优化3针对某个变量的所有读写都在同一个块中的情况
*
* 1. 将每一个load替换成前一个store的值并删除该load
* 2. 如果在load前没有对该变量的store则不删除该load
* 3. 如果一个store后没有任何对改变量的load则删除该store
*
* @note 额外说明第二点不用显式处理因为我们的方法是从找到第一个store开始
* 第三点其实可以更激进一步地理解即每次替换了load之后它对应地那个store也可以删除了同时注意这里不要使用preoptimize1进行处理因为他们的级联关系是有用的即用来求load的替换值
* 同样地我们这里不考虑数组和全局变量因为这些变量不会被mem2reg优化不过这里在计算value2DefBlocks时已经跳过了所以不需要再显式处理了
* 替换的操作采用了UD链进行简化和效率的提升
*
* @param [in] void
* @return 无返回值,但满足条件的情况下会对指令的操作数进行替换以及对指令进行删除
*/
auto Mem2Reg::preOptimize3() -> void {
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
FunctionAnalysisInfo* funcInfo = controlFlowAnalysis->getFunctionAnalysisInfo(func);
auto values = funcInfo->getValuesOfDefBlock();
for (auto val : values) {
auto sblocks = funcInfo->getDefBlocksByValue(val);
auto lblocks = funcInfo->getUseBlocksByValue(val);
if (sblocks.size() == 1 && lblocks.size() == 1 && *sblocks.begin() == *lblocks.begin()) {
auto block = *sblocks.begin();
auto &instrs = block->getInstructions();
auto it = std::find_if(instrs.begin(), instrs.end(),
[val](const auto &instr) { return instr->isStore() && instr->getOperand(1) == val; });
while (it != instrs.end()) {
auto propogationVal = (*it)->getOperand(0);
auto last = std::find_if(std::next(it), instrs.end(), [val](const auto &instr) {
return instr->isStore() && instr->getOperand(1) == val;
});
for (auto curit = std::next(it); curit != last;) {
if ((*curit)->isLoad() && (*curit)->getOperand(0) == val) {
curit->get()->replaceAllUsesWith(propogationVal);
usedelete(curit->get());
curit = instrs.erase(curit);
funcInfo->removeValue2UseBlock(val, block);
} else {
++curit;
}
}
// 替换了load之后它对应地那个store也可以删除了
if (!(std::find_if(func->getEntryBlock()->getArguments().begin(), func->getEntryBlock()->getArguments().end(),
[val](const auto &instr) { return instr == val; }) !=
func->getEntryBlock()->getArguments().end()) &&
last == instrs.end()) {
usedelete(it->get());
it = instrs.erase(it);
if (funcInfo->removeValue2DefBlock(val, block)) {
auto bb = funcInfo->getAllocBlockByValue(val);
if (bb != nullptr) {
auto tofind = std::find_if(bb->getInstructions().begin(), bb->getInstructions().end(),
[val](const auto &instr) { return instr.get() == val; });
usedelete(tofind->get());
bb->getInstructions().erase(tofind);
funcInfo->removeValue2AllocBlock(val);
}
}
}
it = last;
}
}
}
}
}
/**
* 为所有变量的定义块集合的迭代支配边界插入phi结点
*
* insertPhi是mem2reg的核心之一这里是对所有变量的迭代支配边界的phi结点插入无参数也无返回值
* 同样跳过对数组和全局变量的处理因为这些变量不会被mem2reg优化刚好这里在计算value2DefBlocks时已经跳过了所以不需要再显式处理了
* 同时我们进行了剪枝处理只有在基本块入口活跃的变量才插入phi函数
*
*/
auto Mem2Reg::insertPhi() -> void {
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
FunctionAnalysisInfo* funcInfo = controlFlowAnalysis->getFunctionAnalysisInfo(func);
const auto &vToDefB = funcInfo->getValue2DefBlocks();
for (const auto &map_pair : vToDefB) {
// 首先为每个变量找到迭代支配边界
auto val = map_pair.first;
auto blocks = funcInfo->getDefBlocksByValue(val);
auto itDFs = computeIterDf(blocks);
// 然后在每个变量相应的迭代支配边界上插入phi结点
for (auto basicBlock : itDFs) {
const auto &actiTable = activeVarAnalysis->getActiveTable();
auto dval = dynamic_cast<User *>(val);
// 只有在基本块入口活跃的变量才插入phi函数
if (actiTable.at(basicBlock).front().count(dval) != 0U) {
pBuilder->createPhiInst(val->getType(), val, basicBlock);
}
}
}
}
}
/**
* 重命名
*
* 重命名是mem2reg的核心之二这里是对单个块的重命名递归实现
* 同样跳过对数组和全局变量的处理因为这些变量不会被mem2reg优化
*
*/
auto Mem2Reg::rename(BasicBlock *block, std::unordered_map<Value *, int> &count,
std::unordered_map<Value *, std::stack<Instruction *>> &stacks) -> void {
auto &instrs = block->getInstructions();
std::unordered_map<Value *, int> valPop;
// 第一大步:对块中的所有指令遍历处理
for (auto iter = instrs.begin(); iter != instrs.end();) {
auto instr = iter->get();
// 对于load指令变量用最新的那个
if (instr->isLoad()) {
auto val = instr->getOperand(0);
if (!(isArr(val) || isGlobal(val))) {
if (!stacks[val].empty()) {
instr->replaceOperand(0, stacks[val].top());
}
}
}
// 然后对于define的情况看alloca、store和phi指令
if (instr->isDefine()) {
if (instr->isAlloca()) {
// alloca指令名字不改了命名就按xx_1x_2...来就行
auto val = instr;
if (!(isArr(val) || isGlobal(val))) {
++valPop[val];
stacks[val].push(val);
++count[val];
}
} else if (instr->isPhi()) {
// Phi指令也是一条特殊的define指令
auto val = dynamic_cast<PhiInst *>(instr)->getMapVal();
if (!(isArr(val) || isGlobal(val))) {
auto i = count[val];
if (i == 0) {
// 对还未alloca就有phi的指令的处理直接删除
usedelete(iter->get());
iter = instrs.erase(iter);
continue;
}
auto newname = dynamic_cast<Instruction *>(val)->getName() + "_" + std::to_string(i);
auto newalloca = pBuilder->createAllocaInstWithoutInsert(val->getType(), {}, block, newname);
FunctionAnalysisInfo* ParentfuncInfo = controlFlowAnalysis->getFunctionAnalysisInfo(block->getParent());
ParentfuncInfo->addIndirectAlloca(newalloca);
instr->replaceOperand(0, newalloca);
++valPop[val];
stacks[val].push(newalloca);
++count[val];
}
} else {
// store指令看operand的名字我们的实现是规定变量在operand的第二位用一个新的alloca x_i代替
auto val = instr->getOperand(1);
if (!(isArr(val) || isGlobal(val))) {
auto i = count[val];
auto newname = dynamic_cast<Instruction *>(val)->getName() + "_" + std::to_string(i);
auto newalloca = pBuilder->createAllocaInstWithoutInsert(val->getType(), {}, block, newname);
FunctionAnalysisInfo* ParentfuncInfo = controlFlowAnalysis->getFunctionAnalysisInfo(block->getParent());
ParentfuncInfo->addIndirectAlloca(newalloca);
// block->getParent()->addIndirectAlloca(newalloca);
instr->replaceOperand(1, newalloca);
++valPop[val];
stacks[val].push(newalloca);
++count[val];
}
}
}
++iter;
}
// 第二大步把所有CFG中的该块的successor的phi指令的相应operand确定
for (auto succ : block->getSuccessors()) {
auto position = getPredIndex(block, succ);
for (auto &instr : succ->getInstructions()) {
if (instr->isPhi()) {
auto val = dynamic_cast<PhiInst *>(instr.get())->getMapVal();
if (!stacks[val].empty()) {
instr->replaceOperand(position + 1, stacks[val].top());
}
} else {
// phi指令是添加在块的最前面的因此过了之后就不会有phi了直接break
break;
}
}
}
// 第三大步递归支配树的后继支配树才能表示define-use关系
BlockAnalysisInfo* blockInfo = controlFlowAnalysis->getBlockAnalysisInfo(block);
for (auto sdom : blockInfo->getSdoms()) {
rename(sdom, count, stacks);
}
// 第四大步遍历块中的所有指令如果涉及到define就弹栈这一步是必要的可以从递归的整体性来思考原因
// 注意这里count没清理因为平级之间计数仍然是一直增加的但是stack要清理因为define-use关系来自直接
// 支配结点而不是平级之间,不清理栈会被污染
// 提前优化知道变量对应的要弹栈的次数就可以了没必要遍历所有instr.
for (auto val_pair : valPop) {
auto val = val_pair.first;
for (int i = 0; i < val_pair.second; ++i) {
stacks[val].pop();
}
}
}
/**
* 重命名所有块
*
* 调用rename自上而下实现所有rename
*
*/
auto Mem2Reg::renameAll() -> void {
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
// 对于每个function都要SSA化所以count和stacks定义在这并初始化
std::unordered_map<Value *, int> count;
std::unordered_map<Value *, std::stack<Instruction *>> stacks;
FunctionAnalysisInfo* funcInfo = controlFlowAnalysis->getFunctionAnalysisInfo(func);
for (const auto &map_pair : funcInfo->getValue2DefBlocks()) {
auto val = map_pair.first;
count[val] = 0;
}
rename(func->getEntryBlock(), count, stacks);
}
}
/**
* mem2reg对外的接口
*
* 静态单一赋值 + mem2reg等pass的逻辑组合
*
*/
auto Mem2Reg::mem2regPipeline() -> void {
// 首先进行mem2reg的前置分析
controlFlowAnalysis->clear();
controlFlowAnalysis->runControlFlowAnalysis();
// 活跃变量分析
activeVarAnalysis->clear();
dataFlowAnalysisUtils.addBackwardAnalyzer(activeVarAnalysis);
dataFlowAnalysisUtils.backwardAnalyze(pModule);
// 计算所有valueToBlocks的定义映射
computeValue2Blocks();
// SysYPrinter printer(pModule);
// 参考llvm的mem2reg遍在插入phi结点之前先做些优化
preOptimize1();
// printer.printIR();
preOptimize2();
// printer.printIR();
// 优化三 可能会针对局部变量优化而删除整个块的alloca/store
preOptimize3();
//再进行活跃变量分析
// 报错?
// printer.printIR();
dataFlowAnalysisUtils.backwardAnalyze(pModule);
// 为所有变量插入phi结点
insertPhi();
// 重命名
renameAll();
}
/**
* 计算块n是块s的第几个前驱
*
* helperfunction没有返回值但是会将dom和other的交集赋值给dom
*
*/
auto Mem2Reg::getPredIndex(BasicBlock *n, BasicBlock *s) -> int {
int index = 0;
for (auto elem : s->getPredecessors()) {
if (elem == n) {
break;
}
++index;
}
assert(index < static_cast<int>(s->getPredecessors().size()) && "n is not a predecessor of s.");
return index;
}
/**
* 判断一个value是不是全局变量
*/
auto Mem2Reg::isGlobal(Value *val) -> bool {
auto gval = dynamic_cast<GlobalValue *>(val);
return gval != nullptr;
}
/**
* 判断一个value是不是数组
*/
auto Mem2Reg::isArr(Value *val) -> bool {
auto aval = dynamic_cast<AllocaInst *>(val);
return aval != nullptr && aval->getNumDims() != 0;
}
/**
* 删除一个指令的operand对应的value的该条use
*/
auto Mem2Reg::usedelete(Instruction *instr) -> void {
for (auto &use : instr->getOperands()) {
auto val = use->getValue();
val->removeUse(use);
}
}
} // namespace sysy

1348
src/RISCv32Backend.cpp Normal file

File diff suppressed because it is too large Load Diff

100
src/RISCv32Backend.h Normal file
View File

@ -0,0 +1,100 @@
#ifndef RISCV32_BACKEND_H
#define RISCV32_BACKEND_H
#include "IR.h"
#include <string>
#include <vector>
#include <map>
#include <set>
#include <memory>
#include <iostream>
#include <functional> // For std::function
namespace sysy {
class RISCv32CodeGen {
public:
enum class PhysicalReg {
ZERO, RA, SP, GP, TP, T0, T1, T2, S0, S1, A0, A1, A2, A3, A4, A5, A6, A7, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, T3, T4, T5, T6,
F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15,F16, F17, F18, F19, F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30, F31
};
// Move DAGNode and RegAllocResult to public section
struct DAGNode {
enum NodeKind { CONSTANT, LOAD, STORE, BINARY, CALL, RETURN, BRANCH, ALLOCA_ADDR, UNARY };
NodeKind kind;
Value* value = nullptr; // For IR Value
std::string inst; // Generated RISC-V instruction(s) for this node
std::string result_vreg; // Virtual register assigned to this node's result
std::vector<DAGNode*> operands;
std::vector<DAGNode*> users; // For debugging and potentially optimizations
DAGNode(NodeKind k) : kind(k) {}
// Debugging / helper
std::string getNodeKindString() const {
switch (kind) {
case CONSTANT: return "CONSTANT";
case LOAD: return "LOAD";
case STORE: return "STORE";
case BINARY: return "BINARY";
case CALL: return "CALL";
case RETURN: return "RETURN";
case BRANCH: return "BRANCH";
case ALLOCA_ADDR: return "ALLOCA_ADDR";
case UNARY: return "UNARY";
default: return "UNKNOWN";
}
}
};
struct RegAllocResult {
std::map<std::string, PhysicalReg> vreg_to_preg; // Virtual register to Physical Register mapping
std::map<Value*, int> stack_map; // Value (AllocaInst) to stack offset
int stack_size = 0; // Total stack frame size for locals and spills
};
RISCv32CodeGen(Module* mod) : module(mod) {}
std::string code_gen();
std::string module_gen();
std::string function_gen(Function* func);
// 修改 basicBlock_gen 的声明,添加 int block_idx 参数
std::string basicBlock_gen(BasicBlock* bb, const RegAllocResult& alloc, int block_idx);
// DAG related
std::vector<std::unique_ptr<DAGNode>> build_dag(BasicBlock* bb);
void select_instructions(DAGNode* node, const RegAllocResult& alloc);
// 改变 emit_instructions 的参数,使其可以直接添加汇编指令到 main ss
void emit_instructions(DAGNode* node, std::stringstream& ss, const RegAllocResult& alloc, std::set<DAGNode*>& emitted_nodes);
// Register Allocation related
std::map<Instruction*, std::set<std::string>> liveness_analysis(Function* func);
std::map<std::string, std::set<std::string>> build_interference_graph(
const std::map<Instruction*, std::set<std::string>>& live_sets);
void color_graph(std::map<std::string, PhysicalReg>& vreg_to_preg,
const std::map<std::string, std::set<std::string>>& interference_graph);
RegAllocResult register_allocation(Function* func);
void eliminate_phi(Function* func); // Phi elimination is typically done before DAG building
// Utility
std::string reg_to_string(PhysicalReg reg);
void print_dag(const std::vector<std::unique_ptr<DAGNode>>& dag, const std::string& bb_name);
private:
static const std::vector<PhysicalReg> allocable_regs;
std::map<Value*, std::string> value_vreg_map; // Maps IR Value* to its virtual register name
Module* module;
int vreg_counter = 0; // Counter for unique virtual register names
int alloca_offset_counter = 0; // Counter for alloca offsets
// 新增一个成员变量来存储当前函数的所有 DAGNode以确保其生命周期贯穿整个函数代码生成
// 这样可以在多个 BasicBlock_gen 调用中访问到完整的 DAG 节点
std::vector<std::unique_ptr<DAGNode>> current_function_dag_nodes;
// 为空标签定义一个伪名称前缀,加上块索引以确保唯一性
const std::string ENTRY_BLOCK_PSEUDO_NAME = "entry_block_";
};
} // namespace sysy
#endif // RISCV32_BACKEND_H

1383
src/RISCv64Backend.cpp Normal file

File diff suppressed because it is too large Load Diff

122
src/RISCv64Backend.h Normal file
View File

@ -0,0 +1,122 @@
#ifndef RISCV64_BACKEND_H
#define RISCV64_BACKEND_H
#include "IR.h"
#include <string>
#include <vector>
#include <map>
#include <set>
#include <memory>
#include <iostream>
#include <functional> // For std::function
extern int DEBUG;
extern int DEEPDEBUG;
namespace sysy {
class RISCv64CodeGen {
public:
enum class PhysicalReg {
ZERO, RA, SP, GP, TP, T0, T1, T2, S0, S1, A0, A1, A2, A3, A4, A5, A6, A7, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, T3, T4, T5, T6,
F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15,F16, F17, F18, F19, F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30, F31
};
// Move DAGNode and RegAllocResult to public section
struct DAGNode {
enum NodeKind { CONSTANT, LOAD, STORE, BINARY, CALL, RETURN, BRANCH, ALLOCA_ADDR, UNARY };
NodeKind kind;
Value* value = nullptr; // For IR Value
std::string inst; // Generated RISC-V instruction(s) for this node
std::string result_vreg; // Virtual register assigned to this node's result
std::vector<DAGNode*> operands;
std::vector<DAGNode*> users; // For debugging and potentially optimizations
DAGNode(NodeKind k) : kind(k) {}
// Debugging / helper
std::string getNodeKindString() const {
switch (kind) {
case CONSTANT: return "CONSTANT";
case LOAD: return "LOAD";
case STORE: return "STORE";
case BINARY: return "BINARY";
case CALL: return "CALL";
case RETURN: return "RETURN";
case BRANCH: return "BRANCH";
case ALLOCA_ADDR: return "ALLOCA_ADDR";
case UNARY: return "UNARY";
default: return "UNKNOWN";
}
}
};
struct RegAllocResult {
std::map<std::string, PhysicalReg> vreg_to_preg; // Virtual register to Physical Register mapping
std::map<Value*, int> stack_map; // Value (AllocaInst) to stack offset
int stack_size = 0; // Total stack frame size for locals and spills
};
RISCv64CodeGen(Module* mod) : module(mod) {}
std::string code_gen();
std::string module_gen();
std::string function_gen(Function* func);
// 修改 basicBlock_gen 的声明,添加 int block_idx 参数
std::string basicBlock_gen(BasicBlock* bb, const RegAllocResult& alloc, int block_idx);
// DAG related
std::vector<std::unique_ptr<DAGNode>> build_dag(BasicBlock* bb);
void select_instructions(DAGNode* node, const RegAllocResult& alloc);
// 改变 emit_instructions 的参数,使其可以直接添加汇编指令到 main ss
void emit_instructions(DAGNode* node, std::stringstream& ss, const RegAllocResult& alloc, std::set<DAGNode*>& emitted_nodes);
// Register Allocation related
std::map<Instruction*, std::set<std::string>> liveness_analysis(Function* func);
std::map<std::string, std::set<std::string>> build_interference_graph(
const std::map<Instruction*, std::set<std::string>>& live_sets);
void color_graph(std::map<std::string, PhysicalReg>& vreg_to_preg,
const std::map<std::string, std::set<std::string>>& interference_graph);
RegAllocResult register_allocation(Function* func);
void eliminate_phi(Function* func); // Phi elimination is typically done before DAG building
// Utility
std::string reg_to_string(PhysicalReg reg);
void print_dag(const std::vector<std::unique_ptr<DAGNode>>& dag, const std::string& bb_name);
private:
static const std::vector<PhysicalReg> allocable_regs;
std::map<Value*, std::string> value_vreg_map; // Maps IR Value* to its virtual register name
Module* module;
int vreg_counter = 0; // Counter for unique virtual register names
int alloca_offset_counter = 0; // Counter for alloca offsets
// 新增一个成员变量来存储当前函数的所有 DAGNode以确保其生命周期贯穿整个函数代码生成
// 这样可以在多个 BasicBlock_gen 调用中访问到完整的 DAG 节点
std::vector<std::unique_ptr<DAGNode>> current_function_dag_nodes;
// 为空标签定义一个伪名称前缀,加上块索引以确保唯一性
const std::string ENTRY_BLOCK_PSEUDO_NAME = "entry_block_";
// !!! 修改get_operand_node 辅助函数现在需要传入 value_to_node 和 nodes_storage 的引用
// 因为它们是 build_dag 局部管理的
DAGNode* get_operand_node(
Value* val_ir,
std::map<Value*, DAGNode*>& value_to_node,
std::vector<std::unique_ptr<DAGNode>>& nodes_storage
);
// !!! 新增create_node 辅助函数也需要传入 value_to_node 和 nodes_storage 的引用
// 并且它应该不再是 lambda而是一个真正的成员函数
DAGNode* create_node(
DAGNode::NodeKind kind,
Value* val,
std::map<Value*, DAGNode*>& value_to_node,
std::vector<std::unique_ptr<DAGNode>>& nodes_storage
);
std::vector<std::unique_ptr<Instruction>> temp_instructions_storage; // 用于存储 build_dag 中创建的临时 BinaryInst
};
} // namespace sysy
#endif // RISCV64_BACKEND_H

129
src/Reg2Mem.cpp Normal file
View File

@ -0,0 +1,129 @@
#include "Reg2Mem.h"
#include <cstddef>
#include <iostream>
#include <list>
#include <memory>
namespace sysy {
/**
* 删除phi节点
* 删除phi节点后可能会生成冗余存储代码
*/
void Reg2Mem::DeletePhiInst(){
auto &functions = pModule->getFunctions();
for (auto &function : functions) {
auto basicBlocks = function.second->getBasicBlocks();
for (auto &basicBlock : basicBlocks) {
for (auto iter = basicBlock->begin(); iter != basicBlock->end();) {
auto &instruction = *iter;
if (instruction->isPhi()) {
auto predBlocks = basicBlock->getPredecessors();
// 寻找源和目的
// 目的就是phi指令的第一个操作数
// 源就是phi指令的后续操作数
auto destination = instruction->getOperand(0);
int predBlockindex = 0;
for (auto &predBlock : predBlocks) {
++predBlockindex;
// 判断前驱块儿只有一个后继还是多个后继
// 如果有多个
auto source = instruction->getOperand(predBlockindex);
if (source == destination) {
continue;
}
// std::cout << predBlock->getNumSuccessors() << std::endl;
if (predBlock->getNumSuccessors() > 1) {
// 创建一个basicblock
auto newbasicBlock = function.second->addBasicBlock();
std::stringstream ss;
ss << " phidel.L" << pBuilder->getLabelIndex();
newbasicBlock->setName(ss.str());
ss.str("");
// // 修改前驱后继关系
basicBlock->replacePredecessor(predBlock, newbasicBlock);
// predBlock = newbasicBlock;
newbasicBlock->addPredecessor(predBlock);
newbasicBlock->addSuccessor(basicBlock.get());
predBlock->removeSuccessor(basicBlock.get());
predBlock->addSuccessor(newbasicBlock);
// std::cout << "the block name is " << basicBlock->getName() << std::endl;
// for (auto pb : basicBlock->getPredecessors()) {
// // newbasicBlock->addPredecessor(pb);
// std::cout << pb->getName() << std::endl;
// }
// sysy::BasicBlock::conectBlocks(newbasicBlock, static_cast<BasicBlock *>(basicBlock.get()));
// 若后为跳转指令,应该修改跳转指令所到达的位置
auto thelastinst = predBlock->end();
(--thelastinst);
if (thelastinst->get()->isConditional() || thelastinst->get()->isUnconditional()) { // 如果是跳转指令
auto opnum = thelastinst->get()->getNumOperands();
for (size_t i = 0; i < opnum; i++) {
if (thelastinst->get()->getOperand(i) == basicBlock.get()) {
thelastinst->get()->replaceOperand(i, newbasicBlock);
}
}
}
// 在新块中插入store指令
pBuilder->setPosition(newbasicBlock, newbasicBlock->end());
// pBuilder->createStoreInst(source, destination);
if (source->isInt() || source->isFloat()) {
pBuilder->createStoreInst(source, destination);
} else {
auto loadInst = pBuilder->createLoadInst(source);
pBuilder->createStoreInst(loadInst, destination);
}
// pBuilder->createMoveInst(Instruction::kMove, destination->getType(), destination, source,
// newbasicBlock);
pBuilder->setPosition(newbasicBlock, newbasicBlock->end());
pBuilder->createUncondBrInst(basicBlock.get(), {});
} else {
// 如果前驱块只有一个后继
auto thelastinst = predBlock->end();
(--thelastinst);
// std::cout << predBlock->getName() << std::endl;
// std::cout << thelastinst->get() << std::endl;
// std::cout << "First point 11 " << std::endl;
if (thelastinst->get()->isConditional() || thelastinst->get()->isUnconditional()) {
// 在跳转语句前insert st指令
pBuilder->setPosition(predBlock, thelastinst);
} else {
pBuilder->setPosition(predBlock, predBlock->end());
}
if (source->isInt() || source->isFloat()) {
pBuilder->createStoreInst(source, destination);
} else {
auto loadInst = pBuilder->createLoadInst(source);
pBuilder->createStoreInst(loadInst, destination);
}
}
}
// 删除phi指令
auto &instructions = basicBlock->getInstructions();
usedelete(iter->get());
iter = instructions.erase(iter);
if (basicBlock->getNumInstructions() == 0) {
if (basicBlock->getNumSuccessors() == 1) {
pBuilder->setPosition(basicBlock.get(), basicBlock->end());
pBuilder->createUncondBrInst(basicBlock->getSuccessors()[0], {});
}
}
} else {
break;
}
}
}
}
}
void Reg2Mem::usedelete(Instruction *instr) {
for (auto &use : instr->getOperands()) {
auto val = use->getValue();
val->removeUse(use);
}
}
} // namespace sysy

View File

@ -0,0 +1,532 @@
#include "SysYIRAnalyser.h"
#include <iostream>
namespace sysy {
void ControlFlowAnalysis::init() {
// 初始化分析器
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
auto basicBlocks = func->getBasicBlocks();
for (auto &basicBlock : basicBlocks) {
blockAnalysisInfo[basicBlock.get()] = new BlockAnalysisInfo();
blockAnalysisInfo[basicBlock.get()]->clear();
}
functionAnalysisInfo[func] = new FunctionAnalysisInfo();
functionAnalysisInfo[func]->clear();
}
}
void ControlFlowAnalysis::runControlFlowAnalysis() {
// 运行控制流分析
clear(); // 清空之前的分析结果
init(); // 初始化分析器
computeDomNode();
computeDomTree();
computeDomFrontierAllBlk();
}
void ControlFlowAnalysis::intersectOP4Dom(std::unordered_set<BasicBlock *> &dom, const std::unordered_set<BasicBlock *> &other) {
// 计算交集
for (auto it = dom.begin(); it != dom.end();) {
if (other.find(*it) == other.end()) {
// 如果other中没有这个基本块则从dom中删除
it = dom.erase(it);
} else {
++it;
}
}
}
auto ControlFlowAnalysis::findCommonDominator(BasicBlock *a, BasicBlock *b) -> BasicBlock * {
// 查找两个基本块的共同支配结点
while (a != b) {
BlockAnalysisInfo* infoA = blockAnalysisInfo[a];
BlockAnalysisInfo* infoB = blockAnalysisInfo[b];
// 如果深度不同,则向上移动到直接支配结点
// TODO空间换时间倍增优化优先级较低
while (infoA->getDomDepth() > infoB->getDomDepth()) {
a = const_cast<BasicBlock*>(infoA->getIdom());
infoA = blockAnalysisInfo[a];
}
while (infoB->getDomDepth() > infoA->getDomDepth()) {
b = const_cast<BasicBlock*>(infoB->getIdom());
infoB = blockAnalysisInfo[b];
}
if (a == b) break;
a = const_cast<BasicBlock*>(infoA->getIdom());
b = const_cast<BasicBlock*>(infoB->getIdom());
}
return a;
}
void ControlFlowAnalysis::computeDomNode(){
auto &functions = pModule->getFunctions();
// 分析每个函数内的基本块
for (const auto &function : functions) {
auto func = function.second.get();
auto basicBlocks = func->getBasicBlocks();
std::unordered_set<BasicBlock *> domSetTmp;
// 一开始把domSetTmp置为所有block
auto entry_block = func->getEntryBlock();
entry_block->setName("Entry");
blockAnalysisInfo[entry_block]->addDominants(entry_block);
for (auto &basicBlock : basicBlocks) {
domSetTmp.emplace(basicBlock.get());
}
// 初始化
for (auto &basicBlock : basicBlocks) {
if (basicBlock.get() != entry_block) {
blockAnalysisInfo[basicBlock.get()]->setDominants(domSetTmp);
// 先把所有block的必经结点都设为N
}
}
// 支配节点计算公式
//DOM[B]={B} {⋂P∈pred(B) DOM[P]}
// 其中pred(B)是B的所有前驱结点
// 迭代计算支配结点,直到不再变化
// 这里使用迭代法,直到支配结点不再变化
// TODOLengauer-Tarjan 算法可以更高效地计算支配结点
// 或者按照CFG拓扑序遍历效率更高
bool changed = true;
while (changed) {
changed = false;
// 循环非start结点
for (auto &basicBlock : basicBlocks) {
if (basicBlock.get() != entry_block) {
auto olddom =
blockAnalysisInfo[basicBlock.get()]->getDominants();
std::unordered_set<BasicBlock *> dom =
blockAnalysisInfo[basicBlock->getPredecessors().front()]->getDominants();
// 对于每个基本块,计算其支配结点
// 取其前驱结点的支配结点的交集和自己
for (auto pred : basicBlock->getPredecessors()) {
intersectOP4Dom(dom, blockAnalysisInfo[pred]->getDominants());
}
dom.emplace(basicBlock.get());
blockAnalysisInfo[basicBlock.get()]->setDominants(dom);
if (dom != olddom) {
changed = true;
}
}
}
}
}
}
// TODO SEMI-NCA算法改进
void ControlFlowAnalysis::computeDomTree() {
// 构造支配树
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
auto basicBlocks = func->getBasicBlocks();
auto entry_block = func->getEntryBlock();
blockAnalysisInfo[entry_block]->setIdom(entry_block);
blockAnalysisInfo[entry_block]->setDomDepth(0); // 入口块深度为0
bool changed = true;
while (changed) {
changed = false;
for (auto &basicBlock : basicBlocks) {
if (basicBlock.get() == entry_block) continue;
BasicBlock *new_idom = nullptr;
for (auto pred : basicBlock->getPredecessors()) {
// 跳过未处理的前驱
if (blockAnalysisInfo[pred]->getIdom() == nullptr) continue;
// new_idom = (new_idom == nullptr) ? pred : findCommonDominator(new_idom, pred);
if (new_idom == nullptr)
new_idom = pred;
else
new_idom = findCommonDominator(new_idom, pred);
}
// 更新直接支配节点
if (new_idom && new_idom != blockAnalysisInfo[basicBlock.get()]->getIdom()) {
// 移除旧的支配关系
if (blockAnalysisInfo[basicBlock.get()]->getIdom()) {
blockAnalysisInfo[const_cast<BasicBlock*>(blockAnalysisInfo[basicBlock.get()]->getIdom())]->removeSdoms(basicBlock.get());
}
// 设置新的支配关系
// std::cout << "Block: " << basicBlock->getName()
// << " New Idom: " << new_idom->getName() << std::endl;
blockAnalysisInfo[basicBlock.get()]->setIdom(new_idom);
blockAnalysisInfo[new_idom]->addSdoms(basicBlock.get());
// 更新深度 = 直接支配节点深度 + 1
blockAnalysisInfo[basicBlock.get()]->setDomDepth(
blockAnalysisInfo[new_idom]->getDomDepth() + 1);
changed = true;
}
}
}
}
// for (auto &basicBlock : basicBlocks) {
// if (basicBlock.get() != func->getEntryBlock()) {
// auto dominats =
// blockAnalysisInfo[basicBlock.get()]->getDominants();
// bool found = false;
// // 从前驱结点开始寻找直接支配结点
// std::queue<BasicBlock *> q;
// for (auto pred : basicBlock->getPredecessors()) {
// q.push(pred);
// }
// // BFS遍历前驱结点直到找到直接支配结点
// while (!found && !q.empty()) {
// auto curr = q.front();
// q.pop();
// if (curr == basicBlock.get())
// continue;
// if (dominats.count(curr) != 0U) {
// blockAnalysisInfo[basicBlock.get()]->setIdom(curr);
// blockAnalysisInfo[curr]->addSdoms(basicBlock.get());
// found = true;
// } else {
// for (auto pred : curr->getPredecessors()) {
// q.push(pred);
// }
// }
// }
// }
// }
}
// std::unordered_set<BasicBlock *> ControlFlowAnalysis::computeDomFrontier(BasicBlock *block) {
// std::unordered_set<BasicBlock *> ret_list;
// // 计算 localDF
// for (auto local_successor : block->getSuccessors()) {
// if (local_successor->getIdom() != block) {
// ret_list.emplace(local_successor);
// }
// }
// // 计算 upDF
// for (auto up_successor : block->getSdoms()) {
// auto childrenDF = computeDF(up_successor);
// for (auto w : childrenDF) {
// if (block != w->getIdom() || block == w) {
// ret_list.emplace(w);
// }
// }
// }
// return ret_list;
// }
void ControlFlowAnalysis::computeDomFrontierAllBlk() {
auto &functions = pModule->getFunctions();
for (const auto &function : functions) {
auto func = function.second.get();
auto basicBlocks = func->getBasicBlocks();
// 按支配树深度排序(从深到浅)
std::vector<BasicBlock *> orderedBlocks;
for (auto &bb : basicBlocks) {
orderedBlocks.push_back(bb.get());
}
std::sort(orderedBlocks.begin(), orderedBlocks.end(),
[this](BasicBlock *a, BasicBlock *b) {
return blockAnalysisInfo[a]->getDomDepth() > blockAnalysisInfo[b]->getDomDepth();
});
// 计算支配边界
for (auto block : orderedBlocks) {
std::unordered_set<BasicBlock *> df;
// Local DF: 直接后继中不被当前块支配的
for (auto succ : block->getSuccessors()) {
// 当前块不支配该后继(即不是其直接支配节点)
if (blockAnalysisInfo[succ]->getIdom() != block) {
df.insert(succ);
}
}
// Up DF: 从支配子树中继承
for (auto child : blockAnalysisInfo[block]->getSdoms()) {
for (auto w : blockAnalysisInfo[child]->getDomFrontiers()) {
// 如果w不被当前块支配
if (block != blockAnalysisInfo[w]->getIdom()) {
df.insert(w);
}
}
}
blockAnalysisInfo[block]->setDomFrontiers(df);
}
}
}
// ==========================
// dataflow analysis utils
// ==========================
// 先引用学长的代码
// TODO: Worklist 增加逆后序遍历机制
void DataFlowAnalysisUtils::forwardAnalyze(Module *pModule){
std::map<DataFlowAnalysis *, bool> workAnalysis;
for (auto &dataflow : forwardAnalysisList) {
dataflow->init(pModule);
}
for (const auto &function : pModule->getFunctions()) {
for (auto &dataflow : forwardAnalysisList) {
workAnalysis.emplace(dataflow, false);
}
while (!workAnalysis.empty()) {
for (const auto &block : function.second->getBasicBlocks()) {
for (auto &elem : workAnalysis) {
if (elem.first->analyze(pModule, block.get())) {
elem.second = true;
}
}
}
std::map<DataFlowAnalysis *, bool> tmp;
std::remove_copy_if(workAnalysis.begin(), workAnalysis.end(), std::inserter(tmp, tmp.end()),
[](const std::pair<DataFlowAnalysis *, bool> &elem) -> bool { return !elem.second; });
workAnalysis.swap(tmp);
for (auto &elem : workAnalysis) {
elem.second = false;
}
}
}
}
void DataFlowAnalysisUtils::backwardAnalyze(Module *pModule) {
std::map<DataFlowAnalysis *, bool> workAnalysis;
for (auto &dataflow : backwardAnalysisList) {
dataflow->init(pModule);
}
for (const auto &function : pModule->getFunctions()) {
for (auto &dataflow : backwardAnalysisList) {
workAnalysis.emplace(dataflow, false);
}
while (!workAnalysis.empty()) {
for (const auto &block : function.second->getBasicBlocks()) {
for (auto &elem : workAnalysis) {
if (elem.first->analyze(pModule, block.get())) {
elem.second = true;
}
}
}
std::map<DataFlowAnalysis *, bool> tmp;
std::remove_copy_if(workAnalysis.begin(), workAnalysis.end(), std::inserter(tmp, tmp.end()),
[](const std::pair<DataFlowAnalysis *, bool> &elem) -> bool { return !elem.second; });
workAnalysis.swap(tmp);
for (auto &elem : workAnalysis) {
elem.second = false;
}
}
}
}
std::set<User *> ActiveVarAnalysis::getUsedSet(Instruction *inst) {
using Kind = Instruction::Kind;
std::vector<User *> operands;
for (const auto &operand : inst->getOperands()) {
operands.emplace_back(dynamic_cast<User *>(operand->getValue()));
}
std::set<User *> result;
switch (inst->getKind()) {
// phi op
case Kind::kPhi:
case Kind::kCall:
result.insert(std::next(operands.begin()), operands.end());
break;
case Kind::kCondBr:
result.insert(operands[0]);
break;
case Kind::kBr:
case Kind::kAlloca:
break;
// mem op
case Kind::kStore:
// StoreInst 的第一个操作数是被存储的值,第二个操作数是存储的变量
// 后续的是可能的数组维度
result.insert(operands[0]);
result.insert(operands.begin() + 2, operands.end());
break;
case Kind::kLoad:
case Kind::kLa: {
auto variable = dynamic_cast<AllocaInst *>(operands[0]);
auto global = dynamic_cast<GlobalValue *>(operands[0]);
auto constArray = dynamic_cast<ConstantVariable *>(operands[0]);
if ((variable != nullptr && variable->getNumDims() == 0) || (global != nullptr && global->getNumDims() == 0) ||
(constArray != nullptr && constArray->getNumDims() == 0)) {
result.insert(operands[0]);
}
result.insert(std::next(operands.begin()), operands.end());
break;
}
case Kind::kGetSubArray: {
for (unsigned i = 2; i < operands.size(); i++) {
// 数组的维度信息
result.insert(operands[i]);
}
break;
}
case Kind::kMemset: {
result.insert(std::next(operands.begin()), operands.end());
break;
}
case Kind::kInvalid:
// Binary
case Kind::kAdd:
case Kind::kSub:
case Kind::kMul:
case Kind::kDiv:
case Kind::kRem:
case Kind::kICmpEQ:
case Kind::kICmpNE:
case Kind::kICmpLT:
case Kind::kICmpLE:
case Kind::kICmpGT:
case Kind::kICmpGE:
case Kind::kFAdd:
case Kind::kFSub:
case Kind::kFMul:
case Kind::kFDiv:
case Kind::kFCmpEQ:
case Kind::kFCmpNE:
case Kind::kFCmpLT:
case Kind::kFCmpLE:
case Kind::kFCmpGT:
case Kind::kFCmpGE:
case Kind::kAnd:
case Kind::kOr:
// Unary
case Kind::kNeg:
case Kind::kNot:
case Kind::kFNot:
case Kind::kFNeg:
case Kind::kFtoI:
case Kind::kItoF:
// terminator
case Kind::kReturn:
result.insert(operands.begin(), operands.end());
break;
default:
assert(false);
break;
}
result.erase(nullptr);
return result;
}
User * ActiveVarAnalysis::getDefine(Instruction *inst) {
User *result = nullptr;
if (inst->isStore()) {
StoreInst* store = dynamic_cast<StoreInst *>(inst);
auto operand = store->getPointer();
AllocaInst* variable = dynamic_cast<AllocaInst *>(operand);
GlobalValue* global = dynamic_cast<GlobalValue *>(operand);
if ((variable != nullptr && variable->getNumDims() != 0) || (global != nullptr && global->getNumDims() != 0)) {
// 如果是数组变量或者全局变量,则不返回定义
// TODO兼容数组变量
result = nullptr;
} else {
result = dynamic_cast<User *>(operand);
}
} else if (inst->isPhi()) {
result = dynamic_cast<User *>(inst->getOperand(0));
} else if (inst->isBinary() || inst->isUnary() || inst->isCall() ||
inst->isLoad() || inst->isLa()) {
result = dynamic_cast<User *>(inst);
}
return result;
}
void ActiveVarAnalysis::init(Module *pModule) {
for (const auto &function : pModule->getFunctions()) {
for (const auto &block : function.second->getBasicBlocks()) {
activeTable.emplace(block.get(), std::vector<std::set<User *>>{});
for (unsigned i = 0; i < block->getNumInstructions() + 1; i++)
activeTable.at(block.get()).emplace_back();
}
}
}
// 活跃变量分析公式 每个块内的分析动作供分析器调用
bool ActiveVarAnalysis::analyze(Module *pModule, BasicBlock *block) {
bool changed = false; // 标记数据流结果是否有变化
std::set<User *> activeSet{}; // 当前计算的活跃变量集合
// 步骤1: 计算基本块出口的活跃变量集 (OUT[B])
// 公式: OUT[B] = _{S ∈ succ(B)} IN[S]
for (const auto &succ : block->getSuccessors()) {
// 获取后继块入口的活跃变量集 (IN[S])
auto succActiveSet = activeTable.at(succ).front();
// 合并所有后继块的入口活跃变量
activeSet.insert(succActiveSet.begin(), succActiveSet.end());
}
// 步骤2: 处理基本块出口处的活跃变量集
const auto &instructions = block->getInstructions();
const auto numInstructions = instructions.size();
// 获取旧的出口活跃变量集 (block出口对应索引numInstructions)
const auto &oldEndActiveSet = activeTable.at(block)[numInstructions];
// 检查出口活跃变量集是否有变化
if (!std::equal(activeSet.begin(), activeSet.end(),
oldEndActiveSet.begin(), oldEndActiveSet.end()))
{
changed = true; // 标记变化
activeTable.at(block)[numInstructions] = activeSet; // 更新出口活跃变量集
}
// 步骤3: 逆序遍历基本块中的指令
// 从最后一条指令开始向前计算每个程序点的活跃变量
auto instructionIter = instructions.end();
instructionIter--; // 指向最后一条指令
// 从出口向入口遍历 (索引从numInstructions递减到1)
for (unsigned i = numInstructions; i > 0; i--) {
auto inst = instructionIter->get(); // 当前指令
auto used = getUsedSet(inst);
User *defined = getDefine(inst);
// 步骤3.3: 计算指令入口的活跃变量 (IN[i])
// 公式: IN[i] = use_i (OUT[i] - def_i)
activeSet.erase(defined); // 移除被定义的变量 (OUT[i] - def_i)
activeSet.insert(used.begin(), used.end()); // 添加使用的变量
// 获取旧的入口活跃变量集 (位置i-1对应当前指令的入口)
const auto &oldActiveSet = activeTable.at(block)[i - 1];
// 检查活跃变量集是否有变化
if (!std::equal(activeSet.begin(), activeSet.end(),
oldActiveSet.begin(), oldActiveSet.end()))
{
changed = true; // 标记变化
activeTable.at(block)[i - 1] = activeSet; // 更新入口活跃变量集
}
instructionIter--; // 移动到前一条指令
}
return changed; // 返回数据流结果是否变化
}
auto ActiveVarAnalysis::getActiveTable() const -> const std::map<BasicBlock *, std::vector<std::set<User *>>> & {
return activeTable;
}
} // namespace sysy

View File

@ -73,10 +73,12 @@ std::any SysYIRGenerator::visitGlobalVarDecl(SysYParser::GlobalVarDeclContext *c
}
}
ArrayValueTree* root = std::any_cast<ArrayValueTree *>(varDef->initVal()->accept(this));
ValueCounter values;
Utils::tree2Array(type, root, dims, dims.size(), values, &builder);
delete root;
ValueCounter values = {};
if (varDef->initVal() != nullptr) {
ArrayValueTree* root = std::any_cast<ArrayValueTree *>(varDef->initVal()->accept(this));
Utils::tree2Array(type, root, dims, dims.size(), values, &builder);
delete root;
}
// 创建全局变量,并更新符号表
module->createGlobalValue(name, Type::getPointerType(type), dims, values);
}
@ -202,6 +204,7 @@ std::any SysYIRGenerator::visitFuncType(SysYParser::FuncTypeContext *ctx) {
std::any SysYIRGenerator::visitFuncDef(SysYParser::FuncDefContext *ctx){
// 更新作用域
module->enterNewScope();
HasReturnInst = false;
auto name = ctx->Ident()->getText();
std::vector<Type *> paramTypes;
@ -241,6 +244,18 @@ std::any SysYIRGenerator::visitFuncDef(SysYParser::FuncDefContext *ctx){
visitBlockItem(item);
}
if(HasReturnInst == false) {
// 如果没有return语句则默认返回0
if (returnType != Type::getVoidType()) {
Value* returnValue = ConstantValue::get(0);
if (returnType == Type::getFloatType()) {
returnValue = ConstantValue::get(0.0f);
}
builder.createReturnInst(returnValue);
} else {
builder.createReturnInst();
}
}
module->leaveScope();
return std::any();
@ -262,7 +277,7 @@ std::any SysYIRGenerator::visitAssignStmt(SysYParser::AssignStmtContext *ctx) {
dims.push_back(std::any_cast<Value *>(visitExp(exp)));
}
User* variable = module->getVariable(name);
auto variable = module->getVariable(name);
Value* value = std::any_cast<Value *>(visitExp(ctx->exp()));
Type* variableType = dynamic_cast<PointerType *>(variable->getType())->getBaseType();
@ -308,7 +323,7 @@ std::any SysYIRGenerator::visitIfStmt(SysYParser::IfStmtContext *ctx) {
builder.popTrueBlock();
builder.popFalseBlock();
labelstring << "then.L" << builder.getLabelIndex();
labelstring << "if_then.L" << builder.getLabelIndex();
thenBlock->setName(labelstring.str());
labelstring.str("");
function->addBasicBlock(thenBlock);
@ -327,7 +342,7 @@ std::any SysYIRGenerator::visitIfStmt(SysYParser::IfStmtContext *ctx) {
builder.createUncondBrInst(exitBlock, {});
BasicBlock::conectBlocks(builder.getBasicBlock(), exitBlock);
labelstring << "else.L" << builder.getLabelIndex();
labelstring << "if_else.L" << builder.getLabelIndex();
elseBlock->setName(labelstring.str());
labelstring.str("");
function->addBasicBlock(elseBlock);
@ -341,9 +356,10 @@ std::any SysYIRGenerator::visitIfStmt(SysYParser::IfStmtContext *ctx) {
ctx->stmt(1)->accept(this);
module->leaveScope();
}
builder.createUncondBrInst(exitBlock, {});
BasicBlock::conectBlocks(builder.getBasicBlock(), exitBlock);
labelstring << "exit.L" << builder.getLabelIndex();
labelstring << "if_exit.L" << builder.getLabelIndex();
exitBlock->setName(labelstring.str());
labelstring.str("");
function->addBasicBlock(exitBlock);
@ -356,7 +372,7 @@ std::any SysYIRGenerator::visitIfStmt(SysYParser::IfStmtContext *ctx) {
builder.popTrueBlock();
builder.popFalseBlock();
labelstring << "then.L" << builder.getLabelIndex();
labelstring << "if_then.L" << builder.getLabelIndex();
thenBlock->setName(labelstring.str());
labelstring.str("");
function->addBasicBlock(thenBlock);
@ -372,7 +388,7 @@ std::any SysYIRGenerator::visitIfStmt(SysYParser::IfStmtContext *ctx) {
}
BasicBlock::conectBlocks(builder.getBasicBlock(), exitBlock);
labelstring << "exit.L" << builder.getLabelIndex();
labelstring << "if_exit.L" << builder.getLabelIndex();
exitBlock->setName(labelstring.str());
labelstring.str("");
function->addBasicBlock(exitBlock);
@ -388,7 +404,7 @@ std::any SysYIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext *ctx) {
Function* function = builder.getBasicBlock()->getParent();
std::stringstream labelstring;
labelstring << "head.L" << builder.getLabelIndex();
labelstring << "while_head.L" << builder.getLabelIndex();
BasicBlock *headBlock = function->addBasicBlock(labelstring.str());
labelstring.str("");
BasicBlock::conectBlocks(curBlock, headBlock);
@ -404,7 +420,7 @@ std::any SysYIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext *ctx) {
builder.popTrueBlock();
builder.popFalseBlock();
labelstring << "body.L" << builder.getLabelIndex();
labelstring << "while_body.L" << builder.getLabelIndex();
bodyBlock->setName(labelstring.str());
labelstring.str("");
function->addBasicBlock(bodyBlock);
@ -428,7 +444,7 @@ std::any SysYIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext *ctx) {
builder.popBreakBlock();
builder.popContinueBlock();
labelstring << "exit.L" << builder.getLabelIndex();
labelstring << "while_exit.L" << builder.getLabelIndex();
exitBlock->setName(labelstring.str());
labelstring.str("");
function->addBasicBlock(exitBlock);
@ -439,7 +455,7 @@ std::any SysYIRGenerator::visitWhileStmt(SysYParser::WhileStmtContext *ctx) {
std::any SysYIRGenerator::visitBreakStmt(SysYParser::BreakStmtContext *ctx) {
BasicBlock* breakBlock = builder.getBreakBlock();
builder.pushBreakBlock(breakBlock);
builder.createUncondBrInst(breakBlock, {});
BasicBlock::conectBlocks(builder.getBasicBlock(), breakBlock);
return std::any();
}
@ -447,6 +463,7 @@ std::any SysYIRGenerator::visitBreakStmt(SysYParser::BreakStmtContext *ctx) {
std::any SysYIRGenerator::visitContinueStmt(SysYParser::ContinueStmtContext *ctx) {
BasicBlock* continueBlock = builder.getContinueBlock();
builder.createUncondBrInst(continueBlock, {});
BasicBlock::conectBlocks(builder.getBasicBlock(), continueBlock);
return std::any();
}
@ -456,7 +473,7 @@ std::any SysYIRGenerator::visitReturnStmt(SysYParser::ReturnStmtContext *ctx) {
returnValue = std::any_cast<Value *>(visitExp(ctx->exp()));
}
Type* funcType = builder.getBasicBlock()->getParent()->getType();
Type* funcType = builder.getBasicBlock()->getParent()->getReturnType();
if (funcType!= returnValue->getType() && returnValue != nullptr) {
ConstantValue * constValue = dynamic_cast<ConstantValue *>(returnValue);
if (constValue != nullptr) {
@ -474,6 +491,7 @@ std::any SysYIRGenerator::visitReturnStmt(SysYParser::ReturnStmtContext *ctx) {
}
}
builder.createReturnInst(returnValue);
HasReturnInst = true;
return std::any();
}

484
src/SysYIROptPre.cpp Normal file
View File

@ -0,0 +1,484 @@
#include "SysYIROptPre.h"
#include <cassert>
#include <list>
#include <map>
#include <memory>
#include <string>
#include <iostream>
#include "IR.h"
#include "IRBuilder.h"
namespace sysy {
/**
* use删除operand,以免扰乱后续分析
* instr: 要删除的指令
*/
void SysYOptPre::usedelete(Instruction *instr) {
for (auto &use : instr->getOperands()) {
Value* val = use->getValue();
// std::cout << delete << val->getName() << std::endl;
val->removeUse(use);
}
}
// 删除br后的无用指令
void SysYOptPre::SysYDelInstAfterBr() {
auto &functions = pModule->getFunctions();
for (auto &function : functions) {
auto basicBlocks = function.second->getBasicBlocks();
for (auto &basicBlock : basicBlocks) {
bool Branch = false;
auto &instructions = basicBlock->getInstructions();
auto Branchiter = instructions.end();
for (auto iter = instructions.begin(); iter != instructions.end(); ++iter) {
if (Branch)
usedelete(iter->get());
else if ((*iter)->isTerminator()){
Branch = true;
Branchiter = iter;
}
}
if (Branchiter != instructions.end()) ++Branchiter;
while (Branchiter != instructions.end())
Branchiter = instructions.erase(Branchiter);
if (Branch) { // 更新前驱后继关系
auto thelastinstinst = basicBlock->getInstructions().end();
--thelastinstinst;
auto &Successors = basicBlock->getSuccessors();
for (auto iterSucc = Successors.begin(); iterSucc != Successors.end();) {
(*iterSucc)->removePredecessor(basicBlock.get());
basicBlock->removeSuccessor(*iterSucc);
}
if (thelastinstinst->get()->isUnconditional()) {
BasicBlock* branchBlock = dynamic_cast<BasicBlock *>(thelastinstinst->get()->getOperand(0));
basicBlock->addSuccessor(branchBlock);
branchBlock->addPredecessor(basicBlock.get());
} else if (thelastinstinst->get()->isConditional()) {
BasicBlock* thenBlock = dynamic_cast<BasicBlock *>(thelastinstinst->get()->getOperand(1));
BasicBlock* elseBlock = dynamic_cast<BasicBlock *>(thelastinstinst->get()->getOperand(2));
basicBlock->addSuccessor(thenBlock);
basicBlock->addSuccessor(elseBlock);
thenBlock->addPredecessor(basicBlock.get());
elseBlock->addPredecessor(basicBlock.get());
}
}
}
}
}
void SysYOptPre::SysYBlockMerge() {
auto &functions = pModule->getFunctions(); //std::map<std::string, std::unique_ptr<Function>>
for (auto &function : functions) {
// auto basicBlocks = function.second->getBasicBlocks();
auto &func = function.second;
for (auto blockiter = func->getBasicBlocks().begin();
blockiter != func->getBasicBlocks().end();) {
if (blockiter->get()->getNumSuccessors() == 1) {
// 如果当前块只有一个后继块
// 且后继块只有一个前驱块
// 则将当前块和后继块合并
if (((blockiter->get())->getSuccessors()[0])->getNumPredecessors() == 1) {
// std::cout << "merge block: " << blockiter->get()->getName() << std::endl;
BasicBlock* block = blockiter->get();
BasicBlock* nextBlock = blockiter->get()->getSuccessors()[0];
auto nextarguments = nextBlock->getArguments();
// 删除br指令
if (block->getNumInstructions() != 0) {
auto thelastinstinst = block->end();
(--thelastinstinst);
if (thelastinstinst->get()->isUnconditional()) {
usedelete(thelastinstinst->get());
block->getInstructions().erase(thelastinstinst);
} else if (thelastinstinst->get()->isConditional()) {
// 如果是条件分支,判断条件是否相同,主要优化相同布尔表达式
if (thelastinstinst->get()->getOperand(1)->getName() == thelastinstinst->get()->getOperand(1)->getName()) {
usedelete(thelastinstinst->get());
block->getInstructions().erase(thelastinstinst);
}
}
}
// 将后继块的指令移动到当前块
// 并将后继块的父指针改为当前块
for (auto institer = nextBlock->begin(); institer != nextBlock->end();) {
institer->get()->setParent(block);
block->getInstructions().emplace_back(institer->release());
institer = nextBlock->getInstructions().erase(institer);
}
// 合并参数
// TODO是否需要去重?
for (auto &argm : nextarguments) {
argm->setParent(block);
block->insertArgument(argm);
}
// 更新前驱后继关系,类似树节点操作
block->removeSuccessor(nextBlock);
nextBlock->removePredecessor(block);
std::list<BasicBlock *> succshoulddel;
for (auto &succ : nextBlock->getSuccessors()) {
block->addSuccessor(succ);
succ->replacePredecessor(nextBlock, block);
succshoulddel.push_back(succ);
}
for (auto del : succshoulddel) {
nextBlock->removeSuccessor(del);
}
func->removeBasicBlock(nextBlock);
} else {
blockiter++;
}
} else {
blockiter++;
}
}
}
}
// 删除无前驱块兼容SSA后的处理
void SysYOptPre::SysYDelNoPreBLock() {
auto &functions = pModule->getFunctions(); // std::map<std::string, std::unique_ptr<sysy::Function>>
for (auto &function : functions) {
auto &func = function.second;
for (auto &block : func->getBasicBlocks()) {
block->setreachableFalse();
}
// 对函数基本块做一个拓扑排序,排查不可达基本块
auto entryBlock = func->getEntryBlock();
entryBlock->setreachableTrue();
std::queue<BasicBlock *> blockqueue;
blockqueue.push(entryBlock);
while (!blockqueue.empty()) {
auto block = blockqueue.front();
blockqueue.pop();
for (auto &succ : block->getSuccessors()) {
if (!succ->getreachable()) {
succ->setreachableTrue();
blockqueue.push(succ);
}
}
}
// 删除不可达基本块指令
for (auto blockIter = func->getBasicBlocks().begin(); blockIter != func->getBasicBlocks().end();blockIter++) {
if (!blockIter->get()->getreachable())
for (auto &iterInst : blockIter->get()->getInstructions())
usedelete(iterInst.get());
}
for (auto blockIter = func->getBasicBlocks().begin(); blockIter != func->getBasicBlocks().end();) {
if (!blockIter->get()->getreachable()) {
for (auto succblock : blockIter->get()->getSuccessors()) {
int indexphi = 1;
for (auto pred : succblock->getPredecessors()) {
if (pred == blockIter->get()) {
break;
}
indexphi++;
}
for (auto &phiinst : succblock->getInstructions()) {
if (phiinst->getKind() != Instruction::kPhi) {
break;
}
phiinst->removeOperand(indexphi);
}
}
// 删除不可达基本块,注意迭代器不可达问题
func->removeBasicBlock((blockIter++)->get());
} else {
blockIter++;
}
}
}
}
void SysYOptPre::SysYDelEmptyBlock() {
auto &functions = pModule->getFunctions();
for (auto &function : functions) {
// 收集不可达基本块
// 这里的不可达基本块是指没有实际指令的基本块
// 当一个基本块没有实际指令例如只有phi指令和一个uncondbr指令时也会被视作不可达
auto basicBlocks = function.second->getBasicBlocks();
std::map<sysy::BasicBlock *, BasicBlock *> EmptyBlocks;
// 空块儿和后继的基本块的映射
for (auto &basicBlock : basicBlocks) {
if (basicBlock->getNumInstructions() == 0) {
if (basicBlock->getNumSuccessors() == 1) {
EmptyBlocks[basicBlock.get()] = basicBlock->getSuccessors().front();
}
}
else{
// 如果只有phi指令和一个uncondbr。(phi)*(uncondbr)?
// 判断除了最后一个指令之外是不是只有phi指令
bool onlyPhi = true;
for (auto &inst : basicBlock->getInstructions()) {
if (!inst->isPhi() && !inst->isUnconditional()) {
onlyPhi = false;
break;
}
}
if(onlyPhi)
EmptyBlocks[basicBlock.get()] = basicBlock->getSuccessors().front();
}
}
// 更新基本块信息,增加必要指令
for (auto &basicBlock : basicBlocks) {
// 把空块转换成只有跳转指令的不可达块
if (distance(basicBlock->begin(), basicBlock->end()) == 0) {
if (basicBlock->getNumSuccessors() == 0) {
continue;
}
if (basicBlock->getNumSuccessors() > 1) {
assert("");
}
pBuilder->setPosition(basicBlock.get(), basicBlock->end());
pBuilder->createUncondBrInst(basicBlock->getSuccessors()[0], {});
continue;
}
auto thelastinst = basicBlock->getInstructions().end();
--thelastinst;
// 根据br指令传递的后继块信息跳过空块链
if (thelastinst->get()->isUnconditional()) {
BasicBlock* OldBrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
BasicBlock *thelastBlockOld = nullptr;
// 如果空块链表为多个块
while (EmptyBlocks.find(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))) !=
EmptyBlocks.end()) {
thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
thelastinst->get()->replaceOperand(0, EmptyBlocks[thelastBlockOld]);
}
basicBlock->removeSuccessor(OldBrBlock);
OldBrBlock->removePredecessor(basicBlock.get());
basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0)));
dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->addPredecessor(basicBlock.get());
if (thelastBlockOld != nullptr) {
int indexphi = 0;
for (auto &pred : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getPredecessors()) {
if (pred == thelastBlockOld) {
break;
}
indexphi++;
}
// 更新phi指令的操作数
// 移除thelastBlockOld对应的phi操作数
for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getInstructions()) {
if (InstInNew->isPhi()) {
dynamic_cast<PhiInst *>(InstInNew.get())->removeOperand(indexphi + 1);
} else {
break;
}
}
}
} else if (thelastinst->get()->getKind() == Instruction::kCondBr) {
auto OldThenBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
auto OldElseBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2));
BasicBlock *thelastBlockOld = nullptr;
while (EmptyBlocks.find(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))) !=
EmptyBlocks.end()) {
thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
thelastinst->get()->replaceOperand(
1, EmptyBlocks[dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))]);
}
basicBlock->removeSuccessor(OldThenBlock);
OldThenBlock->removePredecessor(basicBlock.get());
// 处理 then 和 else 分支合并的情况
if (dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1)) ==
dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))) {
auto thebrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
usedelete(thelastinst->get());
thelastinst = basicBlock->getInstructions().erase(thelastinst);
pBuilder->setPosition(basicBlock.get(), basicBlock->end());
pBuilder->createUncondBrInst(thebrBlock, {});
continue;
}
basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1)));
dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))->addPredecessor(basicBlock.get());
// auto indexInNew = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getPredecessors().
if (thelastBlockOld != nullptr) {
int indexphi = 0;
for (auto &pred : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))->getPredecessors()) {
if (pred == thelastBlockOld) {
break;
}
indexphi++;
}
for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))->getInstructions()) {
if (InstInNew->isPhi()) {
dynamic_cast<PhiInst *>(InstInNew.get())->removeOperand(indexphi + 1);
} else {
break;
}
}
}
thelastBlockOld = nullptr;
while (EmptyBlocks.find(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))) !=
EmptyBlocks.end()) {
thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2));
thelastinst->get()->replaceOperand(
2, EmptyBlocks[dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))]);
}
basicBlock->removeSuccessor(OldElseBlock);
OldElseBlock->removePredecessor(basicBlock.get());
// 处理 then 和 else 分支合并的情况
if (dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1)) ==
dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))) {
auto thebrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
usedelete(thelastinst->get());
thelastinst = basicBlock->getInstructions().erase(thelastinst);
pBuilder->setPosition(basicBlock.get(), basicBlock->end());
pBuilder->createUncondBrInst(thebrBlock, {});
continue;
}
basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2)));
dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))->addPredecessor(basicBlock.get());
if (thelastBlockOld != nullptr) {
int indexphi = 0;
for (auto &pred : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))->getPredecessors()) {
if (pred == thelastBlockOld) {
break;
}
indexphi++;
}
for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))->getInstructions()) {
if (InstInNew->isPhi()) {
dynamic_cast<PhiInst *>(InstInNew.get())->removeOperand(indexphi + 1);
} else {
break;
}
}
}
} else {
if (basicBlock->getNumSuccessors() == 1) {
pBuilder->setPosition(basicBlock.get(), basicBlock->end());
pBuilder->createUncondBrInst(basicBlock->getSuccessors()[0], {});
auto thelastinst = basicBlock->getInstructions().end();
(--thelastinst);
auto OldBrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
sysy::BasicBlock *thelastBlockOld = nullptr;
while (EmptyBlocks.find(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))) !=
EmptyBlocks.end()) {
thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
thelastinst->get()->replaceOperand(
0, EmptyBlocks[dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))]);
}
basicBlock->removeSuccessor(OldBrBlock);
OldBrBlock->removePredecessor(basicBlock.get());
basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0)));
dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->addPredecessor(basicBlock.get());
if (thelastBlockOld != nullptr) {
int indexphi = 0;
for (auto &pred : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getPredecessors()) {
if (pred == thelastBlockOld) {
break;
}
indexphi++;
}
for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getInstructions()) {
if (InstInNew->isPhi()) {
dynamic_cast<PhiInst *>(InstInNew.get())->removeOperand(indexphi + 1);
} else {
break;
}
}
}
}
}
}
for (auto iter = function.second->getBasicBlocks().begin(); iter != function.second->getBasicBlocks().end();) {
if (EmptyBlocks.find(iter->get()) != EmptyBlocks.end()) {
// EntryBlock跳过
if (iter->get() == function.second->getEntryBlock()) {
++iter;
continue;
}
for (auto &iterInst : iter->get()->getInstructions())
usedelete(iterInst.get());
// 删除不可达基本块的phi指令的操作数
for (auto &succ : iter->get()->getSuccessors()) {
int index = 0;
for (auto &pred : succ->getPredecessors()) {
if (pred == iter->get()) {
break;
}
index++;
}
for (auto &instinsucc : succ->getInstructions()) {
if (instinsucc->isPhi()) {
dynamic_cast<PhiInst *>(instinsucc.get())->removeOperand(index);
} else {
break;
}
}
}
function.second->removeBasicBlock((iter++)->get());
} else {
++iter;
}
}
}
}
// 如果函数没有返回指令,则添加一个默认返回指令(主要解决void函数没有返回指令的问题)
void SysYOptPre::SysYAddReturn() {
auto &functions = pModule->getFunctions();
for (auto &function : functions) {
auto &func = function.second;
auto basicBlocks = func->getBasicBlocks();
for (auto &block : basicBlocks) {
if (block->getNumSuccessors() == 0) {
// 如果基本块没有后继块,则添加一个返回指令
if (block->getNumInstructions() == 0) {
pBuilder->setPosition(block.get(), block->end());
pBuilder->createReturnInst();
}
auto thelastinst = block->getInstructions().end();
--thelastinst;
if (thelastinst->get()->getKind() != Instruction::kReturn) {
// std::cout << "Warning: Function " << func->getName() << " has no return instruction, adding default return." << std::endl;
pBuilder->setPosition(block.get(), block->end());
// TODO: 如果int float函数缺少返回值是否需要报错
if (func->getReturnType()->isInt()) {
pBuilder->createReturnInst(ConstantValue::get(0));
} else if (func->getReturnType()->isFloat()) {
pBuilder->createReturnInst(ConstantValue::get(0.0F));
} else {
pBuilder->createReturnInst();
}
}
}
}
}
}
} // namespace sysy

483
src/SysYIRPrinter.cpp Normal file
View File

@ -0,0 +1,483 @@
#include "SysYIRPrinter.h"
#include <cassert>
#include <fstream>
#include <iostream>
#include <string>
#include "IR.h"
namespace sysy {
void SysYPrinter::printIR() {
const auto &functions = pModule->getFunctions();
//TODO: Print target datalayout and triple (minimal required by LLVM)
printGlobalVariable();
for (const auto &iter : functions) {
if (iter.second->getName() == "main") {
printFunction(iter.second.get());
break;
}
}
for (const auto &iter : functions) {
if (iter.second->getName() != "main") {
printFunction(iter.second.get());
}
}
}
std::string SysYPrinter::getTypeString(Type *type) {
if (type->isVoid()) {
return "void";
} else if (type->isInt()) {
return "i32";
} else if (type->isFloat()) {
return "float";
} else if (auto ptrType = dynamic_cast<PointerType*>(type)) {
return getTypeString(ptrType->getBaseType()) + "*";
} else if (auto ptrType = dynamic_cast<FunctionType*>(type)) {
return getTypeString(ptrType->getReturnType());
}
assert(false && "Unsupported type");
return "";
}
std::string SysYPrinter::getValueName(Value *value) {
if (auto global = dynamic_cast<GlobalValue*>(value)) {
return "@" + global->getName();
} else if (auto inst = dynamic_cast<Instruction*>(value)) {
return "%" + inst->getName();
} else if (auto constVal = dynamic_cast<ConstantValue*>(value)) {
if (constVal->isFloat()) {
return std::to_string(constVal->getFloat());
}
return std::to_string(constVal->getInt());
} else if (auto constVar = dynamic_cast<ConstantVariable*>(value)) {
return constVar->getName();
}
assert(false && "Unknown value type");
return "";
}
void SysYPrinter::printType(Type *type) {
std::cout << getTypeString(type);
}
void SysYPrinter::printValue(Value *value) {
std::cout << getValueName(value);
}
void SysYPrinter::printGlobalVariable() {
auto &globals = pModule->getGlobals();
for (const auto &global : globals) {
std::cout << "@" << global->getName() << " = global ";
auto baseType = dynamic_cast<PointerType *>(global->getType())->getBaseType();
printType(baseType);
if (global->getNumDims() > 0) {
// Array type
std::cout << " [";
for (unsigned i = 0; i < global->getNumDims(); i++) {
if (i > 0) std::cout << " x ";
std::cout << getValueName(global->getDim(i));
}
std::cout << "]";
}
std::cout << " ";
if (global->getNumDims() > 0) {
// Array initializer
std::cout << "[";
auto values = global->getInitValues();
auto counterValues = values.getValues();
auto counterNumbers = values.getNumbers();
for (size_t i = 0; i < counterNumbers.size(); i++) {
if (i > 0) std::cout << ", ";
if (baseType->isFloat()) {
std::cout << "float " << dynamic_cast<ConstantValue*>(counterValues[i])->getFloat();
} else {
std::cout << "i32 " << dynamic_cast<ConstantValue*>(counterValues[i])->getInt();
}
}
std::cout << "]";
} else {
// Scalar initializer
if (baseType->isFloat()) {
std::cout << "float " << dynamic_cast<ConstantValue*>(global->getByIndex(0))->getFloat();
} else {
std::cout << "i32 " << dynamic_cast<ConstantValue*>(global->getByIndex(0))->getInt();
}
}
std::cout << ", align 4" << std::endl;
}
}
void SysYPrinter::printFunction(Function *function) {
// Function signature
std::cout << "define ";
printType(function->getReturnType());
std::cout << " @" << function->getName() << "(";
auto entryBlock = function->getEntryBlock();
const auto &args_types = function->getParamTypes();
auto &args = entryBlock->getArguments();
int i = 0;
for (const auto &args_type : args_types) {
if (i > 0) std::cout << ", ";
printType(args_type);
std::cout << " %" << args[i]->getName();
i++;
}
std::cout << ") {" << std::endl;
// Function body
for (const auto &blockIter : function->getBasicBlocks()) {
// Basic block label
BasicBlock* blockPtr = blockIter.get();
if (blockPtr == function->getEntryBlock()) {
std::cout << "entry:" << std::endl;
} else if (!blockPtr->getName().empty()) {
std::cout << blockPtr->getName() << ":" << std::endl;
}
// Instructions
for (const auto &instIter : blockIter->getInstructions()) {
auto inst = instIter.get();
std::cout << " ";
printInst(inst);
}
}
std::cout << "}" << std::endl << std::endl;
}
void SysYPrinter::printInst(Instruction *pInst) {
using Kind = Instruction::Kind;
switch (pInst->getKind()) {
case Kind::kAdd:
case Kind::kSub:
case Kind::kMul:
case Kind::kDiv:
case Kind::kRem:
case Kind::kFAdd:
case Kind::kFSub:
case Kind::kFMul:
case Kind::kFDiv:
case Kind::kICmpEQ:
case Kind::kICmpNE:
case Kind::kICmpLT:
case Kind::kICmpGT:
case Kind::kICmpLE:
case Kind::kICmpGE:
case Kind::kFCmpEQ:
case Kind::kFCmpNE:
case Kind::kFCmpLT:
case Kind::kFCmpGT:
case Kind::kFCmpLE:
case Kind::kFCmpGE:
case Kind::kAnd:
case Kind::kOr: {
auto binInst = dynamic_cast<BinaryInst *>(pInst);
// Print result variable if exists
if (!binInst->getName().empty()) {
std::cout << "%" << binInst->getName() << " = ";
}
// Operation name
switch (pInst->getKind()) {
case Kind::kAdd: std::cout << "add"; break;
case Kind::kSub: std::cout << "sub"; break;
case Kind::kMul: std::cout << "mul"; break;
case Kind::kDiv: std::cout << "sdiv"; break;
case Kind::kRem: std::cout << "srem"; break;
case Kind::kFAdd: std::cout << "fadd"; break;
case Kind::kFSub: std::cout << "fsub"; break;
case Kind::kFMul: std::cout << "fmul"; break;
case Kind::kFDiv: std::cout << "fdiv"; break;
case Kind::kICmpEQ: std::cout << "icmp eq"; break;
case Kind::kICmpNE: std::cout << "icmp ne"; break;
case Kind::kICmpLT: std::cout << "icmp slt"; break;
case Kind::kICmpGT: std::cout << "icmp sgt"; break;
case Kind::kICmpLE: std::cout << "icmp sle"; break;
case Kind::kICmpGE: std::cout << "icmp sge"; break;
case Kind::kFCmpEQ: std::cout << "fcmp oeq"; break;
case Kind::kFCmpNE: std::cout << "fcmp one"; break;
case Kind::kFCmpLT: std::cout << "fcmp olt"; break;
case Kind::kFCmpGT: std::cout << "fcmp ogt"; break;
case Kind::kFCmpLE: std::cout << "fcmp ole"; break;
case Kind::kFCmpGE: std::cout << "fcmp oge"; break;
case Kind::kAnd: std::cout << "and"; break;
case Kind::kOr: std::cout << "or"; break;
default: break;
}
// Types and operands
std::cout << " ";
printType(binInst->getType());
std::cout << " ";
printValue(binInst->getLhs());
std::cout << ", ";
printValue(binInst->getRhs());
std::cout << std::endl;
} break;
case Kind::kNeg:
case Kind::kNot:
case Kind::kFNeg:
case Kind::kFNot:
case Kind::kFtoI:
case Kind::kBitFtoI:
case Kind::kItoF:
case Kind::kBitItoF: {
auto unyInst = dynamic_cast<UnaryInst *>(pInst);
if (!unyInst->getName().empty()) {
std::cout << "%" << unyInst->getName() << " = ";
}
switch (pInst->getKind()) {
case Kind::kNeg: std::cout << "sub "; break;
case Kind::kNot: std::cout << "not "; break;
case Kind::kFNeg: std::cout << "fneg "; break;
case Kind::kFNot: std::cout << "fneg "; break; // FNot not standard, map to fneg
case Kind::kFtoI: std::cout << "fptosi "; break;
case Kind::kBitFtoI: std::cout << "bitcast "; break;
case Kind::kItoF: std::cout << "sitofp "; break;
case Kind::kBitItoF: std::cout << "bitcast "; break;
default: break;
}
printType(unyInst->getType());
std::cout << " ";
// Special handling for negation
if (pInst->getKind() == Kind::kNeg || pInst->getKind() == Kind::kNot) {
std::cout << "i32 0, ";
}
printValue(pInst->getOperand(0));
// For bitcast, need to specify destination type
if (pInst->getKind() == Kind::kBitFtoI || pInst->getKind() == Kind::kBitItoF) {
std::cout << " to ";
printType(unyInst->getType());
}
std::cout << std::endl;
} break;
case Kind::kCall: {
auto callInst = dynamic_cast<CallInst *>(pInst);
auto function = callInst->getCallee();
if (!callInst->getName().empty()) {
std::cout << "%" << callInst->getName() << " = ";
}
std::cout << "call ";
printType(callInst->getType());
std::cout << " @" << function->getName() << "(";
auto params = callInst->getArguments();
bool first = true;
for (auto &param : params) {
if (!first) std::cout << ", ";
first = false;
printType(param->getValue()->getType());
std::cout << " ";
printValue(param->getValue());
}
std::cout << ")" << std::endl;
} break;
case Kind::kCondBr: {
auto condBrInst = dynamic_cast<CondBrInst *>(pInst);
std::cout << "br i1 ";
printValue(condBrInst->getCondition());
std::cout << ", label %" << condBrInst->getThenBlock()->getName();
std::cout << ", label %" << condBrInst->getElseBlock()->getName();
std::cout << std::endl;
} break;
case Kind::kBr: {
auto brInst = dynamic_cast<UncondBrInst *>(pInst);
std::cout << "br label %" << brInst->getBlock()->getName();
std::cout << std::endl;
} break;
case Kind::kReturn: {
auto retInst = dynamic_cast<ReturnInst *>(pInst);
std::cout << "ret ";
if (retInst->getNumOperands() != 0) {
printType(retInst->getOperand(0)->getType());
std::cout << " ";
printValue(retInst->getOperand(0));
} else {
std::cout << "void";
}
std::cout << std::endl;
} break;
case Kind::kAlloca: {
auto allocaInst = dynamic_cast<AllocaInst *>(pInst);
std::cout << "%" << allocaInst->getName() << " = alloca ";
auto baseType = dynamic_cast<PointerType *>(allocaInst->getType())->getBaseType();
printType(baseType);
if (allocaInst->getNumDims() > 0) {
std::cout << ", ";
for (size_t i = 0; i < allocaInst->getNumDims(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(allocaInst->getDim(i));
}
}
std::cout << ", align 4" << std::endl;
} break;
case Kind::kLoad: {
auto loadInst = dynamic_cast<LoadInst *>(pInst);
std::cout << "%" << loadInst->getName() << " = load ";
printType(loadInst->getType());
std::cout << ", ";
printType(loadInst->getPointer()->getType());
std::cout << " ";
printValue(loadInst->getPointer());
if (loadInst->getNumIndices() > 0) {
std::cout << ", ";
for (size_t i = 0; i < loadInst->getNumIndices(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(loadInst->getIndex(i));
}
}
std::cout << ", align 4" << std::endl;
} break;
case Kind::kLa: {
auto laInst = dynamic_cast<LaInst *>(pInst);
std::cout << "%" << laInst->getName() << " = getelementptr inbounds ";
auto ptrType = dynamic_cast<PointerType*>(laInst->getPointer()->getType());
printType(ptrType->getBaseType());
std::cout << ", ";
printType(laInst->getPointer()->getType());
std::cout << " ";
printValue(laInst->getPointer());
std::cout << ", ";
for (size_t i = 0; i < laInst->getNumIndices(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(laInst->getIndex(i));
}
std::cout << std::endl;
} break;
case Kind::kStore: {
auto storeInst = dynamic_cast<StoreInst *>(pInst);
std::cout << "store ";
printType(storeInst->getValue()->getType());
std::cout << " ";
printValue(storeInst->getValue());
std::cout << ", ";
printType(storeInst->getPointer()->getType());
std::cout << " ";
printValue(storeInst->getPointer());
if (storeInst->getNumIndices() > 0) {
std::cout << ", ";
for (size_t i = 0; i < storeInst->getNumIndices(); i++) {
if (i > 0) std::cout << ", ";
printType(Type::getIntType());
std::cout << " ";
printValue(storeInst->getIndex(i));
}
}
std::cout << ", align 4" << std::endl;
} break;
case Kind::kMemset: {
auto memsetInst = dynamic_cast<MemsetInst *>(pInst);
std::cout << "call void @llvm.memset.p0.";
printType(memsetInst->getPointer()->getType());
std::cout << "(";
printType(memsetInst->getPointer()->getType());
std::cout << " ";
printValue(memsetInst->getPointer());
std::cout << ", i8 ";
printValue(memsetInst->getValue());
std::cout << ", i32 ";
printValue(memsetInst->getSize());
std::cout << ", i1 false)" << std::endl;
} break;
case Kind::kPhi: {
auto phiInst = dynamic_cast<PhiInst *>(pInst);
printValue(phiInst->getOperand(0));
std::cout << " = phi ";
printType(phiInst->getType());
for (unsigned i = 1; i < phiInst->getNumOperands(); i++) {
if (i > 0) std::cout << ", ";
std::cout << "[ ";
printValue(phiInst->getOperand(i));
std::cout << " ]";
}
std::cout << std::endl;
} break;
case Kind::kGetSubArray: {
auto getSubArrayInst = dynamic_cast<GetSubArrayInst *>(pInst);
std::cout << "%" << getSubArrayInst->getName() << " = getelementptr inbounds ";
auto ptrType = dynamic_cast<PointerType*>(getSubArrayInst->getFatherArray()->getType());
printType(ptrType->getBaseType());
std::cout << ", ";
printType(getSubArrayInst->getFatherArray()->getType());
std::cout << " ";
printValue(getSubArrayInst->getFatherArray());
std::cout << ", ";
bool firstIndex = true;
for (auto &index : getSubArrayInst->getIndices()) {
if (!firstIndex) std::cout << ", ";
firstIndex = false;
printType(Type::getIntType());
std::cout << " ";
printValue(index->getValue());
}
std::cout << std::endl;
} break;
default:
assert(false && "Unsupported instruction kind");
break;
}
}
} // namespace sysy

View File

@ -0,0 +1,39 @@
#pragma once
#include "IR.h"
#include "SysYIRAnalyser.h"
#include "SysYIRPrinter.h"
namespace sysy {
class DeadCodeElimination {
private:
Module *pModule;
ControlFlowAnalysis *pCFA; // 控制流分析指针
ActiveVarAnalysis *pAVA; // 活跃变量分析指针
DataFlowAnalysisUtils dataFlowAnalysisUtils; // 数据流分析工具类
public:
explicit DeadCodeElimination(Module *pMoudle,
ControlFlowAnalysis *pCFA = nullptr,
ActiveVarAnalysis *pAVA = nullptr)
: pModule(pMoudle), pCFA(pCFA), pAVA(pAVA), dataFlowAnalysisUtils() {} // 构造函数
// TODO根据参数传入的passes来运行不同的死代码删除流程
// void runDCEPipeline(const std::vector<std::string>& passes = {
// "dead-store", "redundant-load-store", "dead-load", "dead-alloca", "dead-global"
// });
void runDCEPipeline(); // 运行死代码删除
void eliminateDeadStores(Function* func, bool& changed); // 消除无用存储
void eliminateDeadLoads(Function* func, bool& changed); // 消除无用加载
void eliminateDeadAllocas(Function* func, bool& changed); // 消除无用内存分配
void eliminateDeadGlobals(bool& changed); // 消除无用全局变量
void eliminateDeadIndirectiveAllocas(Function* func, bool& changed); // 消除无用间接内存分配(phi节点)
void eliminateDeadRedundantLoadStore(Function* func, bool& changed); // 消除冗余加载和存储
bool isGlobal(Value *val);
bool isArr(Value *val);
void usedelete(Instruction *instr);
};
} // namespace sysy

View File

@ -317,7 +317,6 @@ class ConstantValue : public Value {
class Instruction;
class Function;
class Loop;
class BasicBlock;
/*!
@ -327,104 +326,73 @@ class BasicBlock;
* a terminator (branch or return). Besides, `BasicBlock` stores its arguments
* and records its predecessor and successor `BasicBlock`s.
*/
class BasicBlock : public Value {
class BasicBlock : public Value {
friend class Function;
public:
public:
using inst_list = std::list<std::unique_ptr<Instruction>>;
using iterator = inst_list::iterator;
using arg_list = std::vector<AllocaInst *>;
using block_list = std::vector<BasicBlock *>;
using block_set = std::unordered_set<BasicBlock *>;
protected:
protected:
Function *parent; ///< 从属的函数
inst_list instructions; ///< 拥有的指令序列
arg_list arguments; ///< 分配空间后的形式参数列表
block_list successors; ///< 前驱列表
block_list predecessors; ///< 后继列表
BasicBlock *idom = nullptr; ///< 直接支配结点即支配树前驱唯一默认nullptr
block_list sdoms; ///< 支配树后继,可以有多个
block_set dominants; ///< 必经结点集合
block_set dominant_frontiers; ///< 支配边界
bool reachable = false; ///< 用于表示该节点是否可达,默认不可达
Loop *loopbelong = nullptr; ///< 用来表示该块属于哪个循环唯一默认nullptr
int loopdepth = 0; /// < 用来表示其归属循环的深度默认0
bool reachable = false;
public:
public:
explicit BasicBlock(Function *parent, const std::string &name = "")
: Value(Type::getLabelType(), name), parent(parent) {}
~BasicBlock() override {
for (auto pre : predecessors) {
pre->removeSuccessor(this);
}
for (auto suc : successors) {
suc->removePredecessor(this);
}
} ///< 基本块的析构函数,同时删除其前驱后继关系
}
public:
unsigned getNumInstructions() const { return instructions.size(); } ///< 获取指令数量
unsigned getNumArguments() const { return arguments.size(); } ///< 获取形式参数数量
unsigned getNumPredecessors() const { return predecessors.size(); } ///< 获取前驱数量
unsigned getNumSuccessors() const { return successors.size(); } ///< 获取后继数量
Function* getParent() const { return parent; } ///< 获取父函数
void setParent(Function *func) { parent = func; } ///< 设置父函数
inst_list& getInstructions() { return instructions; } ///< 获取指令列表
arg_list& getArguments() { return arguments; } ///< 获取分配空间后的形式参数列表
const block_list& getPredecessors() const { return predecessors; } ///< 获取前驱列表
block_list& getSuccessors() { return successors; } ///< 获取后继列表
block_set& getDominants() { return dominants; }
BasicBlock* getIdom() { return idom; }
block_list& getSdoms() { return sdoms; }
block_set& getDFs() { return dominant_frontiers; }
iterator begin() { return instructions.begin(); } ///< 返回指向指令列表开头的迭代器
iterator end() { return instructions.end(); } ///< 返回指向指令列表末尾的迭代器
iterator terminator() { return std::prev(end()); } ///< 基本块最后的IR
void insertArgument(AllocaInst *inst) { arguments.push_back(inst); } ///< 插入分配空间后的形式参数
public:
unsigned getNumInstructions() const { return instructions.size(); }
unsigned getNumArguments() const { return arguments.size(); }
unsigned getNumPredecessors() const { return predecessors.size(); }
unsigned getNumSuccessors() const { return successors.size(); }
Function* getParent() const { return parent; }
void setParent(Function *func) { parent = func; }
inst_list& getInstructions() { return instructions; }
arg_list& getArguments() { return arguments; }
const block_list& getPredecessors() const { return predecessors; }
block_list& getSuccessors() { return successors; }
iterator begin() { return instructions.begin(); }
iterator end() { return instructions.end(); }
iterator terminator() { return std::prev(end()); }
void insertArgument(AllocaInst *inst) { arguments.push_back(inst); }
void addPredecessor(BasicBlock *block) {
if (std::find(predecessors.begin(), predecessors.end(), block) == predecessors.end()) {
predecessors.push_back(block);
}
} ///< 添加前驱
}
void addSuccessor(BasicBlock *block) {
if (std::find(successors.begin(), successors.end(), block) == successors.end()) {
successors.push_back(block);
}
} ///< 添加后继
}
void addPredecessor(const block_list &blocks) {
for (auto block : blocks) {
addPredecessor(block);
}
} ///< 添加多个前驱
}
void addSuccessor(const block_list &blocks) {
for (auto block : blocks) {
addSuccessor(block);
}
} ///< 添加多个后继
void setIdom(BasicBlock *block) { idom = block; }
void addSdoms(BasicBlock *block) { sdoms.push_back(block); }
void clearSdoms() { sdoms.clear(); }
// 重载1参数为 BasicBlock*
void addDominants(BasicBlock *block) { dominants.emplace(block); }
// 重载2参数为 block_set
void addDominants(const block_set &blocks) { dominants.insert(blocks.begin(), blocks.end()); }
void setDominants(BasicBlock *block) {
dominants.clear();
addDominants(block);
}
void setDominants(const block_set &doms) {
dominants.clear();
addDominants(doms);
}
void setDFs(const block_set &df) {
dominant_frontiers.clear();
for (auto elem : df) {
dominant_frontiers.emplace(elem);
}
}
void removePredecessor(BasicBlock *block) {
auto iter = std::find(predecessors.begin(), predecessors.end(), block);
@ -433,7 +401,7 @@ class BasicBlock;
} else {
assert(false);
}
} ///< 删除前驱
}
void removeSuccessor(BasicBlock *block) {
auto iter = std::find(successors.begin(), successors.end(), block);
if (iter != successors.end()) {
@ -441,7 +409,7 @@ class BasicBlock;
} else {
assert(false);
}
} ///< 删除后继
}
void replacePredecessor(BasicBlock *oldBlock, BasicBlock *newBlock) {
for (auto &predecessor : predecessors) {
if (predecessor == oldBlock) {
@ -449,41 +417,16 @@ class BasicBlock;
break;
}
}
} ///< 替换前驱
// 获取支配树中该块的所有子节点,包括子节点的子节点等,迭代实现
block_list getChildren() {
std::queue<BasicBlock *> q;
block_list children;
for (auto sdom : sdoms) {
q.push(sdom);
children.push_back(sdom);
}
while (!q.empty()) {
auto block = q.front();
q.pop();
for (auto sdom : block->sdoms) {
q.push(sdom);
children.push_back(sdom);
}
}
return children;
}
void setreachableTrue() { reachable = true; } ///< 设置可达
void setreachableFalse() { reachable = false; } ///< 设置不可达
bool getreachable() { return reachable; } ///< 返回可达状态
static void conectBlocks(BasicBlock *prev, BasicBlock *next) {
prev->addSuccessor(next);
next->addPredecessor(prev);
} ///< 连接两个块,即设置两个基本块的前驱后继关系
void setLoop(Loop *loop2set) { loopbelong = loop2set; } ///< 设置所属循环
Loop* getLoop() { return loopbelong; } ///< 获得所属循环
void setLoopDepth(int loopdepth2set) { loopdepth = loopdepth2set; } ///< 设置循环深度
int getLoopDepth() { return loopdepth; } ///< 获得其在循环的深度
void removeInst(iterator pos) { instructions.erase(pos); } ///< 删除指令
iterator moveInst(iterator sourcePos, iterator targetPos, BasicBlock *block); ///< 移动指令
}
void removeInst(iterator pos) { instructions.erase(pos); }
iterator moveInst(iterator sourcePos, iterator targetPos, BasicBlock *block);
};
//! User is the abstract base type of `Value` types which use other `Value` as
@ -940,6 +883,10 @@ public:
assert(false);
}
} ///< 根据指令类型进行二元计算eval template模板实现
static BinaryInst* create(Kind kind, Type *type, Value *lhs, Value *rhs, BasicBlock *parent, const std::string &name = "") {
// 后端处理数组访存操作时需要创建计算地址的指令,需要在外部构造 BinaryInst 对象所以写了个public的方法。
return new BinaryInst(kind, type, lhs, rhs, parent, name);
}
}; // class BinaryInst
//! The return statement
@ -1106,7 +1053,7 @@ public:
return make_range(std::next(operand_begin()), operand_end());
}
Value* getIndex(int index) const { return getOperand(index + 1); }
std::list<Value *> getAncestorIndices() const {
std::list<Value *> getAncestorIndices() const {
std::list<Value *> indices;
for (const auto &index : getIndices()) {
indices.emplace_back(index->getValue());
@ -1198,109 +1145,11 @@ public:
class GlobalValue;
// 循环类
class Loop {
public:
using block_list = std::vector<BasicBlock *>;
using block_set = std::unordered_set<BasicBlock *>;
using Loop_list = std::vector<Loop *>;
protected:
Function *parent; // 所属函数
block_list blocksInLoop; // 循环内的基本块
BasicBlock *preheaderBlock = nullptr; // 前驱块
BasicBlock *headerBlock = nullptr; // 循环头
block_list latchBlock; // 回边块
block_set exitingBlocks; // 退出块
block_set exitBlocks; // 退出目标块
Loop *parentloop = nullptr; // 父循环
Loop_list subLoops; // 子循环
size_t loopID; // 循环ID
unsigned loopDepth; // 循环深度
Instruction *indCondVar = nullptr; // 循环条件变量
Instruction::Kind IcmpKind; // 比较类型
Value *indEnd = nullptr; // 循环结束值
AllocaInst *IndPhi = nullptr; // 循环变量
ConstantValue *indBegin = nullptr; // 循环起始值
ConstantValue *indStep = nullptr; // 循环步长
std::set<GlobalValue *> GlobalValuechange; // 循环内改变的全局变量
int StepType = 0; // 循环步长类型
bool parallelable = false; // 是否可并行
public:
explicit Loop(BasicBlock *header, const std::string &name = "")
: headerBlock(header) {
blocksInLoop.push_back(header);
}
void setloopID() {
static unsigned loopCount = 0;
loopCount = loopCount + 1;
loopID = loopCount;
}
ConstantValue* getindBegin() { return indBegin; } ///< 获得循环开始值
ConstantValue* getindStep() { return indStep; } ///< 获得循环步长
void setindBegin(ConstantValue *indBegin2set) { indBegin = indBegin2set; } ///< 设置循环开始值
void setindStep(ConstantValue *indStep2set) { indStep = indStep2set; } ///< 设置循环步长
void setStepType(int StepType2Set) { StepType = StepType2Set; } ///< 设置循环变量规则
int getStepType() { return StepType; } ///< 获得循环变量规则
size_t getLoopID() { return loopID; }
BasicBlock* getHeader() const { return headerBlock; }
BasicBlock* getPreheaderBlock() const { return preheaderBlock; }
block_list& getLatchBlocks() { return latchBlock; }
block_set& getExitingBlocks() { return exitingBlocks; }
block_set& getExitBlocks() { return exitBlocks; }
Loop* getParentLoop() const { return parentloop; }
void setParentLoop(Loop *parent) { parentloop = parent; }
void addBasicBlock(BasicBlock *bb) { blocksInLoop.push_back(bb); }
void addSubLoop(Loop *loop) { subLoops.push_back(loop); }
void setLoopDepth(unsigned depth) { loopDepth = depth; }
block_list& getBasicBlocks() { return blocksInLoop; }
Loop_list& getSubLoops() { return subLoops; }
unsigned getLoopDepth() const { return loopDepth; }
bool isLoopContainsBasicBlock(BasicBlock *bb) const {
return std::find(blocksInLoop.begin(), blocksInLoop.end(), bb) != blocksInLoop.end();
} ///< 判断输入块是否在该循环内
void addExitingBlock(BasicBlock *bb) { exitingBlocks.insert(bb); }
void addExitBlock(BasicBlock *bb) { exitBlocks.insert(bb); }
void addLatchBlock(BasicBlock *bb) { latchBlock.push_back(bb); }
void setPreheaderBlock(BasicBlock *bb) { preheaderBlock = bb; }
void setIndexCondInstr(Instruction *instr) { indCondVar = instr; }
void setIcmpKind(Instruction::Kind kind) { IcmpKind = kind; }
Instruction::Kind getIcmpKind() const { return IcmpKind; }
bool isSimpleLoopInvariant(Value *value) ; ///< 判断是否为简单循环不变量若其在loop中则不是。
void setIndEnd(Value *value) { indEnd = value; }
void setIndPhi(AllocaInst *phi) { IndPhi = phi; }
Value* getIndEnd() const { return indEnd; }
AllocaInst* getIndPhi() const { return IndPhi; }
Instruction* getIndCondVar() const { return indCondVar; }
void addGlobalValuechange(GlobalValue *globalvaluechange2add) {
GlobalValuechange.insert(globalvaluechange2add);
} ///<添加在循环中改变的全局变量
std::set<GlobalValue *>& getGlobalValuechange() {
return GlobalValuechange;
} ///<获得在循环中改变的所有全局变量
void setParallelable(bool flag) { parallelable = flag; }
bool isParallelable() const { return parallelable; }
};
class Module;
//! Function definition
//! Function definitionclass
class Function : public Value {
friend class Module;
protected:
Function(Module *parent, Type *type, const std::string &name) : Value(type, name), parent(parent) {
blocks.emplace_back(new BasicBlock(this));
@ -1308,9 +1157,6 @@ protected:
public:
using block_list = std::list<std::unique_ptr<BasicBlock>>;
using Loop_list = std::list<std::unique_ptr<Loop>>;
// 函数优化属性标识符
enum FunctionAttribute : uint64_t {
PlaceHolder = 0x0UL,
Pure = 0x1UL << 0,
@ -1322,167 +1168,47 @@ public:
protected:
Module *parent; ///< 函数的父模块
block_list blocks; ///< 函数包含的基本块列表
Loop_list loops; ///< 函数包含的循环列表
Loop_list topLoops; ///< 函数所包含的顶层循环;
std::list<std::unique_ptr<AllocaInst>> indirectAllocas; ///< 函数中mem2reg引入的间接分配的内存
FunctionAttribute attribute = PlaceHolder; ///< 函数属性
std::set<Function *> callees; ///< 函数调用的函数集合
std::unordered_map<BasicBlock *, Loop *> basicblock2Loop;
std::unordered_map<Value *, BasicBlock *> value2AllocBlocks; ///< value -- alloc block mapping
std::unordered_map<Value *, std::unordered_map<BasicBlock *, int>>
value2DefBlocks; //< value -- define blocks mapping
std::unordered_map<Value *, std::unordered_map<BasicBlock *, int>> value2UseBlocks; //< value -- use blocks mapping
public:
public:
static unsigned getcloneIndex() {
static unsigned cloneIndex = 0;
cloneIndex += 1;
return cloneIndex - 1;
}
Function* clone(const std::string &suffix = "_" + std::to_string(getcloneIndex()) + "@") const; ///< 复制函数
Function* clone(const std::string &suffix = "_" + std::to_string(getcloneIndex()) + "@") const;
const std::set<Function *>& getCallees() { return callees; }
void addCallee(Function *callee) { callees.insert(callee); }
void removeCallee(Function *callee) { callees.erase(callee); }
void clearCallees() { callees.clear(); }
std::set<Function *> getCalleesWithNoExternalAndSelf();
FunctionAttribute getAttribute() const { return attribute; } ///< 获取函数属性
FunctionAttribute getAttribute() const { return attribute; }
void setAttribute(FunctionAttribute attr) {
attribute = static_cast<FunctionAttribute>(attribute | attr);
} ///< 设置函数属性
void clearAttribute() { attribute = PlaceHolder; } ///< 清楚所有函数属性只保留PlaceHolder
Loop* getLoopOfBasicBlock(BasicBlock *bb) {
return basicblock2Loop.count(bb) != 0 ? basicblock2Loop[bb] : nullptr;
} ///< 获得块所在循环
unsigned getLoopDepthByBlock(BasicBlock *basicblock2Check) {
if (getLoopOfBasicBlock(basicblock2Check) != nullptr) {
auto loop = getLoopOfBasicBlock(basicblock2Check);
return loop->getLoopDepth();
}
return static_cast<unsigned>(0);
} ///< 通过块,获得其所在循环深度
void addBBToLoop(BasicBlock *bb, Loop *LoopToadd) { basicblock2Loop[bb] = LoopToadd; } ///< 添加块与循环的映射
std::unordered_map<BasicBlock *, Loop *>& getBBToLoopRef() {
return basicblock2Loop;
} ///< 获得块-循环映射表
// auto getNewLoopPtr(BasicBlock *header) -> Loop * { return new Loop(header); }
Type* getReturnType() const { return getType()->as<FunctionType>()->getReturnType(); } ///< 获取返回值类型
auto getParamTypes() const { return getType()->as<FunctionType>()->getParamTypes(); } ///< 获取形式参数类型列表
auto getBasicBlocks() { return make_range(blocks); } ///< 获取基本块列表
}
void clearAttribute() { attribute = PlaceHolder; }
Type* getReturnType() const { return getType()->as<FunctionType>()->getReturnType(); }
auto getParamTypes() const { return getType()->as<FunctionType>()->getParamTypes(); }
auto getBasicBlocks() { return make_range(blocks); }
block_list& getBasicBlocks_NoRange() { return blocks; }
BasicBlock* getEntryBlock() { return blocks.front().get(); } ///< 获取入口块
BasicBlock* getEntryBlock() { return blocks.front().get(); }
void removeBasicBlock(BasicBlock *blockToRemove) {
auto is_same_ptr = [blockToRemove](const std::unique_ptr<BasicBlock> &ptr) { return ptr.get() == blockToRemove; };
blocks.remove_if(is_same_ptr);
// blocks.erase(std::remove_if(blocks.begin(), blocks.end(), is_same_ptr), blocks.end());
} ///< 将该块从function的blocks中删除
// auto getBasicBlocks_NoRange() -> block_list & { return blocks; }
}
BasicBlock* addBasicBlock(const std::string &name = "") {
blocks.emplace_back(new BasicBlock(this, name));
return blocks.back().get();
} ///< 添加新的基本块
}
BasicBlock* addBasicBlock(BasicBlock *block) {
blocks.emplace_back(block);
return block;
} ///< 添加基本块到blocks中
}
BasicBlock* addBasicBlockFront(BasicBlock *block) {
blocks.emplace_front(block);
return block;
} // 从前端插入新的基本块
/** value -- alloc blocks mapping */
void addValue2AllocBlocks(Value *value, BasicBlock *block) {
value2AllocBlocks[value] = block;
} ///< 添加value -- alloc block mapping
BasicBlock* getAllocBlockByValue(Value *value) {
if (value2AllocBlocks.count(value) > 0) {
return value2AllocBlocks[value];
}
return nullptr;
} ///< 通过value获取alloc block
std::unordered_map<Value *, BasicBlock *>& getValue2AllocBlocks() {
return value2AllocBlocks;
} ///< 获取所有value -- alloc block mappings
void removeValue2AllocBlock(Value *value) {
value2AllocBlocks.erase(value);
} ///< 删除value -- alloc block mapping
/** value -- define blocks mapping */
void addValue2DefBlocks(Value *value, BasicBlock *block) {
++value2DefBlocks[value][block];
} ///< 添加value -- define block mapping
// keep in mind that the return is not a reference.
std::unordered_set<BasicBlock *> getDefBlocksByValue(Value *value) {
std::unordered_set<BasicBlock *> blocks;
if (value2DefBlocks.count(value) > 0) {
for (const auto &pair : value2DefBlocks[value]) {
blocks.insert(pair.first);
}
}
return blocks;
} ///< 通过value获取define blocks
std::unordered_map<Value *, std::unordered_map<BasicBlock *, int>>& getValue2DefBlocks() {
return value2DefBlocks;
} ///< 获取所有value -- define blocks mappings
bool removeValue2DefBlock(Value *value, BasicBlock *block) {
bool changed = false;
if (--value2DefBlocks[value][block] == 0) {
value2DefBlocks[value].erase(block);
if (value2DefBlocks[value].empty()) {
value2DefBlocks.erase(value);
changed = true;
}
}
return changed;
} ///< 删除value -- define block mapping
std::unordered_set<Value *> getValuesOfDefBlock() {
std::unordered_set<Value *> values;
for (const auto &pair : value2DefBlocks) {
values.insert(pair.first);
}
return values;
} ///< 获取所有定义过的value
/** value -- use blocks mapping */
void addValue2UseBlocks(Value *value, BasicBlock *block) {
++value2UseBlocks[value][block];
} ///< 添加value -- use block mapping
// keep in mind that the return is not a reference.
std::unordered_set<BasicBlock *> getUseBlocksByValue(Value *value) {
std::unordered_set<BasicBlock *> blocks;
if (value2UseBlocks.count(value) > 0) {
for (const auto &pair : value2UseBlocks[value]) {
blocks.insert(pair.first);
}
}
return blocks;
} ///< 通过value获取use blocks
std::unordered_map<Value *, std::unordered_map<BasicBlock *, int>>& getValue2UseBlocks() {
return value2UseBlocks;
} ///< 获取所有value -- use blocks mappings
bool removeValue2UseBlock(Value *value, BasicBlock *block) {
bool changed = false;
if (--value2UseBlocks[value][block] == 0) {
value2UseBlocks[value].erase(block);
if (value2UseBlocks[value].empty()) {
value2UseBlocks.erase(value);
changed = true;
}
}
return changed;
} ///< 删除value -- use block mapping
void addIndirectAlloca(AllocaInst *alloca) { indirectAllocas.emplace_back(alloca); } ///< 添加间接分配
std::list<std::unique_ptr<AllocaInst>>& getIndirectAllocas() {
return indirectAllocas;
} ///< 获取间接分配列表
/** loop -- begin */
void addLoop(Loop *loop) { loops.emplace_back(loop); } ///< 添加循环(非顶层)
void addTopLoop(Loop *loop) { topLoops.emplace_back(loop); } ///< 添加顶层循环
Loop_list& getLoops() { return loops; } ///< 获得循环(非顶层)
Loop_list& getTopLoops() { return topLoops; } ///< 获得顶层循环
/** loop -- end */
}; // class Function
}
};
//! Global value declared at file scope
class GlobalValue : public User, public LVal {

View File

@ -96,7 +96,7 @@ class IRBuilder {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << "%" << tmpIndex;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
@ -136,7 +136,7 @@ class IRBuilder {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << "%" << tmpIndex;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
@ -221,7 +221,7 @@ class IRBuilder {
std::string newName;
if (name.empty() && callee->getReturnType() != Type::getVoidType()) {
std::stringstream ss;
ss << "%" << tmpIndex;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
@ -263,12 +263,12 @@ class IRBuilder {
auto inst = new AllocaInst(type, dims, parent, name);
assert(inst);
return inst;
} ///< 创建不插入指令列表的分配指令
} ///< 创建不插入指令列表的分配指令[仅用于phi指令]
LoadInst * createLoadInst(Value *pointer, const std::vector<Value *> &indices = {}, const std::string &name = "") {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << "%" << tmpIndex;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
@ -284,7 +284,7 @@ class IRBuilder {
std::string newName;
if (name.empty()) {
std::stringstream ss;
ss << "%" << tmpIndex;
ss << tmpIndex;
newName = ss.str();
tmpIndex++;
} else {
@ -315,7 +315,7 @@ class IRBuilder {
auto fatherArrayValue = dynamic_cast<Value *>(fatherArray);
auto childArray = new AllocaInst(fatherArrayValue->getType(), subDims, block, childArrayName);
auto inst = new GetSubArrayInst(fatherArray, childArray, indices, block, name);
auto inst = new GetSubArrayInst(fatherArray, childArray, indices, block, childArrayName);
assert(inst);
block->getInstructions().emplace(position, inst);
return inst;

View File

@ -1,78 +0,0 @@
#pragma once
#include "SysYBaseVisitor.h"
#include "SysYParser.h"
#include "IR.h"
#include "IRBuilder.h"
#include <sstream>
#include <map>
#include <vector>
#include <stack>
class LLVMIRGenerator : public SysYBaseVisitor {
public:
sysy::Module* getIRModule() const { return irModule.get(); }
std::string generateIR(SysYParser::CompUnitContext* unit);
std::string getIR() const { return irStream.str(); }
private:
std::unique_ptr<sysy::Module> irModule; // IR数据结构
std::stringstream irStream; // 文本输出流
sysy::IRBuilder irBuilder; // IR构建器
int tempCounter = 0;
std::string currentVarType;
// std::map<std::string, sysy::Value*> symbolTable;
std::map<std::string, std::pair<std::string, std::string>> symbolTable;
std::map<std::string, std::string> tmpTable;
std::vector<std::string> globalVars;
std::string currentFunction;
std::string currentReturnType;
std::vector<std::string> breakStack;
std::vector<std::string> continueStack;
bool hasReturn = false;
struct LoopLabels {
std::string breakLabel; // break跳转的目标标签
std::string continueLabel; // continue跳转的目标标签
};
std::stack<LoopLabels> loopStack; // 用于管理循环的break和continue标签
std::string getNextTemp();
std::string getLLVMType(const std::string&);
sysy::Type* getSysYType(const std::string&);
bool inFunction = false; // 标识当前是否处于函数内部
// 访问方法
std::any visitCompUnit(SysYParser::CompUnitContext* ctx);
std::any visitConstDecl(SysYParser::ConstDeclContext* ctx);
std::any visitVarDecl(SysYParser::VarDeclContext* ctx);
std::any visitVarDef(SysYParser::VarDefContext* ctx);
std::any visitFuncDef(SysYParser::FuncDefContext* ctx);
std::any visitBlockStmt(SysYParser::BlockStmtContext* ctx);
// std::any visitStmt(SysYParser::StmtContext* ctx);
std::any visitLValue(SysYParser::LValueContext* ctx);
std::any visitPrimaryExp(SysYParser::PrimaryExpContext* ctx);
std::any visitPrimExp(SysYParser::PrimExpContext* ctx);
std::any visitParenExp(SysYParser::ParenExpContext* ctx);
std::any visitNumber(SysYParser::NumberContext* ctx);
std::any visitString(SysYParser::StringContext* ctx);
std::any visitCall(SysYParser::CallContext *ctx);
std::any visitUnExp(SysYParser::UnExpContext* ctx);
std::any visitMulExp(SysYParser::MulExpContext* ctx);
std::any visitAddExp(SysYParser::AddExpContext* ctx);
std::any visitRelExp(SysYParser::RelExpContext* ctx);
std::any visitEqExp(SysYParser::EqExpContext* ctx);
std::any visitLAndExp(SysYParser::LAndExpContext* ctx);
std::any visitLOrExp(SysYParser::LOrExpContext* ctx);
std::any visitAssignStmt(SysYParser::AssignStmtContext *ctx) override;
std::any visitIfStmt(SysYParser::IfStmtContext *ctx) override;
std::any visitWhileStmt(SysYParser::WhileStmtContext *ctx) override;
std::any visitBreakStmt(SysYParser::BreakStmtContext *ctx) override;
std::any visitContinueStmt(SysYParser::ContinueStmtContext *ctx) override;
std::any visitReturnStmt(SysYParser::ReturnStmtContext *ctx) override;
// 统一创建二元操作(同时生成数据结构和文本)
sysy::Value* createBinaryOp(SysYParser::ExpContext* lhs,
SysYParser::ExpContext* rhs,
sysy::Instruction::Kind opKind);
};

View File

@ -1,99 +0,0 @@
#pragma once
#include "SysYBaseVisitor.h"
#include "SysYParser.h"
#include "IR.h" // 引入 SysY IR 头文件
#include "IRBuilder.h"
#include <sstream>
#include <map>
#include <vector>
#include <stack>
#include <memory>
class LLVMIRGenerator : public SysYBaseVisitor {
public:
// 生成 IR文本和数据结构
std::string generateIR(SysYParser::CompUnitContext* unit);
// 获取文本格式的 LLVM IR
std::string getIR() const { return irStream.str(); }
// 获取 SysY IR 数据结构
sysy::Module* getModule() const { return module.get(); }
private:
// 文本输出相关
std::stringstream irStream;
int tempCounter = 0; // 临时变量计数器
std::string currentVarType; // 当前变量类型(文本 IR 用)
// 符号表:映射变量名到 {分配地址/寄存器, 类型}(文本 IR
std::map<std::string, std::pair<std::string, std::string>> symbolTable;
// 临时变量表:映射临时变量名到类型(文本 IR
std::map<std::string, std::string> tmpTable;
std::vector<std::string> globalVars; // 全局变量列表(文本 IR
// SysY IR 数据结构
std::unique_ptr<sysy::Module> module; // SysY IR 模块
// 符号表:映射变量名到 SysY IR 的 Value 指针
std::map<std::string, sysy::Value*> irSymbolTable;
// 临时变量表:映射临时变量名到 SysY IR 的 Value 指针
std::map<std::string, sysy::Value*> irTmpTable;
// 当前上下文
std::string currentFunction; // 当前函数名(文本 IR
std::string currentReturnType; // 当前函数返回类型(文本 IR
sysy::Function* currentIRFunction = nullptr; // 当前 SysY IR 函数
sysy::BasicBlock* currentIRBlock = nullptr; // 当前 SysY IR 基本块
// 循环控制
std::vector<std::string> breakStack; // break 标签栈(文本 IR
std::vector<std::string> continueStack; // continue 标签栈(文本 IR
bool hasReturn = false; // 是否有返回语句(文本 IR
struct LoopLabels {
std::string breakLabel; // break 跳转目标标签(文本 IR
std::string continueLabel; // continue 跳转目标标签(文本 IR
sysy::BasicBlock* irBreakBlock = nullptr; // break 跳转目标块SysY IR
sysy::BasicBlock* irContinueBlock = nullptr; // continue 跳转目标块SysY IR
};
std::stack<LoopLabels> loopStack; // 管理循环的 break 和 continue 标签
bool inFunction = false; // 标记是否在函数内部
// 辅助函数(文本 IR
std::string getNextTemp(); // 获取下一个临时变量名
std::string getLLVMType(const std::string& type); // 转换 SysY 类型到 LLVM 类型
// 辅助函数SysY IR
sysy::Type* getIRType(const std::string& type); // 转换 SysY 类型到 SysY IR 类型
std::string getIRTempName(); // 获取 SysY IR 临时变量名
void setIRPosition(sysy::BasicBlock* block); // 设置当前 IR 插入点
// 访问方法
std::any visitCompUnit(SysYParser::CompUnitContext* ctx) override;
std::any visitConstDecl(SysYParser::ConstDeclContext* ctx) override;
std::any visitVarDecl(SysYParser::VarDeclContext* ctx) override;
std::any visitVarDef(SysYParser::VarDefContext* ctx) override;
std::any visitFuncDef(SysYParser::FuncDefContext* ctx) override;
std::any visitBlockStmt(SysYParser::BlockStmtContext* ctx) override;
std::any visitLValue(SysYParser::LValueContext* ctx) override;
// std::any visitPrimaryExp(SysYParser::PrimaryExpContext* ctx) override;
std::any visitPrimExp(SysYParser::PrimExpContext* ctx) override;
std::any visitParenExp(SysYParser::ParenExpContext* ctx) override;
std::any visitNumber(SysYParser::NumberContext* ctx) override;
std::any visitString(SysYParser::StringContext* ctx) override;
std::any visitCall(SysYParser::CallContext* ctx) override;
std::any visitUnExp(SysYParser::UnExpContext* ctx) override;
std::any visitMulExp(SysYParser::MulExpContext* ctx) override;
std::any visitAddExp(SysYParser::AddExpContext* ctx) override;
std::any visitRelExp(SysYParser::RelExpContext* ctx) override;
std::any visitEqExp(SysYParser::EqExpContext* ctx) override;
std::any visitLAndExp(SysYParser::LAndExpContext* ctx) override;
std::any visitLOrExp(SysYParser::LOrExpContext* ctx) override;
std::any visitAssignStmt(SysYParser::AssignStmtContext* ctx) override;
std::any visitIfStmt(SysYParser::IfStmtContext* ctx) override;
std::any visitWhileStmt(SysYParser::WhileStmtContext* ctx) override;
std::any visitBreakStmt(SysYParser::BreakStmtContext* ctx) override;
std::any visitContinueStmt(SysYParser::ContinueStmtContext* ctx) override;
std::any visitReturnStmt(SysYParser::ReturnStmtContext* ctx) override;
};

59
src/include/Mem2Reg.h Normal file
View File

@ -0,0 +1,59 @@
#pragma once
#include <list>
#include <memory>
#include <stack>
#include <unordered_map>
#include <unordered_set>
#include "IR.h"
#include "IRBuilder.h"
#include "SysYIRAnalyser.h"
namespace sysy {
/**
* 实现静态单变量赋值核心类 mem2reg
*/
class Mem2Reg {
private:
Module *pModule;
IRBuilder *pBuilder;
ControlFlowAnalysis *controlFlowAnalysis; // 控制流分析
ActiveVarAnalysis *activeVarAnalysis; // 活跃变量分析
DataFlowAnalysisUtils dataFlowAnalysisUtils;
public:
Mem2Reg(Module *pMoudle, IRBuilder *pBuilder,
ControlFlowAnalysis *pCFA = nullptr, ActiveVarAnalysis *pAVA = nullptr) :
pModule(pMoudle), pBuilder(pBuilder), controlFlowAnalysis(pCFA), activeVarAnalysis(pAVA), dataFlowAnalysisUtils()
{} // 初始化函数
void mem2regPipeline(); ///< mem2reg
private:
// phi节点的插入需要计算IDF
std::unordered_set<BasicBlock *> computeIterDf(const std::unordered_set<BasicBlock *> &blocks); ///< 计算定义块集合的迭代支配边界
auto computeValue2Blocks() -> void; ///< 计算value2block的映射(不包括数组和global)
auto preOptimize1() -> void; ///< llvm memtoreg预优化1: 删除不含load的alloc和store
auto preOptimize2() -> void; ///< llvm memtoreg预优化2: 针对某个变量的Defblocks只有一个块的情况
auto preOptimize3() -> void; ///< llvm memtoreg预优化3: 针对某个变量的所有读写都在同一个块中的情况
auto insertPhi() -> void; ///< 为所有变量的迭代支配边界插入phi结点
auto rename(BasicBlock *block, std::unordered_map<Value *, int> &count,
std::unordered_map<Value *, std::stack<Instruction *>> &stacks) -> void; ///< 单个块的重命名
auto renameAll() -> void; ///< 重命名所有块
// private helper function.
private:
auto getPredIndex(BasicBlock *n, BasicBlock *s) -> int; ///< 获取前驱索引
auto cascade(Instruction *instr, bool &changed, Function *func, BasicBlock *block,
std::list<std::unique_ptr<Instruction>> &instrs) -> void; ///< 消除级联关系
auto isGlobal(Value *val) -> bool; ///< 判断是否是全局变量
auto isArr(Value *val) -> bool; ///< 判断是否是数组
auto usedelete(Instruction *instr) -> void; ///< 删除指令相关的value-use-user关系
};
} // namespace sysy

23
src/include/Reg2Mem.h Normal file
View File

@ -0,0 +1,23 @@
#pragma once
#include "IR.h"
#include "IRBuilder.h"
namespace sysy {
/**
* Reg2Mem(后端未做phi指令翻译)
*/
class Reg2Mem {
private:
Module *pModule;
IRBuilder *pBuilder;
public:
Reg2Mem(Module *pMoudle, IRBuilder *pBuilder) : pModule(pMoudle), pBuilder(pBuilder) {}
void DeletePhiInst();
// 删除UD关系, 因为删除了phi指令会修改ud关系
void usedelete(Instruction *instr);
};
} // namespace sysy

View File

@ -0,0 +1,465 @@
#pragma once
#include "IR.h"
namespace sysy {
// 前向声明
class Loop;
// 基本块分析信息类
class BlockAnalysisInfo {
public:
using block_list = std::vector<BasicBlock*>;
using block_set = std::unordered_set<BasicBlock*>;
protected:
// 支配树相关
int domdepth = 0; ///< 支配节点所在深度
BasicBlock* idom = nullptr; ///< 直接支配结点
block_list sdoms; ///< 支配树后继
block_set dominants; ///< 必经结点集合
block_set dominant_frontiers; ///< 支配边界
// 后续添加循环分析相关
// Loop* loopbelong = nullptr; ///< 所属循环
// int loopdepth = 0; ///< 循环深度
public:
// getterface
const int getDomDepth() const { return domdepth; }
const BasicBlock* getIdom() const { return idom; }
const block_list& getSdoms() const { return sdoms; }
const block_set& getDominants() const { return dominants; }
const block_set& getDomFrontiers() const { return dominant_frontiers; }
// 支配树操作
void setDomDepth(int depth) { domdepth = depth; }
void setIdom(BasicBlock* block) { idom = block; }
void addSdoms(BasicBlock* block) { sdoms.push_back(block); }
void clearSdoms() { sdoms.clear(); }
void removeSdoms(BasicBlock* block) {
sdoms.erase(std::remove(sdoms.begin(), sdoms.end(), block), sdoms.end());
}
void addDominants(BasicBlock* block) { dominants.emplace(block); }
void addDominants(const block_set& blocks) { dominants.insert(blocks.begin(), blocks.end()); }
void setDominants(BasicBlock* block) {
dominants.clear();
addDominants(block);
}
void setDominants(const block_set& doms) {
dominants = doms;
}
void setDomFrontiers(const block_set& df) {
dominant_frontiers = df;
}
// TODO循环分析操作方法
// 清空所有分析信息
void clear() {
domdepth = -1;
idom = nullptr;
sdoms.clear();
dominants.clear();
dominant_frontiers.clear();
// loopbelong = nullptr;
// loopdepth = 0;
}
};
// 函数分析信息类
class FunctionAnalysisInfo {
public:
// 函数属性
enum FunctionAttribute : uint64_t {
PlaceHolder = 0x0UL,
Pure = 0x1UL << 0,
SelfRecursive = 0x1UL << 1,
SideEffect = 0x1UL << 2,
NoPureCauseMemRead = 0x1UL << 3
};
// 数据结构
using Loop_list = std::list<std::unique_ptr<Loop>>;
using block_loop_map = std::unordered_map<BasicBlock*, Loop*>;
using value_block_map = std::unordered_map<Value*, BasicBlock*>;
using value_block_count_map = std::unordered_map<Value*, std::unordered_map<BasicBlock*, int>>;
// 分析数据
FunctionAttribute attribute = PlaceHolder; ///< 函数属性
std::set<Function*> callees; ///< 函数调用集合
Loop_list loops; ///< 所有循环
Loop_list topLoops; ///< 顶层循环
// block_loop_map basicblock2Loop; ///< 基本块到循环映射
std::list<std::unique_ptr<AllocaInst>> indirectAllocas; ///< 间接分配内存
// 值定义/使用信息
value_block_map value2AllocBlocks; ///< 值分配位置映射
value_block_count_map value2DefBlocks; ///< 值定义位置映射
value_block_count_map value2UseBlocks; ///< 值使用位置映射
// 函数属性操作
FunctionAttribute getAttribute() const { return attribute; }
void setAttribute(FunctionAttribute attr) { attribute = static_cast<FunctionAttribute>(attribute | attr); }
void clearAttribute() { attribute = PlaceHolder; }
// 调用关系操作
void addCallee(Function* callee) { callees.insert(callee); }
void removeCallee(Function* callee) { callees.erase(callee); }
void clearCallees() { callees.clear(); }
// 值-块映射操作
BasicBlock* getAllocBlockByValue(Value* value) {
auto it = value2AllocBlocks.find(value);
return it != value2AllocBlocks.end() ? it->second : nullptr;
}
std::unordered_set<BasicBlock *> getDefBlocksByValue(Value *value) {
std::unordered_set<BasicBlock *> blocks;
if (value2DefBlocks.count(value) > 0) {
for (const auto &pair : value2DefBlocks[value]) {
blocks.insert(pair.first);
}
}
return blocks;
}
std::unordered_set<BasicBlock *> getUseBlocksByValue(Value *value) {
std::unordered_set<BasicBlock *> blocks;
if (value2UseBlocks.count(value) > 0) {
for (const auto &pair : value2UseBlocks[value]) {
blocks.insert(pair.first);
}
}
return blocks;
}
// 值定义/使用操作
void addValue2AllocBlocks(Value* value, BasicBlock* block) { value2AllocBlocks[value] = block; }
void addValue2DefBlocks(Value* value, BasicBlock* block) { ++value2DefBlocks[value][block]; }
void addValue2UseBlocks(Value* value, BasicBlock* block) { ++value2UseBlocks[value][block]; }
// 获取值定义/使用信息
std::unordered_map<Value *, BasicBlock *>& getValue2AllocBlocks() {
return value2AllocBlocks;
}
std::unordered_map<Value *, std::unordered_map<BasicBlock *, int>>& getValue2DefBlocks() {
return value2DefBlocks;
}
std::unordered_map<Value *, std::unordered_map<BasicBlock *, int>>& getValue2UseBlocks() {
return value2UseBlocks;
}
std::unordered_set<Value *> getValuesOfDefBlock() {
std::unordered_set<Value *> values;
for (const auto &pair : value2DefBlocks) {
values.insert(pair.first);
}
return values;
}
// 删除信息操作
void removeValue2AllocBlock(Value *value) { value2AllocBlocks.erase(value); }
bool removeValue2DefBlock(Value *value, BasicBlock *block) {
bool changed = false;
if (--value2DefBlocks[value][block] == 0) {
value2DefBlocks[value].erase(block);
if (value2DefBlocks[value].empty()) {
value2DefBlocks.erase(value);
changed = true;
}
}
return changed;
}
bool removeValue2UseBlock(Value *value, BasicBlock *block) {
bool changed = false;
if (--value2UseBlocks[value][block] == 0) {
value2UseBlocks[value].erase(block);
if (value2UseBlocks[value].empty()) {
value2UseBlocks.erase(value);
changed = true;
}
}
return changed;
}
// 间接分配操作
void addIndirectAlloca(AllocaInst* alloca) { indirectAllocas.emplace_back(alloca); }
std::list<std::unique_ptr<AllocaInst>>& getIndirectAllocas() { return indirectAllocas; }
// TODO循环分析操作
// 清空所有分析信息
void clear() {
attribute = PlaceHolder;
callees.clear();
loops.clear();
topLoops.clear();
// basicblock2Loop.clear();
indirectAllocas.clear();
value2AllocBlocks.clear();
value2DefBlocks.clear();
value2UseBlocks.clear();
}
};
// 循环类 - 未实现优化
class Loop {
public:
using block_list = std::vector<BasicBlock *>;
using block_set = std::unordered_set<BasicBlock *>;
using Loop_list = std::vector<Loop *>;
protected:
Function *parent; // 所属函数
block_list blocksInLoop; // 循环内的基本块
BasicBlock *preheaderBlock = nullptr; // 前驱块
BasicBlock *headerBlock = nullptr; // 循环头
block_list latchBlock; // 回边块
block_set exitingBlocks; // 退出块
block_set exitBlocks; // 退出目标块
Loop *parentloop = nullptr; // 父循环
Loop_list subLoops; // 子循环
size_t loopID; // 循环ID
unsigned loopDepth; // 循环深度
Instruction *indCondVar = nullptr; // 循环条件变量
Instruction::Kind IcmpKind; // 比较类型
Value *indEnd = nullptr; // 循环结束值
AllocaInst *IndPhi = nullptr; // 循环变量
ConstantValue *indBegin = nullptr; // 循环起始值
ConstantValue *indStep = nullptr; // 循环步长
std::set<GlobalValue *> GlobalValuechange; // 循环内改变的全局变量
int StepType = 0; // 循环步长类型
bool parallelable = false; // 是否可并行
public:
explicit Loop(BasicBlock *header, const std::string &name = "")
: headerBlock(header) {
blocksInLoop.push_back(header);
}
void setloopID() {
static unsigned loopCount = 0;
loopCount = loopCount + 1;
loopID = loopCount;
}
ConstantValue* getindBegin() { return indBegin; }
ConstantValue* getindStep() { return indStep; }
void setindBegin(ConstantValue *indBegin2set) { indBegin = indBegin2set; }
void setindStep(ConstantValue *indStep2set) { indStep = indStep2set; }
void setStepType(int StepType2Set) { StepType = StepType2Set; }
int getStepType() { return StepType; }
size_t getLoopID() { return loopID; }
BasicBlock* getHeader() const { return headerBlock; }
BasicBlock* getPreheaderBlock() const { return preheaderBlock; }
block_list& getLatchBlocks() { return latchBlock; }
block_set& getExitingBlocks() { return exitingBlocks; }
block_set& getExitBlocks() { return exitBlocks; }
Loop* getParentLoop() const { return parentloop; }
void setParentLoop(Loop *parent) { parentloop = parent; }
void addBasicBlock(BasicBlock *bb) { blocksInLoop.push_back(bb); }
void addSubLoop(Loop *loop) { subLoops.push_back(loop); }
void setLoopDepth(unsigned depth) { loopDepth = depth; }
block_list& getBasicBlocks() { return blocksInLoop; }
Loop_list& getSubLoops() { return subLoops; }
unsigned getLoopDepth() const { return loopDepth; }
bool isLoopContainsBasicBlock(BasicBlock *bb) const {
return std::find(blocksInLoop.begin(), blocksInLoop.end(), bb) != blocksInLoop.end();
}
void addExitingBlock(BasicBlock *bb) { exitingBlocks.insert(bb); }
void addExitBlock(BasicBlock *bb) { exitBlocks.insert(bb); }
void addLatchBlock(BasicBlock *bb) { latchBlock.push_back(bb); }
void setPreheaderBlock(BasicBlock *bb) { preheaderBlock = bb; }
void setIndexCondInstr(Instruction *instr) { indCondVar = instr; }
void setIcmpKind(Instruction::Kind kind) { IcmpKind = kind; }
Instruction::Kind getIcmpKind() const { return IcmpKind; }
bool isSimpleLoopInvariant(Value *value) ;
void setIndEnd(Value *value) { indEnd = value; }
void setIndPhi(AllocaInst *phi) { IndPhi = phi; }
Value* getIndEnd() const { return indEnd; }
AllocaInst* getIndPhi() const { return IndPhi; }
Instruction* getIndCondVar() const { return indCondVar; }
void addGlobalValuechange(GlobalValue *globalvaluechange2add) {
GlobalValuechange.insert(globalvaluechange2add);
}
std::set<GlobalValue *>& getGlobalValuechange() {
return GlobalValuechange;
}
void setParallelable(bool flag) { parallelable = flag; }
bool isParallelable() const { return parallelable; }
};
// 控制流分析类
class ControlFlowAnalysis {
private:
Module *pModule; ///< 模块
std::unordered_map<BasicBlock*, BlockAnalysisInfo*> blockAnalysisInfo; // 基本块分析信息表
std::unordered_map<Function*, FunctionAnalysisInfo*> functionAnalysisInfo; // 函数分析信息
public:
explicit ControlFlowAnalysis(Module *pMoudle) : pModule(pMoudle) {}
// 获取基本块分析信息
BlockAnalysisInfo* getBlockAnalysisInfo(BasicBlock *block) {
auto it = blockAnalysisInfo.find(block);
if (it != blockAnalysisInfo.end()) {
return it->second;
}
return nullptr; // 如果未找到返回nullptr
}
FunctionAnalysisInfo* getFunctionAnalysisInfo(Function *func) {
auto it = functionAnalysisInfo.find(func);
if (it != functionAnalysisInfo.end()) {
return it->second;
}
return nullptr; // 如果未找到返回nullptr
}
void init(); // 初始化分析器
void computeDomNode(); // 计算必经结点
void computeDomTree(); // 构造支配树
// std::unordered_set<BasicBlock *> computeDomFrontier(BasicBlock *block) ; // 计算单个块的支配边界(弃用)
void computeDomFrontierAllBlk(); // 计算所有块的支配边界
void runControlFlowAnalysis(); // 运行控制流分析(主要是支配树和支配边界)
void clear(){
for (auto &pair : blockAnalysisInfo) {
delete pair.second; // 清理基本块分析信息
}
blockAnalysisInfo.clear();
for (auto &pair : functionAnalysisInfo) {
delete pair.second; // 清理函数分析信息
}
functionAnalysisInfo.clear();
} // 清空分析结果
~ControlFlowAnalysis() {
clear(); // 析构时清理所有分析信息
}
private:
void intersectOP4Dom(std::unordered_set<BasicBlock *> &dom, const std::unordered_set<BasicBlock *> &other); // 交集运算,
BasicBlock* findCommonDominator(BasicBlock *a, BasicBlock *b); // 查找两个基本块的共同支配结点
};
// 数据流分析类
// 该类为抽象类,具体的数据流分析器需要继承此类
// 因为每个数据流分析器的分析动作都不一样所以需要继承并实现analyze方法
class DataFlowAnalysis {
public:
virtual ~DataFlowAnalysis() = default;
public:
virtual void init(Module *pModule) {} ///< 分析器初始化
virtual auto analyze(Module *pModule, BasicBlock *block) -> bool { return true; } ///< 分析动作若完成则返回true;
virtual void clear() {} ///< 清空
};
// 数据流分析工具类
// 该类用于管理多个数据流分析器,提供统一的前向与后向分析接口
class DataFlowAnalysisUtils {
private:
std::vector<DataFlowAnalysis *> forwardAnalysisList; ///< 前向分析器列表
std::vector<DataFlowAnalysis *> backwardAnalysisList; ///< 后向分析器列表
public:
DataFlowAnalysisUtils() = default;
~DataFlowAnalysisUtils() {
clear(); // 析构时清理所有分析器
}
// 统一添加接口
void addAnalyzers(
std::vector<DataFlowAnalysis *> forwardList,
std::vector<DataFlowAnalysis *> backwardList = {})
{
forwardAnalysisList.insert(
forwardAnalysisList.end(),
forwardList.begin(),
forwardList.end());
backwardAnalysisList.insert(
backwardAnalysisList.end(),
backwardList.begin(),
backwardList.end());
}
// 单独添加接口
void addForwardAnalyzer(DataFlowAnalysis *analyzer) {
forwardAnalysisList.push_back(analyzer);
}
void addBackwardAnalyzer(DataFlowAnalysis *analyzer) {
backwardAnalysisList.push_back(analyzer);
}
// 设置分析器列表
void setAnalyzers(
std::vector<DataFlowAnalysis *> forwardList,
std::vector<DataFlowAnalysis *> backwardList)
{
forwardAnalysisList = std::move(forwardList);
backwardAnalysisList = std::move(backwardList);
}
// 清空列表
void clear() {
forwardAnalysisList.clear();
backwardAnalysisList.clear();
}
// 访问器
const auto& getForwardAnalyzers() const { return forwardAnalysisList; }
const auto& getBackwardAnalyzers() const { return backwardAnalysisList; }
public:
void forwardAnalyze(Module *pModule); ///< 执行前向分析
void backwardAnalyze(Module *pModule); ///< 执行后向分析
};
// 活跃变量分析类
// 提供def - use分析
// 未兼容数组变量但是考虑了维度的use信息
class ActiveVarAnalysis : public DataFlowAnalysis {
private:
std::map<BasicBlock *, std::vector<std::set<User *>>> activeTable; ///< 活跃信息表,存储每个基本块内的的活跃变量信息
public:
ActiveVarAnalysis() = default;
~ActiveVarAnalysis() override = default;
public:
static std::set<User*> getUsedSet(Instruction *inst);
static User* getDefine(Instruction *inst);
public:
void init(Module *pModule) override;
bool analyze(Module *pModule, BasicBlock *block) override;
// 外部活跃信息表访问器
const std::map<BasicBlock *, std::vector<std::set<User *>>> &getActiveTable() const;
void clear() override {
activeTable.clear(); // 清空活跃信息表
}
};
// 分析管理器 后续实现
// class AnalysisManager {
// };
} // namespace sysy

View File

@ -62,6 +62,8 @@ private:
public:
SysYIRGenerator() = default;
bool HasReturnInst;
public:
Module *get() const { return module.get(); }
IRBuilder *getBuilder(){ return &builder; }

View File

@ -0,0 +1,37 @@
#pragma once
#include "IR.h"
#include "IRBuilder.h"
namespace sysy {
// 优化前对SysY IR的预处理也可以视作部分CFG优化
// 主要包括删除无用指令、合并基本块、删除空块等
// 这些操作可以在SysY IR生成时就完成但为了简化IR生成过程
// 这里将其放在SysY IR生成后进行预处理
// 同时兼容phi节点的处理可以再mem2reg后再次调用优化
class SysYOptPre {
private:
Module *pModule;
IRBuilder *pBuilder;
public:
SysYOptPre(Module *pMoudle, IRBuilder *pBuilder) : pModule(pMoudle), pBuilder(pBuilder) {}
void SysYOptimizateAfterIR(){
SysYDelInstAfterBr();
SysYBlockMerge();
SysYDelNoPreBLock();
SysYDelEmptyBlock();
SysYAddReturn();
}
void SysYDelInstAfterBr(); // 删除br后面的指令
void SysYDelEmptyBlock(); // 空块删除
void SysYDelNoPreBLock(); // 删除无前驱块
void SysYBlockMerge(); // 合并基本块(主要针对嵌套if while的exit块
// 也可以修改IR生成实现回填机制
void SysYAddReturn(); // 添加return指令(主要针对Void函数)
void usedelete(Instruction *instr); // use删除
};
} // namespace sysy

View File

@ -0,0 +1,30 @@
#pragma once
#include <string>
#include "IR.h"
namespace sysy {
class SysYPrinter {
private:
Module *pModule;
public:
explicit SysYPrinter(Module *pModule) : pModule(pModule) {}
public:
void printIR();
void printGlobalVariable();
public:
static void printFunction(Function *function);
static void printInst(Instruction *pInst);
static void printType(Type *type);
static void printValue(Value *value);
static std::string getOperandName(Value *operand);
static std::string getTypeString(Type *type);
static std::string getValueName(Value *value);
};
} // namespace sysy

View File

@ -8,9 +8,19 @@ using namespace std;
using namespace antlr4;
// #include "Backend.h"
#include "SysYIRGenerator.h"
#include "SysYIRPrinter.h"
#include "SysYIROptPre.h"
#include "RISCv64Backend.h"
#include "SysYIRAnalyser.h"
#include "DeadCodeElimination.h"
#include "Mem2Reg.h"
#include "Reg2Mem.h"
// #include "LLVMIRGenerator.h"
using namespace sysy;
int DEBUG = 0;
int DEEPDEBUG = 0;
static string argStopAfter;
static string argInputFile;
static bool argFormat = false;
@ -20,7 +30,7 @@ void usage(int code = EXIT_FAILURE) {
"Supported options:\n"
" -h \tprint help message and exit\n";
" -f \tpretty-format the input file\n";
" -s {ast,ir,asm,llvmir}\tstop after generating AST/IR/Assembly\n";
" -s {ast,ir,asm,llvmir,asmd,ird}\tstop after generating AST/IR/Assembly\n";
cerr << msg;
exit(code);
}
@ -71,27 +81,69 @@ int main(int argc, char **argv) {
// visit AST to generate IR
if (argStopAfter == "ir") {
SysYIRGenerator generator;
generator.visitCompUnit(moduleAST);
SysYIRGenerator generator;
generator.visitCompUnit(moduleAST);
if (argStopAfter == "ir" || argStopAfter == "ird") {
if (argStopAfter == "ird") {
DEBUG = 1;
}
auto moduleIR = generator.get();
// moduleIR->print(cout);
SysYPrinter printer(moduleIR);
if (DEBUG) {
cout << "=== Original IR ===\n";
printer.printIR();
}
auto builder = generator.getBuilder();
SysYOptPre optPre(moduleIR, builder);
optPre.SysYOptimizateAfterIR();
ControlFlowAnalysis cfa(moduleIR);
cfa.init();
ActiveVarAnalysis ava;
ava.init(moduleIR);
if (DEBUG) {
cout << "=== After CFA & AVA ===\n";
printer.printIR();
}
DeadCodeElimination dce(moduleIR, &cfa, &ava);
dce.runDCEPipeline();
if (DEBUG) {
cout << "=== After 1st DCE ===\n";
printer.printIR();
}
Mem2Reg mem2reg(moduleIR, builder, &cfa, &ava);
mem2reg.mem2regPipeline();
if (DEBUG) {
cout << "=== After Mem2Reg ===\n";
printer.printIR();
}
Reg2Mem reg2mem(moduleIR, builder);
reg2mem.DeletePhiInst();
if (DEBUG) {
cout << "=== After Reg2Mem ===\n";
printer.printIR();
}
dce.runDCEPipeline();
if (DEBUG) {
cout << "=== After 2nd DCE ===\n";
printer.printIR();
}
cout << "=== Final IR ===\n";
printer.printIR();
return EXIT_SUCCESS;
}
// else if (argStopAfter == "llvmir") {
// LLVMIRGenerator llvmirGenerator;
// llvmirGenerator.generateIR(moduleAST); // 使用公共接口生成 IR
// cout << llvmirGenerator.getIR();
// return EXIT_SUCCESS;
// }
// // generate assembly
// CodeGen codegen(moduleIR);
// string asmCode = codegen.code_gen();
// cout << asmCode << endl;
// if (argStopAfter == "asm")
// return EXIT_SUCCESS;
// generate assembly
auto module = generator.get();
sysy::RISCv64CodeGen codegen(module);
string asmCode = codegen.code_gen();
if (argStopAfter == "asm" || argStopAfter == "asmd") {
if (argStopAfter == "asmd") {
DEBUG = 1;
DEEPDEBUG = 1;
}
cout << asmCode << endl;
return EXIT_SUCCESS;
}
return EXIT_SUCCESS;
}

View File

@ -1,12 +1,8 @@
//test add
int main(){
int a, b;
float d;
a = 10;
b = 2;
int c = a;
d = 1.1 ;
return a + b + c;
return a + b;
}

View File

@ -5,10 +5,10 @@ int main() {
const int b = 2;
int c;
if (a == b)
c = a + b;
if (a != b)
c = b - a + 20; // 21 <- this
else
c = a * b;
c = a * b + b + b + 10; // 16
return c;
}

View File

@ -7,7 +7,7 @@ int mul(int x, int y) {
int main(){
int a, b;
a = 10;
b = 0;
a = mul(a, b);
return a + b;
b = 3;
a = mul(a, b); //60
return a + b; //66
}

3
test_script/clean.sh Normal file
View File

@ -0,0 +1,3 @@
rm -rf tmp/*
rm -rf *.s *.ll *clang *sysyc
rm -rf *_riscv32

View File

@ -0,0 +1,49 @@
#!/bin/bash
# 定义输入目录
input_dir="./tmp"
# 获取tmp目录下的所有符合条件的可执行文件并按前缀数字升序排序
executable_files=$(ls "$input_dir" | grep -E '^[0-9]+_.*' | grep -E '_gcc_riscv32$|_sysyc_riscv32$' | sort -t '_' -k1,1n)
# 用于存储前缀数字和返回值
declare -A gcc_results
declare -A sysyc_results
# 遍历所有符合条件的可执行文件
for file in $executable_files; do
# 提取文件名前缀和后缀
prefix=$(echo "$file" | cut -d '_' -f 1)
suffix=$(echo "$file" | cut -d '_' -f 2)
# 检查是否已经处理过该前缀的两个文件
if [[ ${gcc_results["$prefix"]} && ${sysyc_results["$prefix"]} ]]; then
continue
fi
# 执行可执行文件并捕获返回值
echo "Executing: $file"
qemu-riscv32 "$input_dir/$file"
ret_code=$?
# 明确记录返回值
echo "Return code for $file: $ret_code"
# 根据后缀存储返回值
if [[ "$suffix" == "gcc" ]]; then
gcc_results["$prefix"]=$ret_code
elif [[ "$suffix" == "sysyc" ]]; then
sysyc_results["$prefix"]=$ret_code
fi
# 如果同一个前缀的两个文件都已执行,比较它们的返回值
if [[ ${gcc_results["$prefix"]} && ${sysyc_results["$prefix"]} ]]; then
gcc_ret=${gcc_results["$prefix"]}
sysyc_ret=${sysyc_results["$prefix"]}
if [[ "$gcc_ret" -ne "$sysyc_ret" ]]; then
echo -e "\e[31mWARNING: Return codes differ for prefix $prefix: _gcc=$gcc_ret, _sysyc=$sysyc_ret\e[0m"
else
echo "Return codes match for prefix $prefix: $gcc_ret"
fi
fi
done

49
test_script/exe.sh Normal file
View File

@ -0,0 +1,49 @@
#!/bin/bash
# 定义输入目录
input_dir="."
# 获取当前目录下的所有符合条件的可执行文件,并按前缀数字升序排序
executable_files=$(ls "$input_dir" | grep -E '^[0-9]+_.*' | grep -E '_clang$|_sysyc$' | sort -t '_' -k1,1n)
# 用于存储前缀数字和返回值
declare -A clang_results
declare -A sysyc_results
# 遍历所有符合条件的可执行文件
for file in $executable_files; do
# 提取文件名前缀和后缀
prefix=$(echo "$file" | cut -d '_' -f 1)
suffix=$(echo "$file" | cut -d '_' -f 2)
# 检查是否已经处理过该前缀的两个文件
if [[ ${clang_results["$prefix"]} && ${sysyc_results["$prefix"]} ]]; then
continue
fi
# 执行可执行文件并捕获返回值
echo "Executing: $file"
"./$file"
ret_code=$?
# 明确记录返回值
echo "Return code for $file: $ret_code"
# 根据后缀存储返回值
if [[ "$suffix" == "clang" ]]; then
clang_results["$prefix"]=$ret_code
elif [[ "$suffix" == "sysyc" ]]; then
sysyc_results["$prefix"]=$ret_code
fi
# 如果同一个前缀的两个文件都已执行,比较它们的返回值
if [[ ${clang_results["$prefix"]} && ${sysyc_results["$prefix"]} ]]; then
clang_ret=${clang_results["$prefix"]}
sysyc_ret=${sysyc_results["$prefix"]}
if [[ "$clang_ret" -ne "$sysyc_ret" ]]; then
echo -e "\e[31mWARNING: Return codes differ for prefix $prefix: _clang=$clang_ret, _sysyc=$sysyc_ret\e[0m"
else
echo "Return codes match for prefix $prefix: $clang_ret"
fi
fi
done

View File

@ -0,0 +1,57 @@
#!/bin/bash
# 定义输入和输出路径
input_dir="../test/"
output_dir="./tmp"
# 默认不生成可执行文件
generate_executable=false
# 解析命令行参数
while [[ "$#" -gt 0 ]]; do
case $1 in
--executable|-e)
generate_executable=true
shift
;;
*)
echo "Unknown parameter: $1"
exit 1
;;
esac
done
# 确保输出目录存在
mkdir -p "$output_dir"
# 遍历输入路径中的所有 .sy 文件
for sy_file in "$input_dir"*.sy; do
# 获取文件名(不带路径和扩展名)
base_name=$(basename "$sy_file" .sy)
# 定义输出文件路径
output_file="${output_dir}/${base_name}_gcc_riscv32.s"
# 使用 gcc 编译 .sy 文件为 .ll 文件
riscv32-unknown-elf-gcc -x c -S "$sy_file" -o "$output_file"
# 检查是否成功
if [ $? -eq 0 ]; then
echo "Compiled $sy_file -> $output_file"
else
echo "Failed to compile $sy_file"
continue
fi
# 如果指定了 --executable 或 -e 参数,则进一步编译为可执行文件
if $generate_executable; then
executable_file="${output_dir}/${base_name}_gcc_riscv32"
riscv32-unknown-elf-gcc "$output_file" -o "$executable_file"
if [ $? -eq 0 ]; then
echo "Generated executable: $executable_file"
else
echo "Failed to generate executable from $output_file"
fi
fi
done

57
test_script/ll.sh Normal file
View File

@ -0,0 +1,57 @@
#!/bin/bash
# 定义输入和输出路径
input_dir="../test/"
output_dir="./"
# 默认不生成可执行文件
generate_executable=false
# 解析命令行参数
while [[ "$#" -gt 0 ]]; do
case $1 in
--executable|-e)
generate_executable=true
shift
;;
*)
echo "Unknown parameter: $1"
exit 1
;;
esac
done
# 确保输出目录存在
mkdir -p "$output_dir"
# 遍历输入路径中的所有 .sy 文件
for sy_file in "$input_dir"*.sy; do
# 获取文件名(不带路径和扩展名)
base_name=$(basename "$sy_file" .sy)
# 定义输出文件路径
output_file="${base_name}_clang.ll"
# 使用 clang 编译 .sy 文件为 .ll 文件
clang -x c -S -emit-llvm "$sy_file" -o "$output_file"
# 检查是否成功
if [ $? -eq 0 ]; then
echo "Compiled $sy_file -> $output_file"
else
echo "Failed to compile $sy_file"
continue
fi
# 如果指定了 --executable 或 -e 参数,则进一步编译为可执行文件
if $generate_executable; then
executable_file="${base_name}_clang"
clang "$output_file" -o "$executable_file"
if [ $? -eq 0 ]; then
echo "Generated executable: $executable_file"
else
echo "Failed to generate executable from $output_file"
fi
fi
done

View File

@ -0,0 +1,208 @@
#!/bin/bash
# run_vm_tests.sh - 用于在 RISC-V 虚拟机内部汇编、链接和测试 SysY 程序的脚本
# 此脚本应该在Riscv64架构的机器上运行依赖gcc。
# 脚本的目录结构应该为:
# .
# ├── runit.sh
# ├── lib
# │ └── libsysy_riscv.a
# └── testdata
# ├── functional
# └── performance
# 定义相对于脚本位置的目录
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
TMP_DIR="${SCRIPT_DIR}/tmp"
LIB_DIR="${SCRIPT_DIR}/lib"
TESTDATA_DIR="${SCRIPT_DIR}/testdata"
# 定义编译器
GCC_NATIVE="gcc" # VM 内部的 gcc
# 显示帮助信息的函数
show_help() {
echo "用法: $0 [选项]"
echo "此脚本用于在 RISC-V 虚拟机内部,对之前生成的 .s 汇编文件进行汇编、链接和测试。"
echo "假设当前运行环境已经是 RISC-V 64 位架构,可以直接执行编译后的程序。"
echo ""
echo "选项:"
echo " -c, --clean 清理 'tmp' 目录下的所有生成文件。"
echo " -h, --help 显示此帮助信息并退出。"
echo ""
echo "执行步骤:"
echo "1. 遍历 'tmp/' 目录下的所有 .s 汇编文件。"
echo "2. 使用 VM 内部的 gcc 将 .s 文件汇编并链接为可执行文件 (链接 -L./lib -lsysy_riscv -static)。"
echo "3. 直接运行编译后的可执行文件 (使用 ./ 方式)。"
echo "4. 根据对应的 testdata/*.out 文件内容(最后一行是否为整数)决定是进行返回值比较、标准输出比较,或两者都进行。"
echo "5. 如果没有对应的 .in/.out 文件,则打印可执行文件的返回值。"
echo "6. 输出比较时会忽略行尾多余的换行符。"
}
# 清理临时文件的函数
clean_tmp() {
echo "正在清理临时目录: ${TMP_DIR}"
# 清理所有由本脚本和 runit.sh 生成的文件
rm -rf "${TMP_DIR}"/*.s \
"${TMP_DIR}"/*_sysyc_riscv64 \
"${TMP_DIR}"/*_sysyc_riscv64.actual_out \
"${TMP_DIR}"/*_sysyc_riscv64.expected_stdout \
"${TMP_DIR}"/*_sysyc_riscv64.o # 以防生成了 .o 文件
echo "清理完成。"
}
# 如果临时目录不存在,则创建它 (尽管 runit.sh 应该已经创建了)
mkdir -p "${TMP_DIR}"
# 解析命令行参数
while [[ "$#" -gt 0 ]]; do
case "$1" in
-c|--clean)
clean_tmp
exit 0
;;
-h|--help)
show_help
exit 0
;;
*)
echo "未知选项: $1"
show_help
exit 1
;;
esac
done
echo "SysY VM 内部测试运行器启动..."
echo "汇编文件目录: ${TMP_DIR}"
echo "库文件目录: ${LIB_DIR}"
echo "测试数据目录: ${TESTDATA_DIR}"
echo ""
# 查找 tmp 目录下的所有 .s 汇编文件
# 遍历找到的每个 .s 文件
find "${TMP_DIR}" -maxdepth 1 -name "*.s" | while read s_file; do
# 从 .s 文件名中提取原始的测试用例名称部分
# 例如:从 functional_21_if_test2_sysyc_riscv64.s 提取 functional_21_if_test2
base_name_from_s_file=$(basename "$s_file" .s)
original_test_name_underscored=$(echo "$base_name_from_s_file" | sed 's/_sysyc_riscv64$//')
# 将下划线转换回斜杠以构建原始的相对路径例如functional/21_if_test2
original_relative_path=$(echo "$original_test_name_underscored" | tr '_' '/')
# 定义可执行文件、输入文件、参考输出文件和实际输出文件的路径
executable_file="${TMP_DIR}/${base_name_from_s_file}"
input_file="${TESTDATA_DIR}/${original_relative_path}.in"
output_reference_file="${TESTDATA_DIR}/${original_relative_path}.out"
output_actual_file="${TMP_DIR}/${base_name_from_s_file}.actual_out"
echo "正在处理汇编文件: $(basename "$s_file")"
echo " 对应的测试用例路径: ${original_relative_path}"
# 步骤 1: 使用 VM 内部的 gcc 编译 .s 到可执行文件
# 注意:这里假设 gcc 在 VM 环境中可用,且 ./lib 是相对于当前脚本运行目录
echo " 使用 gcc 汇编并链接: ${GCC_NATIVE} \"${s_file}\" -o \"${executable_file}\" -L\"${LIB_DIR}\" -lsysy_riscv -static -g"
"${GCC_NATIVE}" "${s_file}" -o "${executable_file}" -L"${LIB_DIR}" -lsysy_riscv -static -g
if [ $? -ne 0 ]; then
echo -e "\e[31m错误: GCC 汇编/链接 ${s_file} 失败\e[0m"
continue
fi
echo " 生成的可执行文件: ${executable_file}"
# 步骤 2: 执行编译后的文件并比较/报告结果
# 直接执行可执行文件,不再通过 qemu-riscv64
echo " 正在执行: ./\"${executable_file}\""
# 检查是否存在 .out 文件
if [ -f "${output_reference_file}" ]; then
# 尝试从 .out 文件中提取期望的返回码和期望的标准输出
# 获取 .out 文件的最后一行,去除空白字符
LAST_LINE_TRIMMED=$(tail -n 1 "${output_reference_file}" | tr -d '[:space:]')
# 检查最后一行是否为纯整数 (允许正负号)
if [[ "$LAST_LINE_TRIMMED" =~ ^[-+]?[0-9]+$ ]]; then
# 假设最后一行是期望的返回码
EXPECTED_RETURN_CODE="$LAST_LINE_TRIMMED"
# 创建一个只包含期望标准输出的临时文件 (所有行除了最后一行)
EXPECTED_STDOUT_FILE="${TMP_DIR}/${base_name_from_s_file}.expected_stdout"
# 使用 head -n -1 来获取除了最后一行之外的所有行。如果文件只有一行,则生成一个空文件。
head -n -1 "${output_reference_file}" > "${EXPECTED_STDOUT_FILE}"
echo " 检测到 .out 文件同时包含标准输出和期望的返回码。"
echo " 期望返回码: ${EXPECTED_RETURN_CODE}"
if [ -s "${EXPECTED_STDOUT_FILE}" ]; then # -s 检查文件是否非空
echo " 期望标准输出文件: ${EXPECTED_STDOUT_FILE}"
else
echo " 期望标准输出为空。"
fi
# 执行程序,捕获实际返回码和实际标准输出
if [ -f "${input_file}" ]; then
echo " 使用输入文件: ${input_file}"
"./${executable_file}" < "${input_file}" > "${output_actual_file}"
else
"./${executable_file}" > "${output_actual_file}"
fi
ACTUAL_RETURN_CODE=$? # 捕获执行状态
# 比较实际返回码与期望返回码
if [ "$ACTUAL_RETURN_CODE" -eq "$EXPECTED_RETURN_CODE" ]; then
echo -e "\e[32m 返回码测试成功: ${original_relative_path}.sy 的返回码 (${ACTUAL_RETURN_CODE}) 与期望值 (${EXPECTED_RETURN_CODE}) 匹配\e[0m"
else
echo -e "\e[31m 返回码测试失败: ${original_relative_path}.sy 的返回码不匹配。期望: ${EXPECTED_RETURN_CODE}, 实际: ${ACTUAL_RETURN_CODE}\e[0m"
fi
# 比较实际标准输出与期望标准输出,忽略文件末尾的换行符差异
if diff -q <(sed ':a;N;$!ba;s/\n*$//' "${output_actual_file}") <(sed ':a;N;$!ba;s/\n*$//' "${EXPECTED_STDOUT_FILE}") >/dev/null 2>&1; then
echo -e "\e[32m 标准输出测试成功: 输出与 ${original_relative_path}.sy 的参考输出匹配 (忽略行尾换行符差异)\e[0m"
else
echo -e "\e[31m 标准输出测试失败: ${original_relative_path}.sy 的输出不匹配\e[0m"
echo " 差异 (可能包含行尾换行符差异):"
diff "${output_actual_file}" "${EXPECTED_STDOUT_FILE}" # 显示原始差异以便调试
fi
else
# 最后一行不是纯整数,将整个 .out 文件视为纯标准输出
echo " 检测到 .out 文件为纯标准输出参考。正在与输出文件比较: ${output_reference_file}"
# 执行程序,并将输出重定向到临时文件
if [ -f "${input_file}" ]; then
echo " 使用输入文件: ${input_file}"
"./${executable_file}" < "${input_file}" > "${output_actual_file}"
else
"./${executable_file}" > "${output_actual_file}"
fi
EXEC_STATUS=$? # 捕获执行状态
if [ $EXEC_STATUS -ne 0 ]; then
echo -e "\e[33m警告: 可执行文件 ${original_relative_path}.sy 以非零状态 ${EXEC_STATUS} 退出 (纯输出比较模式)。请检查程序逻辑或其是否应返回此状态。\e[0m"
fi
# 比较实际输出与参考输出,忽略文件末尾的换行符差异
if diff -q <(sed ':a;N;$!ba;s/\n*$//' "${output_actual_file}") <(sed ':a;N;$!ba;s/\n*$//' "${output_reference_file}") >/dev/null 2>&1; then
echo -e "\e[32m 成功: 输出与 ${original_relative_path}.sy 的参考输出匹配 (忽略行尾换行符差异)\e[0m"
else
echo -e "\e[31m 失败: ${original_relative_path}.sy 的输出不匹配\e[0m"
echo " 差异 (可能包含行尾换行符差异):"
diff "${output_actual_file}" "${output_reference_file}" # 显示原始差异以便调试
fi
fi
elif [ -f "${input_file}" ]; then
# 只有 .in 文件存在,使用输入运行并报告退出码(无参考输出)
echo " 使用输入文件: ${input_file}"
echo " 没有 .out 文件进行比较。正在运行并报告返回码。"
"./${executable_file}" < "${input_file}"
EXEC_STATUS=$?
echo " ${original_relative_path}.sy 的返回码: ${EXEC_STATUS}"
else
# .in 和 .out 文件都不存在,只运行并报告退出码
echo " 未找到 .in 或 .out 文件。正在运行并报告返回码。"
"./${executable_file}"
EXEC_STATUS=$?
echo " ${original_relative_path}.sy 的返回码: ${EXEC_STATUS}"
fi
echo "" # 为测试用例之间添加一个空行,以提高可读性
done
echo "脚本完成。"

223
test_script/runit.sh Normal file
View File

@ -0,0 +1,223 @@
#!/bin/bash
# runit.sh - 用于编译和测试 SysY 程序的脚本
# 此脚本应该位于 mysysy/test_script/
# 定义相对于脚本位置的目录
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
TESTDATA_DIR="${SCRIPT_DIR}/../testdata"
BUILD_BIN_DIR="${SCRIPT_DIR}/../build/bin"
LIB_DIR="${SCRIPT_DIR}/../lib"
# TMP_DIR="${SCRIPT_DIR}/tmp"
TMP_DIR="/home/ladev987/paraComp/debug/share_folder/tmp"
# 定义编译器和模拟器
SYSYC="${BUILD_BIN_DIR}/sysyc"
GCC_RISCV64="riscv64-linux-gnu-gcc"
QEMU_RISCV64="qemu-riscv64"
# 标志,用于确定是否应该生成和运行可执行文件
EXECUTE_MODE=false
# 显示帮助信息的函数
show_help() {
echo "用法: $0 [选项]"
echo "此脚本用于编译 .sy 文件,并可选择性地运行它们进行测试。"
echo ""
echo "选项:"
echo " -e, --executable 编译为可执行文件,运行可执行文件,并比较输出(如果存在 .in/.out 文件)。"
echo " 如果 .out 文件的最后一行是整数,则将其视为期望的返回值进行比较,其余内容视为期望的标准输出。"
echo " 如果 .out 文件的最后一行不是整数,则将整个 .out 文件视为期望的标准输出进行比较。"
echo " 输出比较时会忽略行尾多余的换行符。"
echo " 如果不存在 .in/.out 文件,则打印返回码。"
echo " -c, --clean 清理 'tmp' 目录下的所有生成文件。"
echo " -h, --help 显示此帮助信息并退出。"
echo ""
echo "编译步骤:"
echo "1. 调用 sysyc 将 .sy 编译为 .s (RISC-V 汇编)。"
echo "2. 调用 riscv64-linux-gnu-gcc 将 .s 编译为可执行文件,并链接 -L../lib/ -lsysy_riscv -static。"
echo "3. 调用 qemu-riscv64 执行编译后的文件。"
echo "4. 根据 .out 文件内容(最后一行是否为整数)决定是进行返回值比较、标准输出比较,或两者都进行。"
echo "5. 如果没有 .in/.out 文件,则打印可执行文件的返回值。"
}
# 清理临时文件的函数
clean_tmp() {
echo "正在清理临时目录: ${TMP_DIR}"
rm -rf "${TMP_DIR}"/*
# 如果需要,也可以根据 clean.sh 示例清理其他特定文件
# rm -rf "${SCRIPT_DIR}"/*.s "${SCRIPT_DIR}"/*.ll "${SCRIPT_DIR}"/*clang "${SCRIPT_DIR}"/*sysyc
# rm -rf "${SCRIPT_DIR}"/*_riscv64
}
# 如果临时目录不存在,则创建它
mkdir -p "${TMP_DIR}"
# 解析命令行参数
while [[ "$#" -gt 0 ]]; do
case "$1" in
-e|--executable)
EXECUTE_MODE=true
shift
;;
-c|--clean)
clean_tmp
exit 0
;;
-h|--help)
show_help
exit 0
;;
*)
echo "未知选项: $1"
show_help
exit 1
;;
esac
done
echo "SysY 测试运行器启动..."
echo "输入目录: ${TESTDATA_DIR}"
echo "临时目录: ${TMP_DIR}"
echo "执行模式已启用: ${EXECUTE_MODE}"
echo ""
# 查找 testdata 目录及其子目录中的所有 .sy 文件
# 遍历找到的每个 .sy 文件
find "${TESTDATA_DIR}" -name "*.sy" | while read sy_file; do
# 获取 .sy 文件的基本名称例如21_if_test2
# 这也处理了文件位于子目录中的情况例如functional/21_if_test2.sy
relative_path_no_ext=$(realpath --relative-to="${TESTDATA_DIR}" "${sy_file%.*}")
# 将斜杠替换为下划线,用于输出文件名,以避免冲突并保持结构
output_base_name=$(echo "${relative_path_no_ext}" | tr '/' '_')
# 定义汇编文件、可执行文件、输入文件和输出文件的路径
assembly_file="${TMP_DIR}/${output_base_name}_sysyc_riscv64.s"
executable_file="${TMP_DIR}/${output_base_name}_sysyc_riscv64"
input_file="${sy_file%.*}.in"
output_reference_file="${sy_file%.*}.out"
output_actual_file="${TMP_DIR}/${output_base_name}_sysyc_riscv64.actual_out"
echo "正在处理: $(basename "$sy_file")"
echo " SY 文件: ${sy_file}"
# 步骤 1: 使用 sysyc 编译 .sy 到 .s
echo " 使用 sysyc 编译: ${SYSYC} -s asm \"${sy_file}\" > \"${assembly_file}\""
"${SYSYC}" -s asm "${sy_file}" > "${assembly_file}"
if [ $? -ne 0 ]; then
echo -e "\e[31m错误: SysY 编译 ${sy_file} 失败\e[0m"
continue
fi
echo " 生成的汇编文件: ${assembly_file}"
# 只有当 EXECUTE_MODE 为 true 时才继续生成和执行可执行文件
if ${EXECUTE_MODE}; then
# 步骤 2: 使用 riscv64-linux-gnu-gcc 编译 .s 到可执行文件
echo " 使用 gcc 编译: ${GCC_RISCV64} \"${assembly_file}\" -o \"${executable_file}\" -L\"${LIB_DIR}\" -lsysy_riscv -static"
"${GCC_RISCV64}" "${assembly_file}" -o "${executable_file}" -L"${LIB_DIR}" -lsysy_riscv -static
if [ $? -ne 0 ]; then
echo -e "\e[31m错误: GCC 编译 ${assembly_file} 失败\e[0m"
continue
fi
echo " 生成的可执行文件: ${executable_file}"
# 步骤 3, 4, 5: 执行编译后的文件并比较/报告结果
echo " 正在执行: ${QEMU_RISCV664} \"${executable_file}\""
# 检查是否存在 .out 文件
if [ -f "${output_reference_file}" ]; then
# 尝试从 .out 文件中提取期望的返回码和期望的标准输出
# 获取 .out 文件的最后一行,去除空白字符
LAST_LINE_TRIMMED=$(tail -n 1 "${output_reference_file}" | tr -d '[:space:]')
# 检查最后一行是否为纯整数 (允许正负号)
if [[ "$LAST_LINE_TRIMMED" =~ ^[-+]?[0-9]+$ ]]; then
# 假设最后一行是期望的返回码
EXPECTED_RETURN_CODE="$LAST_LINE_TRIMMED"
# 创建一个只包含期望标准输出的临时文件 (所有行除了最后一行)
EXPECTED_STDOUT_FILE="${TMP_DIR}/${output_base_name}_sysyc_riscv64.expected_stdout"
# 使用 head -n -1 来获取除了最后一行之外的所有行。如果文件只有一行,则生成一个空文件。
head -n -1 "${output_reference_file}" > "${EXPECTED_STDOUT_FILE}"
echo " 检测到 .out 文件同时包含标准输出和期望的返回码。"
echo " 期望返回码: ${EXPECTED_RETURN_CODE}"
if [ -s "${EXPECTED_STDOUT_FILE}" ]; then # -s 检查文件是否非空
echo " 期望标准输出文件: ${EXPECTED_STDOUT_FILE}"
else
echo " 期望标准输出为空。"
fi
# 执行程序,捕获实际返回码和实际标准输出
if [ -f "${input_file}" ]; then
echo " 使用输入文件: ${input_file}"
"${QEMU_RISCV64}" "${executable_file}" < "${input_file}" > "${output_actual_file}"
else
"${QEMU_RISCV64}" "${executable_file}" > "${output_actual_file}"
fi
ACTUAL_RETURN_CODE=$? # 捕获执行状态
# 比较实际返回码与期望返回码
if [ "$ACTUAL_RETURN_CODE" -eq "$EXPECTED_RETURN_CODE" ]; then
echo -e "\e[32m 返回码测试成功: ${sy_file} 的返回码 (${ACTUAL_RETURN_CODE}) 与期望值 (${EXPECTED_RETURN_CODE}) 匹配\e[0m"
else
echo -e "\e[31m 返回码测试失败: ${sy_file} 的返回码不匹配。期望: ${EXPECTED_RETURN_CODE}, 实际: ${ACTUAL_RETURN_CODE}\e[0m"
fi
# 比较实际标准输出与期望标准输出,忽略文件末尾的换行符差异
# 使用 sed 命令去除文件末尾的所有换行符,再通过 diff 进行比较
if diff -q <(sed ':a;N;$!ba;s/\n*$//' "${output_actual_file}") <(sed ':a;N;$!ba;s/\n*$//' "${EXPECTED_STDOUT_FILE}") >/dev/null 2>&1; then
echo -e "\e[32m 标准输出测试成功: 输出与 ${sy_file} 的参考输出匹配 (忽略行尾换行符差异)\e[0m"
else
echo -e "\e[31m 标准输出测试失败: ${sy_file} 的输出不匹配\e[0m"
echo " 差异 (可能包含行尾换行符差异):"
diff "${output_actual_file}" "${EXPECTED_STDOUT_FILE}" # 显示原始差异以便调试
fi
else
# 最后一行不是纯整数,将整个 .out 文件视为纯标准输出
echo " 检测到 .out 文件为纯标准输出参考。正在与输出文件比较: ${output_reference_file}"
# 使用输入文件(如果存在)运行可执行文件,并将输出重定向到临时文件
if [ -f "${input_file}" ]; then
echo " 使用输入文件: ${input_file}"
"${QEMU_RISCV64}" "${executable_file}" < "${input_file}" > "${output_actual_file}"
else
"${QEMU_RISCV64}" "${executable_file}" > "${output_actual_file}"
fi
EXEC_STATUS=$? # 捕获执行状态
if [ $EXEC_STATUS -ne 0 ]; then
echo -e "\e[33m警告: 可执行文件 ${sy_file} 以非零状态 ${EXEC_STATUS} 退出 (纯输出比较模式)。请检查程序逻辑或其是否应返回此状态。\e[0m"
fi
# 比较实际输出与参考输出,忽略文件末尾的换行符差异
if diff -q <(sed ':a;N;$!ba;s/\n*$//' "${output_actual_file}") <(sed ':a;N;$!ba;s/\n*$//' "${output_reference_file}") >/dev/null 2>&1; then
echo -e "\e[32m 成功: 输出与 ${sy_file} 的参考输出匹配 (忽略行尾换行符差异)\e[0m"
else
echo -e "\e[31m 失败: ${sy_file} 的输出不匹配\e[0m"
echo " 差异 (可能包含行尾换行符差异):"
diff "${output_actual_file}" "${output_reference_file}" # 显示原始差异以便调试
fi
fi
elif [ -f "${input_file}" ]; then
# 只有 .in 文件存在,使用输入运行并报告退出码(无参考输出)
echo " 使用输入文件: ${input_file}"
echo " 没有 .out 文件进行比较。正在运行并报告返回码。"
"${QEMU_RISCV64}" "${executable_file}" < "${input_file}"
EXEC_STATUS=$?
echo " ${sy_file} 的返回码: ${EXEC_STATUS}"
else
# .in 和 .out 文件都不存在,只运行并报告退出码
echo " 未找到 .in 或 .out 文件。正在运行并报告返回码。"
"${QEMU_RISCV64}" "${executable_file}"
EXEC_STATUS=$?
echo " ${sy_file} 的返回码: ${EXEC_STATUS}"
fi
else
echo " 跳过执行模式。仅生成汇编文件。"
fi
echo "" # 为测试用例之间添加一个空行,以提高可读性
done
echo "脚本完成。"

View File

@ -0,0 +1,57 @@
#!/bin/bash
# 定义输入和输出路径
input_dir="../test/"
output_dir="./tmp"
# 默认不生成可执行文件
generate_executable=false
# 解析命令行参数
while [[ "$#" -gt 0 ]]; do
case $1 in
--executable|-e)
generate_executable=true
shift
;;
*)
echo "Unknown parameter: $1"
exit 1
;;
esac
done
# 确保输出目录存在
mkdir -p "$output_dir"
# 遍历输入路径中的所有 .sy 文件
for sy_file in "$input_dir"*.sy; do
# 获取文件名(不带路径和扩展名)
base_name=$(basename "$sy_file" .sy)
# 定义输出文件路径
output_file="${output_dir}/${base_name}_sysyc_riscv32.s"
# 使用 sysyc 编译 .sy 文件为 .s 文件
../build/bin/sysyc -s asm "$sy_file" > "$output_file"
# 检查是否成功
if [ $? -eq 0 ]; then
echo "Compiled $sy_file -> $output_file"
else
echo "Failed to compile $sy_file"
continue
fi
# 如果指定了 --executable 或 -e 参数,则进一步编译为可执行文件
if $generate_executable; then
executable_file="${output_dir}/${base_name}_sysyc_riscv32"
riscv32-unknown-elf-gcc "$output_file" -o "$executable_file"
if [ $? -eq 0 ]; then
echo "Generated executable: $executable_file"
else
echo "Failed to generate executable from $output_file"
fi
fi
done

57
test_script/sysyll.sh Normal file
View File

@ -0,0 +1,57 @@
#!/bin/bash
# 定义输入和输出路径
input_dir="../test/"
output_dir="./"
# 默认不生成可执行文件
generate_executable=false
# 解析命令行参数
while [[ "$#" -gt 0 ]]; do
case $1 in
--executable|-e)
generate_executable=true
shift
;;
*)
echo "Unknown parameter: $1"
exit 1
;;
esac
done
# 确保输出目录存在
mkdir -p "$output_dir"
# 遍历输入路径中的所有 .sy 文件
for sy_file in "$input_dir"*.sy; do
# 获取文件名(不带路径和扩展名)
base_name=$(basename "$sy_file" .sy)
# 定义输出文件路径
output_file="${base_name}_sysyc.ll"
# 使用 sysyc 编译 .sy 文件为 .ll 文件
../build/bin/sysyc -s ir "$sy_file" > "$output_file"
# 检查是否成功
if [ $? -eq 0 ]; then
echo "Compiled $sy_file -> $output_file"
else
echo "Failed to compile $sy_file"
continue
fi
# 如果指定了 --executable 或 -e 参数,则进一步编译为可执行文件
if $generate_executable; then
executable_file="${base_name}_sysyc"
clang "$output_file" -o "$executable_file"
if [ $? -eq 0 ]; then
echo "Generated executable: $executable_file"
else
echo "Failed to generate executable from $output_file"
fi
fi
done

3
test_script/wrapper.sh Normal file
View File

@ -0,0 +1,3 @@
sh ./gcc-riscv32.sh -e
sh ./sysy-riscv32.sh -e
sh ./exe-riscv32.sh