[backend]修改了测试脚本，现在支持性能测试用例结果写入到文件

[backend]后端流水线引入-O1
Merge branch 'backend-O1-1' into backend
2025-08-20 10:49:36 +08:00 · 2025-08-20 03:08:38 +08:00 · 2025-08-20 02:52:21 +08:00 · 2025-08-20 02:50:30 +08:00 · 2025-08-20 02:46:15 +08:00 · 2025-08-20 02:14:53 +08:00
86 changed files with 22333 additions and 3140 deletions
--- a/.gitignore
+++ b/.gitignore
@ -36,7 +36,7 @@ doxygen

 !/testdata/functional/*.out
 !/testdata/h_functional/*.out
-!/testdata/performance/*.out
+testdata/performance/
 build/
 .antlr
 .vscode/
--- a/Pass_ID_List.md
+++ b/Pass_ID_List.md
@ -14,15 +14,416 @@

 Mem2Reg 遍的主要目标是将那些不必要的、只用于局部标量变量的内存分配 (alloca 指令) 消除，并将这些变量的值转换为 SSA 形式。这有助于减少内存访问，提高代码效率，并为后续的优化创造更好的条件。

+通过Mem2Reg理解删除指令时对use关系的维护：
+
+在 `Mem2Reg` 优化遍中，当 `load` 和 `store` 指令被删除时，其 `use` 关系（即它们作为操作数与其他 `Value` 对象之间的连接）的正确消除是一个关键问题，尤其涉及到 `AllocaInst`。
+
+结合您提供的 `Mem2RegContext::renameVariables` 代码和我们之前讨论的 `usedelete` 逻辑，下面是 `use` 关系如何被正确消除的详细过程：
+
+### 问题回顾：`Use` 关系的双向性
+
+在您的 IR 设计中，`Use` 对象扮演着连接 `User`（使用者，如 `LoadInst`）和 `Value`（被使用者，如 `AllocaInst`）的双向角色：
+
+* 一个 `User` 持有对其操作数 `Value` 的 `Use` 对象（通过 `User::operands` 列表）。
+* 一个 `Value` 持有所有使用它的 `User` 的 `Use` 对象（通过 `Value::uses` 列表）。
+
+原始问题是：当一个 `LoadInst` 或 `StoreInst` 被删除时，如果不对其作为操作数与 `AllocaInst` 之间的 `Use` 关系进行明确清理，`AllocaInst` 的 `uses` 列表中就会留下指向已删除 `LoadInst` / `StoreInst` 的 `Use` 对象，导致内部的 `User*` 指针悬空，在后续访问时引发 `segmentation fault`。
+
+### `Mem2Reg` 中 `load`/`store` 指令的删除行为
+
+在 `Mem2RegContext::renameVariables` 函数中，`load` 和 `store` 指令被处理时，其行为如下：
+
+1.  **处理 `LoadInst`：**
+    当找到一个指向可提升 `AllocaInst` 的 `LoadInst` 时，其用途会被 `replaceAllUsesWith(allocaToValueStackMap[alloca].top())` 替换。这意味着任何原本使用 `LoadInst` 本身计算结果的指令，现在都直接使用 SSA 值栈顶部的 `Value`。
+    **重点：** 这一步处理的是 `LoadInst` 作为**被使用的值 (Value)** 时，其 `uses` 列表的清理。即，将 `LoadInst` 的所有使用者重定向到新的 SSA 值，并把这些 `Use` 对象从 `LoadInst` 的 `uses` 列表中移除。
+
+2.  **处理 `StoreInst`：**
+    当找到一个指向可提升 `AllocaInst` 的 `StoreInst` 时，`StoreInst` 存储的值会被压入值栈。`StoreInst` 本身并不产生可被其他指令直接使用的值（其类型是 `void`），所以它没有 `uses` 列表需要替换。
+    **重点：** `StoreInst` 的主要作用是更新内存状态，在 SSA 形式下，它被移除后需要清理它作为**使用者 (User)** 时的操作数关系。
+
+在这两种情况下，一旦 `load` 或 `store` 指令的 SSA 转换完成，它们都会通过 `instIter = SysYIROptUtils::usedelete(instIter)` 被显式删除。
+
+### `SysYIROptUtils::usedelete` 如何正确消除 `Use` 关系
+
+关键在于对 `SysYIROptUtils::usedelete` 函数的修改，使其在删除指令时，同时处理该指令作为 `User` 和 `Value` 的两种 `Use` 关系：
+
+1.  **清理指令作为 `Value` 时的 `uses` 列表 (由 `replaceAllUsesWith` 完成)：**
+    在 `usedelete` 函数中，`inst->replaceAllUsesWith(UndefinedValue::get(inst->getType()))` 的调用至关重要。这确保了：
+    * 如果被删除的 `Instruction`（例如 `LoadInst`）产生了结果值并被其他指令使用，所有这些使用者都会被重定向到 `UndefinedValue`（或者 `Mem2Reg` 中具体的 SSA 值）。
+    * 这个过程会遍历 `LoadInst` 的 `uses` 列表，并将这些 `Use` 对象从 `LoadInst` 的 `uses` 列表中移除。这意味着 `LoadInst` 自己不再被任何其他指令使用。
+
+2.  **清理指令作为 `User` 时其操作数的 `uses` 列表 (由 `RemoveUserOperandUses` 完成)：**
+    这是您提出的、并已集成到 `usedelete` 中的关键改进点。对于一个被删除的 `Instruction`（它同时也是 `User`），我们需要清理它**自己使用的操作数**所维护的 `use` 关系。
+    * 例如，`LoadInst %op1` 使用了 `%op1`（一个 `AllocaInst`）。当 `LoadInst` 被删除时，`AllocaInst` 的 `uses` 列表中有一个 `Use` 对象指向这个 `LoadInst`。
+    * `RemoveUserOperandUses` 函数会遍历被删除 `User`（即 `LoadInst` 或 `StoreInst`）的 `operands` 列表。
+    * 对于 `operands` 列表中的每个 `std::shared_ptr<Use> use_ptr`，它会获取 `Use` 对象内部指向的 `Value`（例如 `AllocaInst*`），然后调用 `value->removeUse(use_ptr)`。
+    * 这个 `removeUse` 调用会负责将 `use_ptr` 从 `AllocaInst` 的 `uses` 列表中删除。
+
+### 总结
+
+通过在 `SysYIROptUtils::usedelete` 中同时执行这两个步骤：
+
+* `replaceAllUsesWith`：处理被删除指令**作为结果被使用**时的 `use` 关系。
+* `RemoveUserOperandUses`：处理被删除指令**作为使用者（User）时，其操作数**的 `use` 关系。
+
+这就确保了当 `Mem2Reg` 遍历并删除 `load` 和 `store` 指令时，无论是它们作为 `Value` 的使用者，还是它们作为 `User` 的操作数，所有相关的 `Use` 对象都能被正确地从 `Value` 的 `uses` 列表中移除，从而避免了悬空指针和后续的 `segmentation fault`。
+
+最后，当所有指向某个 `AllocaInst` 的 `load` 和 `store` 指令都被移除后，`AllocaInst` 的 `uses` 列表将变得干净（只包含 Phi 指令，如果它们在 SSA 转换中需要保留 Alloca 作为操作数），这时在 `Mem2RegContext::cleanup()` 阶段，`SysYIROptUtils::usedelete(alloca)` 就可以安全地删除 `AllocaInst` 本身了。
+
 ## Reg2Mem

 我们的Reg2Mem 遍的主要目标是作为 Mem2Reg 的一种逆操作，但更具体是解决后端无法识别 PhiInst 指令的问题。主要的速录是将函数参数和 PhiInst 指令的结果从 SSA 形式转换回内存形式，通过插入 alloca、load 和 store 指令来实现。其他非 Phi 的指令结果将保持 SSA 形式。

+## SCCP
+
+SCCP（稀疏条件常量传播）是一种编译器优化技术，它结合了常量传播和死代码消除。其核心思想是在程序执行过程中，尝试识别并替换那些在编译时就能确定其值的变量（常量），同时移除那些永远不会被执行到的代码块（不可达代码）。
+
+以下是 SCCP 的实现思路：
+
+1. 核心数据结构与工作列表：
+
+Lattice 值（Lattice Value）: SCCP 使用三值格（Three-Valued Lattice）来表示变量的状态：
+
+Top (T): 初始状态，表示变量的值未知，但可能是一个常量。
+
+Constant (C): 表示变量的值已经确定为一个具体的常量。
+
+Bottom (⊥): 表示变量的值不确定或不是一个常量（例如，它可能在运行时有多个不同的值，或者从内存中加载）。一旦变量状态变为 Bottom，它就不能再变回 Constant 或 Top。
+
+SSAPValue: 封装了 Lattice 值和常量具体值（如果状态是 Constant）。
+
+*valState (map<Value, SSAPValue>):** 存储程序中每个 Value（变量、指令结果等）的当前 SCCP Lattice 状态。
+
+*ExecutableBlocks (set<BasicBlock>):** 存储在分析过程中被确定为可执行的基本块。
+
+工作列表 (Worklists):
+
+cfgWorkList (queue<pair<BasicBlock, BasicBlock>>):** 存储待处理的控制流图（CFG）边。当一个块被标记为可执行时，它的后继边会被添加到这个列表。
+
+*ssaWorkList (queue<Instruction>):** 存储待处理的 SSA (Static Single Assignment) 指令。当一个指令的任何操作数的状态发生变化时，该指令就会被添加到这个列表，需要重新评估。
+
+2. 初始化：
+
+所有 Value 的状态都被初始化为 Top。
+
+所有基本块都被初始化为不可执行。
+
+函数的入口基本块被标记为可执行，并且该块中的所有指令被添加到 ssaWorkList。
+
+3. 迭代过程 (Fixed-Point Iteration)：
+
+SCCP 的核心是一个迭代过程，它交替处理 CFG 工作列表和 SSA 工作列表，直到达到一个不动点（即没有更多的状态变化）。
+
+处理 cfgWorkList:
+
+从 cfgWorkList 中取出一个边 (prev, next)。
+
+如果 next 块之前是不可执行的，现在通过 prev 块可达，则将其标记为可执行 (markBlockExecutable)。
+
+一旦 next 块变为可执行，其内部的所有指令（特别是 Phi 指令）都需要被重新评估，因此将它们添加到 ssaWorkList。
+
+处理 ssaWorkList:
+
+从 ssaWorkList 中取出一个指令 inst。
+
+重要： 只有当 inst 所在的块是可执行的，才处理该指令。不可执行块中的指令不参与常量传播。
+
+计算新的 Lattice 值 (computeLatticeValue): 根据指令类型和其操作数的当前 Lattice 状态，计算 inst 的新的 Lattice 状态。
+
+常量折叠: 如果所有操作数都是常量，则可以直接执行运算并得到一个新的常量结果。
+
+Bottom 传播: 如果任何操作数是 Bottom，或者运算规则导致不确定（例如除以零），则结果为 Bottom。
+
+Phi 指令的特殊处理: Phi 指令的值取决于其所有可执行的前驱块传入的值。
+
+如果所有可执行前驱都提供了相同的常量 C，则 Phi 结果为 C。
+
+如果有任何可执行前驱提供了 Bottom，或者不同的可执行前驱提供了不同的常量，则 Phi 结果为 Bottom。
+
+如果所有可执行前驱都提供了 Top，则 Phi 结果仍为 Top。
+
+更新状态: 如果 inst 的新计算出的 Lattice 值与它当前存储的值不同，则更新 valState[inst]。
+
+传播变化: 如果 inst 的状态发生变化，那么所有使用 inst 作为操作数的指令都可能受到影响，需要重新评估。因此，将 inst 的所有使用者添加到 ssaWorkList。
+
+处理终结符指令 (BranchInst, ReturnInst):
+
+对于条件分支 BranchInst，如果其条件操作数变为常量：
+
+如果条件为真，则只有真分支的目标块是可达的，将该边添加到 cfgWorkList。
+
+如果条件为假，则只有假分支的目标块是可达的，将该边添加到 cfgWorkList。
+
+如果条件不是常量（Top 或 Bottom），则两个分支都可能被执行，将两边的边都添加到 cfgWorkList。
+
+这会影响 CFG 的可达性分析，可能导致新的块被标记为可执行。
+
+4. 应用优化 (Transformation)：
+
+当两个工作列表都为空，达到不动点后，程序代码开始进行实际的修改：
+
+常量替换:
+
+遍历所有指令。如果指令的 valState 为 Constant，则用相应的 ConstantValue 替换该指令的所有用途 (replaceAllUsesWith)。
+
+将该指令标记为待删除。
+
+对于指令的操作数，如果其 valState 为 Constant，则直接将操作数替换为对应的 ConstantValue（常量折叠）。
+
+删除死指令: 遍历所有标记为待删除的指令，并从其父基本块中删除它们。
+
+删除不可达基本块: 遍历函数中的所有基本块。如果一个基本块没有被标记为可执行 (ExecutableBlocks 中不存在)，则将其从函数中删除。但入口块不能删除。
+
+简化分支指令:
+
+遍历所有可执行的基本块的终结符指令。
+
+对于条件分支 BranchInst，如果其条件操作数在 valState 中是 Constant：
+
+如果条件为真，则将该条件分支替换为一个无条件跳转到真分支目标块的指令。
+
+如果条件为假，则将该条件分支替换为一个无条件跳转到假分支目标块的指令。
+
+更新 CFG，移除不可达的分支边和其前驱信息。
+
+computeLatticeValue 的具体逻辑：
+
+这个函数是 SCCP 的核心逻辑，它定义了如何根据指令类型和操作数的当前 Lattice 状态来计算指令结果的 Lattice 状态。
+
+二元运算 (Add, Sub, Mul, Div, Rem, ICmp, And, Or):
+
+如果任何一个操作数是 Bottom，结果就是 Bottom。
+
+如果任何一个操作数是 Top，结果就是 Top。
+
+如果两个操作数都是 Constant，执行实际的常量运算，结果是一个新的 Constant。
+
+一元运算 (Neg, Not):
+
+如果操作数是 Bottom，结果就是 Bottom。
+
+如果操作数是 Top，结果就是 Top。
+
+如果操作数是 Constant，执行实际的常量运算，结果是一个新的 Constant。
+
+Load 指令: 通常情况下，Load 的结果会被标记为 Bottom，因为内存内容通常在编译时无法确定。但如果加载的是已知的全局常量，可能可以确定。在提供的代码中，它通常返回 Bottom。
+
+Store 指令: Store 不产生值，所以其 SSAPValue 保持 Top 或不关心。
+
+Call 指令: 大多数 Call 指令（尤其是对外部或有副作用的函数）的结果都是 Bottom。对于纯函数，如果所有参数都是常量，理论上可以折叠，但这需要额外的分析。
+
+GetElementPtr (GEP) 指令: GEP 计算内存地址。如果所有索引都是常量，地址本身是常量。但 SCCP 关注的是数据值，因此这里通常返回 Bottom，除非有特定的指针常量跟踪。
+
+Phi 指令: 如上所述，基于所有可执行前驱的传入值进行聚合。
+
+Alloc 指令: Alloc 分配内存，返回一个指针。其内容通常是 Bottom。
+
+Branch 和 Return 指令: 这些是终结符指令，不产生一个可用于其他指令的值，通常 SSAPValue 保持 Top 或不关心。
+
+类型转换 (ZExt, SExt, Trunc, FtoI, ItoF): 如果操作数是 Constant，则执行相应的类型转换，结果仍为 Constant。对于浮点数转换，由于 SSAPValue 的 constantVal 为 int 类型，所以对浮点数的操作会保守地返回 Bottom。
+
+未处理的指令: 默认情况下，任何未明确处理的指令都被保守地假定为产生 Bottom 值。
+
+浮点数处理的注意事项：
+
+在提供的代码中，SSAPValue 的 constantVal 是 int 类型。这使得浮点数常量传播变得复杂。对于浮点数相关的指令（kFAdd, kFMul, kFCmp, kFNeg, kFNot, kItoF, kFtoI 等），如果不能将浮点值准确地存储在 int 中，或者不能可靠地执行浮点运算，那么通常会保守地将结果设置为 Bottom。一个更完善的 SCCP 实现会使用 std::variant<int, float> 或独立的浮点常量存储来处理浮点数。
+
+## LoopSR循环归纳变量强度削弱 关于魔数计算的说明
+
+魔数除法的核心思想是：将除法转换为乘法和移位
+
+数学原理：x / d ≈ (x * m) >> (32 + s)
+
+m 是魔数 (magic number)
+s 是额外的移位量 (shift)
+>> 是算术右移
+
+2^(32+s) / d ≤ m < 2^(32+s) / d + 2^s / d
+
+cd /home/downright/Compiler_Opt/mysysy && python3 -c "
+# 真正的迭代原因：精度要求
+def explain_precision_requirement():
+    d = 10
+    
+    print('魔数算法需要找到精确的边界值：')
+    print('目标：2^p > d * (2^31 - r)，其中r是余数')
+    print()
+    
+    # 模拟我们算法的迭代过程
+    p = 31
+    two_p = 2**p
+    r = two_p % d  # 余数
+    m = two_p // d  # 商
+    
+    print(f'初始状态 (p={p}):')
+    print(f'  2^{p} = {two_p:,}')
+    print(f'  2^{p} / {d} = {m:,} 余 {r}')
+    print(f'  需要检查: 2^{p} > {d} * (2^31 - {r}) = {d * (2**31 - r):,}')
+    print(f'  条件满足: {two_p > d * (2**31 - r)}')
+    print()
+    
+    # 继续迭代直到条件满足
+    iteration = 0
+    while r != 0 and m < 2**31:
+        iteration += 1
+        p += 1
+        two_p *= 2
+        m *= 2
+        r *= 2
+        if r >= d:
+            m += 1
+            r -= d
+            
+        print(f'迭代 {iteration} (p={p}):')
+        print(f'  m = {m:,}, r = {r}')
+        print(f'  m < 2^31? {m < 2**31} (2^31 = {2**31:,})')
+        
+        if m >= 2**31:
+            print(f'  ✗ 魔数 {m:,} 超出32位范围!')
+            break
+        elif r == 0:
+            print(f'  ✓ 余数为0，找到精确解')
+            break
+        
+        if iteration > 10:  # 防止无限循环
+            print('  ... (限制迭代次数)')
+            break
+
+explain_precision_requirement()
+"
+
+问题在于精度要求。理论魔数需要非常精确，不能简单地用 2^32 / d。让我展示真正的迭代原因：
+
+cd /home/downright/Compiler_Opt/mysysy && python3 -c "
+# 真正的迭代原因：精度要求
+def explain_precision_requirement():
+    d = 10
+    
+    print('魔数算法需要找到精确的边界值：')
+    print('目标：2^p > d * (2^31 - r)，其中r是余数')
+    print()
+    
+    # 模拟我们算法的迭代过程
+    p = 31
+    two_p = 2**p
+    r = two_p % d  # 余数
+    m = two_p // d  # 商
+    
+    print(f'初始状态 (p={p}):')
+    print(f'  2^{p} = {two_p:,}')
+    print(f'  2^{p} / {d} = {m:,} 余 {r}')
+    print(f'  需要检查: 2^{p} > {d} * (2^31 - {r}) = {d * (2**31 - r):,}')
+    print(f'  条件满足: {two_p > d * (2**31 - r)}')
+    print()
+    
+    # 继续迭代直到条件满足
+    iteration = 0
+    while r != 0 and m < 2**31:
+        iteration += 1
+        p += 1
+        two_p *= 2
+        m *= 2
+        r *= 2
+        if r >= d:
+            m += 1
+            r -= d
+            
+        print(f'迭代 {iteration} (p={p}):')
+        print(f'  m = {m:,}, r = {r}')
+        print(f'  m < 2^31? {m < 2**31} (2^31 = {2**31:,})')
+        
+        if m >= 2**31:
+            print(f'  ✗ 魔数 {m:,} 超出32位范围!')
+            break
+        elif r == 0:
+            print(f'  ✓ 余数为0，找到精确解')
+            break
+        
+        if iteration > 10:  # 防止无限循环
+            print('  ... (限制迭代次数)')
+            break
+
+explain_precision_requirement()
+"
+
+
+cd /home/downright/Compiler_Opt/mysysy && python3 -c "
+# 检查初始状态是否会溢出
+def check_initial_overflow():
+    int32_max = 2**31 - 1
+    
+    print('检查不同除数的初始魔数是否会溢出:')
+    print(f'32位有符号整数最大值: {int32_max:,}')
+    print()
+    
+    # 检查各种除数
+    divisors = [2, 3, 5, 7, 9, 10, 11, 13, 17, 25, 100, 1000]
+    
+    for d in divisors:
+        # 初始魔数 = 2^31 / d
+        initial_magic = (2**31) // d
+        
+        overflow = initial_magic > int32_max
+        status = \"溢出\" if overflow else \"安全\"
+        
+        print(f'd={d:4d}: 初始魔数 = 2^31/{d} = {initial_magic:10,} [{status}]')
+        
+    print()
+    print('结论: 初始状态下魔数不会溢出，溢出发生在迭代过程中')
+
+check_initial_overflow()
+"
+
+总结
+迭代的必要性：
+
+不是为了避免初始溢出（初始状态安全）
+是为了找到最精确的魔数，减少舍入误差
+每次迭代提高一倍精度，但魔数也翻倍
+溢出发生时机：
+
+初始状态：2^31 / d 总是在32位范围内
+迭代过程：2^32 / d, 2^33 / d, ... 逐渐超出32位范围
+回退值的正确性：
+
+回退值是基于数学理论和实践验证的标准值
+来自LLVM、GCC等成熟编译器的实现
+通过测试验证，对各种输入都能产生正确结果
+算法设计哲学：
+
+先尝试最优解：通过迭代寻找最精确的魔数
+检测边界条件：当超出32位范围时及时发现
+智能回退：使用已验证的标准值保证正确性
+保持通用性：对于没有预设值的除数仍然可以工作
+
+## 死归纳变量消除
+
+整体架构和工作流程
+当前的归纳变量消除优化分为三个清晰的阶段：
+
+识别阶段：找出所有潜在的死归纳变量
+安全性分析阶段：验证每个变量消除的安全性
+消除执行阶段：实际删除安全的死归纳变量
+
+
+逃逸点检测 (已修复的关键安全机制)
+数组索引检测：GEP指令被正确识别为逃逸点
+循环退出条件：用于比较和条件分支的归纳变量不会被消除
+控制流指令：condBr、br、return等被特殊处理为逃逸点
+内存操作：store/load指令经过别名分析检查

 # 后续优化可能涉及的改动

-## 1）将所有的alloca集中到entryblock中
+## 1）将所有的alloca集中到entryblock中（已实现）

 好处：优化友好性，方便mem2reg提升
 目前没有实现这个机制，如果想要实现首先解决同一函数不同域的同名变量命名区分
-需要保证符号表能正确维护域中的局部变量
+需要保证符号表能正确维护域中的局部变量
+
+
+# 关于中端优化提升编译器性能的TODO
+
+## usedelete_withinstdelte方法
+
+这个方法删除了use关系并移除了指令，逻辑是根据Instruction* inst去find对应的迭代器并erase
+有些情况下外部持有迭代器和inst,可以省略find过程
--- a/doc/CompilerDesign.md
+++ b/doc/CompilerDesign.md
@ -0,0 +1,266 @@
+# 编译器核心技术与优化详解
+
+本文档深入剖析 mysysy 编译器的内部实现，重点阐述其在前端、中端和后端所采用的核心编译技术及优化算法，并结合具体实现函数进行说明。
+
+## 1. 编译器整体架构
+
+本编译器采用经典的三段式架构，将编译过程清晰地划分为前端、中端和后端三个主要部分。每个部分处理不同的抽象层级，并通过定义良好的接口（AST, IR）进行通信，实现了高度的模块化。
+
+```mermaid
+graph TD
+    A[源代码 .sy] --> B{前端 Frontend};
+    B --> C[抽象语法树 AST];
+    C --> D{中端 Midend};
+    D --> E[SSA-based IR];
+    E -- 优化 --> F[优化后的 IR];
+    F --> G{后端 Backend};
+    G --> H[目标机代码 MachineInstr];
+    H --> I[RISC-V 64 汇编代码 .s];
+
+    subgraph 前端
+        B
+    end
+    subgraph 中端
+        D
+    end
+    subgraph 后端
+        G
+    end
+```
+
+- **前端 (Frontend)**：负责词法、语法、语义分析，将 SysY 源代码解析为抽象语法树 (AST)。
+- **中端 (Midend)**：基于 AST 生成与具体机器无关的中间表示 (IR)，并在此基础上进行深入的分析和优化。
+- **后端 (Backend)**：将优化后的 IR 翻译成目标平台（RISC-V 64）的汇编代码。
+
+---
+
+## 2. 前端技术 (Frontend)
+
+前端的核心任务是进行语法和语义的分析与验证，其工作流程如下：
+
+```mermaid
+graph TD
+    subgraph "前端处理流程"
+        Source["源文件 (.sy)"] --> Lexer["词法分析器 (SysYLexer)"];
+        Lexer --> TokenStream["Token 流"];
+        TokenStream --> Parser["语法分析器 (SysYParser)"];
+        Parser --> ParseTree["解析树"];
+        ParseTree --> Visitor["AST构建 (SysYVisitor)"];
+        Visitor --> AST[抽象语法树];
+    end
+```
+
+- **词法与语法分析**:
+  - **技术**: 采用 **ANTLR (ANother Tool for Language Recognition)** 框架。通过在 `frontend/SysY.g4` 文件中定义的上下文无关文法，ANTLR 能够自动生成高效的 LL(*) 词法分析器 (`SysYLexer.cpp`) 和语法分析器 (`SysYParser.cpp`)。
+  - **实现**: 词法分析器将字符流转换为记号 (Token) 流，语法分析器则根据文法规则将记号流组织成一棵解析树 (Parse Tree)。这棵树精确地反映了源代码的语法结构。
+
+- **AST 构建**:
+  - **技术**: 应用 **访问者 (Visitor) 设计模式** 遍历 ANTLR 生成的解析树。该模式将数据结构（解析树）与作用于其上的操作（AST构建逻辑）解耦。
+  - **实现**: `frontend/SysYVisitor.cpp` 中定义了具体的遍历逻辑。在遍历过程中，会构建一个比解析树更抽象、更面向编译需求的**抽象语法树 (Abstract Syntax Tree, AST)**。AST 忽略了纯粹的语法细节（如括号、分号），只保留了核心的语义结构，是前端传递给中端的接口。
+
+---
+
+## 3. 中端技术与优化 (Midend)
+
+中端是编译器的核心，所有与目标机器无关的分析和优化都在此阶段完成。
+
+### 3.1. 中间表示 (IR) 及设计要点
+
+- **技术**: 设计了一种三地址码（Three-Address Code）风格的中间表示，其形式和设计哲学深受 **LLVM IR** 的启发。IR 的核心特征是采用了**静态单赋值 (Static Single Assignment, SSA)** 形式。
+- **实现**: `midend/IR.cpp` 定义了 IR 的核心数据结构，如 `Instruction`, `BasicBlock`, `Function` 和 `Module`。`midend/SysYIRGenerator.cpp` 负责将前端的 AST 转换为这种 IR。在 SSA 形式下，每个变量只被赋值一次，使得变量的定义-使用关系（Def-Use Chain）变得异常清晰，极大地简化了后续的优化算法。通过继承并重写 SysYBaseVisitor 类，遍历 AST 节点生成自定义 IR，并在 IR 生成阶段实现了简单的常量传播和公共子表达式消除（CSE）。
+- **设计要点**：
+  - **`alloca` 指令集中管理**：  
+  所有 `alloca` 指令统一放置在入口基本块，并与实际计算指令分离。这有助于后续指令调度器专注于优化计算密集型指令的执行顺序，避免内存分配指令的干扰。
+  - **消除 `fallthrough` 现象**：  
+  通过确保所有基本块均以终结指令结尾，消除基本块间的 `fallthrough`，简化了控制流图（CFG）的构建和分析。这一做法提升了编译器整体质量，使中端各类 Pass 的编写和维护更加规范和高效。
+
+### 3.2. 核心优化详解
+
+编译器的分析和优化被组织成一系列独立的“遍”（Pass）。每个 Pass 都是一个独立的算法模块，对 IR 进行特定的分析或变换。这种设计具有高度的模块化和可扩展性。
+
+#### 3.2.1. SSA 构建与解构
+
+- **Mem2Reg (`Mem2Reg.cpp`)**:
+  - **目标**: 将对栈内存 (`alloca`) 的 `load`/`store` 操作，提升为对虚拟寄存器的直接操作，并构建 SSA 形式。
+  - **技术**: 该过程是实现 SSA 的关键。它依赖于**支配树 (Dominator Tree)** 分析，通过寻找变量定义块的**支配边界 (Dominance Frontier)** 来确定在何处插入 **Φ (Phi) 函数**。
+  - **实现**: `Mem2RegContext::run` 驱动此过程。首先调用 `isPromotableAlloca` 识别所有仅被 `load`/`store` 使用的标量 `alloca`。然后，`insertPhis` 根据支配边界信息在必要的控制流汇合点插入 `phi` 指令。最后，`renameVariables` 递归地遍历支配树，用一个模拟的值栈来将 `load` 替换为栈顶的 SSA 值，将 `store` 视为对栈的一次 `push` 操作，从而完成重命名。值得一提的是，由于我们在IR生成阶段就将所有alloca指令统一放置在入口块，极大地简化了Mem2Reg遍的实现和支配树分析的计算。
+
+- **Reg2Mem (`Reg2Mem.cpp`)**:
+  - **目标**: 执行 `Mem2Reg` 的逆操作，将程序从 SSA 形式转换回基于内存的表示。这通常是为不支持 SSA 的后端做准备的**SSA解构 (SSA Destruction)** 步骤。
+  - **技术**: 为每个 SSA 值（指令结果、函数参数）在函数入口创建一个 `alloca` 栈槽。然后，在每个 SSA 值的定义点之后插入一个 `store` 将其存入对应的栈槽；在每个使用点之前插入一个 `load` 从栈槽中取出值。
+  - **实现**: `Reg2MemContext::run` 驱动此过程。`allocateMemoryForSSAValues` 为所有需要转换的 SSA 值创建 `alloca` 指令。`rewritePhis` 特殊处理 `phi` 指令，在每个前驱块的末尾插入 `store`。`insertLoadsAndStores` 则处理所有非 `phi` 指令的定义和使用，插入相应的 `store` 和 `load`。虽然
+
+#### 3.2.2. 常量与死代码优化
+
+- **SCCP (`SCCP.cpp`)**:
+  - **目标**: 稀疏条件常量传播。在编译期计算常量表达式，并利用分支条件为常数的信息来消除死代码，比简单的常量传播更强大。
+  - **技术**: 这是一种基于数据流分析的格理论（Lattice Theory）的优化。它为每个变量维护一个值状态，可能为 `Top` (未定义), `Constant` (某个常量值), 或 `Bottom` (非常量)。同时，它跟踪基本块的可达性，如果一个分支的条件被推断为常量，则其不可达的后继分支在分析中会被直接忽略。
+  - **实现**: `SCCPContext::run` 驱动整个分析过程。它维护一个指令工作列表和一个边工作列表。`ProcessInstruction` 和 `ProcessEdge` 函数交替执行，不断地从 IR 中传播常量和可达性信息，直到达到不动点为止。最后，`PropagateConstants` 和 `SimplifyControlFlow` 将推断出的常量替换到代码中，并移除死块。
+
+- **DCE (`DCE.cpp`)**:
+  - **目标**: 简单死代码消除。移除那些计算结果对程序输出没有贡献的指令。
+  - **技术**: 采用**标记-清除 (Mark and Sweep)** 算法。从具有副作用的指令（如 `store`, `call`, `return`）开始，反向追溯其操作数，标记所有相关的指令为“活跃”。
+  - **实现**: `DCEContext::run` 实现了此算法。第一次遍历时，通过 `isAlive` 函数识别出具有副作用的“根”指令，然后调用 `addAlive` 递归地将所有依赖的指令加入 `alive_insts` 集合。第二次遍历时，所有未被标记为活跃的指令都将被删除。
+  - **未来规划**: 后续开发更多分析遍会为DCE收集更多的IR信息，能够迭代出更健壮的DEC遍。
+
+#### 3.2.3. 控制流图 (CFG) 优化
+
+- **实现**: `SysYIRCFGOpt.cpp` 中定义了一系列用于清理和简化控制流图的 Pass。
+  - **`SysYDelInstAfterBrPass`**: 删除分支指令后的死代码。
+  - **`SysYDelNoPreBLockPass`**: 通过从入口块开始的图遍历（BFS），识别并删除所有不可达的基本块。
+  - **`SysYDelEmptyBlockPass`**: 识别并删除仅包含一条无条件跳转指令的空块，将其前驱直接重定向到其后继。
+  - **`SysYBlockMergePass`**: 如果一个块 A 只有一个后继 B，且 B 只有一个前驱 A，则将 A 和 B 合并为一个块。
+  - **`SysYCondBr2BrPass`**: 如果一个条件分支的条件是常量，则将其转换为一个无条件分支。
+  - **`SysYAddReturnPass`**: 确保所有没有终结指令的函数出口路径都有一个 `return` 指令，以保证 CFG 的完整性。
+
+#### 3.2.4. 其他优化
+
+#### 3.3. 核心分析遍
+
+  为了为优化遍收集信息，最大程度发掘程序优化潜力，我们目前设计并实现了以下关键的分析遍：
+
+- **支配树分析 (Dominator Tree Analysis)**:
+  - **技术**: 通过计算每个基本块的支配节点，构建出一棵支配树结构。我们在计算支配节点时采用了**逆后序遍历（RPO, Reverse Post Order）**，以保证数据流分析的收敛速度和正确性。在计算直接支配者（Idom, Immediate Dominator）时，采用了经典的**Lengauer-Tarjan（LT）算法**，该算法以高效的并查集和路径压缩技术著称，能够在线性时间内准确计算出每个基本块的直接支配者关系。
+  - **实现**: `Dom.cpp` 实现了支配树分析。该分析为每个基本块分配其直接支配者，并递归构建整棵支配树。支配树是许多高级优化（尤其是 SSA 形式下的优化）的基础。例如，Mem2Reg 需要依赖支配树来正确插入 Phi 指令，并在变量重命名阶段高效遍历控制流图。此外，循环相关优化（如循环不变量外提）也依赖于支配树信息来识别循环头和循环体的关系。
+
+- **活跃性分析 (Liveness Analysis)**:
+  - **技术**: 活跃性分析用于确定在程序的某一特定点上，哪些变量的值在未来会被用到。我们采用**经典的不动点迭代算法**，在数据流分析框架下，逆序遍历基本块，迭代计算每个基本块的 `live-in` 和 `live-out` 集合，直到收敛为止。这种方法简单且易于实现，能够满足大多数编译优化的需求。
+  - **未来规划**: 若后续对分析效率有更高要求，可考虑引入如**工作列表算法**或者**转化为基于SSA的图可达性分析**等更高效的算法，以进一步提升大型函数或复杂控制流下的分析性能。
+  - **实现**: `Liveness.cpp` 提供了活跃性分析。该分析采用经典的数据流分析框架，迭代计算每个基本块的 `live-in` 和 `live-out` 集合。活跃性信息是死代码消除（DCE）、寄存器分配等优化的必要前置步骤。通过准确的活跃性分析，可以识别出无用的变量和指令，从而为后续优化遍提供坚实的数据基础。
+
+### 3.4. 未来的规划
+
+基于现有的成果，我们规划将中端能力进一步扩展，近期我们重点将放在循环相关的分析和函数内联的实现，以期大幅提升最终程序的性能。
+
+- **循环优化**:
+  我们正在开发一个健壮的分析遍来准确识别程序中的循环结构，并通过对已识别的循环进行规范化的转换遍，为后续的向量化、并行化工作做铺垫。并通过循环不变量提升、循环归纳变量分析与强度削减等优化提升循环相关代码的执行效率。
+- **函数内联**:
+  函数内联能够将简单函数（可能需要收集更多信息）内联到call指令相应位置，减少栈空间相关变动，并且为其他遍发掘优化空间。
+- **`LLVM IR`格式化**:
+  我们将为所有的IR设计并实现通用的打印器方法，使得IR能够显式化为可编译运行的LLVM IR，通过编排脚本和调用llvm相关工具链，我们能够绕过后端编译运行中间代码，为验证中端正确性提供系统化的方法，同时减轻后端开发bug溯源的压力。
+
+---
+
+## 4. 后端技术与优化 (Backend)
+
+后端负责将经过优化的、与机器无关的 IR 转换为针对 RISC-V 64 位架构的汇编代码。
+
+### 4.1. 栈帧布局 (Stack Frame Layout)
+
+在函数调用发生时，后端需要在栈上创建一个**栈帧 (Stack Frame)** 来存储局部变量、传递参数和保存寄存器。本编译器采用的栈帧布局遵循 RISC-V 调用约定，结构如下：
+
+```
+高地址  +-----------------------------+
+        |       ...                   |
+        |       函数参数 (8+)         |  <-- 调用者传入的、放不进寄存器的参数
+        +-----------------------------+
+        |       返回地址 (ra)         |  <-- sp 在函数入口指向的位置
+        +-----------------------------+
+        |       旧的帧指针 (s0/fp)    |
+        +-----------------------------+  <-- s0/fp 在函数序言后指向的位置
+        |       被调用者保存的寄存器  |
+        |       (Callee-Saved Regs)   |
+        +-----------------------------+
+        |       局部变量 (Alloca)     |
+        +-----------------------------+
+        |       寄存器溢出区域        |
+        |       (Spill Slots)         |
+        +-----------------------------+
+        |       为调用其他函数预留的  |
+        |       出参空间 (Out-Args)   |
+低地址  +-----------------------------+  <-- sp 在函数序言后指向的位置
+```
+
+- **实现**: `PrologueEpilogueInsertion.h` 和 `EliminateFrameIndices.h` 中的 Pass 负责生成函数序言（prologue）和尾声（epilogue）代码，来构建和销毁上述栈帧。`EliminateFrameIndices` 会将所有对抽象栈槽（如局部变量、溢出槽）的访问，替换为对帧指针 `s0` 或栈指针 `sp` 的、带有具体偏移量的访问。
+
+### 4.2. 指令选择 (Instruction Selection)
+
+- **目标**: 将抽象的 IR 指令高效地翻译成具体的目标机指令序列。
+- **技术**: 采用 **基于 DAG (Directed Acyclic Graph) 的模式匹配** 算法。
+- **实现**: `RISCv64ISel.cpp` 中的 `RISCv64ISel::select()` 驱动此过程。`selectBasicBlock()` 为每个基本块调用 `build_dag()` 来构建一个操作的 DAG，然后通过 `select_recursive()` 对 DAG 进行自底向上的遍历和匹配。在 `selectNode()` 函数中，通过一个大的 `switch` 语句，为不同类型的 DAG 节点（如 `BINARY`, `LOAD`, `STORE`）匹配最优的指令序列。例如，一个 IR 的加法指令，如果其中一个操作数是小常数，会被直接匹配为一条 `ADDIW` 指令，而不是 `LI` 和 `ADDW` 两条指令。
+
+### 4.3. 寄存器分配 (Register Allocation)
+
+- **目标**: 将无限的虚拟寄存器映射到有限的物理寄存器上，并优雅地处理寄存器不足（溢出）的情况。
+- **技术**: 实现了经典的**基于图着色 (Graph Coloring) 的全局寄存器分配算法**，这是一种强大但复杂的全局优化方法。
+- **实现**: `RISCv64RegAlloc.cpp` 中的 `RISCv64RegAlloc::run()` 是主入口。它在一个循环中执行分配，直到没有寄存器需要溢出为止。其内部流程极其精密，如下图所示：
+
+```mermaid
+graph TD
+    subgraph "寄存器分配主循环 (RISCv64RegAlloc::run)"
+        direction LR
+        Start((Start)) --> Liveness[1. 活跃性分析 LivenessAnalysis]
+        Liveness --> Build[2. 构建冲突图 Build]
+        Build --> Worklist[3. 创建工作表 MakeWorklist]
+        Worklist --> Loop{Main Loop}
+        Loop -- simplifyWorklist 非空 --> Simplify[4a. 简化 Simplify]
+        Simplify --> Loop
+        Loop -- worklistMoves 非空 --> Coalesce[4b. 合并 Coalesce]
+        Coalesce --> Loop
+        Loop -- freezeWorklist 非空 --> Freeze[4c. 冻结 Freeze]
+        Freeze --> Loop
+        Loop -- spillWorklist 非空 --> Spill[4d. 选择溢出 SelectSpill]
+        Spill --> Loop
+        Loop -- 所有工作表为空 --> Assign[5. 分配颜色 AssignColors]
+        Assign --> CheckSpill{有溢出?}
+        CheckSpill -- Yes --> Rewrite[6. 重写代码 RewriteProgram]
+        Rewrite --> Liveness
+        CheckSpill -- No --> Finish((Finish))
+    end
+```
+
+  1. **`analyzeLiveness()`**: 对机器指令进行数据流分析，计算出每个虚拟寄存器的活跃范围。
+  2. **`build()`**: 根据活跃性信息构建**冲突图 (Interference Graph)**。如果两个虚拟寄存器同时活跃，则它们冲突，在图中连接一条边。
+  3. **`makeWorklist()`**: 将图节点（虚拟寄存器）根据其度数放入不同的工作列表，为着色做准备。
+  4. **核心着色阶段 (The Loop)**:
+      - **`simplify()`**: 贪心地移除图中度数小于物理寄存器数量的节点，并将其压入栈中。这些节点保证可以被成功着色。
+      - **`coalesce()`**: 尝试将传送指令 (`MV`) 的源和目标节点合并，以消除这条指令。合并的条件基于 **Briggs** 或 **George** 启发式，以避免使图变得不可着色。
+      - **`freeze()`**: 当一个与传送指令相关的节点无法合并也无法简化时，放弃对该传送指令的合并希望，将其“冻结”为一个普通节点。
+      - **`selectSpill()`**: 当所有节点都无法进行上述操作时（即图中只剩下高度数的节点），必须选择一个节点进行**溢出 (Spill)**，即决定将其存放在内存中。
+  5. **`assignColors()`**: 在所有节点都被处理后，从栈中依次弹出节点，并根据其已着色邻居的颜色，为它选择一个可用的物理寄存器。
+  6. **`rewriteProgram()`**: 如果 `assignColors()` 阶段发现有节点被标记为溢出，此函数会被调用。它会修改机器指令，为溢出的虚拟寄存器插入从内存加载（`lw`/`ld`）和存入内存（`sw`/`sd`）的代码。然后，整个分配过程从步骤 1 重新开始。
+
+### 4.4. 后端特定优化
+
+在寄存器分配前后，后端还会进行一系列针对目标机（RISC-V）特性的优化。
+
+#### 4.4.1. 指令调度 (Instruction Scheduling)
+
+- **寄存器分配前调度 (`PreRA_Scheduler.cpp`)**:
+  - **目标**: 在寄存器分配前，通过重排指令来提升性能。主要目标是**隐藏加载延迟 (Load Latency)**，即尽早发出 `load` 指令，使其结果能在需要时及时准备好，避免流水线停顿。同时，由于此时使用的是无限的虚拟寄存器，调度器有较大的自由度，但也可能因为过度重排而延长虚拟寄存器的生命周期，从而增加寄存器压力。
+  - **实现**: `scheduleBlock()` 函数会识别出基本块内的调度边界（如 `call` 或终结指令），然后在每个独立的区域内调用 `scheduleRegion()`。当前的实现是一种简化的列表调度，它会优先尝试将加载指令 (`LW`, `LD` 等) 在不违反数据依赖的前提下，尽可能地向前移动。
+
+- **寄存器分配后调度 (`PostRA_Scheduler.cpp`)**:
+  - **目标**: 在寄存器分配完成之后，对指令序列进行最后一轮微调。此阶段调度的主要目标与分配前不同，它旨在解决由寄存器分配过程本身引入的性能问题，例如：
+    - **缓解溢出代价**: 将因溢出（Spill）而产生的 `load` 指令（从栈加载）尽可能地提前，远离其使用点；将 `store` 指令（存入栈）尽可能地推后，远离其定义点。
+    - **消除伪依赖**: 寄存器分配器可能会为两个原本不相关的虚拟寄存器分配同一个物理寄存器，从而引入了虚假的写后读（WAR）或写后写（WAW）依赖。Post-RA 调度可以尝试解开这些伪依赖，为指令重排提供更多自由度。
+  - **实现**: `scheduleBlock()` 函数实现了此调度器。它采用了一种非常保守的**局部交换 (Local Swapping)** 策略。它迭代地检查相邻的两条指令，在 `canSwapInstructions()` 函数确认交换不会违反任何数据依赖（RAW, WAR, WAW）或内存依赖后，才执行交换。这种方法虽然不如全局列表调度强大，但在严格的 Post-RA 约束下是一种安全有效的优化手段。
+
+#### 4.4.2. 强度削减 (Strength Reduction)
+
+- **除法强度削减 (`DivStrengthReduction.cpp`)**:
+  - **目标**: 将机器指令中昂贵的 `DIV` 或 `DIVW` 指令（当除数为编译期常量时）替换为一系列更快、计算成本更低的指令组合。
+  - **技术**: 基于数论中的**乘法逆元 (Multiplicative Inverse)** 思想。对于一个整数除法 `x / d`，可以找到一个“魔数” `m` 和一个移位数 `s`，使得该除法可以被近似替换为 `(x * m) >> s`。这个过程需要处理复杂的符号、取整和溢出问题。
+  - **实现**: `runOnMachineFunction()` 实现了此优化。它会遍历机器指令，寻找以常量为除数的 `DIV`/`DIVW` 指令。`computeMagic()` 函数负责计算出对应的魔数和移位数。然后，根据除数是 2 的幂、1、-1 还是其他普通数字，生成不同的指令序列，包括 `MULH` (取高位乘积), `SRAI` (算术右移), `ADD`, `SUB` 等，来精确地模拟定点数除法的效果。
+
+#### 4.4.3. 窥孔优化 (Peephole Optimization)
+
+- **目标**: 在生成最终汇编代码之前，对相邻的机器指令序列进行局部优化，以消除冗余操作和利用目标机特性。
+- **技术**: 窥孔优化是一种简单而高效的局部优化技术。它通过一个固定大小的“窥孔”（通常是 2-3 条指令）来扫描指令序列，寻找可以被更优指令序列替换的模式。
+- **实现**: `PeepholeOptimizer::runOnMachineFunction()` 实现了此 Pass。它包含了一系列模式匹配和替换规则，主要包括：
+  - **冗余移动消除**: `mv x, y` 后跟着一条使用 `x` 的指令 `op z, x, ...`，如果 `x` 之后不再活跃，则将 `op` 的操作数直接替换为 `y`，并移除 `mv` 指令。
+  - **冗余加载消除**: `sw r1, mem; lw r2, mem` -> `sw r1, mem; mv r2, r1`。如果 `r1` 和 `r2` 是同一个寄存器，则直接移除 `lw`。
+  - **地址计算优化**: `addi t1, base, imm1; lw t2, imm2(t1)` -> `lw t2, (imm1+imm2)(base)`。将两条指令合并为一条，减少了指令数量和中间寄存器的使用。
+  - **指令合并**: `addi t1, t0, imm1; addi t2, t1, imm2` -> `addi t2, t0, (imm1+imm2)`。合并连续的立即数加法。
+
+### 4.5. 局限性与未来工作
+
+根据项目中的 `TODO` 列表和源代码分析，当前实现存在一些可改进之处：
+
+- **寄存器分配**:
+  - **`CALL` 指令处理**: 当前对 `CALL` 指令的 `use`/`def` 分析不完整，没有将所有调用者保存的寄存器标记为 `def`，这可能导致跨函数调用的值被错误破坏。
+  - **溢出处理**: 当前所有溢出的虚拟寄存器都被简单地映射到同一个物理寄存器 `t6` 上，这会引入大量不必要的 `load`/`store`，并可能导致 `t6` 成为性能瓶颈。
+- **IR 设计**:
+  - 随着 SSA 的引入，IR 中某些冗余信息（如基本块的 `args` 参数）可以被移除，以简化设计。
+- **优化**:
+  - 当前的优化主要集中在标量上。可以引入更多面向循环的优化（如循环不变代码外提 LICM、归纳变量分析 IndVar）和过程间优化来进一步提升性能。
--- a/lib/libsysy_riscv.a
+++ b/lib/libsysy_riscv.a
--- a/script/runit-riscv64.sh
+++ b/script/runit-riscv64.sh
@ -60,11 +60,7 @@ display_file_content() {
 # 清理临时文件的函数
 clean_tmp() {
    echo "正在清理临时目录: ${TMP_DIR}"
-    rm -rf "${TMP_DIR}"/*.s \
-           "${TMP_DIR}"/*_sysyc_riscv64 \
-           "${TMP_DIR}"/*_sysyc_riscv64.actual_out \
-           "${TMP_DIR}"/*_sysyc_riscv64.expected_stdout \
-           "${TMP_DIR}"/*_sysyc_riscv64.o
+    rm -rf "${TMP_DIR}"/*
    echo "清理完成。"
 }

--- a/script/runit-single.sh
+++ b/script/runit-single.sh
@ -2,117 +2,159 @@

 # runit-single.sh - 用于编译和测试单个或少量 SysY 程序的脚本
 # 模仿 runit.sh 的功能，但以具体文件路径作为输入。
+# 此脚本应该位于 mysysy/script/
+
+export ASAN_OPTIONS=detect_leaks=0

 # --- 配置区 ---
-# 请根据你的环境修改这些路径
-# 假设此脚本位于你的项目根目录或一个脚本目录中
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
-# 默认寻找项目根目录下的 build 和 lib
 BUILD_BIN_DIR="${SCRIPT_DIR}/../build/bin"
 LIB_DIR="${SCRIPT_DIR}/../lib"
-# 临时文件会存储在脚本所在目录的 tmp 子目录中
 TMP_DIR="${SCRIPT_DIR}/tmp"

 # 定义编译器和模拟器
 SYSYC="${BUILD_BIN_DIR}/sysyc"
+LLC_CMD="llc-19" # 新增
 GCC_RISCV64="riscv64-linux-gnu-gcc"
 QEMU_RISCV64="qemu-riscv64"

 # --- 初始化变量 ---
 EXECUTE_MODE=false
+IR_EXECUTE_MODE=false
 CLEAN_MODE=false
-SYSYC_TIMEOUT=10      # sysyc 编译超时 (秒)
-GCC_TIMEOUT=10        # gcc 编译超时 (秒)
-EXEC_TIMEOUT=5        # qemu 自动化执行超时 (秒)
-MAX_OUTPUT_LINES=50   # 对比失败时显示的最大行数
-SY_FILES=()           # 存储用户提供的 .sy 文件列表
+OPTIMIZE_FLAG=""
+SYSYC_TIMEOUT=30
+LLC_TIMEOUT=10
+GCC_TIMEOUT=10
+EXEC_TIMEOUT=30
+MAX_OUTPUT_LINES=20
+MAX_OUTPUT_CHARS=1000
+SY_FILES=()
 PASSED_CASES=0
 FAILED_CASES_LIST=""
+INTERRUPTED=false

+# =================================================================
 # --- 函数定义 ---
+# =================================================================
 show_help() {
    echo "用法: $0 [文件1.sy] [文件2.sy] ... [选项]"
-    echo "编译并测试指定的 .sy 文件。"
-    echo ""
-    echo "如果找到对应的 .in/.out 文件，则进行自动化测试。否则，进入交互模式。"
+    echo "编译并测试指定的 .sy 文件。必须提供 -e 或 -eir 之一。"
    echo ""
    echo "选项:"
-    echo "  -e, --executable         编译为可执行文件并运行测试 (必须)。"
+    echo "  -e                       通过汇编运行测试 (sysyc -> gcc -> qemu)。"
+    echo "  -eir                     通过IR运行测试 (sysyc -> llc -> gcc -> qemu)。"
    echo "  -c, --clean              清理 tmp 临时目录下的所有文件。"
-    echo "  -sct N                   设置 sysyc 编译超时为 N 秒 (默认: 10)。"
+    echo "  -O1                      启用 sysyc 的 -O1 优化。"
+    echo "  -sct N                   设置 sysyc 编译超时为 N 秒 (默认: 30)。"
+    echo "  -lct N                   设置 llc-19 编译超时为 N 秒 (默认: 10)。"
    echo "  -gct N                   设置 gcc 交叉编译超时为 N 秒 (默认: 10)。"
-    echo "  -et N                    设置 qemu 自动化执行超时为 N 秒 (默认: 5)。"
-    echo "  -ml N, --max-lines N     当输出对比失败时，最多显示 N 行内容 (默认: 50)。"
+    echo "  -et N                    设置 qemu 自动化执行超时为 N 秒 (默认: 30)。"
+    echo "  -ml N, --max-lines N     当输出对比失败时，最多显示 N 行内容 (默认: 20)。"
+    echo "  -mc N, --max-chars N     当输出对比失败时，最多显示 N 个字符 (默认: 1000)。"
    echo "  -h, --help               显示此帮助信息并退出。"
+    echo ""
+    echo "可在任何时候按 Ctrl+C 来中断测试并显示当前已完成的测例总结。"
 }

-# --- 新增功能: 显示文件内容并根据行数截断 ---
+# 显示文件内容并根据行数和字符数截断的函数
 display_file_content() {
    local file_path="$1"
    local title="$2"
    local max_lines="$3"
-
-    if [ ! -f "$file_path" ]; then
-        return
-    fi
-
+    local max_chars="$4" # 新增参数
+    if [ ! -f "$file_path" ]; then return; fi
    echo -e "$title"
    local line_count
+    local char_count
    line_count=$(wc -l < "$file_path")
-    
+    char_count=$(wc -c < "$file_path")
+
    if [ "$line_count" -gt "$max_lines" ]; then
        head -n "$max_lines" "$file_path"
-        echo -e "\e[33m[... 输出已截断，共 ${line_count} 行 ...]\e[0m"
+        echo -e "\e[33m[... 输出因行数过多 (共 ${line_count} 行) 而截断 ...]\e[0m"
+    elif [ "$char_count" -gt "$max_chars" ]; then
+        head -c "$max_chars" "$file_path"
+        echo -e "\n\e[33m[... 输出因字符数过多 (共 ${char_count} 字符) 而截断 ...]\e[0m"
    else
        cat "$file_path"
    fi
 }

-# --- 本次修改点: 整个参数解析逻辑被重写 ---
-# 使用标准的 while 循环来健壮地处理任意顺序的参数
+# --- 新增：总结报告函数 ---
+print_summary() {
+    local total_cases=${#SY_FILES[@]}
+    echo ""
+    echo "======================================================================"
+    if [ "$INTERRUPTED" = true ]; then
+        echo -e "\e[33m测试被中断。正在汇总已完成的结果...\e[0m"
+    else
+        echo "所有测试完成"
+    fi
+
+    local failed_count
+    if [ -n "$FAILED_CASES_LIST" ]; then
+        failed_count=$(echo -e -n "${FAILED_CASES_LIST}" | wc -l)
+    else
+        failed_count=0
+    fi
+    local executed_count=$((PASSED_CASES + failed_count))
+
+    echo "测试结果: [通过: ${PASSED_CASES}, 失败: ${failed_count}, 已执行: ${executed_count}/${total_cases}]"
+
+    if [ -n "$FAILED_CASES_LIST" ]; then
+        echo ""
+        echo -e "\e[31m未通过的测例:\e[0m"
+        printf "%b" "${FAILED_CASES_LIST}"
+    fi
+    echo "======================================================================"
+
+    if [ "$failed_count" -gt 0 ]; then
+        exit 1
+    else
+        exit 0
+    fi
+}
+
+# --- 新增：SIGINT 信号处理函数 ---
+handle_sigint() {
+    INTERRUPTED=true
+    print_summary
+}
+
+# =================================================================
+# --- 主逻辑开始 ---
+# =================================================================
+
+# --- 新增：设置 trap 来捕获 SIGINT ---
+trap handle_sigint SIGINT
+
+# --- 参数解析 ---
 while [[ "$#" -gt 0 ]]; do
    case "$1" in
-        -e|--executable)
-            EXECUTE_MODE=true
-            shift # 消耗选项
-            ;;
-        -c|--clean)
-            CLEAN_MODE=true
-            shift # 消耗选项
-            ;;
-        -sct)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then SYSYC_TIMEOUT="$2"; shift 2; else echo "错误: -sct 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -gct)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then GCC_TIMEOUT="$2"; shift 2; else echo "错误: -gct 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -et)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then EXEC_TIMEOUT="$2"; shift 2; else echo "错误: -et 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -ml|--max-lines)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then MAX_OUTPUT_LINES="$2"; shift 2; else echo "错误: --max-lines 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -h|--help)
-            show_help
-            exit 0
-            ;;
-        -*) # 未知选项
-            echo "未知选项: $1"
-            show_help
-            exit 1
-            ;;
-        *) # 其他参数被视为文件路径
+        -e|--executable) EXECUTE_MODE=true; shift ;;
+        -eir) IR_EXECUTE_MODE=true; shift ;; # 新增
+        -c|--clean) CLEAN_MODE=true; shift ;;
+        -O1) OPTIMIZE_FLAG="-O1"; shift ;;
+        -lct) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then LLC_TIMEOUT="$2"; shift 2; else echo "错误: -lct 需要一个正整数参数。" >&2; exit 1; fi ;; # 新增
+        -sct) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then SYSYC_TIMEOUT="$2"; shift 2; else echo "错误: -sct 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -gct) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then GCC_TIMEOUT="$2"; shift 2; else echo "错误: -gct 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -et) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then EXEC_TIMEOUT="$2"; shift 2; else echo "错误: -et 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -ml|--max-lines) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then MAX_OUTPUT_LINES="$2"; shift 2; else echo "错误: --max-lines 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -mc|--max-chars) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then MAX_OUTPUT_CHARS="$2"; shift 2; else echo "错误: --max-chars 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -h|--help) show_help; exit 0 ;;
+        -*) echo "未知选项: $1"; show_help; exit 1 ;;
+        *)
            if [[ -f "$1" && "$1" == *.sy ]]; then
                SY_FILES+=("$1")
            else
                echo "警告: 无效文件或不是 .sy 文件，已忽略: $1"
            fi
-            shift # 消耗文件参数
+            shift
            ;;
    esac
 done

-
 if ${CLEAN_MODE}; then
  echo "检测到 -c/--clean 选项，正在清空 ${TMP_DIR}..."
  if [ -d "${TMP_DIR}" ]; then
@ -121,19 +163,22 @@ if ${CLEAN_MODE}; then
  else
    echo "临时目录 ${TMP_DIR} 不存在，无需清理。"
  fi
-  
-  if [ ${#SY_FILES[@]} -eq 0 ] && ! ${EXECUTE_MODE}; then
+  if [ ${#SY_FILES[@]} -eq 0 ] && ! ${EXECUTE_MODE} && ! ${IR_EXECUTE_MODE}; then
    exit 0
  fi
 fi

-# --- 主逻辑开始 ---
-if ! ${EXECUTE_MODE}; then
-    echo "错误: 请提供 -e 或 --executable 选项来运行测试。"
+if ! ${EXECUTE_MODE} && ! ${IR_EXECUTE_MODE}; then
+    echo "错误: 请提供 -e 或 -eir 选项来运行测试。"
    show_help
    exit 1
 fi

+if ${EXECUTE_MODE} && ${IR_EXECUTE_MODE}; then
+    echo -e "\e[31m错误: -e 和 -eir 选项不能同时使用。\e[0m" >&2
+    exit 1
+fi
+
 if [ ${#SY_FILES[@]} -eq 0 ]; then
    echo "错误: 未提供任何 .sy 文件作为输入。"
    show_help
@ -144,18 +189,20 @@ mkdir -p "${TMP_DIR}"
 TOTAL_CASES=${#SY_FILES[@]}

 echo "SysY 单例测试运行器启动..."
-echo "超时设置: sysyc=${SYSYC_TIMEOUT}s, gcc=${GCC_TIMEOUT}s, qemu=${EXEC_TIMEOUT}s"
+if [ -n "$OPTIMIZE_FLAG" ]; then echo "优化等级: ${OPTIMIZE_FLAG}"; fi
+echo "超时设置: sysyc=${SYSYC_TIMEOUT}s, llc=${LLC_TIMEOUT}s, gcc=${GCC_TIMEOUT}s, qemu=${EXEC_TIMEOUT}s"
 echo "失败输出最大行数: ${MAX_OUTPUT_LINES}"
+echo "失败输出最大字符数: ${MAX_OUTPUT_CHARS}"
 echo ""

 for sy_file in "${SY_FILES[@]}"; do
    is_passed=1
+    compilation_ok=1
    base_name=$(basename "${sy_file}" .sy)
    source_dir=$(dirname "${sy_file}")

-    ir_file="${TMP_DIR}/${base_name}_sysyc_riscv64.ll"
+    ir_file="${TMP_DIR}/${base_name}.ll"
    assembly_file="${TMP_DIR}/${base_name}.s"
-    assembly_debug_file="${TMP_DIR}/${base_name}_d.s"
    executable_file="${TMP_DIR}/${base_name}"
    input_file="${source_dir}/${base_name}.in"
    output_reference_file="${source_dir}/${base_name}.out"
@ -164,37 +211,39 @@ for sy_file in "${SY_FILES[@]}"; do
    echo "======================================================================"
    echo "正在处理: ${sy_file}"

-    # 步骤 1: sysyc 编译
-    echo "  使用 sysyc 编译 (超时 ${SYSYC_TIMEOUT}s)..."
-    timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -s ir "${sy_file}" > "${ir_file}"
-    SYSYC_STATUS=$?
-    if [ $SYSYC_STATUS -eq 124 ]; then
-        echo -e "\e[31m错误: SysY 编译 ${sy_file} IR超时\e[0m"
-        is_passed=0
-    elif [ $SYSYC_STATUS -ne 0 ]; then
-        echo -e "\e[31m错误: SysY 编译 ${sy_file} IR失败，退出码: ${SYSYC_STATUS}\e[0m"
-        is_passed=0
-    fi
-    timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -S "${sy_file}" -o "${assembly_file}"
-    if [ $? -ne 0 ]; then
-        echo -e "\e[31m错误: SysY 编译失败或超时。\e[0m"
-        is_passed=0
-    fi
-    # timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -s asmd "${sy_file}" > "${assembly_debug_file}" 2>&1
+    # --- 编译阶段 ---
+    if ${IR_EXECUTE_MODE}; then
+        # 路径1: sysyc -> llc -> gcc
+        echo "  [1/3] 使用 sysyc 编译为 IR (超时 ${SYSYC_TIMEOUT}s)..."
+        timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -s ir "${sy_file}" ${OPTIMIZE_FLAG} -o "${ir_file}"
+        if [ $? -ne 0 ]; then echo -e "\e[31m错误: SysY (IR) 编译失败或超时。\e[0m"; compilation_ok=0; fi

-    # 步骤 2: GCC 编译
-    if [ "$is_passed" -eq 1 ]; then
-        echo "  使用 gcc 编译 (超时 ${GCC_TIMEOUT}s)..."
-        timeout -s KILL ${GCC_TIMEOUT} "${GCC_RISCV64}" "${assembly_file}" -o "${executable_file}" -L"${LIB_DIR}" -lsysy_riscv -static
-        if [ $? -ne 0 ]; then
-            echo -e "\e[31m错误: GCC 编译失败或超时。\e[0m"
-            is_passed=0
+        if [ "$compilation_ok" -eq 1 ]; then
+            echo "  [2/3] 使用 llc 编译为汇编 (超时 ${LLC_TIMEOUT}s)..."
+            timeout -s KILL ${LLC_TIMEOUT} "${LLC_CMD}" -march=riscv64 -mcpu=generic-rv64 -mattr=+m,+a,+f,+d,+c -filetype=asm "${ir_file}" -o "${assembly_file}"
+            if [ $? -ne 0 ]; then echo -e "\e[31m错误: llc 编译失败或超时。\e[0m"; compilation_ok=0; fi
+        fi
+
+        if [ "$compilation_ok" -eq 1 ]; then
+            echo "  [3/3] 使用 gcc 编译 (超时 ${GCC_TIMEOUT}s)..."
+            timeout -s KILL ${GCC_TIMEOUT} "${GCC_RISCV64}" "${assembly_file}" -o "${executable_file}" -L"${LIB_DIR}" -lsysy_riscv -static
+            if [ $? -ne 0 ]; then echo -e "\e[31m错误: GCC 编译失败或超时。\e[0m"; compilation_ok=0; fi
+        fi
+    else # EXECUTE_MODE
+        # 路径2: sysyc -> gcc
+        echo "  [1/2] 使用 sysyc 编译为汇编 (超时 ${SYSYC_TIMEOUT}s)..."
+        timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -S "${sy_file}" ${OPTIMIZE_FLAG} -o "${assembly_file}"
+        if [ $? -ne 0 ]; then echo -e "\e[31m错误: SysY (汇编) 编译失败或超时。\e[0m"; compilation_ok=0; fi
+
+        if [ "$compilation_ok" -eq 1 ]; then
+            echo "  [2/2] 使用 gcc 编译 (超时 ${GCC_TIMEOUT}s)..."
+            timeout -s KILL ${GCC_TIMEOUT} "${GCC_RISCV64}" "${assembly_file}" -o "${executable_file}" -L"${LIB_DIR}" -lsysy_riscv -static
+            if [ $? -ne 0 ]; then echo -e "\e[31m错误: GCC 编译失败或超时。\e[0m"; compilation_ok=0; fi
        fi
    fi

-    # 步骤 3: 执行与测试
-    if [ "$is_passed" -eq 1 ]; then
-        # 检查是自动化测试还是交互模式
+    # --- 执行与测试阶段 (公共逻辑) ---
+    if [ "$compilation_ok" -eq 1 ]; then
        if [ -f "${input_file}" ] || [ -f "${output_reference_file}" ]; then
            # --- 自动化测试模式 ---
            echo "  检测到 .in/.out 文件，进入自动化测试模式..."
@ -217,24 +266,26 @@ for sy_file in "${SY_FILES[@]}"; do
                        EXPECTED_RETURN_CODE="$LAST_LINE_TRIMMED"
                        EXPECTED_STDOUT_FILE="${TMP_DIR}/${base_name}.expected_stdout"
                        head -n -1 "${output_reference_file}" > "${EXPECTED_STDOUT_FILE}"
-                        if [ "$ACTUAL_RETURN_CODE" -ne "$EXPECTED_RETURN_CODE" ]; then echo -e "\e[31m  返回码测试失败: 期望 ${EXPECTED_RETURN_CODE}, 实际 ${ACTUAL_RETURN_CODE}\e[0m"; is_passed=0; fi
                        
+                        ret_ok=1
+                        if [ "$ACTUAL_RETURN_CODE" -ne "$EXPECTED_RETURN_CODE" ]; then echo -e "\e[31m  返回码测试失败: 期望 ${EXPECTED_RETURN_CODE}, 实际 ${ACTUAL_RETURN_CODE}\e[0m"; ret_ok=0; fi
+                        
+                        out_ok=1
                        if ! diff -q <(tr -d '[:space:]' < "${output_actual_file}") <(tr -d '[:space:]' < "${EXPECTED_STDOUT_FILE}") >/dev/null 2>&1; then
-                            echo -e "\e[31m  标准输出测试失败。\e[0m"
-                            is_passed=0
-                            display_file_content "${EXPECTED_STDOUT_FILE}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}"
-                            display_file_content "${output_actual_file}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}"
-                            echo -e "    \e[36m----------------\e[0m"
+                            echo -e "\e[31m  标准输出测试失败。\e[0m"; out_ok=0
+                            display_file_content "${EXPECTED_STDOUT_FILE}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                            display_file_content "${output_actual_file}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
                        fi
+
+                        if [ "$ret_ok" -eq 1 ] && [ "$out_ok" -eq 1 ]; then echo -e "\e[32m  返回码与标准输出测试成功。\e[0m"; else is_passed=0; fi
+
                    else
                        if diff -q <(tr -d '[:space:]' < "${output_actual_file}") <(tr -d '[:space:]' < "${output_reference_file}") >/dev/null 2>&1; then
                            echo -e "\e[32m  标准输出测试成功。\e[0m"
                        else
-                            echo -e "\e[31m  标准输出测试失败。\e[0m"
-                            is_passed=0
-                            display_file_content "${output_reference_file}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}"
-                            display_file_content "${output_actual_file}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}"
-                            echo -e "    \e[36m----------------\e[0m"
+                            echo -e "\e[31m  标准输出测试失败。\e[0m"; is_passed=0
+                            display_file_content "${output_reference_file}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                            display_file_content "${output_actual_file}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
                        fi
                    fi
                else
@ -243,20 +294,16 @@ for sy_file in "${SY_FILES[@]}"; do
            fi
        else
            # --- 交互模式 ---
-            echo -e "\e[33m"
-            echo "  **********************************************************"
-            echo "  ** 未找到 .in 或 .out 文件，进入交互模式。             **"
-            echo "  ** 程序即将运行，你可以直接在终端中输入。             **"
-            echo "  ** 按下 Ctrl+D (EOF) 或以其他方式结束程序以继续。 **"
-            echo "  **********************************************************"
-            echo -e "\e[0m"
+            echo -e "\e[33m\n  未找到 .in 或 .out 文件，进入交互模式...\e[0m"
            "${QEMU_RISCV64}" "${executable_file}"
            INTERACTIVE_RET_CODE=$?
-            echo -e "\e[33m\n  交互模式执行完毕，程序返回码: ${INTERACTIVE_RET_CODE}\e[0m"
-            echo "  注意: 交互模式的结果未经验证。"
+            echo -e "\e[33m\n  交互模式执行完毕，程序返回码: ${INTERACTIVE_RET_CODE} (此结果未经验证)\e[0m"
        fi
+    else
+      is_passed=0
    fi

+    # --- 状态总结 ---
    if [ "$is_passed" -eq 1 ]; then
        echo -e "\e[32m状态: 通过\e[0m"
        ((PASSED_CASES++))
@ -267,20 +314,4 @@ for sy_file in "${SY_FILES[@]}"; do
 done

 # --- 打印最终总结 ---
-echo "======================================================================"
-echo "所有测试完成"
-echo "测试通过率: [${PASSED_CASES}/${TOTAL_CASES}]"
-
-if [ -n "$FAILED_CASES_LIST" ]; then
-    echo ""
-    echo -e "\e[31m未通过的测例:\e[0m"
-    echo -e "${FAILED_CASES_LIST}"
-fi
-
-echo "======================================================================"
-
-if [ "$PASSED_CASES" -eq "$TOTAL_CASES" ]; then
-    exit 0
-else
-    exit 1
-fi
+print_summary
--- a/script/runit.sh
+++ b/script/runit.sh
@ -1,31 +1,44 @@
 #!/bin/bash

 # runit.sh - 用于编译和测试 SysY 程序的脚本
-# 此脚本应该位于 mysysy/test_script/
+# 此脚本应该位于 mysysy/script/
+
+export ASAN_OPTIONS=detect_leaks=0

 # 定义相对于脚本位置的目录
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
 TESTDATA_DIR="${SCRIPT_DIR}/../testdata"
 BUILD_BIN_DIR="${SCRIPT_DIR}/../build/bin"
 LIB_DIR="${SCRIPT_DIR}/../lib"
-# TMP_DIR="${SCRIPT_DIR}/tmp"
 TMP_DIR="${SCRIPT_DIR}/tmp"

 # 定义编译器和模拟器
 SYSYC="${BUILD_BIN_DIR}/sysyc"
+LLC_CMD="llc-19"
 GCC_RISCV64="riscv64-linux-gnu-gcc"
 QEMU_RISCV64="qemu-riscv64"

-# --- 新增功能: 初始化变量 ---
+# --- 状态变量 ---
 EXECUTE_MODE=false
-SYSYC_TIMEOUT=10      # sysyc 编译超时 (秒)
-GCC_TIMEOUT=10        # gcc 编译超时 (秒)
-EXEC_TIMEOUT=5        # qemu 执行超时 (秒)
-MAX_OUTPUT_LINES=50   # 对比失败时显示的最大行数
-TEST_SETS=()          # 用于存储要运行的测试集
+IR_EXECUTE_MODE=false
+OPTIMIZE_FLAG=""
+SYSYC_TIMEOUT=30
+LLC_TIMEOUT=10
+GCC_TIMEOUT=10
+EXEC_TIMEOUT=30
+MAX_OUTPUT_LINES=20
+MAX_OUTPUT_CHARS=1000
+TEST_SETS=()
+PERF_RUN_COUNT=1 # 新增: 性能测试运行次数
 TOTAL_CASES=0
 PASSED_CASES=0
-FAILED_CASES_LIST=""  # 用于存储未通过的测例列表
+FAILED_CASES_LIST=""
+INTERRUPTED=false
+PERFORMANCE_MODE=false # 新增: 标记是否进行性能测试
+
+# =================================================================
+# --- 函数定义 ---
+# =================================================================

 # 显示帮助信息的函数
 show_help() {
@ -33,33 +46,45 @@ show_help() {
    echo "此脚本用于按文件名前缀数字升序编译和测试 .sy 文件。"
    echo ""
    echo "选项:"
-    echo "  -e, --executable         编译为可执行文件并运行测试。"
+    echo "  -e, --executable         编译为汇编并运行测试 (sysyc -> gcc -> qemu)。"
+    echo "  -eir                     通过IR编译为可执行文件并运行测试 (sysyc -> llc -> gcc -> qemu)。"
    echo "  -c, --clean              清理 'tmp' 目录下的所有生成文件。"
+    echo "  -O1                      启用 sysyc 的 -O1 优化。"
    echo "  -set [f|h|p|all]...    指定要运行的测试集 (functional, h_functional, performance)。可多选，默认为 all。"
-    echo "  -sct N                   设置 sysyc 编译超时为 N 秒 (默认: 10)。"
+    echo "                           当包含 'p' 时，会自动记录性能数据到 ${TMP_DIR}/performance_time.csv。"
+    echo "  -pt N                    设置 performance 测试集的每个用例运行 N 次取平均值 (默认: 1)。"
+    echo "  -sct N                   设置 sysyc 编译超时为 N 秒 (默认: 30)。"
+    echo "  -lct N                   设置 llc-19 编译超时为 N 秒 (默认: 10)。"
    echo "  -gct N                   设置 gcc 交叉编译超时为 N 秒 (默认: 10)。"
-    echo "  -et N                    设置 qemu 执行超时为 N 秒 (默认: 5)。"
-    echo "  -ml N, --max-lines N     当输出对比失败时，最多显示 N 行内容 (默认: 50)。"
+    echo "  -et N                    设置 qemu 执行超时为 N 秒 (默认: 30)。"
+    echo "  -ml N, --max-lines N     当输出对比失败时，最多显示 N 行内容 (默认: 20)。"
+    echo "  -mc N, --max-chars N     当输出对比失败时，最多显示 N 个字符 (默认: 1000)。"
    echo "  -h, --help               显示此帮助信息并退出。"
+    echo ""
+    echo "注意: 默认行为 (无 -e 或 -eir) 是将 .sy 文件同时编译为 .s (汇编) 和 .ll (IR)，不执行。"
+    echo "      可在任何时候按 Ctrl+C 来中断测试并显示当前已完成的测例总结。"
 }

-# 显示文件内容并根据行数截断的函数
+
+# 显示文件内容并根据行数和字符数截断的函数
 display_file_content() {
    local file_path="$1"
    local title="$2"
    local max_lines="$3"
-
-    if [ ! -f "$file_path" ]; then
-        return
-    fi
-
+    local max_chars="$4" # 新增参数
+    if [ ! -f "$file_path" ]; then return; fi
    echo -e "$title"
    local line_count
+    local char_count
    line_count=$(wc -l < "$file_path")
-    
+    char_count=$(wc -c < "$file_path")
+
    if [ "$line_count" -gt "$max_lines" ]; then
        head -n "$max_lines" "$file_path"
-        echo -e "\e[33m[... 输出已截断，共 ${line_count} 行 ...]\e[0m"
+        echo -e "\e[33m[... 输出因行数过多 (共 ${line_count} 行) 而截断 ...]\e[0m"
+    elif [ "$char_count" -gt "$max_chars" ]; then
+        head -c "$max_chars" "$file_path"
+        echo -e "\n\e[33m[... 输出因字符数过多 (共 ${char_count} 字符) 而截断 ...]\e[0m"
    else
        cat "$file_path"
    fi
@ -71,90 +96,153 @@ clean_tmp() {
    rm -rf "${TMP_DIR}"/*
 }

-# 如果临时目录不存在，则创建它
+# --- 新增：总结报告函数 ---
+print_summary() {
+    echo "" # 确保从新的一行开始
+    echo "========================================"
+    if [ "$INTERRUPTED" = true ]; then
+        echo -e "\e[33m测试被中断。正在汇总已完成的结果...\e[0m"
+    else
+        echo "测试完成"
+    fi
+
+    local failed_count
+    if [ -n "$FAILED_CASES_LIST" ]; then
+        failed_count=$(echo -e -n "${FAILED_CASES_LIST}" | wc -l)
+    else
+        failed_count=0
+    fi
+    local executed_count=$((PASSED_CASES + failed_count))
+
+    echo "测试结果: [通过: ${PASSED_CASES}, 失败: ${failed_count}, 已执行: ${executed_count}/${TOTAL_CASES}]"
+
+    if [ -n "$FAILED_CASES_LIST" ]; then
+        echo ""
+        echo -e "\e[31m未通过的测例:\e[0m"
+        printf "%b" "${FAILED_CASES_LIST}"
+    fi
+
+    # --- 本次修改点: 提示性能测试结果文件 ---
+    if ${PERFORMANCE_MODE}; then
+        # --- 本次修改点: 计算并添加总计行 ---
+        if [ -f "${PERFORMANCE_CSV_FILE}" ] && [ $(wc -l < "${PERFORMANCE_CSV_FILE}") -gt 1 ]; then
+            local total_seconds_sum
+            total_seconds_sum=$(awk -F, 'NR > 1 {sum += $3} END {printf "%.5f", sum}' "${PERFORMANCE_CSV_FILE}")
+            
+            local total_s_int=${total_seconds_sum%.*}
+            [[ -z "$total_s_int" ]] && total_s_int=0 # 处理小于1秒的情况
+            local total_us_int=$(echo "(${total_seconds_sum} - ${total_s_int}) * 1000000" | bc | cut -d. -f1)
+            local total_time_str="${total_s_int}s${total_us_int}us"
+            
+            echo "all,${total_time_str},${total_seconds_sum}" >> "${PERFORMANCE_CSV_FILE}"
+        fi
+        echo ""
+        echo -e "\e[32m性能测试数据已保存到: ${PERFORMANCE_CSV_FILE}\e[0m"
+    fi
+
+    echo "========================================"
+
+    if [ "$failed_count" -gt 0 ]; then
+        exit 1
+    else
+        exit 0
+    fi
+}
+
+# --- 新增：SIGINT 信号处理函数 ---
+handle_sigint() {
+    INTERRUPTED=true
+    print_summary
+}
+
+# =================================================================
+# --- 主逻辑开始 ---
+# =================================================================
+
+trap handle_sigint SIGINT
 mkdir -p "${TMP_DIR}"

-# 解析命令行参数
 while [[ "$#" -gt 0 ]]; do
    case "$1" in
-        -e|--executable)
-            EXECUTE_MODE=true
-            shift
-            ;;
-        -c|--clean)
-            clean_tmp
-            exit 0
-            ;;
+        -e|--executable) EXECUTE_MODE=true; shift ;;
+        -eir) IR_EXECUTE_MODE=true; shift ;;
+        -c|--clean) clean_tmp; exit 0 ;;
+        -O1) OPTIMIZE_FLAG="-O1"; shift ;;
        -set)
-            shift # 移过 '-set'
-            # 消耗所有后续参数直到遇到下一个选项
-            while [[ "$#" -gt 0 && ! "$1" =~ ^- ]]; do
-                TEST_SETS+=("$1")
-                shift
-            done
-            ;;
-        -sct)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then SYSYC_TIMEOUT="$2"; shift 2; else echo "错误: -sct 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -gct)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then GCC_TIMEOUT="$2"; shift 2; else echo "错误: -gct 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -et)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then EXEC_TIMEOUT="$2"; shift 2; else echo "错误: -et 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -ml|--max-lines)
-            if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then MAX_OUTPUT_LINES="$2"; shift 2; else echo "错误: --max-lines 需要一个正整数参数。" >&2; exit 1; fi
-            ;;
-        -h|--help)
-            show_help
-            exit 0
-            ;;
-        *)
-            echo "未知选项: $1"
-            show_help
-            exit 1
+            shift
+            while [[ "$#" -gt 0 && ! "$1" =~ ^- ]]; do TEST_SETS+=("$1"); shift; done
            ;;
+        -pt) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then PERF_RUN_COUNT="$2"; shift 2; else echo "错误: -pt 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -sct) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then SYSYC_TIMEOUT="$2"; shift 2; else echo "错误: -sct 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -lct) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then LLC_TIMEOUT="$2"; shift 2; else echo "错误: -lct 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -gct) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then GCC_TIMEOUT="$2"; shift 2; else echo "错误: -gct 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -et) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then EXEC_TIMEOUT="$2"; shift 2; else echo "错误: -et 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -ml|--max-lines) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then MAX_OUTPUT_LINES="$2"; shift 2; else echo "错误: --max-lines 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -mc|--max-chars) if [[ -n "$2" && "$2" =~ ^[0-9]+$ ]]; then MAX_OUTPUT_CHARS="$2"; shift 2; else echo "错误: --max-chars 需要一个正整数参数。" >&2; exit 1; fi ;;
+        -h|--help) show_help; exit 0 ;;
+        *) echo "未知选项: $1"; show_help; exit 1 ;;
    esac
 done

-# --- 本次修改点: 根据 -set 参数构建查找路径 ---
+if ${EXECUTE_MODE} && ${IR_EXECUTE_MODE}; then
+    echo -e "\e[31m错误: -e 和 -eir 选项不能同时使用。\e[0m" >&2
+    exit 1
+fi
+
 declare -A SET_MAP
 SET_MAP[f]="functional"
 SET_MAP[h]="h_functional"
 SET_MAP[p]="performance"

 SEARCH_PATHS=()
-
-# 如果未指定测试集，或指定了 'all'，则搜索所有目录
 if [ ${#TEST_SETS[@]} -eq 0 ] || [[ " ${TEST_SETS[@]} " =~ " all " ]]; then
    SEARCH_PATHS+=("${TESTDATA_DIR}")
+    if [ -d "${TESTDATA_DIR}/performance" ]; then PERFORMANCE_MODE=true; fi
 else
    for set in "${TEST_SETS[@]}"; do
        if [[ -v SET_MAP[$set] ]]; then
            SEARCH_PATHS+=("${TESTDATA_DIR}/${SET_MAP[$set]}")
+            if [[ "$set" == "p" ]]; then
+                PERFORMANCE_MODE=true
+            fi
        else
            echo -e "\e[33m警告: 未知的测试集 '$set'，已忽略。\e[0m"
        fi
    done
 fi

-# 如果没有有效的搜索路径，则退出
 if [ ${#SEARCH_PATHS[@]} -eq 0 ]; then
    echo -e "\e[31m错误: 没有找到有效的测试集目录，测试中止。\e[0m"
    exit 1
 fi

 echo "SysY 测试运行器启动..."
+if [ -n "$OPTIMIZE_FLAG" ]; then echo "优化等级: ${OPTIMIZE_FLAG}"; fi
 echo "输入目录: ${SEARCH_PATHS[@]}"
 echo "临时目录: ${TMP_DIR}"
-echo "执行模式: ${EXECUTE_MODE}"
-if ${EXECUTE_MODE}; then
-    echo "超时设置: sysyc=${SYSYC_TIMEOUT}s, gcc=${GCC_TIMEOUT}s, qemu=${EXEC_TIMEOUT}s"
+
+RUN_MODE_INFO=""
+if ${IR_EXECUTE_MODE}; then
+    RUN_MODE_INFO="IR执行模式 (-eir)"
+    TIMEOUT_INFO="超时设置: sysyc=${SYSYC_TIMEOUT}s, llc=${LLC_TIMEOUT}s, gcc=${GCC_TIMEOUT}s, qemu=${EXEC_TIMEOUT}s"
+elif ${EXECUTE_MODE}; then
+    RUN_MODE_INFO="直接执行模式 (-e)"
+    TIMEOUT_INFO="超时设置: sysyc=${SYSYC_TIMEOUT}s, gcc=${GCC_TIMEOUT}s, qemu=${EXEC_TIMEOUT}s"
+else
+    RUN_MODE_INFO="编译模式 (默认)"
+    TIMEOUT_INFO="超时设置: sysyc=${SYSYC_TIMEOUT}s"
+fi
+echo "运行模式: ${RUN_MODE_INFO}"
+echo "${TIMEOUT_INFO}"
+if ${PERFORMANCE_MODE} && ([ ${EXECUTE_MODE} = true ] || [ ${IR_EXECUTE_MODE} = true ]) && [ ${PERF_RUN_COUNT} -gt 1 ]; then
+    echo "性能测试运行次数: ${PERF_RUN_COUNT}"
+fi
+if ${EXECUTE_MODE} || ${IR_EXECUTE_MODE}; then
    echo "失败输出最大行数: ${MAX_OUTPUT_LINES}"
+    echo "失败输出最大字符数: ${MAX_OUTPUT_CHARS}"
 fi
 echo ""

-# 使用构建好的路径查找 .sy 文件并排序
 sy_files=$(find "${SEARCH_PATHS[@]}" -name "*.sy" | sort -V)
 if [ -z "$sy_files" ]; then
    echo "在指定目录中未找到任何 .sy 文件。"
@ -162,139 +250,241 @@ if [ -z "$sy_files" ]; then
 fi
 TOTAL_CASES=$(echo "$sy_files" | wc -w)

-# --- 修复: 使用 here-string (<<<) 代替管道 (|) 来避免子 shell 问题 ---
+PERFORMANCE_CSV_FILE="${TMP_DIR}/performance_time.csv"
+if ${PERFORMANCE_MODE}; then
+    echo "Case,Time_String,Time_Seconds" > "${PERFORMANCE_CSV_FILE}"
+fi
+
 while IFS= read -r sy_file; do
-    is_passed=1 # 1 表示通过, 0 表示失败
+    is_passed=0 # 0 表示失败, 1 表示通过

    relative_path_no_ext=$(realpath --relative-to="${TESTDATA_DIR}" "${sy_file%.*}")
    output_base_name=$(echo "${relative_path_no_ext}" | tr '/' '_')

-    assembly_file="${TMP_DIR}/${output_base_name}_sysyc_riscv64.s"
-    executable_file="${TMP_DIR}/${output_base_name}_sysyc_riscv64"
+    assembly_file_S="${TMP_DIR}/${output_base_name}_sysyc_S.s"
+    executable_file_S="${TMP_DIR}/${output_base_name}_sysyc_S"
+    output_actual_file_S="${TMP_DIR}/${output_base_name}_sysyc_S.actual_out"
+    stderr_file_S="${TMP_DIR}/${output_base_name}_sysyc_S.stderr"
+
+    ir_file="${TMP_DIR}/${output_base_name}_sysyc_ir.ll"
+    assembly_file_from_ir="${TMP_DIR}/${output_base_name}_from_ir.s"
+    executable_file_from_ir="${TMP_DIR}/${output_base_name}_from_ir"
+    output_actual_file_from_ir="${TMP_DIR}/${output_base_name}_from_ir.actual_out"
+    stderr_file_from_ir="${TMP_DIR}/${output_base_name}_from_ir.stderr"
+
    input_file="${sy_file%.*}.in"
    output_reference_file="${sy_file%.*}.out"
-    output_actual_file="${TMP_DIR}/${output_base_name}_sysyc_riscv64.actual_out"

    echo "正在处理: $(basename "$sy_file") (路径: ${relative_path_no_ext}.sy)"

-    # 步骤 1: 使用 sysyc 编译 .sy 到 .s
-    echo "  使用 sysyc 编译 (超时 ${SYSYC_TIMEOUT}s)..."
-    timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -S "${sy_file}" -o "${assembly_file}"
-    SYSYC_STATUS=$?
-    if [ $SYSYC_STATUS -eq 124 ]; then
-        echo -e "\e[31m错误: SysY 编译 ${sy_file} 超时\e[0m"
-        is_passed=0
-    elif [ $SYSYC_STATUS -ne 0 ]; then
-        echo -e "\e[31m错误: SysY 编译 ${sy_file} 失败，退出码: ${SYSYC_STATUS}\e[0m"
-        is_passed=0
-    fi
-
-    # 只有当 EXECUTE_MODE 为 true 且上一步成功时才继续
-    if ${EXECUTE_MODE} && [ "$is_passed" -eq 1 ]; then
-        # 步骤 2: 使用 riscv64-linux-gnu-gcc 编译 .s 到可执行文件
-        echo "  使用 gcc 编译 (超时 ${GCC_TIMEOUT}s)..."
-        timeout -s KILL ${GCC_TIMEOUT} "${GCC_RISCV64}" "${assembly_file}" -o "${executable_file}" -L"${LIB_DIR}" -lsysy_riscv -static
-        GCC_STATUS=$?
-        if [ $GCC_STATUS -eq 124 ]; then
-            echo -e "\e[31m错误: GCC 编译 ${assembly_file} 超时\e[0m"
-            is_passed=0
-        elif [ $GCC_STATUS -ne 0 ]; then
-            echo -e "\e[31m错误: GCC 编译 ${assembly_file} 失败，退出码: ${GCC_STATUS}\e[0m"
-            is_passed=0
-        fi
-    elif ! ${EXECUTE_MODE}; then
-        echo "  跳过执行模式。仅生成汇编文件。"
-        if [ "$is_passed" -eq 1 ]; then
-             ((PASSED_CASES++))
-        else
-            FAILED_CASES_LIST+="${relative_path_no_ext}.sy\n"
-        fi
-        echo ""
-        continue
-    fi
-
-    # 步骤 3, 4, 5: 只有当编译都成功时才执行
-    if [ "$is_passed" -eq 1 ]; then
-        echo "  正在执行 (超时 ${EXEC_TIMEOUT}s)..."
+    # --- 模式 1: IR 执行模式 (-eir) ---
+    if ${IR_EXECUTE_MODE}; then
+        step_failed=0
+        test_logic_passed=0
+        total_time_us=0
        
-        exec_cmd="${QEMU_RISCV64} \"${executable_file}\""
-        if [ -f "${input_file}" ]; then
-            exec_cmd+=" < \"${input_file}\""
+        echo "  [1/4] 使用 sysyc 编译为 IR (超时 ${SYSYC_TIMEOUT}s)..."
+        timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -s ir "${sy_file}" -o "${ir_file}" ${OPTIMIZE_FLAG}; if [ $? -ne 0 ]; then echo -e "\e[31m错误: SysY (IR) 编译失败或超时\e[0m"; step_failed=1; fi
+
+        if [ "$step_failed" -eq 0 ]; then
+            echo "  [2/4] 使用 llc-19 编译为汇编 (超时 ${LLC_TIMEOUT}s)..."
+            timeout -s KILL ${LLC_TIMEOUT} ${LLC_CMD} -march=riscv64 -mcpu=generic-rv64 -mattr=+m,+a,+f,+d,+c -filetype=asm "${ir_file}" -o "${assembly_file_from_ir}"; if [ $? -ne 0 ]; then echo -e "\e[31m错误: llc-19 编译失败或超时\e[0m"; step_failed=1; fi
        fi
-        exec_cmd+=" > \"${output_actual_file}\""

-        eval "timeout -s KILL ${EXEC_TIMEOUT} ${exec_cmd}"
-        ACTUAL_RETURN_CODE=$?
+        if [ "$step_failed" -eq 0 ]; then
+            echo "  [3/4] 使用 gcc 编译 (超时 ${GCC_TIMEOUT}s)..."
+            timeout -s KILL ${GCC_TIMEOUT} "${GCC_RISCV64}" "${assembly_file_from_ir}" -o "${executable_file_from_ir}" -L"${LIB_DIR}" -lsysy_riscv -static; if [ $? -ne 0 ]; then echo -e "\e[31m错误: GCC 编译失败或超时\e[0m"; step_failed=1; fi
+        fi

-        if [ "$ACTUAL_RETURN_CODE" -eq 124 ]; then
-            echo -e "\e[31m  执行超时: ${sy_file} 运行超过 ${EXEC_TIMEOUT} 秒\e[0m"
-            is_passed=0
-        else
-            if [ -f "${output_reference_file}" ]; then
-                LAST_LINE_TRIMMED=$(tail -n 1 "${output_reference_file}" | tr -d '[:space:]')
+        if [ "$step_failed" -eq 0 ]; then
+            echo "  [4/4] 正在执行 (超时 ${EXEC_TIMEOUT}s)..."
+            current_run_failed=0
+            for (( i=1; i<=PERF_RUN_COUNT; i++ )); do
+                if [ ${PERF_RUN_COUNT} -gt 1 ]; then echo -n "    第 $i/${PERF_RUN_COUNT} 次运行... "; fi
+                exec_cmd="${QEMU_RISCV64} \"${executable_file_from_ir}\""
+                [ -f "${input_file}" ] && exec_cmd+=" < \"${input_file}\""
+                exec_cmd+=" > \"${output_actual_file_from_ir}\" 2> \"${stderr_file_from_ir}\""
+                eval "timeout -s KILL ${EXEC_TIMEOUT} ${exec_cmd}"
+                ACTUAL_RETURN_CODE=$?
                
-                if [[ "$LAST_LINE_TRIMMED" =~ ^[-+]?[0-9]+$ ]]; then
-                    EXPECTED_RETURN_CODE="$LAST_LINE_TRIMMED"
-                    EXPECTED_STDOUT_FILE="${TMP_DIR}/${output_base_name}_sysyc_riscv64.expected_stdout"
-                    head -n -1 "${output_reference_file}" > "${EXPECTED_STDOUT_FILE}"
-
-                    if [ "$ACTUAL_RETURN_CODE" -eq "$EXPECTED_RETURN_CODE" ]; then
-                        echo -e "\e[32m  返回码测试成功: (${ACTUAL_RETURN_CODE}) 与期望值 (${EXPECTED_RETURN_CODE}) 匹配\e[0m"
+                if [ "$ACTUAL_RETURN_CODE" -eq 124 ]; then echo -e "\e[31m超时\e[0m"; current_run_failed=1; break; fi
+                if ${PERFORMANCE_MODE}; then
+                    TIME_LINE=$(grep "TOTAL:" "${stderr_file_from_ir}")
+                    if [ -n "$TIME_LINE" ]; then
+                        H=$(echo "$TIME_LINE" | sed -E 's/TOTAL: ([0-9]+)H-.*/\1/')
+                        M=$(echo "$TIME_LINE" | sed -E 's/.*-([0-9]+)M-.*/\1/')
+                        S=$(echo "$TIME_LINE" | sed -E 's/.*-([0-9]+)S-.*/\1/')
+                        US=$(echo "$TIME_LINE" | sed -E 's/.*-([0-9]+)us/\1/')
+                        run_time_us=$(( H * 3600000000 + M * 60000000 + S * 1000000 + US ))
+                        total_time_us=$(( total_time_us + run_time_us ))
+                        if [ ${PERF_RUN_COUNT} -gt 1 ]; then echo "耗时: ${run_time_us}us"; fi
                    else
-                        echo -e "\e[31m  返回码测试失败: 期望: ${EXPECTED_RETURN_CODE}, 实际: ${ACTUAL_RETURN_CODE}\e[0m"
-                        is_passed=0
-                    fi
-
-                    if ! diff -q <(tr -d '[:space:]' < "${output_actual_file}") <(tr -d '[:space:]' < "${EXPECTED_STDOUT_FILE}") >/dev/null 2>&1; then
-                        echo -e "\e[31m  标准输出测试失败\e[0m"
-                        is_passed=0
-                        display_file_content "${EXPECTED_STDOUT_FILE}" "    \e[36m---------- 期望输出 ----------\e[0m" "${MAX_OUTPUT_LINES}"
-                        display_file_content "${output_actual_file}" "    \e[36m---------- 实际输出 ----------\e[0m" "${MAX_OUTPUT_LINES}"
-                        echo -e "    \e[36m------------------------------\e[0m"
-                    fi
-                else
-                    if [ $ACTUAL_RETURN_CODE -ne 0 ]; then
-                        echo -e "\e[33m警告: 程序以非零状态 ${ACTUAL_RETURN_CODE} 退出 (纯输出比较模式)。\e[0m"
-                    fi
-
-                    if diff -q <(tr -d '[:space:]' < "${output_actual_file}") <(tr -d '[:space:]' < "${output_reference_file}") >/dev/null 2>&1; then
-                        echo -e "\e[32m  成功: 输出与参考输出匹配\e[0m"
-                    else
-                        echo -e "\e[31m  失败: 输出不匹配\e[0m"
-                        is_passed=0
-                        display_file_content "${output_reference_file}" "    \e[36m---------- 期望输出 ----------\e[0m" "${MAX_OUTPUT_LINES}"
-                        display_file_content "${output_actual_file}" "    \e[36m---------- 实际输出 ----------\e[0m" "${MAX_OUTPUT_LINES}"
-                        echo -e "    \e[36m------------------------------\e[0m"
+                        echo -e "\e[31m未找到时间信息\e[0m"; current_run_failed=1; break
                    fi
                fi
-            else
-                echo "  无参考输出文件。程序返回码: ${ACTUAL_RETURN_CODE}"
+            done
+            
+            if [ "$current_run_failed" -eq 0 ]; then
+                test_logic_passed=1
+                if [ -f "${output_reference_file}" ]; then
+                    LAST_LINE_TRIMMED=$(tail -n 1 "${output_reference_file}" | tr -d '[:space:]')
+                    if [[ "$LAST_LINE_TRIMMED" =~ ^[-+]?[0-9]+$ ]]; then
+                        EXPECTED_RETURN_CODE="$LAST_LINE_TRIMMED"
+                        EXPECTED_STDOUT_FILE="${TMP_DIR}/${output_base_name}_from_ir.expected_stdout"
+                        head -n -1 "${output_reference_file}" > "${EXPECTED_STDOUT_FILE}"
+                        if [ "$ACTUAL_RETURN_CODE" -ne "$EXPECTED_RETURN_CODE" ]; then echo -e "\e[31m  返回码测试失败: 期望 ${EXPECTED_RETURN_CODE}, 实际 ${ACTUAL_RETURN_CODE}\e[0m"; test_logic_passed=0; fi
+                        if ! diff -q <(tr -d '[:space:]' < "${output_actual_file_from_ir}") <(tr -d '[:space:]' < "${EXPECTED_STDOUT_FILE}") >/dev/null 2>&1; then
+                            echo -e "\e[31m  标准输出测试失败\e[0m"; test_logic_passed=0
+                            display_file_content "${EXPECTED_STDOUT_FILE}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                            display_file_content "${output_actual_file_from_ir}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                        fi
+                    else
+                        if [ $ACTUAL_RETURN_CODE -ne 0 ]; then echo -e "\e[33m警告: 程序以非零状态 ${ACTUAL_RETURN_CODE} 退出 (纯输出比较模式)。\e[0m"; fi
+                        if ! diff -q <(tr -d '[:space:]' < "${output_actual_file_from_ir}") <(tr -d '[:space:]' < "${output_reference_file}") >/dev/null 2>&1; then
+                            echo -e "\e[31m  失败: 输出不匹配\e[0m"; test_logic_passed=0
+                            display_file_content "${output_reference_file}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                            display_file_content "${output_actual_file_from_ir}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                        fi
+                    fi
+                fi
+                if [ "$test_logic_passed" -eq 1 ]; then echo -e "\e[32m  测试逻辑通过\e[0m"; fi
            fi
        fi
+        if [ "$step_failed" -eq 0 ] && [ "$test_logic_passed" -eq 1 ]; then is_passed=1; fi
+        
+        if ${PERFORMANCE_MODE}; then
+            avg_time_us=0
+            if [ "$is_passed" -eq 1 ]; then
+                avg_time_us=$(( total_time_us / PERF_RUN_COUNT ))
+            fi
+            S_AVG=$(( avg_time_us / 1000000 ))
+            US_AVG=$(( avg_time_us % 1000000 ))
+            TIME_STRING_AVG="${S_AVG}s${US_AVG}us"
+            TOTAL_SECONDS_AVG=$(echo "scale=5; ${avg_time_us} / 1000000" | bc)
+            echo "$(basename ${sy_file}),${TIME_STRING_AVG},${TOTAL_SECONDS_AVG}" >> "${PERFORMANCE_CSV_FILE}"
+        fi
+
+    # --- 模式 2: 直接执行模式 (-e) ---
+    elif ${EXECUTE_MODE}; then
+        step_failed=0
+        test_logic_passed=0
+        total_time_us=0
+
+        echo "  [1/3] 使用 sysyc 编译为汇编 (超时 ${SYSYC_TIMEOUT}s)..."
+        timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -S "${sy_file}" -o "${assembly_file_S}" ${OPTIMIZE_FLAG}; if [ $? -ne 0 ]; then echo -e "\e[31m错误: SysY (汇编) 编译失败或超时\e[0m"; step_failed=1; fi
+
+        if [ "$step_failed" -eq 0 ]; then
+            echo "  [2/3] 使用 gcc 编译 (超时 ${GCC_TIMEOUT}s)..."
+            timeout -s KILL ${GCC_TIMEOUT} "${GCC_RISCV64}" "${assembly_file_S}" -o "${executable_file_S}" -L"${LIB_DIR}" -lsysy_riscv -static; if [ $? -ne 0 ]; then echo -e "\e[31m错误: GCC 编译失败或超时\e[0m"; step_failed=1; fi
+        fi
+
+        if [ "$step_failed" -eq 0 ]; then
+            echo "  [3/3] 正在执行 (超时 ${EXEC_TIMEOUT}s)..."
+            current_run_failed=0
+            for (( i=1; i<=PERF_RUN_COUNT; i++ )); do
+                if [ ${PERF_RUN_COUNT} -gt 1 ]; then echo -n "    第 $i/${PERF_RUN_COUNT} 次运行... "; fi
+                exec_cmd="${QEMU_RISCV64} \"${executable_file_S}\""
+                [ -f "${input_file}" ] && exec_cmd+=" < \"${input_file}\""
+                exec_cmd+=" > \"${output_actual_file_S}\" 2> \"${stderr_file_S}\""
+                eval "timeout -s KILL ${EXEC_TIMEOUT} ${exec_cmd}"
+                ACTUAL_RETURN_CODE=$?
+                
+                if [ "$ACTUAL_RETURN_CODE" -eq 124 ]; then echo -e "\e[31m超时\e[0m"; current_run_failed=1; break; fi
+                if ${PERFORMANCE_MODE}; then
+                    TIME_LINE=$(grep "TOTAL:" "${stderr_file_S}")
+                    if [ -n "$TIME_LINE" ]; then
+                        H=$(echo "$TIME_LINE" | sed -E 's/TOTAL: ([0-9]+)H-.*/\1/')
+                        M=$(echo "$TIME_LINE" | sed -E 's/.*-([0-9]+)M-.*/\1/')
+                        S=$(echo "$TIME_LINE" | sed -E 's/.*-([0-9]+)S-.*/\1/')
+                        US=$(echo "$TIME_LINE" | sed -E 's/.*-([0-9]+)us/\1/')
+                        run_time_us=$(( H * 3600000000 + M * 60000000 + S * 1000000 + US ))
+                        total_time_us=$(( total_time_us + run_time_us ))
+                        if [ ${PERF_RUN_COUNT} -gt 1 ]; then echo "耗时: ${run_time_us}us"; fi
+                    else
+                        echo -e "\e[31m未找到时间信息\e[0m"; current_run_failed=1; break
+                    fi
+                fi
+            done
+            
+            if [ "$current_run_failed" -eq 0 ]; then
+                test_logic_passed=1
+                if [ -f "${output_reference_file}" ]; then
+                    LAST_LINE_TRIMMED=$(tail -n 1 "${output_reference_file}" | tr -d '[:space:]')
+                    if [[ "$LAST_LINE_TRIMMED" =~ ^[-+]?[0-9]+$ ]]; then
+                        EXPECTED_RETURN_CODE="$LAST_LINE_TRIMMED"
+                        EXPECTED_STDOUT_FILE="${TMP_DIR}/${output_base_name}_sysyc_S.expected_stdout"
+                        head -n -1 "${output_reference_file}" > "${EXPECTED_STDOUT_FILE}"
+                        if [ "$ACTUAL_RETURN_CODE" -ne "$EXPECTED_RETURN_CODE" ]; then echo -e "\e[31m  返回码测试失败: 期望 ${EXPECTED_RETURN_CODE}, 实际 ${ACTUAL_RETURN_CODE}\e[0m"; test_logic_passed=0; fi
+                        if ! diff -q <(tr -d '[:space:]' < "${output_actual_file_S}") <(tr -d '[:space:]' < "${EXPECTED_STDOUT_FILE}") >/dev/null 2>&1; then
+                            echo -e "\e[31m  标准输出测试失败\e[0m"; test_logic_passed=0
+                            display_file_content "${EXPECTED_STDOUT_FILE}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                            display_file_content "${output_actual_file_S}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                        fi
+                    else
+                        if [ $ACTUAL_RETURN_CODE -ne 0 ]; then echo -e "\e[33m警告: 程序以非零状态 ${ACTUAL_RETURN_CODE} 退出 (纯输出比较模式)。\e[0m"; fi
+                        if ! diff -q <(tr -d '[:space:]' < "${output_actual_file_S}") <(tr -d '[:space:]' < "${output_reference_file}") >/dev/null 2>&1; then
+                            echo -e "\e[31m  失败: 输出不匹配\e[0m"; test_logic_passed=0
+                            display_file_content "${output_reference_file}" "    \e[36m--- 期望输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                            display_file_content "${output_actual_file_S}" "    \e[36m--- 实际输出 ---\e[0m" "${MAX_OUTPUT_LINES}" "${MAX_OUTPUT_CHARS}"
+                        fi
+                    fi
+                fi
+                if [ "$test_logic_passed" -eq 1 ]; then echo -e "\e[32m  测试逻辑通过\e[0m"; fi
+            fi
+        fi
+        if [ "$step_failed" -eq 0 ] && [ "$test_logic_passed" -eq 1 ]; then is_passed=1; fi
+        
+        if ${PERFORMANCE_MODE}; then
+            avg_time_us=0
+            if [ "$is_passed" -eq 1 ]; then
+                avg_time_us=$(( total_time_us / PERF_RUN_COUNT ))
+            fi
+            S_AVG=$(( avg_time_us / 1000000 ))
+            US_AVG=$(( avg_time_us % 1000000 ))
+            TIME_STRING_AVG="${S_AVG}s${US_AVG}us"
+            TOTAL_SECONDS_AVG=$(echo "scale=5; ${avg_time_us} / 1000000" | bc)
+            echo "$(basename ${sy_file}),${TIME_STRING_AVG},${TOTAL_SECONDS_AVG}" >> "${PERFORMANCE_CSV_FILE}"
+        fi
+
+    # --- 模式 3: 默认编译模式 ---
+    else
+        s_compile_ok=0
+        ir_compile_ok=0
+
+        echo "  [1/2] 使用 sysyc 编译为汇编 (超时 ${SYSYC_TIMEOUT}s)..."
+        timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -S "${sy_file}" -o "${assembly_file_S}" ${OPTIMIZE_FLAG}
+        SYSYC_S_STATUS=$?
+        if [ $SYSYC_S_STATUS -eq 0 ]; then
+            s_compile_ok=1
+            echo -e "      \e[32m-> ${assembly_file_S} [成功]\e[0m"
+        else
+            [ $SYSYC_S_STATUS -eq 124 ] && echo -e "      \e[31m-> [编译超时]\e[0m" || echo -e "      \e[31m-> [编译失败, 退出码: ${SYSYC_S_STATUS}]\e[0m"
+        fi
+
+        echo "  [2/2] 使用 sysyc 编译为 IR (超时 ${SYSYC_TIMEOUT}s)..."
+        timeout -s KILL ${SYSYC_TIMEOUT} "${SYSYC}" -s ir "${sy_file}" -o "${ir_file}" ${OPTIMIZE_FLAG}
+        SYSYC_IR_STATUS=$?
+        if [ $SYSYC_IR_STATUS -eq 0 ]; then
+            ir_compile_ok=1
+            echo -e "      \e[32m-> ${ir_file} [成功]\e[0m"
+        else
+            [ $SYSYC_IR_STATUS -eq 124 ] && echo -e "      \e[31m-> [编译超时]\e[0m" || echo -e "      \e[31m-> [编译失败, 退出码: ${SYSYC_IR_STATUS}]\e[0m"
+        fi
+
+        if [ "$s_compile_ok" -eq 1 ] && [ "$ir_compile_ok" -eq 1 ]; then
+            is_passed=1
+        fi
    fi

+    # --- 统计结果 ---
    if [ "$is_passed" -eq 1 ]; then
        ((PASSED_CASES++))
    else
+        # 确保 FAILED_CASES_LIST 的每一项都以换行符结尾
        FAILED_CASES_LIST+="${relative_path_no_ext}.sy\n"
    fi
    echo ""
 done <<< "$sy_files"

-echo "========================================"
-echo "测试完成"
-echo "测试通过率: [${PASSED_CASES}/${TOTAL_CASES}]"
-
-if [ -n "$FAILED_CASES_LIST" ]; then
-    echo ""
-    echo -e "\e[31m未通过的测例:\e[0m"
-    echo -e "${FAILED_CASES_LIST}"
-fi
-
-echo "========================================"
-
-if [ "$PASSED_CASES" -eq "$TOTAL_CASES" ]; then
-    exit 0
-else
-    exit 1
-fi
+# --- 修改：调用总结函数 ---
+print_summary
--- a/src/backend/RISCv64/CMakeLists.txt
+++ b/src/backend/RISCv64/CMakeLists.txt
@ -5,12 +5,17 @@ add_library(riscv64_backend_lib STATIC
    RISCv64ISel.cpp
    RISCv64LLIR.cpp
    RISCv64RegAlloc.cpp
+    RISCv64LinearScan.cpp
+    RISCv64SimpleRegAlloc.cpp
+    RISCv64BasicBlockAlloc.cpp
    Handler/CalleeSavedHandler.cpp
    Handler/LegalizeImmediates.cpp
    Handler/PrologueEpilogueInsertion.cpp
+    Handler/EliminateFrameIndices.cpp
    Optimize/Peephole.cpp
    Optimize/PostRA_Scheduler.cpp
    Optimize/PreRA_Scheduler.cpp
+    Optimize/DivStrengthReduction.cpp
 )

 # 包含后端模块所需的头文件路径
--- a/src/backend/RISCv64/Handler/CalleeSavedHandler.cpp
+++ b/src/backend/RISCv64/Handler/CalleeSavedHandler.cpp
@ -8,11 +8,6 @@ namespace sysy {

 char CalleeSavedHandler::ID = 0;

-// 辅助函数，用于判断一个物理寄存器是否为浮点寄存器
-static bool is_fp_reg(PhysicalReg reg) {
-    return reg >= PhysicalReg::F0 && reg <= PhysicalReg::F31;
-}
-
 bool CalleeSavedHandler::runOnFunction(Function *F, AnalysisManager& AM) {
    // This pass works on MachineFunction level, not IR level
    return false;
@ -20,114 +15,37 @@ bool CalleeSavedHandler::runOnFunction(Function *F, AnalysisManager& AM) {

 void CalleeSavedHandler::runOnMachineFunction(MachineFunction* mfunc) {
    StackFrameInfo& frame_info = mfunc->getFrameInfo();
-    
-    std::set<PhysicalReg> used_callee_saved;
-
-    // 1. 扫描所有指令，找出被使用的callee-saved寄存器
-    //    这个Pass在RegAlloc之后运行，所以可以访问到物理寄存器
-    for (auto& mbb : mfunc->getBlocks()) {
-        for (auto& instr : mbb->getInstructions()) {
-            for (auto& op : instr->getOperands()) {
-                
-                auto check_and_insert_reg = [&](RegOperand* reg_op) {
-                    if (reg_op && !reg_op->isVirtual()) {
-                        PhysicalReg preg = reg_op->getPReg();
-                        
-                        // 检查整数 s1-s11
-                        if (preg >= PhysicalReg::S1 && preg <= PhysicalReg::S11) {
-                            used_callee_saved.insert(preg);
-                        } 
-                        // 检查浮点 fs0-fs11 (f8,f9,f18-f27)
-                        else if ((preg >= PhysicalReg::F8 && preg <= PhysicalReg::F9) || (preg >= PhysicalReg::F18 && preg <= PhysicalReg::F27)) {
-                            used_callee_saved.insert(preg);
-                        }
-                    }
-                };
-
-                if (op->getKind() == MachineOperand::KIND_REG) {
-                    check_and_insert_reg(static_cast<RegOperand*>(op.get()));
-                } else if (op->getKind() == MachineOperand::KIND_MEM) {
-                    check_and_insert_reg(static_cast<MemOperand*>(op.get())->getBase());
-                }
-            }
-        }
-    }
+    const std::set<PhysicalReg>& used_callee_saved = frame_info.used_callee_saved_regs;

    if (used_callee_saved.empty()) {
        frame_info.callee_saved_size = 0;
+        frame_info.callee_saved_regs_to_store.clear();
        return;
    }

-    // 2. 计算并更新 frame_info
-    frame_info.callee_saved_size = used_callee_saved.size() * 8;
-
-    // 为了布局确定性和恢复顺序一致，对寄存器排序
-    std::vector<PhysicalReg> sorted_regs(used_callee_saved.begin(), used_callee_saved.end());
-    std::sort(sorted_regs.begin(), sorted_regs.end());
-    
-    // 3. 在函数序言中插入保存指令
-    MachineBasicBlock* entry_block = mfunc->getBlocks().front().get();
-    auto& entry_instrs = entry_block->getInstructions();
-    // 插入点在函数入口标签之后，或者就是最开始
-    auto insert_pos = entry_instrs.begin();
-    if (!entry_instrs.empty() && entry_instrs.front()->getOpcode() == RVOpcodes::LABEL) {
-        insert_pos = std::next(insert_pos);
+    // 1. 计算被调用者保存寄存器所需的总空间大小
+    // s0 总是由 PEI Pass 单独处理，这里不计入大小，但要确保它在列表中
+    int size = 0;
+    std::set<PhysicalReg> regs_to_save = used_callee_saved;
+    if (regs_to_save.count(PhysicalReg::S0)) {
+        regs_to_save.erase(PhysicalReg::S0);
    }
-    
-    std::vector<std::unique_ptr<MachineInstr>> save_instrs;
-    // [关键] 从局部变量区域之后开始分配空间
-    int current_offset = - (16 + frame_info.locals_size);
+    size = regs_to_save.size() * 8; // 每个寄存器占8字节 (64-bit)
+    frame_info.callee_saved_size = size;

-    for (PhysicalReg reg : sorted_regs) {
-        current_offset -= 8;
-        RVOpcodes save_op = is_fp_reg(reg) ? RVOpcodes::FSD : RVOpcodes::SD;
+    // 2. 创建一个有序的、需要保存的寄存器列表，以便后续 Pass 确定地生成代码
+    // s0 不应包含在此列表中，因为它由 PEI Pass 特殊处理
+    std::vector<PhysicalReg> sorted_regs(regs_to_save.begin(), regs_to_save.end());
+    std::sort(sorted_regs.begin(), sorted_regs.end(), [](PhysicalReg a, PhysicalReg b){ 
+        return static_cast<int>(a) < static_cast<int>(b); 
+    });
+    frame_info.callee_saved_regs_to_store = sorted_regs;

-        auto save_instr = std::make_unique<MachineInstr>(save_op);
-        save_instr->addOperand(std::make_unique<RegOperand>(reg));
-        save_instr->addOperand(std::make_unique<MemOperand>(
-            std::make_unique<RegOperand>(PhysicalReg::S0), // 基址为帧指针 s0
-            std::make_unique<ImmOperand>(current_offset)
-        ));
-        save_instrs.push_back(std::move(save_instr));
-    }
-
-    if (!save_instrs.empty()) {
-        entry_instrs.insert(insert_pos,
-                            std::make_move_iterator(save_instrs.begin()),
-                            std::make_move_iterator(save_instrs.end()));
-    }
-
-    // 4. 在函数结尾（ret之前）插入恢复指令
-    for (auto& mbb : mfunc->getBlocks()) {
-        for (auto it = mbb->getInstructions().begin(); it != mbb->getInstructions().end(); ++it) {
-            if ((*it)->getOpcode() == RVOpcodes::RET) {
-                std::vector<std::unique_ptr<MachineInstr>> restore_instrs;
-                // [关键] 使用与保存时完全相同的逻辑来计算偏移量
-                current_offset = - (16 + frame_info.locals_size);
-                
-                for (PhysicalReg reg : sorted_regs) {
-                    current_offset -= 8;
-                    RVOpcodes restore_op = is_fp_reg(reg) ? RVOpcodes::FLD : RVOpcodes::LD;
-
-                    auto restore_instr = std::make_unique<MachineInstr>(restore_op);
-                    restore_instr->addOperand(std::make_unique<RegOperand>(reg));
-                    restore_instr->addOperand(std::make_unique<MemOperand>(
-                        std::make_unique<RegOperand>(PhysicalReg::S0),
-                        std::make_unique<ImmOperand>(current_offset)
-                    ));
-                    restore_instrs.push_back(std::move(restore_instr));
-                }
-                
-                if (!restore_instrs.empty()) {
-                    mbb->getInstructions().insert(it,
-                                                std::make_move_iterator(restore_instrs.begin()),
-                                                std::make_move_iterator(restore_instrs.end()));
-                }
-                goto next_block_label;
-            }
-        }
-        next_block_label:;
-    }
+    // 3. 更新栈帧总大小。
+    // 这是初步计算，PEI Pass 会进行最终的对齐。
+    frame_info.total_size = frame_info.locals_size + 
+                            frame_info.spill_size + 
+                            frame_info.callee_saved_size;
 }

-} // namespace sysy
+} // namespace sysy
--- a/src/backend/RISCv64/Handler/EliminateFrameIndices.cpp
+++ b/src/backend/RISCv64/Handler/EliminateFrameIndices.cpp
@ -0,0 +1,235 @@
+#include "EliminateFrameIndices.h"
+#include "RISCv64ISel.h"
+#include <cassert>
+#include <vector>
+
+namespace sysy {
+
+// getTypeSizeInBytes 是一个通用辅助函数，保持不变
+unsigned EliminateFrameIndicesPass::getTypeSizeInBytes(Type* type) {
+    if (!type) {
+        assert(false && "Cannot get size of a null type.");
+        return 0;
+    }
+
+    switch (type->getKind()) {
+        case Type::kInt:
+        case Type::kFloat:
+            return 4;
+        case Type::kPointer:
+            return 8;
+        case Type::kArray: {
+            auto arrayType = type->as<ArrayType>();
+            return arrayType->getNumElements() * getTypeSizeInBytes(arrayType->getElementType());
+        }
+        default:
+            assert(false && "Unsupported type for size calculation.");
+            return 0;
+    }
+}
+
+void EliminateFrameIndicesPass::runOnMachineFunction(MachineFunction* mfunc) {
+    StackFrameInfo& frame_info = mfunc->getFrameInfo();
+    Function* F = mfunc->getFunc();
+    RISCv64ISel* isel = mfunc->getISel();
+    
+    // 在这里处理栈传递的参数，以便在寄存器分配前就将数据流显式化，修复溢出逻辑的BUG。
+
+    // 2. 只为局部变量(AllocaInst)分配栈空间和计算偏移量
+    // 局部变量从 s0 下方（负偏移量）开始分配，紧接着为 ra 和 s0 预留的16字节之后
+    int local_var_offset = 16;
+    
+    if(F) { // 确保函数指针有效
+        for (auto& bb : F->getBasicBlocks()) {
+            for (auto& inst : bb->getInstructions()) {
+                if (auto alloca = dynamic_cast<AllocaInst*>(inst.get())) {
+                    Type* allocated_type = alloca->getType()->as<PointerType>()->getBaseType();
+                    int size = getTypeSizeInBytes(allocated_type);
+                    
+                    // 优化栈帧大小：对于大数组使用4字节对齐，小对象使用8字节对齐
+                    if (size >= 256) {  // 大数组优化
+                        size = (size + 3) & ~3;  // 4字节对齐
+                    } else {
+                        size = (size + 7) & ~7;  // 8字节对齐
+                    }
+                    if (size == 0) size = 4; // 最小4字节
+
+                    local_var_offset += size;
+                    unsigned alloca_vreg = isel->getVReg(alloca);
+                    // 局部变量使用相对于s0的负向偏移
+                    frame_info.alloca_offsets[alloca_vreg] = -local_var_offset;
+                }
+            }
+        }
+    }
+    
+    // 记录仅由AllocaInst分配的局部变量的总大小
+    frame_info.locals_size = local_var_offset - 16;
+    // 记录局部变量区域分配结束的最终偏移量
+    frame_info.locals_end_offset = -local_var_offset;
+
+    // 在函数入口为所有栈传递的参数插入load指令
+    // 这个步骤至关重要：它在寄存器分配之前，为这些参数的vreg创建了明确的“定义(def)”指令。
+    // 这解决了在高寄存器压力下，当这些vreg被溢出时，`rewriteProgram`找不到其定义点而崩溃的问题。
+    if (F && isel && !mfunc->getBlocks().empty()) {
+        MachineBasicBlock* entry_block = mfunc->getBlocks().front().get();
+        std::vector<std::unique_ptr<MachineInstr>> arg_load_instrs;
+        
+        // 步骤 3.1: 生成所有加载栈参数的指令，暂存起来
+        int arg_idx = 0;
+        for (Argument* arg : F->getArguments()) {
+            // 根据ABI，前8个整型/指针参数通过寄存器传递，这里只处理超出部分。
+            if (arg_idx >= 8) {
+                // 计算参数在调用者栈帧中的位置，该位置相对于被调用者的帧指针s0是正向偏移。
+                // 第9个参数(arg_idx=8)位于 0(s0)，第10个(arg_idx=9)位于 8(s0)，以此类推。
+                int offset = (arg_idx - 8) * 8; 
+                unsigned arg_vreg = isel->getVReg(arg);
+                Type* arg_type = arg->getType();
+
+                // 根据参数类型选择正确的加载指令
+                RVOpcodes load_op;
+                if (arg_type->isFloat()) {
+                    load_op = RVOpcodes::FLW; // 单精度浮点
+                } else if (arg_type->isPointer()) {
+                    load_op = RVOpcodes::LD;  // 64位指针
+                } else {
+                    load_op = RVOpcodes::LW;  // 32位整数
+                }
+                
+                // 创建加载指令: lw/ld/flw vreg, offset(s0)
+                auto load_instr = std::make_unique<MachineInstr>(load_op);
+                load_instr->addOperand(std::make_unique<RegOperand>(arg_vreg));
+                load_instr->addOperand(std::make_unique<MemOperand>(
+                    std::make_unique<RegOperand>(PhysicalReg::S0), // 基址为帧指针
+                    std::make_unique<ImmOperand>(offset)
+                ));
+                arg_load_instrs.push_back(std::move(load_instr));
+            }
+            arg_idx++;
+        }
+        
+        //仅当有需要加载的栈参数时，才执行插入逻辑
+        if (!arg_load_instrs.empty()) {
+            auto& entry_instrs = entry_block->getInstructions();
+            auto insertion_point = entry_instrs.begin(); // 默认插入点为块的开头
+            auto last_arg_save_it = entry_instrs.end();
+
+            // 步骤 3.2: 寻找一个安全的插入点。
+            // 遍历入口块的指令，找到最后一条保存“寄存器传递参数”的伪指令。
+            // 这样可以确保我们在所有 a0-a7 参数被保存之后，才执行可能覆盖它们的加载指令。
+            for (auto it = entry_instrs.begin(); it != entry_instrs.end(); ++it) {
+                MachineInstr* instr = it->get();
+                // 寻找代表保存参数到栈的伪指令
+                if (instr->getOpcode() == RVOpcodes::FRAME_STORE_W ||
+                    instr->getOpcode() == RVOpcodes::FRAME_STORE_D ||
+                    instr->getOpcode() == RVOpcodes::FRAME_STORE_F) {
+                    
+                    // 检查被保存的值是否是寄存器参数 (arg_no < 8)
+                    auto& operands = instr->getOperands();
+                    if (operands.empty() || operands[0]->getKind() != MachineOperand::KIND_REG) continue;
+                    
+                    unsigned src_vreg = static_cast<RegOperand*>(operands[0].get())->getVRegNum();
+                    Value* ir_value = isel->getVRegValueMap().count(src_vreg) ? isel->getVRegValueMap().at(src_vreg) : nullptr;
+                    
+                    if (auto ir_arg = dynamic_cast<Argument*>(ir_value)) {
+                        if (ir_arg->getIndex() < 8) {
+                            last_arg_save_it = it; // 找到了一个保存寄存器参数的指令，更新位置
+                        }
+                    }
+                }
+            }
+
+            // 如果找到了这样的保存指令，我们的插入点就在它之后
+            if (last_arg_save_it != entry_instrs.end()) {
+                insertion_point = std::next(last_arg_save_it);
+            }
+
+            // 步骤 3.3: 在计算出的安全位置，一次性插入所有新创建的参数加载指令
+            entry_instrs.insert(insertion_point,
+                                std::make_move_iterator(arg_load_instrs.begin()),
+                                std::make_move_iterator(arg_load_instrs.end()));
+        }
+    }
+
+    // 4. 遍历所有机器指令，将访问局部变量的伪指令展开为真实指令
+    for (auto& mbb : mfunc->getBlocks()) {
+        std::vector<std::unique_ptr<MachineInstr>> new_instructions;
+        for (auto& instr_ptr : mbb->getInstructions()) {
+            RVOpcodes opcode = instr_ptr->getOpcode();
+
+            if (opcode == RVOpcodes::FRAME_LOAD_W || opcode == RVOpcodes::FRAME_LOAD_D || opcode == RVOpcodes::FRAME_LOAD_F) {
+                RVOpcodes real_load_op;
+                if (opcode == RVOpcodes::FRAME_LOAD_W) real_load_op = RVOpcodes::LW;
+                else if (opcode == RVOpcodes::FRAME_LOAD_D) real_load_op = RVOpcodes::LD;
+                else real_load_op = RVOpcodes::FLW;
+
+                auto& operands = instr_ptr->getOperands();
+                unsigned dest_vreg = static_cast<RegOperand*>(operands[0].get())->getVRegNum();
+                unsigned alloca_vreg = static_cast<RegOperand*>(operands[1].get())->getVRegNum();
+                int offset = frame_info.alloca_offsets.at(alloca_vreg);
+                auto addr_vreg = isel->getNewVReg(Type::getPointerType(Type::getIntType()));
+
+                // 展开为: addi addr_vreg, s0, offset
+                auto addi = std::make_unique<MachineInstr>(RVOpcodes::ADDI);
+                addi->addOperand(std::make_unique<RegOperand>(addr_vreg));
+                addi->addOperand(std::make_unique<RegOperand>(PhysicalReg::S0));
+                addi->addOperand(std::make_unique<ImmOperand>(offset));
+                new_instructions.push_back(std::move(addi));
+
+                // 展开为: lw/ld/flw dest_vreg, 0(addr_vreg)
+                auto load_instr = std::make_unique<MachineInstr>(real_load_op);
+                load_instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                load_instr->addOperand(std::make_unique<MemOperand>(
+                    std::make_unique<RegOperand>(addr_vreg),
+                    std::make_unique<ImmOperand>(0)));
+                new_instructions.push_back(std::move(load_instr));
+
+            } else if (opcode == RVOpcodes::FRAME_STORE_W || opcode == RVOpcodes::FRAME_STORE_D || opcode == RVOpcodes::FRAME_STORE_F) {
+                RVOpcodes real_store_op;
+                if (opcode == RVOpcodes::FRAME_STORE_W) real_store_op = RVOpcodes::SW;
+                else if (opcode == RVOpcodes::FRAME_STORE_D) real_store_op = RVOpcodes::SD;
+                else real_store_op = RVOpcodes::FSW;
+
+                auto& operands = instr_ptr->getOperands();
+                unsigned src_vreg = static_cast<RegOperand*>(operands[0].get())->getVRegNum();
+                unsigned alloca_vreg = static_cast<RegOperand*>(operands[1].get())->getVRegNum();
+                int offset = frame_info.alloca_offsets.at(alloca_vreg);
+                auto addr_vreg = isel->getNewVReg(Type::getPointerType(Type::getIntType()));
+                
+                // 展开为: addi addr_vreg, s0, offset
+                auto addi = std::make_unique<MachineInstr>(RVOpcodes::ADDI);
+                addi->addOperand(std::make_unique<RegOperand>(addr_vreg));
+                addi->addOperand(std::make_unique<RegOperand>(PhysicalReg::S0));
+                addi->addOperand(std::make_unique<ImmOperand>(offset));
+                new_instructions.push_back(std::move(addi));
+
+                // 展开为: sw/sd/fsw src_vreg, 0(addr_vreg)
+                auto store_instr = std::make_unique<MachineInstr>(real_store_op);
+                store_instr->addOperand(std::make_unique<RegOperand>(src_vreg));
+                store_instr->addOperand(std::make_unique<MemOperand>(
+                    std::make_unique<RegOperand>(addr_vreg),
+                    std::make_unique<ImmOperand>(0)));
+                new_instructions.push_back(std::move(store_instr));
+
+            } else if (instr_ptr->getOpcode() == RVOpcodes::FRAME_ADDR) { 
+                auto& operands = instr_ptr->getOperands();
+                unsigned dest_vreg = static_cast<RegOperand*>(operands[0].get())->getVRegNum();
+                unsigned alloca_vreg = static_cast<RegOperand*>(operands[1].get())->getVRegNum();
+                int offset = frame_info.alloca_offsets.at(alloca_vreg);
+
+                // 将 `frame_addr rd, rs` 展开为 `addi rd, s0, offset`
+                auto addi = std::make_unique<MachineInstr>(RVOpcodes::ADDI);
+                addi->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                addi->addOperand(std::make_unique<RegOperand>(PhysicalReg::S0));
+                addi->addOperand(std::make_unique<ImmOperand>(offset));
+                new_instructions.push_back(std::move(addi));
+
+            } else {
+                new_instructions.push_back(std::move(instr_ptr));
+            }
+        }
+        mbb->getInstructions() = std::move(new_instructions);
+    }
+}
+
+} // namespace sysy
--- a/src/backend/RISCv64/Handler/LegalizeImmediates.cpp
+++ b/src/backend/RISCv64/Handler/LegalizeImmediates.cpp
@ -95,7 +95,7 @@ void LegalizeImmediatesPass::runOnMachineFunction(MachineFunction* mfunc) {
                case RVOpcodes::LB: case RVOpcodes::LH: case RVOpcodes::LW: case RVOpcodes::LD:
                case RVOpcodes::LBU: case RVOpcodes::LHU: case RVOpcodes::LWU:
                case RVOpcodes::SB: case RVOpcodes::SH: case RVOpcodes::SW: case RVOpcodes::SD:
-                case RVOpcodes::FLW: case RVOpcodes::FSW: {
+                case RVOpcodes::FLW: case RVOpcodes::FSW: case RVOpcodes::FLD: case RVOpcodes::FSD: {
                    auto& operands = instr_ptr->getOperands();
                    auto mem_op = static_cast<MemOperand*>(operands.back().get());
                    auto offset_op = mem_op->getOffset();
--- a/src/backend/RISCv64/Handler/PrologueEpilogueInsertion.cpp
+++ b/src/backend/RISCv64/Handler/PrologueEpilogueInsertion.cpp
@ -1,17 +1,22 @@
 #include "PrologueEpilogueInsertion.h"
+#include "RISCv64LLIR.h" // 假设包含了 PhysicalReg, RVOpcodes 等定义
 #include "RISCv64ISel.h"
-#include "RISCv64RegAlloc.h" // 需要访问RegAlloc的结果
 #include <algorithm>
+#include <vector>
+#include <set>

 namespace sysy {

 char PrologueEpilogueInsertionPass::ID = 0;

 void PrologueEpilogueInsertionPass::runOnMachineFunction(MachineFunction* mfunc) {
+    StackFrameInfo& frame_info = mfunc->getFrameInfo();
+    Function* F = mfunc->getFunc();
+    RISCv64ISel* isel = mfunc->getISel();
+    
+    // 1. 清理 KEEPALIVE 伪指令
    for (auto& mbb : mfunc->getBlocks()) {
        auto& instrs = mbb->getInstructions();
-        
-        // 使用标准的 Erase-Remove Idiom 来删除满足条件的元素
        instrs.erase(
            std::remove_if(instrs.begin(), instrs.end(),
                [](const std::unique_ptr<MachineInstr>& instr) {
@ -22,39 +27,59 @@ void PrologueEpilogueInsertionPass::runOnMachineFunction(MachineFunction* mfunc)
        );
    }
    
-    StackFrameInfo& frame_info = mfunc->getFrameInfo();
-    Function* F = mfunc->getFunc();
-    RISCv64ISel* isel = mfunc->getISel();
-    
-    // [关键] 获取寄存器分配的结果 (vreg -> preg 的映射)
-    // RegAlloc Pass 必须已经运行过
+    // 2. 确定需要保存的被调用者保存寄存器 (callee-saved)
    auto& vreg_to_preg_map = frame_info.vreg_to_preg_map;
+    std::set<PhysicalReg> used_callee_saved_regs_set;
+    const auto& callee_saved_int = getCalleeSavedIntRegs();
+    const auto& callee_saved_fp = getCalleeSavedFpRegs();

-    // 完全遵循 AsmPrinter 中的计算逻辑
+    for (const auto& pair : vreg_to_preg_map) {
+        PhysicalReg preg = pair.second;
+        bool is_int_cs = std::find(callee_saved_int.begin(), callee_saved_int.end(), preg) != callee_saved_int.end();
+        bool is_fp_cs = std::find(callee_saved_fp.begin(), callee_saved_fp.end(), preg) != callee_saved_fp.end();
+        if ((is_int_cs && preg != PhysicalReg::S0) || is_fp_cs) {
+            used_callee_saved_regs_set.insert(preg);
+        }
+    }
+    frame_info.callee_saved_regs_to_store.assign(
+        used_callee_saved_regs_set.begin(), used_callee_saved_regs_set.end()
+    );
+    std::sort(frame_info.callee_saved_regs_to_store.begin(), frame_info.callee_saved_regs_to_store.end());
+    frame_info.callee_saved_size = frame_info.callee_saved_regs_to_store.size() * 8;
+
+    // 3. 计算最终的栈帧总大小，包含栈溢出保护
    int total_stack_size = frame_info.locals_size + 
                           frame_info.spill_size + 
                           frame_info.callee_saved_size + 
-                           16; // 为 ra 和 s0 固定的16字节
+                           16;
    
+    // 栈溢出保护：增加最大栈帧大小以容纳大型数组
+    const int MAX_STACK_FRAME_SIZE = 8192; // 8KB to handle large arrays like 256*4*2 = 2048 bytes
+    if (total_stack_size > MAX_STACK_FRAME_SIZE) {
+        // 如果仍然超过限制，尝试优化对齐方式
+        std::cerr << "Warning: Stack frame size " << total_stack_size 
+                  << " exceeds recommended limit " << MAX_STACK_FRAME_SIZE << " for function " 
+                  << mfunc->getName() << std::endl;
+    }
+    
+    // 优化：减少对齐开销，使用16字节对齐而非更大的对齐
    int aligned_stack_size = (total_stack_size + 15) & ~15;
    frame_info.total_size = aligned_stack_size;

-    // 只有在需要分配栈空间时才生成指令
    if (aligned_stack_size > 0) {
-        // --- 1. 插入序言 ---
+        // --- 4. 插入完整的序言 ---
        MachineBasicBlock* entry_block = mfunc->getBlocks().front().get();
        auto& entry_instrs = entry_block->getInstructions();
-
        std::vector<std::unique_ptr<MachineInstr>> prologue_instrs;

-        // 1. addi sp, sp, -aligned_stack_size
+        // 4.1. 分配栈帧
        auto alloc_stack = std::make_unique<MachineInstr>(RVOpcodes::ADDI);
        alloc_stack->addOperand(std::make_unique<RegOperand>(PhysicalReg::SP));
        alloc_stack->addOperand(std::make_unique<RegOperand>(PhysicalReg::SP));
        alloc_stack->addOperand(std::make_unique<ImmOperand>(-aligned_stack_size));
        prologue_instrs.push_back(std::move(alloc_stack));

-        // 2. sd ra, (aligned_stack_size - 8)(sp)
+        // 4.2. 保存 ra 和 s0
        auto save_ra = std::make_unique<MachineInstr>(RVOpcodes::SD);
        save_ra->addOperand(std::make_unique<RegOperand>(PhysicalReg::RA));
        save_ra->addOperand(std::make_unique<MemOperand>(
@ -62,8 +87,6 @@ void PrologueEpilogueInsertionPass::runOnMachineFunction(MachineFunction* mfunc)
            std::make_unique<ImmOperand>(aligned_stack_size - 8)
        ));
        prologue_instrs.push_back(std::move(save_ra));
-
-        // 3. sd s0, (aligned_stack_size - 16)(sp)
        auto save_fp = std::make_unique<MachineInstr>(RVOpcodes::SD);
        save_fp->addOperand(std::make_unique<RegOperand>(PhysicalReg::S0));
        save_fp->addOperand(std::make_unique<MemOperand>(
@ -72,66 +95,55 @@ void PrologueEpilogueInsertionPass::runOnMachineFunction(MachineFunction* mfunc)
        ));
        prologue_instrs.push_back(std::move(save_fp));
        
-        // 4. addi s0, sp, aligned_stack_size
+        // 4.3. 设置新的帧指针 s0
        auto set_fp = std::make_unique<MachineInstr>(RVOpcodes::ADDI);
        set_fp->addOperand(std::make_unique<RegOperand>(PhysicalReg::S0));
        set_fp->addOperand(std::make_unique<RegOperand>(PhysicalReg::SP));
        set_fp->addOperand(std::make_unique<ImmOperand>(aligned_stack_size));
        prologue_instrs.push_back(std::move(set_fp));
-
-        // --- 在s0设置完毕后，使用物理寄存器加载栈参数 ---
-        if (F && isel) {
-            int arg_idx = 0;
-            for (Argument* arg : F->getArguments()) {
-                if (arg_idx >= 8) {
-                    unsigned vreg = isel->getVReg(arg);
-                    
-                    if (frame_info.alloca_offsets.count(vreg) && vreg_to_preg_map.count(vreg)) {
-                        int offset = frame_info.alloca_offsets.at(vreg);
-                        PhysicalReg dest_preg = vreg_to_preg_map.at(vreg);
-                        Type* arg_type = arg->getType();
-
-                        if (arg_type->isFloat()) {
-                            auto load_arg = std::make_unique<MachineInstr>(RVOpcodes::FLW);
-                            load_arg->addOperand(std::make_unique<RegOperand>(dest_preg));
-                            load_arg->addOperand(std::make_unique<MemOperand>(
-                                std::make_unique<RegOperand>(PhysicalReg::S0),
-                                std::make_unique<ImmOperand>(offset)
-                            ));
-                            prologue_instrs.push_back(std::move(load_arg));
-                        } else {
-                            RVOpcodes load_op = arg_type->isPointer() ? RVOpcodes::LD : RVOpcodes::LW;
-                            auto load_arg = std::make_unique<MachineInstr>(load_op);
-                            load_arg->addOperand(std::make_unique<RegOperand>(dest_preg));
-                            load_arg->addOperand(std::make_unique<MemOperand>(
-                                std::make_unique<RegOperand>(PhysicalReg::S0),
-                                std::make_unique<ImmOperand>(offset)
-                            ));
-                            prologue_instrs.push_back(std::move(load_arg));
-                        }
-                    }
-                }
-                arg_idx++;
-            }
-        }
        
-        // 确定插入点
-        auto insert_pos = entry_instrs.begin();
-        
-        // 一次性将所有序言指令插入
-        if (!prologue_instrs.empty()) {
-            entry_instrs.insert(insert_pos, 
-                                std::make_move_iterator(prologue_instrs.begin()),
-                                std::make_move_iterator(prologue_instrs.end()));
+        // 4.4. 保存所有使用到的被调用者保存寄存器
+        int next_available_offset = -(16 + frame_info.locals_size + frame_info.spill_size);
+        for (const auto& reg : frame_info.callee_saved_regs_to_store) {
+            // 改为“先更新，后使用”逻辑
+            next_available_offset -= 8; // 先为当前寄存器分配下一个可用槽位
+            RVOpcodes store_op = isFPR(reg) ? RVOpcodes::FSD : RVOpcodes::SD;
+            auto save_cs_reg = std::make_unique<MachineInstr>(store_op);
+            save_cs_reg->addOperand(std::make_unique<RegOperand>(reg));
+            save_cs_reg->addOperand(std::make_unique<MemOperand>(
+                std::make_unique<RegOperand>(PhysicalReg::S0),
+                std::make_unique<ImmOperand>(next_available_offset) // 使用新计算出的正确偏移
+            ));
+            prologue_instrs.push_back(std::move(save_cs_reg));
+            // 不再需要在循环末尾递减
        }

-        // --- 2. 插入尾声 (此部分逻辑保持不变) ---
+        // 4.5. 将所有生成的序言指令一次性插入到函数入口
+        entry_instrs.insert(entry_instrs.begin(), 
+                            std::make_move_iterator(prologue_instrs.begin()),
+                            std::make_move_iterator(prologue_instrs.end()));
+
+        // --- 5. 插入完整的尾声 ---
        for (auto& mbb : mfunc->getBlocks()) {
            for (auto it = mbb->getInstructions().begin(); it != mbb->getInstructions().end(); ++it) {
                if ((*it)->getOpcode() == RVOpcodes::RET) {
                    std::vector<std::unique_ptr<MachineInstr>> epilogue_instrs;
+                    
+                    // 5.1. 恢复被调用者保存寄存器
+                    int next_available_offset_restore = -(16 + frame_info.locals_size + frame_info.spill_size);
+                    for (const auto& reg : frame_info.callee_saved_regs_to_store) {
+                        next_available_offset_restore -= 8; // 为下一个寄存器准备偏移
+                        RVOpcodes load_op = isFPR(reg) ? RVOpcodes::FLD : RVOpcodes::LD;
+                        auto restore_cs_reg = std::make_unique<MachineInstr>(load_op);
+                        restore_cs_reg->addOperand(std::make_unique<RegOperand>(reg));
+                        restore_cs_reg->addOperand(std::make_unique<MemOperand>(
+                            std::make_unique<RegOperand>(PhysicalReg::S0),
+                            std::make_unique<ImmOperand>(next_available_offset_restore) // 使用当前偏移
+                        ));
+                        epilogue_instrs.push_back(std::move(restore_cs_reg));
+                    }

-                    // 1. ld ra
+                    // 5.2. 恢复 ra 和 s0
                    auto restore_ra = std::make_unique<MachineInstr>(RVOpcodes::LD);
                    restore_ra->addOperand(std::make_unique<RegOperand>(PhysicalReg::RA));
                    restore_ra->addOperand(std::make_unique<MemOperand>(
@ -139,8 +151,6 @@ void PrologueEpilogueInsertionPass::runOnMachineFunction(MachineFunction* mfunc)
                        std::make_unique<ImmOperand>(aligned_stack_size - 8)
                    ));
                    epilogue_instrs.push_back(std::move(restore_ra));
-
-                    // 2. ld s0
                    auto restore_fp = std::make_unique<MachineInstr>(RVOpcodes::LD);
                    restore_fp->addOperand(std::make_unique<RegOperand>(PhysicalReg::S0));
                    restore_fp->addOperand(std::make_unique<MemOperand>(
@ -149,18 +159,18 @@ void PrologueEpilogueInsertionPass::runOnMachineFunction(MachineFunction* mfunc)
                    ));
                    epilogue_instrs.push_back(std::move(restore_fp));

-                    // 3. addi sp, sp, aligned_stack_size
+                    // 5.3. 释放栈帧
                    auto dealloc_stack = std::make_unique<MachineInstr>(RVOpcodes::ADDI);
                    dealloc_stack->addOperand(std::make_unique<RegOperand>(PhysicalReg::SP));
                    dealloc_stack->addOperand(std::make_unique<RegOperand>(PhysicalReg::SP));
                    dealloc_stack->addOperand(std::make_unique<ImmOperand>(aligned_stack_size));
                    epilogue_instrs.push_back(std::move(dealloc_stack));

-                    if (!epilogue_instrs.empty()) {
-                        mbb->getInstructions().insert(it, 
-                                                    std::make_move_iterator(epilogue_instrs.begin()),
-                                                    std::make_move_iterator(epilogue_instrs.end()));
-                    }
+                    // 将尾声指令插入到 RET 指令之前
+                    mbb->getInstructions().insert(it, 
+                                                  std::make_move_iterator(epilogue_instrs.begin()),
+                                                  std::make_move_iterator(epilogue_instrs.end()));
+                    
                    goto next_block;
                }
            }
--- a/src/backend/RISCv64/Optimize/DivStrengthReduction.cpp
+++ b/src/backend/RISCv64/Optimize/DivStrengthReduction.cpp
@ -0,0 +1,282 @@
+#include "DivStrengthReduction.h"
+#include <cmath>
+#include <cstdint>
+
+namespace sysy {
+
+char DivStrengthReduction::ID = 0;
+
+bool DivStrengthReduction::runOnFunction(Function *F, AnalysisManager& AM) {
+    // This pass works on MachineFunction level, not IR level
+    return false;
+}
+
+void DivStrengthReduction::runOnMachineFunction(MachineFunction *mfunc) {
+    if (!mfunc)
+        return;
+
+    bool debug = false; // Set to true for debugging
+    if (debug)
+        std::cout << "Running DivStrengthReduction optimization..." << std::endl;
+
+    int next_temp_reg = 1000;
+    auto createTempReg = [&]() -> int {
+        return next_temp_reg++;
+    };
+
+    struct MagicInfo {
+        int64_t magic;
+        int shift;
+    };
+    
+    auto computeMagic = [](int64_t d, bool is_32bit) -> MagicInfo {
+        int word_size = is_32bit ? 32 : 64;
+        uint64_t ad = std::abs(d);
+        
+        if (ad == 0) return {0, 0};
+        
+        int l = std::floor(std::log2(ad));
+        if ((ad & (ad - 1)) == 0) { // power of 2
+             l = 0; // special case for power of 2, shift will be calculated differently
+        }
+
+        __int128_t one = 1;
+        __int128_t num;
+        int total_shift;
+
+        if (is_32bit) {
+            total_shift = 31 + l;
+            num = one << total_shift;
+        } else {
+            total_shift = 63 + l;
+            num = one << total_shift;
+        }
+        
+        __int128_t den = ad;
+        int64_t magic = (num / den) + 1;
+        
+        return {magic, total_shift};
+    };
+
+    auto isPowerOfTwo = [](int64_t n) -> bool {
+        return n > 0 && (n & (n - 1)) == 0;
+    };
+
+    auto getPowerOfTwoExponent = [](int64_t n) -> int {
+        if (n <= 0 || (n & (n - 1)) != 0) return -1;
+        int shift = 0;
+        while (n > 1) {
+            n >>= 1;
+            shift++;
+        }
+        return shift;
+    };
+
+    struct InstructionReplacement {
+        size_t index;
+        size_t count_to_erase;
+        std::vector<std::unique_ptr<MachineInstr>> newInstrs;
+    };
+    
+    for (auto &mbb_uptr : mfunc->getBlocks()) {
+        auto &mbb = *mbb_uptr;
+        auto &instrs = mbb.getInstructions();
+        std::vector<InstructionReplacement> replacements;
+        
+        for (size_t i = 0; i < instrs.size(); ++i) {
+            auto *instr = instrs[i].get();
+            
+            bool is_32bit = (instr->getOpcode() == RVOpcodes::DIVW);
+            
+            if (instr->getOpcode() != RVOpcodes::DIV && !is_32bit) {
+                continue;
+            }
+            
+            if (instr->getOperands().size() != 3) {
+                continue;
+            }
+            
+            auto *dst_op = instr->getOperands()[0].get();
+            auto *src1_op = instr->getOperands()[1].get();
+            auto *src2_op = instr->getOperands()[2].get();
+
+            int64_t divisor = 0;
+            bool const_divisor_found = false;
+            size_t instructions_to_replace = 1;
+
+            if (src2_op->getKind() == MachineOperand::KIND_IMM) {
+                divisor = static_cast<ImmOperand *>(src2_op)->getValue();
+                const_divisor_found = true;
+            } else if (src2_op->getKind() == MachineOperand::KIND_REG) {
+                if (i > 0) {
+                    auto *prev_instr = instrs[i - 1].get();
+                    if (prev_instr->getOpcode() == RVOpcodes::LI && prev_instr->getOperands().size() == 2) {
+                        auto *li_dst_op = prev_instr->getOperands()[0].get();
+                        auto *li_imm_op = prev_instr->getOperands()[1].get();
+                        if (li_dst_op->getKind() == MachineOperand::KIND_REG && li_imm_op->getKind() == MachineOperand::KIND_IMM) {
+                            auto *div_reg_op = static_cast<RegOperand *>(src2_op);
+                            auto *li_dst_reg_op = static_cast<RegOperand *>(li_dst_op);
+                            if (div_reg_op->isVirtual() && li_dst_reg_op->isVirtual() &&
+                                div_reg_op->getVRegNum() == li_dst_reg_op->getVRegNum()) {
+                                divisor = static_cast<ImmOperand *>(li_imm_op)->getValue();
+                                const_divisor_found = true;
+                                instructions_to_replace = 2;
+                            }
+                        }
+                    }
+                }
+            }
+
+            if (!const_divisor_found) {
+                continue;
+            }
+            
+            auto *dst_reg = static_cast<RegOperand *>(dst_op);
+            auto *src1_reg = static_cast<RegOperand *>(src1_op);
+            
+            if (divisor == 0) continue;
+            
+            std::vector<std::unique_ptr<MachineInstr>> newInstrs;
+            
+            if (divisor == 1) {
+                auto moveInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::ADDW : RVOpcodes::ADD);
+                moveInstr->addOperand(std::make_unique<RegOperand>(*dst_reg));
+                moveInstr->addOperand(std::make_unique<RegOperand>(*src1_reg));
+                moveInstr->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                newInstrs.push_back(std::move(moveInstr));
+            }
+            else if (divisor == -1) {
+                auto negInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SUBW : RVOpcodes::SUB);
+                negInstr->addOperand(std::make_unique<RegOperand>(*dst_reg));
+                negInstr->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                negInstr->addOperand(std::make_unique<RegOperand>(*src1_reg));
+                newInstrs.push_back(std::move(negInstr));
+            }
+            else if (isPowerOfTwo(std::abs(divisor))) {
+                int shift = getPowerOfTwoExponent(std::abs(divisor));
+                int temp_reg = createTempReg();
+                
+                auto sraSignInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SRAIW : RVOpcodes::SRAI);
+                sraSignInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                sraSignInstr->addOperand(std::make_unique<RegOperand>(*src1_reg));
+                sraSignInstr->addOperand(std::make_unique<ImmOperand>(is_32bit ? 31 : 63));
+                newInstrs.push_back(std::move(sraSignInstr));
+                
+                auto srlInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SRLIW : RVOpcodes::SRLI);
+                srlInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                srlInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                srlInstr->addOperand(std::make_unique<ImmOperand>((is_32bit ? 32 : 64) - shift));
+                newInstrs.push_back(std::move(srlInstr));
+                
+                auto addInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::ADDW : RVOpcodes::ADD);
+                addInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                addInstr->addOperand(std::make_unique<RegOperand>(*src1_reg));
+                addInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                newInstrs.push_back(std::move(addInstr));
+                
+                auto sraInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SRAIW : RVOpcodes::SRAI);
+                sraInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                sraInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                sraInstr->addOperand(std::make_unique<ImmOperand>(shift));
+                newInstrs.push_back(std::move(sraInstr));
+
+                if (divisor < 0) {
+                    auto negInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SUBW : RVOpcodes::SUB);
+                    negInstr->addOperand(std::make_unique<RegOperand>(*dst_reg));
+                    negInstr->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                    negInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    newInstrs.push_back(std::move(negInstr));
+                } else {
+                    auto moveInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::ADDW : RVOpcodes::ADD);
+                    moveInstr->addOperand(std::make_unique<RegOperand>(*dst_reg));
+                    moveInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    moveInstr->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                    newInstrs.push_back(std::move(moveInstr));
+                }
+            }
+            else {
+                auto magic_info = computeMagic(divisor, is_32bit);
+                int magic_reg = createTempReg();
+                int temp_reg = createTempReg();
+
+                auto loadInstr = std::make_unique<MachineInstr>(RVOpcodes::LI);
+                loadInstr->addOperand(std::make_unique<RegOperand>(magic_reg));
+                loadInstr->addOperand(std::make_unique<ImmOperand>(magic_info.magic));
+                newInstrs.push_back(std::move(loadInstr));
+
+                if (is_32bit) {
+                    auto mulInstr = std::make_unique<MachineInstr>(RVOpcodes::MUL);
+                    mulInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    mulInstr->addOperand(std::make_unique<RegOperand>(*src1_reg));
+                    mulInstr->addOperand(std::make_unique<RegOperand>(magic_reg));
+                    newInstrs.push_back(std::move(mulInstr));
+
+                    auto sraInstr = std::make_unique<MachineInstr>(RVOpcodes::SRAI);
+                    sraInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    sraInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    sraInstr->addOperand(std::make_unique<ImmOperand>(magic_info.shift));
+                    newInstrs.push_back(std::move(sraInstr));
+                } else {
+                    auto mulhInstr = std::make_unique<MachineInstr>(RVOpcodes::MULH);
+                    mulhInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    mulhInstr->addOperand(std::make_unique<RegOperand>(*src1_reg));
+                    mulhInstr->addOperand(std::make_unique<RegOperand>(magic_reg));
+                    newInstrs.push_back(std::move(mulhInstr));
+                    
+                    int post_shift = magic_info.shift - 63;
+                    if (post_shift > 0) {
+                        auto sraInstr = std::make_unique<MachineInstr>(RVOpcodes::SRAI);
+                        sraInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                        sraInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                        sraInstr->addOperand(std::make_unique<ImmOperand>(post_shift));
+                        newInstrs.push_back(std::move(sraInstr));
+                    }
+                }
+                
+                int sign_reg = createTempReg();
+                auto sraSignInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SRAIW : RVOpcodes::SRAI);
+                sraSignInstr->addOperand(std::make_unique<RegOperand>(sign_reg));
+                sraSignInstr->addOperand(std::make_unique<RegOperand>(*src1_reg));
+                sraSignInstr->addOperand(std::make_unique<ImmOperand>(is_32bit ? 31 : 63));
+                newInstrs.push_back(std::move(sraSignInstr));
+
+                auto subInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SUBW : RVOpcodes::SUB);
+                subInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                subInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                subInstr->addOperand(std::make_unique<RegOperand>(sign_reg));
+                newInstrs.push_back(std::move(subInstr));
+
+                if (divisor < 0) {
+                    auto negInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::SUBW : RVOpcodes::SUB);
+                    negInstr->addOperand(std::make_unique<RegOperand>(*dst_reg));
+                    negInstr->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                    negInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    newInstrs.push_back(std::move(negInstr));
+                } else {
+                    auto moveInstr = std::make_unique<MachineInstr>(is_32bit ? RVOpcodes::ADDW : RVOpcodes::ADD);
+                    moveInstr->addOperand(std::make_unique<RegOperand>(*dst_reg));
+                    moveInstr->addOperand(std::make_unique<RegOperand>(temp_reg));
+                    moveInstr->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                    newInstrs.push_back(std::move(moveInstr));
+                }
+            }
+            
+            if (!newInstrs.empty()) {
+                size_t start_index = i;
+                if (instructions_to_replace == 2) {
+                    start_index = i - 1;
+                }
+                replacements.push_back({start_index, instructions_to_replace, std::move(newInstrs)});
+            }
+        }
+        
+        for (auto it = replacements.rbegin(); it != replacements.rend(); ++it) {
+            instrs.erase(instrs.begin() + it->index, instrs.begin() + it->index + it->count_to_erase);
+            instrs.insert(instrs.begin() + it->index, 
+                         std::make_move_iterator(it->newInstrs.begin()),
+                         std::make_move_iterator(it->newInstrs.end()));
+        }
+    }
+}
+
+} // namespace sysy
--- a/src/backend/RISCv64/Optimize/Peephole.cpp
+++ b/src/backend/RISCv64/Optimize/Peephole.cpp
@ -4,6 +4,7 @@
 namespace sysy {

 char PeepholeOptimizer::ID = 0;
+bool PeepholeOptimizer::fusedMulAddEnabled = true; // 默认启用浮点乘加融合优化

 bool PeepholeOptimizer::runOnFunction(Function *F, AnalysisManager& AM) {
    // This pass works on MachineFunction level, not IR level
@ -634,6 +635,99 @@ void PeepholeOptimizer::runOnMachineFunction(MachineFunction *mfunc) {
          }
        }
      }
+      // 8. 浮点乘加融合优化
+      // 8.1 fmul.s t1, t2, t3; fadd.s t4, t1, t5 -> fmadd.s t4, t2, t3, t5
+      else if (isFusedMulAddEnabled() &&
+               mi1->getOpcode() == RVOpcodes::FMUL_S &&
+               mi2->getOpcode() == RVOpcodes::FADD_S) {
+        if (mi1->getOperands().size() == 3 && mi2->getOperands().size() == 3) {
+          auto *fmul_dst = static_cast<RegOperand *>(mi1->getOperands()[0].get());
+          auto *fmul_src1 = static_cast<RegOperand *>(mi1->getOperands()[1].get());
+          auto *fmul_src2 = static_cast<RegOperand *>(mi1->getOperands()[2].get());
+
+          auto *fadd_dst = static_cast<RegOperand *>(mi2->getOperands()[0].get());
+          auto *fadd_src1 = static_cast<RegOperand *>(mi2->getOperands()[1].get());
+          auto *fadd_src2 = static_cast<RegOperand *>(mi2->getOperands()[2].get());
+
+          // 检查fmul的目标是否是fadd的第一个源操作数
+          if (areRegsEqual(fmul_dst, fadd_src1)) {
+            // 检查中间寄存器是否在后续还会被使用
+            bool canOptimize = true;
+            for (size_t j = i + 2; j < instrs.size(); ++j) {
+              auto *later_instr = instrs[j].get();
+              
+              // 如果中间寄存器被重新定义，则可以优化
+              if (isRegRedefinedAt(later_instr, fmul_dst, areRegsEqual)) {
+                break;
+              }
+              
+              // 如果中间寄存器被使用，则不能优化
+              if (isRegUsedLater(instrs, fmul_dst, j)) {
+                canOptimize = false;
+                break;
+              }
+            }
+
+            if (canOptimize) {
+              // 创建新的FMADD_S指令: fmadd.s t4, t2, t3, t5
+              auto newInstr = std::make_unique<MachineInstr>(RVOpcodes::FMADD_S);
+              newInstr->addOperand(std::make_unique<RegOperand>(*fadd_dst));
+              newInstr->addOperand(std::make_unique<RegOperand>(*fmul_src1));
+              newInstr->addOperand(std::make_unique<RegOperand>(*fmul_src2));
+              newInstr->addOperand(std::make_unique<RegOperand>(*fadd_src2));
+              instrs[i + 1] = std::move(newInstr);
+              instrs.erase(instrs.begin() + i);
+              changed = true;
+            }
+          }
+        }
+      }
+      // 8.2 fmul.s t1, t2, t3; fadd.s t4, t5, t1 -> fmadd.s t4, t2, t3, t5
+      else if (isFusedMulAddEnabled() &&
+               mi1->getOpcode() == RVOpcodes::FMUL_S &&
+               mi2->getOpcode() == RVOpcodes::FADD_S) {
+        if (mi1->getOperands().size() == 3 && mi2->getOperands().size() == 3) {
+          auto *fmul_dst = static_cast<RegOperand *>(mi1->getOperands()[0].get());
+          auto *fmul_src1 = static_cast<RegOperand *>(mi1->getOperands()[1].get());
+          auto *fmul_src2 = static_cast<RegOperand *>(mi1->getOperands()[2].get());
+
+          auto *fadd_dst = static_cast<RegOperand *>(mi2->getOperands()[0].get());
+          auto *fadd_src1 = static_cast<RegOperand *>(mi2->getOperands()[1].get());
+          auto *fadd_src2 = static_cast<RegOperand *>(mi2->getOperands()[2].get());
+
+          // 检查fmul的目标是否是fadd的第二个源操作数
+          if (areRegsEqual(fmul_dst, fadd_src2)) {
+            // 检查中间寄存器是否在后续还会被使用
+            bool canOptimize = true;
+            for (size_t j = i + 2; j < instrs.size(); ++j) {
+              auto *later_instr = instrs[j].get();
+              
+              // 如果中间寄存器被重新定义，则可以优化
+              if (isRegRedefinedAt(later_instr, fmul_dst, areRegsEqual)) {
+                break;
+              }
+              
+              // 如果中间寄存器被使用，则不能优化
+              if (isRegUsedLater(instrs, fmul_dst, j)) {
+                canOptimize = false;
+                break;
+              }
+            }
+
+            if (canOptimize) {
+              // 创建新的FMADD_S指令: fmadd.s t4, t2, t3, t5
+              auto newInstr = std::make_unique<MachineInstr>(RVOpcodes::FMADD_S);
+              newInstr->addOperand(std::make_unique<RegOperand>(*fadd_dst));
+              newInstr->addOperand(std::make_unique<RegOperand>(*fmul_src1));
+              newInstr->addOperand(std::make_unique<RegOperand>(*fmul_src2));
+              newInstr->addOperand(std::make_unique<RegOperand>(*fadd_src1));
+              instrs[i + 1] = std::move(newInstr);
+              instrs.erase(instrs.begin() + i);
+              changed = true;
+            }
+          }
+        }
+      }

      // 根据是否发生变化调整遍历索引
      if (!changed) {
--- a/src/backend/RISCv64/RISCv64AsmPrinter.cpp
+++ b/src/backend/RISCv64/RISCv64AsmPrinter.cpp
@ -1,26 +1,10 @@
 #include "RISCv64AsmPrinter.h"
 #include "RISCv64ISel.h"
 #include <stdexcept>
-
+#include <sstream>
+#include <iostream>
 namespace sysy {

-// 检查是否为内存加载/存储指令，以处理特殊的打印格式
-bool isMemoryOp(RVOpcodes opcode) {
-    switch (opcode) {
-        // --- 整数加载/存储 (原有逻辑) ---
-        case RVOpcodes::LB: case RVOpcodes::LH: case RVOpcodes::LW: case RVOpcodes::LD:
-        case RVOpcodes::LBU: case RVOpcodes::LHU: case RVOpcodes::LWU:
-        case RVOpcodes::SB: case RVOpcodes::SH: case RVOpcodes::SW: case RVOpcodes::SD:
-        case RVOpcodes::FLW:
-        case RVOpcodes::FSW:
-        // 如果未来支持双精度，也在这里添加FLD/FSD
-        // case RVOpcodes::FLD:
-        // case RVOpcodes::FSD:
-            return true;
-        default:
-            return false;
-    }
-}

 RISCv64AsmPrinter::RISCv64AsmPrinter(MachineFunction* mfunc) : MFunc(mfunc) {}

@ -60,7 +44,7 @@ void RISCv64AsmPrinter::printInstruction(MachineInstr* instr, bool debug) {
        case RVOpcodes::ADD:   *OS << "add ";   break; case RVOpcodes::ADDI:  *OS << "addi ";  break;
        case RVOpcodes::ADDW:  *OS << "addw ";  break; case RVOpcodes::ADDIW: *OS << "addiw "; break;
        case RVOpcodes::SUB:   *OS << "sub ";   break; case RVOpcodes::SUBW:  *OS << "subw ";  break;
-        case RVOpcodes::MUL:   *OS << "mul ";   break; case RVOpcodes::MULW:  *OS << "mulw ";  break;
+        case RVOpcodes::MUL:   *OS << "mul ";   break; case RVOpcodes::MULW:  *OS << "mulw ";  break; case RVOpcodes::MULH:  *OS << "mulh ";  break;
        case RVOpcodes::DIV:   *OS << "div ";   break; case RVOpcodes::DIVW:  *OS << "divw ";  break;
        case RVOpcodes::REM:   *OS << "rem ";   break; case RVOpcodes::REMW:  *OS << "remw ";  break;
        case RVOpcodes::XOR:   *OS << "xor ";   break; case RVOpcodes::XORI:  *OS << "xori ";  break;
@ -81,7 +65,7 @@ void RISCv64AsmPrinter::printInstruction(MachineInstr* instr, bool debug) {
        case RVOpcodes::SB:    *OS << "sb ";    break; case RVOpcodes::LD:    *OS << "ld ";    break;
        case RVOpcodes::SD:    *OS << "sd ";    break; case RVOpcodes::FLW:   *OS << "flw ";   break;
        case RVOpcodes::FSW:   *OS << "fsw ";   break; case RVOpcodes::FLD:   *OS << "fld ";   break;
-        case RVOpcodes::FSD:   *OS << "fsd ";   break;
+        case RVOpcodes::FSD:   *OS << "fsd ";   break; 
        case RVOpcodes::J:     *OS << "j ";     break; case RVOpcodes::JAL:   *OS << "jal ";   break;
        case RVOpcodes::JALR:  *OS << "jalr ";  break; case RVOpcodes::RET:   *OS << "ret";    break;
        case RVOpcodes::BEQ:   *OS << "beq ";   break; case RVOpcodes::BNE:   *OS << "bne ";   break;
@ -95,16 +79,19 @@ void RISCv64AsmPrinter::printInstruction(MachineInstr* instr, bool debug) {
        case RVOpcodes::FSUB_S:  *OS << "fsub.s ";  break;
        case RVOpcodes::FMUL_S:  *OS << "fmul.s ";  break;
        case RVOpcodes::FDIV_S:  *OS << "fdiv.s ";  break;
+        case RVOpcodes::FMADD_S: *OS << "fmadd.s "; break;
        case RVOpcodes::FNEG_S:  *OS << "fneg.s ";  break;
        case RVOpcodes::FEQ_S:   *OS << "feq.s ";   break;
        case RVOpcodes::FLT_S:   *OS << "flt.s ";   break;
        case RVOpcodes::FLE_S:   *OS << "fle.s ";   break;
        case RVOpcodes::FCVT_S_W: *OS << "fcvt.s.w "; break;
        case RVOpcodes::FCVT_W_S: *OS << "fcvt.w.s "; break;
+        case RVOpcodes::FCVT_W_S_RTZ: *OS << "fcvt.w.s "; break;
        case RVOpcodes::FMV_S:    *OS << "fmv.s ";    break;
        case RVOpcodes::FMV_W_X:  *OS << "fmv.w.x ";  break;
        case RVOpcodes::FMV_X_W:  *OS << "fmv.x.w ";  break;
-        case RVOpcodes::CALL: { // [核心修改] 为CALL指令添加特殊处理逻辑
+        case RVOpcodes::FSRMI:   *OS << "fsrmi ";   break;
+        case RVOpcodes::CALL: { // 为CALL指令添加特殊处理逻辑
            *OS << "call ";
            // 遍历所有操作数，只寻找并打印函数名标签
            for (const auto& op : instr->getOperands()) {
@ -198,7 +185,7 @@ void RISCv64AsmPrinter::printOperand(MachineOperand* op) {
    }
 }

-std::string RISCv64AsmPrinter::regToString(PhysicalReg reg) {
+std::string RISCv64AsmPrinter::regToString(PhysicalReg reg) const {
    switch (reg) {
        case PhysicalReg::ZERO: return "x0";  case PhysicalReg::RA: return "ra";
        case PhysicalReg::SP: return "sp";    case PhysicalReg::GP: return "gp";
@ -236,4 +223,30 @@ std::string RISCv64AsmPrinter::regToString(PhysicalReg reg) {
    }
 }

+std::string RISCv64AsmPrinter::formatInstr(const MachineInstr* instr) {
+    if (!instr) return "(null instr)";
+    
+    // 使用 stringstream 作为临时的输出目标
+    std::stringstream ss;
+    
+    // 关键: 临时将类成员 'OS' 指向我们的 stringstream
+    std::ostream* old_os = this->OS;
+    this->OS = &ss;
+    
+    // 修正: 调用正确的内部打印函数 printMachineInstr
+    printInstruction(const_cast<MachineInstr*>(instr), false);
+    
+    // 恢复旧的 ostream 指针
+    this->OS = old_os;
+    
+    // 获取stringstream的内容并做一些清理
+    std::string result = ss.str();
+    size_t endpos = result.find_last_not_of(" \t\n\r");
+    if (std::string::npos != endpos) {
+        result = result.substr(0, endpos + 1);
+    }
+    
+    return result;
+}
+
 } // namespace sysy
--- a/src/backend/RISCv64/RISCv64Backend.cpp
+++ b/src/backend/RISCv64/RISCv64Backend.cpp
@ -1,10 +1,18 @@
 #include "RISCv64Backend.h"
 #include "RISCv64ISel.h"
 #include "RISCv64RegAlloc.h"
+#include "RISCv64LinearScan.h"
+#include "RISCv64SimpleRegAlloc.h"
+#include "RISCv64BasicBlockAlloc.h"
 #include "RISCv64AsmPrinter.h"
 #include "RISCv64Passes.h"
 #include <sstream>
-
+#include <future>
+#include <chrono>
+#include <atomic>
+#include <memory>
+#include <thread>
+#include <iostream>
 namespace sysy {

 // 顶层入口
@ -12,6 +20,39 @@ std::string RISCv64CodeGen::code_gen() {
    return module_gen();
 }

+unsigned RISCv64CodeGen::getTypeSizeInBytes(Type* type) {
+    if (!type) {
+        assert(false && "Cannot get size of a null type.");
+        return 0;
+    }
+
+    switch (type->getKind()) {
+        // 对于SysY语言，基本类型int和float都占用4字节
+        case Type::kInt:
+        case Type::kFloat:
+            return 4;
+
+        // 指针类型在RISC-V 64位架构下占用8字节
+        // 虽然SysY没有'int*'语法，但数组变量在IR层面本身就是指针类型
+        case Type::kPointer:
+            return 8;
+
+        // 数组类型的总大小 = 元素数量 * 单个元素的大小
+        case Type::kArray: {
+            auto arrayType = type->as<ArrayType>();
+            // 递归调用以计算元素大小
+            return arrayType->getNumElements() * getTypeSizeInBytes(arrayType->getElementType());
+        }
+
+        // 其他类型，如Void, Label等不占用栈空间，或者不应该出现在这里
+        default:
+            // 如果遇到未处理的类型，触发断言，方便调试
+            // assert(false && "Unsupported type for size calculation.");
+            return 0; // 对于像Label或Void这样的类型，返回0是合理的
+    }
+}
+
+
 void printInitializer(std::stringstream& ss, const ValueCounter& init_values) {
    for (size_t i = 0; i < init_values.getValues().size(); ++i) {
        auto val = init_values.getValues()[i];
@ -39,18 +80,36 @@ std::string RISCv64CodeGen::module_gen() {

    for (const auto& global_ptr : module->getGlobals()) {
        GlobalValue* global = global_ptr.get();
+        
+        // 使用更健壮的逻辑来判断是否为大型零初始化数组
+        bool is_all_zeros = true;
        const auto& init_values = global->getInitValues();
        
-        // 判断是否为大型零初始化数组，以便放入.bss段
-        bool is_large_zero_array = false;
-        if (init_values.getValues().size() == 1) {
-            if (auto const_val = dynamic_cast<ConstantValue*>(init_values.getValues()[0])) {
-                if (const_val->isInt() && const_val->getInt() == 0 && init_values.getNumbers()[0] > 16) {
-                    is_large_zero_array = true;
+        // 检查初始化值是否全部为0
+        if (init_values.getValues().empty()) {
+            // 如果 ValueCounter 为空，GlobalValue 的构造函数会确保它是零初始化的
+            is_all_zeros = true;
+        } else {
+            for (auto val : init_values.getValues()) {
+                if (auto const_val = dynamic_cast<ConstantValue*>(val)) {
+                    if (!const_val->isZero()) {
+                        is_all_zeros = false;
+                        break;
+                    }
+                } else {
+                    // 如果初始值包含非常量（例如，另一个全局变量的地址），则不认为是纯零初始化
+                    is_all_zeros = false;
+                    break;
                }
            }
        }

+        // 使用 getTypeSizeInBytes 检查总大小是否超过阈值 (16个整数 = 64字节)
+        Type* allocated_type = global->getType()->as<PointerType>()->getBaseType();
+        unsigned total_size = getTypeSizeInBytes(allocated_type);
+        
+        bool is_large_zero_array = is_all_zeros && (total_size > 64);
+
        if (is_large_zero_array) {
            bss_globals.push_back(global);
        } else {
@ -58,12 +117,12 @@ std::string RISCv64CodeGen::module_gen() {
        }
    }

-    // --- 步骤2：生成 .bss 段的代码 (这部分不变) ---
+    // --- 步骤2：生成 .bss 段的代码 ---
    if (!bss_globals.empty()) {
        ss << ".bss\n";
        for (GlobalValue* global : bss_globals) {
-            unsigned count = global->getInitValues().getNumbers()[0];
-            unsigned total_size = count * 4; // 假设元素都是4字节
+            Type* allocated_type = global->getType()->as<PointerType>()->getBaseType();
+            unsigned total_size = getTypeSizeInBytes(allocated_type);

            ss << "    .align 3\n";
            ss << ".globl " << global->getName() << "\n";
@ -74,33 +133,67 @@ std::string RISCv64CodeGen::module_gen() {
        }
    }
    
-    // --- [修改] 步骤3：生成 .data 段的代码 ---
-    // 我们需要检查 data_globals 和 常量列表是否都为空
+    // --- 步骤3：生成 .data 段的代码 ---
    if (!data_globals.empty() || !module->getConsts().empty()) {
        ss << ".data\n";

-        // a. 先处理普通的全局变量 (GlobalValue)
+        // a. 处理普通的全局变量 (GlobalValue)
        for (GlobalValue* global : data_globals) {
+            Type* allocated_type = global->getType()->as<PointerType>()->getBaseType();
+            unsigned total_size = getTypeSizeInBytes(allocated_type);
+            
+            ss << "    .align 3\n";
            ss << ".globl " << global->getName() << "\n";
+            ss << ".type " << global->getName() << ", @object\n";
+            ss << ".size " << global->getName() << ", " << total_size << "\n";
            ss << global->getName() << ":\n";
-            printInitializer(ss, global->getInitValues());
+            bool is_all_zeros = true;
+            const auto& init_values = global->getInitValues();
+            if (init_values.getValues().empty()) {
+                is_all_zeros = true;
+            } else {
+                for (auto val : init_values.getValues()) {
+                    if (auto const_val = dynamic_cast<ConstantValue*>(val)) {
+                        if (!const_val->isZero()) {
+                            is_all_zeros = false;
+                            break;
+                        }
+                    } else {
+                        is_all_zeros = false;
+                        break;
+                    }
+                }
+            }
+            if (is_all_zeros) {
+                ss << "    .zero " << total_size << "\n";
+            } else {
+                // 对于有非零初始值的变量，保持原有的打印逻辑。
+                printInitializer(ss, global->getInitValues());
+            }
        }

-        // b. [新增] 再处理全局常量 (ConstantVariable)
+        // b. 处理全局常量 (ConstantVariable)
        for (const auto& const_ptr : module->getConsts()) {
            ConstantVariable* cnst = const_ptr.get();
+            Type* allocated_type = cnst->getType()->as<PointerType>()->getBaseType();
+            unsigned total_size = getTypeSizeInBytes(allocated_type);
+
+            ss << "    .align 3\n";
            ss << ".globl " << cnst->getName() << "\n";
+            ss << ".type " << cnst->getName() << ", @object\n";
+            ss << ".size " << cnst->getName() << ", " << total_size << "\n";
            ss << cnst->getName() << ":\n";
            printInitializer(ss, cnst->getInitValues());
        }
    }

-    // --- 处理函数 (.text段) 的逻辑保持不变 ---
+    // --- 步骤4：处理函数 (.text段) 的逻辑 ---
    if (!module->getFunctions().empty()) {
        ss << ".text\n";
        for (const auto& func_pair : module->getFunctions()) {
-            if (func_pair.second.get()) {
+            if (func_pair.second.get() && !func_pair.second->getBasicBlocks().empty()) {
                ss << function_gen(func_pair.second.get());
+                if (DEBUG) std::cerr << "Function: " << func_pair.first << " generated.\n";
            }
        }
    }
@ -108,42 +201,159 @@ std::string RISCv64CodeGen::module_gen() {
 }

 std::string RISCv64CodeGen::function_gen(Function* func) {
-    // === 完整的后端处理流水线 ===
-
    // 阶段 1: 指令选择 (sysy::IR -> LLIR with virtual registers)
    RISCv64ISel isel;
    std::unique_ptr<MachineFunction> mfunc = isel.runOnFunction(func);
-
    // 第一次调试打印输出
-    std::stringstream ss1;
-    RISCv64AsmPrinter printer1(mfunc.get());
-    printer1.run(ss1, true);
+    std::stringstream ss_after_isel;
+    RISCv64AsmPrinter printer_isel(mfunc.get());
+    printer_isel.run(ss_after_isel, true);
+    // DEBUG = 1;
+    if (DEBUG) {
+        std::cerr << "====== Intermediate Representation after Instruction Selection ======\n" 
+        << ss_after_isel.str();
+    }
+    // DEBUG = 0;
+    // 阶段 2: 消除帧索引 (展开伪指令，计算局部变量偏移)
+    EliminateFrameIndicesPass efi_pass;
+    efi_pass.runOnMachineFunction(mfunc.get());

-    // 阶段 2: 指令调度 (Instruction Scheduling)
-    PreRA_Scheduler scheduler;
-    scheduler.runOnMachineFunction(mfunc.get());
+    if (DEBUG) {
+        std::cerr << "====== stack info after eliminate frame indices  ======\n";
+        mfunc->dumpStackFrameInfo(std::cerr);
+        std::stringstream ss_after_eli;
+        printer_isel.run(ss_after_eli, true);
+        std::cerr << "====== LLIR after eliminate frame indices ======\n" 
+        << ss_after_eli.str();
+    }
+
+    // 阶段 2.1: 除法强度削弱优化 (Division Strength Reduction)
+    DivStrengthReduction div_strength_reduction;
+    div_strength_reduction.runOnMachineFunction(mfunc.get());
+
+    // // 阶段 2.2: 指令调度 (Instruction Scheduling)
+    // PreRA_Scheduler scheduler;
+    // scheduler.runOnMachineFunction(mfunc.get());

    // 阶段 3: 物理寄存器分配 (Register Allocation)
-    RISCv64RegAlloc reg_alloc(mfunc.get());
-    reg_alloc.run();
+    bool allocation_succeeded = false;
+
+    // 尝试迭代图着色 (IRC)
+    if (!irc_failed) {
+        if (DEBUG) std::cerr << "Attempting Register Allocation with Iterated Register Coloring (IRC)...\n";        
+        RISCv64RegAlloc irc_alloc(mfunc.get());
+        auto stop_flag = std::make_shared<std::atomic<bool>>(false);
+        auto future = std::async(std::launch::async, &RISCv64RegAlloc::run, &irc_alloc, stop_flag);
+        std::future_status status = future.wait_for(std::chrono::seconds(25));
+        bool success_irc = false;
+        if (status == std::future_status::ready) {
+            try {
+                if (future.get()) {
+                    success_irc = true;
+                } else {
+                    std::cerr << "Warning: IRC explicitly returned failure for function '" << func->getName() << "'.\n";
+                }
+            } catch (const std::exception& e) {
+                std::cerr << "Error: IRC allocation threw an exception: " << e.what() << std::endl;
+            }
+        } else if (status == std::future_status::timeout) {
+            std::cerr << "Warning: IRC allocation timed out after 25 seconds. Requesting cancellation...\n";            
+            stop_flag->store(true);
+            try {
+                future.get();
+            } catch (const std::exception& e) {
+                std::cerr << "Exception occurred during IRC thread shutdown after timeout: " << e.what() << std::endl;
+            }
+        }
+
+        if (success_irc) {
+            allocation_succeeded = true;
+            if (DEBUG) std::cerr << "IRC allocation succeeded.\n";
+        } else {
+            std::cerr << "Info: Blacklisting IRC for subsequent functions and falling back.\n";
+            irc_failed = true;
+        }
+    }
+    
+    // 尝试简单图着色 (SGC) 
+    if (!allocation_succeeded) {
+        // 如果是从IRC失败回退过来的，需要重新创建干净的mfunc和ISel
+        RISCv64ISel isel_for_sgc;
+        if (irc_failed) {
+            if (DEBUG) std::cerr << "Info: Resetting MachineFunction for SGC attempt.\n";
+            mfunc = isel_for_sgc.runOnFunction(func);
+            EliminateFrameIndicesPass efi_pass_for_sgc;
+            efi_pass_for_sgc.runOnMachineFunction(mfunc.get());
+        }
+
+        if (DEBUG) std::cerr << "Attempting Register Allocation with Simple Graph Coloring (SGC)...\n";
+
+        bool sgc_completed_in_time = false;
+        {
+            RISCv64SimpleRegAlloc sgc_alloc(mfunc.get());
+            auto future = std::async(std::launch::async, &RISCv64SimpleRegAlloc::run, &sgc_alloc);
+            std::future_status status = future.wait_for(std::chrono::seconds(25));
+
+            if (status == std::future_status::ready) {
+                try {
+                    future.get(); // 检查是否有异常
+                    sgc_completed_in_time = true;
+                    if (DEBUG) std::cerr << "SGC allocation completed successfully within the time limit.\n";
+                } catch (const std::exception& e) {
+                    std::cerr << "Error: SGC allocation threw an exception: " << e.what() << std::endl;
+                }
+            }
+        }
+
+        if (sgc_completed_in_time) {
+            allocation_succeeded = true;
+        } else {
+            std::cerr << "Warning: SGC allocation timed out or failed for function '" << func->getName() 
+                    << "'. Falling back.\n";
+        }
+    }
+
+    // 如果都失败了，则使用基本块分配器 (BBA)
+    if (!allocation_succeeded) {
+        // 为BBA准备干净的mfunc和ISel
+        std::cerr << "Info: Resetting MachineFunction for BBA fallback.\n";
+        RISCv64ISel isel_for_bba;
+        mfunc = isel_for_bba.runOnFunction(func);
+        EliminateFrameIndicesPass efi_pass_for_bba;
+        efi_pass_for_bba.runOnMachineFunction(mfunc.get());
+
+        std::cerr << "Info: Using Basic Block Allocator as final fallback.\n";
+        RISCv64BasicBlockAlloc bb_alloc(mfunc.get());
+        bb_alloc.run();
+    }
+    
+    if (DEBUG) {
+        std::cerr << "====== stack info after reg alloc ======\n";
+        mfunc->dumpStackFrameInfo(std::cerr);
+    }

    // 阶段 3.1: 处理被调用者保存寄存器
    CalleeSavedHandler callee_handler;
    callee_handler.runOnMachineFunction(mfunc.get());

+    if (DEBUG) {
+        std::cerr << "====== stack info after callee handler ======\n";
+        mfunc->dumpStackFrameInfo(std::cerr);
+    }
+
    // 阶段 4: 窥孔优化 (Peephole Optimization)
    PeepholeOptimizer peephole;
    peephole.runOnMachineFunction(mfunc.get());

-    // 阶段 5: 局部指令调度 (Local Scheduling)
-    PostRA_Scheduler local_scheduler;
-    local_scheduler.runOnMachineFunction(mfunc.get());
+    // // 阶段 5: 局部指令调度 (Local Scheduling)
+    // PostRA_Scheduler local_scheduler;
+    // local_scheduler.runOnMachineFunction(mfunc.get());

    // 阶段 3.2: 插入序言和尾声
    PrologueEpilogueInsertionPass pei_pass;
    pei_pass.runOnMachineFunction(mfunc.get());

-    // 阶段 3.3: 清理产生的大立即数
+    // 阶段 3.3: 大立即数合法化
    LegalizeImmediatesPass legalizer;
    legalizer.runOnMachineFunction(mfunc.get());

@ -151,7 +361,7 @@ std::string RISCv64CodeGen::function_gen(Function* func) {
    std::stringstream ss;
    RISCv64AsmPrinter printer(mfunc.get());
    printer.run(ss);
-    if (DEBUG) ss << "\n" << ss1.str(); // 将指令选择阶段的结果也包含在最终输出中
+
    return ss.str();
 }

--- a/src/backend/RISCv64/RISCv64BasicBlockAlloc.cpp
+++ b/src/backend/RISCv64/RISCv64BasicBlockAlloc.cpp
@ -0,0 +1,267 @@
+#include "RISCv64BasicBlockAlloc.h"
+#include "RISCv64Info.h"
+#include "RISCv64AsmPrinter.h"
+#include <iostream>
+#include <algorithm>
+
+// 外部调试级别控制变量
+extern int DEBUG;
+extern int DEEPDEBUG;
+
+namespace sysy {
+
+// 将 getInstrUseDef 的定义移到这里，因为它是一个全局的辅助函数
+void getInstrUseDef(const MachineInstr* instr, std::set<unsigned>& use, std::set<unsigned>& def) {
+    auto opcode = instr->getOpcode();
+    const auto& operands = instr->getOperands();
+    
+    auto get_vreg_id_if_virtual = [&](const MachineOperand* op, std::set<unsigned>& s) {
+        if (op->getKind() == MachineOperand::KIND_REG) {
+            auto reg_op = static_cast<const RegOperand*>(op);
+            if (reg_op->isVirtual()) s.insert(reg_op->getVRegNum());
+        } else if (op->getKind() == MachineOperand::KIND_MEM) {
+            auto mem_op = static_cast<const MemOperand*>(op);
+            auto reg_op = mem_op->getBase();
+            if (reg_op->isVirtual()) s.insert(reg_op->getVRegNum());
+        }
+    };
+
+    if (op_info.count(opcode)) {
+        const auto& info = op_info.at(opcode);
+        for (int idx : info.first) if (idx < operands.size()) get_vreg_id_if_virtual(operands[idx].get(), def);
+        for (int idx : info.second) if (idx < operands.size()) get_vreg_id_if_virtual(operands[idx].get(), use);
+        // 内存操作数的基址寄存器总是use
+        for (const auto& op : operands) if (op->getKind() == MachineOperand::KIND_MEM) get_vreg_id_if_virtual(op.get(), use);
+    } else if (opcode == RVOpcodes::CALL) {
+        if (!operands.empty() && operands[0]->getKind() == MachineOperand::KIND_REG) get_vreg_id_if_virtual(operands[0].get(), def);
+        for (size_t i = 1; i < operands.size(); ++i) if (operands[i]->getKind() == MachineOperand::KIND_REG) get_vreg_id_if_virtual(operands[i].get(), use);
+    }
+}
+
+
+RISCv64BasicBlockAlloc::RISCv64BasicBlockAlloc(MachineFunction* mfunc)
+    : MFunc(mfunc), ISel(mfunc->getISel()) {
+    // 初始化临时寄存器池
+    int_temps = {PhysicalReg::T0, PhysicalReg::T1, PhysicalReg::T2, PhysicalReg::T3, PhysicalReg::T6};
+    fp_temps = {PhysicalReg::F0, PhysicalReg::F1, PhysicalReg::F2, PhysicalReg::F3, PhysicalReg::F4};
+    int_temp_idx = 0;
+    fp_temp_idx = 0;
+
+    // 构建ABI寄存器映射
+    if (MFunc->getFunc()) {
+        int int_arg_idx = 0;
+        int fp_arg_idx = 0;
+        for (Argument* arg : MFunc->getFunc()->getArguments()) {
+            unsigned arg_vreg = ISel->getVReg(arg);
+            if (arg->getType()->isFloat()) {
+                if (fp_arg_idx < 8) {
+                    auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::F10) + fp_arg_idx++);
+                    abi_vreg_map[arg_vreg] = preg;
+                }
+            } else {
+                if (int_arg_idx < 8) {
+                    auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::A0) + int_arg_idx++);
+                    abi_vreg_map[arg_vreg] = preg;
+                }
+            }
+        }
+    }
+}
+
+void RISCv64BasicBlockAlloc::run() {
+    if (DEBUG) std::cerr << "===== [BB-Alloc] Running Stateful Greedy Allocator for function: " << MFunc->getName() << " =====\n";
+    
+    computeLiveness();
+    assignStackSlotsForAllVRegs();
+
+    for (auto& mbb : MFunc->getBlocks()) {
+        processBasicBlock(mbb.get());
+    }
+    
+    // 将ABI寄存器映射（如函数参数）合并到最终结果中
+    MFunc->getFrameInfo().vreg_to_preg_map.insert(this->abi_vreg_map.begin(), this->abi_vreg_map.end());
+}
+
+PhysicalReg RISCv64BasicBlockAlloc::getNextIntTemp() {
+    PhysicalReg reg = int_temps[int_temp_idx];
+    int_temp_idx = (int_temp_idx + 1) % int_temps.size();
+    return reg;
+}
+
+PhysicalReg RISCv64BasicBlockAlloc::getNextFpTemp() {
+    PhysicalReg reg = fp_temps[fp_temp_idx];
+    fp_temp_idx = (fp_temp_idx + 1) % fp_temps.size();
+    return reg;
+}
+
+void RISCv64BasicBlockAlloc::computeLiveness() {
+    // 这是一个必需的步骤，用于确定在块末尾哪些变量需要被写回栈
+    // 为保持聚焦，此处暂时留空，但请确保您有一个有效的活性分析来填充 live_out 映射
+}
+
+void RISCv64BasicBlockAlloc::assignStackSlotsForAllVRegs() {
+    if (DEBUG) std::cerr << "[BB-Alloc] Assigning stack slots for all vregs.\n";
+    StackFrameInfo& frame_info = MFunc->getFrameInfo();
+    int current_offset = frame_info.locals_end_offset;
+    const auto& vreg_type_map = ISel->getVRegTypeMap();
+
+    for (unsigned vreg = 1; vreg < ISel->getVRegCounter(); ++vreg) {
+        if (this->abi_vreg_map.count(vreg) || frame_info.alloca_offsets.count(vreg) || frame_info.spill_offsets.count(vreg)) {
+            continue;
+        }
+
+        Type* type = vreg_type_map.count(vreg) ? vreg_type_map.at(vreg) : Type::getIntType();
+        int size = type->isPointer() ? 8 : 4;
+        
+        current_offset -= size;
+        current_offset &= -size; // 按size对齐
+        
+        frame_info.spill_offsets[vreg] = current_offset;
+    }
+    frame_info.spill_size = -(current_offset - frame_info.locals_end_offset);
+}
+
+void RISCv64BasicBlockAlloc::processBasicBlock(MachineBasicBlock* mbb) {
+    if (DEEPDEBUG) std::cerr << "  [BB-Alloc] Processing block " << mbb->getName() << "\n";
+    
+    vreg_to_preg.clear();
+    preg_to_vreg.clear();
+    dirty_pregs.clear();
+    
+    auto& instrs = mbb->getInstructions();
+    std::vector<std::unique_ptr<MachineInstr>> new_instrs;
+    const auto& vreg_type_map = ISel->getVRegTypeMap();
+
+    for (auto& instr_ptr : instrs) {
+        std::set<unsigned> use_vregs, def_vregs;
+        getInstrUseDef(instr_ptr.get(), use_vregs, def_vregs);
+
+        std::map<unsigned, PhysicalReg> current_instr_map;
+
+        // 1. 确保所有use操作数都在物理寄存器中
+        for (unsigned vreg : use_vregs) {
+            current_instr_map[vreg] = ensureInReg(vreg, new_instrs);
+        }
+
+        // 2. 为所有def操作数分配物理寄存器
+        for (unsigned vreg : def_vregs) {
+            current_instr_map[vreg] = allocReg(vreg, new_instrs);
+        }
+
+        // 3. 重写指令，将vreg替换为preg
+        for (const auto& pair : current_instr_map) {
+            instr_ptr->replaceVRegWithPReg(pair.first, pair.second);
+        }
+        
+        new_instrs.push_back(std::move(instr_ptr));
+    }
+
+    // 4. 在块末尾，写回所有被修改过的且在后续块中活跃(live-out)的vreg
+    StackFrameInfo& frame_info = MFunc->getFrameInfo(); // **修正：获取frame_info引用**
+    const auto& lo = live_out[mbb];
+    for(auto const& [preg, vreg] : preg_to_vreg) {
+        // **修正：简化逻辑，在此保底分配器中总是写回脏寄存器**
+        if (dirty_pregs.count(preg)) {
+             if (!frame_info.spill_offsets.count(vreg)) continue;
+            Type* type = vreg_type_map.at(vreg);
+            RVOpcodes store_op = type->isFloat() ? RVOpcodes::FSW : (type->isPointer() ? RVOpcodes::SD : RVOpcodes::SW);
+            auto store = std::make_unique<MachineInstr>(store_op);
+            store->addOperand(std::make_unique<RegOperand>(preg));
+            store->addOperand(std::make_unique<MemOperand>(
+                std::make_unique<RegOperand>(PhysicalReg::S0),
+                std::make_unique<ImmOperand>(frame_info.spill_offsets.at(vreg))
+            ));
+            new_instrs.push_back(std::move(store));
+        }
+    }
+
+    instrs = std::move(new_instrs);
+}
+
+PhysicalReg RISCv64BasicBlockAlloc::ensureInReg(unsigned vreg, std::vector<std::unique_ptr<MachineInstr>>& new_instrs) {
+    if (abi_vreg_map.count(vreg)) {
+        return abi_vreg_map.at(vreg);
+    }
+    if (vreg_to_preg.count(vreg)) {
+        return vreg_to_preg.at(vreg);
+    }
+    
+    PhysicalReg preg = allocReg(vreg, new_instrs);
+    
+    const auto& vreg_type_map = ISel->getVRegTypeMap();
+    Type* type = vreg_type_map.count(vreg) ? vreg_type_map.at(vreg) : Type::getIntType();
+    RVOpcodes load_op = type->isFloat() ? RVOpcodes::FLW : (type->isPointer() ? RVOpcodes::LD : RVOpcodes::LW);
+    
+    auto load = std::make_unique<MachineInstr>(load_op);
+    load->addOperand(std::make_unique<RegOperand>(preg));
+    load->addOperand(std::make_unique<MemOperand>(
+        std::make_unique<RegOperand>(PhysicalReg::S0),
+        std::make_unique<ImmOperand>(MFunc->getFrameInfo().spill_offsets.at(vreg))
+    ));
+    new_instrs.push_back(std::move(load));
+    
+    dirty_pregs.erase(preg);
+    
+    return preg;
+}
+
+PhysicalReg RISCv64BasicBlockAlloc::allocReg(unsigned vreg, std::vector<std::unique_ptr<MachineInstr>>& new_instrs) {
+    if (abi_vreg_map.count(vreg)) {
+        dirty_pregs.insert(abi_vreg_map.at(vreg)); // 如果参数被重定义，也标记为脏
+        return abi_vreg_map.at(vreg);
+    }
+
+    bool is_fp = ISel->getVRegTypeMap().at(vreg)->isFloat();
+    PhysicalReg preg = findFreeReg(is_fp);
+    if (preg == PhysicalReg::INVALID) {
+        preg = spillReg(is_fp, new_instrs);
+    }
+
+    if (preg_to_vreg.count(preg)) {
+        vreg_to_preg.erase(preg_to_vreg.at(preg));
+    }
+    vreg_to_preg[vreg] = preg;
+    preg_to_vreg[preg] = vreg;
+    dirty_pregs.insert(preg);
+    
+    return preg;
+}
+
+PhysicalReg RISCv64BasicBlockAlloc::findFreeReg(bool is_fp) {
+    // **修正：使用正确的成员变量名 int_temps 和 fp_temps**
+    const auto& regs = is_fp ? fp_temps : int_temps;
+    for (PhysicalReg preg : regs) {
+        if (!preg_to_vreg.count(preg)) {
+            return preg;
+        }
+    }
+    return PhysicalReg::INVALID;
+}
+
+PhysicalReg RISCv64BasicBlockAlloc::spillReg(bool is_fp, std::vector<std::unique_ptr<MachineInstr>>& new_instrs) {
+    // **修正**: 调用成员函数需要使用 this->
+    PhysicalReg preg_to_spill = is_fp ? this->getNextFpTemp() : this->getNextIntTemp();
+    
+    if (preg_to_vreg.count(preg_to_spill)) {
+        unsigned victim_vreg = preg_to_vreg.at(preg_to_spill);
+        if (dirty_pregs.count(preg_to_spill)) {
+            const auto& vreg_type_map = ISel->getVRegTypeMap();
+            Type* type = vreg_type_map.count(victim_vreg) ? vreg_type_map.at(victim_vreg) : Type::getIntType();
+            RVOpcodes store_op = type->isFloat() ? RVOpcodes::FSW : (type->isPointer() ? RVOpcodes::SD : RVOpcodes::SW);
+            auto store = std::make_unique<MachineInstr>(store_op);
+            store->addOperand(std::make_unique<RegOperand>(preg_to_spill));
+            store->addOperand(std::make_unique<MemOperand>(
+                std::make_unique<RegOperand>(PhysicalReg::S0),
+                std::make_unique<ImmOperand>(MFunc->getFrameInfo().spill_offsets.at(victim_vreg))
+            ));
+            new_instrs.push_back(std::move(store));
+        }
+        vreg_to_preg.erase(victim_vreg);
+        dirty_pregs.erase(preg_to_spill);
+    }
+    
+    preg_to_vreg.erase(preg_to_spill);
+    return preg_to_spill;
+}
+
+} // namespace sysy
--- a/src/backend/RISCv64/RISCv64ISel.cpp
+++ b/src/backend/RISCv64/RISCv64ISel.cpp
@ -1,9 +1,10 @@
 #include "RISCv64ISel.h"
+#include "IR.h"  // For GlobalValue
 #include <stdexcept>
 #include <set>
 #include <functional>
-#include <cmath> // For std::fabs
-#include <limits> // For std::numeric_limits
+#include <cmath>
+#include <limits>
 #include <iostream>

 namespace sysy {
@ -102,6 +103,81 @@ void RISCv64ISel::select() {
        }
    }

+    // 仅当函数满足特定条件时，才需要保存参数寄存器,应用更精细的过滤规则
+    // 1. 函数包含call指令 (非叶子函数): 参数寄存器(a0-a7)是调用者保存的，
+    //    call指令可能会覆盖这些寄存器，因此必须保存。
+    // 2. 函数包含alloca指令 (需要栈分配)。
+    // 3. 函数的指令数量超过一个阈值（如20），意味着它是一个复杂的叶子函数，
+    //    为安全起见，保存其参数。
+    // 简单的叶子函数 (如min) 则可以跳过这个步骤进行优化。
+    auto shouldSaveArgs = [](Function* func) {
+        if (!func) return false;
+        int instruction_count = 0;
+        for (const auto& bb : func->getBasicBlocks()) {
+            for (const auto& inst : bb->getInstructions()) {
+                if (dynamic_cast<CallInst*>(inst.get()) || dynamic_cast<AllocaInst*>(inst.get())) {
+                    return true; // 发现call或alloca，立即返回true
+                }
+                instruction_count++;
+            }
+        }
+        // 如果没有call或alloca，则检查指令数量
+        return instruction_count > 45;
+    };
+
+    if (optLevel > 0 && shouldSaveArgs(F)) {
+        if (F && !F->getBasicBlocks().empty()) {
+            // 定位到第一个MachineBasicBlock，也就是函数入口
+            BasicBlock* first_ir_block = F->getBasicBlocks_NoRange().front().get();
+            CurMBB = bb_map.at(first_ir_block);
+
+            int int_arg_idx = 0;
+            int fp_arg_idx = 0;
+
+            for (Argument* arg : F->getArguments()) {
+                Type* arg_type = arg->getType();
+
+                // --- 处理整数/指针参数 ---
+                if (!arg_type->isFloat() && int_arg_idx < 8) {
+                    // 1. 获取参数原始的、将被预着色为 a0-a7 的 vreg
+                    unsigned original_vreg = getVReg(arg);
+
+                    // 2. 创建一个新的、安全的 vreg 来持有参数的值
+                    unsigned saved_vreg = getNewVReg(arg_type);
+
+                    // 3. 生成 mv saved_vreg, original_vreg 指令
+                    auto mv = std::make_unique<MachineInstr>(RVOpcodes::MV);
+                    mv->addOperand(std::make_unique<RegOperand>(saved_vreg));
+                    mv->addOperand(std::make_unique<RegOperand>(original_vreg));
+                    CurMBB->addInstruction(std::move(mv));
+
+                    MFunc->addProtectedArgumentVReg(saved_vreg);
+                    // 4.【关键】更新vreg映射表，将arg的vreg指向新的、安全的vreg
+                    //    这样，后续所有对该参数的 getVReg(arg) 调用都会自动获得 saved_vreg，
+                    //    使得函数体内的代码都使用这个被保存过的值。
+                    vreg_map[arg] = saved_vreg;
+                    int_arg_idx++;
+                }
+                // --- 处理浮点参数 ---
+                else if (arg_type->isFloat() && fp_arg_idx < 8) {
+                    unsigned original_vreg = getVReg(arg);
+                    unsigned saved_vreg = getNewVReg(arg_type);
+
+                    // 对于浮点数，使用 fmv.s 指令
+                    auto fmv = std::make_unique<MachineInstr>(RVOpcodes::FMV_S);
+                    fmv->addOperand(std::make_unique<RegOperand>(saved_vreg));
+                    fmv->addOperand(std::make_unique<RegOperand>(original_vreg));
+                    CurMBB->addInstruction(std::move(fmv));
+
+                    MFunc->addProtectedArgumentVReg(saved_vreg);
+                    vreg_map[arg] = saved_vreg;
+                    fp_arg_idx++;
+                }
+                // 对于栈传递的参数，则无需处理
+            }
+        }
+    }
+
    // 遍历基本块，进行指令选择
    for (const auto& bb_ptr : F->getBasicBlocks()) {
        selectBasicBlock(bb_ptr.get());
@ -167,33 +243,6 @@ void RISCv64ISel::selectBasicBlock(BasicBlock* bb) {
            select_recursive(node_to_select);
        }
    }
-
-    if (CurMBB == MFunc->getBlocks().front().get()) { // 只对入口块操作
-        auto keepalive = std::make_unique<MachineInstr>(RVOpcodes::PSEUDO_KEEPALIVE);
-        for (Argument* arg : F->getArguments()) {
-            keepalive->addOperand(std::make_unique<RegOperand>(getVReg(arg)));
-        }
-
-        auto& instrs = CurMBB->getInstructions();
-        auto insert_pos = instrs.end();
-
-        // 关键：检查基本块是否以一个“终止指令”结尾
-        if (!instrs.empty()) {
-            RVOpcodes last_op = instrs.back()->getOpcode();
-            // 扩充了判断条件，涵盖所有可能的终止指令
-            if (last_op == RVOpcodes::J || last_op == RVOpcodes::RET ||
-                last_op == RVOpcodes::BEQ || last_op == RVOpcodes::BNE ||
-                last_op == RVOpcodes::BLT || last_op == RVOpcodes::BGE ||
-                last_op == RVOpcodes::BLTU || last_op == RVOpcodes::BGEU)
-            {
-                // 如果是，插入点就在这个终止指令之前
-                insert_pos = std::prev(instrs.end());
-            }
-        }
-        
-        // 在计算出的正确位置插入伪指令
-        instrs.insert(insert_pos, std::move(keepalive));
-    }
 }

 // 核心函数：为DAG节点选择并生成MachineInstr (已修复和增强的完整版本)
@ -209,8 +258,12 @@ void RISCv64ISel::selectNode(DAGNode* node) {
        case DAGNode::CONSTANT:
        case DAGNode::ALLOCA_ADDR:
            if (node->value) {
-                // 确保它有一个关联的虚拟寄存器即可，不生成代码。
-                getVReg(node->value);
+                // GlobalValue objects (global variables) should not get virtual registers
+                // since they represent memory addresses, not register-allocated values
+                if (dynamic_cast<GlobalValue*>(node->value) == nullptr) {
+                    // 确保它有一个关联的虚拟寄存器即可，不生成代码。
+                    getVReg(node->value);
+                }
            }
            break;
        
@ -402,7 +455,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                Value* base = nullptr;
                Value* offset = nullptr;

-                // [修改] 扩展基地址的判断，使其可以识别 AllocaInst 或 GlobalValue
+                // 扩展基地址的判断，使其可以识别 AllocaInst 或 GlobalValue
                if (dynamic_cast<AllocaInst*>(lhs) || dynamic_cast<GlobalValue*>(lhs)) {
                    base = lhs;
                    offset = rhs;
@ -421,7 +474,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                        CurMBB->addInstruction(std::move(li));
                    }
                    
-                    // 2. [修改] 根据基地址的类型，生成不同的指令来获取基地址
+                    // 2. 根据基地址的类型，生成不同的指令来获取基地址
                    auto base_addr_vreg = getNewVReg(Type::getIntType()); // 创建一个新的临时vreg来存放基地址

                    // 情况一：基地址是局部栈变量
@ -452,7 +505,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                }
            }

-            // [V2优点] 在BINARY节点内部按需加载常量操作数。
+            // 在BINARY节点内部按需加载常量操作数。
            auto load_val_if_const = [&](Value* val) {
                if (auto c = dynamic_cast<ConstantValue*>(val)) {
                    if (DEBUG) {
@ -483,7 +536,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            auto dest_vreg = getVReg(bin);
            auto lhs_vreg = getVReg(lhs);

-            // [V2优点] 融合 ADDIW 优化。
+            // 融合 ADDIW 优化。
            if (rhs_is_imm_opt) {
                auto rhs_const = dynamic_cast<ConstantValue*>(rhs);
                auto instr = std::make_unique<MachineInstr>(RVOpcodes::ADDIW);
@ -523,6 +576,14 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                    CurMBB->addInstruction(std::move(instr));
                    break;
                }
+                case BinaryInst::kMulh: {
+                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::MULH);
+                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
+                    CurMBB->addInstruction(std::move(instr));
+                    break;
+                }
                case Instruction::kDiv: {
                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::DIVW);
                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
@ -539,6 +600,31 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                    CurMBB->addInstruction(std::move(instr));
                    break;
                }
+                case Instruction::kSra: {
+                    auto rhs_const = dynamic_cast<ConstantInteger*>(rhs);
+                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::SRAIW);
+                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
+                    instr->addOperand(std::make_unique<ImmOperand>(rhs_const->getInt()));
+                    CurMBB->addInstruction(std::move(instr));
+                    break;
+                }
+                case BinaryInst::kSll: { // 逻辑左移
+                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::SLLW);
+                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
+                    CurMBB->addInstruction(std::move(instr));
+                    break;
+                }
+                case BinaryInst::kSrl: { // 逻辑右移
+                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::SRLW);
+                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
+                    CurMBB->addInstruction(std::move(instr));
+                    break;
+                }
                case BinaryInst::kICmpEQ: { // 等于 (a == b) -> (subw; seqz)
                    auto sub = std::make_unique<MachineInstr>(RVOpcodes::SUBW);
                    sub->addOperand(std::make_unique<RegOperand>(dest_vreg));
@ -595,7 +681,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                    CurMBB->addInstruction(std::move(xori));
                    break;
                }
-                 case BinaryInst::kICmpGE: { // 大于等于 (a >= b) -> !(a < b) -> (slt; xori)
+                case BinaryInst::kICmpGE: { // 大于等于 (a >= b) -> !(a < b) -> (slt; xori)
                    auto slt = std::make_unique<MachineInstr>(RVOpcodes::SLT);
                    slt->addOperand(std::make_unique<RegOperand>(dest_vreg));
                    slt->addOperand(std::make_unique<RegOperand>(lhs_vreg));
@ -609,6 +695,22 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                    CurMBB->addInstruction(std::move(xori));
                    break;
                }
+                case BinaryInst::kAnd: {
+                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::AND);
+                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
+                    CurMBB->addInstruction(std::move(instr));
+                    break;
+                }
+                case BinaryInst::kOr: {
+                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::OR);
+                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
+                    instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
+                    CurMBB->addInstruction(std::move(instr));
+                    break;
+                }
                default:
                    throw std::runtime_error("Unsupported binary instruction in ISel");
            }
@ -758,11 +860,29 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                    CurMBB->addInstruction(std::move(instr));
                    break;
                }
-                case Instruction::kFtoI: { // 浮点 to 整数
-                    auto instr = std::make_unique<MachineInstr>(RVOpcodes::FCVT_W_S);
-                    instr->addOperand(std::make_unique<RegOperand>(dest_vreg)); // 目标是整数vreg
-                    instr->addOperand(std::make_unique<RegOperand>(src_vreg));  // 源是浮点vreg
-                    CurMBB->addInstruction(std::move(instr));
+                case Instruction::kFtoI: { // 浮点 to 整数 (C/C++: 截断)
+                    // C/C++ 标准要求向零截断 (truncate), 对应的RISC-V舍入模式是 RTZ (Round Towards Zero).
+                    // fcvt.w.s 指令使用 fcsr 中的 frm 字段来决定舍入模式。
+                    // 我们需要手动设置 frm=1 (RTZ), 执行转换, 然后恢复 frm=0 (RNE, 默认).
+
+                    // 1. fsrmi x0, 1  (set rounding mode to RTZ)
+                    auto fsrmi1 = std::make_unique<MachineInstr>(RVOpcodes::FSRMI);
+                    fsrmi1->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                    fsrmi1->addOperand(std::make_unique<ImmOperand>(1));
+                    CurMBB->addInstruction(std::move(fsrmi1));
+
+                    // 2. fcvt.w.s dest_vreg, src_vreg
+                    auto fcvt = std::make_unique<MachineInstr>(RVOpcodes::FCVT_W_S);
+                    fcvt->addOperand(std::make_unique<RegOperand>(dest_vreg));
+                    fcvt->addOperand(std::make_unique<RegOperand>(src_vreg));
+                    CurMBB->addInstruction(std::move(fcvt));
+
+                    // 3. fsrmi x0, 0  (restore rounding mode to RNE)
+                    auto fsrmi0 = std::make_unique<MachineInstr>(RVOpcodes::FSRMI);
+                    fsrmi0->addOperand(std::make_unique<RegOperand>(PhysicalReg::ZERO));
+                    fsrmi0->addOperand(std::make_unique<ImmOperand>(0));
+                    CurMBB->addInstruction(std::move(fsrmi0));
+
                    break;
                }
                case Instruction::kFNeg: { // 浮点取负
@ -943,7 +1063,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            
            // --- 步骤 3: 生成CALL指令 ---
            auto call_instr = std::make_unique<MachineInstr>(RVOpcodes::CALL);
-            // [协议] 如果函数有返回值，将它的目标虚拟寄存器作为第一个操作数
+            // 如果函数有返回值，将它的目标虚拟寄存器作为第一个操作数
            if (!call->getType()->isVoid()) {
                unsigned dest_vreg = getVReg(call);
                call_instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
@ -1020,7 +1140,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                } else {
                    // --- 处理整数/指针返回值 ---
                    // 返回值需要被放入 a0
-                    // [V2优点] 在RETURN节点内加载常量返回值
+                    // 在RETURN节点内加载常量返回值
                    if (auto const_val = dynamic_cast<ConstantValue*>(ret_val)) {
                        auto li_instr = std::make_unique<MachineInstr>(RVOpcodes::LI);
                        li_instr->addOperand(std::make_unique<RegOperand>(PhysicalReg::A0));
@ -1034,7 +1154,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                    }
                }
            }
-            // [V1设计保留] 函数尾声（epilogue）不由RETURN节点生成，
+            // 函数尾声（epilogue）不由RETURN节点生成，
            // 而是由后续的AsmPrinter或其它Pass统一处理，这是一种常见且有效的模块化设计。
            auto ret_mi = std::make_unique<MachineInstr>(RVOpcodes::RET);
            CurMBB->addInstruction(std::move(ret_mi));
@ -1048,7 +1168,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            auto then_bb_name = cond_br->getThenBlock()->getName();
            auto else_bb_name = cond_br->getElseBlock()->getName();

-            // [优化] 检查分支条件是否为编译期常量
+            // 检查分支条件是否为编译期常量
            if (auto const_cond = dynamic_cast<ConstantValue*>(condition)) {
                // 如果条件是常量，直接生成一个无条件跳转J，而不是BNE
                if (const_cond->getInt() != 0) { // 条件为 true
@ -1063,7 +1183,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            } 
            // 如果条件不是常量，则执行标准流程
            else {
-                // [修复] 为条件变量生成加载指令（如果它是常量的话，尽管上面已经处理了）
+                // 为条件变量生成加载指令（如果它是常量的话，尽管上面已经处理了）
                // 这一步是为了逻辑完整，以防有其他类型的常量没有被捕获
                if (auto const_val = dynamic_cast<ConstantValue*>(condition)) {
                    auto li = std::make_unique<MachineInstr>(RVOpcodes::LI);
@ -1097,7 +1217,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
    }

        case DAGNode::MEMSET: {
-            // [V1设计保留] Memset的核心展开逻辑在虚拟寄存器层面是正确的，无需修改。
+            // Memset的核心展开逻辑在虚拟寄存器层面是正确的，无需修改。
            // 之前的bug是由于其输入（地址、值、大小）的虚拟寄存器未被正确初始化。
            // 在修复了CONSTANT/ALLOCA_ADDR的加载问题后，此处的逻辑现在可以正常工作。

@ -1143,10 +1263,11 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            auto r_value_byte = getVReg(memset->getValue());
            
            // 为memset内部逻辑创建新的临时虚拟寄存器
-            auto r_counter = getNewVReg();
-            auto r_end_addr = getNewVReg();
-            auto r_current_addr = getNewVReg();
-            auto r_temp_val = getNewVReg();
+            Type* ptr_type = Type::getPointerType(Type::getIntType()); 
+            auto r_counter = getNewVReg(ptr_type);
+            auto r_end_addr = getNewVReg(ptr_type);
+            auto r_current_addr = getNewVReg(ptr_type);
+             auto r_temp_val = getNewVReg(Type::getIntType());

            // 定义一系列lambda表达式来简化指令创建
            auto add_instr = [&](RVOpcodes op, unsigned rd, unsigned rs1, unsigned rs2) {
@ -1193,15 +1314,13 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            std::string loop_end_label = MFunc->getName() + "_memset_loop_end_" + std::to_string(unique_id);
            std::string remainder_label = MFunc->getName() + "_memset_remainder_" + std::to_string(unique_id);
            std::string done_label = MFunc->getName() + "_memset_done_" + std::to_string(unique_id);
-            
-            // 构造64位的填充值
-            addi_instr(RVOpcodes::ANDI, r_temp_val, r_value_byte, 255);
-            addi_instr(RVOpcodes::SLLI, r_value_byte, r_temp_val, 8);
-            add_instr(RVOpcodes::OR, r_temp_val, r_temp_val, r_value_byte);
-            addi_instr(RVOpcodes::SLLI, r_value_byte, r_temp_val, 16);
-            add_instr(RVOpcodes::OR, r_temp_val, r_temp_val, r_value_byte);
-            addi_instr(RVOpcodes::SLLI, r_value_byte, r_temp_val, 32);
-            add_instr(RVOpcodes::OR, r_temp_val, r_temp_val, r_value_byte);
+
+            // 构造32位的填充值 (将一个字节复制4次)
+            addi_instr(RVOpcodes::ANDI, r_temp_val, r_value_byte, 255);  // 提取低8位: 000000XX
+            addi_instr(RVOpcodes::SLLI, r_value_byte, r_temp_val, 8);    // 左移8位: 0000XX00
+            add_instr(RVOpcodes::OR, r_temp_val, r_temp_val, r_value_byte); // 合并得到: 0000XXXX
+            addi_instr(RVOpcodes::SLLI, r_value_byte, r_temp_val, 16);   // 左移16位: XXXX0000
+            add_instr(RVOpcodes::OR, r_temp_val, r_temp_val, r_value_byte); // 合并得到完整的32位值: XXXXXXXX
            
            // 计算循环边界
            add_instr(RVOpcodes::ADD, r_end_addr, r_dest_addr, r_num_bytes);
@ -1209,16 +1328,18 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            mv->addOperand(std::make_unique<RegOperand>(r_current_addr));
            mv->addOperand(std::make_unique<RegOperand>(r_dest_addr));
            CurMBB->addInstruction(std::move(mv));
-            addi_instr(RVOpcodes::ANDI, r_counter, r_num_bytes, -8);
+            // 计算主循环部分的总字节数 (向下舍入到4的倍数)
+            addi_instr(RVOpcodes::ANDI, r_counter, r_num_bytes, -4);
+            // 计算主循环的结束地址
            add_instr(RVOpcodes::ADD, r_counter, r_dest_addr, r_counter);
            
-            // 8字节主循环
+            // 4字节主循环
            label_instr(loop_start_label);
            branch_instr(RVOpcodes::BGEU, r_current_addr, r_counter, loop_end_label);
-            store_instr(RVOpcodes::SD, r_temp_val, r_current_addr, 0);
-            addi_instr(RVOpcodes::ADDI, r_current_addr, r_current_addr, 8);
+            store_instr(RVOpcodes::SW, r_temp_val, r_current_addr, 0); // 使用 sw (存储字)
+            addi_instr(RVOpcodes::ADDI, r_current_addr, r_current_addr, 4); // 步长改为4
            jump_instr(loop_start_label);
-            
+
            // 1字节收尾循环
            label_instr(loop_end_label);
            label_instr(remainder_label);
@ -1235,9 +1356,10 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            auto gep = dynamic_cast<GetElementPtrInst*>(node->value);
            auto result_vreg = getVReg(gep);

+            if (optLevel == 0) {
            // --- Step 1: 获取基地址 (此部分逻辑正确，保持不变) ---
            auto base_ptr_node = node->operands[0];
-            auto current_addr_vreg = getNewVReg();
+            auto current_addr_vreg = getNewVReg(gep->getType());

            if (auto alloca_base = dynamic_cast<AllocaInst*>(base_ptr_node->value)) {
                auto frame_addr_instr = std::make_unique<MachineInstr>(RVOpcodes::FRAME_ADDR);
@ -1279,15 +1401,20 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                // 如果步长为0（例如对一个void类型或空结构体索引），则不产生任何偏移
                if (stride != 0) {
                    // --- 为当前索引和步长生成偏移计算指令 ---
-                    auto offset_vreg = getNewVReg();
-                    auto index_vreg = getVReg(indexValue);
-
-                    // 如果索引是常量，先用 LI 指令加载到虚拟寄存器
+                    auto offset_vreg = getNewVReg(Type::getIntType());
+                    
+                    // 处理索引 - 区分常量与动态值
+                    unsigned index_vreg;
                    if (auto const_index = dynamic_cast<ConstantValue*>(indexValue)) {
+                        // 对于常量索引，直接创建新的虚拟寄存器
+                        index_vreg = getNewVReg(Type::getIntType());
                        auto li = std::make_unique<MachineInstr>(RVOpcodes::LI);
                        li->addOperand(std::make_unique<RegOperand>(index_vreg));
                        li->addOperand(std::make_unique<ImmOperand>(const_index->getInt()));
                        CurMBB->addInstruction(std::move(li));
+                    } else {
+                        // 对于动态索引，使用已存在的虚拟寄存器
+                        index_vreg = getVReg(indexValue);
                    }
                    
                    // 优化：如果步长是1，可以直接移动(MV)作为偏移量，无需乘法
@ -1298,7 +1425,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
                        CurMBB->addInstruction(std::move(mv));
                    } else {
                        // 步长不为1，需要生成乘法指令
-                        auto size_vreg = getNewVReg();
+                        auto size_vreg = getNewVReg(Type::getIntType());
                        auto li_size = std::make_unique<MachineInstr>(RVOpcodes::LI);
                        li_size->addOperand(std::make_unique<RegOperand>(size_vreg));
                        li_size->addOperand(std::make_unique<ImmOperand>(stride));
@ -1336,6 +1463,106 @@ void RISCv64ISel::selectNode(DAGNode* node) {
            final_mv->addOperand(std::make_unique<RegOperand>(current_addr_vreg));
            CurMBB->addInstruction(std::move(final_mv));
            break;
+        } else {
+            // 对于-O1时的处理逻辑
+            // --- Step 1: 获取基地址 ---
+            auto base_ptr_node = node->operands[0];
+            auto base_ptr_val = base_ptr_node->value;
+            
+            // last_step_addr_vreg 保存上一步计算的结果。
+            // 它首先被初始化为GEP的初始基地址。
+            unsigned last_step_addr_vreg; 
+
+            if (auto alloca_base = dynamic_cast<AllocaInst*>(base_ptr_val)) {
+                last_step_addr_vreg = getNewVReg(gep->getType());
+                auto frame_addr_instr = std::make_unique<MachineInstr>(RVOpcodes::FRAME_ADDR);
+                frame_addr_instr->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
+                frame_addr_instr->addOperand(std::make_unique<RegOperand>(getVReg(alloca_base)));
+                CurMBB->addInstruction(std::move(frame_addr_instr));
+            } else if (auto global_base = dynamic_cast<GlobalValue*>(base_ptr_val)) {
+                last_step_addr_vreg = getNewVReg(gep->getType());
+                auto la_instr = std::make_unique<MachineInstr>(RVOpcodes::LA);
+                la_instr->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
+                la_instr->addOperand(std::make_unique<LabelOperand>(global_base->getName()));
+                CurMBB->addInstruction(std::move(la_instr));
+            } else {
+                // 对于函数参数或来自其他指令的指针，直接获取其vreg。
+                // 这个vreg必须被保护，不能在计算中被修改。
+                last_step_addr_vreg = getVReg(base_ptr_val);
+            }
+
+            // --- Step 2: 遵循LLVM GEP语义迭代计算地址 ---
+            Type* current_type = gep->getBasePointer()->getType()->as<PointerType>()->getBaseType();
+
+            for (size_t i = 0; i < gep->getNumIndices(); ++i) {
+                Value* indexValue = gep->getIndex(i);
+                unsigned stride = getTypeSizeInBytes(current_type);
+                
+                if (stride != 0) {
+                    // --- 为当前索引和步长生成偏移计算指令 ---
+                    auto offset_vreg = getNewVReg(Type::getIntType());
+                    
+                    unsigned index_vreg;
+                    if (auto const_index = dynamic_cast<ConstantValue*>(indexValue)) {
+                        index_vreg = getNewVReg(Type::getIntType());
+                        auto li = std::make_unique<MachineInstr>(RVOpcodes::LI);
+                        li->addOperand(std::make_unique<RegOperand>(index_vreg));
+                        li->addOperand(std::make_unique<ImmOperand>(const_index->getInt()));
+                        CurMBB->addInstruction(std::move(li));
+                    } else {
+                        index_vreg = getVReg(indexValue);
+                    }
+                    
+                    if (stride == 1) {
+                        auto mv = std::make_unique<MachineInstr>(RVOpcodes::MV);
+                        mv->addOperand(std::make_unique<RegOperand>(offset_vreg));
+                        mv->addOperand(std::make_unique<RegOperand>(index_vreg));
+                        CurMBB->addInstruction(std::move(mv));
+                    } else {
+                        auto size_vreg = getNewVReg(Type::getIntType());
+                        auto li_size = std::make_unique<MachineInstr>(RVOpcodes::LI);
+                        li_size->addOperand(std::make_unique<RegOperand>(size_vreg));
+                        li_size->addOperand(std::make_unique<ImmOperand>(stride));
+                        CurMBB->addInstruction(std::move(li_size));
+                        
+                        auto mul = std::make_unique<MachineInstr>(RVOpcodes::MULW);
+                        mul->addOperand(std::make_unique<RegOperand>(offset_vreg));
+                        mul->addOperand(std::make_unique<RegOperand>(index_vreg));
+                        mul->addOperand(std::make_unique<RegOperand>(size_vreg));
+                        CurMBB->addInstruction(std::move(mul));
+                    }
+
+                    // --- 关键修复点 ---
+                    // 创建一个新的vreg来保存本次加法的结果。
+                    unsigned current_step_addr_vreg = getNewVReg(gep->getType());
+                    
+                    // 执行 add current_step, last_step, offset
+                    // 这确保了 last_step_addr_vreg (输入) 永远不会被直接修改。
+                    auto add = std::make_unique<MachineInstr>(RVOpcodes::ADD);
+                    add->addOperand(std::make_unique<RegOperand>(current_step_addr_vreg));
+                    add->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
+                    add->addOperand(std::make_unique<RegOperand>(offset_vreg));
+                    CurMBB->addInstruction(std::move(add));
+
+                    // 本次的结果成为下一次计算的输入。
+                    last_step_addr_vreg = current_step_addr_vreg;
+                }
+
+                // --- 为下一次迭代更新类型 ---
+                if (auto array_type = current_type->as<ArrayType>()) {
+                    current_type = array_type->getElementType();
+                } else if (auto ptr_type = current_type->as<PointerType>()) {
+                    current_type = ptr_type->getBaseType();
+                }
+            }
+            
+            // --- Step 3: 将最终计算出的地址存入GEP的目标虚拟寄存器 ---
+            auto final_mv = std::make_unique<MachineInstr>(RVOpcodes::MV);
+            final_mv->addOperand(std::make_unique<RegOperand>(result_vreg));
+            final_mv->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
+            CurMBB->addInstruction(std::move(final_mv));
+            break;
+        }
        }

        default:
@ -1445,7 +1672,7 @@ std::vector<std::unique_ptr<RISCv64ISel::DAGNode>> RISCv64ISel::build_dag(BasicB
            
            // 依次添加所有索引作为后续的操作数
            for (auto index : gep->getIndices()) {
-                // [修复] 从 Use 对象中获取真正的 Value*
+                // 从 Use 对象中获取真正的 Value*
                gep_node->operands.push_back(get_operand_node(index->getValue(), value_to_node, nodes_storage));
            }
        } else if (auto load = dynamic_cast<LoadInst*>(inst)) {
@ -1473,7 +1700,7 @@ std::vector<std::unique_ptr<RISCv64ISel::DAGNode>> RISCv64ISel::build_dag(BasicB
                    }
                }
            }
-            if (bin->getKind() >= Instruction::kFAdd) { // 假设浮点指令枚举值更大
+            if (bin->isFPBinary()) { // 假设浮点指令枚举值更大
                auto fbin_node = create_node(DAGNode::FBINARY, bin, value_to_node, nodes_storage);
                fbin_node->operands.push_back(get_operand_node(bin->getLhs(), value_to_node, nodes_storage));
                fbin_node->operands.push_back(get_operand_node(bin->getRhs(), value_to_node, nodes_storage));
@ -1549,7 +1776,7 @@ unsigned RISCv64ISel::getTypeSizeInBytes(Type* type) {
    }
 }

-// [新] 打印DAG图以供调试的辅助函数
+// 打印DAG图以供调试的辅助函数
 void RISCv64ISel::print_dag(const std::vector<std::unique_ptr<DAGNode>>& dag, const std::string& bb_name) {
    // 检查是否有DEBUG宏或者全局变量，避免在非调试模式下打印
    // if (!DEBUG) return; 
@ -1645,4 +1872,8 @@ void RISCv64ISel::print_dag(const std::vector<std::unique_ptr<DAGNode>>& dag, co
    std::cerr << "======================================\n\n";
 }

+unsigned int RISCv64ISel::getVRegCounter() const {
+    return vreg_counter;
+}
+
 } // namespace sysy
--- a/src/backend/RISCv64/RISCv64LLIR.cpp
+++ b/src/backend/RISCv64/RISCv64LLIR.cpp
@ -1,6 +1,195 @@
 #include "RISCv64LLIR.h"
+#include "RISCv64Info.h"
 #include <vector>
+#include <iostream> // 用于 std::ostream 和 std::cerr
+#include <string>   // 用于 std::string

 namespace sysy {

-}
+// 辅助函数：将 PhysicalReg 枚举转换为可读的字符串
+std::string regToString(PhysicalReg reg) {
+    switch (reg) {
+        case PhysicalReg::ZERO: return "x0";  case PhysicalReg::RA: return "ra";
+        case PhysicalReg::SP: return "sp";    case PhysicalReg::GP: return "gp";
+        case PhysicalReg::TP: return "tp";    case PhysicalReg::T0: return "t0";
+        case PhysicalReg::T1: return "t1";    case PhysicalReg::T2: return "t2";
+        case PhysicalReg::S0: return "s0";    case PhysicalReg::S1: return "s1";
+        case PhysicalReg::A0: return "a0";    case PhysicalReg::A1: return "a1";
+        case PhysicalReg::A2: return "a2";    case PhysicalReg::A3: return "a3";
+        case PhysicalReg::A4: return "a4";    case PhysicalReg::A5: return "a5";
+        case PhysicalReg::A6: return "a6";    case PhysicalReg::A7: return "a7";
+        case PhysicalReg::S2: return "s2";    case PhysicalReg::S3: return "s3";
+        case PhysicalReg::S4: return "s4";    case PhysicalReg::S5: return "s5";
+        case PhysicalReg::S6: return "s6";    case PhysicalReg::S7: return "s7";
+        case PhysicalReg::S8: return "s8";    case PhysicalReg::S9: return "s9";
+        case PhysicalReg::S10: return "s10";  case PhysicalReg::S11: return "s11";
+        case PhysicalReg::T3: return "t3";    case PhysicalReg::T4: return "t4";
+        case PhysicalReg::T5: return "t5";    case PhysicalReg::T6: return "t6";
+        case PhysicalReg::F0: return "f0";    case PhysicalReg::F1: return "f1";
+        case PhysicalReg::F2: return "f2";    case PhysicalReg::F3: return "f3";
+        case PhysicalReg::F4: return "f4";    case PhysicalReg::F5: return "f5";
+        case PhysicalReg::F6: return "f6";    case PhysicalReg::F7: return "f7";
+        case PhysicalReg::F8: return "f8";    case PhysicalReg::F9: return "f9";
+        case PhysicalReg::F10: return "f10";  case PhysicalReg::F11: return "f11";
+        case PhysicalReg::F12: return "f12";  case PhysicalReg::F13: return "f13";
+        case PhysicalReg::F14: return "f14";  case PhysicalReg::F15: return "f15";
+        case PhysicalReg::F16: return "f16";  case PhysicalReg::F17: return "f17";
+        case PhysicalReg::F18: return "f18";  case PhysicalReg::F19: return "f19";
+        case PhysicalReg::F20: return "f20";  case PhysicalReg::F21: return "f21";
+        case PhysicalReg::F22: return "f22";  case PhysicalReg::F23: return "f23";
+        case PhysicalReg::F24: return "f24";  case PhysicalReg::F25: return "f25";
+        case PhysicalReg::F26: return "f26";  case PhysicalReg::F27: return "f27";
+        case PhysicalReg::F28: return "f28";  case PhysicalReg::F29: return "f29";
+        case PhysicalReg::F30: return "f30";  case PhysicalReg::F31: return "f31";
+        default: return "UNKNOWN_REG";
+    }
+}
+
+// 打印栈帧信息的完整实现
+void MachineFunction::dumpStackFrameInfo(std::ostream& os) const {
+    const StackFrameInfo& info = frame_info;
+
+    os << "--- Stack Frame Info for function '" << getName() << "' ---\n";
+
+    // 打印尺寸信息
+    os << "  Sizes:\n";
+    os << "    Total Size:          " << info.total_size << " bytes\n";
+    os << "    Locals Size:         " << info.locals_size << " bytes\n";
+    os << "    Spill Size:          " << info.spill_size << " bytes\n";
+    os << "    Callee-Saved Size:   " << info.callee_saved_size << " bytes\n";
+    os << "\n";
+
+    // 打印 Alloca 变量的偏移量
+    os << "  Alloca Offsets (vreg -> offset from FP):\n";
+    if (info.alloca_offsets.empty()) {
+        os << "    (None)\n";
+    } else {
+        for (const auto& pair : info.alloca_offsets) {
+            os << "    %vreg" << pair.first << " -> " << pair.second << "\n";
+        }
+    }
+    os << "\n";
+
+    // 打印溢出变量的偏移量
+    os << "  Spill Offsets (vreg -> offset from FP):\n";
+    if (info.spill_offsets.empty()) {
+        os << "    (None)\n";
+    } else {
+        for (const auto& pair : info.spill_offsets) {
+            os << "    %vreg" << pair.first << " -> " << pair.second << "\n";
+        }
+    }
+    os << "\n";
+
+    // 打印使用的被调用者保存寄存器
+    os << "  Used Callee-Saved Registers:\n";
+    if (info.used_callee_saved_regs.empty()) {
+        os << "    (None)\n";
+    } else {
+        os << "    { ";
+        for (const auto& reg : info.used_callee_saved_regs) {
+            os << regToString(reg) << " ";
+        }
+        os << "}\n";
+    }
+    os << "\n";
+
+    // 打印需要保存/恢复的被调用者保存寄存器 (有序)
+    os << "  Callee-Saved Registers to Store/Restore:\n";
+    if (info.callee_saved_regs_to_store.empty()) {
+        os << "    (None)\n";
+    } else {
+        os << "    [ ";
+        for (const auto& reg : info.callee_saved_regs_to_store) {
+            os << regToString(reg) << " ";
+        }
+        os << "]\n";
+    }
+    os << "\n";
+
+    // 打印最终的寄存器分配结果
+    os << "  Final Register Allocation Map (vreg -> preg):\n";
+    if (info.vreg_to_preg_map.empty()) {
+        os << "    (None)\n";
+    } else {
+        for (const auto& pair : info.vreg_to_preg_map) {
+            os << "    %vreg" << pair.first << " -> " << regToString(pair.second) << "\n";
+        }
+    }
+
+    os << "---------------------------------------------------\n";
+}
+
+/**
+ * @brief （为紧急溢出模式添加）将指令中所有对特定虚拟寄存器的引用替换为指定的物理寄存器。
+ */
+void MachineInstr::replaceVRegWithPReg(unsigned old_vreg, PhysicalReg preg) {
+    for (auto& op : operands) {
+        if (op->getKind() == MachineOperand::KIND_REG) {
+            auto reg_op = static_cast<RegOperand*>(op.get());
+            if (reg_op->isVirtual() && reg_op->getVRegNum() == old_vreg) {
+                // 将虚拟寄存器操作数直接转换为物理寄存器操作数
+                reg_op->setPReg(preg);
+            }
+        } else if (op->getKind() == MachineOperand::KIND_MEM) {
+            // 同时处理内存操作数中的基址寄存器
+            auto mem_op = static_cast<MemOperand*>(op.get());
+            auto base_reg = mem_op->getBase();
+            if (base_reg->isVirtual() && base_reg->getVRegNum() == old_vreg) {
+                base_reg->setPReg(preg);
+            }
+        }
+    }
+}
+
+/**
+ * @brief （为常规溢出模式添加）根据提供的映射表，重映射指令中的虚拟寄存器。
+ * 这个函数的逻辑与 RISCv64LinearScan::getInstrUseDef 非常相似，因为它也需要
+ * 知道哪个操作数是 use，哪个是 def。
+ */
+void MachineInstr::remapVRegs(const std::map<unsigned, unsigned>& use_remap, const std::map<unsigned, unsigned>& def_remap) {
+    auto opcode = getOpcode();
+
+    // 辅助lambda，用于替换寄存器操作数
+    auto remap_reg_op = [](RegOperand* reg_op, const std::map<unsigned, unsigned>& remap) {
+        if (reg_op->isVirtual() && remap.count(reg_op->getVRegNum())) {
+            reg_op->setVRegNum(remap.at(reg_op->getVRegNum()));
+        }
+    };
+    
+    // 根据指令信息表（op_info）来确定 use 和 def
+    if (op_info.count(opcode)) {
+        const auto& info = op_info.at(opcode);
+        // 替换 def 操作数
+        for (int idx : info.first) {
+            if (idx < operands.size() && operands[idx]->getKind() == MachineOperand::KIND_REG) {
+                remap_reg_op(static_cast<RegOperand*>(operands[idx].get()), def_remap);
+            }
+        }
+        // 替换 use 操作数
+        for (int idx : info.second) {
+            if (idx < operands.size()) {
+                if (operands[idx]->getKind() == MachineOperand::KIND_REG) {
+                    remap_reg_op(static_cast<RegOperand*>(operands[idx].get()), use_remap);
+                } else if (operands[idx]->getKind() == MachineOperand::KIND_MEM) {
+                    // 内存操作数的基址寄存器总是 use
+                    remap_reg_op(static_cast<MemOperand*>(operands[idx].get())->getBase(), use_remap);
+                }
+            }
+        }
+    } else if (opcode == RVOpcodes::CALL) {
+        // 处理 CALL 指令的特殊情况
+        // 第一个操作数（如果存在且是寄存器）是 def
+        if (!operands.empty() && operands[0]->getKind() == MachineOperand::KIND_REG) {
+            remap_reg_op(static_cast<RegOperand*>(operands[0].get()), def_remap);
+        }
+        // 其余寄存器操作数是 use
+        for (size_t i = 1; i < operands.size(); ++i) {
+            if (operands[i]->getKind() == MachineOperand::KIND_REG) {
+                remap_reg_op(static_cast<RegOperand*>(operands[i].get()), use_remap);
+            }
+        }
+    }
+}
+
+}
--- a/src/backend/RISCv64/RISCv64LinearScan.cpp
+++ b/src/backend/RISCv64/RISCv64LinearScan.cpp
@ -0,0 +1,694 @@
+#include "RISCv64LinearScan.h"
+#include "RISCv64LLIR.h"
+#include "RISCv64ISel.h"
+#include "RISCv64Info.h"
+#include "RISCv64AsmPrinter.h"
+#include <iostream>
+#include <algorithm>
+#include <set>
+#include <sstream>
+#include <functional>
+
+// 外部调试级别控制变量
+extern int DEBUG;
+extern int DEEPDEBUG;
+extern int DEEPERDEBUG;
+
+namespace sysy {
+
+// --- 调试辅助函数 ---
+// These helpers are self-contained and only used for logging.
+static std::string pregToString(PhysicalReg preg) {
+    // This map is a copy from AsmPrinter to avoid dependency issues.
+    static const std::map<PhysicalReg, std::string> preg_names = {
+        {PhysicalReg::ZERO, "zero"}, {PhysicalReg::RA, "ra"}, {PhysicalReg::SP, "sp"}, {PhysicalReg::GP, "gp"}, {PhysicalReg::TP, "tp"},
+        {PhysicalReg::T0, "t0"}, {PhysicalReg::T1, "t1"}, {PhysicalReg::T2, "t2"}, {PhysicalReg::T3, "t3"}, {PhysicalReg::T4, "t4"}, {PhysicalReg::T5, "t5"}, {PhysicalReg::T6, "t6"},
+        {PhysicalReg::S0, "s0"}, {PhysicalReg::S1, "s1"}, {PhysicalReg::S2, "s2"}, {PhysicalReg::S3, "s3"}, {PhysicalReg::S4, "s4"}, {PhysicalReg::S5, "s5"}, {PhysicalReg::S6, "s6"}, {PhysicalReg::S7, "s7"}, {PhysicalReg::S8, "s8"}, {PhysicalReg::S9, "s9"}, {PhysicalReg::S10, "s10"}, {PhysicalReg::S11, "s11"},
+        {PhysicalReg::A0, "a0"}, {PhysicalReg::A1, "a1"}, {PhysicalReg::A2, "a2"}, {PhysicalReg::A3, "a3"}, {PhysicalReg::A4, "a4"}, {PhysicalReg::A5, "a5"}, {PhysicalReg::A6, "a6"}, {PhysicalReg::A7, "a7"},
+        {PhysicalReg::F0, "f0"}, {PhysicalReg::F1, "f1"}, {PhysicalReg::F2, "f2"}, {PhysicalReg::F3, "f3"}, {PhysicalReg::F4, "f4"}, {PhysicalReg::F5, "f5"}, {PhysicalReg::F6, "f6"}, {PhysicalReg::F7, "f7"},
+        {PhysicalReg::F8, "f8"}, {PhysicalReg::F9, "f9"}, {PhysicalReg::F10, "f10"}, {PhysicalReg::F11, "f11"}, {PhysicalReg::F12, "f12"}, {PhysicalReg::F13, "f13"}, {PhysicalReg::F14, "f14"}, {PhysicalReg::F15, "f15"},
+        {PhysicalReg::F16, "f16"}, {PhysicalReg::F17, "f17"}, {PhysicalReg::F18, "f18"}, {PhysicalReg::F19, "f19"}, {PhysicalReg::F20, "f20"}, {PhysicalReg::F21, "f21"}, {PhysicalReg::F22, "f22"}, {PhysicalReg::F23, "f23"},
+        {PhysicalReg::F24, "f24"}, {PhysicalReg::F25, "f25"}, {PhysicalReg::F26, "f26"}, {PhysicalReg::F27, "f27"}, {PhysicalReg::F28, "f28"}, {PhysicalReg::F29, "f29"}, {PhysicalReg::F30, "f30"}, {PhysicalReg::F31, "f31"},
+        {PhysicalReg::INVALID, "INVALID"}
+    };
+    if (preg_names.count(preg)) return preg_names.at(preg);
+    return "UnknownPreg";
+}
+
+template<typename T>
+static std::string setToString(const std::set<T>& s, std::function<std::string(T)> formatter) {
+    std::stringstream ss;
+    ss << "{ ";
+    bool first = true;
+    for (const auto& item : s) {
+        if (!first) ss << ", ";
+        ss << formatter(item);
+        first = false;
+    }
+    ss << " }";
+    return ss.str();
+}
+
+static std::string vregSetToString(const std::set<unsigned>& s) {
+    return setToString<unsigned>(s, [](unsigned v){ return "%v" + std::to_string(v); });
+}
+
+static std::string pregSetToString(const std::set<PhysicalReg>& s) {
+    return setToString<PhysicalReg>(s, pregToString);
+}
+
+// Helper function to check if a register is callee-saved.
+// Defined locally to avoid scope issues.
+static bool isCalleeSaved(PhysicalReg preg) {
+    if (preg >= PhysicalReg::S0 && preg <= PhysicalReg::S11) return true;
+    if (preg >= PhysicalReg::F8 && preg <= PhysicalReg::F9) return true;
+    if (preg >= PhysicalReg::F18 && preg <= PhysicalReg::F27) return true;
+    return false;
+}
+
+
+RISCv64LinearScan::RISCv64LinearScan(MachineFunction* mfunc)
+    : MFunc(mfunc), 
+      ISel(mfunc->getISel()),
+      vreg_type_map(ISel->getVRegTypeMap()) {
+    
+    allocable_int_regs = {
+        PhysicalReg::T0, PhysicalReg::T1, PhysicalReg::T2, PhysicalReg::T3, PhysicalReg::T6,
+        PhysicalReg::S1, PhysicalReg::S2, PhysicalReg::S3, PhysicalReg::S4, PhysicalReg::S5, PhysicalReg::S6, PhysicalReg::S7,
+        PhysicalReg::S8, PhysicalReg::S9, PhysicalReg::S10, PhysicalReg::S11,
+    };
+    allocable_fp_regs = {
+        PhysicalReg::F0, PhysicalReg::F1, PhysicalReg::F2, PhysicalReg::F3, PhysicalReg::F4, PhysicalReg::F5, PhysicalReg::F6, PhysicalReg::F7,
+        PhysicalReg::F10, PhysicalReg::F11, PhysicalReg::F12, PhysicalReg::F13, PhysicalReg::F14, PhysicalReg::F15, PhysicalReg::F16, PhysicalReg::F17,
+        PhysicalReg::F8, PhysicalReg::F9, PhysicalReg::F18, PhysicalReg::F19, PhysicalReg::F20, PhysicalReg::F21, PhysicalReg::F22,
+        PhysicalReg::F23, PhysicalReg::F24, PhysicalReg::F25, PhysicalReg::F26, PhysicalReg::F27,
+        PhysicalReg::F28, PhysicalReg::F29, PhysicalReg::F30, PhysicalReg::F31,
+    };
+    if (MFunc->getFunc()) {
+        int int_arg_idx = 0;
+        int fp_arg_idx = 0;
+        for (Argument* arg : MFunc->getFunc()->getArguments()) {
+            unsigned arg_vreg = ISel->getVReg(arg);
+            if (arg->getType()->isFloat()) {
+                if (fp_arg_idx < 8) {
+                    auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::F10) + fp_arg_idx++);
+                    abi_vreg_map[arg_vreg] = preg;
+                }
+            } else {
+                if (int_arg_idx < 8) {
+                    auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::A0) + int_arg_idx++);
+                    abi_vreg_map[arg_vreg] = preg;
+                }
+            }
+        }
+    }
+}
+
+bool RISCv64LinearScan::run() {
+    if (DEBUG) std::cerr << "===== [LSRA] Running for function: " << MFunc->getName() << " =====\n";
+    
+    const int MAX_ITERATIONS = 3;
+
+    for (int iteration = 1; ; ++iteration) {
+        if (DEBUG && iteration > 1) {
+             std::cerr << "\n----- [LSRA] Re-running iteration " << iteration << " -----\n";
+        }
+        
+        linearizeBlocks();
+        computeLiveIntervals();
+        bool needs_spill = linearScan();
+
+        // 如果当前这轮线性扫描不需要溢出，说明分配成功，直接跳出循环。
+        if (!needs_spill) {
+            break; 
+        }
+
+        // --- 检查是否需要启动或已经失败于保底策略 ---
+        if (iteration > MAX_ITERATIONS) {
+            // 如果我们已经在保底模式下运行过，但这一轮 linearScan 仍然返回 true，
+            // 这说明发生了无法解决的错误，此时才真正失败。
+            if (conservative_spill_mode) {
+                std::cerr << "\n!!!!!! [LSRA-FATAL] Allocation failed to converge even in Conservative Spill Mode. Triggering final fallback. !!!!!!\n\n";
+                return false; // 返回失败，而不是exit
+            }
+            // 这是第一次达到最大迭代次数，触发保底策略。
+            std::cerr << "\n!!!!!! [LSRA-WARN] Convergence failed after " << MAX_ITERATIONS 
+                      << " iterations. Entering Conservative Spill Mode for the next attempt. !!!!!!\n\n";
+            conservative_spill_mode = true; // 开启保守溢出模式，将在下一次循环生效
+        }
+        
+        // 只要需要溢出，就重写程序
+        if (DEBUG) std::cerr << "[LSRA] Spilling detected, will rewrite program.\n";
+        rewriteProgram();
+    }
+
+    if (DEBUG) std::cerr << "[LSRA] Applying final allocation.\n";
+    applyAllocation();
+    MFunc->getFrameInfo().vreg_to_preg_map = this->vreg_to_preg_map;
+    collectUsedCalleeSavedRegs();
+
+    if (DEBUG) std::cerr << "===== [LSRA] Finished for function: " << MFunc->getName() << " =====\n\n";
+    return true; // 分配成功
+}
+
+void RISCv64LinearScan::linearizeBlocks() {
+    linear_order_blocks.clear();
+    for (auto& mbb : MFunc->getBlocks()) {
+        linear_order_blocks.push_back(mbb.get());
+    }
+}
+
+void RISCv64LinearScan::computeLiveIntervals() {
+    if (DEBUG) std::cerr << "[LSRA-Live] Starting live interval computation.\n";
+    instr_numbering.clear();
+    live_intervals.clear();
+    unhandled.clear();
+
+    int num = 0;
+    std::set<int> call_locations;
+    for (auto* mbb : linear_order_blocks) {
+        for (auto& instr : mbb->getInstructions()) {
+            instr_numbering[instr.get()] = num;
+            if (instr->getOpcode() == RVOpcodes::CALL) call_locations.insert(num);
+            num += 2;
+        }
+    }
+
+    if (DEEPDEBUG) std::cerr << "  [Live] Starting live variable dataflow analysis...\n";
+    std::map<const MachineBasicBlock*, std::set<unsigned>> live_in, live_out;
+    bool changed = true;
+    int df_iter = 0;
+    while(changed) {
+        changed = false;
+        df_iter++;
+        std::vector<MachineBasicBlock*> reversed_blocks = linear_order_blocks;
+        std::reverse(reversed_blocks.begin(), reversed_blocks.end());
+        for(auto* mbb : reversed_blocks) {
+            std::set<unsigned> old_live_in = live_in[mbb];
+            std::set<unsigned> current_live_out;
+            for (auto* succ : mbb->successors) current_live_out.insert(live_in[succ].begin(), live_in[succ].end());
+            std::set<unsigned> use, def;
+            std::set<unsigned> temp_live = current_live_out;
+            auto& instrs = mbb->getInstructions();
+            for (auto it = instrs.rbegin(); it != instrs.rend(); ++it) {
+                use.clear(); def.clear();
+                getInstrUseDef(it->get(), use, def);
+                for (unsigned vreg : def) temp_live.erase(vreg);
+                for (unsigned vreg : use) temp_live.insert(vreg);
+            }
+            if (live_in[mbb] != temp_live || live_out[mbb] != current_live_out) {
+                changed = true;
+                live_in[mbb] = temp_live;
+                live_out[mbb] = current_live_out;
+            }
+        }
+    }
+    if (DEEPDEBUG) std::cerr << "  [Live] Dataflow analysis converged after " << df_iter << " iterations.\n";
+    if (DEEPERDEBUG) {
+        std::cerr << "  [Live-Debug] Live-in sets:\n";
+        for (auto* mbb : linear_order_blocks) std::cerr << "    " << mbb->getName() << ": " << vregSetToString(live_in[mbb]) << "\n";
+        std::cerr << "  [Live-Debug] Live-out sets:\n";
+        for (auto* mbb : linear_order_blocks) std::cerr << "    " << mbb->getName() << ": " << vregSetToString(live_out[mbb]) << "\n";
+    }
+
+    if (DEEPDEBUG) std::cerr << "  [Live] Building precise intervals...\n";
+    std::map<unsigned, int> first_def, last_use;
+    for (auto* mbb : linear_order_blocks) {
+        for (auto& instr_ptr : mbb->getInstructions()) {
+            int instr_num = instr_numbering.at(instr_ptr.get());
+            std::set<unsigned> use, def;
+            getInstrUseDef(instr_ptr.get(), use, def);
+            for (unsigned vreg : def) if (first_def.find(vreg) == first_def.end()) first_def[vreg] = instr_num;
+            for (unsigned vreg : use) last_use[vreg] = instr_num;
+        }
+    }
+    if (DEEPERDEBUG) {
+        std::cerr << "  [Live-Debug] First def points:\n";
+        for (auto const& [vreg, pos] : first_def) std::cerr << "    %v" << vreg << ": " << pos << "\n";
+        std::cerr << "  [Live-Debug] Last use points:\n";
+        for (auto const& [vreg, pos] : last_use) std::cerr << "    %v" << vreg << ": " << pos << "\n";
+    }
+
+    for (auto const& [vreg, start] : first_def) {
+        live_intervals.emplace(vreg, LiveInterval(vreg));
+        auto& interval = live_intervals.at(vreg);
+        interval.start = start;
+        interval.end = last_use.count(vreg) ? last_use.at(vreg) : start;
+    }
+
+    for (auto const& [mbb, live_set] : live_out) {
+        if (mbb->getInstructions().empty()) continue;
+        int block_end_num = instr_numbering.at(mbb->getInstructions().back().get());
+        for (unsigned vreg : live_set) {
+            if (live_intervals.count(vreg)) {
+                if (DEEPERDEBUG && live_intervals.at(vreg).end < block_end_num) {
+                    std::cerr << "  [Live-Debug] Extending interval for %v" << vreg << " from " << live_intervals.at(vreg).end << " to " << block_end_num << " due to live_out of " << mbb->getName() << "\n";
+                }
+                live_intervals.at(vreg).end = std::max(live_intervals.at(vreg).end, block_end_num);
+            }
+        }
+    }
+    
+    for (auto& pair : live_intervals) {
+        auto& interval = pair.second;
+        auto it = call_locations.lower_bound(interval.start);
+        if (it != call_locations.end() && *it < interval.end) interval.crosses_call = true;
+    }
+
+    for (auto& pair : live_intervals) unhandled.push_back(&pair.second);
+    std::sort(unhandled.begin(), unhandled.end(), [](const LiveInterval* a, const LiveInterval* b){ return a->start < b->start; });
+
+    if (DEBUG) {
+        std::cerr << "[LSRA-Live] Finished. Total intervals: " << unhandled.size() << "\n";
+        if (DEEPDEBUG) {
+            std::cerr << "  [Live] Computed Intervals (vreg: [start, end]):\n";
+            for(const auto* interval : unhandled) {
+                std::cerr << "    %v" << interval->vreg << ": [" << interval->start << ", " << interval->end << "]" 
+                          << (interval->crosses_call ? " (crosses call)" : "") << "\n";
+            }
+        }
+    }
+
+    // ================== 新增的调试代码 ==================
+    // 检查活性分析找到的vreg与指令扫描找到的vreg是否一致
+    if (DEEPERDEBUG) {
+        // 修正：将 std.set 修改为 std::set
+        std::set<unsigned> vregs_from_liveness;
+        for (const auto& pair : live_intervals) {
+            vregs_from_liveness.insert(pair.first);
+        }
+
+        std::set<unsigned> vregs_from_instr_scan;
+        for (auto* mbb : linear_order_blocks) {
+            for (auto& instr_ptr : mbb->getInstructions()) {
+                std::set<unsigned> use, def;
+                getInstrUseDef(instr_ptr.get(), use, def);
+                vregs_from_instr_scan.insert(use.begin(), use.end());
+                vregs_from_instr_scan.insert(def.begin(), def.end());
+            }
+        }
+        
+        std::cerr << "  [Live-Debug] VReg Consistency Check:\n";
+        std::cerr << "    VRegs found by Liveness Analysis: " << vregs_from_liveness.size() << "\n";
+        std::cerr << "    VRegs found by getInstrUseDef Scan: " << vregs_from_instr_scan.size() << "\n";
+
+        // 修正：将 std.set 修改为 std::set
+        std::set<unsigned> diff;
+        std::set_difference(vregs_from_liveness.begin(), vregs_from_liveness.end(),
+                            vregs_from_instr_scan.begin(), vregs_from_instr_scan.end(),
+                            std::inserter(diff, diff.begin()));
+
+        if (!diff.empty()) {
+            std::cerr << "    !!!!!! [Live-Debug] DISCREPANCY DETECTED !!!!!!\n";
+            std::cerr << "    The following vregs were found by liveness but NOT by getInstrUseDef scan:\n";
+            std::cerr << "    " << vregSetToString(diff) << "\n";
+        } else {
+            std::cerr << "    [Live-Debug] VReg sets are consistent.\n";
+        }
+    }
+    // ======================================================
+}
+
+bool RISCv64LinearScan::linearScan() {
+    // ================== 终极保底策略 (新逻辑) ==================
+    // 当此标志位为true时，我们进入最暴力的溢出模式。
+    if (conservative_spill_mode) {
+        if (DEBUG) std::cerr << "[LSRA-Scan-Panic] In Conservative Mode. Spilling all unhandled vregs.\n";
+        
+        // 1. 清空溢出列表，准备重新计算
+        spilled_vregs.clear();
+
+        // 2. 遍历所有计算出的活性区间
+        for (auto& pair : live_intervals) {
+            // 3. 如果一个vreg不是ABI规定的寄存器，就必须溢出
+            if (abi_vreg_map.find(pair.first) == abi_vreg_map.end()) {
+                spilled_vregs.insert(pair.first);
+            }
+        }
+        
+        // 4. 只要有任何vreg被标记为溢出，就返回true以触发最终的rewriteProgram。
+        //    下一轮迭代时，由于所有vreg都已被重写，将不再有新的溢出，保证收敛。
+        return !spilled_vregs.empty();
+    }
+    // ==========================================================
+    
+
+    // ================== 常规线性扫描逻辑 (您已有的代码) ==================
+    // 只有在非保守模式下才会执行以下代码
+    if (DEBUG) std::cerr << "[LSRA-Scan] Starting main linear scan algorithm.\n";
+    active.clear();
+    spilled_vregs.clear();
+    vreg_to_preg_map.clear();
+
+    std::set<PhysicalReg> free_caller_int_regs, free_callee_int_regs;
+    std::set<PhysicalReg> free_caller_fp_regs, free_callee_fp_regs;
+
+    for (auto preg : allocable_int_regs) {
+        if (isCalleeSaved(preg)) free_callee_int_regs.insert(preg); else free_caller_int_regs.insert(preg);
+    }
+    for (auto preg : allocable_fp_regs) {
+        if (isCalleeSaved(preg)) free_callee_fp_regs.insert(preg); else free_caller_fp_regs.insert(preg);
+    }
+
+    if (DEEPDEBUG) {
+        std::cerr << "  [Scan] Initial free regs:\n";
+        std::cerr << "    Caller-Saved Int: " << pregSetToString(free_caller_int_regs) << "\n";
+        std::cerr << "    Callee-Saved Int: " << pregSetToString(free_callee_int_regs) << "\n";
+    }
+
+    vreg_to_preg_map.insert(abi_vreg_map.begin(), abi_vreg_map.end());
+    std::vector<LiveInterval*> normal_unhandled;
+    for(LiveInterval* interval : unhandled) {
+        if(abi_vreg_map.count(interval->vreg)) {
+            active.push_back(interval);
+            PhysicalReg preg = abi_vreg_map.at(interval->vreg);
+            if (isFPVReg(interval->vreg)) {
+                if(isCalleeSaved(preg)) free_callee_fp_regs.erase(preg); else free_caller_fp_regs.erase(preg);
+            } else {
+                if(isCalleeSaved(preg)) free_callee_int_regs.erase(preg); else free_caller_int_regs.erase(preg);
+            }
+        } else {
+            normal_unhandled.push_back(interval);
+        }
+    }
+    unhandled = normal_unhandled;
+    std::sort(active.begin(), active.end(), [](const LiveInterval* a, const LiveInterval* b){ return a->end < b->end; });
+    
+    for (LiveInterval* current : unhandled) {
+        if (DEEPDEBUG) std::cerr << "\n  [Scan] Processing interval %v" << current->vreg << " [" << current->start << ", " << current->end << "]\n";
+        
+        std::vector<LiveInterval*> new_active;
+        for (LiveInterval* active_interval : active) {
+            if (active_interval->end < current->start) {
+                PhysicalReg preg = vreg_to_preg_map.at(active_interval->vreg);
+                if (DEEPDEBUG) std::cerr << "    [Scan] Expiring interval %v" << active_interval->vreg << ", freeing " << pregToString(preg) << "\n";
+                if (isFPVReg(active_interval->vreg)) {
+                        if(isCalleeSaved(preg)) free_callee_fp_regs.insert(preg); else free_caller_fp_regs.insert(preg);
+                } else {
+                        if(isCalleeSaved(preg)) free_callee_int_regs.insert(preg); else free_caller_int_regs.insert(preg);
+                }
+            } else {
+                new_active.push_back(active_interval);
+            }
+        }
+        active = new_active;
+
+        bool is_fp = isFPVReg(current->vreg);
+        auto& free_caller = is_fp ? free_caller_fp_regs : free_caller_int_regs;
+        auto& free_callee = is_fp ? free_callee_fp_regs : free_callee_int_regs;
+        PhysicalReg allocated_preg = PhysicalReg::INVALID;
+
+        if (current->crosses_call) {
+            if (!free_callee.empty()) {
+                allocated_preg = *free_callee.begin();
+                free_callee.erase(allocated_preg);
+            }
+        } else {
+            if (!free_caller.empty()) {
+                allocated_preg = *free_caller.begin();
+                free_caller.erase(allocated_preg);
+            } else if (!free_callee.empty()) {
+                allocated_preg = *free_callee.begin();
+                free_callee.erase(allocated_preg);
+            }
+        }
+
+        if (allocated_preg != PhysicalReg::INVALID) {
+            if (DEEPDEBUG) std::cerr << "    [Scan] Allocated " << pregToString(allocated_preg) << " to %v" << current->vreg << "\n";
+            vreg_to_preg_map[current->vreg] = allocated_preg;
+            active.push_back(current);
+            std::sort(active.begin(), active.end(), [](const LiveInterval* a, const LiveInterval* b){ return a->end < b->end; });
+        } else {
+            if (DEEPDEBUG) std::cerr << "    [Scan] No free registers for %v" << current->vreg << ". Spilling...\n";
+            spillAtInterval(current);
+        }
+    }
+    return !spilled_vregs.empty();
+}
+
+void RISCv64LinearScan::spillAtInterval(LiveInterval* current) {
+    // 保持您的原始逻辑
+    LiveInterval* spill_candidate = nullptr;
+    if (!active.empty()) {
+        spill_candidate = active.back();
+    }
+    
+    if (DEEPERDEBUG) {
+        std::cerr << "      [Spill-Debug] Spill decision for current=%v" << current->vreg << "[" << current->start << "," << current->end << "]\n";
+        std::cerr << "      [Spill-Debug] Active intervals (sorted by end point):\n";
+        for (const auto* i : active) {
+            std::cerr << "        %v" << i->vreg << "[" << i->start << "," << i->end << "] in " << pregToString(vreg_to_preg_map[i->vreg]) << "\n";
+        }
+        if(spill_candidate) {
+            std::cerr << "      [Spill-Debug] Candidate is %v" << spill_candidate->vreg << ". Its end is " << spill_candidate->end << ", current's end is " << current->end << "\n";
+        } else {
+            std::cerr << "      [Spill-Debug] No active candidate.\n";
+        }
+    }
+
+    if (spill_candidate && spill_candidate->end > current->end) {
+        if (DEEPDEBUG) std::cerr << "      [Spill] Decision: Spilling active %v" << spill_candidate->vreg << ".\n";
+        PhysicalReg preg = vreg_to_preg_map.at(spill_candidate->vreg);
+        vreg_to_preg_map.erase(spill_candidate->vreg); // 确保移除旧映射
+        vreg_to_preg_map[current->vreg] = preg;
+        active.pop_back();
+        active.push_back(current);
+        std::sort(active.begin(), active.end(), [](const LiveInterval* a, const LiveInterval* b){ return a->end < b->end; });
+        spilled_vregs.insert(spill_candidate->vreg);
+    } else {
+        if (DEEPDEBUG) std::cerr << "      [Spill] Decision: Spilling current %v" << current->vreg << ".\n";
+        spilled_vregs.insert(current->vreg);
+    }
+}
+
+void RISCv64LinearScan::rewriteProgram() {
+    if (DEBUG) {
+        std::cerr << "[LSRA-Rewrite] Starting program rewrite. Spilled vregs: " << vregSetToString(spilled_vregs) << "\n";
+    }
+    StackFrameInfo& frame_info = MFunc->getFrameInfo();
+    int spill_current_offset = frame_info.locals_end_offset - frame_info.spill_size;
+
+    for (unsigned vreg : spilled_vregs) {
+        // 保持您的原始逻辑
+        if (frame_info.spill_offsets.count(vreg)) continue;
+        
+        Type* type = vreg_type_map.count(vreg) ? vreg_type_map.at(vreg) : Type::getIntType();
+        int size = isFPVReg(vreg) ? 4 : (type->isPointer() ? 8 : 4);
+        spill_current_offset -= size;
+        spill_current_offset = (spill_current_offset & ~7);
+        frame_info.spill_offsets[vreg] = spill_current_offset;
+        if (DEEPDEBUG) std::cerr << "  [Rewrite] Assigned new stack offset " << frame_info.spill_offsets.at(vreg) << " to spilled %v" << vreg << "\n";
+    }
+    frame_info.spill_size = -(spill_current_offset - frame_info.locals_end_offset);
+
+    for (auto& mbb : MFunc->getBlocks()) {
+        auto& instrs = mbb->getInstructions();
+        std::vector<std::unique_ptr<MachineInstr>> new_instrs;
+        if (DEEPERDEBUG) std::cerr << "  [Rewrite] Processing block " << mbb->getName() << "\n";
+        
+        for (auto it = instrs.begin(); it != instrs.end(); ++it) {
+            auto& instr = *it;
+            std::set<unsigned> use_vregs, def_vregs;
+            getInstrUseDef(instr.get(), use_vregs, def_vregs);
+            
+            if (conservative_spill_mode) {
+                // ================== 紧急模式重写逻辑 ==================
+                // 直接使用物理寄存器 t4 (SPILL_TEMP_REG) 进行加载/存储
+                
+                // 为调试日志准备一个指令打印机
+                auto printer = DEEPERDEBUG ? std::make_unique<RISCv64AsmPrinter>(MFunc) : nullptr;
+                auto original_instr_str_for_log = DEEPERDEBUG ? printer->formatInstr(instr.get()) : "";
+                bool modified = false;
+
+                for (unsigned old_vreg : use_vregs) {
+                    if (spilled_vregs.count(old_vreg)) {
+                        modified = true;
+                        Type* type = vreg_type_map.at(old_vreg);
+                        RVOpcodes load_op = isFPVReg(old_vreg) ? RVOpcodes::FLW : (type->isPointer() ? RVOpcodes::LD : RVOpcodes::LW);
+                        auto load = std::make_unique<MachineInstr>(load_op);
+                        // 直接加载到保留的物理寄存器
+                        load->addOperand(std::make_unique<RegOperand>(SPILL_TEMP_REG));
+                        load->addOperand(std::make_unique<MemOperand>(
+                            std::make_unique<RegOperand>(PhysicalReg::S0),
+                            std::make_unique<ImmOperand>(frame_info.spill_offsets.at(old_vreg))));
+                        
+                        if (DEEPERDEBUG) {
+                            std::cerr << "    [Rewrite-Panic] Inserting LOAD for use of %v" << old_vreg 
+                                      << " into " << pregToString(SPILL_TEMP_REG) 
+                                      << " before: " << original_instr_str_for_log << "\n";
+                        }
+                        new_instrs.push_back(std::move(load));
+
+                        // 替换指令中的操作数
+                        instr->replaceVRegWithPReg(old_vreg, SPILL_TEMP_REG);
+                    }
+                }
+
+                // 在处理 def 之前，先替换定义自身的 vreg
+                for (unsigned old_vreg : def_vregs) {
+                    if (spilled_vregs.count(old_vreg)) {
+                        modified = true;
+                        instr->replaceVRegWithPReg(old_vreg, SPILL_TEMP_REG);
+                    }
+                }
+                
+                // 将原始指令（可能已被修改）放入新列表
+                new_instrs.push_back(std::move(instr));
+                if (DEEPERDEBUG && modified) {
+                    std::cerr << "    [Rewrite-Panic] Original: " << original_instr_str_for_log 
+                              << " -> Rewritten: " << printer->formatInstr(new_instrs.back().get()) << "\n";
+                }
+                
+                for (unsigned old_vreg : def_vregs) {
+                    if (spilled_vregs.count(old_vreg)) {
+                        // 指令本身已经被修改为定义到 SPILL_TEMP_REG，现在从它存回内存
+                        Type* type = vreg_type_map.at(old_vreg);
+                        RVOpcodes store_op = isFPVReg(old_vreg) ? RVOpcodes::FSW : (type->isPointer() ? RVOpcodes::SD : RVOpcodes::SW);
+                        auto store = std::make_unique<MachineInstr>(store_op);
+                        store->addOperand(std::make_unique<RegOperand>(SPILL_TEMP_REG));
+                        store->addOperand(std::make_unique<MemOperand>(
+                            std::make_unique<RegOperand>(PhysicalReg::S0),
+                            std::make_unique<ImmOperand>(frame_info.spill_offsets.at(old_vreg))));
+                        if (DEEPERDEBUG) {
+                             std::cerr << "    [Rewrite-Panic] Inserting STORE for def of %v" << old_vreg 
+                                       << " from " << pregToString(SPILL_TEMP_REG) << " after original instr.\n";
+                        }
+                        new_instrs.push_back(std::move(store));
+                    }
+                }
+
+            } else {
+                // ================== 常规模式重写逻辑 (您的原始代码) ==================
+                std::map<unsigned, unsigned> use_remap, def_remap;
+                for (unsigned old_vreg : use_vregs) {
+                    if (spilled_vregs.count(old_vreg) && use_remap.find(old_vreg) == use_remap.end()) {
+                        Type* type = vreg_type_map.at(old_vreg);
+                        unsigned new_temp_vreg = ISel->getNewVReg(type);
+                        use_remap[old_vreg] = new_temp_vreg;
+                        RVOpcodes load_op = isFPVReg(old_vreg) ? RVOpcodes::FLW : (type->isPointer() ? RVOpcodes::LD : RVOpcodes::LW);
+                        auto load = std::make_unique<MachineInstr>(load_op);
+                        load->addOperand(std::make_unique<RegOperand>(new_temp_vreg));
+                        load->addOperand(std::make_unique<MemOperand>(
+                            std::make_unique<RegOperand>(PhysicalReg::S0),
+                            std::make_unique<ImmOperand>(frame_info.spill_offsets.at(old_vreg))));
+                        if (DEEPERDEBUG) {
+                            RISCv64AsmPrinter printer(MFunc);
+                            std::cerr << "    [Rewrite] Inserting LOAD for use of %v" << old_vreg << " into new %v" << new_temp_vreg << " before: " << printer.formatInstr(instr.get()) << "\n";
+                        }
+                        new_instrs.push_back(std::move(load));
+                    }
+                }
+                for (unsigned old_vreg : def_vregs) {
+                    if (spilled_vregs.count(old_vreg) && def_remap.find(old_vreg) == def_remap.end()) {
+                        Type* type = vreg_type_map.at(old_vreg);
+                        unsigned new_temp_vreg = ISel->getNewVReg(type);
+                        def_remap[old_vreg] = new_temp_vreg;
+                    }
+                }
+                auto original_instr_str_for_log = DEEPERDEBUG ? RISCv64AsmPrinter(MFunc).formatInstr(instr.get()) : "";
+                instr->remapVRegs(use_remap, def_remap);
+                new_instrs.push_back(std::move(instr));
+                if (DEEPERDEBUG && (!use_remap.empty() || !def_remap.empty())) std::cerr << "    [Rewrite] Original: " << original_instr_str_for_log << " -> Rewritten: " << RISCv64AsmPrinter(MFunc).formatInstr(new_instrs.back().get()) << "\n";
+                for(const auto& pair : def_remap) {
+                    unsigned old_vreg = pair.first;
+                    unsigned new_temp_vreg = pair.second;
+                    Type* type = vreg_type_map.at(old_vreg);
+                    RVOpcodes store_op = isFPVReg(old_vreg) ? RVOpcodes::FSW : (type->isPointer() ? RVOpcodes::SD : RVOpcodes::SW);
+                    auto store = std::make_unique<MachineInstr>(store_op);
+                    store->addOperand(std::make_unique<RegOperand>(new_temp_vreg));
+                    store->addOperand(std::make_unique<MemOperand>(
+                        std::make_unique<RegOperand>(PhysicalReg::S0),
+                        std::make_unique<ImmOperand>(frame_info.spill_offsets.at(old_vreg))));
+                    if (DEEPERDEBUG) std::cerr << "    [Rewrite] Inserting STORE for def of %v" << old_vreg << " from new %v" << new_temp_vreg << " after original instr.\n";
+                    new_instrs.push_back(std::move(store));
+                }
+            }
+        }
+        instrs = std::move(new_instrs);
+    }
+}
+
+void RISCv64LinearScan::applyAllocation() {
+    if (DEBUG) std::cerr << "[LSRA-Apply] Applying final vreg->preg mapping.\n";
+    for (auto& mbb : MFunc->getBlocks()) {
+        for (auto& instr_ptr : mbb->getInstructions()) {
+            for (auto& op_ptr : instr_ptr->getOperands()) {
+                if (op_ptr->getKind() == MachineOperand::KIND_REG) {
+                    auto reg_op = static_cast<RegOperand*>(op_ptr.get());
+                    if (reg_op->isVirtual()) {
+                        unsigned vreg = reg_op->getVRegNum();
+                        if (vreg_to_preg_map.count(vreg)) {
+                            reg_op->setPReg(vreg_to_preg_map.at(vreg));
+                        } else {
+                            std::cerr << "ERROR: Uncolored virtual register %v" << vreg << " found during applyAllocation! in func " << MFunc->getName() << "\n";
+                            // Forcing an error is better than silent failure.
+                            // reg_op->setPReg(PhysicalReg::T5); 
+                        }
+                    }
+                } else if (op_ptr->getKind() == MachineOperand::KIND_MEM) {
+                    auto mem_op = static_cast<MemOperand*>(op_ptr.get());
+                    auto reg_op = mem_op->getBase();
+                    if (reg_op->isVirtual()) {
+                        unsigned vreg = reg_op->getVRegNum();
+                        if (vreg_to_preg_map.count(vreg)) {
+                            reg_op->setPReg(vreg_to_preg_map.at(vreg));
+                        } else {
+                            std::cerr << "ERROR: Uncolored virtual register %v" << vreg << " in memory operand! in func " << MFunc->getName() << "\n";
+                            // reg_op->setPReg(PhysicalReg::T5);
+                        }
+                    }
+                }
+            }
+        }
+    }
+}
+
+// void getInstrUseDef(const MachineInstr* instr, std::set<unsigned>& use, std::set<unsigned>& def) {
+//     auto opcode = instr->getOpcode();
+//     const auto& operands = instr->getOperands();
+    
+//     auto get_vreg_id_if_virtual = [&](const MachineOperand* op, std::set<unsigned>& s) {
+//         if (op->getKind() == MachineOperand::KIND_REG) {
+//             auto reg_op = static_cast<const RegOperand*>(op);
+//             if (reg_op->isVirtual()) s.insert(reg_op->getVRegNum());
+//         } else if (op->getKind() == MachineOperand::KIND_MEM) {
+//             auto mem_op = static_cast<const MemOperand*>(op);
+//             auto reg_op = mem_op->getBase();
+//             if (reg_op->isVirtual()) s.insert(reg_op->getVRegNum());
+//         }
+//     };
+
+//     if (op_info.count(opcode)) {
+//         const auto& info = op_info.at(opcode);
+//         for (int idx : info.first) if (idx < operands.size()) get_vreg_id_if_virtual(operands[idx].get(), def);
+//         for (int idx : info.second) if (idx < operands.size()) get_vreg_id_if_virtual(operands[idx].get(), use);
+//         for (const auto& op : operands) if (op->getKind() == MachineOperand::KIND_MEM) get_vreg_id_if_virtual(op.get(), use);
+//     } else if (opcode == RVOpcodes::CALL) {
+//         if (!operands.empty() && operands[0]->getKind() == MachineOperand::KIND_REG) get_vreg_id_if_virtual(operands[0].get(), def);
+//         for (size_t i = 1; i < operands.size(); ++i) if (operands[i]->getKind() == MachineOperand::KIND_REG) get_vreg_id_if_virtual(operands[i].get(), use);
+//     }
+// }
+
+bool RISCv64LinearScan::isFPVReg(unsigned vreg) const {
+    return vreg_type_map.count(vreg) && vreg_type_map.at(vreg)->isFloat();
+}
+
+void RISCv64LinearScan::collectUsedCalleeSavedRegs() {
+    StackFrameInfo& frame_info = MFunc->getFrameInfo();
+    frame_info.used_callee_saved_regs.clear();
+
+    const auto& callee_saved_int = getCalleeSavedIntRegs();
+    const auto& callee_saved_fp = getCalleeSavedFpRegs();
+    std::set<PhysicalReg> callee_saved_set(callee_saved_int.begin(), callee_saved_int.end());
+    callee_saved_set.insert(callee_saved_fp.begin(), callee_saved_fp.end());
+    callee_saved_set.insert(PhysicalReg::S0);
+
+    for(const auto& pair : vreg_to_preg_map) {
+        PhysicalReg preg = pair.second;
+        if(callee_saved_set.count(preg)) {
+            frame_info.used_callee_saved_regs.insert(preg);
+        }
+    }
+}
+
+} // namespace sysy
--- a/src/backend/RISCv64/RISCv64RegAlloc.cpp
+++ b/src/backend/RISCv64/RISCv64RegAlloc.cpp
--- a/src/backend/RISCv64/RISCv64SimpleRegAlloc.cpp
+++ b/src/backend/RISCv64/RISCv64SimpleRegAlloc.cpp
@ -0,0 +1,716 @@
+#include "RISCv64SimpleRegAlloc.h"
+#include "RISCv64AsmPrinter.h"
+#include "RISCv64Info.h"
+#include <algorithm>
+#include <iostream>
+#include <cassert>
+
+// 外部调试级别控制变量的定义
+// 假设这些变量在其他地方定义，例如主程序或一个通用的cpp文件
+extern int DEBUG;
+extern int DEEPDEBUG;
+
+namespace sysy {
+
+RISCv64SimpleRegAlloc::RISCv64SimpleRegAlloc(MachineFunction* mfunc) : MFunc(mfunc), ISel(mfunc->getISel()) {
+    // 1. 初始化可分配的整数寄存器池
+    // T5 被大立即数传送逻辑保留
+    // T2, T3, T4 被本分配器保留为专用的溢出/临时寄存器
+    allocable_int_regs = {
+        PhysicalReg::T0, PhysicalReg::T1, /* T2,T3,T4,T5,T6 reserved */
+        PhysicalReg::A0, PhysicalReg::A1, PhysicalReg::A2, PhysicalReg::A3, PhysicalReg::A4, PhysicalReg::A5, PhysicalReg::A6, PhysicalReg::A7,
+        PhysicalReg::S1, PhysicalReg::S2, PhysicalReg::S3, PhysicalReg::S4, PhysicalReg::S5, PhysicalReg::S6, PhysicalReg::S7,
+        PhysicalReg::S8, PhysicalReg::S9, PhysicalReg::S10, PhysicalReg::S11,
+    };
+
+    // 2. 初始化可分配的浮点寄存器池
+    // F0, F1, F2 被本分配器保留为专用的溢出/临时寄存器
+    allocable_fp_regs = {
+        /* F0,F1,F2 reserved */ PhysicalReg::F3, PhysicalReg::F4, PhysicalReg::F5, PhysicalReg::F6, PhysicalReg::F7,
+        PhysicalReg::F10, PhysicalReg::F11, PhysicalReg::F12, PhysicalReg::F13, PhysicalReg::F14, PhysicalReg::F15, PhysicalReg::F16, PhysicalReg::F17,
+        PhysicalReg::F8, PhysicalReg::F9, PhysicalReg::F18, PhysicalReg::F19, PhysicalReg::F20, PhysicalReg::F21, PhysicalReg::F22,
+        PhysicalReg::F23, PhysicalReg::F24, PhysicalReg::F25, PhysicalReg::F26, PhysicalReg::F27,
+        PhysicalReg::F28, PhysicalReg::F29, PhysicalReg::F30, PhysicalReg::F31,
+    };
+    
+    // 3. 映射所有物理寄存器到特殊的虚拟寄存器ID (保持不变)
+    const unsigned offset = static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID);
+    for (unsigned i = 0; i < static_cast<unsigned>(PhysicalReg::INVALID); ++i) {
+        auto preg = static_cast<PhysicalReg>(i);
+        preg_to_vreg_id_map[preg] = offset + i;
+    }
+}
+
+// 寄存器分配的主入口点
+void RISCv64SimpleRegAlloc::run() {
+    if (DEBUG) std::cerr << "===== Running Simple Graph Coloring Allocator for function: " << MFunc->getName() << " =====\n";
+    
+    // 实例化一个AsmPrinter用于调试输出，避免重复创建
+    RISCv64AsmPrinter printer(MFunc);
+    printer.setStream(std::cerr);
+
+    if (DEBUG) {
+        std::cerr << "\n===== LLIR after VReg Unification =====\n";
+        printer.run(std::cerr, true);
+        std::cerr << "===== End of Unified LLIR =====\n\n";
+    }
+
+    // 阶段 1: 处理函数调用约定（参数寄存器预着色）
+    handleCallingConvention();    
+    if (DEBUG) {
+        std::cerr << "--- After HandleCallingConvention ---\n";
+        std::cerr << "Pre-colored vregs:\n";
+        for (const auto& pair : color_map) {
+            std::cerr << "  %vreg" << pair.first << " -> " << printer.regToString(pair.second) << "\n";
+        }
+    }
+
+    // 阶段 2: 活跃性分析
+    analyzeLiveness();            
+    
+    // 阶段 3: 构建干扰图
+    buildInterferenceGraph();     
+    
+    // 阶段 4: 图着色算法分配物理寄存器
+    colorGraph();                 
+    if (DEBUG) {
+        std::cerr << "\n--- After GraphColoring ---\n";
+        std::cerr << "Assigned colors:\n";
+        for (const auto& pair : color_map) {
+            std::cerr << "  %vreg" << pair.first << " -> " << printer.regToString(pair.second) << "\n";
+        }
+        std::cerr << "Spilled vregs:\n";
+        if (spilled_vregs.empty()) {
+            std::cerr << "  (None)\n";
+        } else {
+            for (unsigned vreg : spilled_vregs) {
+                std::cerr << "  %vreg" << vreg << "\n";
+            }
+        }
+    }
+    
+    // 阶段 5: 重写函数（插入溢出/填充代码，替换虚拟寄存器为物理寄存器）
+    rewriteFunction();
+    
+    // 将最终的寄存器分配结果保存到MachineFunction的帧信息中，供后续Pass使用
+    MFunc->getFrameInfo().vreg_to_preg_map = this->color_map;
+
+    if (DEBUG) {
+        std::cerr << "\n===== Final LLIR after Simple Register Allocation =====\n";
+        printer.run(std::cerr, false); // 使用false来打印最终的物理寄存器
+        std::cerr << "===== Finished Simple Graph Coloring Allocator =====\n\n";
+    }
+}
+
+/**
+ * @brief [新增] 虚拟寄存器统一预处理
+ * 扫描函数，找到通过栈帧传递的参数，并将后续从该栈帧加载的VReg统一为原始的参数VReg。
+ */
+void RISCv64SimpleRegAlloc::unifyArgumentVRegs() {
+    if (MFunc->getBlocks().size() < 2) return; // 至少需要入口和函数体两个块
+
+    std::map<int, unsigned> stack_slot_to_vreg; // 映射: <栈偏移, 原始参数vreg>
+    MachineBasicBlock* entry_block = MFunc->getBlocks().front().get();
+
+    // 步骤 1: 扫描入口块，找到所有参数的“家（home）”在栈上的位置
+    for (const auto& instr : entry_block->getInstructions()) {
+        // 我们寻找 sw %vreg_arg, 0(%vreg_addr) 的模式
+        if (instr->getOpcode() == RVOpcodes::SW || instr->getOpcode() == RVOpcodes::SD || instr->getOpcode() == RVOpcodes::FSW) {
+            auto& operands = instr->getOperands();
+            if (operands.size() == 2 && operands[0]->getKind() == MachineOperand::KIND_REG && operands[1]->getKind() == MachineOperand::KIND_MEM) {
+                auto src_reg_op = static_cast<RegOperand*>(operands[0].get());
+                auto mem_op = static_cast<MemOperand*>(operands[1].get());
+                unsigned addr_vreg = mem_op->getBase()->getVRegNum();
+                
+                // 查找定义这个地址vreg的addi指令，以获取偏移量
+                for (const auto& prev_instr : entry_block->getInstructions()) {
+                    if (prev_instr->getOpcode() == RVOpcodes::ADDI && prev_instr->getOperands().front()->getKind() == MachineOperand::KIND_REG) {
+                        auto def_op = static_cast<RegOperand*>(prev_instr->getOperands().front().get());
+                        if (def_op->isVirtual() && def_op->getVRegNum() == addr_vreg) {
+                            int offset = static_cast<ImmOperand*>(prev_instr->getOperands()[2].get())->getValue();
+                            stack_slot_to_vreg[offset] = src_reg_op->getVRegNum();
+                            break;
+                        }
+                    }
+                }
+            }
+        }
+    }
+
+    if (stack_slot_to_vreg.empty()) return; // 没有找到参数存储，无需处理
+
+    // 步骤 2: 扫描函数体，构建本地vreg到参数vreg的重映射表
+    std::map<unsigned, unsigned> vreg_remap; // 映射: <本地vreg, 原始参数vreg>
+    MachineBasicBlock* body_block = MFunc->getBlocks()[1].get();
+
+    for (const auto& instr : body_block->getInstructions()) {
+        if (instr->getOpcode() == RVOpcodes::LW || instr->getOpcode() == RVOpcodes::LD || instr->getOpcode() == RVOpcodes::FLW) {
+            auto& operands = instr->getOperands();
+            if (operands.size() == 2 && operands[0]->getKind() == MachineOperand::KIND_REG && operands[1]->getKind() == MachineOperand::KIND_MEM) {
+                auto dest_reg_op = static_cast<RegOperand*>(operands[0].get());
+                auto mem_op = static_cast<MemOperand*>(operands[1].get());
+                unsigned addr_vreg = mem_op->getBase()->getVRegNum();
+                
+                // 同样地，查找定义地址的addi指令
+                for (const auto& prev_instr : body_block->getInstructions()) {
+                     if (prev_instr->getOpcode() == RVOpcodes::ADDI && prev_instr->getOperands().front()->getKind() == MachineOperand::KIND_REG) {
+                        auto def_op = static_cast<RegOperand*>(prev_instr->getOperands().front().get());
+                        if (def_op->isVirtual() && def_op->getVRegNum() == addr_vreg) {
+                            int offset = static_cast<ImmOperand*>(prev_instr->getOperands()[2].get())->getValue();
+                            if (stack_slot_to_vreg.count(offset)) {
+                                unsigned old_vreg = dest_reg_op->getVRegNum();
+                                unsigned new_vreg = stack_slot_to_vreg.at(offset);
+                                vreg_remap[old_vreg] = new_vreg;
+                            }
+                            break;
+                        }
+                    }
+                }
+            }
+        }
+    }
+
+    if (vreg_remap.empty()) return;
+
+    // 步骤 3: 遍历所有指令，应用重映射
+    // 定义一个lambda函数来替换vreg，避免代码重复
+    auto replace_vreg_in_operand = [&](MachineOperand* op) {
+        if (op->getKind() == MachineOperand::KIND_REG) {
+            auto reg_op = static_cast<RegOperand*>(op);
+            if (reg_op->isVirtual() && vreg_remap.count(reg_op->getVRegNum())) {
+                reg_op->setVRegNum(vreg_remap.at(reg_op->getVRegNum()));
+            }
+        } else if (op->getKind() == MachineOperand::KIND_MEM) {
+            auto base_reg_op = static_cast<MemOperand*>(op)->getBase();
+            if (base_reg_op->isVirtual() && vreg_remap.count(base_reg_op->getVRegNum())) {
+                base_reg_op->setVRegNum(vreg_remap.at(base_reg_op->getVRegNum()));
+            }
+        }
+    };
+
+    for (auto& mbb : MFunc->getBlocks()) {
+        for (auto& instr : mbb->getInstructions()) {
+            for (auto& op : instr->getOperands()) {
+                replace_vreg_in_operand(op.get());
+            }
+        }
+    }
+}
+
+void RISCv64SimpleRegAlloc::handleCallingConvention() {
+    Function* F = MFunc->getFunc();
+    if (!F) return;
+
+    // --- 1. 处理函数传入参数的预着色 ---
+    int int_arg_idx = 0;
+    int float_arg_idx = 0;
+
+    for (Argument* arg : F->getArguments()) {
+        unsigned vreg = ISel->getVReg(arg);
+        if (arg->getType()->isFloat()) {
+            if (float_arg_idx < 8) { // fa0-fa7
+                auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::F10) + float_arg_idx);
+                color_map[vreg] = preg;
+            }
+            float_arg_idx++;
+        } else {
+            if (int_arg_idx < 8) { // a0-a7
+                auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::A0) + int_arg_idx);
+                color_map[vreg] = preg;
+            }
+            int_arg_idx++;
+        }
+    }
+}
+
+void RISCv64SimpleRegAlloc::analyzeLiveness() {
+    if (DEBUG) std::cerr << "\n--- Starting Liveness Analysis ---\n";
+    
+    // === 阶段 1: 预计算每个基本块的 use 和 def 集合 ===
+    std::map<const MachineBasicBlock*, LiveSet> block_uses;
+    std::map<const MachineBasicBlock*, LiveSet> block_defs;
+    for (const auto& mbb_ptr : MFunc->getBlocks()) {
+        const MachineBasicBlock* mbb = mbb_ptr.get();
+        LiveSet uses, defs;
+        for (const auto& instr_ptr : mbb->getInstructions()) {
+            LiveSet instr_use, instr_def;
+            getInstrUseDef_Liveness(instr_ptr.get(), instr_use, instr_def);
+            // use[B] = use[B] U (instr_use - def[B])
+            for (unsigned u : instr_use) {
+                if (defs.find(u) == defs.end()) {
+                    uses.insert(u);
+                }
+            }
+            // def[B] = def[B] U instr_def
+            defs.insert(instr_def.begin(), instr_def.end());
+        }
+        block_uses[mbb] = uses;
+        block_defs[mbb] = defs;
+    }
+
+    // === 阶段 2: 在“块”粒度上进行迭代数据流分析，直到收敛 ===
+    std::map<const MachineBasicBlock*, LiveSet> block_live_in;
+    std::map<const MachineBasicBlock*, LiveSet> block_live_out;
+    bool changed = true;
+    while (changed) {
+        changed = false;
+        // 逆序遍历基本块，加速收敛
+        for (auto it = MFunc->getBlocks().rbegin(); it != MFunc->getBlocks().rend(); ++it) {
+            const auto& mbb_ptr = *it;
+            const MachineBasicBlock* mbb = mbb_ptr.get();
+            
+            // 2.1 计算 live_out[B] = U_{S in succ(B)} live_in[S]
+            LiveSet new_live_out;
+            for (auto succ : mbb->successors) {
+                new_live_out.insert(block_live_in[succ].begin(), block_live_in[succ].end());
+            }
+
+            // 2.2 计算 live_in[B] = use[B] U (live_out[B] - def[B])
+            LiveSet live_out_minus_def = new_live_out;
+            for (unsigned d : block_defs.at(mbb)) {
+                live_out_minus_def.erase(d);
+            }
+            LiveSet new_live_in = block_uses.at(mbb);
+            new_live_in.insert(live_out_minus_def.begin(), live_out_minus_def.end());
+
+            // 2.3 检查是否达到不动点
+            if (block_live_out[mbb] != new_live_out || block_live_in[mbb] != new_live_in) {
+                changed = true;
+                block_live_out[mbb] = new_live_out;
+                block_live_in[mbb] = new_live_in;
+            }
+        }
+    }
+
+    // === 阶段 3: 进行一次指令粒度的遍历，填充最终的 live_in_map 和 live_out_map ===
+    for (const auto& mbb_ptr : MFunc->getBlocks()) {
+        const MachineBasicBlock* mbb = mbb_ptr.get();
+        LiveSet live_out = block_live_out.at(mbb);
+
+        for (auto instr_it = mbb->getInstructions().rbegin(); instr_it != mbb->getInstructions().rend(); ++instr_it) {
+            const MachineInstr* instr = instr_it->get();
+            live_out_map[instr] = live_out;
+
+            LiveSet use, def;
+            getInstrUseDef_Liveness(instr, use, def);
+
+            LiveSet live_in = use;
+            LiveSet diff = live_out;
+            for (auto vreg : def) {
+                diff.erase(vreg);
+            }
+            live_in.insert(diff.begin(), diff.end());
+            live_in_map[instr] = live_in;
+            
+            // 更新 live_out，为块内的上一条指令做准备
+            live_out = live_in;
+        }
+    }
+}
+
+void RISCv64SimpleRegAlloc::buildInterferenceGraph() {
+    if (DEBUG) std::cerr << "\n--- Starting Interference Graph Construction ---\n";
+    RISCv64AsmPrinter printer(MFunc);
+    printer.setStream(std::cerr);
+
+    // 1. 收集所有图中需要出现的节点 (所有虚拟寄存器和物理寄存器)
+    std::set<unsigned> all_nodes;
+    for (const auto& mbb : MFunc->getBlocks()) {
+        for(const auto& instr : mbb->getInstructions()) {
+            LiveSet use, def;
+            getInstrUseDef_Liveness(instr.get(), use, def);
+            all_nodes.insert(use.begin(), use.end());
+            all_nodes.insert(def.begin(), def.end());
+        }
+    }
+    // 确保所有物理寄存器节点也存在
+    for (const auto& pair : preg_to_vreg_id_map) {
+        all_nodes.insert(pair.second);
+    }
+
+    // 2. 初始化干扰图邻接表
+    for (unsigned vreg : all_nodes) { interference_graph[vreg] = {}; }
+
+    // 3. 遍历指令，添加冲突边
+    for (const auto& mbb : MFunc->getBlocks()) {
+        if (DEEPDEBUG) std::cerr << "--- Building Graph for Basic Block: " << mbb->getName() << " ---\n";
+        for (const auto& instr_ptr : mbb->getInstructions()) {
+            const MachineInstr* instr = instr_ptr.get();
+            if (DEEPDEBUG) {
+                std::cerr << "  Instr: ";
+                printer.printInstruction(const_cast<MachineInstr*>(instr), true);
+            }
+            
+            LiveSet def, use;
+            getInstrUseDef_Liveness(instr, def, use); // 注意Use/Def顺序
+            const LiveSet& live_out = live_out_map.at(instr);
+            
+            if (DEEPDEBUG) {
+                printLiveSet(use, "Use     ", std::cerr, printer);
+                printLiveSet(def, "Def     ", std::cerr, printer);
+                printLiveSet(live_out, "Live_Out", std::cerr, printer);
+            }
+
+            // 规则1: 指令的定义(def)与该指令之后的所有活跃变量(live_out)冲突
+            for (unsigned d : def) {
+                for (unsigned l : live_out) {
+                    if (d != l) {
+                        if (DEEPDEBUG && interference_graph.at(d).find(l) == interference_graph.at(d).end()) {
+                           std::cerr << "    Edge (Def-LiveOut): " << regIdToString(d, printer) << " <-> " << regIdToString(l, printer) << "\n";
+                        }
+                        interference_graph[d].insert(l);
+                        interference_graph[l].insert(d);
+                    }
+                }
+            }
+            
+            // 规则2: 对于非MV指令, def与use也冲突
+            if (instr->getOpcode() != RVOpcodes::MV) {
+                for (unsigned d : def) {
+                    for (unsigned u : use) {
+                        if (d != u) {
+                            if (DEEPDEBUG && interference_graph.at(d).find(u) == interference_graph.at(d).end()) {
+                                std::cerr << "    Edge (Def-Use): " << regIdToString(d, printer) << " <-> " << regIdToString(u, printer) << "\n";
+                            }
+                            interference_graph[d].insert(u);
+                            interference_graph[u].insert(d);
+                        }
+                    }
+                }
+            }
+
+            // 所有在某一点上同时活跃的寄存器（即live_out集合中的所有成员），
+            // 它们之间必须两两互相干扰。
+            std::vector<unsigned> live_out_vec(live_out.begin(), live_out.end());
+            for (size_t i = 0; i < live_out_vec.size(); ++i) {
+                for (size_t j = i + 1; j < live_out_vec.size(); ++j) {
+                    unsigned u = live_out_vec[i];
+                    unsigned v = live_out_vec[j];
+                    if (DEEPDEBUG && interference_graph[u].find(v) == interference_graph[u].end()) {
+                        std::cerr << "    Edge (Live-Live): %vreg" << u << " <-> %vreg" << v << "\n";
+                    }
+                    interference_graph[u].insert(v);
+                    interference_graph[v].insert(u);
+                }
+            }
+
+            // 规则3: CALL指令会破坏所有调用者保存(caller-saved)寄存器
+            if (instr->getOpcode() == RVOpcodes::CALL) {
+                const auto& caller_saved_int = getCallerSavedIntRegs();
+                const auto& caller_saved_fp = getCallerSavedFpRegs();
+
+                for (unsigned live_vreg : live_out) {
+                    auto [type, size] = getTypeAndSize(live_vreg);
+                    if (type == Type::kFloat) {
+                        for (PhysicalReg cs_reg : caller_saved_fp) {
+                            unsigned cs_vreg_id = preg_to_vreg_id_map.at(cs_reg);
+                            if (live_vreg != cs_vreg_id) {
+                                interference_graph[live_vreg].insert(cs_vreg_id);
+                                interference_graph[cs_vreg_id].insert(live_vreg);
+                            }
+                        }
+                    } else {
+                        for (PhysicalReg cs_reg : caller_saved_int) {
+                            unsigned cs_vreg_id = preg_to_vreg_id_map.at(cs_reg);
+                            if (live_vreg != cs_vreg_id) {
+                                interference_graph[live_vreg].insert(cs_vreg_id);
+                                interference_graph[cs_vreg_id].insert(live_vreg);
+                            }
+                        }
+                    }
+                }
+            } // end if CALL
+            if (DEEPDEBUG) std::cerr << "  ----------------\n";
+        } // end for instr
+    } // end for mbb
+}
+
+void RISCv64SimpleRegAlloc::colorGraph() {
+    // 1. 收集所有需要着色的虚拟寄存器
+    std::vector<unsigned> vregs_to_color;
+    for (auto const& [vreg, neighbors] : interference_graph) {
+        // 只为未预着色的、真正的虚拟寄存器进行着色
+        if (color_map.find(vreg) == color_map.end() && vreg < static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID)) {
+            vregs_to_color.push_back(vreg);
+        }
+    }
+
+    // 2. 按冲突度从高到低排序，进行贪心着色
+    std::sort(vregs_to_color.begin(), vregs_to_color.end(), [&](unsigned a, unsigned b) {
+        return interference_graph.at(a).size() > interference_graph.at(b).size();
+    });
+
+    // 3. 遍历并着色
+    for (unsigned vreg : vregs_to_color) {
+        std::set<PhysicalReg> used_colors;
+        // 收集所有邻居的颜色
+        for (unsigned neighbor_id : interference_graph.at(vreg)) {
+            // A. 邻居是已着色的vreg
+            if (color_map.count(neighbor_id)) {
+                used_colors.insert(color_map.at(neighbor_id));
+            } 
+            // B. 邻居是物理寄存器本身
+            else if (neighbor_id >= static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID)) {
+                PhysicalReg neighbor_preg = static_cast<PhysicalReg>(neighbor_id - static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID));
+                used_colors.insert(neighbor_preg);
+            }
+        }
+        
+        // 根据vreg类型选择寄存器池
+        auto [type, size] = getTypeAndSize(vreg);
+        const auto& allocable_regs = (type == Type::kFloat) ? allocable_fp_regs : allocable_int_regs;
+        
+        bool colored = false;
+        for (PhysicalReg preg : allocable_regs) {
+            if (used_colors.find(preg) == used_colors.end()) {
+                color_map[vreg] = preg;
+                colored = true;
+                break;
+            }
+        }
+        
+        if (!colored) {
+            spilled_vregs.insert(vreg);
+        }
+    }
+}
+
+void RISCv64SimpleRegAlloc::rewriteFunction() {
+    if (DEBUG) std::cerr << "\n--- Starting Function Rewrite (Spilling & Substitution) ---\n";
+    StackFrameInfo& frame_info = MFunc->getFrameInfo();
+
+    // 步骤 1: 为所有溢出的vreg计算唯一的栈偏移量 (此部分逻辑正确，予以保留)
+    int current_offset = frame_info.locals_end_offset;
+    for (unsigned vreg : spilled_vregs) {
+        if (frame_info.spill_offsets.count(vreg)) continue;
+        auto [type, size] = getTypeAndSize(vreg);
+        current_offset -= size;
+        current_offset = current_offset & ~7;
+        frame_info.spill_offsets[vreg] = current_offset;
+    }
+    frame_info.spill_size = -(current_offset - frame_info.locals_end_offset);
+
+    // 步骤 2: 遍历所有指令，对CALL指令做简化处理
+    for (auto& mbb : MFunc->getBlocks()) {
+        std::vector<std::unique_ptr<MachineInstr>> new_instructions;
+        for (auto& instr_ptr : mbb->getInstructions()) {
+
+            if (instr_ptr->getOpcode() != RVOpcodes::CALL) {
+                std::vector<PhysicalReg> int_spill_pool = {PhysicalReg::T2, PhysicalReg::T3, PhysicalReg::T4, /*PhysicalReg::T5,*/ PhysicalReg::T6};
+                std::vector<PhysicalReg> fp_spill_pool = {PhysicalReg::F0, PhysicalReg::F1, PhysicalReg::F2, PhysicalReg::F3};
+                std::map<unsigned, PhysicalReg> vreg_to_preg_map_for_this_instr;
+                LiveSet use, def;
+                getInstrUseDef(instr_ptr.get(), use, def);
+                LiveSet all_vregs_in_instr = use;
+                all_vregs_in_instr.insert(def.begin(), def.end());
+                for(unsigned vreg : all_vregs_in_instr) {
+                    if (spilled_vregs.count(vreg)) {
+                        auto [type, size] = getTypeAndSize(vreg);
+                        if (type == Type::kFloat) {
+                            assert(!fp_spill_pool.empty() && "FP spill pool exhausted for generic instruction!");
+                            vreg_to_preg_map_for_this_instr[vreg] = fp_spill_pool.front();
+                            fp_spill_pool.erase(fp_spill_pool.begin());
+                        } else {
+                            assert(!int_spill_pool.empty() && "Int spill pool exhausted for generic instruction!");
+                            vreg_to_preg_map_for_this_instr[vreg] = int_spill_pool.front();
+                            int_spill_pool.erase(int_spill_pool.begin());
+                        }
+                    }
+                }
+                for (unsigned vreg : use) {
+                    if (spilled_vregs.count(vreg)) {
+                        PhysicalReg target_preg = vreg_to_preg_map_for_this_instr.at(vreg);
+                        auto [type, size] = getTypeAndSize(vreg);
+                        RVOpcodes load_op = (type == Type::kFloat) ? RVOpcodes::FLW : ((type == Type::kPointer) ? RVOpcodes::LD : RVOpcodes::LW);
+                        auto load = std::make_unique<MachineInstr>(load_op);
+                        load->addOperand(std::make_unique<RegOperand>(target_preg));
+                        load->addOperand(std::make_unique<MemOperand>(
+                            std::make_unique<RegOperand>(PhysicalReg::S0),
+                            std::make_unique<ImmOperand>(frame_info.spill_offsets.at(vreg))
+                        ));
+                        new_instructions.push_back(std::move(load));
+                    }
+                }
+                auto new_instr = std::make_unique<MachineInstr>(instr_ptr->getOpcode());
+                for (const auto& op : instr_ptr->getOperands()) {
+                    const RegOperand* reg_op = nullptr;
+                    if (op->getKind() == MachineOperand::KIND_REG) reg_op = static_cast<const RegOperand*>(op.get());
+                    else if (op->getKind() == MachineOperand::KIND_MEM) reg_op = static_cast<const MemOperand*>(op.get())->getBase();
+                    if (reg_op) {
+                        PhysicalReg final_preg;
+                        if (reg_op->isVirtual()) {
+                            unsigned vreg = reg_op->getVRegNum();
+                            if (spilled_vregs.count(vreg)) {
+                                final_preg = vreg_to_preg_map_for_this_instr.at(vreg);
+                            } else {
+                                assert(color_map.count(vreg));
+                                final_preg = color_map.at(vreg);
+                            }
+                        } else {
+                            final_preg = reg_op->getPReg();
+                        }
+                        auto new_reg_op = std::make_unique<RegOperand>(final_preg);
+                        if (op->getKind() == MachineOperand::KIND_REG) {
+                            new_instr->addOperand(std::move(new_reg_op));
+                        } else {
+                            auto mem_op = static_cast<const MemOperand*>(op.get());
+                            new_instr->addOperand(std::make_unique<MemOperand>(std::move(new_reg_op), std::make_unique<ImmOperand>(*mem_op->getOffset())));
+                        }
+                    } else {
+                        if(op->getKind() == MachineOperand::KIND_IMM) new_instr->addOperand(std::make_unique<ImmOperand>(*static_cast<const ImmOperand*>(op.get())));
+                        else if (op->getKind() == MachineOperand::KIND_LABEL) new_instr->addOperand(std::make_unique<LabelOperand>(*static_cast<const LabelOperand*>(op.get())));
+                    }
+                }
+                new_instructions.push_back(std::move(new_instr));
+                for (unsigned vreg : def) {
+                    if (spilled_vregs.count(vreg)) {
+                        PhysicalReg src_preg = vreg_to_preg_map_for_this_instr.at(vreg);
+                        auto [type, size] = getTypeAndSize(vreg);
+                        RVOpcodes store_op = (type == Type::kFloat) ? RVOpcodes::FSW : ((type == Type::kPointer) ? RVOpcodes::SD : RVOpcodes::SW);
+                        auto store = std::make_unique<MachineInstr>(store_op);
+                        store->addOperand(std::make_unique<RegOperand>(src_preg));
+                        store->addOperand(std::make_unique<MemOperand>(
+                            std::make_unique<RegOperand>(PhysicalReg::S0),
+                            std::make_unique<ImmOperand>(frame_info.spill_offsets.at(vreg))
+                        ));
+                        new_instructions.push_back(std::move(store));
+                    }
+                }
+            } else {
+                // --- 对于CALL指令，只处理其自身和返回值，不再处理参数 ---
+                const PhysicalReg INT_TEMP_REG = PhysicalReg::T6;
+                const PhysicalReg FP_TEMP_REG = PhysicalReg::F7;
+
+                // 1. 克隆CALL指令本身，只保留标签操作数
+                auto new_call = std::make_unique<MachineInstr>(RVOpcodes::CALL);
+                for (const auto& op : instr_ptr->getOperands()) {
+                    if (op->getKind() == MachineOperand::KIND_LABEL) {
+                        new_call->addOperand(std::make_unique<LabelOperand>(*static_cast<const LabelOperand*>(op.get())));
+                        // 注意：只添加第一个标签，防止ISel的错误导致多个标签
+                        break; 
+                    }
+                }
+                new_instructions.push_back(std::move(new_call));
+
+                // 2. 只处理返回值(def)的溢出和移动
+                auto& operands = instr_ptr->getOperands();
+                if (!operands.empty() && operands.front()->getKind() == MachineOperand::KIND_REG) {
+                    unsigned def_vreg = static_cast<RegOperand*>(operands.front().get())->getVRegNum();
+                    auto [type, size] = getTypeAndSize(def_vreg);
+                    PhysicalReg result_reg_abi = type == Type::kFloat ? PhysicalReg::F10 : PhysicalReg::A0;
+
+                    if (spilled_vregs.count(def_vreg)) {
+                        // 返回值被溢出：a0/fa0 -> temp -> 溢出槽
+                        PhysicalReg temp_reg = type == Type::kFloat ? FP_TEMP_REG : INT_TEMP_REG;
+                        RVOpcodes store_op = (type == Type::kFloat) ? RVOpcodes::FSW : ((type == Type::kPointer) ? RVOpcodes::SD : RVOpcodes::SW);
+                        
+                        auto mv_from_abi = std::make_unique<MachineInstr>(type == Type::kFloat ? RVOpcodes::FMV_S : RVOpcodes::MV);
+                        mv_from_abi->addOperand(std::make_unique<RegOperand>(temp_reg));
+                        mv_from_abi->addOperand(std::make_unique<RegOperand>(result_reg_abi));
+                        new_instructions.push_back(std::move(mv_from_abi));
+
+                        auto store = std::make_unique<MachineInstr>(store_op);
+                        store->addOperand(std::make_unique<RegOperand>(temp_reg));
+                        store->addOperand(std::make_unique<MemOperand>(
+                            std::make_unique<RegOperand>(PhysicalReg::S0),
+                            std::make_unique<ImmOperand>(frame_info.spill_offsets.at(def_vreg))
+                        ));
+                        new_instructions.push_back(std::move(store));
+                    } else {
+                        // 返回值未溢出：a0/fa0 -> 已着色的物理寄存器
+                        auto mv_to_dest = std::make_unique<MachineInstr>(type == Type::kFloat ? RVOpcodes::FMV_S : RVOpcodes::MV);
+                        mv_to_dest->addOperand(std::make_unique<RegOperand>(color_map.at(def_vreg)));
+                        mv_to_dest->addOperand(std::make_unique<RegOperand>(result_reg_abi));
+                        new_instructions.push_back(std::move(mv_to_dest));
+                    }
+                }
+            }
+        }
+        mbb->getInstructions() = std::move(new_instructions);
+    }
+}
+
+// --- 辅助函数实现 ---
+
+void RISCv64SimpleRegAlloc::getInstrUseDef_Liveness(const MachineInstr* instr, LiveSet& use, LiveSet& def) {
+    auto opcode = instr->getOpcode();
+    const auto& operands = instr->getOperands();
+    const unsigned offset = static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID);
+    
+    auto get_any_reg_id = [&](const MachineOperand* op) -> unsigned {
+        if (op->getKind() == MachineOperand::KIND_REG) {
+            auto reg_op = static_cast<const RegOperand*>(op);
+            return reg_op->isVirtual() ? reg_op->getVRegNum() : (offset + static_cast<unsigned>(reg_op->getPReg()));
+        } else if (op->getKind() == MachineOperand::KIND_MEM) {
+            auto reg_op = static_cast<const MemOperand*>(op)->getBase();
+            return reg_op->isVirtual() ? reg_op->getVRegNum() : (offset + static_cast<unsigned>(reg_op->getPReg()));
+        }
+        return (unsigned)-1;
+    };
+    
+    if (op_info.count(opcode)) {
+        const auto& info = op_info.at(opcode);
+        for (int idx : info.first) if (idx < operands.size()) {
+            unsigned reg_id = get_any_reg_id(operands[idx].get());
+            if (reg_id != (unsigned)-1) def.insert(reg_id);
+        }
+        for (int idx : info.second) if (idx < operands.size()) {
+            unsigned reg_id = get_any_reg_id(operands[idx].get());
+            if (reg_id != (unsigned)-1) use.insert(reg_id);
+        }
+        for (const auto& op : operands) {
+            if (op->getKind() == MachineOperand::KIND_MEM) {
+                 unsigned reg_id = get_any_reg_id(op.get());
+                 if (reg_id != (unsigned)-1) use.insert(reg_id);
+            }
+        }
+    } 
+    else if (opcode == RVOpcodes::CALL) {
+        if (!operands.empty() && operands[0]->getKind() == MachineOperand::KIND_REG) {
+            def.insert(get_any_reg_id(operands[0].get()));
+        }
+        for (size_t i = 1; i < operands.size(); ++i) {
+            if (operands[i]->getKind() == MachineOperand::KIND_REG) {
+                use.insert(get_any_reg_id(operands[i].get()));
+            }
+        }
+        for (auto preg : getCallerSavedIntRegs()) def.insert(offset + static_cast<unsigned>(preg));
+        for (auto preg : getCallerSavedFpRegs()) def.insert(offset + static_cast<unsigned>(preg));
+        def.insert(offset + static_cast<unsigned>(PhysicalReg::RA));
+    }
+    else if (opcode == RVOpcodes::RET) {
+        use.insert(offset + static_cast<unsigned>(PhysicalReg::A0));
+        use.insert(offset + static_cast<unsigned>(PhysicalReg::F10)); // fa0
+    }
+}
+
+std::pair<Type::Kind, unsigned> RISCv64SimpleRegAlloc::getTypeAndSize(unsigned vreg) {
+    const auto& vreg_type_map = ISel->getVRegTypeMap();
+    if (vreg_type_map.count(vreg)) {
+        Type* type = vreg_type_map.at(vreg);
+        if (type->isFloat()) return {Type::kFloat, 4};
+        if (type->isPointer()) return {Type::kPointer, 8};
+    }
+    // 默认或未知类型按32位整数处理
+    return {Type::kInt, 4};
+}
+
+std::string RISCv64SimpleRegAlloc::regIdToString(unsigned id, const RISCv64AsmPrinter& printer) const {
+    const unsigned offset = static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID);
+    if (id >= offset) {
+        PhysicalReg reg = static_cast<PhysicalReg>(id - offset);
+        return printer.regToString(reg);
+    } else {
+        return "%vreg" + std::to_string(id);
+    }
+}
+
+void RISCv64SimpleRegAlloc::printLiveSet(const LiveSet& s, const std::string& name, std::ostream& os, const RISCv64AsmPrinter& printer) {
+    os << "    " << name << " (" << s.size() << "): { ";
+    for (unsigned vreg : s) {
+        os << regIdToString(vreg, printer) << " ";
+    }
+    os << "}\n";
+}
+
+} // namespace sysy
--- a/src/include/backend/RISCv64/Handler/EliminateFrameIndices.h
+++ b/src/include/backend/RISCv64/Handler/EliminateFrameIndices.h
@ -0,0 +1,20 @@
+#ifndef ELIMINATE_FRAME_INDICES_H
+#define ELIMINATE_FRAME_INDICES_H
+
+#include "RISCv64LLIR.h"
+
+namespace sysy {
+
+class EliminateFrameIndicesPass {
+public:
+    // Pass 的主入口函数
+    void runOnMachineFunction(MachineFunction* mfunc);
+
+private:
+    // 帮助计算类型大小的辅助函数，从原RegAlloc中移出
+    unsigned getTypeSizeInBytes(Type* type);
+};
+
+} // namespace sysy
+
+#endif // ELIMINATE_FRAME_INDICES_H
--- a/src/include/backend/RISCv64/Optimize/DivStrengthReduction.h
+++ b/src/include/backend/RISCv64/Optimize/DivStrengthReduction.h
@ -0,0 +1,30 @@
+#ifndef RISCV64_DIV_STRENGTH_REDUCTION_H
+#define RISCV64_DIV_STRENGTH_REDUCTION_H
+
+#include "RISCv64LLIR.h"
+#include "Pass.h"
+
+namespace sysy {
+
+/**
+ * @class DivStrengthReduction
+ * @brief 除法强度削弱优化器
+ * * 将除法运算转换为乘法运算，使用magic number算法
+ * 适用于除数为常数的情况，可以显著提高性能
+ */
+class DivStrengthReduction : public Pass {
+public:
+    static char ID;
+    
+    DivStrengthReduction() : Pass("div-strength-reduction", Granularity::Function, PassKind::Optimization) {}
+    
+    void *getPassID() const override { return &ID; }
+    
+    bool runOnFunction(Function *F, AnalysisManager& AM) override;
+    
+    void runOnMachineFunction(MachineFunction* mfunc);
+};
+
+} // namespace sysy
+
+#endif // RISCV64_DIV_STRENGTH_REDUCTION_H
--- a/src/include/backend/RISCv64/Optimize/Peephole.h
+++ b/src/include/backend/RISCv64/Optimize/Peephole.h
@ -23,6 +23,21 @@ public:
    bool runOnFunction(Function *F, AnalysisManager& AM) override;
    
    void runOnMachineFunction(MachineFunction* mfunc);
+    
+    /**
+     * @brief 设置是否启用浮点乘加融合优化
+     * @param enabled 是否启用
+     */
+    static void setFusedMulAddEnabled(bool enabled) { fusedMulAddEnabled = enabled; }
+    
+    /**
+     * @brief 检查是否启用了浮点乘加融合优化
+     * @return 是否启用
+     */
+    static bool isFusedMulAddEnabled() { return fusedMulAddEnabled; }
+
+private:
+    static bool fusedMulAddEnabled;  // 浮点乘加融合优化开关
 };

 } // namespace sysy
--- a/src/include/backend/RISCv64/RISCv64AsmPrinter.h
+++ b/src/include/backend/RISCv64/RISCv64AsmPrinter.h
@ -19,7 +19,9 @@ public:
    // 辅助函数
    void setStream(std::ostream& os) { OS = &os; }
    // 辅助函数
-    std::string regToString(PhysicalReg reg);
+    std::string regToString(PhysicalReg reg) const;
+    std::string formatInstr(const MachineInstr *instr);
+
 private:
    // 打印各个部分
    void printBasicBlock(MachineBasicBlock* mbb, bool debug = false);
--- a/src/include/backend/RISCv64/RISCv64Backend.h
+++ b/src/include/backend/RISCv64/RISCv64Backend.h
@ -6,6 +6,7 @@

 extern int DEBUG;
 extern int DEEPDEBUG;
+extern int optLevel;

 namespace sysy {

@ -22,7 +23,11 @@ private:
    // 函数级代码生成 (实现新的流水线)
    std::string function_gen(Function* func);

+    // 私有辅助函数，用于根据类型计算其占用的字节数。
+    unsigned getTypeSizeInBytes(Type* type);
+    
    Module* module;
+    bool irc_failed = false;
 };

 } // namespace sysy
--- a/src/include/backend/RISCv64/RISCv64BasicBlockAlloc.h
+++ b/src/include/backend/RISCv64/RISCv64BasicBlockAlloc.h
@ -0,0 +1,61 @@
+#ifndef RISCV64_BASICBLOCKALLOC_H
+#define RISCV64_BASICBLOCKALLOC_H
+
+#include "RISCv64LLIR.h"
+#include "RISCv64ISel.h"
+#include <set>
+#include <map>
+#include <vector>
+
+namespace sysy {
+
+/**
+ * @class RISCv64BasicBlockAlloc
+ * @brief 一个有状态的、基本块级的贪心寄存器分配器。
+ *
+ * 该分配器作为简单但可靠的实现，它逐个处理基本块，并在块内尽可能地
+ * 将虚拟寄存器的值保留在物理寄存器中，以减少不必要的内存访问。
+ */
+class RISCv64BasicBlockAlloc {
+public:
+    RISCv64BasicBlockAlloc(MachineFunction* mfunc);
+    void run();
+
+private:
+    void computeLiveness();
+    void processBasicBlock(MachineBasicBlock* mbb);
+    void assignStackSlotsForAllVRegs();
+    
+    // 核心分配函数
+    PhysicalReg ensureInReg(unsigned vreg, std::vector<std::unique_ptr<MachineInstr>>& new_instrs);
+    PhysicalReg allocReg(unsigned vreg, std::vector<std::unique_ptr<MachineInstr>>& new_instrs);
+    PhysicalReg findFreeReg(bool is_fp);
+    PhysicalReg spillReg(bool is_fp, std::vector<std::unique_ptr<MachineInstr>>& new_instrs);
+
+    // 状态跟踪（每个基本块开始时都会重置）
+    std::map<unsigned, PhysicalReg> vreg_to_preg; // 当前vreg到物理寄存器的映射
+    std::map<PhysicalReg, unsigned> preg_to_vreg; // 反向映射
+    std::set<PhysicalReg> dirty_pregs;      // 被修改过、需要写回的物理寄存器
+
+    // 分配器全局信息
+    MachineFunction* MFunc;
+    RISCv64ISel* ISel;
+    std::map<unsigned, PhysicalReg> abi_vreg_map; // 函数参数的ABI寄存器映射
+
+    // 寄存器池和循环索引
+    std::vector<PhysicalReg> int_temps;
+    std::vector<PhysicalReg> fp_temps;
+    int int_temp_idx = 0;
+    int fp_temp_idx = 0;
+
+    // 辅助函数
+    PhysicalReg getNextIntTemp();
+    PhysicalReg getNextFpTemp();
+
+    // 活性分析结果
+    std::map<const MachineBasicBlock*, std::set<unsigned>> live_out;
+};
+
+} // namespace sysy
+
+#endif // RISCV64_BASICBLOCKALLOC_H
--- a/src/include/backend/RISCv64/RISCv64ISel.h
+++ b/src/include/backend/RISCv64/RISCv64ISel.h
@ -3,8 +3,15 @@

 #include "RISCv64LLIR.h"

+// Forward declarations
+namespace sysy {
+    class GlobalValue;
+    class Value;
+}
+
 extern int DEBUG;
 extern int DEEPDEBUG;
+extern int optLevel;

 namespace sysy {

@ -16,8 +23,8 @@ public:

    // 公开接口，以便后续模块（如RegAlloc）可以查询或创建vreg
    unsigned getVReg(Value* val);
-    unsigned getNewVReg() { return vreg_counter++; }
-    unsigned getNewVReg(Type* type); 
+    unsigned getNewVReg(Type* type);
+    unsigned getVRegCounter() const; 
    // 获取 vreg_map 的公共接口
    const std::map<Value*, unsigned>& getVRegMap() const { return vreg_map; }
    const std::map<unsigned, Value*>& getVRegValueMap() const { return vreg_to_value_map; }
--- a/src/include/backend/RISCv64/RISCv64Info.h
+++ b/src/include/backend/RISCv64/RISCv64Info.h
@ -0,0 +1,98 @@
+#ifndef RISCV64_INFO_H
+#define RISCV64_INFO_H
+
+#include "RISCv64LLIR.h"
+#include <map>
+#include <vector>
+
+namespace sysy {
+
+// 定义一个全局的、权威的指令信息表
+// 它包含了指令的定义(def)和使用(use)操作数索引
+// defs: {0} -> 第一个操作数是定义
+// uses: {1, 2} -> 第二、三个操作数是使用
+static const std::map<RVOpcodes, std::pair<std::vector<int>, std::vector<int>>> op_info = {
+    // --- 整数计算 (R-Type) ---
+    {RVOpcodes::ADD, {{0}, {1, 2}}}, 
+    {RVOpcodes::SUB, {{0}, {1, 2}}}, 
+    {RVOpcodes::MUL, {{0}, {1, 2}}},
+    {RVOpcodes::MULH, {{0}, {1, 2}}},
+    {RVOpcodes::DIV, {{0}, {1, 2}}}, 
+    {RVOpcodes::DIVW, {{0}, {1, 2}}},
+    {RVOpcodes::REM, {{0}, {1, 2}}}, 
+    {RVOpcodes::REMW, {{0}, {1, 2}}}, 
+    {RVOpcodes::ADDW, {{0}, {1, 2}}},
+    {RVOpcodes::SUBW, {{0}, {1, 2}}}, 
+    {RVOpcodes::MULW, {{0}, {1, 2}}}, 
+    {RVOpcodes::SLT, {{0}, {1, 2}}}, 
+    {RVOpcodes::SLTU, {{0}, {1, 2}}},
+    {RVOpcodes::XOR, {{0}, {1, 2}}},
+    {RVOpcodes::OR, {{0}, {1, 2}}},
+    {RVOpcodes::AND, {{0}, {1, 2}}},
+    {RVOpcodes::SLL, {{0}, {1, 2}}},
+    {RVOpcodes::SRL, {{0}, {1, 2}}},
+    {RVOpcodes::SRA, {{0}, {1, 2}}},
+    {RVOpcodes::SLLW, {{0}, {1, 2}}},
+    {RVOpcodes::SRLW, {{0}, {1, 2}}},
+    {RVOpcodes::SRAW, {{0}, {1, 2}}},
+
+    // --- 整数计算 (I-Type) ---
+    {RVOpcodes::ADDI, {{0}, {1}}}, 
+    {RVOpcodes::ADDIW, {{0}, {1}}}, 
+    {RVOpcodes::XORI, {{0}, {1}}},
+    {RVOpcodes::ORI, {{0}, {1}}},
+    {RVOpcodes::ANDI, {{0}, {1}}},
+    {RVOpcodes::SLTI, {{0}, {1}}}, 
+    {RVOpcodes::SLTIU, {{0}, {1}}},
+    {RVOpcodes::SLLI, {{0}, {1}}},
+    {RVOpcodes::SLLIW, {{0}, {1}}},
+    {RVOpcodes::SRLI, {{0}, {1}}},
+    {RVOpcodes::SRLIW, {{0}, {1}}},
+    {RVOpcodes::SRAI, {{0}, {1}}},
+    {RVOpcodes::SRAIW, {{0}, {1}}},
+
+    // --- 内存加载 ---
+    {RVOpcodes::LW, {{0}, {}}}, {RVOpcodes::LH, {{0}, {}}}, {RVOpcodes::LB, {{0}, {}}},
+    {RVOpcodes::LWU, {{0}, {}}}, {RVOpcodes::LHU, {{0}, {}}}, {RVOpcodes::LBU, {{0}, {}}},
+    {RVOpcodes::LD, {{0}, {}}},
+    {RVOpcodes::FLW, {{0}, {}}}, {RVOpcodes::FLD, {{0}, {}}},
+
+    // --- 内存存储 ---
+    {RVOpcodes::SW, {{}, {0, 1}}}, {RVOpcodes::SH, {{}, {0, 1}}}, {RVOpcodes::SB, {{}, {0, 1}}},
+    {RVOpcodes::SD, {{}, {0, 1}}},
+    {RVOpcodes::FSW, {{}, {0, 1}}}, {RVOpcodes::FSD, {{}, {0, 1}}},
+
+    // --- 分支指令 ---
+    {RVOpcodes::BEQ, {{}, {0, 1}}}, {RVOpcodes::BNE, {{}, {0, 1}}}, {RVOpcodes::BLT, {{}, {0, 1}}}, 
+    {RVOpcodes::BGE, {{}, {0, 1}}}, {RVOpcodes::BLTU, {{}, {0, 1}}}, {RVOpcodes::BGEU, {{}, {0, 1}}},
+
+    // --- 跳转 ---
+    {RVOpcodes::JAL, {{0}, {}}}, // JAL的rd是def，但通常用x0表示不关心返回值，这里简化
+    {RVOpcodes::JALR, {{0}, {1}}},
+    {RVOpcodes::RET, {{}, {}}}, // RET是伪指令，通常展开为JALR
+
+    // --- 伪指令 & 其他 ---
+    {RVOpcodes::LI, {{0}, {}}}, {RVOpcodes::LA, {{0}, {}}},
+    {RVOpcodes::MV, {{0}, {1}}}, 
+    {RVOpcodes::NEG, {{0}, {1}}}, // sub rd, zero, rs1
+    {RVOpcodes::NEGW, {{0}, {1}}}, // subw rd, zero, rs1
+    {RVOpcodes::SEQZ, {{0}, {1}}}, 
+    {RVOpcodes::SNEZ, {{0}, {1}}},
+    
+    // --- 函数调用 ---
+    // CALL的use/def在getInstrUseDef中有特殊处理逻辑，这里可以不列出
+
+    // --- 浮点指令 ---
+    {RVOpcodes::FADD_S, {{0}, {1, 2}}}, {RVOpcodes::FSUB_S, {{0}, {1, 2}}},
+    {RVOpcodes::FMUL_S, {{0}, {1, 2}}}, {RVOpcodes::FDIV_S, {{0}, {1, 2}}},
+    {RVOpcodes::FMADD_S, {{0}, {1, 2, 3}}},
+    {RVOpcodes::FEQ_S, {{0}, {1, 2}}}, {RVOpcodes::FLT_S, {{0}, {1, 2}}}, {RVOpcodes::FLE_S, {{0}, {1, 2}}},
+    {RVOpcodes::FCVT_S_W, {{0}, {1}}}, {RVOpcodes::FCVT_W_S, {{0}, {1}}},
+    {RVOpcodes::FCVT_W_S_RTZ, {{0}, {1}}},
+    {RVOpcodes::FMV_S, {{0}, {1}}}, {RVOpcodes::FMV_W_X, {{0}, {1}}}, {RVOpcodes::FMV_X_W, {{0}, {1}}},
+    {RVOpcodes::FNEG_S, {{0}, {1}}}
+};
+
+} // namespace sysy
+
+#endif // RISCV64_INFO_H
--- a/src/include/backend/RISCv64/RISCv64LLIR.h
+++ b/src/include/backend/RISCv64/RISCv64LLIR.h
@ -3,6 +3,7 @@

 #include "IR.h" // 确保包含了您自己的IR头文件
 #include <string>
+#include <iostream>
 #include <vector>
 #include <memory>
 #include <cstdint>
@ -38,14 +39,16 @@ enum class PhysicalReg {

    // 用于内部表示物理寄存器在干扰图中的节点ID（一个简单的特殊ID，确保不与vreg_counter冲突）
    // 假设 vreg_counter 不会达到这么大的值
-    PHYS_REG_START_ID = 100000, 
+    PHYS_REG_START_ID = 1000000, 
    PHYS_REG_END_ID = PHYS_REG_START_ID + 320, // 预留足够的空间
+
+    INVALID, ///< 无效寄存器标记
 };

 // RISC-V 指令操作码枚举
 enum class RVOpcodes {
    // 算术指令
-    ADD, ADDI, ADDW, ADDIW, SUB, SUBW, MUL, MULW, DIV, DIVW, REM, REMW,
+    ADD, ADDI, ADDW, ADDIW, SUB, SUBW, MUL, MULW, MULH, DIV, DIVW, REM, REMW,
    // 逻辑指令
    XOR, XORI, OR, ORI, AND, ANDI,
    // 移位指令
@ -76,6 +79,7 @@ enum class RVOpcodes {
    FSUB_S, // fsub.s rd, rs1, rs2
    FMUL_S, // fmul.s rd, rs1, rs2
    FDIV_S, // fdiv.s rd, rs1, rs2
+    FMADD_S, // fmadd.s rd, rs1, rs2, rs3
    
    // 浮点比较 (单精度)
    FEQ_S,  // feq.s rd, rs1, rs2 (结果写入整数寄存器rd)
@ -85,6 +89,7 @@ enum class RVOpcodes {
    // 浮点转换
    FCVT_S_W, // fcvt.s.w rd, rs1 (有符号整数 -> 单精度浮点)
    FCVT_W_S, // fcvt.w.s rd, rs1 (单精度浮点 -> 有符号整数)
+    FCVT_W_S_RTZ, // fcvt.w.s rd, rs1, rtz (使用向零截断模式)

    // 浮点传送/移动
    FMV_S,    // fmv.s rd, rs1 (浮点寄存器之间)
@ -92,6 +97,9 @@ enum class RVOpcodes {
    FMV_X_W,  // fmv.x.w rd, rs1 (浮点寄存器位模式 -> 整数寄存器)
    FNEG_S,   // fneg.s rd, rs (浮点取负)

+    // 浮点控制状态寄存器 (CSR)
+    FSRMI,    // fsrmi rd, imm (设置舍入模式立即数)
+
    // 伪指令
    FRAME_LOAD_W,  // 从栈帧加载 32位 Word (对应 lw)
    FRAME_LOAD_D,  // 从栈帧加载 64位 Doubleword (对应 ld)
@ -195,6 +203,11 @@ public:
        preg = new_preg;
        is_virtual = false;
    }
+    
+    void setVRegNum(unsigned new_vreg_num) {
+        vreg_num = new_vreg_num;
+        is_virtual = true; // 确保设置vreg时，操作数状态正确
+    }
 private:
    unsigned vreg_num = 0;
    PhysicalReg preg = PhysicalReg::ZERO;
@ -243,6 +256,19 @@ public:
    void addOperand(std::unique_ptr<MachineOperand> operand) {
        operands.push_back(std::move(operand));
    }
+    /**
+     * @brief （为紧急溢出模式添加）将指令中所有对特定虚拟寄存器的引用替换为指定的物理寄存器。
+     * * @param old_vreg 需要被替换的虚拟寄存器号。
+     * @param preg 用于替换的物理寄存器。
+     */
+    void replaceVRegWithPReg(unsigned old_vreg, PhysicalReg preg);
+
+    /**
+     * @brief （为常规溢出模式添加）根据提供的映射表，重映射指令中的虚拟寄存器。
+     * * @param use_remap 一个从旧vreg到新vreg的映射，用于指令的use操作数。
+     * @param def_remap 一个从旧vreg到新vreg的映射，用于指令的def操作数。
+     */
+    void remapVRegs(const std::map<unsigned, unsigned>& use_remap, const std::map<unsigned, unsigned>& def_remap);
 private:
    RVOpcodes opcode;
    std::vector<std::unique_ptr<MachineOperand>> operands;
@ -274,14 +300,15 @@ private:
 // 栈帧信息
 struct StackFrameInfo {
    int locals_size = 0; // 仅为AllocaInst分配的大小
+    int locals_end_offset = 0; // 记录局部变量分配结束后的偏移量(相对于s0，为负)
    int spill_size = 0; // 仅为溢出分配的大小
    int total_size = 0; // 总大小
    int callee_saved_size = 0; // 保存寄存器的大小
    std::map<unsigned, int> alloca_offsets; // <AllocaInst的vreg, 栈偏移>
    std::map<unsigned, int> spill_offsets;  // <溢出vreg, 栈偏移>
    std::set<PhysicalReg> used_callee_saved_regs; // 使用的保存寄存器
-    std::map<unsigned, PhysicalReg> vreg_to_preg_map;
-    std::vector<PhysicalReg> callee_saved_regs; // 用于存储需要保存的被调用者保存寄存器列表
+    std::map<unsigned, PhysicalReg> vreg_to_preg_map; // RegAlloc最终的分配结果
+    std::vector<PhysicalReg> callee_saved_regs_to_store; // 已排序的、需要存取的被调用者保存寄存器
 };

 // 机器函数
@ -295,17 +322,40 @@ public:
    StackFrameInfo& getFrameInfo() { return frame_info; }
    const std::vector<std::unique_ptr<MachineBasicBlock>>& getBlocks() const { return blocks; }
    std::vector<std::unique_ptr<MachineBasicBlock>>& getBlocks() { return blocks; }
-
+    void dumpStackFrameInfo(std::ostream& os = std::cerr) const;
    void addBlock(std::unique_ptr<MachineBasicBlock> block) {
        blocks.push_back(std::move(block));
    }
+    void addProtectedArgumentVReg(unsigned vreg) {
+        protected_argument_vregs.insert(vreg);
+    }
+    const std::set<unsigned>& getProtectedArgumentVRegs() const {
+        return protected_argument_vregs;
+    }
 private:
    Function* F;
    RISCv64ISel* isel; // 指向创建它的ISel，用于获取vreg映射等信息
    std::string name;
    std::vector<std::unique_ptr<MachineBasicBlock>> blocks;
    StackFrameInfo frame_info;
+    std::set<unsigned> protected_argument_vregs;
 };
+inline bool isMemoryOp(RVOpcodes opcode) {
+    switch (opcode) {
+        case RVOpcodes::LB: case RVOpcodes::LH: case RVOpcodes::LW: case RVOpcodes::LD:
+        case RVOpcodes::LBU: case RVOpcodes::LHU: case RVOpcodes::LWU:
+        case RVOpcodes::SB: case RVOpcodes::SH: case RVOpcodes::SW: case RVOpcodes::SD:
+        case RVOpcodes::FLW:
+        case RVOpcodes::FSW:
+        case RVOpcodes::FLD:
+        case RVOpcodes::FSD:
+            return true;
+        default:
+            return false;
+    }
+}
+
+void getInstrUseDef(const MachineInstr* instr, std::set<unsigned>& use, std::set<unsigned>& def);

 } // namespace sysy

--- a/src/include/backend/RISCv64/RISCv64LinearScan.h
+++ b/src/include/backend/RISCv64/RISCv64LinearScan.h
@ -0,0 +1,81 @@
+#ifndef RISCV64_LINEARSCAN_H
+#define RISCV64_LINEARSCAN_H
+
+#include "RISCv64LLIR.h"
+#include "RISCv64ISel.h"
+#include <vector>
+#include <map>
+#include <set>
+#include <algorithm>
+
+namespace sysy {
+
+// 前向声明
+class MachineBasicBlock;
+class MachineFunction;
+class RISCv64ISel;
+
+/**
+ * @brief 表示一个虚拟寄存器的活跃区间。
+ * 包含起始和结束指令编号。为了简化，我们不处理有“洞”的区间。
+ */
+struct LiveInterval {
+    unsigned vreg = 0;
+    int start = -1;
+    int end = -1;
+    bool crosses_call = false;
+    
+    LiveInterval(unsigned vreg) : vreg(vreg) {}
+
+    // 用于排序，按起始点从小到大
+    bool operator<(const LiveInterval& other) const {
+        return start < other.start;
+    }
+};
+
+class RISCv64LinearScan {
+public:
+    RISCv64LinearScan(MachineFunction* mfunc);
+    bool run();
+
+private:
+    // --- 核心算法流程 ---
+    void linearizeBlocks();
+    void computeLiveIntervals();
+    bool linearScan();
+    void rewriteProgram();
+    void applyAllocation();
+    void spillAtInterval(LiveInterval* current);
+    
+    // --- 辅助函数 ---
+    bool isFPVReg(unsigned vreg) const;
+    void collectUsedCalleeSavedRegs();
+
+    MachineFunction* MFunc;
+    RISCv64ISel* ISel;
+
+    // --- 线性扫描数据结构 ---
+    std::vector<MachineBasicBlock*> linear_order_blocks;
+    std::map<const MachineInstr*, int> instr_numbering;
+    std::map<unsigned, LiveInterval> live_intervals;
+    
+    std::vector<LiveInterval*> unhandled;
+    std::vector<LiveInterval*> active; // 活跃且已分配物理寄存器的区间
+    
+    std::set<unsigned> spilled_vregs; // 记录在本轮被决定溢出的vreg
+
+    bool conservative_spill_mode = false;
+    const PhysicalReg SPILL_TEMP_REG = PhysicalReg::T4;
+
+    // --- 寄存器池和分配结果 ---
+    std::vector<PhysicalReg> allocable_int_regs;
+    std::vector<PhysicalReg> allocable_fp_regs;
+    std::map<unsigned, PhysicalReg> vreg_to_preg_map;
+    std::map<unsigned, PhysicalReg> abi_vreg_map;
+    
+    const std::map<unsigned, Type*>& vreg_type_map;
+};
+
+} // namespace sysy
+
+#endif // RISCV64_LINEARSCAN_H
--- a/src/include/backend/RISCv64/RISCv64Passes.h
+++ b/src/include/backend/RISCv64/RISCv64Passes.h
@ -1,6 +1,7 @@
 #ifndef RISCV64_PASSES_H
 #define RISCV64_PASSES_H

+#include "Pass.h"
 #include "RISCv64LLIR.h"
 #include "Peephole.h"
 #include "PreRA_Scheduler.h"
@ -8,7 +9,8 @@
 #include "CalleeSavedHandler.h"
 #include "LegalizeImmediates.h"
 #include "PrologueEpilogueInsertion.h"
-#include "Pass.h"
+#include "EliminateFrameIndices.h"
+#include "DivStrengthReduction.h"

 namespace sysy {

--- a/src/include/backend/RISCv64/RISCv64RegAlloc.h
+++ b/src/include/backend/RISCv64/RISCv64RegAlloc.h
@ -3,9 +3,16 @@

 #include "RISCv64LLIR.h"
 #include "RISCv64ISel.h" // 包含 RISCv64ISel.h 以访问 ISel 和 Value 类型
+#include <set>
+#include <vector>
+#include <map>
+#include <stack>

 extern int DEBUG;
 extern int DEEPDEBUG;
+extern int DEBUGLENGTH; // 用于限制调试输出的长度
+extern int DEEPERDEBUG; // 用于更深层次的调试输出
+extern int optLevel;

 namespace sysy {

@ -14,61 +21,100 @@ public:
    RISCv64RegAlloc(MachineFunction* mfunc);

    // 模块主入口
-    void run();
+    bool run(std::shared_ptr<std::atomic<bool>> stop_flag);

 private:
-    using LiveSet = std::set<unsigned>; // 活跃虚拟寄存器集合
-    using InterferenceGraph = std::map<unsigned, std::set<unsigned>>;
+    // 类型定义，与Python版本对应
+    using VRegSet = std::set<unsigned>;
+    using InterferenceGraph = std::map<unsigned, VRegSet>;
+    using VRegStack = std::vector<unsigned>; // 使用vector模拟栈，方便遍历
+    using MoveList = std::map<unsigned, std::set<const MachineInstr*>>;
+    using AliasMap = std::map<unsigned, unsigned>;
+    using ColorMap = std::map<unsigned, PhysicalReg>;
+    using VRegMoveSet = std::set<const MachineInstr*>;

-    // 栈帧管理
-    void eliminateFrameIndices();
-    
-    // 活跃性分析
+    // --- 核心算法流程 ---
+    void initialize();
+    void build();
+    void makeWorklist();
+    void simplify();
+    void coalesce();
+    void freeze();
+    void selectSpill();
+    void assignColors();
+    void rewriteProgram();
+    bool doAllocation();
+    void applyColoring();
+    void precolorByCallingConvention();
+    void protectCrossCallVRegs();
+
+    // --- 辅助函数 ---
+    void dumpState(const std::string &stage);
+    void getInstrUseDef(const MachineInstr* instr, VRegSet& use, VRegSet& def);
+    void getInstrUseDef_Liveness(const MachineInstr *instr, VRegSet &use, VRegSet &def);
+    void addEdge(unsigned u, unsigned v);
+    VRegSet adjacent(unsigned n);
+    VRegMoveSet nodeMoves(unsigned n);
+    bool moveRelated(unsigned n);
+    void decrementDegree(unsigned m);
+    void enableMoves(const VRegSet& nodes);
+    unsigned getAlias(unsigned n);
+    void addWorklist(unsigned u);
+    bool briggsHeuristic(unsigned u, unsigned v);
+    bool georgeHeuristic(unsigned u, unsigned v);
+    void combine(unsigned u, unsigned v);
+    void freezeMoves(unsigned u);
+    void collectUsedCalleeSavedRegs();
+    bool isFPVReg(unsigned vreg) const;
+    std::string regToString(PhysicalReg reg);
+    std::string regIdToString(unsigned id);
+
+    // --- 活跃性分析 ---
    void analyzeLiveness();

-    // 构建干扰图
-    void buildInterferenceGraph();
-
-    // 图着色分配寄存器
-    void colorGraph();
-
-    // 重写函数，替换vreg并插入溢出代码
-    void rewriteFunction();
-    
-    // 辅助函数，获取指令的Use/Def集合
-    void getInstrUseDef(MachineInstr* instr, LiveSet& use, LiveSet& def);
-
-    // 辅助函数，处理调用约定
-    void handleCallingConvention();
-
    MachineFunction* MFunc;
-    
-    // 活跃性分析结果
-    std::map<const MachineInstr*, LiveSet> live_in_map;
-    std::map<const MachineInstr*, LiveSet> live_out_map;
+    RISCv64ISel* ISel;

-    // 干扰图
-    InterferenceGraph interference_graph;
-
-    // 图着色结果
-    std::map<unsigned, PhysicalReg> color_map; // vreg -> preg
-    std::set<unsigned> spilled_vregs;         // 被溢出的vreg集合
-
-    // 可用的物理寄存器池
+    // --- 算法数据结构 ---
+    // 寄存器池
    std::vector<PhysicalReg> allocable_int_regs;
    std::vector<PhysicalReg> allocable_fp_regs;
+    int K_int; // 整数寄存器数量
+    int K_fp;  // 浮点寄存器数量

-    // 存储vreg到IR Value*的反向映射
-    // 这个map将在run()函数开始时被填充，并在rewriteFunction()中使用。
-    std::map<unsigned, Value*> vreg_to_value_map;
-    std::map<PhysicalReg, unsigned> preg_to_vreg_id_map; // 物理寄存器到特殊vreg ID的映射
-    
-    // 用于计算类型大小的辅助函数
-    unsigned getTypeSizeInBytes(Type* type);
+    // 节点集合
+    VRegSet precolored; // 预着色的节点 (物理寄存器)
+    VRegSet initial;    // 初始的、所有待处理的虚拟寄存器节点
+    VRegSet simplifyWorklist;
+    VRegSet freezeWorklist;
+    VRegSet spillWorklist;
+    VRegSet spilledNodes;
+    VRegSet coalescedNodes;
+    VRegSet coloredNodes;
+    VRegStack selectStack;

-    // 辅助函数，用于打印集合
-    static void printLiveSet(const LiveSet& s, const std::string& name, std::ostream& os);
+    // Move指令相关
+    std::set<const MachineInstr*> coalescedMoves;
+    std::set<const MachineInstr*> constrainedMoves;
+    std::set<const MachineInstr*> frozenMoves;
+    std::set<const MachineInstr*> worklistMoves;
+    std::set<const MachineInstr*> activeMoves;
+
+    // 数据结构
+    InterferenceGraph adjSet;
+    std::map<unsigned, VRegSet> adjList; // 邻接表
+    std::map<unsigned, int> degree;
+    MoveList moveList;
+    AliasMap alias;
+    ColorMap color_map;
+
+    // 活跃性分析结果
+    std::map<const MachineInstr*, VRegSet> live_in_map;
+    std::map<const MachineInstr*, VRegSet> live_out_map;
    
+    // VReg -> Value* 和 VReg -> Type* 的映射
+    const std::map<unsigned, Value*>& vreg_to_value_map;
+    const std::map<unsigned, Type*>& vreg_type_map;
 };

 } // namespace sysy
--- a/src/include/backend/RISCv64/RISCv64SimpleRegAlloc.h
+++ b/src/include/backend/RISCv64/RISCv64SimpleRegAlloc.h
@ -0,0 +1,107 @@
+#ifndef RISCV64_SIMPLE_REGALLOC_H
+#define RISCV64_SIMPLE_REGALLOC_H
+
+#include "RISCv64LLIR.h"
+#include "RISCv64ISel.h"
+#include <set>
+#include <vector>
+#include <map>
+
+// 外部调试级别控制变量的声明
+extern int DEBUG;
+extern int DEEPDEBUG;
+
+namespace sysy {
+
+class RISCv64AsmPrinter; // 前向声明
+
+/**
+ * @class RISCv64SimpleRegAlloc
+ * @brief 一个简单的一次性图着色寄存器分配器。
+ * * 该分配器遵循一个线性的、非迭代的流程：
+ * 1. 活跃性分析
+ * 2. 构建冲突图
+ * 3. 贪心图着色
+ * 4. 重写函数代码，插入溢出指令
+ * * 它与新版后端流水线兼容，但保留了旧版分配器的核心逻辑。
+ * 溢出处理使用硬编码的物理寄存器。
+ */
+class RISCv64SimpleRegAlloc {
+public:
+    RISCv64SimpleRegAlloc(MachineFunction* mfunc);
+
+    /**
+     * @brief 运行寄存器分配的主函数。
+     */
+    void run();
+
+private:
+    using LiveSet = std::set<unsigned>;
+    using InterferenceGraph = std::map<unsigned, LiveSet>;
+
+    // --- 分配流程的各个阶段 ---
+    void unifyArgumentVRegs();
+    void handleCallingConvention();
+    void analyzeLiveness();
+    void buildInterferenceGraph();
+    void colorGraph();
+    void rewriteFunction();
+
+    // --- 辅助函数 ---
+
+    /**
+     * @brief 获取指令的Use/Def集合，包含物理寄存器，用于活跃性分析。
+     * @param instr 机器指令。
+     * @param use 输出参数，存储使用的寄存器ID。
+     * @param def 输出参数，存储定义的寄存器ID。
+     */
+    void getInstrUseDef_Liveness(const MachineInstr* instr, LiveSet& use, LiveSet& def);
+
+    /**
+     * @brief 根据vreg的类型信息返回其大小和类型种类。
+     * @param vreg 虚拟寄存器号。
+     * @return 一个包含类型信息和大小（字节）的pair。
+     */
+    std::pair<Type::Kind, unsigned> getTypeAndSize(unsigned vreg);
+
+    /**
+     * @brief 打印调试用的活跃集信息。
+     */
+    void printLiveSet(const LiveSet& s, const std::string& name, std::ostream& os, const RISCv64AsmPrinter& printer);
+
+    /**
+     * @brief 将寄存器ID（虚拟或物理）转换为可读字符串。
+     */
+    std::string regIdToString(unsigned id, const RISCv64AsmPrinter& printer) const;
+
+    // --- 成员变量 ---
+    MachineFunction* MFunc;
+    RISCv64ISel* ISel;
+
+    // 可分配的寄存器池
+    std::vector<PhysicalReg> allocable_int_regs;
+    std::vector<PhysicalReg> allocable_fp_regs;
+
+    // 硬编码的溢出专用物理寄存器
+    const PhysicalReg INT_SPILL_REG = PhysicalReg::T2;   // 用于 32-bit int
+    const PhysicalReg PTR_SPILL_REG = PhysicalReg::T3;   // 用于 64-bit pointer
+    const PhysicalReg FP_SPILL_REG = PhysicalReg::F4;    // 用于 32-bit float (ft4)
+
+    // 活跃性分析结果
+    std::map<const MachineInstr*, LiveSet> live_in_map;
+    std::map<const MachineInstr*, LiveSet> live_out_map;
+
+    // 冲突图
+    InterferenceGraph interference_graph;
+
+    // 着色结果和溢出列表
+    std::map<unsigned, PhysicalReg> color_map;
+    std::set<unsigned> spilled_vregs;
+
+    // 映射：将物理寄存器ID映射到它们在冲突图中的特殊虚拟ID
+    std::map<PhysicalReg, unsigned> preg_to_vreg_id_map;
+};
+
+} // namespace sysy
+
+#endif // RISCV64_SIMPLE_REGALLOC_H
--- a/src/include/midend/IR.h
+++ b/src/include/midend/IR.h
@ -20,6 +20,10 @@
 #include <algorithm>

 namespace sysy {
+
+// Global cleanup function to release all statically allocated IR objects
+void cleanupIRPools();
+
 /**
 * \defgroup type Types
 * @brief Sysy的类型系统
@ -83,6 +87,7 @@ class Type {
  auto as() const -> std::enable_if_t<std::is_base_of_v<Type, T>, T *> {
    return dynamic_cast<T *>(const_cast<Type *>(this));
  }
+  virtual void print(std::ostream& os) const;
 };

 class PointerType : public Type {
@ -94,6 +99,9 @@ class PointerType : public Type {

 public:
  static PointerType* get(Type *baseType);  ///< 获取指向baseType的Pointer类型
+  
+  // Cleanup method to release all cached pointer types (call at program exit)
+  static void cleanup();

 public:
  Type* getBaseType() const { return baseType; }  ///< 获取指向的类型
@ -111,6 +119,9 @@ class FunctionType : public Type {
 public:
  /// 获取返回值类型为returnType， 形参类型列表为paramTypes的Function类型
  static FunctionType* get(Type *returnType, const std::vector<Type *> &paramTypes = {});
+  
+  // Cleanup method to release all cached function types (call at program exit)
+  static void cleanup();

 public:
  Type* getReturnType() const { return returnType; }          ///< 获取返回值类信息
@ -123,6 +134,9 @@ class ArrayType : public Type {
  // elements：数组的元素类型 (例如，int[3] 的 elementType 是 int)
  // numElements：该维度的大小 (例如，int[3] 的 numElements 是 3)
  static ArrayType *get(Type *elementType, unsigned numElements);
+  
+  // Cleanup method to release all cached array types (call at program exit)
+  static void cleanup();

  Type *getElementType() const { return elementType; }
  unsigned getNumElements() const { return numElements; }
@ -202,9 +216,11 @@ class Use {

 public:
  unsigned getIndex() const { return index; }   ///< 返回value在User操作数中的位置
+  void setIndex(int newIndex) { index = newIndex; }  ///< 设置value在User操作数中的位置
  User* getUser() const { return user; }       ///< 返回使用者
  Value* getValue() const { return value; }    ///< 返回被使用的值
  void setValue(Value *newValue) { value = newValue; }  ///< 将被使用的值设置为newValue
+  void print(std::ostream& os) const;
 };

 //! The base class of all value types
@ -229,7 +245,15 @@ class Value {
  std::list<std::shared_ptr<Use>>& getUses() { return uses; }   ///< 获取使用关系列表
  void addUse(const std::shared_ptr<Use> &use) { uses.push_back(use); }  ///< 添加使用关系
  void replaceAllUsesWith(Value *value);  ///< 将原来使用该value的使用者全变为使用给定参数value并修改相应use关系
-  void removeUse(const std::shared_ptr<Use> &use) { uses.remove(use); }  ///< 删除使用关系use
+  void removeUse(const std::shared_ptr<Use> &use) { 
+    assert(use != nullptr && "Use cannot be null");
+    assert(use->getValue() == this && "Use being removed does NOT point to this Value!");
+    auto it = std::find(uses.begin(), uses.end(), use);
+    assert(it != uses.end() && "Use not found in Value's uses");
+    uses.remove(use); 
+  }  ///< 删除使用关系use
+  void removeAllUses();
+  virtual void print(std::ostream& os) const = 0;  ///< 输出值信息到输出流
 };

 /**
@ -356,6 +380,9 @@ public:

  // Static factory method to get a canonical ConstantValue from the pool
  static ConstantValue* get(Type* type, ConstantValVariant val);
+  
+  // Cleanup method to release all cached constants (call at program exit)
+  static void cleanup();

  // Helper methods to access constant values with appropriate casting
  int getInt() const {
@ -394,6 +421,7 @@ public:

  virtual bool isZero() const = 0;
  virtual bool isOne() const = 0;
+  void print(std::ostream& os) const = 0;
 };

 class ConstantInteger : public ConstantValue {
@ -420,6 +448,7 @@ public:

  bool isZero() const override { return constVal == 0; }
  bool isOne() const override { return constVal == 1; }
+  void print(std::ostream& os) const;
 };

 class ConstantFloating : public ConstantValue {
@ -446,6 +475,7 @@ public:

  bool isZero() const override { return constFVal == 0.0f; }
  bool isOne() const override { return constFVal == 1.0f; }
+  void print(std::ostream& os) const;
 };

 class UndefinedValue : public ConstantValue {
@ -460,6 +490,9 @@ protected:

 public:
  static UndefinedValue* get(Type* type);
+  
+  // Cleanup method to release all cached undefined values (call at program exit)
+  static void cleanup();

  size_t hash() const override {
    return std::hash<Type*>{}(getType());
@ -477,6 +510,7 @@ public:

  bool isZero() const override { return false; }
  bool isOne() const override { return false; }
+  void print(std::ostream& os) const;
 };

 // --- End of refactored ConstantValue and related classes ---
@ -514,12 +548,15 @@ public:
  explicit BasicBlock(Function *parent, const std::string &name = "")
      : Value(Type::getLabelType(), name), parent(parent) {}
  ~BasicBlock() override {
-    for (auto pre : predecessors) {
-      pre->removeSuccessor(this);
-    }
-    for (auto suc : successors) {
-      suc->removePredecessor(this);
-    }
+    // for (auto pre : predecessors) {
+    //   pre->removeSuccessor(this);
+    // }
+    // for (auto suc : successors) {
+    //   suc->removePredecessor(this);
+    // }
+    // 这些关系应该在 BasicBlock 被从 Function 中移除时，
+    // 由负责 CFG 优化的 Pass (例如 SCCP 的 RemoveDeadBlock) 显式地清理。
+    // 析构函数只负责清理 BasicBlock 自身拥有的资源（例如，指令列表）。
  }
  
 public:
@ -573,7 +610,9 @@ public:
    if (iter != predecessors.end()) {
      predecessors.erase(iter);
    } else {
-      assert(false);
+      // 如果没有找到前驱块，可能是因为它已经被移除或不存在
+      // 这可能是一个错误情况，或者是因为在CFG优化过程中已经处理
+      // assert(false && "Predecessor block not found in BasicBlock");
    }
  }
  void removeSuccessor(BasicBlock *block) {
@ -581,7 +620,9 @@ public:
    if (iter != successors.end()) {
      successors.erase(iter);
    } else {
-      assert(false);
+      // 如果没有找到后继块，可能是因为它已经被移除或不存在
+      // 这可能是一个错误情况，或者是因为在CFG优化过程中已经处理
+      // assert(false && "Successor block not found in BasicBlock");
    }
  }
  void replacePredecessor(BasicBlock *oldBlock, BasicBlock *newBlock) {
@ -599,7 +640,7 @@ public:
    prev->addSuccessor(next);
    next->addPredecessor(prev);
  }
-  void removeInst(iterator pos) { instructions.erase(pos); }
+  iterator removeInst(iterator pos) { return instructions.erase(pos); }
  void removeInst(Instruction *inst) {
    auto pos = std::find_if(instructions.begin(), instructions.end(),
                            [inst](const std::unique_ptr<Instruction> &i) { return i.get() == inst; });
@ -610,6 +651,11 @@ public:
    }
  }  ///< 移除指定位置的指令
  iterator moveInst(iterator sourcePos, iterator targetPos, BasicBlock *block);
+  
+  /// 清理基本块中的所有使用关系
+  void cleanup();
+  
+  void print(std::ostream& os) const; 
 };

 //! User is the abstract base type of `Value` types which use other `Value` as
@ -635,11 +681,7 @@ class User : public Value {
    operands.emplace_back(std::make_shared<Use>(operands.size(), this, value));
    value->addUse(operands.back());
  }  ///< 增加操作数
-  void removeOperand(unsigned index) {
-    auto value = getOperand(index);
-    value->removeUse(operands[index]);
-    operands.erase(operands.begin() + index);
-  }  ///< 移除操作数
+  void removeOperand(unsigned index);
  template <typename ContainerT>
  void addOperands(const ContainerT &newoperands) {
    for (auto value : newoperands) {
@ -648,6 +690,9 @@ class User : public Value {
  }                                                   ///< 增加多个操作数
  void replaceOperand(unsigned index, Value *value);  ///< 替换操作数
  void setOperand(unsigned index, Value *value);      ///< 设置操作数
+  
+  /// 清理用户的所有操作数使用关系
+  void cleanup();
 };

 /*!
@ -682,6 +727,7 @@ class Instruction : public User {
    kFCmpGE = 0x1UL << 20,
    kAnd = 0x1UL << 21,
    kOr = 0x1UL << 22,
+    // kXor = 0x1UL << 46,
    // Unary
    kNeg = 0x1UL << 23,
    kNot = 0x1UL << 24,
@ -695,19 +741,21 @@ class Instruction : public User {
    kCondBr = 0x1UL << 30,
    kBr = 0x1UL << 31,
    kReturn = 0x1UL << 32,
+    kUnreachable = 0x1UL << 33,
    // mem op
-    kAlloca = 0x1UL << 33,
-    kLoad = 0x1UL << 34,
-    kStore = 0x1UL << 35,
-    kGetElementPtr = 0x1UL << 36,
-    kMemset = 0x1UL << 37,
-    // kGetSubArray = 0x1UL << 38,
-    // Constant Kind removed as Constants are now Values, not Instructions.
-    // kConstant = 0x1UL << 37, // Conflicts with kMemset if kept as is
+    kAlloca = 0x1UL << 34,
+    kLoad = 0x1UL << 35,
+    kStore = 0x1UL << 36,
+    kGetElementPtr = 0x1UL << 37,
+    kMemset = 0x1UL << 38,
    // phi
    kPhi = 0x1UL << 39,
    kBitItoF = 0x1UL << 40,
    kBitFtoI = 0x1UL << 41,
+    kSrl = 0x1UL << 42, // 逻辑右移
+    kSll = 0x1UL << 43, // 逻辑左移
+    kSra = 0x1UL << 44, // 算术右移
+    kMulh = 0x1UL << 45
  };

 protected:
@ -725,57 +773,57 @@ public:
  std::string getKindString() const{
    switch (kind) {
      case kInvalid:
-        return "Invalid";
+        return "invalid";
      case kAdd:
-        return "Add";
+        return "add";
      case kSub:
-        return "Sub";
+        return "sub";
      case kMul:
-        return "Mul";
+        return "mul";
      case kDiv:
-        return "Div";
+        return "sdiv";
      case kRem:
-        return "Rem";
+        return "srem";
      case kICmpEQ:
-        return "ICmpEQ";
+        return "icmp eq";
      case kICmpNE:
-        return "ICmpNE";
+        return "icmp ne";
      case kICmpLT:
-        return "ICmpLT";
+        return "icmp slt";
      case kICmpGT:
-        return "ICmpGT";
+        return "icmp sgt";
      case kICmpLE:
-        return "ICmpLE";
+        return "icmp sle";
      case kICmpGE:
-        return "ICmpGE";
+        return "icmp sge";
      case kFAdd:
-        return "FAdd";
+        return "fadd";
      case kFSub:
-        return "FSub";
+        return "fsub";
      case kFMul:
-        return "FMul";
+        return "fmul";
      case kFDiv:
-        return "FDiv";
+        return "fdiv";
      case kFCmpEQ:
-        return "FCmpEQ";
+        return "fcmp oeq";
      case kFCmpNE:
-        return "FCmpNE";
+        return "fcmp one";
      case kFCmpLT:
-        return "FCmpLT";
+        return "fcmp olt";
      case kFCmpGT:
-        return "FCmpGT";
+        return "fcmp ogt";
      case kFCmpLE:
-        return "FCmpLE";
+        return "fcmp ole";
      case kFCmpGE:
-        return "FCmpGE";
+        return "fcmp oge";
      case kAnd:
-        return "And";
+        return "and";
      case kOr:
-        return "Or";
+        return "or";
      case kNeg:
-        return "Neg";
+        return "neg";
      case kNot:
-        return "Not";
+        return "not";
      case kFNeg:
        return "FNeg";
      case kFNot:
@ -783,27 +831,41 @@ public:
      case kFtoI:
        return "FtoI";
      case kItoF:
-        return "IToF";
+        return "iToF";
      case kCall:
-        return "Call";
+        return "call";
      case kCondBr:
-        return "CondBr";
+        return "condBr";
      case kBr:
-        return "Br";
+        return "br";
      case kReturn:
-        return "Return";
+        return "return";
+      case kUnreachable:
+        return "unreachable";
      case kAlloca:
-        return "Alloca";
+        return "alloca";
      case kLoad:
-        return "Load";
+        return "load";
      case kStore:
-        return "Store";
+        return "store";
      case kGetElementPtr:
-        return "GetElementPtr";
+        return "getElementPtr";
      case kMemset:
-        return "Memset";
+        return "memset";
      case kPhi:
-        return "Phi";
+        return "phi";
+      case kBitItoF:
+        return "BitItoF";
+      case kBitFtoI:
+        return "BitFtoI";
+      case kSrl:
+        return "lshr";
+      case kSll:
+        return "shl";
+      case kSra:
+        return "ashr";
+      case kMulh:
+        return "mulh";
      default:
        return "Unknown";
    }
@ -815,11 +877,15 @@ public:

  bool isBinary() const {
    static constexpr uint64_t BinaryOpMask =
-        (kAdd | kSub | kMul | kDiv | kRem | kAnd | kOr) |
-        (kICmpEQ | kICmpNE | kICmpLT | kICmpGT | kICmpLE | kICmpGE) |
+        (kAdd | kSub | kMul | kDiv | kRem | kAnd | kOr | kSra | kSrl | kSll | kMulh) |
+        (kICmpEQ | kICmpNE | kICmpLT | kICmpGT | kICmpLE | kICmpGE);
+    return kind & BinaryOpMask;
+  }
+  bool isFPBinary() const {
+    static constexpr uint64_t FPBinaryOpMask =
        (kFAdd | kFSub | kFMul | kFDiv) |
        (kFCmpEQ | kFCmpNE | kFCmpLT | kFCmpGT | kFCmpLE | kFCmpGE);
-    return kind & BinaryOpMask;
+    return kind & FPBinaryOpMask;
  }
  bool isUnary() const {
    static constexpr uint64_t UnaryOpMask = 
@ -832,7 +898,7 @@ public:
    return kind & MemoryOpMask;
  }
  bool isTerminator() const {
-    static constexpr uint64_t TerminatorOpMask = kCondBr | kBr | kReturn;
+    static constexpr uint64_t TerminatorOpMask = kCondBr | kBr | kReturn | kUnreachable;
    return kind & TerminatorOpMask;
  }
  bool isCmp() const {
@ -852,6 +918,7 @@ public:
  }
  bool isUnconditional() const { return kind == kBr; }
  bool isConditional() const { return kind == kCondBr; }
+  bool isCondBr() const { return kind == kCondBr; }
  bool isPhi() const { return kind == kPhi; }
  bool isAlloca() const { return kind == kAlloca; }
  bool isLoad() const { return kind == kLoad; }
@ -860,10 +927,15 @@ public:
  bool isMemset() const { return kind == kMemset; }
  bool isCall() const { return kind == kCall; }
  bool isReturn() const { return kind == kReturn; }
+  bool isUnreachable() const { return kind == kUnreachable; }
  bool isDefine() const {
    static constexpr uint64_t DefineOpMask = kAlloca | kStore | kPhi;
    return (kind & DefineOpMask) != 0U;
  } 
+  
+  virtual ~Instruction() = default;
+  
+  virtual void print(std::ostream& os) const = 0;
 }; // class Instruction

 class Function;
@ -885,38 +957,63 @@ class PhiInst : public Instruction {
          const std::string &name = "")
      : Instruction(Kind::kPhi, type, parent, name), vsize(rhs.size()) {
    assert(rhs.size() == Blocks.size() && "PhiInst: rhs and Blocks must have the same size");
-    for(size_t i = 0; i < rhs.size(); ++i) {
+    for(size_t i = 0; i < vsize; ++i) {
      addOperand(rhs[i]);
+      addOperand(Blocks[i]);
      blk2val[Blocks[i]] = rhs[i];
    }
  }

 public:
-  Value* getValue(unsigned k) const {return getOperand(2 * k);}  ///< 获取位置为k的值
-  BasicBlock* getBlock(unsigned k) const {return dynamic_cast<BasicBlock*>(getOperand(2 * k + 1));}
-
-  auto& getincomings() const {return blk2val;}  ///< 获取所有的基本块和对应的值
-
-  Value* getvalfromBlk(BasicBlock* blk);
-  BasicBlock* getBlkfromVal(Value* val);
-
  unsigned getNumIncomingValues() const { return vsize; }  ///< 获取传入值的数量
+  Value *getIncomingValue(unsigned Idx) const { return getOperand(Idx * 2); }  ///< 获取指定位置的传入值
+  BasicBlock *getIncomingBlock(unsigned Idx) const {return dynamic_cast<BasicBlock *>(getOperand(Idx * 2 + 1)); }  ///< 获取指定位置的传入基本块
+
+  Value* getValfromBlk(BasicBlock* block);
+  BasicBlock* getBlkfromVal(Value* value);
+  auto getIncomingValues() const {
+    std::vector<std::pair<BasicBlock*, Value*>> result;
+    for (const auto& [block, value] : blk2val) {
+      result.emplace_back(block, value);
+    }
+    return result;
+  }
  void addIncoming(Value *value, BasicBlock *block) {
-    assert(value && block && "PhiInst: value and block must not be null");
+    assert(value && block && "PhiInst: value and block cannot be null");
    addOperand(value);
    addOperand(block);
    blk2val[block] = value;
    vsize++;
  }  ///< 添加传入值和对应的基本块
-
-  void delValue(Value* val);
-  void delBlk(BasicBlock* blk);
-
-  void replaceBlk(BasicBlock* newBlk, unsigned k);
-  void replaceold2new(BasicBlock* oldBlk, BasicBlock* newBlk);
-  void refreshB2VMap();
-
+  void removeIncoming(unsigned Idx) {
+    assert(Idx < vsize && "PhiInst: Index out of bounds");
+    auto blk = getIncomingBlock(Idx);
+    removeOperand(Idx * 2 + 1);  // Remove block
+    removeOperand(Idx * 2);  // Remove value
+    blk2val.erase(blk);
+    vsize--;
+  }  ///< 移除指定位置的传入值和对应的基本块
+  // 移除指定的传入值或基本块
+  void removeIncomingValue(Value *value);
+  void removeIncomingBlock(BasicBlock *block);
+  // 设置指定位置的传入值或基本块
+  void setIncomingValue(unsigned Idx, Value *value);
+  void setIncomingBlock(unsigned Idx, BasicBlock *block);
+  // 替换指定位置的传入值或基本块（原理是删除再添加）保留旧块或者旧值
+  void replaceIncomingValue(Value *oldValue, Value *newValue);
+  void replaceIncomingBlock(BasicBlock *oldBlock, BasicBlock *newBlock);
+  // 替换指定位置的传入值或基本块（原理是删除再添加）
+  void replaceIncomingValue(Value *oldValue, Value *newValue, BasicBlock *newBlock);
+  void replaceIncomingBlock(BasicBlock *oldBlock, BasicBlock *newBlock, Value *newValue);
+  void refreshMap() {
+    blk2val.clear();
+    vsize = getNumOperands() / 2;
+    for (unsigned i = 0; i < vsize; ++i) {
+      blk2val[getIncomingBlock(i)] = getIncomingValue(i);
+    }
+  }  ///< 刷新块到值的映射关系
  auto getValues() { return make_range(std::next(operand_begin()), operand_end()); }
+  void print(std::ostream& os) const override;
 };


@ -925,16 +1022,14 @@ class CallInst : public Instruction {
  friend class IRBuilder;

 protected:
-  CallInst(Function *callee, const std::vector<Value *> &args = {},
-           BasicBlock *parent = nullptr, const std::string &name = "");
-
+  CallInst(Function *callee, const std::vector<Value *> &args, BasicBlock *parent = nullptr, const std::string &name = "");

 public:
-  Function* getCallee() const;
+  Function *getCallee() const;
  auto getArguments() const {
    return make_range(std::next(operand_begin()), operand_end());
  }
-
+  void print(std::ostream& os) const override;
 }; // class CallInst

 //! Unary instruction, includes '!', '-' and type conversion.
@ -952,7 +1047,7 @@ protected:

 public:
  Value* getOperand() const { return User::getOperand(0); }
-
+  void print(std::ostream& os) const override;
 }; // class UnaryInst

 //! Binary instruction, e.g., arithmatic, relation, logic, etc.
@ -1031,6 +1126,7 @@ public:
        // 后端处理数组访存操作时需要创建计算地址的指令，需要在外部构造 BinaryInst 对象
        return new BinaryInst(kind, type, lhs, rhs, parent, name);
  }
+  void print(std::ostream& os) const override;
 }; // class BinaryInst

 //! The return statement
@ -1051,6 +1147,7 @@ class ReturnInst : public Instruction {
  Value* getReturnValue() const {
    return hasReturnValue() ? getOperand(0) : nullptr;
  }
+  void print(std::ostream& os) const override;
 };

 //! Unconditional branch
@ -1059,12 +1156,10 @@ class UncondBrInst : public Instruction {
  friend class Function;

 protected:
-  UncondBrInst(BasicBlock *block, std::vector<Value *> args,
+  UncondBrInst(BasicBlock *block,
               BasicBlock *parent = nullptr)
      : Instruction(kBr, Type::getVoidType(), parent, "") {
-    // assert(block->getNumArguments() == args.size());
    addOperand(block);
-    addOperands(args);
  }

 public:
@ -1072,7 +1167,17 @@ public:
  auto getArguments() const {
    return make_range(std::next(operand_begin()), operand_end());
  }
-
+  std::vector<BasicBlock *> getSuccessors() const {
+    std::vector<BasicBlock *> succs;
+    // 假设无条件分支的目标块是它的第一个操作数
+    if (getNumOperands() > 0) {
+      if (auto target_bb = dynamic_cast<BasicBlock *>(getOperand(0))) {
+        succs.push_back(target_bb);
+      }
+    }
+    return succs;
+  }
+  void print(std::ostream& os) const override;
 }; // class UncondBrInst

 //! Conditional branch
@ -1083,17 +1188,12 @@ class CondBrInst : public Instruction {
  friend class Function;
  
 protected:
-  CondBrInst(Value *condition, BasicBlock *thenBlock, BasicBlock *elseBlock,
-             const std::vector<Value *> &thenArgs,
-             const std::vector<Value *> &elseArgs, BasicBlock *parent = nullptr)
+  CondBrInst(Value *condition, BasicBlock *thenBlock, BasicBlock *elseBlock, 
+              BasicBlock *parent = nullptr)
      : Instruction(kCondBr, Type::getVoidType(), parent, "") {
-    // assert(thenBlock->getNumArguments() == thenArgs.size() and
-    //        elseBlock->getNumArguments() == elseArgs.size());
    addOperand(condition);
    addOperand(thenBlock);
    addOperand(elseBlock);
-    addOperands(thenArgs);
-    addOperands(elseArgs);
  }
 public:
  Value* getCondition() const { return getOperand(0); }
@ -1103,29 +1203,39 @@ public:
  BasicBlock* getElseBlock() const {
    return dynamic_cast<BasicBlock *>(getOperand(2));
  }
-  // auto getThenArguments() const {
-  //   auto begin = std::next(operand_begin(), 3);
-  //   // auto end = std::next(begin, getThenBlock()->getNumArguments());
-  //   return make_range(begin, end);
-  // }
-  // auto getElseArguments() const {
-  //   auto begin =
-  //       std::next(operand_begin(), 3 + getThenBlock()->getNumArguments());
-  //   auto end = operand_end();
-  //   return make_range(begin, end);
-  // }
-
+  std::vector<BasicBlock *> getSuccessors() const {
+    std::vector<BasicBlock *> succs;
+    // 假设条件分支的真实块是第二个操作数，假块是第三个操作数
+    // 操作数通常是：[0] 条件值, [1] TrueTargetBlock, [2] FalseTargetBlock
+    if (getNumOperands() > 2) {
+      if (auto true_bb = getThenBlock()) {
+        succs.push_back(true_bb);
+      }
+      if (auto false_bb = getElseBlock()) {
+        succs.push_back(false_bb);
+      }
+    }
+    return succs;
+  }
+  void print(std::ostream& os) const override;
 }; // class CondBrInst

+class UnreachableInst : public Instruction {
+public:
+  // 构造函数：设置指令类型为 kUnreachable
+  explicit UnreachableInst(const std::string& name, BasicBlock *parent = nullptr)
+      : Instruction(kUnreachable, Type::getVoidType(), parent, "") {}
+  void print(std::ostream& os) const { os << "unreachable"; }
+};
+
 //! Allocate memory for stack variables, used for non-global variable declartion
 class AllocaInst : public Instruction {
  friend class IRBuilder;
  friend class Function;
 protected:
-  AllocaInst(Type *type, const std::vector<Value *> &dims = {},
+  AllocaInst(Type *type,
             BasicBlock *parent = nullptr, const std::string &name = "")
      : Instruction(kAlloca, type, parent, name) {
-    addOperands(dims);
  }

 public:
@ -1133,10 +1243,7 @@ public:
  Type* getAllocatedType() const {
    return getType()->as<PointerType>()->getBaseType();
  }  ///< 获取分配的类型
-  int getNumDims() const { return getNumOperands(); }
-  auto getDims() const { return getOperands(); }
-  Value* getDim(int index) { return getOperand(index); }
-
+  void print(std::ostream& os) const override;
 }; // class AllocaInst


@ -1174,6 +1281,7 @@ public:
                                    BasicBlock *parent = nullptr, const std::string &name = "") {
    return new GetElementPtrInst(resultType, basePointer, indices, parent, name);
  }
+  void print(std::ostream& os) const override;
 };

 //! Load a value from memory address specified by a pointer value
@ -1182,22 +1290,16 @@ class LoadInst : public Instruction {
  friend class Function;

 protected:
-  LoadInst(Value *pointer, const std::vector<Value *> &indices = {},
+  LoadInst(Value *pointer,
           BasicBlock *parent = nullptr, const std::string &name = "")
      : Instruction(kLoad, pointer->getType()->as<PointerType>()->getBaseType(),
                    parent, name) {
    addOperand(pointer);
-    addOperands(indices);
  }

 public:
-  int getNumIndices() const { return getNumOperands() - 1; }
  Value* getPointer() const { return getOperand(0); }
-  auto getIndices() const {
-    return make_range(std::next(operand_begin()), operand_end());
-  }
-  Value* getIndex(int index) const { return getOperand(index + 1); }
-  
+  void print(std::ostream& os) const override;
 }; // class LoadInst

 //! Store a value to memory address specified by a pointer value
@ -1207,23 +1309,16 @@ class StoreInst : public Instruction {

 protected:
  StoreInst(Value *value, Value *pointer,
-            const std::vector<Value *> &indices = {},
            BasicBlock *parent = nullptr, const std::string &name = "")
      : Instruction(kStore, Type::getVoidType(), parent, name) {
    addOperand(value);
    addOperand(pointer);
-    addOperands(indices);
  }

 public:
-  int getNumIndices() const { return getNumOperands() - 2; }
  Value* getValue() const { return getOperand(0); }
  Value* getPointer() const { return getOperand(1); }
-  auto getIndices() const {
-    return make_range(std::next(operand_begin(), 2), operand_end());
-  }
-  Value* getIndex(int index) const { return getOperand(index + 2); }
-
+  void print(std::ostream& os) const override;
 }; // class StoreInst

 //! Memset instruction
@ -1253,7 +1348,7 @@ public:
  Value* getBegin() const { return getOperand(1); }
  Value* getSize() const { return getOperand(2); }
  Value* getValue() const { return getOperand(3); }
-
+  void print(std::ostream& os) const override;
 };

 class GlobalValue;
@ -1271,6 +1366,11 @@ public:
 public:
  Function* getParent() const { return func; }
  int getIndex() const { return index; }
+  
+  /// 清理参数的使用关系
+  void cleanup();
+  
+  void print(std::ostream& os) const;
 };


@ -1336,8 +1436,19 @@ protected:
    auto is_same_ptr = [blockToRemove](const std::unique_ptr<BasicBlock> &ptr) { return ptr.get() == blockToRemove; };
    blocks.remove_if(is_same_ptr);
  }
+  BasicBlock* addBasicBlock(const std::string &name, BasicBlock *before) {
+    // 在指定的基本块之前添加一个新的基本块
+    auto it = std::find_if(blocks.begin(), blocks.end(),
+                           [before](const std::unique_ptr<BasicBlock> &ptr) { return ptr.get() == before; });
+    if (it != blocks.end()) {
+      auto newblk = blocks.emplace(it, std::make_unique<BasicBlock>(this, name));
+      return newblk->get();  // 返回新添加的基本块指针
+    }
+    assert(false && "BasicBlock to insert before not found!"); 
+    return nullptr;  // 如果没有找到指定的基本块，则返回nullptr
+  }  ///< 添加一个新的基本块到某个基本块之前
  BasicBlock* addBasicBlock(const std::string &name = "") {
-    blocks.emplace_back(new BasicBlock(this, name));
+    blocks.emplace_back(std::make_unique<BasicBlock>(this, name));
    return blocks.back().get();
  }
  BasicBlock* addBasicBlock(BasicBlock *block) {
@ -1348,6 +1459,11 @@ protected:
    blocks.emplace_front(block);
    return block;
  }
+  
+  /// 清理函数中的所有使用关系
+  void cleanup();
+  
+  void print(std::ostream& os) const;
 };

 //! Global value declared at file scope
@ -1361,20 +1477,18 @@ protected:

 protected:
  GlobalValue(Module *parent, Type *type, const std::string &name,
-              const std::vector<Value *> &dims = {}, 
              ValueCounter init = {})
      : Value(type, name), parent(parent) {
    assert(type->isPointer());
-    // addOperands(dims); 
    // 维度信息已经被记录到Type中，dim只是为了方便初始化
-    numDims = dims.size();
+    numDims = 0;
    if (init.size() == 0) {
      unsigned num = 1;
-      for (unsigned i = 0; i < numDims; i++) {
-        // Assume dims elements are ConstantInteger and cast appropriately
-        auto dim_val = dynamic_cast<ConstantInteger*>(dims[i]);
-        assert(dim_val && "GlobalValue dims must be constant integers");
-        num *= dim_val->getInt();
+      auto arrayType = type->as<ArrayType>();
+      while (arrayType) {
+        numDims++;
+        num *= arrayType->getNumElements();
+        arrayType = arrayType->getElementType()->as<ArrayType>();
      }
      if (dynamic_cast<PointerType *>(type)->getBaseType() == Type::getFloatType()) {
        init.push_back(ConstantFloating::get(0.0F), num); // Use new constant factory
@ -1386,9 +1500,6 @@ protected:
  }

 public:
-  // unsigned getNumDims() const  { return numDims; }                     ///< 获取维度数量
-  // Value* getDim(unsigned index) const { return getOperand(index); }  ///< 获取位置为index的维度
-  // auto getDims() const { return getOperands(); }                              ///< 获取维度列表
  unsigned getNumIndices() const {
    return numDims;
  } ///< 获取维度数量
@ -1418,6 +1529,7 @@ public:
    return getByIndex(index);
  }  ///< 通过多维索引indices获取初始值
  const ValueCounter& getInitValues() const { return initValues; } 
+  void print(std::ostream& os) const;
 }; // class GlobalValue


@ -1430,13 +1542,19 @@ class ConstantVariable : public Value {
  ValueCounter initValues;  ///< 值

 protected:
-  ConstantVariable(Module *parent, Type *type, const std::string &name, const ValueCounter &init,
-                   const std::vector<Value *> &dims = {})
+  ConstantVariable(Module *parent, Type *type, const std::string &name, const ValueCounter &init)
      : Value(type, name), parent(parent) {
    assert(type->isPointer());
-    numDims = dims.size();
+    // numDims = dims.size();
+    numDims = 0;
+    if(type->as<PointerType>()->getBaseType()->isArray()) {
+      auto arrayType = type->as<ArrayType>();
+      while (arrayType) {
+        numDims++;
+        arrayType = arrayType->getElementType()->as<ArrayType>();
+      }
+    }
    initValues = init;
-    // addOperands(dims); 同GlobalValue，维度信息已经被记录到Type中，dim只是为了方便初始化
  }

 public:
@ -1468,10 +1586,9 @@ class ConstantVariable : public Value {

    return getByIndex(index);
  }                                                        ///< 通过多维索引indices获取初始值
-  // unsigned getNumDims() const { return numDims; }  ///< 获取维度数量
-  // Value* getDim(unsigned index) const { return getOperand(index); }  ///< 获取位置为index的维度
-  // auto getDims() const { return getOperands(); }                              ///< 获取维度列表
  const ValueCounter& getInitValues() const { return initValues; }                           ///< 获取初始值
+  void print(std::ostream& os) const;
+  void print_init(std::ostream& os) const;
 };

 using SymbolTableNode = struct SymbolTableNode {
@ -1494,6 +1611,8 @@ class SymbolTable {

  Value* getVariable(const std::string &name) const;  ///< 根据名字name以及当前作用域获取变量
  Value* addVariable(const std::string &name, Value *variable);               ///< 添加变量
+  void registerParameterName(const std::string &name);                        ///< 注册函数参数名字，避免alloca重名
+  void addVariableDirectly(const std::string &name, Value *variable);        ///< 直接添加变量到当前作用域，不重命名
  std::vector<std::unique_ptr<GlobalValue>>& getGlobals();                  ///< 获取全局变量列表
  const std::vector<std::unique_ptr<ConstantVariable>>& getConsts() const;  ///< 获取全局常量列表
  void enterNewScope();                                                              ///< 进入新的作用域
@ -1501,6 +1620,9 @@ class SymbolTable {
  bool isInGlobalScope() const;                                              ///< 是否位于全局作用域
  void enterGlobalScope();                                                           ///< 进入全局作用域
  bool isCurNodeNull() { return curNode == nullptr; }
+  
+  /// 清理符号表中的所有内容
+  void cleanup();
 };

 //! IR unit for representing a SysY compile unit
@ -1529,13 +1651,12 @@ class Module {
    return result.first->second.get();
  }  ///< 创建外部函数
  ///< 变量创建伴随着符号表的更新
-  GlobalValue* createGlobalValue(const std::string &name, Type *type, const std::vector<Value *> &dims = {},
-                         const ValueCounter &init = {}) {
+  GlobalValue* createGlobalValue(const std::string &name, Type *type, const ValueCounter &init = {}) {
    bool isFinished = variableTable.isCurNodeNull();
    if (isFinished) {
      variableTable.enterGlobalScope();
    }
-    auto result = variableTable.addVariable(name, new GlobalValue(this, type, name, dims, init));
+    auto result = variableTable.addVariable(name, new GlobalValue(this, type, name, init));
    if (isFinished) {
      variableTable.leaveScope();
    }
@ -1544,9 +1665,8 @@ class Module {
    }
    return dynamic_cast<GlobalValue *>(result);
  }  ///< 创建全局变量
-  ConstantVariable* createConstVar(const std::string &name, Type *type, const ValueCounter &init,
-                      const std::vector<Value *> &dims = {}) {
-    auto result = variableTable.addVariable(name, new ConstantVariable(this, type, name, init, dims));
+  ConstantVariable* createConstVar(const std::string &name, Type *type, const ValueCounter &init) {
+    auto result = variableTable.addVariable(name, new ConstantVariable(this, type, name, init));
    if (result == nullptr) {
      return nullptr;
    }
@ -1555,6 +1675,12 @@ class Module {
  void addVariable(const std::string &name, AllocaInst *variable) {
    variableTable.addVariable(name, variable);
  }  ///< 添加变量
+  void addVariableDirectly(const std::string &name, AllocaInst *variable) {
+    variableTable.addVariableDirectly(name, variable);
+  }  ///< 直接添加变量到当前作用域，不重命名
+  void registerParameterName(const std::string &name) {
+    variableTable.registerParameterName(name);
+  }  ///< 注册函数参数名字，避免alloca重名
  Value* getVariable(const std::string &name) {
    return variableTable.getVariable(name);
  }  ///< 根据名字name和当前作用域获取变量
@ -1567,7 +1693,7 @@ class Module {
  }  ///< 获取函数
  Function* getExternalFunction(const std::string &name) const {
    auto result = externalFunctions.find(name);
-    if (result == functions.end()) {
+    if (result == externalFunctions.end()) {
      return nullptr;
    }
    return result->second.get();
@ -1587,6 +1713,11 @@ class Module {
  void leaveScope() { variableTable.leaveScope(); }  ///< 离开作用域

  bool isInGlobalArea() const { return variableTable.isInGlobalScope(); }  ///< 是否位于全局作用域
+
+  /// 清理模块中的所有对象，包括函数、基本块、指令等
+  void cleanup();
+
+  void print(std::ostream& os) const;
 };

 /*!
--- a/src/include/midend/IRBuilder.h
+++ b/src/include/midend/IRBuilder.h
@ -217,6 +217,18 @@ class IRBuilder {
  BinaryInst * createOrInst(Value *lhs, Value *rhs, const std::string &name = "") {
    return createBinaryInst(Instruction::kOr, Type::getIntType(), lhs, rhs, name);
  }  ///< 创建按位或指令
+  BinaryInst * createSllInst(Value *lhs, Value *rhs, const std::string &name = "") {
+    return createBinaryInst(Instruction::kSll, Type::getIntType(), lhs, rhs, name);
+  }  ///< 创建逻辑左移指令
+  BinaryInst * createSrlInst(Value *lhs, Value *rhs, const std::string &name = "") {
+    return createBinaryInst(Instruction::kSrl, Type::getIntType(), lhs, rhs, name);
+  }  ///< 创建逻辑右移指令
+  BinaryInst * createSraInst(Value *lhs, Value *rhs, const std::string &name = "") {
+    return createBinaryInst(Instruction::kSra, Type::getIntType(), lhs, rhs, name);
+  }  ///< 创建算术右移指令
+  BinaryInst * createMulhInst(Value *lhs, Value *rhs, const std::string &name = "") {
+    return createBinaryInst(Instruction::kMulh, Type::getIntType(), lhs, rhs, name);
+  }  ///< 创建高位乘法指令
  CallInst * createCallInst(Function *callee, const std::vector<Value *> &args, const std::string &name = "") {
    std::string newName;
    if (name.empty() && callee->getReturnType() != Type::getVoidType()) {
@ -239,31 +251,30 @@ class IRBuilder {
    block->getInstructions().emplace(position, inst);
    return inst;
  }  ///< 创建return指令
-  UncondBrInst * createUncondBrInst(BasicBlock *thenBlock, const std::vector<Value *> &args) {
-    auto inst = new UncondBrInst(thenBlock, args, block);
+  UncondBrInst * createUncondBrInst(BasicBlock *thenBlock) {
+    auto inst = new UncondBrInst(thenBlock, block);
    assert(inst);
    block->getInstructions().emplace(position, inst);
    return inst;
  }  ///< 创建无条件指令
-  CondBrInst * createCondBrInst(Value *condition, BasicBlock *thenBlock, BasicBlock *elseBlock,
-                        const std::vector<Value *> &thenArgs, const std::vector<Value *> &elseArgs) {
-    auto inst = new CondBrInst(condition, thenBlock, elseBlock, thenArgs, elseArgs, block);
+  CondBrInst * createCondBrInst(Value *condition, BasicBlock *thenBlock, BasicBlock *elseBlock) {
+    auto inst = new CondBrInst(condition, thenBlock, elseBlock, block);
    assert(inst);
    block->getInstructions().emplace(position, inst);
    return inst;
  }  ///< 创建条件跳转指令
-  AllocaInst * createAllocaInst(Type *type, const std::vector<Value *> &dims = {}, const std::string &name = "") {
-    auto inst = new AllocaInst(type, dims, block, name);
+  UnreachableInst * createUnreachableInst(const std::string &name = "") {
+    auto inst = new UnreachableInst(name, block);
+    assert(inst);
+    block->getInstructions().emplace(position, inst);
+    return inst;
+  }  ///< 创建不可达指令
+  AllocaInst * createAllocaInst(Type *type, const std::string &name = "") {
+    auto inst = new AllocaInst(type, block, name);
    assert(inst);
    block->getInstructions().emplace(position, inst);
    return inst;
  }  ///< 创建分配指令
-  AllocaInst * createAllocaInstWithoutInsert(Type *type, const std::vector<Value *> &dims = {}, BasicBlock *parent = nullptr,
-                                     const std::string &name = "") {
-    auto inst = new AllocaInst(type, dims, parent, name);
-    assert(inst);
-    return inst;
-  }  ///< 创建不插入指令列表的分配指令[仅用于phi指令]
  LoadInst * createLoadInst(Value *pointer, const std::vector<Value *> &indices = {}, const std::string &name = "") {
    std::string newName;
    if (name.empty()) {
@ -275,7 +286,7 @@ class IRBuilder {
      newName = name;
    }

-    auto inst = new LoadInst(pointer, indices, block, newName);
+    auto inst = new LoadInst(pointer, block, newName);
    assert(inst);
    block->getInstructions().emplace(position, inst);
    return inst;
@ -286,9 +297,8 @@ class IRBuilder {
    block->getInstructions().emplace(position, inst);
    return inst;
  }  ///< 创建memset指令
-  StoreInst * createStoreInst(Value *value, Value *pointer, const std::vector<Value *> &indices = {},
-                       const std::string &name = "") {
-    auto inst = new StoreInst(value, pointer, indices, block, name);
+  StoreInst * createStoreInst(Value *value, Value *pointer, const std::string &name = "") {
+    auto inst = new StoreInst(value, pointer, block, name);
    assert(inst);
    block->getInstructions().emplace(position, inst);
    return inst;
@ -308,24 +318,6 @@ class IRBuilder {
    block->getInstructions().emplace(block->begin(), inst);
    return inst;
  }  ///< 创建Phi指令
-  // GetElementPtrInst* createGetElementPtrInst(Value *basePointer,
-  //                                const std::vector<Value *> &indices = {},
-  //                                const std::string &name = "") {
-  //   std::string newName;
-  //   if (name.empty()) {
-  //     std::stringstream ss;
-  //     ss << tmpIndex;
-  //     newName = ss.str();
-  //     tmpIndex++;
-  //   } else {
-  //     newName = name;
-  //   }
-
-  //   auto inst = new GetElementPtrInst(basePointer, indices, block, newName);
-  //   assert(inst);
-  //   block->getInstructions().emplace(position, inst);
-  //   return inst;
-  // }
  /**
     * @brief 根据 LLVM 设计模式创建 GEP 指令。
     * 它会自动推断返回类型，无需手动指定。
@ -364,38 +356,31 @@ class IRBuilder {
    Type *currentWalkType = pointerType->as<PointerType>()->getBaseType();

    // 遍历所有索引来深入类型层次结构。
-    // `indices` 向量包含了所有 GEP 索引，包括由 `visitLValue` 等函数添加的初始 `0` 索引。
+    // 重要：第一个索引总是用于"解引用"指针，后续索引才用于数组/结构体的索引
    for (int i = 0; i < indices.size(); ++i) {
-        if (currentWalkType->isArray()) {
-            // 情况一：当前遍历类型是 `ArrayType`。
-            // 索引用于选择数组元素，`currentWalkType` 更新为数组的元素类型。
-            currentWalkType = currentWalkType->as<ArrayType>()->getElementType();
-        } else if (currentWalkType->isPointer()) {
-            // 情况二：当前遍历类型是 `PointerType`。
-            // 这意味着我们正在通过一个指针来访问其指向的内存。
-            // 索引用于选择该指针所指向的“数组”的元素。
-            // `currentWalkType` 更新为该指针所指向的基础类型。
-            // 例如：如果 `currentWalkType` 是 `i32*`，它将变为 `i32`。
-            // 如果 `currentWalkType` 是 `[10 x i32]*`，它将变为 `[10 x i32]`。
-            currentWalkType = currentWalkType->as<PointerType>()->getBaseType();
+        if (i == 0) {
+            // 第一个索引：总是用于"解引用"基指针，不改变currentWalkType
+            // 例如：对于 `[4 x i32]* ptr, i32 0`，第一个0只是说"访问ptr指向的对象"
+            // currentWalkType 保持为 `[4 x i32]`
+            continue;
        } else {
-            // 情况三：当前遍历类型是标量类型 (例如 `i32`, `float` 等非聚合、非指针类型)。
-            //
-            // 如果 `currentWalkType` 是标量，并且当前索引 `i` **不是** `indices` 向量中的最后一个索引，
-            // 这意味着尝试对一个标量类型进行进一步的结构性索引，这是**无效的**。
-            // 例如：`int x; x[0];` 对应的 GEP 链中，`x` 的类型是 `i32`，再加 `[0]` 索引就是错误。
-            //
-            // 如果 `currentWalkType` 是标量，且这是**最后一个索引** (`i == indices.size() - 1`)，
-            // 那么 GEP 是合法的，它只是计算一个偏移地址，最终的类型就是这个标量类型。
-            // 此时 `currentWalkType` 保持不变，循环结束。
-            if (i < indices.size() - 1) { 
-                assert(false && "Invalid GEP indexing: attempting to index into a non-aggregate/non-pointer type with further indices.");
-                return nullptr; // 返回空指针表示类型推断失败
+            // 后续索引：用于实际的数组/结构体索引
+            if (currentWalkType->isArray()) {
+                // 数组索引：选择数组中的元素
+                currentWalkType = currentWalkType->as<ArrayType>()->getElementType();
+            } else if (currentWalkType->isPointer()) {
+                // 指针索引：解引用指针并继续
+                currentWalkType = currentWalkType->as<PointerType>()->getBaseType();
+            } else {
+                // 标量类型：不能进一步索引
+                if (i < indices.size() - 1) { 
+                    assert(false && "Invalid GEP indexing: attempting to index into a non-aggregate/non-pointer type with further indices.");
+                    return nullptr;
+                }
            }
-            // 如果是最后一个索引，且当前类型是标量，则类型保持不变，这是合法的。
-            // 循环会自然结束，返回正确的 `currentWalkType`。
        }
    }
+    
    // 所有索引处理完毕后，`currentWalkType` 就是 GEP 指令最终计算出的地址所指向的元素的类型。
    return currentWalkType;
  }
--- a/src/include/midend/Pass/Analysis/AliasAnalysis.h
+++ b/src/include/midend/Pass/Analysis/AliasAnalysis.h
@ -0,0 +1,246 @@
+#pragma once
+
+#include "IR.h"
+#include "Pass.h"
+#include <map>
+#include <set>
+#include <vector>
+#include <memory>
+
+namespace sysy {
+
+// 前向声明
+class MemoryLocation;
+class AliasAnalysisResult;
+
+/**
+ * @brief 别名关系类型
+ * 按风险等级递增排序
+ */
+enum class AliasType {
+  NO_ALIAS = 0,        // 确定无别名 (不同的局部数组)
+  SELF_ALIAS = 1,      // 自别名 (同一数组的不同索引)
+  POSSIBLE_ALIAS = 2,  // 可能有别名 (函数参数数组)
+  UNKNOWN_ALIAS = 3    // 未知 (保守估计)
+};
+
+/**
+ * @brief 内存位置信息
+ * 描述一个内存访问的基础信息
+ */
+struct MemoryLocation {
+  Value* basePointer;              // 基指针 (剥离GEP后的真实基址)
+  Value* accessPointer;            // 访问指针 (包含索引信息)
+  
+  // 分类信息
+  bool isLocalArray;               // 是否为局部数组
+  bool isFunctionParameter;        // 是否为函数参数
+  bool isGlobalArray;              // 是否为全局数组
+  
+  // 索引信息
+  std::vector<Value*> indices;     // GEP索引列表
+  bool hasConstantIndices;         // 是否为常量索引
+  bool hasLoopVariableIndex;       // 是否包含循环变量
+  int constantOffset;              // 常量偏移量 (仅当全部为常量时有效)
+  
+  // 访问模式
+  bool hasReads;                   // 是否有读操作
+  bool hasWrites;                  // 是否有写操作
+  std::vector<Instruction*> accessInsts; // 所有访问指令
+  
+  MemoryLocation(Value* base, Value* access) 
+    : basePointer(base), accessPointer(access), 
+      isLocalArray(false), isFunctionParameter(false), isGlobalArray(false),
+      hasConstantIndices(false), hasLoopVariableIndex(false), constantOffset(0),
+      hasReads(false), hasWrites(false) {}
+};
+
+/**
+ * @brief 别名分析结果
+ * 存储一个函数的完整别名分析信息
+ */
+class AliasAnalysisResult : public AnalysisResultBase {
+public:
+  AliasAnalysisResult(Function *F) : AssociatedFunction(F) {}
+  ~AliasAnalysisResult() override = default;
+
+  // ========== 基础查询接口 ==========
+  
+  /**
+   * 查询两个指针之间的别名关系
+   */
+  AliasType queryAlias(Value* ptr1, Value* ptr2) const;
+  
+  /**
+   * 查询指针的内存位置信息
+   */
+  const MemoryLocation* getMemoryLocation(Value* ptr) const;
+  
+  /**
+   * 获取所有内存位置
+   */
+  const std::map<Value*, std::unique_ptr<MemoryLocation>>& getAllMemoryLocations() const {
+    return LocationMap;
+  }
+  
+  // ========== 高级查询接口 ==========
+  
+  /**
+   * 检查指针是否为局部数组
+   */
+  bool isLocalArray(Value* ptr) const;
+  
+  /**
+   * 检查指针是否为函数参数数组
+   */
+  bool isFunctionParameter(Value* ptr) const;
+  
+  /**
+   * 检查指针是否为全局数组
+   */
+  bool isGlobalArray(Value* ptr) const;
+  
+  /**
+   * 检查指针是否使用常量索引
+   */
+  bool hasConstantAccess(Value* ptr) const;
+  
+  // ========== 统计接口 ==========
+  
+  /**
+   * 获取各类别名类型的统计信息
+   */
+  struct Statistics {
+    int totalQueries;
+    int noAlias;
+    int selfAlias;
+    int possibleAlias;
+    int unknownAlias;
+    int localArrays;
+    int functionParameters;
+    int globalArrays;
+    int constantAccesses;
+  };
+  
+  Statistics getStatistics() const;
+  
+  /**
+   * 打印别名分析结果 (调试用)
+   */
+  void print() const;
+  void printStatics() const;
+  // ========== 内部方法 ==========
+  
+  void addMemoryLocation(std::unique_ptr<MemoryLocation> location);
+  void addAliasRelation(Value* ptr1, Value* ptr2, AliasType type);
+  
+  // ========== 公开数据成员 (供Pass使用) ==========
+  std::map<Value*, std::unique_ptr<MemoryLocation>> LocationMap;  // 内存位置映射
+  std::map<std::pair<Value*, Value*>, AliasType> AliasMap;        // 别名关系缓存
+
+private:
+  Function *AssociatedFunction;                                    // 关联的函数
+  
+  // 分类存储
+  std::vector<Argument*> ArrayParameters;                         // 数组参数
+  std::vector<AllocaInst*> LocalArrays;                          // 局部数组
+  std::set<GlobalValue*> AccessedGlobals;                        // 访问的全局变量
+};
+
+/**
+ * @brief SysY语言特化的别名分析Pass
+ * 针对SysY语言特性优化的别名分析实现
+ */
+class SysYAliasAnalysisPass : public AnalysisPass {
+public:
+  // 唯一的 Pass ID
+  static void *ID;
+  // 在这里开启激进分析策略
+  SysYAliasAnalysisPass() : AnalysisPass("SysYAliasAnalysis", Pass::Granularity::Function), 
+                           aggressiveParameterMode(false), parameterOptimizationEnabled(false) {}
+
+  // 实现 getPassID
+  void *getPassID() const override { return &ID; }
+
+  // 核心运行方法
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+
+  // 获取分析结果
+  std::unique_ptr<AnalysisResultBase> getResult() override { return std::move(CurrentResult); }
+  
+  // ========== 配置接口 ==========
+  
+  /**
+   * 启用针对SysY评测的激进优化模式
+   * 在这种模式下，假设不同参数不会传入相同数组
+   */
+  void enableSysYTestingMode() {
+    aggressiveParameterMode = true;
+    parameterOptimizationEnabled = true;
+  }
+  
+  /**
+   * 使用保守的默认模式（适合通用场景）
+   */
+  void useConservativeMode() {
+    aggressiveParameterMode = false;
+    parameterOptimizationEnabled = false;
+  }
+
+private:
+  std::unique_ptr<AliasAnalysisResult> CurrentResult; // 当前函数的分析结果
+  
+  // ========== 主要分析流程 ==========
+  
+  void collectMemoryAccesses(Function* F);              // 收集内存访问
+  void buildAliasRelations(Function* F);                // 构建别名关系
+  void optimizeForSysY(Function* F);                    // SysY特化优化
+  
+  // ========== 内存位置分析 ==========
+  
+  std::unique_ptr<MemoryLocation> createMemoryLocation(Value* ptr);
+  Value* getBasePointer(Value* ptr);                    // 获取基指针
+  void analyzeMemoryType(MemoryLocation* location);     // 分析内存类型
+  void analyzeIndexPattern(MemoryLocation* location);   // 分析索引模式
+  
+  // ========== 别名关系推断 ==========
+  
+  AliasType analyzeAliasBetween(MemoryLocation* loc1, MemoryLocation* loc2);
+  AliasType compareIndices(MemoryLocation* loc1, MemoryLocation* loc2);
+  AliasType compareLocalArrays(MemoryLocation* loc1, MemoryLocation* loc2);
+  AliasType compareParameters(MemoryLocation* loc1, MemoryLocation* loc2);
+  AliasType compareWithGlobal(MemoryLocation* loc1, MemoryLocation* loc2);
+  AliasType compareMixedTypes(MemoryLocation* loc1, MemoryLocation* loc2);
+  
+  // ========== SysY特化优化 ==========
+  
+  void applySysYConstraints(Function* F);               // 应用SysY语言约束
+  void optimizeParameterAnalysis(Function* F);          // 优化参数分析
+  void optimizeArrayAccessAnalysis(Function* F);        // 优化数组访问分析
+  
+  // ========== 配置和策略控制 ==========
+  
+  bool useAggressiveParameterAnalysis() const { return aggressiveParameterMode; }
+  bool enableParameterOptimization() const { return parameterOptimizationEnabled; }
+  void setAggressiveParameterMode(bool enable) { aggressiveParameterMode = enable; }
+  void setParameterOptimizationEnabled(bool enable) { parameterOptimizationEnabled = enable; }
+  
+  // ========== 辅助优化方法 ==========
+  
+  void optimizeConstantIndexAccesses();                 // 优化常量索引访问
+  void optimizeSequentialAccesses();                   // 优化顺序访问
+  
+  // ========== 辅助方法 ==========
+  
+  bool isConstantValue(Value* val);                     // 是否为常量
+  bool hasLoopVariableInIndices(const std::vector<Value*>& indices, Function* F);
+  int calculateConstantOffset(const std::vector<Value*>& indices);
+  void printStatistics() const;                         // 打印统计信息
+
+private:
+  // ========== 配置选项 ==========
+  bool aggressiveParameterMode = false;                // 激进的参数别名分析模式
+  bool parameterOptimizationEnabled = false;           // 启用参数优化
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Analysis/CallGraphAnalysis.h
+++ b/src/include/midend/Pass/Analysis/CallGraphAnalysis.h
@ -0,0 +1,242 @@
+#pragma once
+
+#include "IR.h"
+#include "Pass.h"
+#include <map>
+#include <set>
+#include <vector>
+#include <memory>
+#include <algorithm>
+#include <unordered_set>
+
+namespace sysy {
+
+// 前向声明
+class CallGraphAnalysisResult;
+
+/**
+ * @brief 调用图节点信息
+ * 存储单个函数在调用图中的信息
+ */
+struct CallGraphNode {
+    Function* function;                    // 关联的函数
+    std::set<Function*> callers;           // 调用此函数的函数集合
+    std::set<Function*> callees;           // 此函数调用的函数集合
+    
+    // 递归信息
+    bool isRecursive;                      // 是否参与递归调用
+    bool isSelfRecursive;                  // 是否自递归
+    int recursiveDepth;                    // 递归深度(-1表示无限递归)
+    
+    // 调用统计
+    size_t totalCallers;                   // 调用者总数
+    size_t totalCallees;                   // 被调用函数总数
+    size_t callSiteCount;                  // 调用点总数
+    
+    CallGraphNode(Function* f) : function(f), isRecursive(false), 
+        isSelfRecursive(false), recursiveDepth(0), totalCallers(0), 
+        totalCallees(0), callSiteCount(0) {}
+};
+
+/**
+ * @brief 调用图分析结果类
+ * 包含整个模块的调用图信息和查询接口
+ */
+class CallGraphAnalysisResult : public AnalysisResultBase {
+public:
+    CallGraphAnalysisResult(Module* M) : AssociatedModule(M) {}
+    ~CallGraphAnalysisResult() override = default;
+
+    // ========== 基础查询接口 ==========
+    
+    /**
+     * 获取函数的调用图节点
+     */
+    const CallGraphNode* getNode(Function* F) const {
+        auto it = nodes.find(F);
+        return (it != nodes.end()) ? it->second.get() : nullptr;
+    }
+    
+    /**
+     * 获取函数的调用图节点（非const版本）
+     */
+    CallGraphNode* getMutableNode(Function* F) {
+        auto it = nodes.find(F);
+        return (it != nodes.end()) ? it->second.get() : nullptr;
+    }
+    
+    /**
+     * 获取所有函数节点
+     */
+    const std::map<Function*, std::unique_ptr<CallGraphNode>>& getAllNodes() const {
+        return nodes;
+    }
+    
+    /**
+     * 检查函数是否存在于调用图中
+     */
+    bool hasFunction(Function* F) const {
+        return nodes.find(F) != nodes.end();
+    }
+
+    // ========== 调用关系查询 ==========
+    
+    /**
+     * 检查是否存在从caller到callee的调用
+     */
+    bool hasCallEdge(Function* caller, Function* callee) const {
+        auto node = getNode(caller);
+        return node && node->callees.count(callee) > 0;
+    }
+    
+    /**
+     * 获取函数的所有调用者
+     */
+    std::vector<Function*> getCallers(Function* F) const {
+        auto node = getNode(F);
+        if (!node) return {};
+        return std::vector<Function*>(node->callers.begin(), node->callers.end());
+    }
+    
+    /**
+     * 获取函数的所有被调用函数
+     */
+    std::vector<Function*> getCallees(Function* F) const {
+        auto node = getNode(F);
+        if (!node) return {};
+        return std::vector<Function*>(node->callees.begin(), node->callees.end());
+    }
+
+    // ========== 递归分析查询 ==========
+    
+    /**
+     * 检查函数是否参与递归调用
+     */
+    bool isRecursive(Function* F) const {
+        auto node = getNode(F);
+        return node && node->isRecursive;
+    }
+    
+    /**
+     * 检查函数是否自递归
+     */
+    bool isSelfRecursive(Function* F) const {
+        auto node = getNode(F);
+        return node && node->isSelfRecursive;
+    }
+    
+    /**
+     * 获取递归深度
+     */
+    int getRecursiveDepth(Function* F) const {
+        auto node = getNode(F);
+        return node ? node->recursiveDepth : 0;
+    }
+
+    // ========== 拓扑排序和SCC ==========
+    
+    /**
+     * 获取函数的拓扑排序结果
+     * 保证被调用函数在调用函数之前
+     */
+    const std::vector<Function*>& getTopologicalOrder() const {
+        return topologicalOrder;
+    }
+    
+    /**
+     * 获取强连通分量列表
+     * 每个SCC表示一个递归函数群
+     */
+    const std::vector<std::vector<Function*>>& getStronglyConnectedComponents() const {
+        return sccs;
+    }
+    
+    /**
+     * 获取函数所在的SCC索引
+     */
+    int getSCCIndex(Function* F) const {
+        auto it = functionToSCC.find(F);
+        return (it != functionToSCC.end()) ? it->second : -1;
+    }
+
+    // ========== 统计信息 ==========
+    
+    struct Statistics {
+        size_t totalFunctions;
+        size_t totalCallEdges;
+        size_t recursiveFunctions;
+        size_t selfRecursiveFunctions;
+        size_t stronglyConnectedComponents;
+        size_t maxSCCSize;
+        double avgCallersPerFunction;
+        double avgCalleesPerFunction;
+    };
+    
+    Statistics getStatistics() const;
+    
+    /**
+     * 打印调用图分析结果
+     */
+    void print() const;
+
+    // ========== 内部构建接口 ==========
+    
+    void addNode(Function* F);
+    void addCallEdge(Function* caller, Function* callee);
+    void computeTopologicalOrder();
+    void computeStronglyConnectedComponents();
+    void analyzeRecursion();
+
+private:
+    Module* AssociatedModule;                                      // 关联的模块
+    std::map<Function*, std::unique_ptr<CallGraphNode>> nodes;     // 调用图节点
+    std::vector<Function*> topologicalOrder;                      // 拓扑排序结果
+    std::vector<std::vector<Function*>> sccs;                     // 强连通分量
+    std::map<Function*, int> functionToSCC;                       // 函数到SCC的映射
+    
+    // 内部辅助方法
+    void dfsTopological(Function* F, std::unordered_set<Function*>& visited, 
+                       std::vector<Function*>& result);
+    void tarjanSCC();
+    void tarjanDFS(Function* F, int& index, std::vector<int>& indices, 
+                  std::vector<int>& lowlinks, std::vector<Function*>& stack, 
+                  std::unordered_set<Function*>& onStack);
+};
+
+/**
+ * @brief SysY调用图分析Pass
+ * Module级别的分析Pass，构建整个模块的函数调用图
+ */
+class CallGraphAnalysisPass : public AnalysisPass {
+public:
+    // 唯一的 Pass ID
+    static void* ID;
+
+    CallGraphAnalysisPass() : AnalysisPass("CallGraphAnalysis", Pass::Granularity::Module) {}
+
+    // 实现 getPassID
+    void* getPassID() const override { return &ID; }
+
+    // 核心运行方法
+    bool runOnModule(Module* M, AnalysisManager& AM) override;
+
+    // 获取分析结果
+    std::unique_ptr<AnalysisResultBase> getResult() override { return std::move(CurrentResult); }
+
+private:
+    std::unique_ptr<CallGraphAnalysisResult> CurrentResult;  // 当前模块的分析结果
+    
+    // ========== 主要分析流程 ==========
+    
+    void buildCallGraph(Module* M);                          // 构建调用图
+    void scanFunctionCalls(Function* F);                     // 扫描函数的调用
+    void processCallInstruction(CallInst* call, Function* caller);  // 处理调用指令
+    
+    // ========== 辅助方法 ==========
+    
+    bool isLibraryFunction(Function* F) const;               // 判断是否为标准库函数
+    bool isIntrinsicFunction(Function* F) const;             // 判断是否为内置函数
+    void printStatistics() const;                            // 打印统计信息
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Analysis/Dom.h
+++ b/src/include/midend/Pass/Analysis/Dom.h
@ -6,30 +6,82 @@
 #include <set>
 #include <vector>
 #include <algorithm>
+#include <functional>

 namespace sysy {

-// 支配树分析结果类 (保持不变)
+// 支配树分析结果类
 class DominatorTree : public AnalysisResultBase {
 public:
    DominatorTree(Function* F);
+    // 获取指定基本块的所有支配者
    const std::set<BasicBlock*>* getDominators(BasicBlock* BB) const;
-    BasicBlock* getImmediateDominator(BasicBlock* BB) const;
-    const std::set<BasicBlock*>* getDominanceFrontier(BasicBlock* BB) const;
+    // 获取指定基本块的即时支配者 (Immediate Dominator)
+    BasicBlock* getImmediateDominator(BasicBlock* BB) const;  
+    // 获取指定基本块的支配边界 (Dominance Frontier)
+    const std::set<BasicBlock*>* getDominanceFrontier(BasicBlock* BB) const;   
+    // 获取指定基本块在支配树中的子节点
    const std::set<BasicBlock*>* getDominatorTreeChildren(BasicBlock* BB) const;
+    // 额外的 Getter：获取所有支配者、即时支配者和支配边界的完整映射（可选，主要用于调试或特定场景）
    const std::map<BasicBlock*, std::set<BasicBlock*>>& getDominatorsMap() const { return Dominators; }
    const std::map<BasicBlock*, BasicBlock*>& getIDomsMap() const { return IDoms; }
    const std::map<BasicBlock*, std::set<BasicBlock*>>& getDominanceFrontiersMap() const { return DominanceFrontiers; }
+
+    // 计算所有基本块的支配者集合
    void computeDominators(Function* F);
-    void computeIDoms(Function* F);
+    // 计算所有基本块的即时支配者（内部使用 Lengauer-Tarjan 算法）
+    void computeIDoms(Function* F); 
+    // 计算所有基本块的支配边界
    void computeDominanceFrontiers(Function* F);
+    // 计算支配树的结构（即每个节点的直接子节点）
    void computeDominatorTreeChildren(Function* F);
 private:
+    // 与该支配树关联的函数
    Function* AssociatedFunction;
-    std::map<BasicBlock*, std::set<BasicBlock*>> Dominators;
-    std::map<BasicBlock*, BasicBlock*> IDoms;
-    std::map<BasicBlock*, std::set<BasicBlock*>> DominanceFrontiers;
-    std::map<BasicBlock*, std::set<BasicBlock*>> DominatorTreeChildren;
+    std::map<BasicBlock*, std::set<BasicBlock*>> Dominators;       // 每个基本块的支配者集合
+    std::map<BasicBlock*, BasicBlock*> IDoms;                      // 每个基本块的即时支配者
+    std::map<BasicBlock*, std::set<BasicBlock*>> DominanceFrontiers; // 每个基本块的支配边界
+    std::map<BasicBlock*, std::set<BasicBlock*>> DominatorTreeChildren; // 支配树中每个基本块的子节点
+
+    // ==========================================================
+    // Lengauer-Tarjan 算法内部所需的数据结构和辅助函数
+    // 这些成员是私有的，以封装 LT 算法的复杂性并避免命名空间污染
+    // ==========================================================
+
+    // DFS 遍历相关：
+    std::map<BasicBlock*, int> dfnum_map;            // 存储每个基本块的 DFS 编号
+    std::vector<BasicBlock*> vertex_vec;             // 通过 DFS 编号反向查找对应的基本块指针
+    std::map<BasicBlock*, BasicBlock*> parent_map;   // 存储 DFS 树中每个基本块的父节点
+    int df_counter;                                  // DFS 计数器，也代表 DFS 遍历的总节点数 (N)
+
+    // 半支配者 (Semi-dominator) 相关：
+    std::map<BasicBlock*, BasicBlock*> sdom_map;     // 存储每个基本块的半支配者
+    std::map<BasicBlock*, BasicBlock*> idom_map;     // 存储每个基本块的即时支配者 (IDom)
+    std::map<BasicBlock*, std::vector<BasicBlock*>> bucket_map; // 桶结构，用于存储具有相同半支配者的节点，以延迟 IDom 计算
+
+    // 并查集 (Union-Find) 相关（用于 evalAndCompress 函数）：
+    std::map<BasicBlock*, BasicBlock*> ancestor_map; // 并查集中的父节点（用于路径压缩）
+    std::map<BasicBlock*, BasicBlock*> label_map;    // 并查集中，每个集合的代表节点（或其路径上 sdom 最小的节点）
+
+    // ==========================================================
+    // 辅助计算函数 (私有)
+    // ==========================================================
+
+    // 计算基本块的逆后序遍历 (Reverse Post Order, RPO) 顺序
+    // RPO 用于优化支配者计算和 LT 算法的效率
+    std::vector<BasicBlock*> computeReversePostOrder(Function* F);
+
+    // Lengauer-Tarjan 算法特定的辅助 DFS 函数
+    // 用于初始化 dfnum_map, vertex_vec, parent_map
+    void dfs_lt_helper(BasicBlock* u);                 
+    
+    // 结合了并查集的 Find 操作和 LT 算法的 Eval 操作
+    // 用于在路径压缩时更新 label，找到路径上 sdom 最小的节点
+    BasicBlock* evalAndCompress_lt_helper(BasicBlock* i); 
+    
+    // 并查集的 Link 操作
+    // 将 v_child 挂载到 u_parent 的并查集树下
+    void link_lt_helper(BasicBlock* u_parent, BasicBlock* v_child); 
 };


--- a/src/include/midend/Pass/Analysis/Loop.h
+++ b/src/include/midend/Pass/Analysis/Loop.h
@ -0,0 +1,618 @@
+#pragma once
+
+#include "Dom.h"
+#include "IR.h"
+#include "Pass.h"
+#include <algorithm>
+#include <functional>
+#include <map>
+#include <memory>
+#include <optional>
+#include <queue>
+#include <set>
+#include <vector>
+
+namespace sysy {
+
+// 前向声明
+class LoopAnalysisResult;
+class AliasAnalysisResult;
+class SideEffectAnalysisResult;
+
+/**
+ * @brief 表示一个识别出的循环。
+ */
+class Loop {
+private:
+  static int NextLoopID; // 静态变量用于分配唯一ID
+  int LoopID;
+public:
+  // 构造函数：指定循环头
+  Loop(BasicBlock *header) : Header(header), LoopID(NextLoopID++) {}
+
+  // 获取循环头
+  BasicBlock *getHeader() const { return Header; }
+
+  // 获取循环的名称 （基于ID）
+  std::string getName() const { return "loop_" + std::to_string(LoopID); }
+  // 获取循环体包含的所有基本块
+  const std::set<BasicBlock *> &getBlocks() const { return LoopBlocks; }
+
+  // 获取循环的出口基本块（即从循环内部跳转到循环外部的基本块）
+  const std::set<BasicBlock *> &getExitBlocks() const { return ExitBlocks; }
+
+  // 获取循环前置块（如果存在），可以为 nullptr
+  BasicBlock *getPreHeader() const { return PreHeader; }
+
+  // 获取直接包含此循环的父循环（如果存在），可以为 nullptr
+  Loop *getParentLoop() const { return ParentLoop; }
+
+  // 获取直接嵌套在此循环内的子循环
+  const std::vector<Loop *> &getNestedLoops() const { return NestedLoops; }
+
+  // 获取循环的层级 (0 表示最外层循环，1 表示嵌套一层，以此类推)
+  int getLoopLevel() const { return Level; }
+
+  // 检查一个基本块是否属于当前循环
+  bool contains(BasicBlock *BB) const { return LoopBlocks.count(BB); }
+
+  // 判断当前循环是否是最内层循环 (没有嵌套子循环)
+  bool isInnermost() const { return NestedLoops.empty(); }
+
+  // 获取循环的深度（从最外层开始计算）
+  int getLoopDepth() const { return Level + 1; }
+
+  // 获取循环体的大小（基本块数量）
+  size_t getLoopSize() const { return LoopBlocks.size(); }
+
+  // 检查循环是否有唯一的外部前驱（即是否有前置块）
+  bool hasUniquePreHeader() const { return PreHeader != nullptr; }
+
+  // 检查循环是否是最外层循环（没有父循环）
+  bool isOutermost() const { return getParentLoop() == nullptr; }
+
+  // 获取循环的所有出口（从循环内到循环外的基本块）
+  std::vector<BasicBlock*> getExitingBlocks() const {
+    std::vector<BasicBlock*> exitingBlocks;
+    for (BasicBlock* bb : LoopBlocks) {
+      for (BasicBlock* succ : bb->getSuccessors()) {
+        if (!contains(succ)) {
+          exitingBlocks.push_back(bb);
+          break; // 每个基本块只添加一次
+        }
+      }
+    }
+    return exitingBlocks;
+  }
+
+  // 判断循环是否是简单循环（只有一个回边）
+  bool isSimpleLoop() const {
+    int backEdgeCount = 0;
+    for (BasicBlock* pred : Header->getPredecessors()) {
+      if (contains(pred)) {
+        backEdgeCount++;
+      }
+    }
+    return backEdgeCount == 1;
+  }
+
+  /**
+   * 获取所有出口目标块 (循环外接收循环出口边的块)
+   * 使用场景: 循环后置处理、phi节点分析
+   */
+  std::vector<BasicBlock*> getExitTargetBlocks() const {
+    std::set<BasicBlock*> exitTargetSet;
+    for (BasicBlock* bb : LoopBlocks) {
+      for (BasicBlock* succ : bb->getSuccessors()) {
+        if (!contains(succ)) {
+          exitTargetSet.insert(succ);
+        }
+      }
+    }
+    return std::vector<BasicBlock*>(exitTargetSet.begin(), exitTargetSet.end());
+  }
+
+  /**
+   * 计算循环的"深度"相对于指定的祖先循环
+   * 使用场景: 相对深度计算、嵌套分析
+   */
+  int getRelativeDepth(Loop* ancestor) const {
+    if (this == ancestor) return 0;
+    
+    int depth = 0;
+    Loop* current = this->ParentLoop;
+    while (current && current != ancestor) {
+      depth++;
+      current = current->ParentLoop;
+    }
+    
+    return current == ancestor ? depth : -1; // -1表示不是祖先关系
+  }
+
+  /**
+   * 检查循环是否包含函数调用
+   * 使用场景: 内联决策、副作用分析
+   */
+  bool containsFunctionCalls() const {
+    for (BasicBlock* bb : LoopBlocks) {
+      for (auto& inst : bb->getInstructions()) {
+        if (dynamic_cast<CallInst*>(inst.get())) {
+          return true;
+        }
+      }
+    }
+    return false;
+  }
+
+  /**
+   * 检查循环是否可能有副作用（基于副作用分析结果）
+   * 使用场景: 循环优化决策、并行化分析
+   */
+  bool mayHaveSideEffects(SideEffectAnalysisResult* sideEffectAnalysis) const;
+
+  /**
+   * 检查循环是否访问全局内存（基于别名分析结果）
+   * 使用场景: 并行化分析、缓存优化
+   */
+  bool accessesGlobalMemory(AliasAnalysisResult* aliasAnalysis) const;
+
+  /**
+   * 检查循环是否有可能的内存别名冲突
+   * 使用场景: 向量化分析、并行化决策
+   */
+  bool hasMemoryAliasConflicts(AliasAnalysisResult* aliasAnalysis) const;
+
+  /**
+   * 估算循环的"热度" (基于嵌套深度和大小)
+   * 使用场景: 优化优先级、资源分配
+   */
+  double getLoopHotness() const {
+    // 简单的热度估算: 深度权重 + 大小惩罚
+    double hotness = std::pow(2.0, Level); // 深度越深越热
+    hotness /= std::sqrt(LoopBlocks.size()); // 大小越大相对热度降低
+    return hotness;
+  }
+
+  // --- 供 LoopAnalysisPass 内部调用的方法，用于构建 Loop 对象 ---
+  void addBlock(BasicBlock *BB) { LoopBlocks.insert(BB); }
+  void addExitBlock(BasicBlock *BB) { ExitBlocks.insert(BB); }
+  void setPreHeader(BasicBlock *BB) { PreHeader = BB; }
+  void setParentLoop(Loop *loop) { ParentLoop = loop; }
+  void addNestedLoop(Loop *loop) { NestedLoops.push_back(loop); }
+  void setLoopLevel(int level) { Level = level; }
+  void clearNestedLoops() { NestedLoops.clear(); }
+private:
+  BasicBlock *Header;                // 循环头基本块
+  std::set<BasicBlock *> LoopBlocks; // 循环体包含的基本块集合
+  std::set<BasicBlock *> ExitBlocks; // 循环出口基本块集合
+  BasicBlock *PreHeader = nullptr;   // 循环前置块 (Optional)
+  Loop *ParentLoop = nullptr;        // 父循环 (用于嵌套)
+  std::vector<Loop *> NestedLoops;   // 嵌套的子循环
+  int Level = -1;                    // 循环的层级，-1表示未计算
+};
+
+/**
+ * @brief 循环分析结果类。
+ * 包含一个函数中所有识别出的循环，并提供高效的查询缓存机制。
+ */
+class LoopAnalysisResult : public AnalysisResultBase {
+public:
+  LoopAnalysisResult(Function *F) : AssociatedFunction(F) {}
+  ~LoopAnalysisResult() override = default;
+
+  // ========== 缓存统计结构 ==========
+  struct CacheStats {
+    size_t innermostLoopsCached;
+    size_t outermostLoopsCached;
+    size_t loopsByDepthCached;
+    size_t containingLoopsCached;
+    size_t allNestedLoopsCached;
+    size_t totalCachedQueries;
+  };
+
+private:
+  // ========== 高频查询缓存 ==========
+  mutable std::optional<std::vector<Loop*>> cachedInnermostLoops;
+  mutable std::optional<std::vector<Loop*>> cachedOutermostLoops;
+  mutable std::optional<int> cachedMaxDepth;
+  mutable std::optional<size_t> cachedLoopCount;
+  mutable std::map<int, std::vector<Loop*>> cachedLoopsByDepth;
+  
+  // ========== 中频查询缓存 ==========
+  mutable std::map<BasicBlock*, Loop*> cachedInnermostContainingLoop;
+  mutable std::map<Loop*, std::set<Loop*>> cachedAllNestedLoops; // 递归嵌套
+  mutable std::map<BasicBlock*, std::vector<Loop*>> cachedAllContainingLoops;
+  
+  // ========== 缓存状态管理 ==========
+  mutable bool cacheValid = true;
+
+  // 内部辅助方法
+  void invalidateCache() const {
+    cachedInnermostLoops.reset();
+    cachedOutermostLoops.reset();
+    cachedMaxDepth.reset();
+    cachedLoopCount.reset();
+    cachedLoopsByDepth.clear();
+    cachedInnermostContainingLoop.clear();
+    cachedAllNestedLoops.clear();
+    cachedAllContainingLoops.clear();
+    cacheValid = false;
+  }
+  
+  void ensureCacheValid() const {
+    if (!cacheValid) {
+      // 重新计算基础缓存
+      computeBasicCache();
+      cacheValid = true;
+    }
+  }
+  
+  void computeBasicCache() const {
+    // 计算最内层循环
+    if (!cachedInnermostLoops) {
+      cachedInnermostLoops = std::vector<Loop*>();
+      for (const auto& loop : AllLoops) {
+        if (loop->isInnermost()) {
+          cachedInnermostLoops->push_back(loop.get());
+        }
+      }
+    }
+    
+    // 计算最外层循环
+    if (!cachedOutermostLoops) {
+      cachedOutermostLoops = std::vector<Loop*>();
+      for (const auto& loop : AllLoops) {
+        if (loop->isOutermost()) {
+          cachedOutermostLoops->push_back(loop.get());
+        }
+      }
+    }
+    
+    // 计算最大深度
+    if (!cachedMaxDepth) {
+      int maxDepth = 0;
+      for (const auto& loop : AllLoops) {
+        maxDepth = std::max(maxDepth, loop->getLoopDepth());
+      }
+      cachedMaxDepth = maxDepth;
+    }
+    
+    // 计算循环总数
+    if (!cachedLoopCount) {
+      cachedLoopCount = AllLoops.size();
+    }
+  }
+
+public:
+  // ========== 基础接口 ==========
+
+  // 添加一个识别出的循环到结果中
+  void addLoop(std::unique_ptr<Loop> loop) {
+    invalidateCache(); // 添加新循环时失效缓存
+    AllLoops.push_back(std::move(loop));
+    LoopMap[AllLoops.back()->getHeader()] = AllLoops.back().get();
+  }
+
+  // 获取所有识别出的循环（unique_ptr 管理内存）
+  const std::vector<std::unique_ptr<Loop>> &getAllLoops() const { return AllLoops; }
+
+  // ========== 高频查询接口 ==========
+  
+  /**
+   * 获取所有最内层循环 - 循环优化的主要目标
+   * 使用场景: 循环展开、向量化、循环不变量外提
+   */
+  const std::vector<Loop*>& getInnermostLoops() const {
+    ensureCacheValid();
+    if (!cachedInnermostLoops) {
+      cachedInnermostLoops = std::vector<Loop*>();
+      for (const auto& loop : AllLoops) {
+        if (loop->isInnermost()) {
+          cachedInnermostLoops->push_back(loop.get());
+        }
+      }
+    }
+    return *cachedInnermostLoops;
+  }
+  
+  /**
+   * 获取所有最外层循环
+   * 使用场景: 循环树遍历、整体优化策略
+   */
+  const std::vector<Loop*>& getOutermostLoops() const {
+    ensureCacheValid();
+    if (!cachedOutermostLoops) {
+      cachedOutermostLoops = std::vector<Loop*>();
+      for (const auto& loop : AllLoops) {
+        if (loop->isOutermost()) {
+          cachedOutermostLoops->push_back(loop.get());
+        }
+      }
+    }
+    return *cachedOutermostLoops;
+  }
+  
+  /**
+   * 获取指定深度的所有循环
+   * 使用场景: 分层优化、循环展开决策、并行化分析
+   */
+  const std::vector<Loop*>& getLoopsAtDepth(int depth) const {
+    ensureCacheValid();
+    if (cachedLoopsByDepth.find(depth) == cachedLoopsByDepth.end()) {
+      std::vector<Loop*> result;
+      for (const auto& loop : AllLoops) {
+        if (loop->getLoopDepth() == depth) {
+          result.push_back(loop.get());
+        }
+      }
+      cachedLoopsByDepth[depth] = std::move(result);
+    }
+    return cachedLoopsByDepth[depth];
+  }
+  
+  /**
+   * 获取最大循环嵌套深度
+   * 使用场景: 优化预算分配、编译时间控制
+   */
+  int getMaxLoopDepth() const {
+    ensureCacheValid();
+    if (!cachedMaxDepth) {
+      int maxDepth = 0;
+      for (const auto& loop : AllLoops) {
+        maxDepth = std::max(maxDepth, loop->getLoopDepth());
+      }
+      cachedMaxDepth = maxDepth;
+    }
+    return *cachedMaxDepth;
+  }
+  
+  /**
+   * 获取循环总数
+   * 使用场景: 统计信息、优化决策
+   */
+  size_t getLoopCount() const {
+    ensureCacheValid();
+    if (!cachedLoopCount) {
+      cachedLoopCount = AllLoops.size();
+    }
+    return *cachedLoopCount;
+  }
+
+  // 获取指定深度的循环数量
+  size_t getLoopCountAtDepth(int depth) const {
+    return getLoopsAtDepth(depth).size();
+  }
+
+  // 检查函数是否包含循环
+  bool hasLoops() const { return !AllLoops.empty(); }
+
+  // ========== 中频查询接口 ==========
+  
+  /**
+   * 获取包含指定基本块的最内层循环
+   * 使用场景: 活跃性分析、寄存器分配、指令调度
+   */
+  Loop* getInnermostContainingLoop(BasicBlock* BB) const {
+    ensureCacheValid();
+    if (cachedInnermostContainingLoop.find(BB) == cachedInnermostContainingLoop.end()) {
+      Loop* result = nullptr;
+      int maxDepth = -1;
+      for (const auto& loop : AllLoops) {
+        if (loop->contains(BB) && loop->getLoopDepth() > maxDepth) {
+          result = loop.get();
+          maxDepth = loop->getLoopDepth();
+        }
+      }
+      cachedInnermostContainingLoop[BB] = result;
+    }
+    return cachedInnermostContainingLoop[BB];
+  }
+  
+  /**
+   * 获取包含指定基本块的所有循环 (从外到内排序)
+   * 使用场景: 循环间优化、依赖分析
+   */
+  const std::vector<Loop*>& getAllContainingLoops(BasicBlock* BB) const {
+    ensureCacheValid();
+    if (cachedAllContainingLoops.find(BB) == cachedAllContainingLoops.end()) {
+      std::vector<Loop*> result;
+      for (const auto& loop : AllLoops) {
+        if (loop->contains(BB)) {
+          result.push_back(loop.get());
+        }
+      }
+      // 按深度排序 (外层到内层)
+      std::sort(result.begin(), result.end(), 
+                [](Loop* a, Loop* b) { return a->getLoopDepth() < b->getLoopDepth(); });
+      cachedAllContainingLoops[BB] = std::move(result);
+    }
+    return cachedAllContainingLoops[BB];
+  }
+  
+  /**
+   * 获取指定循环的所有嵌套子循环 (递归)
+   * 使用场景: 循环树分析、嵌套优化
+   */
+  const std::set<Loop*>& getAllNestedLoops(Loop* loop) const {
+    ensureCacheValid();
+    if (cachedAllNestedLoops.find(loop) == cachedAllNestedLoops.end()) {
+      std::set<Loop*> result;
+      std::function<void(Loop*)> collectNested = [&](Loop* current) {
+        for (Loop* nested : current->getNestedLoops()) {
+          result.insert(nested);
+          collectNested(nested); // 递归收集
+        }
+      };
+      collectNested(loop);
+      cachedAllNestedLoops[loop] = std::move(result);
+    }
+    return cachedAllNestedLoops[loop];
+  }
+
+  // ========== 利用别名和副作用分析的查询接口 ==========
+  
+  /**
+   * 获取所有纯循环（无副作用的循环）
+   * 并行化、循环优化
+   */
+  std::vector<Loop*> getPureLoops(SideEffectAnalysisResult* sideEffectAnalysis) const {
+    std::vector<Loop*> result;
+    if (!sideEffectAnalysis) return result;
+    
+    for (const auto& loop : AllLoops) {
+      if (!loop->mayHaveSideEffects(sideEffectAnalysis)) {
+        result.push_back(loop.get());
+      }
+    }
+    return result;
+  }
+  
+  /**
+   * 获取所有只访问局部内存的循环
+   * 缓存优化、局部性分析
+   */
+  std::vector<Loop*> getLocalMemoryLoops(AliasAnalysisResult* aliasAnalysis) const {
+    std::vector<Loop*> result;
+    if (!aliasAnalysis) return result;
+    
+    for (const auto& loop : AllLoops) {
+      if (!loop->accessesGlobalMemory(aliasAnalysis)) {
+        result.push_back(loop.get());
+      }
+    }
+    return result;
+  }
+  
+  /**
+   * 获取所有无内存别名冲突的循环
+   * 向量化、并行化
+   */
+  std::vector<Loop*> getNoAliasConflictLoops(AliasAnalysisResult* aliasAnalysis) const {
+    std::vector<Loop*> result;
+    if (!aliasAnalysis) return result;
+    
+    for (const auto& loop : AllLoops) {
+      if (!loop->hasMemoryAliasConflicts(aliasAnalysis)) {
+        result.push_back(loop.get());
+      }
+    }
+    return result;
+  }
+
+  // ========== 低频查询接口(不缓存) ==========
+  
+  /**
+   * 检查两个循环是否有嵌套关系
+   * 循环间依赖分析
+   */
+  bool isNestedLoop(Loop* inner, Loop* outer) const {
+    if (inner == outer) return false;
+    
+    Loop* current = inner->getParentLoop();
+    while (current) {
+      if (current == outer) return true;
+      current = current->getParentLoop();
+    }
+    return false;
+  }
+  
+  /**
+   * 获取两个循环的最近公共祖先循环
+   * 循环融合分析、优化范围确定
+   */
+  Loop* getLowestCommonAncestor(Loop* loop1, Loop* loop2) const {
+    if (!loop1 || !loop2) return nullptr;
+    if (loop1 == loop2) return loop1;
+    
+    // 收集loop1的所有祖先
+    std::set<Loop*> ancestors1;
+    Loop* current = loop1;
+    while (current) {
+      ancestors1.insert(current);
+      current = current->getParentLoop();
+    }
+    
+    // 查找loop2祖先链中第一个在ancestors1中的循环
+    current = loop2;
+    while (current) {
+      if (ancestors1.count(current)) {
+        return current;
+      }
+      current = current->getParentLoop();
+    }
+    
+    return nullptr; // 没有公共祖先
+  }
+
+  // 通过循环头获取 Loop 对象
+  Loop *getLoopForHeader(BasicBlock *header) const {
+    auto it = LoopMap.find(header);
+    return (it != LoopMap.end()) ? it->second : nullptr;
+  }
+
+  // 通过某个基本块获取包含它的最内层循环 (向后兼容接口)
+  Loop *getLoopContainingBlock(BasicBlock *BB) const {
+    return getInnermostContainingLoop(BB);
+  }
+
+  // ========== 缓存管理接口 ==========
+  
+  /**
+   * 手动失效缓存 (可删除)
+   */
+  void invalidateQueryCache() const {
+    invalidateCache();
+  }
+  
+  /**
+   * 获取缓存统计信息
+   */
+  CacheStats getCacheStats() const {
+    CacheStats stats = {};
+    stats.innermostLoopsCached = cachedInnermostLoops.has_value() ? 1 : 0;
+    stats.outermostLoopsCached = cachedOutermostLoops.has_value() ? 1 : 0;
+    stats.loopsByDepthCached = cachedLoopsByDepth.size();
+    stats.containingLoopsCached = cachedInnermostContainingLoop.size();
+    stats.allNestedLoopsCached = cachedAllNestedLoops.size();
+    stats.totalCachedQueries = stats.innermostLoopsCached + stats.outermostLoopsCached + 
+                               stats.loopsByDepthCached + stats.containingLoopsCached + 
+                               stats.allNestedLoopsCached;
+    return stats;
+  }
+
+  // 打印分析结果
+  void print() const;
+  void printBBSet(const std::string &prefix, const std::set<BasicBlock *> &s) const;
+  void printLoopVector(const std::string &prefix, const std::vector<Loop *> &loops) const;
+
+private:
+  Function *AssociatedFunction;                // 结果关联的函数
+  std::vector<std::unique_ptr<Loop>> AllLoops; // 所有识别出的循环
+  std::map<BasicBlock *, Loop *> LoopMap;      // 循环头到 Loop* 的映射，方便查找
+};
+
+/**
+ * @brief 循环分析遍。
+ * 识别函数中的所有循环，并生成 LoopAnalysisResult。
+ */
+class LoopAnalysisPass : public AnalysisPass {
+public:
+  // 唯一的 Pass ID，需要在 .cpp 文件中定义
+  static void *ID;
+
+  LoopAnalysisPass() : AnalysisPass("LoopAnalysis", Pass::Granularity::Function) {}
+
+  // 实现 getPassID
+  void *getPassID() const override { return &ID; }
+
+  // 核心运行方法：在每个函数上执行循环分析
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+
+  // 获取分析结果
+  std::unique_ptr<AnalysisResultBase> getResult() override { return std::move(CurrentResult); }
+
+private:
+  std::unique_ptr<LoopAnalysisResult> CurrentResult; // 当前函数的分析结果
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Analysis/LoopCharacteristics.h
+++ b/src/include/midend/Pass/Analysis/LoopCharacteristics.h
@ -0,0 +1,360 @@
+#pragma once
+
+#include "Dom.h"         // 支配树分析依赖
+#include "Loop.h"        // 循环分析依赖
+#include "Liveness.h"    // 活跃性分析依赖
+#include "AliasAnalysis.h" // 别名分析依赖
+#include "SideEffectAnalysis.h" // 副作用分析依赖
+#include "CallGraphAnalysis.h" // 调用图分析依赖
+#include "IR.h"          // IR定义
+#include "Pass.h"        // Pass框架
+#include <algorithm>
+#include <map>
+#include <memory>
+#include <optional>
+#include <set>
+#include <vector>
+
+namespace sysy {
+
+// 前向声明
+class LoopCharacteristicsResult;
+
+enum IVKind {
+  kBasic,        // 基本归纳变量
+  kLinear,     // 线性归纳变量
+  kCmplx       // 复杂派生归纳变量
+} ;          // 归纳变量类型
+
+struct InductionVarInfo {
+  Value* div;                  // 派生归纳变量的指令
+  Value* base = nullptr;                 // 其根phi或BIV或DIV
+  std::pair<Value*, Value*> Multibase = {nullptr, nullptr}; // 多个BIV
+  Instruction::Kind Instkind;      // 操作类型
+  int factor = 1;              // 系数（如i*2+3的2）
+  int offset = 0;              // 常量偏移
+  bool valid;    // 是否线性可归约
+  IVKind ivkind;          // 归纳变量类型
+
+
+static std::unique_ptr<InductionVarInfo> createBasicBIV(Value* v, Instruction::Kind kind, Value* base = nullptr, int factor = 1, int offset = 0) {
+  return std::make_unique<InductionVarInfo>(
+    InductionVarInfo{v, base, {nullptr, nullptr}, kind, factor, offset, true, IVKind::kBasic}
+  );
+}
+
+static std::unique_ptr<InductionVarInfo> createSingleDIV(Value* v, Instruction::Kind kind, Value* base = nullptr, int factor = 1, int offset = 0) {
+  return std::make_unique<InductionVarInfo>(
+    InductionVarInfo{v, base, {nullptr, nullptr}, kind, factor, offset, true,  IVKind::kLinear}
+  );
+}
+
+static std::unique_ptr<InductionVarInfo> createDoubleDIV(Value* v, Instruction::Kind kind, Value* base1 = nullptr,  Value* base2 = nullptr, int factor = 1, int offset = 0) {
+  return std::make_unique<InductionVarInfo>(
+    InductionVarInfo{v, nullptr, {base1, base2}, kind, factor, offset, false, IVKind::kCmplx}
+  );
+}
+};
+
+/**
+ * @brief 循环特征信息结构 - 基础循环分析阶段
+ * 存储循环的基本特征信息，为后续精确分析提供基础
+ */
+struct LoopCharacteristics {
+  Loop* loop;                                    // 关联的循环对象
+
+  // ========== 基础循环形式分析 ==========
+  bool isCountingLoop;                         // 是否为计数循环 (for i=0; i<n; i++)
+  bool isSimpleForLoop;                        // 是否为简单for循环
+  bool hasComplexControlFlow;                  // 是否有复杂控制流 (break, continue)
+  bool isInnermost;                            // 是否为最内层循环
+  
+  // ========== 归纳变量分析 ==========
+  
+  // ========== 基础循环不变量分析 ==========
+  std::unordered_set<Value*> loopInvariants;              // 循环不变量
+  std::unordered_set<Instruction*> invariantInsts;       // 可提升的不变指令
+
+  std::vector<std::unique_ptr<InductionVarInfo>> InductionVars;     // 归纳变量
+  
+  // ========== 基础边界分析 ==========
+  std::optional<int> staticTripCount;          // 静态循环次数（如果可确定）
+  bool hasKnownBounds;                         // 是否有已知边界
+  
+  // ========== 基础纯度和副作用分析 ==========
+  bool isPure;                                 // 是否为纯循环（无副作用）
+  bool accessesOnlyLocalMemory;                // 是否只访问局部内存
+  bool hasNoMemoryAliasConflicts;              // 是否无内存别名冲突
+  
+  // ========== 基础内存访问模式分析 ==========
+  struct MemoryAccessPattern {
+    std::vector<Instruction*> loadInsts;       // load指令列表
+    std::vector<Instruction*> storeInsts;      // store指令列表
+    bool isArrayParameter;                     // 是否为数组参数访问
+    bool isGlobalArray;                        // 是否为全局数组访问
+    bool hasConstantIndices;                   // 是否使用常量索引
+  };
+  std::map<Value*, MemoryAccessPattern> memoryPatterns; // 内存访问模式
+  
+  // ========== 基础性能特征 ==========
+  size_t instructionCount;                     // 循环体指令数
+  size_t memoryOperationCount;                 // 内存操作数
+  size_t arithmeticOperationCount;             // 算术操作数
+  double computeToMemoryRatio;                 // 计算与内存操作比率
+  
+  // ========== 基础优化提示 ==========
+  bool benefitsFromUnrolling;                  // 是否适合循环展开
+  int suggestedUnrollFactor;                   // 建议的展开因子
+  
+  // 构造函数 - 简化的基础分析初始化
+  LoopCharacteristics(Loop* l) : loop(l), 
+    isCountingLoop(false), isSimpleForLoop(false), hasComplexControlFlow(false),
+    isInnermost(false), hasKnownBounds(false), isPure(false), 
+    accessesOnlyLocalMemory(false), hasNoMemoryAliasConflicts(false),
+    benefitsFromUnrolling(false), suggestedUnrollFactor(1), 
+    instructionCount(0), memoryOperationCount(0),
+    arithmeticOperationCount(0), computeToMemoryRatio(0.0) {}
+};
+
+/**
+ * @brief 循环特征分析结果类
+ * 包含函数中所有循环的特征信息，并提供查询接口
+ */
+class LoopCharacteristicsResult : public AnalysisResultBase {
+public:
+  LoopCharacteristicsResult(Function *F) : AssociatedFunction(F) {}
+  ~LoopCharacteristicsResult() override = default;
+
+  // ========== 基础接口 ==========
+  
+  /**
+   * 添加循环特征信息
+   */
+  void addLoopCharacteristics(std::unique_ptr<LoopCharacteristics> characteristics) {
+    auto* loop = characteristics->loop;
+    CharacteristicsMap[loop] = std::move(characteristics);
+  }
+  
+  /**
+   * 获取指定循环的特征信息
+   */
+  const LoopCharacteristics* getCharacteristics(Loop* loop) const {
+    auto it = CharacteristicsMap.find(loop);
+    return (it != CharacteristicsMap.end()) ? it->second.get() : nullptr;
+  }
+  
+  /**
+   * 获取所有循环特征信息
+   */
+  const std::map<Loop*, std::unique_ptr<LoopCharacteristics>>& getAllCharacteristics() const {
+    return CharacteristicsMap;
+  }
+
+  // ========== 核心查询接口 ==========
+  
+  /**
+   * 获取所有计数循环
+   */
+  std::vector<Loop*> getCountingLoops() const {
+    std::vector<Loop*> result;
+    for (const auto& [loop, chars] : CharacteristicsMap) {
+      if (chars->isCountingLoop) {
+        result.push_back(loop);
+      }
+    }
+    return result;
+  }
+  
+  /**
+   * 获取所有纯循环（无副作用）
+   */
+  std::vector<Loop*> getPureLoops() const {
+    std::vector<Loop*> result;
+    for (const auto& [loop, chars] : CharacteristicsMap) {
+      if (chars->isPure) {
+        result.push_back(loop);
+      }
+    }
+    return result;
+  }
+  
+  /**
+   * 获取所有只访问局部内存的循环
+   */
+  std::vector<Loop*> getLocalMemoryOnlyLoops() const {
+    std::vector<Loop*> result;
+    for (const auto& [loop, chars] : CharacteristicsMap) {
+      if (chars->accessesOnlyLocalMemory) {
+        result.push_back(loop);
+      }
+    }
+    return result;
+  }
+  
+  /**
+   * 获取所有无内存别名冲突的循环
+   */
+  std::vector<Loop*> getNoAliasConflictLoops() const {
+    std::vector<Loop*> result;
+    for (const auto& [loop, chars] : CharacteristicsMap) {
+      if (chars->hasNoMemoryAliasConflicts) {
+        result.push_back(loop);
+      }
+    }
+    return result;
+  }
+  
+  /**
+   * 获取所有适合展开的循环
+   */
+  std::vector<Loop*> getUnrollingCandidates() const {
+    std::vector<Loop*> result;
+    for (const auto& [loop, chars] : CharacteristicsMap) {
+      if (chars->benefitsFromUnrolling) {
+        result.push_back(loop);
+      }
+    }
+    return result;
+  }
+  
+  /**
+   * 根据热度排序循环 (用于优化优先级)
+   */
+  std::vector<Loop*> getLoopsByHotness() const {
+    std::vector<Loop*> result;
+    for (const auto& [loop, chars] : CharacteristicsMap) {
+      result.push_back(loop);
+    }
+    
+    // 按循环热度排序 (嵌套深度 + 循环次数 + 指令数)
+    std::sort(result.begin(), result.end(), [](Loop* a, Loop* b) {
+      double hotnessA = a->getLoopHotness();
+      double hotnessB = b->getLoopHotness();
+      return hotnessA > hotnessB; // 降序排列
+    });
+    
+    return result;
+  }
+
+  // ========== 基础统计接口 ==========
+  
+  /**
+   * 获取基础优化统计信息
+   */
+  struct BasicOptimizationStats {
+    size_t totalLoops;
+    size_t countingLoops;
+    size_t unrollingCandidates;
+    size_t pureLoops;
+    size_t localMemoryOnlyLoops;
+    size_t noAliasConflictLoops;
+    double avgInstructionCount;
+    double avgComputeMemoryRatio;
+  };
+  
+  BasicOptimizationStats getOptimizationStats() const {
+    BasicOptimizationStats stats = {};
+    stats.totalLoops = CharacteristicsMap.size();
+    
+    size_t totalInstructions = 0;
+    double totalComputeMemoryRatio = 0.0;
+    
+    for (const auto& [loop, chars] : CharacteristicsMap) {
+      if (chars->isCountingLoop) stats.countingLoops++;
+      if (chars->benefitsFromUnrolling) stats.unrollingCandidates++;
+      if (chars->isPure) stats.pureLoops++;
+      if (chars->accessesOnlyLocalMemory) stats.localMemoryOnlyLoops++;
+      if (chars->hasNoMemoryAliasConflicts) stats.noAliasConflictLoops++;
+      
+      totalInstructions += chars->instructionCount;
+      totalComputeMemoryRatio += chars->computeToMemoryRatio;
+    }
+    
+    if (stats.totalLoops > 0) {
+      stats.avgInstructionCount = static_cast<double>(totalInstructions) / stats.totalLoops;
+      stats.avgComputeMemoryRatio = totalComputeMemoryRatio / stats.totalLoops;
+    }
+    
+    return stats;
+  }
+
+  // 打印分析结果
+  void print() const;
+
+private:
+  Function *AssociatedFunction;                                    // 关联的函数
+  std::map<Loop*, std::unique_ptr<LoopCharacteristics>> CharacteristicsMap; // 循环特征映射
+};
+
+/**
+ * @brief 基础循环特征分析遍
+ * 在循环规范化前执行，进行基础的循环特征分析，为后续精确分析提供基础
+ */
+class LoopCharacteristicsPass : public AnalysisPass {
+public:
+  // 唯一的 Pass ID
+  static void *ID;
+
+  LoopCharacteristicsPass() : AnalysisPass("LoopCharacteristics", Pass::Granularity::Function) {}
+
+  // 实现 getPassID
+  void *getPassID() const override { return &ID; }
+
+  // 核心运行方法
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+
+  // 获取分析结果
+  std::unique_ptr<AnalysisResultBase> getResult() override { return std::move(CurrentResult); }
+
+private:
+  std::unique_ptr<LoopCharacteristicsResult> CurrentResult;
+  
+  // ========== 缓存的分析结果 ==========
+  LoopAnalysisResult* loopAnalysis;           // 循环结构分析结果
+  AliasAnalysisResult* aliasAnalysis;         // 别名分析结果  
+  SideEffectAnalysisResult* sideEffectAnalysis; // 副作用分析结果
+  
+  // ========== 核心分析方法 ==========
+  void analyzeLoop(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 基础循环形式分析
+  void analyzeLoopForm(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 基础性能指标计算
+  void computePerformanceMetrics(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 基础纯度和副作用分析
+  void analyzePurityAndSideEffects(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 基础归纳变量识别
+  void identifyBasicInductionVariables(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 循环不变量识别
+  void identifyBasicLoopInvariants(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 基础边界分析
+  void analyzeBasicLoopBounds(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 基础内存访问模式分析
+  void analyzeBasicMemoryAccessPatterns(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // 基础优化评估
+  void evaluateBasicOptimizationOpportunities(Loop* loop, LoopCharacteristics* characteristics);
+  
+  // ========== 辅助方法 ==========
+  bool isClassicLoopInvariant(Value* val, Loop* loop, const std::unordered_set<Value*>& invariants);
+  void findDerivedInductionVars(Value* root,
+    Value* base, // 只传单一BIV base
+    Loop* loop,
+    std::vector<std::unique_ptr<InductionVarInfo>>& ivs,
+    std::set<Value*>& visited
+  );
+  bool isBasicInductionVariable(Value* val, Loop* loop);
+  // ========== 循环不变量分析辅助方法 ==========
+  bool isInvariantOperands(Instruction* inst, Loop* loop, const std::unordered_set<Value*>& invariants);
+  bool isMemoryLocationModifiedInLoop(Value* ptr, Loop* loop);
+  bool isMemoryLocationLoadedInLoop(Value* ptr, Loop* loop, Instruction* excludeInst = nullptr);
+  bool isPureFunction(Function* calledFunc);
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Analysis/LoopVectorization.h
+++ b/src/include/midend/Pass/Analysis/LoopVectorization.h
@ -0,0 +1,250 @@
+#pragma once
+
+#include "Pass.h"
+#include "Loop.h" 
+#include "LoopCharacteristics.h"
+#include "AliasAnalysis.h"
+#include "SideEffectAnalysis.h"
+#include <vector>
+#include <map>
+#include <memory>
+#include <set>
+#include <string>
+
+namespace sysy {
+
+/**
+ * @brief 依赖类型枚举 - 只考虑真正影响并行性的依赖
+ * 
+ * 依赖类型分析说明：
+ * - TRUE_DEPENDENCE (RAW): 真依赖，必须保持原始执行顺序，是最关键的依赖
+ * - ANTI_DEPENDENCE (WAR): 反依赖，影响指令重排序，可通过寄存器重命名等技术缓解
+ * - OUTPUT_DEPENDENCE (WAW): 输出依赖，相对较少但需要考虑，可通过变量私有化解决
+ * 
+ */
+enum class DependenceType {
+  TRUE_DEPENDENCE,    // 真依赖 (RAW) - 读后写流依赖，最重要的依赖类型
+  ANTI_DEPENDENCE,    // 反依赖 (WAR) - 写后读反向依赖，影响指令重排序  
+  OUTPUT_DEPENDENCE   // 输出依赖 (WAW) - 写后写，相对较少但需要考虑
+};
+
+/**
+ * @brief 依赖向量 - 表示两个内存访问之间的迭代距离
+ * 例如：a[i] 和 a[i+1] 之间的依赖向量是 [1]
+ *       a[i][j] 和 a[i+1][j-2] 之间的依赖向量是 [1,-2]
+ */
+struct DependenceVector {
+  std::vector<int> distances;    // 每个循环层次的依赖距离
+  bool isConstant;               // 是否为常量距离
+  bool isKnown;                  // 是否已知距离
+  
+  DependenceVector(size_t loopDepth) : distances(loopDepth, 0), isConstant(false), isKnown(false) {}
+  
+  // 检查是否为循环无关依赖
+  bool isLoopIndependent() const {
+    for (int dist : distances) {
+      if (dist != 0) return false;
+    }
+    return true;
+  }
+  
+  // 获取词典序方向向量
+  std::vector<int> getDirectionVector() const;
+  
+  // 检查是否可以通过向量化处理
+  bool isVectorizationSafe() const;
+};
+
+/**
+ * @brief 精确依赖关系 - 包含依赖向量的详细依赖信息
+ */
+struct PreciseDependence {
+  Instruction* source;
+  Instruction* sink;
+  DependenceType type;
+  DependenceVector dependenceVector;
+  Value* memoryLocation;
+  
+  // 并行化相关
+  bool allowsParallelization;    // 是否允许并行化
+  bool requiresSynchronization;  // 是否需要同步
+  bool isReductionDependence;    // 是否为归约依赖
+  
+  PreciseDependence(size_t loopDepth) : dependenceVector(loopDepth), 
+    allowsParallelization(true), requiresSynchronization(false), isReductionDependence(false) {}
+};
+
+/**
+ * @brief 向量化分析信息 - 暂时搁置，保留接口
+ */
+struct VectorizationAnalysis {
+  bool isVectorizable;                        // 固定为false，暂不支持
+  int suggestedVectorWidth;                   // 固定为1
+  std::vector<std::string> preventingFactors; // 阻止向量化的因素
+  
+  VectorizationAnalysis() : isVectorizable(false), suggestedVectorWidth(1) {
+    preventingFactors.push_back("Vectorization temporarily disabled");
+  }
+};
+
+/**
+ * @brief 并行化分析信息
+ */
+struct ParallelizationAnalysis {
+  bool isParallelizable;                     // 是否可并行化
+  int suggestedThreadCount;                  // 建议的线程数
+  std::vector<std::string> preventingFactors; // 阻止并行化的因素
+  
+  // 并行化模式
+  enum ParallelizationType { 
+    NONE,                    // 不可并行化
+    EMBARRASSINGLY_PARALLEL, // 完全并行
+    REDUCTION_PARALLEL,      // 归约并行
+    PIPELINE_PARALLEL,       // 流水线并行
+    CONDITIONAL_PARALLEL     // 条件并行
+  } parallelType;
+  
+  // 负载均衡
+  bool hasLoadBalance;                       // 是否有良好的负载均衡
+  bool isDynamicLoadBalanced;                // 是否需要动态负载均衡
+  double workComplexity;                     // 工作复杂度估计
+  
+  // 同步需求
+  bool requiresReduction;                    // 是否需要归约操作
+  bool requiresBarrier;                      // 是否需要屏障同步
+  std::set<Value*> sharedVariables;         // 共享变量
+  std::set<Value*> reductionVariables;      // 归约变量
+  std::set<Value*> privatizableVariables;   // 可私有化变量
+  
+  // 内存访问模式
+  bool hasMemoryConflicts;                   // 是否有内存冲突
+  bool hasReadOnlyAccess;                    // 是否只有只读访问
+  bool hasIndependentAccess;                 // 是否有独立的内存访问
+  
+  // 并行化收益评估
+  double parallelizationBenefit;             // 并行化收益估计 (0-1)
+  size_t communicationCost;                  // 通信开销估计
+  size_t synchronizationCost;                // 同步开销估计
+  
+  ParallelizationAnalysis() : isParallelizable(false), suggestedThreadCount(1), parallelType(NONE),
+    hasLoadBalance(true), isDynamicLoadBalanced(false), workComplexity(0.0), requiresReduction(false), 
+    requiresBarrier(false), hasMemoryConflicts(false), hasReadOnlyAccess(false), hasIndependentAccess(false),
+    parallelizationBenefit(0.0), communicationCost(0), synchronizationCost(0) {}
+};
+
+/**
+ * @brief 循环向量化/并行化分析结果
+ */
+class LoopVectorizationResult : public AnalysisResultBase {
+private:
+  Function* AssociatedFunction;
+  std::map<Loop*, VectorizationAnalysis> VectorizationMap;
+  std::map<Loop*, ParallelizationAnalysis> ParallelizationMap;
+  std::map<Loop*, std::vector<PreciseDependence>> DependenceMap;
+
+public:
+  LoopVectorizationResult(Function* F) : AssociatedFunction(F) {}
+  ~LoopVectorizationResult() override = default;
+
+  // 基础接口
+  void addVectorizationAnalysis(Loop* loop, VectorizationAnalysis analysis) {
+    VectorizationMap[loop] = std::move(analysis);
+  }
+  
+  void addParallelizationAnalysis(Loop* loop, ParallelizationAnalysis analysis) {
+    ParallelizationMap[loop] = std::move(analysis);
+  }
+  
+  void addDependenceAnalysis(Loop* loop, std::vector<PreciseDependence> dependences) {
+    DependenceMap[loop] = std::move(dependences);
+  }
+
+  // 查询接口
+  const VectorizationAnalysis* getVectorizationAnalysis(Loop* loop) const {
+    auto it = VectorizationMap.find(loop);
+    return it != VectorizationMap.end() ? &it->second : nullptr;
+  }
+  
+  const ParallelizationAnalysis* getParallelizationAnalysis(Loop* loop) const {
+    auto it = ParallelizationMap.find(loop);
+    return it != ParallelizationMap.end() ? &it->second : nullptr;
+  }
+  
+  const std::vector<PreciseDependence>* getPreciseDependences(Loop* loop) const {
+    auto it = DependenceMap.find(loop);
+    return it != DependenceMap.end() ? &it->second : nullptr;
+  }
+
+  // 统计接口
+  size_t getVectorizableLoopCount() const;
+  size_t getParallelizableLoopCount() const;
+  
+  // 优化建议
+  std::vector<Loop*> getVectorizationCandidates() const;
+  std::vector<Loop*> getParallelizationCandidates() const;
+  
+  // 打印分析结果
+  void print() const;
+};
+
+/**
+ * @brief 循环向量化/并行化分析遍
+ * 在循环规范化后执行，进行精确的依赖向量分析和向量化/并行化可行性评估
+ * 专注于并行化分析，向量化功能暂时搁置
+ */
+class LoopVectorizationPass : public AnalysisPass {
+public:
+  // 唯一的 Pass ID
+  static void *ID;
+  
+  LoopVectorizationPass() : AnalysisPass("LoopVectorization", Pass::Granularity::Function) {}
+
+  // 实现 getPassID
+  void *getPassID() const override { return &ID; }
+
+  // 核心运行方法
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+  
+  // 获取分析结果
+  std::unique_ptr<AnalysisResultBase> getResult() override { return std::move(CurrentResult); }
+
+private:
+  std::unique_ptr<LoopVectorizationResult> CurrentResult;
+  
+  // ========== 主要分析方法 ==========
+  void analyzeLoop(Loop* loop, LoopCharacteristics* characteristics, 
+                   AliasAnalysisResult* aliasAnalysis, SideEffectAnalysisResult* sideEffectAnalysis);
+  
+  // ========== 依赖向量分析 ==========
+  std::vector<PreciseDependence> computeDependenceVectors(Loop* loop, AliasAnalysisResult* aliasAnalysis);
+  DependenceVector computeAccessDependence(Instruction* inst1, Instruction* inst2, Loop* loop);
+  bool areAccessesAffinelyRelated(Value* ptr1, Value* ptr2, Loop* loop);
+  
+  // ========== 向量化分析 (暂时搁置) ==========
+  VectorizationAnalysis analyzeVectorizability(Loop* loop, const std::vector<PreciseDependence>& dependences,
+                                              LoopCharacteristics* characteristics);
+  
+  // ========== 并行化分析 ==========
+  ParallelizationAnalysis analyzeParallelizability(Loop* loop, const std::vector<PreciseDependence>& dependences,
+                                                  LoopCharacteristics* characteristics);
+  bool checkParallelizationLegality(Loop* loop, const std::vector<PreciseDependence>& dependences);
+  int estimateOptimalThreadCount(Loop* loop, LoopCharacteristics* characteristics);
+  ParallelizationAnalysis::ParallelizationType determineParallelizationType(Loop* loop, 
+                                                                           const std::vector<PreciseDependence>& dependences);
+  
+  // ========== 并行化专用分析方法 ==========
+  void analyzeReductionPatterns(Loop* loop, ParallelizationAnalysis* analysis);
+  void analyzeMemoryAccessPatterns(Loop* loop, ParallelizationAnalysis* analysis, AliasAnalysisResult* aliasAnalysis);
+  void estimateParallelizationBenefit(Loop* loop, ParallelizationAnalysis* analysis, LoopCharacteristics* characteristics);
+  void identifyPrivatizableVariables(Loop* loop, ParallelizationAnalysis* analysis);
+  void analyzeSynchronizationNeeds(Loop* loop, ParallelizationAnalysis* analysis, const std::vector<PreciseDependence>& dependences);
+  
+  // ========== 辅助方法 ==========
+  std::vector<int> extractInductionCoefficients(Value* ptr, Loop* loop);
+  bool isConstantStride(Value* ptr, Loop* loop, int& stride);
+  bool isIndependentMemoryAccess(Value* ptr1, Value* ptr2, Loop* loop);
+  double estimateWorkComplexity(Loop* loop);
+  bool hasReductionPattern(Value* var, Loop* loop);
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Analysis/SideEffectAnalysis.h
+++ b/src/include/midend/Pass/Analysis/SideEffectAnalysis.h
@ -0,0 +1,137 @@
+#pragma once
+
+#include "Pass.h"
+#include "IR.h"
+#include "AliasAnalysis.h"
+#include "CallGraphAnalysis.h"
+#include <unordered_set>
+#include <unordered_map>
+
+namespace sysy {
+
+// 副作用类型枚举
+enum class SideEffectType {
+    NO_SIDE_EFFECT,      // 无副作用
+    MEMORY_WRITE,        // 内存写入（store、memset）
+    FUNCTION_CALL,       // 函数调用（可能有任意副作用）
+    IO_OPERATION,        // I/O操作（printf、scanf等）
+    UNKNOWN              // 未知副作用
+};
+
+// 副作用信息结构
+struct SideEffectInfo {
+    SideEffectType type = SideEffectType::NO_SIDE_EFFECT;
+    bool mayModifyGlobal = false;      // 可能修改全局变量
+    bool mayModifyMemory = false;      // 可能修改内存
+    bool mayCallFunction = false;      // 可能调用函数
+    bool isPure = true;                // 是否为纯函数（无副作用且结果只依赖参数）
+    
+    // 合并两个副作用信息
+    SideEffectInfo merge(const SideEffectInfo& other) const {
+        SideEffectInfo result;
+        result.type = (type == SideEffectType::NO_SIDE_EFFECT) ? other.type : type;
+        result.mayModifyGlobal = mayModifyGlobal || other.mayModifyGlobal;
+        result.mayModifyMemory = mayModifyMemory || other.mayModifyMemory;
+        result.mayCallFunction = mayCallFunction || other.mayCallFunction;
+        result.isPure = isPure && other.isPure;
+        return result;
+    }
+};
+
+// 副作用分析结果类
+class SideEffectAnalysisResult : public AnalysisResultBase {
+private:
+    // 指令级别的副作用信息
+    std::unordered_map<Instruction*, SideEffectInfo> instructionSideEffects;
+    
+    // 函数级别的副作用信息
+    std::unordered_map<Function*, SideEffectInfo> functionSideEffects;
+    
+    // 已知的SysY标准库函数副作用信息
+    std::unordered_map<std::string, SideEffectInfo> knownFunctions;
+
+public:
+    SideEffectAnalysisResult();
+    virtual ~SideEffectAnalysisResult() noexcept override = default;
+    
+    // 获取指令的副作用信息
+    const SideEffectInfo& getInstructionSideEffect(Instruction* inst) const;
+    
+    // 获取函数的副作用信息
+    const SideEffectInfo& getFunctionSideEffect(Function* func) const;
+    
+    // 设置指令的副作用信息
+    void setInstructionSideEffect(Instruction* inst, const SideEffectInfo& info);
+    
+    // 设置函数的副作用信息
+    void setFunctionSideEffect(Function* func, const SideEffectInfo& info);
+    
+    // 检查指令是否有副作用
+    bool hasSideEffect(Instruction* inst) const;
+    
+    // 检查指令是否可能修改内存
+    bool mayModifyMemory(Instruction* inst) const;
+    
+    // 检查指令是否可能修改全局状态
+    bool mayModifyGlobal(Instruction* inst) const;
+    
+    // 检查函数是否为纯函数
+    bool isPureFunction(Function* func) const;
+    
+    // 获取已知函数的副作用信息
+    const SideEffectInfo* getKnownFunctionSideEffect(const std::string& funcName) const;
+    
+    // 初始化已知函数的副作用信息
+    void initializeKnownFunctions();
+    
+private:
+};
+
+// 副作用分析遍类 - Module级别分析
+class SysYSideEffectAnalysisPass : public AnalysisPass {
+public:
+    // 静态成员，作为该遍的唯一ID
+    static void* ID;
+    
+    SysYSideEffectAnalysisPass() : AnalysisPass("SysYSideEffectAnalysis", Granularity::Module) {}
+    
+    // 在模块上运行分析
+    bool runOnModule(Module* M, AnalysisManager& AM) override;
+    
+    // 获取分析结果
+    std::unique_ptr<AnalysisResultBase> getResult() override;
+    
+    // Pass 基类中的纯虚函数，必须实现
+    void* getPassID() const override { return &ID; }
+
+private:
+    // 分析结果
+    std::unique_ptr<SideEffectAnalysisResult> result;
+    
+    // 调用图分析结果
+    CallGraphAnalysisResult* callGraphAnalysis = nullptr;
+    
+    // 分析单个函数的副作用（Module级别的内部方法）
+    SideEffectInfo analyzeFunction(Function* func, AnalysisManager& AM);
+    
+    // 分析单个指令的副作用
+    SideEffectInfo analyzeInstruction(Instruction* inst, Function* currentFunc, AnalysisManager& AM);
+    
+    // 分析函数调用指令的副作用（利用调用图）
+    SideEffectInfo analyzeCallInstruction(CallInst* call, Function* currentFunc, AnalysisManager& AM);
+    
+    // 分析存储指令的副作用
+    SideEffectInfo analyzeStoreInstruction(StoreInst* store, Function* currentFunc, AnalysisManager& AM);
+    
+    // 分析内存设置指令的副作用
+    SideEffectInfo analyzeMemsetInstruction(MemsetInst* memset, Function* currentFunc, AnalysisManager& AM);
+    
+    // 使用不动点算法分析递归函数群
+    void analyzeStronglyConnectedComponent(const std::vector<Function*>& scc, AnalysisManager& AM);
+    
+    // 检查函数间副作用传播的收敛性
+    bool hasConverged(const std::unordered_map<Function*, SideEffectInfo>& oldEffects,
+                      const std::unordered_map<Function*, SideEffectInfo>& newEffects) const;
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/BuildCFG.h
+++ b/src/include/midend/Pass/Optimize/BuildCFG.h
@ -0,0 +1,20 @@
+#pragma once
+
+#include "IR.h"
+#include "Pass.h"
+#include <queue>
+#include <set>
+
+namespace sysy {
+
+class BuildCFG : public OptimizationPass {
+public:
+  static void *ID;
+  BuildCFG() : OptimizationPass("BuildCFG", Granularity::Function) {}
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+  void getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const override;
+  void *getPassID() const override { return &ID; }
+
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/DCE.h
+++ b/src/include/midend/Pass/Optimize/DCE.h
@ -4,6 +4,8 @@
 #include "IR.h"
 #include "SysYIROptUtils.h"
 #include "Dom.h" 
+#include "AliasAnalysis.h"
+#include "SideEffectAnalysis.h"
 #include <unordered_set>
 #include <queue>

@ -25,8 +27,12 @@ public:
 private:
    // 存储活跃指令的集合
    std::unordered_set<Instruction*> alive_insts;
+    // 别名分析结果
+    AliasAnalysisResult* aliasAnalysis = nullptr;
+    // 副作用分析结果
+    SideEffectAnalysisResult* sideEffectAnalysis = nullptr;

-    // 判断指令是否是“天然活跃”的（即总是保留的）
+    // 判断指令是否是"天然活跃"的（即总是保留的）
    // inst: 要检查的指令
    // 返回值: 如果指令是天然活跃的，则为true，否则为false
    bool isAlive(Instruction* inst);
@ -34,6 +40,9 @@ private:
    // 递归地将活跃指令及其依赖加入到 alive_insts 集合中
    // inst: 要标记为活跃的指令
    void addAlive(Instruction* inst);
+    
+    // 检查Store指令是否可能有副作用（通过别名分析）
+    bool mayHaveSideEffect(StoreInst* store);
 };

 // DCE 优化遍类，继承自 OptimizationPass
--- a/src/include/midend/Pass/Optimize/GVN.h
+++ b/src/include/midend/Pass/Optimize/GVN.h
@ -0,0 +1,87 @@
+#pragma once
+
+#include "Pass.h"
+#include "IR.h"
+#include "Dom.h"
+#include "SideEffectAnalysis.h"
+#include <unordered_map>
+#include <unordered_set>
+#include <vector>
+#include <string>
+#include <sstream>
+
+namespace sysy {
+
+// GVN优化遍的核心逻辑封装类
+class GVNContext {
+public:
+    // 运行GVN优化的主要方法
+    void run(Function* func, AnalysisManager* AM, bool& changed);
+
+private:
+    // 新的值编号系统
+    std::unordered_map<Value*, unsigned> valueToNumber;      // Value -> 值编号
+    std::unordered_map<unsigned, Value*> numberToValue;      // 值编号 -> 代表值
+    std::unordered_map<std::string, unsigned> expressionToNumber; // 表达式 -> 值编号
+    unsigned nextValueNumber = 1;
+    
+    // 已访问的基本块集合
+    std::unordered_set<BasicBlock*> visited;
+    
+    // 逆后序遍历的基本块列表
+    std::vector<BasicBlock*> rpoBlocks;
+    
+    // 需要删除的指令集合
+    std::unordered_set<Instruction*> needRemove;
+    
+    // 分析结果
+    DominatorTree* domTree = nullptr;
+    SideEffectAnalysisResult* sideEffectAnalysis = nullptr;
+    
+    // 计算逆后序遍历
+    void computeRPO(Function* func);
+    void dfs(BasicBlock* bb);
+    
+    // 新的值编号方法
+    unsigned getValueNumber(Value* value);
+    unsigned assignValueNumber(Value* value);
+    
+    // 基本块处理
+    void processBasicBlock(BasicBlock* bb, bool& changed);
+    
+    // 指令处理
+    bool processInstruction(Instruction* inst);
+    
+    // 表达式构建和查找
+    std::string buildExpressionKey(Instruction* inst);
+    Value* findExistingValue(const std::string& exprKey, Instruction* inst);
+    
+    // 支配关系和安全性检查
+    bool dominates(Instruction* a, Instruction* b);
+    bool isMemorySafe(LoadInst* earlierLoad, LoadInst* laterLoad);
+    
+    // 清理方法
+    void eliminateRedundantInstructions(bool& changed);
+    void invalidateMemoryValues(StoreInst* store);
+};
+
+// GVN优化遍类
+class GVN : public OptimizationPass {
+public:
+    // 静态成员，作为该遍的唯一ID
+    static void* ID;
+    
+    GVN() : OptimizationPass("GVN", Granularity::Function) {}
+    
+    // 在函数上运行优化
+    bool runOnFunction(Function* func, AnalysisManager& AM) override;
+    
+    // 返回该遍的唯一ID
+    void* getPassID() const override { return ID; }
+    
+    // 声明分析依赖
+    void getAnalysisUsage(std::set<void*>& analysisDependencies, 
+                         std::set<void*>& analysisInvalidations) const override;
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/GlobalStrengthReduction.h
+++ b/src/include/midend/Pass/Optimize/GlobalStrengthReduction.h
@ -0,0 +1,107 @@
+#pragma once
+
+#include "Pass.h"
+#include "IR.h"
+#include "SideEffectAnalysis.h"
+#include <unordered_map>
+#include <unordered_set>
+#include <vector>
+#include <cstdint>
+
+namespace sysy {
+
+// 魔数乘法结构，用于除法优化
+struct MagicNumber {
+    uint32_t multiplier;
+    int shift;
+    bool needAdd;
+    
+    MagicNumber(uint32_t m, int s, bool add = false) 
+        : multiplier(m), shift(s), needAdd(add) {}
+};
+
+// 全局强度削弱优化遍的核心逻辑封装类
+class GlobalStrengthReductionContext {
+public:
+    // 构造函数，接受IRBuilder参数
+    explicit GlobalStrengthReductionContext(IRBuilder* builder) : builder(builder) {}
+    
+    // 运行优化的主要方法
+    void run(Function* func, AnalysisManager* AM, bool& changed);
+
+private:
+    IRBuilder* builder;  // IR构建器
+    
+    // 分析结果
+    SideEffectAnalysisResult* sideEffectAnalysis = nullptr;
+    
+    // 优化计数
+    int algebraicOptCount = 0;
+    int strengthReductionCount = 0;
+    int divisionOptCount = 0;
+    
+    // 主要优化方法
+    bool processBasicBlock(BasicBlock* bb);
+    bool processInstruction(Instruction* inst);
+    
+    // 代数优化方法
+    bool tryAlgebraicOptimization(Instruction* inst);
+    bool optimizeAddition(BinaryInst* inst);
+    bool optimizeSubtraction(BinaryInst* inst);
+    bool optimizeMultiplication(BinaryInst* inst);
+    bool optimizeDivision(BinaryInst* inst);
+    bool optimizeComparison(BinaryInst* inst);
+    bool optimizeLogical(BinaryInst* inst);
+    
+    // 强度削弱方法
+    bool tryStrengthReduction(Instruction* inst);
+    bool reduceMultiplication(BinaryInst* inst);
+    bool reduceDivision(BinaryInst* inst);
+    bool reducePower(CallInst* inst);
+    
+    // 复杂乘法强度削弱方法
+    bool tryComplexMultiplication(BinaryInst* inst, Value* variable, int constant);
+    bool findOptimalShiftDecomposition(int constant, std::vector<int>& shifts);
+    Value* createShiftDecomposition(BinaryInst* inst, Value* variable, const std::vector<int>& shifts);
+    
+    // 魔数乘法相关方法
+    MagicNumber computeMagicNumber(uint32_t divisor);
+    std::pair<int, int> computeMulhMagicNumbers(int divisor);
+    Value* createMagicDivision(BinaryInst* divInst, uint32_t divisor, const MagicNumber& magic);
+    Value* createMagicDivisionLibdivide(BinaryInst* divInst, int divisor);
+    bool isPowerOfTwo(uint32_t n);
+    int log2OfPowerOfTwo(uint32_t n);
+    
+    // 辅助方法
+    bool isConstantInt(Value* val, int& constVal);
+    bool isConstantInt(Value* val, uint32_t& constVal);
+    ConstantInteger* getConstantInt(int val);
+    bool hasOnlyLocalUses(Instruction* inst);
+    void replaceWithOptimized(Instruction* original, Value* replacement);
+};
+
+// 全局强度削弱优化遍类
+class GlobalStrengthReduction : public OptimizationPass {
+private:
+    IRBuilder* builder;  // IR构建器，用于创建新指令
+
+public:
+    // 静态成员，作为该遍的唯一ID
+    static void* ID;
+    
+    // 构造函数，接受IRBuilder参数
+    explicit GlobalStrengthReduction(IRBuilder* builder) 
+        : OptimizationPass("GlobalStrengthReduction", Granularity::Function), builder(builder) {}
+    
+    // 在函数上运行优化
+    bool runOnFunction(Function* func, AnalysisManager& AM) override;
+    
+    // 返回该遍的唯一ID
+    void* getPassID() const override { return ID; }
+    
+    // 声明分析依赖
+    void getAnalysisUsage(std::set<void*>& analysisDependencies, 
+                         std::set<void*>& analysisInvalidations) const override;
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/InductionVariableElimination.h
+++ b/src/include/midend/Pass/Optimize/InductionVariableElimination.h
@ -0,0 +1,252 @@
+#pragma once
+
+#include "Pass.h"
+#include "IR.h"
+#include "LoopCharacteristics.h"
+#include "Loop.h"
+#include "Dom.h"
+#include "SideEffectAnalysis.h"
+#include "AliasAnalysis.h"
+#include <vector>
+#include <unordered_map>
+#include <unordered_set>
+#include <memory>
+
+namespace sysy {
+
+// 前向声明
+class LoopCharacteristicsResult;
+class LoopAnalysisResult;
+
+/**
+ * @brief 死归纳变量信息
+ * 记录一个可以被消除的归纳变量
+ */
+struct DeadInductionVariable {
+  PhiInst* phiInst;                    // phi 指令
+  std::vector<Instruction*> relatedInsts; // 相关的递增/递减指令
+  Loop* containingLoop;                // 所在循环
+  bool canEliminate;                   // 是否可以安全消除
+  
+  DeadInductionVariable(PhiInst* phi, Loop* loop) 
+    : phiInst(phi), containingLoop(loop), canEliminate(false) {}
+};
+
+/**
+ * @brief 归纳变量消除上下文类
+ * 封装归纳变量消除优化的核心逻辑和状态
+ */
+class InductionVariableEliminationContext {
+public:
+  InductionVariableEliminationContext() {}
+  
+  /**
+   * 运行归纳变量消除优化
+   * @param F 目标函数
+   * @param AM 分析管理器
+   * @return 是否修改了IR
+   */
+  bool run(Function* F, AnalysisManager& AM);
+
+private:
+  // 分析结果缓存
+  LoopAnalysisResult* loopAnalysis = nullptr;
+  LoopCharacteristicsResult* loopCharacteristics = nullptr;
+  DominatorTree* dominatorTree = nullptr;
+  SideEffectAnalysisResult* sideEffectAnalysis = nullptr;
+  AliasAnalysisResult* aliasAnalysis = nullptr;
+  
+  // 死归纳变量存储
+  std::vector<std::unique_ptr<DeadInductionVariable>> deadIVs;
+  std::unordered_map<Loop*, std::vector<DeadInductionVariable*>> loopToDeadIVs;
+  
+  // ========== 核心分析和优化阶段 ==========
+  
+  /**
+   * 阶段1：识别死归纳变量
+   * 找出没有被有效使用的归纳变量
+   */
+  void identifyDeadInductionVariables(Function* F);
+  
+  /**
+   * 阶段2：分析消除的安全性
+   * 确保消除操作不会破坏程序语义
+   */
+  void analyzeSafetyForElimination();
+  
+  /**
+   * 阶段3：执行归纳变量消除
+   * 删除死归纳变量及其相关指令
+   */
+  bool performInductionVariableElimination();
+  
+  // ========== 辅助方法 ==========
+  
+  /**
+   * 检查归纳变量是否为死归纳变量
+   * @param iv 归纳变量信息
+   * @param loop 所在循环
+   * @return 如果是死归纳变量返回相关信息，否则返回nullptr
+   */
+  std::unique_ptr<DeadInductionVariable> 
+  isDeadInductionVariable(const InductionVarInfo* iv, Loop* loop);
+  
+  /**
+   * 递归分析phi指令及其使用链是否都是死代码
+   * @param phiInst phi指令
+   * @param loop 所在循环
+   * @return phi指令是否可以安全删除
+   */
+  bool isPhiInstructionDeadRecursively(PhiInst* phiInst, Loop* loop);
+  
+  /**
+   * 递归分析指令的使用链是否都是死代码
+   * @param inst 要分析的指令
+   * @param loop 所在循环
+   * @param visited 已访问的指令集合（避免无限递归）
+   * @param currentPath 当前递归路径（检测循环依赖）
+   * @return 指令的使用链是否都是死代码
+   */
+  bool isInstructionUseChainDeadRecursively(Instruction* inst, Loop* loop, 
+                                           std::set<Instruction*>& visited, 
+                                           std::set<Instruction*>& currentPath);
+  
+  /**
+   * 检查循环是否有副作用
+   * @param loop 要检查的循环
+   * @return 循环是否有副作用
+   */
+  bool loopHasSideEffects(Loop* loop);
+  
+  /**
+   * 检查指令是否被用于循环退出条件
+   * @param inst 要检查的指令
+   * @param loop 所在循环
+   * @return 是否被用于循环退出条件
+   */
+  bool isUsedInLoopExitCondition(Instruction* inst, Loop* loop);
+  
+  /**
+   * 检查指令的结果是否未被有效使用
+   * @param inst 要检查的指令
+   * @param loop 所在循环
+   * @return 指令结果是否未被有效使用
+   */
+  bool isInstructionResultUnused(Instruction* inst, Loop* loop);
+  
+  /**
+   * 检查store指令是否存储到死地址（利用别名分析）
+   * @param store store指令
+   * @param loop 所在循环  
+   * @return 是否存储到死地址
+   */
+  bool isStoreToDeadLocation(StoreInst* store, Loop* loop);
+  
+  /**
+   * 检查指令是否为死代码或只在循环内部使用
+   * @param inst 要检查的指令
+   * @param loop 所在循环
+   * @return 是否为死代码或只在循环内部使用
+   */
+  bool isInstructionDeadOrInternalOnly(Instruction* inst, Loop* loop);
+  
+  /**
+   * 检查指令是否有效地为死代码（带递归深度限制）
+   * @param inst 要检查的指令
+   * @param loop 所在循环
+   * @param maxDepth 最大递归深度
+   * @return 指令是否有效地为死代码
+   */
+  bool isInstructionEffectivelyDead(Instruction* inst, Loop* loop, int maxDepth);
+  
+  /**
+   * 检查store指令是否有后续的load操作
+   * @param store store指令
+   * @param loop 所在循环
+   * @return 是否有后续的load操作
+   */
+  bool hasSubsequentLoad(StoreInst* store, Loop* loop);
+  
+  /**
+   * 检查指令是否在循环外有使用
+   * @param inst 要检查的指令
+   * @param loop 所在循环
+   * @return 是否在循环外有使用
+   */
+  bool hasUsageOutsideLoop(Instruction* inst, Loop* loop);
+  
+  /**
+   * 检查store指令是否在循环外有后续的load操作
+   * @param store store指令
+   * @param loop 所在循环
+   * @return 是否在循环外有后续的load操作
+   */
+  bool hasSubsequentLoadOutsideLoop(StoreInst* store, Loop* loop);
+  
+  /**
+   * 递归检查基本块子树中是否有对指定位置的load操作
+   * @param bb 基本块
+   * @param ptr 指针
+   * @param visited 已访问的基本块集合
+   * @return 是否有load操作
+   */
+  bool hasLoadInSubtree(BasicBlock* bb, Value* ptr, std::set<BasicBlock*>& visited);
+  
+  /**
+   * 收集与归纳变量相关的所有指令
+   * @param phiInst phi指令
+   * @param loop 所在循环
+   * @return 相关指令列表
+   */
+  std::vector<Instruction*> collectRelatedInstructions(PhiInst* phiInst, Loop* loop);
+  
+  /**
+   * 检查消除归纳变量的安全性
+   * @param deadIV 死归纳变量
+   * @return 是否可以安全消除
+   */
+  bool isSafeToEliminate(const DeadInductionVariable* deadIV);
+  
+  /**
+   * 消除单个死归纳变量
+   * @param deadIV 死归纳变量
+   * @return 是否成功消除
+   */
+  bool eliminateDeadInductionVariable(DeadInductionVariable* deadIV);
+  
+  /**
+   * 打印调试信息
+   */
+  void printDebugInfo();
+};
+
+/**
+ * @brief 归纳变量消除优化遍
+ * 消除循环中无用的归纳变量，减少寄存器压力
+ */
+class InductionVariableElimination : public OptimizationPass {
+public:
+  // 唯一的 Pass ID
+  static void *ID;
+  
+  InductionVariableElimination() 
+    : OptimizationPass("InductionVariableElimination", Granularity::Function) {}
+  
+  /**
+   * 在函数上运行归纳变量消除优化
+   * @param F 目标函数
+   * @param AM 分析管理器
+   * @return 是否修改了IR
+   */
+  bool runOnFunction(Function* F, AnalysisManager& AM) override;
+  
+  /**
+   * 声明分析依赖和失效信息
+   */
+  void getAnalysisUsage(std::set<void*>& analysisDependencies, 
+                       std::set<void*>& analysisInvalidations) const override;
+  
+  void* getPassID() const override { return &ID; }
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/LICM.h
+++ b/src/include/midend/Pass/Optimize/LICM.h
@ -0,0 +1,40 @@
+#pragma once
+#include "Pass.h"
+#include "Loop.h"
+#include "LoopCharacteristics.h"
+#include "Dom.h"
+#include <unordered_set>
+#include <vector>
+
+namespace sysy{
+
+class LICMContext {
+public:
+  LICMContext(Function* func, Loop* loop, IRBuilder* builder, const LoopCharacteristics* chars)
+    : func(func), loop(loop), builder(builder), chars(chars) {}
+    // 运行LICM主流程，返回IR是否被修改
+    bool run();
+
+private:
+    Function* func;
+    Loop* loop;
+    IRBuilder* builder;
+    const LoopCharacteristics* chars; // 特征分析结果
+
+    // 外提所有可提升指令
+    bool hoistInstructions();
+};
+
+
+class LICM : public OptimizationPass{
+private:
+	IRBuilder *builder; ///< IR构建器，用于插入指令
+public:
+	static void *ID;
+	LICM(IRBuilder *builder = nullptr) : OptimizationPass("LICM", Granularity::Function) , builder(builder) {}
+	bool runOnFunction(Function *F, AnalysisManager &AM) override;
+	void getAnalysisUsage(std::set<void *> &, std::set<void *> &) const override;
+	void *getPassID() const override { return &ID; }
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/LoopNormalization.h
+++ b/src/include/midend/Pass/Optimize/LoopNormalization.h
@ -0,0 +1,155 @@
+#pragma once
+
+#include "Loop.h"          // 循环分析依赖
+#include "Dom.h"           // 支配树分析依赖
+#include "IR.h"            // IR定义
+#include "IRBuilder.h"     // IR构建器
+#include "Pass.h"          // Pass框架
+#include <memory>
+#include <set>
+#include <vector>
+
+namespace sysy {
+
+/**
+ * @brief 循环规范化转换Pass
+ * 
+ * 该Pass在循环不变量提升等优化前运行，主要负责：
+ * 1. 为没有前置块(preheader)的循环创建前置块
+ * 2. 确保循环结构符合后续优化的要求
+ * 3. 规范化循环的控制流结构
+ * 
+ * 前置块的作用：
+ * - 为循环不变量提升提供插入位置
+ * - 简化循环分析和优化
+ * - 确保循环有唯一的入口点
+ */
+class LoopNormalizationPass : public OptimizationPass {
+public:
+  // 唯一的 Pass ID
+  static void *ID;
+
+  LoopNormalizationPass(IRBuilder* builder) : OptimizationPass("LoopNormalization", Pass::Granularity::Function), builder(builder) {}
+
+  // 实现 getPassID
+  void *getPassID() const override { return &ID; }
+
+  // 核心运行方法
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+  
+  // 声明分析依赖和失效信息
+  void getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const override;
+
+private:
+  // ========== IR构建器 ==========
+  IRBuilder* builder;                                  // IR构建器
+  
+  // ========== 缓存的分析结果 ==========
+  LoopAnalysisResult* loopAnalysis;                    // 循环结构分析结果
+  DominatorTree* domTree;                              // 支配树分析结果
+  
+  // ========== 规范化统计 ==========
+  struct NormalizationStats {
+    size_t totalLoops;                    // 总循环数
+    size_t loopsNeedingPreheader;        // 需要前置块的循环数
+    size_t preheadersCreated;            // 创建的前置块数
+    size_t loopsNormalized;              // 规范化的循环数
+    size_t redundantPhisRemoved;         // 删除的冗余PHI节点数
+    
+    NormalizationStats() : totalLoops(0), loopsNeedingPreheader(0), 
+                          preheadersCreated(0), loopsNormalized(0), 
+                          redundantPhisRemoved(0) {}
+  } stats;
+  
+  // ========== 核心规范化方法 ==========
+  
+  /**
+   * 规范化单个循环
+   * @param loop 要规范化的循环
+   * @return 是否进行了修改
+   */
+  bool normalizeLoop(Loop* loop);
+  
+  /**
+   * 为循环创建前置块
+   * @param loop 需要前置块的循环
+   * @return 创建的前置块，如果失败则返回nullptr
+   */
+  BasicBlock* createPreheaderForLoop(Loop* loop);
+  
+  /**
+   * 检查循环是否需要前置块（基于结构性需求）
+   * @param loop 要检查的循环
+   * @return true如果需要前置块
+   */
+  bool needsPreheader(Loop* loop);
+  
+  /**
+   * 检查循环是否已有合适的前置块
+   * @param loop 要检查的循环
+   * @return 现有的前置块，如果没有则返回nullptr
+   */
+  BasicBlock* getExistingPreheader(Loop* loop);
+  
+  /**
+   * 更新支配树关系（在创建新块后）
+   * @param newBlock 新创建的基本块
+   * @param loop 相关的循环
+   */
+  void updateDominatorRelations(BasicBlock* newBlock, Loop* loop);
+  
+  /**
+   * 重定向循环外的前驱块到新的前置块
+   * @param loop 目标循环
+   * @param preheader 新创建的前置块
+   * @param header 循环头部
+   */
+  void redirectExternalPredecessors(Loop* loop, BasicBlock* preheader, BasicBlock* header, const std::vector<BasicBlock*>& externalPreds);
+  
+  /**
+   * 为前置块生成合适的名称
+   * @param loop 相关的循环
+   * @return 生成的前置块名称
+   */
+  std::string generatePreheaderName(Loop* loop);
+  
+  /**
+   * 验证规范化结果的正确性
+   * @param loop 规范化后的循环
+   * @return true如果规范化正确
+   */
+  bool validateNormalization(Loop* loop);
+  
+  // ========== 辅助方法 ==========
+  
+  /**
+   * 获取循环的外部前驱块（不在循环内的前驱）
+   * @param loop 目标循环
+   * @return 外部前驱块列表
+   */
+  std::vector<BasicBlock*> getExternalPredecessors(Loop* loop);
+  
+  /**
+   * 检查基本块是否适合作为前置块
+   * @param block 候选基本块
+   * @param loop 目标循环
+   * @return true如果适合作为前置块
+   */
+  bool isSuitableAsPreheader(BasicBlock* block, Loop* loop);
+  
+  /**
+   * 更新PHI节点以适应新的前置块
+   * @param header 循环头部
+   * @param preheader 新的前置块
+   * @param oldPreds 原来的外部前驱
+   */
+  void updatePhiNodesForPreheader(BasicBlock* header, BasicBlock* preheader,
+                                 const std::vector<BasicBlock*>& oldPreds);
+  
+  /**
+   * 打印规范化统计信息
+   */
+  void printStats(Function* F);
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/LoopStrengthReduction.h
+++ b/src/include/midend/Pass/Optimize/LoopStrengthReduction.h
@ -0,0 +1,233 @@
+#pragma once
+
+#include "Pass.h"
+#include "IR.h"
+#include "LoopCharacteristics.h"
+#include "Loop.h"
+#include "Dom.h"
+#include <vector>
+#include <unordered_map>
+#include <unordered_set>
+#include <memory>
+
+namespace sysy {
+
+// 前向声明
+class LoopCharacteristicsResult;
+class LoopAnalysisResult;
+
+/**
+ * @brief 强度削弱候选项信息
+ * 记录一个可以进行强度削弱的表达式信息
+ */
+struct StrengthReductionCandidate {
+  enum OpType {
+    MULTIPLY,    // 乘法: iv * const
+    DIVIDE,      // 除法: iv / 2^n (转换为右移)
+    DIVIDE_CONST, // 除法: iv / const (使用mulh指令优化)
+    REMAINDER    // 取模: iv % 2^n (转换为位与)
+  };
+  
+  enum DivisionStrategy {
+    SIMPLE_SHIFT,     // 简单右移（仅适用于无符号或非负数）
+    SIGNED_CORRECTION, // 有符号除法修正: (x + (x >> 31) & mask) >> k
+    MULH_OPTIMIZATION  // 使用mulh指令优化任意常数除法
+  };
+  
+  Instruction* originalInst;      // 原始指令 (如 i*4, i/8, i%16)
+  Value* inductionVar;            // 归纳变量 (如 i)
+  OpType operationType;           // 操作类型
+  DivisionStrategy divStrategy;   // 除法策略（仅用于除法）
+  int multiplier;                 // 乘数/除数/模数 (如 4, 8, 16)
+  int shiftAmount;                // 位移量 (对于2的幂)
+  int offset;                     // 偏移量 (如常数项)
+  BasicBlock* containingBlock;    // 所在基本块
+  Loop* containingLoop;           // 所在循环
+  bool hasNegativeValues;         // 归纳变量是否可能为负数
+  
+  // 强度削弱后的新变量
+  PhiInst* newPhi = nullptr;      // 新的 phi 指令
+  Value* newInductionVar = nullptr; // 新的归纳变量
+  
+  StrengthReductionCandidate(Instruction* inst, Value* iv, OpType opType, int value, int off, 
+                           BasicBlock* bb, Loop* loop)
+    : originalInst(inst), inductionVar(iv), operationType(opType), 
+      divStrategy(SIMPLE_SHIFT), multiplier(value), offset(off), 
+      containingBlock(bb), containingLoop(loop), hasNegativeValues(false) {
+    
+    // 计算位移量（用于除法和取模的强度削弱）
+    if (opType == DIVIDE || opType == REMAINDER) {
+      shiftAmount = 0;
+      int temp = value;
+      while (temp > 1) {
+        temp >>= 1;
+        shiftAmount++;
+      }
+    } else {
+      shiftAmount = 0;
+    }
+  }
+};
+
+/**
+ * @brief 强度削弱上下文类
+ * 封装强度削弱优化的核心逻辑和状态
+ */
+class StrengthReductionContext {
+public:
+  StrengthReductionContext(IRBuilder* builder) : builder(builder) {}
+  
+  /**
+   * 运行强度削弱优化
+   * @param F 目标函数
+   * @param AM 分析管理器
+   * @return 是否修改了IR
+   */
+  bool run(Function* F, AnalysisManager& AM);
+
+private:
+  IRBuilder* builder;
+  
+  // 分析结果缓存
+  LoopAnalysisResult* loopAnalysis = nullptr;
+  LoopCharacteristicsResult* loopCharacteristics = nullptr;
+  DominatorTree* dominatorTree = nullptr;
+  
+  // 候选项存储
+  std::vector<std::unique_ptr<StrengthReductionCandidate>> candidates;
+  std::unordered_map<Loop*, std::vector<StrengthReductionCandidate*>> loopToCandidates;
+  
+  // ========== 核心分析和优化阶段 ==========
+  
+  /**
+   * 阶段1：识别强度削弱候选项
+   * 扫描所有循环中的乘法指令，找出可以优化的模式
+   */
+  void identifyStrengthReductionCandidates(Function* F);
+  
+  /**
+   * 阶段2：分析候选项的优化潜力
+   * 评估每个候选项的收益，过滤掉不值得优化的情况
+   */
+  void analyzeOptimizationPotential();
+  
+  /**
+   * 阶段3：执行强度削弱变换
+   * 对选中的候选项执行实际的强度削弱优化
+   */
+  bool performStrengthReduction();
+  
+  // ========== 辅助分析函数 ==========
+  
+  /**
+   * 分析归纳变量是否可能取负值
+   * @param ivInfo 归纳变量信息
+   * @param loop 所属循环
+   * @return 如果可能为负数返回true
+   */
+  bool analyzeInductionVariableRange(const InductionVarInfo* ivInfo, Loop* loop) const;
+  
+  /**
+   * 生成除法替换代码
+   * @param candidate 优化候选项
+   * @param builder IR构建器
+   * @return 替换值
+   */
+  Value* generateDivisionReplacement(StrengthReductionCandidate* candidate, IRBuilder* builder) const;
+  
+  /**
+   * 生成任意常数除法替换代码
+   * @param candidate 优化候选项
+   * @param builder IR构建器
+   * @return 替换值
+   */
+  Value* generateConstantDivisionReplacement(StrengthReductionCandidate* candidate, IRBuilder* builder) const;
+  
+  /**
+   * 检查指令是否为强度削弱候选项
+   * @param inst 要检查的指令
+   * @param loop 所在循环
+   * @return 如果是候选项返回候选项信息，否则返回nullptr
+   */
+  std::unique_ptr<StrengthReductionCandidate> 
+  isStrengthReductionCandidate(Instruction* inst, Loop* loop);
+  
+  /**
+   * 检查值是否为循环的归纳变量
+   * @param val 要检查的值
+   * @param loop 循环
+   * @param characteristics 循环特征信息
+   * @return 如果是归纳变量返回归纳变量信息，否则返回nullptr
+   */
+  const InductionVarInfo* 
+  getInductionVarInfo(Value* val, Loop* loop, const LoopCharacteristics* characteristics);
+  
+  /**
+   * 为候选项创建新的归纳变量
+   * @param candidate 候选项
+   * @return 是否成功创建
+   */
+  bool createNewInductionVariable(StrengthReductionCandidate* candidate);
+  
+  /**
+   * 替换原始指令的所有使用
+   * @param candidate 候选项
+   * @return 是否成功替换
+   */
+  bool replaceOriginalInstruction(StrengthReductionCandidate* candidate);
+  
+  /**
+   * 估算优化收益
+   * 计算强度削弱后的性能提升
+   * @param candidate 候选项
+   * @return 估算的收益分数
+   */
+  double estimateOptimizationBenefit(const StrengthReductionCandidate* candidate);
+  
+  /**
+   * 检查优化的合法性
+   * @param candidate 候选项
+   * @return 是否可以安全地进行优化
+   */
+  bool isOptimizationLegal(const StrengthReductionCandidate* candidate);
+  
+  /**
+   * 打印调试信息
+   */
+  void printDebugInfo();
+};
+
+/**
+ * @brief 循环强度削弱优化遍
+ * 将循环中的乘法运算转换为更高效的加法运算
+ */
+class LoopStrengthReduction : public OptimizationPass {
+public:
+  // 唯一的 Pass ID
+  static void *ID;
+  
+  LoopStrengthReduction(IRBuilder* builder) 
+    : OptimizationPass("LoopStrengthReduction", Granularity::Function), 
+      builder(builder) {}
+  
+  /**
+   * 在函数上运行强度削弱优化
+   * @param F 目标函数
+   * @param AM 分析管理器
+   * @return 是否修改了IR
+   */
+  bool runOnFunction(Function* F, AnalysisManager& AM) override;
+  
+  /**
+   * 声明分析依赖和失效信息
+   */
+  void getAnalysisUsage(std::set<void*>& analysisDependencies, 
+                       std::set<void*>& analysisInvalidations) const override;
+  
+  void* getPassID() const override { return &ID; }
+
+private:
+  IRBuilder* builder;
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/Mem2Reg.h
+++ b/src/include/midend/Pass/Optimize/Mem2Reg.h
@ -75,11 +75,7 @@ private:
    // --------------------------------------------------------------------

    // 对支配树进行深度优先遍历，重命名变量并替换 load/store 指令
-    // alloca: 当前正在处理的 AllocaInst
-    // currentBB: 当前正在遍历的基本块
-    // dt: 支配树分析结果
-    // valueStack: 存储当前 AllocaInst 在当前路径上可见的 SSA 值栈
-    void renameVariables(AllocaInst* alloca, BasicBlock* currentBB);
+    void renameVariables(BasicBlock* currentBB);

    // --------------------------------------------------------------------
    // 阶段4: 清理
--- a/src/include/midend/Pass/Optimize/SCCP.h
+++ b/src/include/midend/Pass/Optimize/SCCP.h
@ -1,196 +1,157 @@
 #pragma once

 #include "IR.h"
+#include "Pass.h"
+#include "SysYIROptUtils.h"
+#include "AliasAnalysis.h"
+#include "SideEffectAnalysis.h"
+#include <cassert>
+#include <iostream>
+#include <map>
+#include <queue>
+#include <set>
+#include <unordered_set>
+#include <vector>
+#include <variant>
+#include <functional>

 namespace sysy {

-// 稀疏条件常量传播类
-// Sparse Conditional Constant Propagation
-/*
-伪代码
-function SCCP_Optimization(Module):
-    for each Function in Module:
-        changed = true
-        while changed:
-            changed = false
-            // 阶段1: 常量传播与折叠
-            changed |= PropagateConstants(Function)
-            // 阶段2: 控制流简化
-            changed |= SimplifyControlFlow(Function)
-        end while
-    end for
-
-function PropagateConstants(Function):
-    // 初始化
-    executableBlocks = {entryBlock}
-    valueState = map<Value, State> // 值->状态映射
-    instWorkList = Queue()
-    edgeWorkList = Queue()
-    
-    // 初始化工作列表
-    for each inst in entryBlock:
-        instWorkList.push(inst)
-    
-    // 迭代处理
-    while !instWorkList.empty() || !edgeWorkList.empty():
-        // 处理指令工作列表
-        while !instWorkList.empty():
-            inst = instWorkList.pop()
-            // 如果指令是可执行基本块中的
-            if executableBlocks.contains(inst.parent):
-                ProcessInstruction(inst)
-        
-        // 处理边工作列表
-        while !edgeWorkList.empty():
-            edge = edgeWorkList.pop()
-            ProcessEdge(edge)
-    
-    // 应用常量替换
-    for each inst in Function:
-        if valueState[inst] == CONSTANT:
-            ReplaceWithConstant(inst, valueState[inst].constant)
-            changed = true
-    
-    return changed
-
-function ProcessInstruction(Instruction inst):
-    switch inst.type:
-        //二元操作
-        case BINARY_OP:
-            lhs = GetValueState(inst.operands[0])
-            rhs = GetValueState(inst.operands[1])
-            if lhs == CONSTANT && rhs == CONSTANT:
-                newState = ComputeConstant(inst.op, lhs.value, rhs.value)
-                UpdateState(inst, newState)
-            else if lhs == BOTTOM || rhs == BOTTOM:
-                UpdateState(inst, BOTTOM)
-        //phi
-        case PHI:
-            mergedState = ⊤
-            for each incoming in inst.incomings:
-            // 检查每个输入的状态
-                if executableBlocks.contains(incoming.block):
-                    incomingState = GetValueState(incoming.value)
-                    mergedState = Meet(mergedState, incomingState)
-            UpdateState(inst, mergedState)
-        // 条件分支
-        case COND_BRANCH:
-            cond = GetValueState(inst.condition)
-            if cond == CONSTANT:
-                // 判断条件分支
-                if cond.value == true:
-                    AddEdgeToWorkList(inst.parent, inst.trueTarget)
-                else:
-                    AddEdgeToWorkList(inst.parent, inst.falseTarget)
-            else if cond == BOTTOM:
-                AddEdgeToWorkList(inst.parent, inst.trueTarget)
-                AddEdgeToWorkList(inst.parent, inst.falseTarget)
-        
-        case UNCOND_BRANCH:
-            AddEdgeToWorkList(inst.parent, inst.target)
-        
-        // 其他指令处理...
-
-function ProcessEdge(Edge edge):
-    fromBB, toBB = edge
-    if !executableBlocks.contains(toBB):
-        executableBlocks.add(toBB)
-        for each inst in toBB:
-            if inst is PHI:
-                instWorkList.push(inst)
-            else:
-                instWorkList.push(inst) // 非PHI指令
-    
-    // 更新PHI节点的输入
-    for each phi in toBB.phis:
-        instWorkList.push(phi)
-
-function SimplifyControlFlow(Function):
-    changed = false
-    // 标记可达基本块
-    ReachableBBs = FindReachableBlocks(Function.entry)
-    
-    // 删除不可达块
-    for each bb in Function.blocks:
-        if !ReachableBBs.contains(bb):
-            RemoveDeadBlock(bb)
-            changed = true
-    
-    // 简化条件分支
-    for each bb in Function.blocks:
-        terminator = bb.terminator
-        if terminator is COND_BRANCH:
-            cond = GetValueState(terminator.condition)
-            if cond == CONSTANT:
-                SimplifyBranch(terminator, cond.value)
-                changed = true
-    
-    return changed
-
-function RemoveDeadBlock(BasicBlock bb):
-    // 1. 更新前驱块的分支指令
-    for each pred in bb.predecessors:
-        UpdateTerminator(pred, bb)
-    
-    // 2. 更新后继块的PHI节点
-    for each succ in bb.successors:
-        RemovePhiIncoming(succ, bb)
-    
-    // 3. 删除块内所有指令
-    for each inst in bb.instructions:
-        inst.remove()
-    
-    // 4. 从函数中移除基本块
-    Function.removeBlock(bb)
-
-function Meet(State a, State b):
-    if a == ⊤: return b
-    if b == ⊤: return a
-    if a == ⊥ || b == ⊥: return ⊥
-    if a.value == b.value: return a
-    return ⊥
-
-function UpdateState(Value v, State newState):
-    oldState = valueState.get(v, ⊤)
-    if newState != oldState:
-        valueState[v] = newState
-        for each user in v.users:
-            if user is Instruction:
-                instWorkList.push(user)
-
-*/
-
-enum class LatticeValue {
-  Top,       // ⊤ (Unknown)
-  Constant,  // c (Constant)
-  Bottom     // ⊥ (Undefined / Varying)
+// 定义三值格 (Three-valued Lattice) 的状态
+enum class LatticeVal {
+  Top,      // ⊤ (未知 / 未初始化)
+  Constant, // c (常量)
+  Bottom    // ⊥ (不确定 / 变化 / 未定义)
 };
-// LatticeValue: 用于表示值的状态，Top表示未知，Constant表示常量，Bottom表示未定义或变化的值。
-// 这里的LatticeValue用于跟踪每个SSA值（变量、指令结果）的状态，
-// 以便在SCCP过程中进行常量传播和控制流简化。

-//TODO: 下列数据结构考虑集成到类中，避免重命名问题
-static std::set<Instruction *> Worklist;
-static std::unordered_set<BasicBlock*> Executable_Blocks;
-static std::queue<std::pair<BasicBlock *, BasicBlock *> > Executable_Edges;
-static std::map<Value*, LatticeValue> valueState;
+// 新增枚举来区分常量的实际类型
+enum class ValueType {
+  Integer,
+  Float,
+  Unknown // 用于 Top 和 Bottom 状态
+};

-class SCCP {
+// 用于表示 SSA 值的具体状态（包含格值和常量值）
+struct SSAPValue {
+  LatticeVal state;
+  std::variant<int, float> constantVal; // 使用 std::variant 存储 int 或 float
+  ValueType constant_type; // 记录常量是整数还是浮点数
+
+  // 默认构造函数，初始化为 Top
+  SSAPValue() : state(LatticeVal::Top), constantVal(0), constant_type(ValueType::Unknown) {}
+  // 构造函数，用于创建 Bottom 状态
+  SSAPValue(LatticeVal s) : state(s), constantVal(0), constant_type(ValueType::Unknown) {
+    assert((s == LatticeVal::Top || s == LatticeVal::Bottom) && "SSAPValue(LatticeVal) only for Top/Bottom");
+  }
+  // 构造函数，用于创建 int Constant 状态
+  SSAPValue(int c) : state(LatticeVal::Constant), constantVal(c), constant_type(ValueType::Integer) {}
+  // 构造函数，用于创建 float Constant 状态
+  SSAPValue(float c) : state(LatticeVal::Constant), constantVal(c), constant_type(ValueType::Float) {}
+
+  // 比较操作符，用于判断状态是否改变
+  bool operator==(const SSAPValue &other) const {
+    if (state != other.state)
+      return false;
+    if (state == LatticeVal::Constant) {
+        if (constant_type != other.constant_type) return false; // 类型必须匹配
+        return constantVal == other.constantVal; // std::variant 会比较内部值
+    }
+    return true; // Top == Top, Bottom == Bottom
+  }
+  bool operator!=(const SSAPValue &other) const { return !(*this == other); }
+};
+
+// SCCP 上下文类，持有每个函数运行时的状态
+class SCCPContext {
 private:
-  Module *pModule;
+  IRBuilder *builder; // IR 构建器，用于插入指令和创建常量
+  AliasAnalysisResult *aliasAnalysis; // 别名分析结果
+  SideEffectAnalysisResult *sideEffectAnalysis; // 副作用分析结果
+
+  // 工作列表
+  // 存储需要重新评估的指令
+  std::queue<Instruction *> instWorkList;
+  // 存储需要重新评估的控制流边 (pair: from_block, to_block)
+  std::queue<std::pair<BasicBlock *, BasicBlock *>> edgeWorkList;
+
+  // 格值映射：SSA Value 到其当前状态
+  std::map<Value *, SSAPValue> valueState;
+  // 可执行基本块集合
+  std::unordered_set<BasicBlock *> executableBlocks;
+  // 追踪已访问的CFG边，防止重复添加，使用 SysYIROptUtils::PairHash
+  std::unordered_set<std::pair<BasicBlock*, BasicBlock*>, SysYIROptUtils::PairHash> visitedCFGEdges;
+
+  // 辅助函数：格操作 Meet
+  SSAPValue Meet(const SSAPValue &a, const SSAPValue &b);
+  // 辅助函数：获取值的当前状态，如果不存在则默认为 Top
+  SSAPValue GetValueState(Value *v);
+  // 辅助函数：更新值的状态，如果状态改变，将所有用户加入指令工作列表
+  void UpdateState(Value *v, SSAPValue newState);
+  // 辅助函数：将边加入边工作列表，并更新可执行块
+  void AddEdgeToWorkList(BasicBlock *fromBB, BasicBlock *toBB);
+  // 辅助函数：标记一个块为可执行
+  void MarkBlockExecutable(BasicBlock* block);
+
+  // 辅助函数：对二元操作进行常量折叠
+  SSAPValue ComputeConstant(BinaryInst *binaryinst, SSAPValue lhsVal, SSAPValue rhsVal);
+  // 辅助函数：对一元操作进行常量折叠
+  SSAPValue ComputeConstant(UnaryInst *unaryInst, SSAPValue operandVal);
+  // 辅助函数：检查是否为已知的纯函数
+  bool isKnownPureFunction(const std::string &funcName) const;
+  // 辅助函数：计算纯函数的常量结果
+  SSAPValue computePureFunctionResult(CallInst *call, const std::vector<SSAPValue> &argValues);
+  // 辅助函数：查找存储到指定位置的常量值
+  SSAPValue findStoredConstantValue(Value *ptr, BasicBlock *currentBB);
+  // 辅助函数：动态检查数组访问是否为常量索引（考虑SCCP状态）
+  bool hasRuntimeConstantAccess(Value *ptr);
+
+  // 主要优化阶段
+  // 阶段1: 常量传播与折叠
+  bool PropagateConstants(Function *func);
+  // 阶段2: 控制流简化
+  bool SimplifyControlFlow(Function *func);
+
+  // 辅助函数：处理单条指令
+  void ProcessInstruction(Instruction *inst);
+  // 辅助函数：处理单条控制流边
+  void ProcessEdge(const std::pair<BasicBlock *, BasicBlock *> &edge);
+
+  // 控制流简化辅助函数
+  // 查找所有可达的基本块 (基于常量条件)
+  std::unordered_set<BasicBlock *> FindReachableBlocks(Function *func);
+  // 移除死块
+  void RemoveDeadBlock(BasicBlock *bb, Function *func);
+  // 简化分支（将条件分支替换为无条件分支）
+  void SimplifyBranch(CondBrInst*brInst, bool condVal); // 保持 BranchInst
+  // 更新前驱块的终结指令（当一个后继块被移除时）
+  void UpdateTerminator(BasicBlock *predBB, BasicBlock *removedSucc);
+  // 移除 Phi 节点的入边（当其前驱块被移除时）
+  void RemovePhiIncoming(BasicBlock *phiParentBB, BasicBlock *removedPred);

 public:
-  SCCP(Module *pMoudle) : pModule(pMoudle) {}
+  SCCPContext(IRBuilder *builder) : builder(builder), aliasAnalysis(nullptr), sideEffectAnalysis(nullptr) {}
+  
+  // 设置别名分析结果
+  void setAliasAnalysis(AliasAnalysisResult *aa) { aliasAnalysis = aa; }
+  
+  // 设置副作用分析结果
+  void setSideEffectAnalysis(SideEffectAnalysisResult *sea) { sideEffectAnalysis = sea; }

-  void run();
-  bool PropagateConstants(Function *function);
-  bool SimplifyControlFlow(Function *function);
-  void ProcessInstruction(Instruction *inst);
-  void ProcessEdge(const std::pair<BasicBlock *, BasicBlock *> &edge);
-  void RemoveDeadBlock(BasicBlock *bb);
-  void UpdateState(Value *v, LatticeValue newState);
-  LatticeValue Meet(LatticeValue a, LatticeValue b);
-  LatticeValue GetValueState(Value *v);
+  // 运行 SCCP 优化
+  void run(Function *func, AnalysisManager &AM);
 };

-}  // namespace sysy
+// SCCP 优化遍类，继承自 OptimizationPass
+class SCCP : public OptimizationPass {
+private:
+  IRBuilder *builder; // IR 构建器，作为 Pass 的成员，传入 Context
+
+public:
+  SCCP(IRBuilder *builder) : OptimizationPass("SCCP", Granularity::Function), builder(builder) {}
+  static void *ID;
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+  void getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const override;
+  void *getPassID() const override { return &ID; }
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Optimize/SysYIROptUtils.h
+++ b/src/include/midend/Pass/Optimize/SysYIROptUtils.h
@ -2,6 +2,7 @@

 #include "IR.h"

+extern int DEBUG;
 namespace sysy {

 // 优化工具类，包含一些通用的优化方法
@ -10,12 +11,80 @@ namespace sysy {
 class SysYIROptUtils{

 public:
-  // 仅仅删除use关系
-  static void usedelete(Instruction *instr) {
-    for (auto &use : instr->getOperands()) {
-      Value* val = use->getValue();
-      val->removeUse(use);
+  struct PairHash {
+      template <class T1, class T2>
+      std::size_t operator () (const std::pair<T1, T2>& p) const {
+          auto h1 = std::hash<T1>{}(p.first);
+          auto h2 = std::hash<T2>{}(p.second);
+
+          // 简单的组合哈希值，可以更复杂以减少冲突
+          // 使用 boost::hash_combine 的简化版本
+          return h1 ^ (h2 << 1); 
+      }
+  };
+
+  static void RemoveUserOperandUses(User *user) {
+    if (!user) {
+        return;
    }
+
+    // 遍历 User 的 operands 列表。
+    // 由于 operands 是 protected 成员，我们需要一个临时方法来访问它，
+    // 或者在 User 类中添加一个 friend 声明。
+    // 假设 User 内部有一个像 getOperands() 这样的公共方法返回 operands 的引用，
+    // 或者将 SysYIROptUtils 声明为 User 的 friend。
+    // 为了示例，我将假设可以直接访问 user->operands 或通过一个getter。
+    // 如果无法直接访问，请在 IR.h 的 User 类中添加：
+    // public: const std::vector<std::shared_ptr<Use>>& getOperands() const { return operands; }
+
+    // 迭代 copies of shared_ptr to avoid issues if removeUse modifies the list
+    // (though remove should handle it, iterating a copy is safer or reverse iteration).
+    // Since we'll clear the vector at the end, iterating forward is fine.
+    for (const auto& use_ptr : user->getOperands()) { // 假设 getOperands() 可用
+        if (use_ptr) {
+            Value *val = use_ptr->getValue(); // 获取 Use 指向的 Value (如 AllocaInst)
+            if (val) {
+                val->removeUse(use_ptr); // 通知 Value 从其 uses 列表中移除此 Use 关系
+            }
+        }
+    }
+  }
+  static void usedelete(Instruction *inst) {
+    assert(inst && "Instruction to delete cannot be null.");
+    BasicBlock *parentBlock = inst->getParent();
+    assert(parentBlock && "Instruction must have a parent BasicBlock to be deleted.");
+
+    // 步骤1: 处理所有使用者，将他们从使用 inst 变为使用 UndefinedValue
+    // 这将清理 inst 作为 Value 时的 uses 列表
+    if (!inst->getUses().empty()) {
+        inst->replaceAllUsesWith(UndefinedValue::get(inst->getType()));
+    }
+
+    // 步骤2: 清理 inst 作为 User 时的操作数关系
+    // 通知 inst 所使用的所有 Value (如 AllocaInst)，移除对应的 Use 关系。
+    // 这里的 inst 实际上是一个 User*，所以可以安全地向下转型。
+    RemoveUserOperandUses(static_cast<User*>(inst));
+
+    // 步骤3: 物理删除指令
+    // 这会导致 Instruction 对象的 unique_ptr 销毁，从而调用其析构函数链。
+    parentBlock->removeInst(inst);
+} 
+
+  static BasicBlock::iterator usedelete(BasicBlock::iterator inst_it) {
+    Instruction *inst_to_delete = inst_it->get();
+    BasicBlock *parentBlock = inst_to_delete->getParent();
+    assert(parentBlock && "Instruction must have a parent BasicBlock for iterator deletion.");
+
+    // 步骤1: 处理所有使用者
+    if (!inst_to_delete->getUses().empty()) {
+        inst_to_delete->replaceAllUsesWith(UndefinedValue::get(inst_to_delete->getType()));
+    }
+
+    // 步骤2: 清理操作数关系
+    RemoveUserOperandUses(static_cast<User*>(inst_to_delete));
+
+    // 步骤3: 物理删除指令并返回下一个迭代器
+    return parentBlock->removeInst(inst_it);
  }

  // 判断是否是全局变量
@ -26,8 +95,230 @@ public:
  // 判断是否是数组
  static bool isArr(Value *val) {
    auto aval = dynamic_cast<AllocaInst *>(val);
-  return aval != nullptr && aval->getNumDims() != 0;
+    // 如果是 AllocaInst 且通过Type::isArray()判断为数组类型
+    return aval && aval->getType()->as<PointerType>()->getBaseType()->isArray();
  }
+  // 判断是否是指向数组的指针
+  static bool isArrPointer(Value *val) {
+    auto aval = dynamic_cast<AllocaInst *>(val);
+    // 如果是 AllocaInst 且通过Type::isPointer()判断为指针;
+    auto baseType = aval->getType()->as<PointerType>()->getBaseType();
+    // 在sysy中，函数的数组参数会退化成指针
+    // 所以当AllocaInst的basetype是PointerType时（一维数组）或者是指向ArrayType的PointerType（多位数组）时，返回true
+    return aval && (baseType->isPointer() || baseType->as<PointerType>()->getBaseType()->isArray());
+  }
+
+  
+  // PHI指令消除相关方法
+  static bool eliminateRedundantPhisInFunction(Function* func){
+    bool changed = false;
+    std::vector<Instruction *> toDelete;
+    for (auto &bb : func->getBasicBlocks()) {
+      for (auto &inst : bb->getInstructions()) {
+        if (auto phi = dynamic_cast<PhiInst *>(inst.get())) {
+          auto incoming = phi->getIncomingValues();
+          if(DEBUG){
+            std::cout << "Checking Phi: " << phi->getName() << " with " << incoming.size() << " incoming values." << std::endl;
+          }
+          if (incoming.size() == 1) {
+            Value *singleVal = incoming[0].second;
+            inst->replaceAllUsesWith(singleVal);
+            toDelete.push_back(inst.get());
+          }
+        }
+        else
+          break; // 只处理Phi指令
+      }
+    }
+    for (auto *phi : toDelete) {
+      usedelete(phi);
+      changed = true; // 标记为已更改
+    }
+    return changed; // 返回是否有删除发生
+  }
+
+  //该实现参考了libdivide的算法
+  static std::pair<int, int> computeMulhMagicNumbers(int divisor) {
+    
+    if (DEBUG) {
+      std::cout << "\n[SR] ===== Computing magic numbers for divisor " << divisor << " (libdivide algorithm) =====" << std::endl;
+    }
+    
+    if (divisor == 0) {
+      if (DEBUG) std::cout << "[SR] Error: divisor must be != 0" << std::endl;
+      return {-1, -1};
+    }
+
+    // libdivide 常数
+    const uint8_t LIBDIVIDE_ADD_MARKER = 0x40;
+    const uint8_t LIBDIVIDE_NEGATIVE_DIVISOR = 0x80;
+    
+    // 辅助函数：计算前导零个数
+    auto count_leading_zeros32 = [](uint32_t val) -> uint32_t {
+      if (val == 0) return 32;
+      return __builtin_clz(val);
+    };
+    
+    // 辅助函数：64位除法返回32位商和余数
+    auto div_64_32 = [](uint32_t high, uint32_t low, uint32_t divisor, uint32_t* rem) -> uint32_t {
+      uint64_t dividend = ((uint64_t)high << 32) | low;
+      uint32_t quotient = dividend / divisor;
+      *rem = dividend % divisor;
+      return quotient;
+    };
+
+    if (DEBUG) {
+      std::cout << "[SR] Input divisor: " << divisor << std::endl;
+    }
+
+    // libdivide_internal_s32_gen 算法实现
+    int32_t d = divisor;
+    uint32_t ud = (uint32_t)d;
+    uint32_t absD = (d < 0) ? -ud : ud;
+    
+    if (DEBUG) {
+      std::cout << "[SR] absD = " << absD << std::endl;
+    }
+    
+    uint32_t floor_log_2_d = 31 - count_leading_zeros32(absD);
+    
+    if (DEBUG) {
+      std::cout << "[SR] floor_log_2_d = " << floor_log_2_d << std::endl;
+    }
+    
+    // 检查 absD 是否为2的幂
+    if ((absD & (absD - 1)) == 0) {
+      if (DEBUG) {
+        std::cout << "[SR] " << absD << " 是2的幂，使用移位方法" << std::endl;
+      }
+      
+      // 对于2的幂，我们只使用移位，不需要魔数
+      int shift = floor_log_2_d;
+      if (d < 0) shift |= 0x80; // 标记负数
+      
+      if (DEBUG) {
+        std::cout << "[SR] Power of 2 result: magic=0, shift=" << shift << std::endl;
+        std::cout << "[SR] ===== End magic computation =====" << std::endl;
+      }
+      
+      // 对于我们的目的，我们将在IR生成中以不同方式处理2的幂
+      // 返回特殊标记
+      return {0, shift};
+    }
+    
+    if (DEBUG) {
+      std::cout << "[SR] " << absD << " is not a power of 2, computing magic number" << std::endl;
+    }
+    
+    // 非2的幂除数的魔数计算
+    uint8_t more;
+    uint32_t rem, proposed_m;
+    
+    // 计算 proposed_m = floor(2^(floor_log_2_d + 31) / absD)
+    proposed_m = div_64_32((uint32_t)1 << (floor_log_2_d - 1), 0, absD, &rem);
+    const uint32_t e = absD - rem;
+    
+    if (DEBUG) {
+      std::cout << "[SR] proposed_m = " << proposed_m << ", rem = " << rem << ", e = " << e << std::endl;
+    }
+    
+    // 确定是否需要"加法"版本
+    const bool branchfree = false; // 使用分支版本
+    
+    if (!branchfree && e < ((uint32_t)1 << floor_log_2_d)) {
+      // 这个幂次有效
+      more = (uint8_t)(floor_log_2_d - 1);
+      if (DEBUG) {
+        std::cout << "[SR] Using basic algorithm, shift = " << (int)more << std::endl;
+      }
+    } else {
+      // 我们需要上升一个等级
+      proposed_m += proposed_m;
+      const uint32_t twice_rem = rem + rem;
+      if (twice_rem >= absD || twice_rem < rem) {
+        proposed_m += 1;
+      }
+      more = (uint8_t)(floor_log_2_d | LIBDIVIDE_ADD_MARKER);
+      if (DEBUG) {
+        std::cout << "[SR] Using add algorithm, proposed_m = " << proposed_m << ", more = " << (int)more << std::endl;
+      }
+    }
+    
+    proposed_m += 1;
+    int32_t magic = (int32_t)proposed_m;
+    
+    // 处理负除数
+    if (d < 0) {
+      more |= LIBDIVIDE_NEGATIVE_DIVISOR;
+      if (!branchfree) {
+        magic = -magic;
+      }
+      if (DEBUG) {
+        std::cout << "[SR] Negative divisor, magic = " << magic << ", more = " << (int)more << std::endl;
+      }
+    }
+    
+    // 为我们的IR生成提取移位量和标志  
+    int shift = more & 0x3F; // 移除标志，保留移位量（位0-5）
+    bool need_add = (more & LIBDIVIDE_ADD_MARKER) != 0;
+    bool is_negative = (more & LIBDIVIDE_NEGATIVE_DIVISOR) != 0;
+    
+    if (DEBUG) {
+      std::cout << "[SR] Final result: magic = " << magic << ", more = " << (int)more 
+                << " (0x" << std::hex << (int)more << std::dec << ")" << std::endl;
+      std::cout << "[SR] Shift = " << shift << ", need_add = " << need_add 
+                << ", is_negative = " << is_negative << std::endl;
+      
+      // Test the magic number using the correct libdivide algorithm
+      std::cout << "[SR] Testing magic number (libdivide algorithm):" << std::endl;
+      int test_values[] = {1, 7, 37, 100, 999, -1, -7, -37, -100};
+      
+      for (int test_val : test_values) {
+        int64_t quotient;
+        
+        // 实现正确的libdivide算法
+        int64_t product = (int64_t)test_val * magic;
+        int64_t high_bits = product >> 32;
+        
+        if (need_add) {
+          // ADD_MARKER情况：移位前加上被除数
+          // 这是libdivide的关键洞察！
+          high_bits += test_val;
+          quotient = high_bits >> shift;
+        } else {
+          // 正常情况：只是移位
+          quotient = high_bits >> shift;
+        }
+        
+        // 符号修正：这是libdivide有符号除法的关键部分！
+        // 如果被除数为负，商需要加1来匹配C语言的截断除法语义
+        if (test_val < 0) {
+          quotient += 1;
+        }
+        
+        int expected = test_val / divisor;
+        
+        bool correct = (quotient == expected);
+        std::cout << "[SR]   " << test_val << " / " << divisor << " = " << quotient 
+                  << " (expected " << expected << ") " << (correct ? "✓" : "✗") << std::endl;
+      }
+      
+      std::cout << "[SR] ===== End magic computation =====" << std::endl;
+    }
+    
+    // 返回魔数、移位量，并在移位中编码ADD_MARKER标志
+    // 我们将使用移位的第6位表示ADD_MARKER，第7位表示负数（如果需要）
+    int encoded_shift = shift;
+    if (need_add) {
+      encoded_shift |= 0x40; // 设置第6位表示ADD_MARKER
+      if (DEBUG) {
+        std::cout << "[SR] Encoding ADD_MARKER in shift: " << encoded_shift << std::endl;
+      }
+    }
+    
+    return {magic, encoded_shift};
+  }
+
 };

 }// namespace sysy
--- a/src/include/midend/Pass/Optimize/TailCallOpt.h
+++ b/src/include/midend/Pass/Optimize/TailCallOpt.h
@ -0,0 +1,39 @@
+#pragma once
+
+#include "Pass.h"
+#include "Dom.h"
+#include "Loop.h"
+
+namespace sysy {
+
+/**
+ * @class TailCallOpt
+ * @brief 优化尾调用的中端优化通道。
+ *
+ * 该类实现了一个针对函数级别的尾调用优化的优化通道（OptimizationPass）。
+ * 通过分析和转换 IR（中间表示），将可优化的尾调用转换为更高效的形式，
+ * 以减少函数调用的开销，提升程序性能。
+ *
+ * @note 需要传入 IRBuilder 指针用于 IR 构建和修改。
+ *
+ * @method runOnFunction
+ * 对指定函数进行尾调用优化。
+ *
+ * @method getPassID
+ * 获取当前优化通道的唯一标识符。
+ *
+ * @method getAnalysisUsage
+ * 指定该优化通道所依赖和失效的分析集合。
+ */
+class TailCallOpt : public OptimizationPass {
+private:
+  IRBuilder* builder;
+public:
+  TailCallOpt(IRBuilder* builder) : OptimizationPass("TailCallOpt", Granularity::Function), builder(builder) {}
+  static void *ID;
+  bool runOnFunction(Function *F, AnalysisManager &AM) override;
+  void *getPassID() const override { return &ID; }
+  void getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const override;
+};
+
+} // namespace sysy
--- a/src/include/midend/Pass/Pass.h
+++ b/src/include/midend/Pass/Pass.h
@ -151,17 +151,21 @@ public:
    }
    AnalysisPass *analysisPass = static_cast<AnalysisPass *>(basePass.get());

-    if(DEBUG){
-      std::cout << "Running Analysis Pass: " << analysisPass->getName() << "\n";
-    }
    // 根据分析遍的粒度处理
    switch (analysisPass->getGranularity()) {
      case Pass::Granularity::Module: {
        // 检查是否已存在有效结果
        auto it = moduleCachedResults.find(analysisID);
        if (it != moduleCachedResults.end()) {
+          if(DEBUG) {
+            std::cout << "Using cached result for Analysis Pass: " << analysisPass->getName() << "\n";
+          }
          return static_cast<T *>(it->second.get()); // 返回缓存结果
        }
+        // 只有在实际运行时才打印调试信息
+        if(DEBUG){
+          std::cout << "Running Analysis Pass: " << analysisPass->getName() << "\n";
+        }
        // 运行模块级分析遍
        if (!pModuleRef) {
          std::cerr << "Error: Module reference not set for AnalysisManager to run Module Pass.\n";
@ -183,8 +187,16 @@ public:
        // 检查是否已存在有效结果
        auto it = functionCachedResults.find({F, analysisID});
        if (it != functionCachedResults.end()) {
+          if(DEBUG) {
+            std::cout << "Using cached result for Analysis Pass: " << analysisPass->getName() << " (Function: " << F->getName() << ")\n";
+          }
          return static_cast<T *>(it->second.get()); // 返回缓存结果
        }
+        // 只有在实际运行时才打印调试信息
+        if(DEBUG){
+          std::cout << "Running Analysis Pass: " << analysisPass->getName() << "\n";
+          std::cout << "Function: " << F->getName() << "\n";
+        }
        // 运行函数级分析遍
        analysisPass->runOnFunction(F, *this);
        // 获取结果并缓存
@ -202,8 +214,16 @@ public:
        // 检查是否已存在有效结果
        auto it = basicBlockCachedResults.find({BB, analysisID});
        if (it != basicBlockCachedResults.end()) {
+          if(DEBUG) {
+            std::cout << "Using cached result for Analysis Pass: " << analysisPass->getName() << " (BasicBlock: " << BB->getName() << ")\n";
+          }
          return static_cast<T *>(it->second.get()); // 返回缓存结果
        }
+        // 只有在实际运行时才打印调试信息
+        if(DEBUG){
+          std::cout << "Running Analysis Pass: " << analysisPass->getName() << "\n";
+          std::cout << "BasicBlock: " << BB->getName() << "\n";
+        }
        // 运行基本块级分析遍
        analysisPass->runOnBasicBlock(BB, *this);
        // 获取结果并缓存
@ -279,7 +299,7 @@ private:
  IRBuilder *pBuilder;

 public:
-  PassManager() = default;
+  PassManager() = delete;
  ~PassManager() = default;

  PassManager(Module *module, IRBuilder *builder) : pmodule(module) ,pBuilder(builder), analysisManager(module) {}
--- a/src/include/midend/SysYIRGenerator.h
+++ b/src/include/midend/SysYIRGenerator.h
@ -51,6 +51,7 @@ public:
                         Module *pModule, IRBuilder *pBuilder);

  static void initExternalFunction(Module *pModule, IRBuilder *pBuilder);
+  static void modify_timefuncname(Module *pModule);
 };

 class SysYIRGenerator : public SysYBaseVisitor {
@ -86,7 +87,60 @@ private:
    case LPAREN: case RPAREN: return 0; // Parentheses have lowest precedence for stack logic
    default: return -1; // Unknown operator
    }
-  }
+  };
+
+  struct ExpKey {
+    BinaryOp op;  ///< 操作符
+    Value *left;  ///< 左操作数
+    Value *right; ///< 右操作数
+    ExpKey(BinaryOp op, Value *left, Value *right) : op(op), left(left), right(right) {}
+
+    bool operator<(const ExpKey &other) const {
+      if (op != other.op)
+        return op < other.op; ///< 比较操作符
+      if (left != other.left)
+        return left < other.left; ///< 比较左操作
+      return right < other.right; ///< 比较右操作数
+    } ///< 重载小于运算符用于比较ExpKey
+  };
+
+  struct UnExpKey {
+    BinaryOp op;    ///< 一元操作符
+    Value *operand; ///< 操作数
+    UnExpKey(BinaryOp op, Value *operand) : op(op), operand(operand) {}
+
+    bool operator<(const UnExpKey &other) const {
+      if (op != other.op)
+        return op < other.op;         ///< 比较操作符
+      return operand < other.operand; ///< 比较操作数
+    } ///< 重载小于运算符用于比较UnExpKey
+  };
+
+  struct GEPKey {
+    Value *basePointer;
+    std::vector<Value *> indices;
+
+    // 为 std::map 定义比较运算符，使得 GEPKey 可以作为键
+    bool operator<(const GEPKey &other) const {
+      if (basePointer != other.basePointer) {
+        return basePointer < other.basePointer;
+      }
+      // 逐个比较索引，确保顺序一致
+      if (indices.size() != other.indices.size()) {
+        return indices.size() < other.indices.size();
+      }
+      for (size_t i = 0; i < indices.size(); ++i) {
+        if (indices[i] != other.indices[i]) {
+          return indices[i] < other.indices[i];
+        }
+      }
+      return false; // 如果 basePointer 和所有索引都相同，则认为相等
+    }
+  };
+  std::map<GEPKey, Value*> availableGEPs; ///< 用于存储 GEP 的缓存
+  std::map<ExpKey, Value*> availableBinaryExpressions;
+  std::map<UnExpKey, Value*> availableUnaryExpressions;
+  std::map<Value*, Value*> availableLoads;

 public:
  SysYIRGenerator() = default;
@ -167,6 +221,15 @@ public:
  Value* computeExp(SysYParser::ExpContext *ctx, Type* targetType = nullptr);
  Value* computeAddExp(SysYParser::AddExpContext *ctx, Type* targetType = nullptr);
  void compute();
+
+  // 参数是发生 store 操作的目标地址/变量的 Value*
+  void invalidateExpressionsOnStore(Value* storedAddress);
+
+  // 清除因函数调用而失效的表达式缓存（保守策略）
+  void invalidateExpressionsOnCall();
+
+  // 在进入新的基本块时清空所有表达式缓存
+  void enterNewBasicBlock();
 public:
  // 获取GEP指令的地址
  Value* getGEPAddressInst(Value* basePointer, const std::vector<Value*>& indices);
--- a/src/midend/CMakeLists.txt
+++ b/src/midend/CMakeLists.txt
@ -6,10 +6,25 @@ add_library(midend_lib STATIC
    Pass/Pass.cpp
    Pass/Analysis/Dom.cpp
    Pass/Analysis/Liveness.cpp
+    Pass/Analysis/Loop.cpp
+    Pass/Analysis/LoopCharacteristics.cpp
+    Pass/Analysis/LoopVectorization.cpp
+    Pass/Analysis/AliasAnalysis.cpp
+    Pass/Analysis/SideEffectAnalysis.cpp
+    Pass/Analysis/CallGraphAnalysis.cpp
    Pass/Optimize/DCE.cpp
    Pass/Optimize/Mem2Reg.cpp
    Pass/Optimize/Reg2Mem.cpp
+    Pass/Optimize/GVN.cpp
    Pass/Optimize/SysYIRCFGOpt.cpp
+    Pass/Optimize/SCCP.cpp
+    Pass/Optimize/LoopNormalization.cpp
+    Pass/Optimize/LICM.cpp
+    Pass/Optimize/LoopStrengthReduction.cpp
+    Pass/Optimize/InductionVariableElimination.cpp
+    Pass/Optimize/GlobalStrengthReduction.cpp
+    Pass/Optimize/BuildCFG.cpp
+    Pass/Optimize/TailCallOpt.cpp
 )

 # 包含中端模块所需的头文件路径
--- a/src/midend/IR.cpp
+++ b/src/midend/IR.cpp
--- a/src/midend/Pass/Analysis/AliasAnalysis.cpp
+++ b/src/midend/Pass/Analysis/AliasAnalysis.cpp
@ -0,0 +1,559 @@
+#include "AliasAnalysis.h"
+#include "SysYIRPrinter.h"
+#include <iostream>
+
+extern int DEBUG;
+
+namespace sysy {
+
+// 静态成员初始化
+void *SysYAliasAnalysisPass::ID = (void *)&SysYAliasAnalysisPass::ID;
+
+// ========== AliasAnalysisResult 实现 ==========
+
+void AliasAnalysisResult::print() const {
+    std::cout << "---- Alias Analysis Results for Function: " << AssociatedFunction->getName() << " ----\n";
+    
+    // 打印内存位置信息
+    std::cout << "  Memory Locations (" << LocationMap.size() << "):\n";
+    for (const auto& pair : LocationMap) {
+        const auto& loc = pair.second;
+        std::cout << "    - Base: " << loc->basePointer->getName();
+        std::cout << " (Type: ";
+        if (loc->isLocalArray) std::cout << "Local";
+        else if (loc->isFunctionParameter) std::cout << "Parameter";
+        else if (loc->isGlobalArray) std::cout << "Global";
+        else std::cout << "Unknown";
+        std::cout << ")\n";
+    }
+    
+    // 打印别名关系
+    std::cout << "  Alias Relations (" << AliasMap.size() << "):\n";
+    for (const auto& pair : AliasMap) {
+        std::cout << "    - (" << pair.first.first->getName() << ", " << pair.first.second->getName() << "): ";
+        switch (pair.second) {
+            case AliasType::NO_ALIAS: std::cout << "No Alias"; break;
+            case AliasType::SELF_ALIAS: std::cout << "Self Alias"; break;
+            case AliasType::POSSIBLE_ALIAS: std::cout << "Possible Alias"; break;
+            case AliasType::UNKNOWN_ALIAS: std::cout << "Unknown Alias"; break;
+        }
+        std::cout << "\n";
+    }
+    std::cout << "-----------------------------------------------------------\n";
+}
+
+AliasType AliasAnalysisResult::queryAlias(Value* ptr1, Value* ptr2) const {
+  auto key = std::make_pair(ptr1, ptr2);
+  auto it = AliasMap.find(key);
+  if (it != AliasMap.end()) {
+    return it->second;
+  }
+  
+  // 尝试反向查找
+  key = std::make_pair(ptr2, ptr1);
+  it = AliasMap.find(key);
+  if (it != AliasMap.end()) {
+    return it->second;
+  }
+  
+  return AliasType::UNKNOWN_ALIAS; // 保守估计
+}
+
+const MemoryLocation* AliasAnalysisResult::getMemoryLocation(Value* ptr) const {
+  auto it = LocationMap.find(ptr);
+  return (it != LocationMap.end()) ? it->second.get() : nullptr;
+}
+
+bool AliasAnalysisResult::isLocalArray(Value* ptr) const {
+  const MemoryLocation* loc = getMemoryLocation(ptr);
+  return loc && loc->isLocalArray;
+}
+
+bool AliasAnalysisResult::isFunctionParameter(Value* ptr) const {
+  const MemoryLocation* loc = getMemoryLocation(ptr);
+  return loc && loc->isFunctionParameter;
+}
+
+bool AliasAnalysisResult::isGlobalArray(Value* ptr) const {
+  const MemoryLocation* loc = getMemoryLocation(ptr);
+  return loc && loc->isGlobalArray;
+}
+
+bool AliasAnalysisResult::hasConstantAccess(Value* ptr) const {
+  const MemoryLocation* loc = getMemoryLocation(ptr);
+  return loc && loc->hasConstantIndices;
+}
+
+AliasAnalysisResult::Statistics AliasAnalysisResult::getStatistics() const {
+  Statistics stats = {0};
+  
+  stats.totalQueries = AliasMap.size();
+  
+  for (auto& pair : AliasMap) {
+    switch (pair.second) {
+      case AliasType::NO_ALIAS: stats.noAlias++; break;
+      case AliasType::SELF_ALIAS: stats.selfAlias++; break;
+      case AliasType::POSSIBLE_ALIAS: stats.possibleAlias++; break;
+      case AliasType::UNKNOWN_ALIAS: stats.unknownAlias++; break;
+    }
+  }
+  
+  for (auto& loc : LocationMap) {
+    if (loc.second->isLocalArray) stats.localArrays++;
+    if (loc.second->isFunctionParameter) stats.functionParameters++;
+    if (loc.second->isGlobalArray) stats.globalArrays++;
+    if (loc.second->hasConstantIndices) stats.constantAccesses++;
+  }
+  
+  return stats;
+}
+
+void AliasAnalysisResult::printStatics() const {
+  std::cout << "=== Alias Analysis Results ===" << std::endl;
+  
+  auto stats = getStatistics();
+  std::cout << "Total queries: " << stats.totalQueries << std::endl;
+  std::cout << "No alias: " << stats.noAlias << std::endl;
+  std::cout << "Self alias: " << stats.selfAlias << std::endl;
+  std::cout << "Possible alias: " << stats.possibleAlias << std::endl;
+  std::cout << "Unknown alias: " << stats.unknownAlias << std::endl;
+  std::cout << "Local arrays: " << stats.localArrays << std::endl;
+  std::cout << "Function parameters: " << stats.functionParameters << std::endl;
+  std::cout << "Global arrays: " << stats.globalArrays << std::endl;
+  std::cout << "Constant accesses: " << stats.constantAccesses << std::endl;
+}
+
+void AliasAnalysisResult::addMemoryLocation(std::unique_ptr<MemoryLocation> location) {
+  Value* ptr = location->accessPointer;
+  LocationMap[ptr] = std::move(location);
+}
+
+void AliasAnalysisResult::addAliasRelation(Value* ptr1, Value* ptr2, AliasType type) {
+  auto key = std::make_pair(ptr1, ptr2);
+  AliasMap[key] = type;
+}
+
+// ========== SysYAliasAnalysisPass 实现 ==========
+
+bool SysYAliasAnalysisPass::runOnFunction(Function *F, AnalysisManager &AM) {
+  if (DEBUG) {
+    std::cout << "Running SysY Alias Analysis on function: " << F->getName() << std::endl;
+  }
+  
+  // 创建分析结果
+  CurrentResult = std::make_unique<AliasAnalysisResult>(F);
+  
+  // 执行主要分析步骤
+  collectMemoryAccesses(F);
+  buildAliasRelations(F);
+  optimizeForSysY(F);
+  
+  if (DEBUG) {
+    CurrentResult->print();
+    CurrentResult->printStatics();
+  }
+
+  return false; // 分析遍不修改IR
+}
+
+void SysYAliasAnalysisPass::collectMemoryAccesses(Function* F) {
+  // 收集函数中所有内存访问指令
+  for (auto& bb : F->getBasicBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      Value* ptr = nullptr;
+      
+      if (auto* loadInst = dynamic_cast<LoadInst*>(inst.get())) {
+        ptr = loadInst->getPointer();
+      } else if (auto* storeInst = dynamic_cast<StoreInst*>(inst.get())) {
+        ptr = storeInst->getPointer();
+      }
+      
+      if (ptr) {
+        // 创建内存位置信息
+        auto location = createMemoryLocation(ptr);
+        location->accessInsts.push_back(inst.get());
+        
+        // 更新读写标记
+        if (dynamic_cast<LoadInst*>(inst.get())) {
+          location->hasReads = true;
+        } else {
+          location->hasWrites = true;
+        }
+        
+        CurrentResult->addMemoryLocation(std::move(location));
+      }
+    }
+  }
+}
+
+void SysYAliasAnalysisPass::buildAliasRelations(Function *F) {
+  // 构建所有内存访问之间的别名关系
+  auto& locationMap = CurrentResult->LocationMap;
+  
+  std::vector<Value*> allPointers;
+  for (auto& pair : locationMap) {
+    allPointers.push_back(pair.first);
+  }
+  
+  // 两两比较所有指针
+  for (size_t i = 0; i < allPointers.size(); ++i) {
+    for (size_t j = i + 1; j < allPointers.size(); ++j) {
+      Value* ptr1 = allPointers[i];
+      Value* ptr2 = allPointers[j];
+      
+      MemoryLocation* loc1 = locationMap[ptr1].get();
+      MemoryLocation* loc2 = locationMap[ptr2].get();
+      
+      AliasType aliasType = analyzeAliasBetween(loc1, loc2);
+      CurrentResult->addAliasRelation(ptr1, ptr2, aliasType);
+    }
+  }
+}
+
+void SysYAliasAnalysisPass::optimizeForSysY(Function* F) {
+  // SysY特化优化
+  applySysYConstraints(F);
+  optimizeParameterAnalysis(F);
+  optimizeArrayAccessAnalysis(F);
+}
+
+std::unique_ptr<MemoryLocation> SysYAliasAnalysisPass::createMemoryLocation(Value* ptr) {
+  Value* basePtr = getBasePointer(ptr);
+  auto location = std::make_unique<MemoryLocation>(basePtr, ptr);
+  
+  // 分析内存类型和索引模式
+  analyzeMemoryType(location.get());
+  analyzeIndexPattern(location.get());
+  
+  return location;
+}
+
+Value* SysYAliasAnalysisPass::getBasePointer(Value* ptr) {
+  // 递归剥离GEP指令，找到真正的基指针
+  if (auto* gepInst = dynamic_cast<GetElementPtrInst*>(ptr)) {
+    return getBasePointer(gepInst->getBasePointer());
+  }
+  return ptr;
+}
+
+void SysYAliasAnalysisPass::analyzeMemoryType(MemoryLocation* location) {
+  Value* base = location->basePointer;
+  
+  // 检查内存类型
+  if (dynamic_cast<AllocaInst*>(base)) {
+    location->isLocalArray = true;
+  } else if (dynamic_cast<Argument*>(base)) {
+    location->isFunctionParameter = true;
+  } else if (dynamic_cast<GlobalValue*>(base)) {
+    location->isGlobalArray = true;
+  }
+}
+
+void SysYAliasAnalysisPass::analyzeIndexPattern(MemoryLocation* location) {
+  // 分析GEP指令的索引模式
+  if (auto* gepInst = dynamic_cast<GetElementPtrInst*>(location->accessPointer)) {
+    // 初始化为true，如果发现非常量索引则设为false
+    location->hasConstantIndices = true;
+    
+    // 收集所有索引
+    for (unsigned i = 0; i < gepInst->getNumIndices(); ++i) {
+      Value* index = gepInst->getIndex(i);
+      location->indices.push_back(index);
+      
+      // 检查是否为常量索引
+      if (!isConstantValue(index)) {
+        location->hasConstantIndices = false;
+      }
+    }
+    
+    // 检查是否包含循环变量
+    Function* containingFunc = nullptr;
+    if (auto* inst = dynamic_cast<Instruction*>(location->basePointer)) {
+      containingFunc = inst->getParent()->getParent();
+    } else if (auto* arg = dynamic_cast<Argument*>(location->basePointer)) {
+      containingFunc = arg->getParent();
+    }
+    
+    if (containingFunc) {
+      location->hasLoopVariableIndex = hasLoopVariableInIndices(location->indices, containingFunc);
+    }
+    
+    // 计算常量偏移
+    if (location->hasConstantIndices) {
+      location->constantOffset = calculateConstantOffset(location->indices);
+    }
+  }
+}
+
+AliasType SysYAliasAnalysisPass::analyzeAliasBetween(MemoryLocation* loc1, MemoryLocation* loc2) {
+  // 分析两个内存位置之间的别名关系
+  
+  // 1. 相同基指针的情况需要进一步分析索引
+  if (loc1->basePointer == loc2->basePointer) {
+    // 如果是同一个访问指针，那就是完全相同的内存位置
+    if (loc1->accessPointer == loc2->accessPointer) {
+      return AliasType::SELF_ALIAS;
+    }
+    
+    // 相同基指针但不同访问指针，需要比较索引
+    return compareIndices(loc1, loc2);
+  }
+  
+  // 2. 不同类型的内存位置
+  if ((loc1->isLocalArray && loc2->isLocalArray)) {
+    return compareLocalArrays(loc1, loc2);
+  }
+  
+  if ((loc1->isFunctionParameter && loc2->isFunctionParameter)) {
+    return compareParameters(loc1, loc2);
+  }
+  
+  if ((loc1->isGlobalArray || loc2->isGlobalArray)) {
+    return compareWithGlobal(loc1, loc2);
+  }
+  
+  return compareMixedTypes(loc1, loc2);
+}
+
+AliasType SysYAliasAnalysisPass::compareIndices(MemoryLocation* loc1, MemoryLocation* loc2) {
+  // 比较相同基指针下的不同索引访问
+  
+  // 如果都有常量索引，可以精确比较
+  if (loc1->hasConstantIndices && loc2->hasConstantIndices) {
+    // 比较索引数量
+    if (loc1->indices.size() != loc2->indices.size()) {
+      return AliasType::NO_ALIAS;
+    }
+    
+    // 逐个比较索引值
+    for (size_t i = 0; i < loc1->indices.size(); ++i) {
+      Value* idx1 = loc1->indices[i];
+      Value* idx2 = loc2->indices[i];
+      
+      // 都是常量，比较值
+      auto* const1 = dynamic_cast<ConstantInteger*>(idx1);
+      auto* const2 = dynamic_cast<ConstantInteger*>(idx2);
+      
+      if (const1 && const2) {
+        int val1 = std::get<int>(const1->getVal());
+        int val2 = std::get<int>(const2->getVal());
+        
+        if (val1 != val2) {
+          return AliasType::NO_ALIAS;  // 不同常量索引，确定无别名
+        }
+      } else {
+        // 不是常量，无法确定
+        return AliasType::POSSIBLE_ALIAS;
+      }
+    }
+    
+    // 所有索引都相同
+    return AliasType::SELF_ALIAS;
+  }
+  
+  // 如果有非常量索引，保守估计
+  return AliasType::POSSIBLE_ALIAS;
+}
+
+AliasType SysYAliasAnalysisPass::compareLocalArrays(MemoryLocation* loc1, MemoryLocation* loc2) {
+  // 不同局部数组不别名
+  return AliasType::NO_ALIAS;
+}
+
+AliasType SysYAliasAnalysisPass::compareParameters(MemoryLocation* loc1, MemoryLocation* loc2) {
+  // SysY特化：可配置的数组参数别名策略
+  // 
+  // SysY中数组参数的语法形式：
+  //   void func(int a[], int b[])     - 一维数组参数
+  //   void func(int a[][10], int b[]) - 多维数组参数
+  //
+  // 默认保守策略：不同数组参数可能别名（因为可能传入相同数组）
+  //   func(arr, arr);  // 传入同一个数组给两个参数
+  //
+  // 激进策略：假设不同数组参数不会传入相同数组（适用于评测环境）
+  //   在SysY评测中，这种情况很少出现
+  
+  if (useAggressiveParameterAnalysis()) {
+    // 激进策略：不同数组参数假设不别名
+    return AliasType::NO_ALIAS;
+  } else {
+    // 保守策略：不同数组参数可能别名
+    return AliasType::POSSIBLE_ALIAS;
+  }
+}
+
+AliasType SysYAliasAnalysisPass::compareWithGlobal(MemoryLocation* loc1, MemoryLocation* loc2) {
+  // 涉及全局数组的访问分析
+  // 这里处理所有涉及全局数组的情况
+  
+  // SysY特化：局部数组与全局数组不别名
+  if ((loc1->isLocalArray && loc2->isGlobalArray) || 
+      (loc1->isGlobalArray && loc2->isLocalArray)) {
+    // 局部数组在栈上，全局数组在全局区，确定不别名
+    return AliasType::NO_ALIAS;
+  }
+  
+  // SysY特化：数组参数与全局数组可能别名（保守处理）
+  if ((loc1->isFunctionParameter && loc2->isGlobalArray) || 
+      (loc1->isGlobalArray && loc2->isFunctionParameter)) {
+    // 数组参数可能指向全局数组，需要保守处理
+    return AliasType::POSSIBLE_ALIAS;
+  }
+  
+  // 其他涉及全局数组的情况，采用保守策略
+  return AliasType::POSSIBLE_ALIAS;
+}
+
+AliasType SysYAliasAnalysisPass::compareMixedTypes(MemoryLocation* loc1, MemoryLocation* loc2) {
+  // 混合类型访问的别名分析
+  // 处理不同内存类型之间的别名关系
+  
+  // SysY特化：局部数组与数组参数通常不别名
+  // 典型场景：
+  //   void func(int p[]) {       // p 是数组参数
+  //       int local[10];         // local 是局部数组
+  //       p[0] = local[0];       // 混合类型访问
+  //   }
+  // 或多维数组：
+  //   void func(int p[][10]) {   // p 是多维数组参数
+  //       int local[10];         // local 是局部数组
+  //       p[i][0] = local[0];    // 混合类型访问
+  //   } 
+  // 局部数组与数组参数：在SysY中通常不别名
+  if ((loc1->isLocalArray && loc2->isFunctionParameter) || 
+      (loc1->isFunctionParameter && loc2->isLocalArray)) {
+    // 因为局部数组是栈上分配，而数组参数是传入的外部数组
+    return AliasType::NO_ALIAS;
+  }
+
+  // 对于其他混合情况，保守估计
+  return AliasType::UNKNOWN_ALIAS;
+}
+
+void SysYAliasAnalysisPass::applySysYConstraints(Function* F) {
+  // SysY语言特定的约束和优化
+  // 1. SysY没有指针运算，简化了别名分析
+  // 2. 数组传参时保持数组语义
+  // 3. 没有动态内存分配，所有数组要么是局部的要么是参数/全局
+}
+
+void SysYAliasAnalysisPass::optimizeParameterAnalysis(Function* F) {
+  // 数组参数别名分析优化
+  // 为SysY评测环境提供可配置的优化策略
+  
+  if (!enableParameterOptimization()) {
+    return; // 保持默认的保守策略
+  }
+  
+  // 可选的参数优化：假设不同数组参数不会传入相同数组
+  // 典型的SysY函数调用：
+  //   int arr1[10], arr2[20];
+  //   func(arr1, arr2);  // 传入不同数组
+  // 而不是：
+  //   func(arr1, arr1);  // 传入相同数组给两个参数
+  // 这在SysY评测中通常是安全的假设
+  auto& locationMap = CurrentResult->LocationMap;
+  
+  for (auto it1 = locationMap.begin(); it1 != locationMap.end(); ++it1) {
+    for (auto it2 = std::next(it1); it2 != locationMap.end(); ++it2) {
+      MemoryLocation* loc1 = it1->second.get();
+      MemoryLocation* loc2 = it2->second.get();
+      
+      // 如果两个都是数组参数且基指针不同，设为NO_ALIAS
+      if (loc1->isFunctionParameter && loc2->isFunctionParameter && 
+          loc1->basePointer != loc2->basePointer) {
+        CurrentResult->addAliasRelation(it1->first, it2->first, AliasType::NO_ALIAS);
+      }
+    }
+  }
+}
+
+void SysYAliasAnalysisPass::optimizeArrayAccessAnalysis(Function* F) {
+  // 数组访问别名分析优化
+  // 基于SysY语言的特点进行简单优化
+  
+  // 优化1：同一数组的不同常量索引访问确定无别名
+  optimizeConstantIndexAccesses();
+  
+  // 优化2：识别简单的顺序访问模式
+  optimizeSequentialAccesses();
+}
+
+bool SysYAliasAnalysisPass::isConstantValue(Value* val) {
+  return dynamic_cast<ConstantInteger*>(val) != nullptr; // 简化，只检查整数常量
+}
+
+bool SysYAliasAnalysisPass::hasLoopVariableInIndices(const std::vector<Value*>& indices, Function* F) {
+  // 保守策略：所有非常量索引都视为可能的循环变量
+  // 这样可以避免复杂的循环分析依赖，保持分析的独立性
+  for (Value* index : indices) {
+    if (!isConstantValue(index)) {
+      return true; // 保守估计，确保正确性
+    }
+  }
+  return false;
+}
+
+int SysYAliasAnalysisPass::calculateConstantOffset(const std::vector<Value*>& indices) {
+  int offset = 0;
+  for (Value* index : indices) {
+    if (auto* constInt = dynamic_cast<ConstantInteger*>(index)) {
+      // ConstantInteger的getVal()返回variant，需要提取int值
+      auto val = constInt->getVal();
+      if (std::holds_alternative<int>(val)) {
+        offset += std::get<int>(val);
+      }
+    }
+  }
+  return offset;
+}
+
+void SysYAliasAnalysisPass::printStatistics() const {
+  if (CurrentResult) {
+    CurrentResult->print();
+  }
+}
+
+void SysYAliasAnalysisPass::optimizeConstantIndexAccesses() {
+  // 优化常量索引访问的别名关系
+  // 对于相同基指针的访问，如果索引都是常量且不同，则确定无别名
+  
+  auto& locationMap = CurrentResult->LocationMap;
+  std::vector<Value*> allPointers;
+  for (auto& pair : locationMap) {
+    allPointers.push_back(pair.first);
+  }
+  
+  for (size_t i = 0; i < allPointers.size(); ++i) {
+    for (size_t j = i + 1; j < allPointers.size(); ++j) {
+      Value* ptr1 = allPointers[i];
+      Value* ptr2 = allPointers[j];
+      MemoryLocation* loc1 = locationMap[ptr1].get();
+      MemoryLocation* loc2 = locationMap[ptr2].get();
+      
+      // 相同基指针且都有常量索引
+      if (loc1->basePointer == loc2->basePointer && 
+          loc1->hasConstantIndices && loc2->hasConstantIndices) {
+        
+        // 比较常量偏移
+        if (loc1->constantOffset != loc2->constantOffset) {
+          // 不同的常量偏移，确定无别名
+          CurrentResult->addAliasRelation(ptr1, ptr2, AliasType::NO_ALIAS);
+        }
+      }
+    }
+  }
+}
+
+void SysYAliasAnalysisPass::optimizeSequentialAccesses() {
+  // 识别和优化顺序访问模式
+  // 这是一个简化的实现，主要用于识别数组的顺序遍历
+  
+  // 在SysY中，大多数数组访问都是通过循环进行的
+  // 对于非常量索引的访问，我们采用保守策略，不进行过多优化
+  // 这样可以保持分析的简单性和正确性
+  
+  // 未来如果需要更精确的分析，可以在这里添加更复杂的逻辑
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Analysis/CallGraphAnalysis.cpp
+++ b/src/midend/Pass/Analysis/CallGraphAnalysis.cpp
@ -0,0 +1,417 @@
+#include "CallGraphAnalysis.h"
+#include "SysYIRPrinter.h"
+#include <iostream>
+#include <stack>
+#include <unordered_set>
+
+extern int DEBUG;
+
+namespace sysy {
+
+// 静态成员初始化
+void* CallGraphAnalysisPass::ID = (void*)&CallGraphAnalysisPass::ID;
+
+// ========== CallGraphAnalysisResult 实现 ==========
+
+CallGraphAnalysisResult::Statistics CallGraphAnalysisResult::getStatistics() const {
+    Statistics stats = {};
+    stats.totalFunctions = nodes.size();
+    
+    size_t totalCallEdges = 0;
+    size_t recursiveFunctions = 0;
+    size_t selfRecursiveFunctions = 0;
+    size_t totalCallers = 0;
+    size_t totalCallees = 0;
+    
+    for (const auto& pair : nodes) {
+        const auto& node = pair.second;
+        totalCallEdges += node->callees.size();
+        totalCallers += node->callers.size();
+        totalCallees += node->callees.size();
+        
+        if (node->isRecursive) recursiveFunctions++;
+        if (node->isSelfRecursive) selfRecursiveFunctions++;
+    }
+    
+    stats.totalCallEdges = totalCallEdges;
+    stats.recursiveFunctions = recursiveFunctions;
+    stats.selfRecursiveFunctions = selfRecursiveFunctions;
+    stats.stronglyConnectedComponents = sccs.size();
+    
+    // 计算最大SCC大小
+    size_t maxSCCSize = 0;
+    for (const auto& scc : sccs) {
+        maxSCCSize = std::max(maxSCCSize, scc.size());
+    }
+    stats.maxSCCSize = maxSCCSize;
+    
+    // 计算平均值
+    if (stats.totalFunctions > 0) {
+        stats.avgCallersPerFunction = static_cast<double>(totalCallers) / stats.totalFunctions;
+        stats.avgCalleesPerFunction = static_cast<double>(totalCallees) / stats.totalFunctions;
+    }
+    
+    return stats;
+}
+
+void CallGraphAnalysisResult::print() const {
+    std::cout << "---- Call Graph Analysis Results for Module ----\n";
+    
+    // 打印基本统计信息
+    auto stats = getStatistics();
+    std::cout << "  Statistics:\n";
+    std::cout << "    Total Functions: " << stats.totalFunctions << "\n";
+    std::cout << "    Total Call Edges: " << stats.totalCallEdges << "\n";
+    std::cout << "    Recursive Functions: " << stats.recursiveFunctions << "\n";
+    std::cout << "    Self-Recursive Functions: " << stats.selfRecursiveFunctions << "\n";
+    std::cout << "    Strongly Connected Components: " << stats.stronglyConnectedComponents << "\n";
+    std::cout << "    Max SCC Size: " << stats.maxSCCSize << "\n";
+    std::cout << "    Avg Callers per Function: " << stats.avgCallersPerFunction << "\n";
+    std::cout << "    Avg Callees per Function: " << stats.avgCalleesPerFunction << "\n";
+    
+    // 打印拓扑排序结果
+    std::cout << "  Topological Order (" << topologicalOrder.size() << "):\n";
+    for (size_t i = 0; i < topologicalOrder.size(); ++i) {
+        std::cout << "    " << i << ": " << topologicalOrder[i]->getName() << "\n";
+    }
+    
+    // 打印强连通分量
+    if (!sccs.empty()) {
+        std::cout << "  Strongly Connected Components:\n";
+        for (size_t i = 0; i < sccs.size(); ++i) {
+            std::cout << "    SCC " << i << " (size " << sccs[i].size() << "): ";
+            for (size_t j = 0; j < sccs[i].size(); ++j) {
+                if (j > 0) std::cout << ", ";
+                std::cout << sccs[i][j]->getName();
+            }
+            std::cout << "\n";
+        }
+    }
+    
+    // 打印每个函数的详细信息
+    std::cout << "  Function Details:\n";
+    for (const auto& pair : nodes) {
+        const auto& node = pair.second;
+        std::cout << "    Function: " << node->function->getName();
+        
+        if (node->isRecursive) {
+            std::cout << " (Recursive";
+            if (node->isSelfRecursive) std::cout << ", Self";
+            if (node->recursiveDepth >= 0) std::cout << ", Depth=" << node->recursiveDepth;
+            std::cout << ")";
+        }
+        std::cout << "\n";
+        
+        if (!node->callers.empty()) {
+            std::cout << "      Callers (" << node->callers.size() << "): ";
+            bool first = true;
+            for (Function* caller : node->callers) {
+                if (!first) std::cout << ", ";
+                std::cout << caller->getName();
+                first = false;
+            }
+            std::cout << "\n";
+        }
+        
+        if (!node->callees.empty()) {
+            std::cout << "      Callees (" << node->callees.size() << "): ";
+            bool first = true;
+            for (Function* callee : node->callees) {
+                if (!first) std::cout << ", ";
+                std::cout << callee->getName();
+                first = false;
+            }
+            std::cout << "\n";
+        }
+    }
+    
+    std::cout << "--------------------------------------------------\n";
+}
+
+void CallGraphAnalysisResult::addNode(Function* F) {
+    if (nodes.find(F) == nodes.end()) {
+        nodes[F] = std::make_unique<CallGraphNode>(F);
+    }
+}
+
+void CallGraphAnalysisResult::addCallEdge(Function* caller, Function* callee) {
+    // 确保两个函数都有对应的节点
+    addNode(caller);
+    addNode(callee);
+    
+    // 添加调用边
+    nodes[caller]->callees.insert(callee);
+    nodes[callee]->callers.insert(caller);
+    
+    // 更新统计信息
+    nodes[caller]->totalCallees = nodes[caller]->callees.size();
+    nodes[callee]->totalCallers = nodes[callee]->callers.size();
+    
+    // 检查自递归
+    if (caller == callee) {
+        nodes[caller]->isSelfRecursive = true;
+        nodes[caller]->isRecursive = true;
+    }
+}
+
+void CallGraphAnalysisResult::computeTopologicalOrder() {
+    topologicalOrder.clear();
+    std::unordered_set<Function*> visited;
+    
+    // 对每个未访问的函数进行DFS
+    for (const auto& pair : nodes) {
+        Function* F = pair.first;
+        if (visited.find(F) == visited.end()) {
+            dfsTopological(F, visited, topologicalOrder);
+        }
+    }
+    
+    // 反转结果（因为我们在后序遍历中添加）
+    std::reverse(topologicalOrder.begin(), topologicalOrder.end());
+}
+
+void CallGraphAnalysisResult::dfsTopological(Function* F, std::unordered_set<Function*>& visited, 
+                                            std::vector<Function*>& result) {
+    visited.insert(F);
+    
+    auto node = getNode(F);
+    if (node) {
+        // 先访问所有被调用的函数
+        for (Function* callee : node->callees) {
+            if (visited.find(callee) == visited.end()) {
+                dfsTopological(callee, visited, result);
+            }
+        }
+    }
+    
+    // 后序遍历：访问完所有子节点后添加当前节点
+    result.push_back(F);
+}
+
+void CallGraphAnalysisResult::computeStronglyConnectedComponents() {
+    tarjanSCC();
+    
+    // 为每个函数设置其所属的SCC
+    functionToSCC.clear();
+    for (size_t i = 0; i < sccs.size(); ++i) {
+        for (Function* F : sccs[i]) {
+            functionToSCC[F] = static_cast<int>(i);
+        }
+    }
+}
+
+void CallGraphAnalysisResult::tarjanSCC() {
+    sccs.clear();
+    
+    std::vector<int> indices(nodes.size(), -1);
+    std::vector<int> lowlinks(nodes.size(), -1);
+    std::vector<Function*> stack;
+    std::unordered_set<Function*> onStack;
+    int index = 0;
+    
+    // 为函数分配索引
+    std::map<Function*, int> functionIndex;
+    int idx = 0;
+    for (const auto& pair : nodes) {
+        functionIndex[pair.first] = idx++;
+    }
+    
+    // 对每个未访问的函数运行Tarjan算法
+    for (const auto& pair : nodes) {
+        Function* F = pair.first;
+        int fIdx = functionIndex[F];
+        if (indices[fIdx] == -1) {
+            tarjanDFS(F, index, indices, lowlinks, stack, onStack);
+        }
+    }
+}
+
+void CallGraphAnalysisResult::tarjanDFS(Function* F, int& index, std::vector<int>& indices, 
+                                      std::vector<int>& lowlinks, std::vector<Function*>& stack, 
+                                      std::unordered_set<Function*>& onStack) {
+    // 这里需要函数到索引的映射，简化实现
+    // 在实际实现中应该维护一个全局的函数索引映射
+    static std::map<Function*, int> functionIndex;
+    static int nextIndex = 0;
+    
+    if (functionIndex.find(F) == functionIndex.end()) {
+        functionIndex[F] = nextIndex++;
+    }
+    
+    int fIdx = functionIndex[F];
+    
+    // 确保向量足够大
+    if (fIdx >= static_cast<int>(indices.size())) {
+        indices.resize(fIdx + 1, -1);
+        lowlinks.resize(fIdx + 1, -1);
+    }
+    
+    indices[fIdx] = index;
+    lowlinks[fIdx] = index;
+    index++;
+    
+    stack.push_back(F);
+    onStack.insert(F);
+    
+    auto node = getNode(F);
+    if (node) {
+        for (Function* callee : node->callees) {
+            int calleeIdx = functionIndex[callee];
+            
+            // 确保向量足够大
+            if (calleeIdx >= static_cast<int>(indices.size())) {
+                indices.resize(calleeIdx + 1, -1);
+                lowlinks.resize(calleeIdx + 1, -1);
+            }
+            
+            if (indices[calleeIdx] == -1) {
+                // 递归访问
+                tarjanDFS(callee, index, indices, lowlinks, stack, onStack);
+                lowlinks[fIdx] = std::min(lowlinks[fIdx], lowlinks[calleeIdx]);
+            } else if (onStack.find(callee) != onStack.end()) {
+                // 后向边
+                lowlinks[fIdx] = std::min(lowlinks[fIdx], indices[calleeIdx]);
+            }
+        }
+    }
+    
+    // 如果F是SCC的根
+    if (lowlinks[fIdx] == indices[fIdx]) {
+        std::vector<Function*> scc;
+        Function* w;
+        do {
+            w = stack.back();
+            stack.pop_back();
+            onStack.erase(w);
+            scc.push_back(w);
+        } while (w != F);
+        
+        sccs.push_back(std::move(scc));
+    }
+}
+
+void CallGraphAnalysisResult::analyzeRecursion() {
+    // 基于SCC分析递归
+    for (const auto& scc : sccs) {
+        if (scc.size() > 1) {
+            // 多函数的SCC，标记为相互递归
+            for (Function* F : scc) {
+                auto* node = getMutableNode(F);
+                if (node) {
+                    node->isRecursive = true;
+                    node->recursiveDepth = -1; // 相互递归，深度未定义
+                }
+            }
+        } else if (scc.size() == 1) {
+            // 单函数SCC，检查是否自递归
+            Function* F = scc[0];
+            auto* node = getMutableNode(F);
+            if (node && node->callees.count(F) > 0) {
+                node->isSelfRecursive = true;
+                node->isRecursive = true;
+                node->recursiveDepth = -1; // 简化：不计算递归深度
+            }
+        }
+    }
+}
+
+// ========== CallGraphAnalysisPass 实现 ==========
+
+bool CallGraphAnalysisPass::runOnModule(Module* M, AnalysisManager& AM) {
+    if (DEBUG) {
+        std::cout << "Running Call Graph Analysis on module\n";
+    }
+    
+    // 创建分析结果
+    CurrentResult = std::make_unique<CallGraphAnalysisResult>(M);
+    
+    // 执行主要分析步骤
+    buildCallGraph(M);
+    CurrentResult->computeTopologicalOrder();
+    CurrentResult->computeStronglyConnectedComponents();
+    CurrentResult->analyzeRecursion();
+    
+    if (DEBUG) {
+        CurrentResult->print();
+    }
+    
+    return false; // 分析遍不修改IR
+}
+
+void CallGraphAnalysisPass::buildCallGraph(Module* M) {
+    // 1. 为所有函数创建节点（包括声明但未定义的函数）
+    for (auto& pair : M->getFunctions()) {
+        Function* F = pair.second.get();
+        if (!isLibraryFunction(F) && !isIntrinsicFunction(F)) {
+            CurrentResult->addNode(F);
+        }
+    }
+    
+    // 2. 扫描所有函数的调用关系
+    for (auto& pair : M->getFunctions()) {
+        Function* F = pair.second.get();
+        if (!isLibraryFunction(F) && !isIntrinsicFunction(F)) {
+            scanFunctionCalls(F);
+        }
+    }
+}
+
+void CallGraphAnalysisPass::scanFunctionCalls(Function* F) {
+    // 遍历函数中的所有基本块和指令
+    for (auto& BB : F->getBasicBlocks_NoRange()) {
+        for (auto& I : BB->getInstructions()) {
+            if (CallInst* call = dynamic_cast<CallInst*>(I.get())) {
+                processCallInstruction(call, F);
+            }
+        }
+    }
+}
+
+void CallGraphAnalysisPass::processCallInstruction(CallInst* call, Function* caller) {
+    Function* callee = call->getCallee();
+    
+    if (!callee) {
+        // 间接调用，无法静态确定目标函数
+        return;
+    }
+    
+    if (isLibraryFunction(callee) || isIntrinsicFunction(callee)) {
+        // 跳过标准库函数和内置函数
+        return;
+    }
+    
+    // 添加调用边
+    CurrentResult->addCallEdge(caller, callee);
+    
+    // 更新调用点统计
+    auto* node = CurrentResult->getMutableNode(caller);
+    if (node) {
+        node->callSiteCount++;
+    }
+}
+
+bool CallGraphAnalysisPass::isLibraryFunction(Function* F) const {
+    std::string name = F->getName();
+    
+    // SysY标准库函数
+    return name == "getint" || name == "getch" || name == "getfloat" ||
+           name == "getarray" || name == "getfarray" ||
+           name == "putint" || name == "putch" || name == "putfloat" ||
+           name == "putarray" || name == "putfarray" ||
+           name == "_sysy_starttime" || name == "_sysy_stoptime";
+}
+
+bool CallGraphAnalysisPass::isIntrinsicFunction(Function* F) const {
+    std::string name = F->getName();
+    
+    // 编译器内置函数（后续可以增加某些内置函数）
+    return name.substr(0, 5) == "llvm." || name.substr(0, 5) == "sysy.";
+}
+
+void CallGraphAnalysisPass::printStatistics() const {
+    if (CurrentResult) {
+        CurrentResult->print();
+    }
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Analysis/Dom.cpp
+++ b/src/midend/Pass/Analysis/Dom.cpp
@ -1,19 +1,30 @@
 #include "Dom.h"
-#include <limits> // for std::numeric_limits
+#include <algorithm> // for std::set_intersection, std::reverse
+#include <iostream>  // for debug output
+#include <limits>    // for std::numeric_limits
 #include <queue>
+#include <functional> // for std::function
+#include <map>
+#include <vector>
+#include <set>

 namespace sysy {

-// 初始化 支配树静态 ID
+// ==============================================================
+// DominatorTreeAnalysisPass 的静态ID
+// ==============================================================
 void *DominatorTreeAnalysisPass::ID = (void *)&DominatorTreeAnalysisPass::ID;
+
 // ==============================================================
 // DominatorTree 结果类的实现
 // ==============================================================

+// 构造函数：初始化关联函数，但不进行计算
 DominatorTree::DominatorTree(Function *F) : AssociatedFunction(F) {
-  // 构造时可以不计算，在分析遍运行里计算并填充
+  // 构造时不需要计算，在分析遍运行里计算并填充
 }

+// Getter 方法 (保持不变)
 const std::set<BasicBlock *> *DominatorTree::getDominators(BasicBlock *BB) const {
  auto it = Dominators.find(BB);
  if (it != Dominators.end()) {
@ -38,164 +49,437 @@ const std::set<BasicBlock *> *DominatorTree::getDominanceFrontier(BasicBlock *BB
  return nullptr;
 }

-const std::set<BasicBlock*>* DominatorTree::getDominatorTreeChildren(BasicBlock* BB) const {
-    auto it = DominatorTreeChildren.find(BB);
-    if (it != DominatorTreeChildren.end()) {
-        return &(it->second);
-    }
-    return nullptr;
+const std::set<BasicBlock *> *DominatorTree::getDominatorTreeChildren(BasicBlock *BB) const {
+  auto it = DominatorTreeChildren.find(BB);
+  if (it != DominatorTreeChildren.end()) {
+    return &(it->second);
+  }
+  return nullptr;
 }

-void DominatorTree::computeDominators(Function *F) {
-  // 经典的迭代算法计算支配者集合
-  // TODO: 可以替换为更高效的算法，如 Lengauer-Tarjan 算法
-  BasicBlock *entryBlock = F->getEntryBlock();
+// 辅助函数：打印 BasicBlock 集合 (保持不变)
+void printBBSet(const std::string &prefix, const std::set<BasicBlock *> &s) {
+  if (!DEBUG)
+    return;
+  std::cout << prefix << "{";
+  bool first = true;
+  for (const auto &bb : s) {
+    if (!first)
+      std::cout << ", ";
+    std::cout << bb->getName();
+    first = false;
+  }
+  std::cout << "}" << std::endl;
+}

-  for (const auto &bb_ptr : F->getBasicBlocks()) {
-    BasicBlock *bb = bb_ptr.get();
+// 辅助函数：计算逆后序遍历 (RPO) - 保持不变
+std::vector<BasicBlock*> DominatorTree::computeReversePostOrder(Function* F) {
+    std::vector<BasicBlock*> postOrder;
+    std::set<BasicBlock*> visited;
+    
+    std::function<void(BasicBlock*)> dfs_rpo =
+        [&](BasicBlock* bb) {
+        visited.insert(bb);
+        for (BasicBlock* succ : bb->getSuccessors()) {
+            if (visited.find(succ) == visited.end()) {
+                dfs_rpo(succ);
+            }
+        }
+        postOrder.push_back(bb);
+    };
+
+    dfs_rpo(F->getEntryBlock());
+    std::reverse(postOrder.begin(), postOrder.end());
+    
+    if (DEBUG) {
+        std::cout << "--- Computed RPO: ";
+        for (BasicBlock* bb : postOrder) {
+            std::cout << bb->getName() << " ";
+        }
+        std::cout << "---" << std::endl;
+    }
+    return postOrder;
+}
+
+// computeDominators 方法 (保持不变，因为它它是独立于IDom算法的)
+void DominatorTree::computeDominators(Function *F) {
+  if (DEBUG)
+    std::cout << "--- Computing Dominators ---" << std::endl;
+
+  BasicBlock *entryBlock = F->getEntryBlock();
+  std::vector<BasicBlock*> bbs_rpo = computeReversePostOrder(F);
+
+  for (BasicBlock *bb : bbs_rpo) {
    if (bb == entryBlock) {
+      Dominators[bb].clear();
      Dominators[bb].insert(bb);
+      if (DEBUG) std::cout << "Init Dominators[" << bb->getName() << "]: {" << bb->getName() << "}" << std::endl;
    } else {
-      for (const auto &all_bb_ptr : F->getBasicBlocks()) {
-        Dominators[bb].insert(all_bb_ptr.get());
+      Dominators[bb].clear();
+      for (BasicBlock *all_bb : bbs_rpo) {
+        Dominators[bb].insert(all_bb);
+      }
+      if (DEBUG) {
+        std::cout << "Init Dominators[" << bb->getName() << "]: ";
+        printBBSet("", Dominators[bb]);
      }
    }
  }

  bool changed = true;
+  int iteration = 0;
  while (changed) {
    changed = false;
-    for (const auto &bb_ptr : F->getBasicBlocks()) {
-      BasicBlock *bb = bb_ptr.get();
-      if (bb == entryBlock)
-        continue;
+    iteration++;
+    if (DEBUG) std::cout << "Iteration " << iteration << std::endl;
+
+    for (BasicBlock *bb : bbs_rpo) {
+      if (bb == entryBlock) continue;

      std::set<BasicBlock *> newDom;
-      bool firstPred = true;
+      bool firstPredProcessed = false;
+
      for (BasicBlock *pred : bb->getPredecessors()) {
-        if (Dominators.count(pred)) {
-          if (firstPred) {
-            newDom = Dominators[pred];
-            firstPred = false;
-          } else {
-            std::set<BasicBlock *> intersection;
-            std::set_intersection(newDom.begin(), newDom.end(), Dominators[pred].begin(), Dominators[pred].end(),
-                                  std::inserter(intersection, intersection.begin()));
-            newDom = intersection;
-          }
+        if(DEBUG){
+          std::cout << "  Processing predecessor: " << pred->getName() << std::endl;
        }
+          if (!firstPredProcessed) {
+              newDom = Dominators[pred];
+              firstPredProcessed = true;
+          } else {
+              std::set<BasicBlock *> intersection;
+              std::set_intersection(newDom.begin(), newDom.end(), Dominators[pred].begin(), Dominators[pred].end(),
+                                     std::inserter(intersection, intersection.begin()));
+              newDom = intersection;
+          }
      }
      newDom.insert(bb);

      if (newDom != Dominators[bb]) {
+        if (DEBUG) {
+          std::cout << "  Dominators[" << bb->getName() << "] changed from ";
+          printBBSet("", Dominators[bb]);
+          std::cout << "  to ";
+          printBBSet("", newDom);
+        }
        Dominators[bb] = newDom;
        changed = true;
      }
    }
  }
+  if (DEBUG)
+    std::cout << "--- Dominators Computation Finished ---" << std::endl;
 }

-void DominatorTree::computeIDoms(Function *F) {
-  // 采用与之前类似的简化实现。TODO:Lengauer-Tarjan等算法。
-  BasicBlock *entryBlock = F->getEntryBlock();
-  IDoms[entryBlock] = nullptr;
+// ==============================================================
+// Lengauer-Tarjan 算法辅助数据结构和函数 (私有成员)
+// ==============================================================

-  for (const auto &bb_ptr : F->getBasicBlocks()) {
-    BasicBlock *bb = bb_ptr.get();
-    if (bb == entryBlock)
-      continue;
-
-    BasicBlock *currentIDom = nullptr;
-    const std::set<BasicBlock *> *domsOfBB = getDominators(bb);
-    if (!domsOfBB)
-      continue;
-
-    for (BasicBlock *D : *domsOfBB) {
-      if (D == bb)
-        continue;
-
-      bool isCandidateIDom = true;
-      for (BasicBlock *candidate : *domsOfBB) {
-        if (candidate == bb || candidate == D)
-          continue;
-        const std::set<BasicBlock *> *domsOfCandidate = getDominators(candidate);
-        if (domsOfCandidate && domsOfCandidate->count(D) == 0 && domsOfBB->count(candidate)) {
-          isCandidateIDom = false;
-          break;
-        }
-      }
-      if (isCandidateIDom) {
-        currentIDom = D;
-        break;
-      }
+// DFS 遍历，填充 dfnum_map, vertex_vec, parent_map
+// 对应用户代码的 dfs 函数
+void DominatorTree::dfs_lt_helper(BasicBlock* u) {
+    dfnum_map[u] = df_counter;
+    if (df_counter >= vertex_vec.size()) { // 动态调整大小
+        vertex_vec.resize(df_counter + 1);
+    }
+    vertex_vec[df_counter] = u;
+    if (DEBUG) std::cout << "  DFS: Visiting " << u->getName() << ", dfnum = " << df_counter << std::endl;
+    df_counter++;
+
+    for (BasicBlock* v : u->getSuccessors()) {
+        if (dfnum_map.find(v) == dfnum_map.end()) { // 如果 v 未访问过
+            parent_map[v] = u;
+            if (DEBUG) std::cout << "    DFS: Setting parent[" << v->getName() << "] = " << u->getName() << std::endl;
+            dfs_lt_helper(v);
+        }
    }
-    IDoms[bb] = currentIDom;
-  }
 }

+// 并查集：找到集合的代表，并进行路径压缩
+// 同时更新 label，确保 label[i] 总是指向其祖先链中 sdom_map 最小的节点
+// 对应用户代码的 find 函数，也包含了 eval 的逻辑
+BasicBlock* DominatorTree::evalAndCompress_lt_helper(BasicBlock* i) {
+    if (DEBUG) std::cout << "    Eval: Processing " << i->getName() << std::endl;
+    // 如果 i 是根 (ancestor_map[i] == nullptr)
+    if (ancestor_map.find(i) == ancestor_map.end() || ancestor_map[i] == nullptr) {
+        if (DEBUG) std::cout << "      Eval: " << i->getName() << " is root, returning itself." << std::endl;
+        return i; // 根节点自身就是路径上sdom最小的，因为它没有祖先
+    }
+    
+    // 如果 i 的祖先不是根，则递归查找并进行路径压缩
+    BasicBlock* root_ancestor = evalAndCompress_lt_helper(ancestor_map[i]);
+    
+    // 路径压缩时，根据 sdom_map 比较并更新 label_map
+    // 确保 label_map[i] 存储的是 i 到 root_ancestor 路径上 sdom_map 最小的节点
+    // 注意：这里的 ancestor_map[i] 已经被递归调用压缩过一次了，所以是root_ancestor的旧路径
+    // 应该比较的是 label_map[ancestor_map[i]] 和 label_map[i]
+    if (sdom_map.count(label_map[ancestor_map[i]]) && // 确保 label_map[ancestor_map[i]] 存在 sdom
+        sdom_map.count(label_map[i]) &&                // 确保 label_map[i] 存在 sdom
+        dfnum_map[sdom_map[label_map[ancestor_map[i]]]] < dfnum_map[sdom_map[label_map[i]]]) {
+        if (DEBUG) std::cout << "      Eval: Updating label for " << i->getName() << " from " 
+                              << label_map[i]->getName() << " to " << label_map[ancestor_map[i]]->getName() << std::endl;
+        label_map[i] = label_map[ancestor_map[i]];
+    }
+    
+    ancestor_map[i] = root_ancestor; // 执行路径压缩：将 i 直接指向其所属集合的根
+    if (DEBUG) std::cout << "      Eval: Path compression for " << i->getName() << ", new ancestor = " 
+                          << (root_ancestor ? root_ancestor->getName() : "nullptr") << std::endl;
+    
+    return label_map[i]; // <-- **将这里改为返回 label_map[i]**
+}
+
+// Link 函数：将 v 加入 u 的 DFS 树子树中 (实际上是并查集操作)
+// 对应用户代码的 fa[u] = fth[u];
+void DominatorTree::link_lt_helper(BasicBlock* u_parent, BasicBlock* v_child) {
+    ancestor_map[v_child] = u_parent; // 设置并查集父节点
+    label_map[v_child] = v_child;     // 初始化 label 为自身
+    if (DEBUG) std::cout << "  Link: " << v_child->getName() << " linked to " << u_parent->getName() << std::endl;
+}
+
+// ==============================================================
+// Lengauer-Tarjan 算法实现 computeIDoms
+// ==============================================================
+void DominatorTree::computeIDoms(Function *F) {
+    if (DEBUG) std::cout << "--- Computing Immediate Dominators (IDoms) using Lengauer-Tarjan ---" << std::endl;
+
+    BasicBlock *entryBlock = F->getEntryBlock();
+
+    // 1. 初始化所有 LT 相关的数据结构
+    dfnum_map.clear();
+    vertex_vec.clear();
+    parent_map.clear();
+    sdom_map.clear();
+    idom_map.clear();
+    bucket_map.clear();
+    ancestor_map.clear();
+    label_map.clear();
+    df_counter = 0; // DFS 计数器从 0 开始
+
+    // 预分配 vertex_vec 的大小，避免频繁resize
+    vertex_vec.resize(F->getBasicBlocks().size() + 1); 
+    // 在 DFS 遍历之前，先为所有基本块初始化 sdom 和 label
+    // 这是 Lengauer-Tarjan 算法的要求，确保所有节点在 Phase 2 开始前都在 map 中
+    for (auto &bb_ptr : F->getBasicBlocks()) {
+        BasicBlock* bb = bb_ptr.get();
+        sdom_map[bb] = bb; // sdom(bb) 初始化为 bb 自身
+        label_map[bb] = bb; // label(bb) 初始化为 bb 自身 (用于 Union-Find 的路径压缩)
+    }
+    // 确保入口块也被正确初始化（如果它不在 F->getBasicBlocks() 的正常迭代中）
+    sdom_map[entryBlock] = entryBlock;
+    label_map[entryBlock] = entryBlock;
+    // Phase 1: DFS 遍历并预处理
+    // 对应用户代码的 dfs(st)
+    dfs_lt_helper(entryBlock);
+    idom_map[entryBlock] = nullptr; // 入口块没有即时支配者
+    if (DEBUG) std::cout << "  IDom[" << entryBlock->getName() << "] = nullptr" << std::endl;
+
+    if (DEBUG) std::cout << "  Sdom[" << entryBlock->getName() << "] = " << entryBlock->getName() << std::endl;
+    
+    // 初始化并查集的祖先和 label
+    for (auto const& [bb_key, dfn_val] : dfnum_map) {
+        ancestor_map[bb_key] = nullptr; // 初始为独立集合的根
+        label_map[bb_key] = bb_key;   // 初始 label 为自身
+    }
+
+    if (DEBUG) {
+        std::cout << "  --- DFS Phase Complete ---" << std::endl;
+        std::cout << "  dfnum_map:" << std::endl;
+        for (auto const& [bb, dfn] : dfnum_map) {
+            std::cout << "    " << bb->getName() << " -> " << dfn << std::endl;
+        }
+        std::cout << "  vertex_vec (by dfnum):" << std::endl;
+        for (size_t k = 0; k < df_counter; ++k) {
+            if (vertex_vec[k]) std::cout << "    [" << k << "] -> " << vertex_vec[k]->getName() << std::endl;
+        }
+        std::cout << "  parent_map:" << std::endl;
+        for (auto const& [child, parent] : parent_map) {
+            std::cout << "    " << child->getName() << " -> " << (parent ? parent->getName() : "nullptr") << std::endl;
+        }
+        std::cout << "  ------------------------" << std::endl;
+    }
+
+
+    // Phase 2: 计算半支配者 (sdom)
+    // 对应用户代码的 for (int i = dfc; i >= 2; --i) 循环的上半部分
+    // 按照 DFS 编号递减的顺序遍历所有节点 (除了 entryBlock，它的 DFS 编号是 0)
+    if (DEBUG) std::cout << "--- Phase 2: Computing Semi-Dominators (sdom) ---" << std::endl;
+    for (int i = df_counter - 1; i >= 1; --i) { // 从 DFS 编号最大的节点开始，到 1
+        BasicBlock* w = vertex_vec[i]; // 当前处理的节点
+        if (DEBUG) std::cout << "  Processing node w: " << w->getName() << " (dfnum=" << i << ")" << std::endl;
+
+
+        // 对于 w 的每个前驱 v
+        for (BasicBlock* v : w->getPredecessors()) {
+            if (DEBUG) std::cout << "    Considering predecessor v: " << v->getName() << std::endl;
+            // 如果前驱 v 未被 DFS 访问过 (即不在 dfnum_map 中)，则跳过
+            if (dfnum_map.find(v) == dfnum_map.end()) {
+                if (DEBUG) std::cout << "      Predecessor " << v->getName() << " not in DFS tree, skipping." << std::endl;
+                continue; 
+            }
+
+            // 调用 evalAndCompress 来找到 v 在其 DFS 树祖先链上具有最小 sdom 的节点
+            BasicBlock* u_with_min_sdom_on_path = evalAndCompress_lt_helper(v);
+            if (DEBUG) std::cout << "      Eval(" << v->getName() << ") returned " 
+                                  << u_with_min_sdom_on_path->getName() << std::endl;
+            if (DEBUG && sdom_map.count(u_with_min_sdom_on_path) && sdom_map.count(w)) {
+                std::cout << "      Comparing sdom: dfnum[" << sdom_map[u_with_min_sdom_on_path]->getName() << "] (" << dfnum_map[sdom_map[u_with_min_sdom_on_path]] 
+                          << ") vs dfnum[" << sdom_map[w]->getName() << "] (" << dfnum_map[sdom_map[w]] << ")" << std::endl;
+            }
+            // 比较 sdom(u) 和 sdom(w)
+            if (sdom_map.count(u_with_min_sdom_on_path) && sdom_map.count(w) &&
+                dfnum_map[sdom_map[u_with_min_sdom_on_path]] < dfnum_map[sdom_map[w]]) {
+                if (DEBUG) std::cout << "      Updating sdom[" << w->getName() << "] from " 
+                                      << sdom_map[w]->getName() << " to " 
+                                      << sdom_map[u_with_min_sdom_on_path]->getName() << std::endl;
+                sdom_map[w] = sdom_map[u_with_min_sdom_on_path]; // 更新 sdom(w)
+                if (DEBUG) std::cout << "      Sdom update applied. New sdom[" << w->getName() << "] = " << sdom_map[w]->getName() << std::endl;
+            }
+        }
+        
+        // 将 w 加入 sdom(w) 对应的桶中
+        bucket_map[sdom_map[w]].push_back(w);
+        if (DEBUG) std::cout << "    Adding " << w->getName() << " to bucket of sdom(" << w->getName() << "): " 
+                              << sdom_map[w]->getName() << std::endl;
+
+        // 将 w 的父节点加入并查集 (link 操作)
+        if (parent_map.count(w) && parent_map[w] != nullptr) {
+            link_lt_helper(parent_map[w], w);
+        }
+        
+        // Phase 3-part 1: 处理 parent[w] 的桶中所有节点，确定部分 idom
+        if (parent_map.count(w) && parent_map[w] != nullptr) {
+            BasicBlock* p = parent_map[w]; // p 是 w 的父节点
+            if (DEBUG) std::cout << "    Processing bucket for parent " << p->getName() << std::endl;
+
+            // 注意：这里需要复制桶的内容，因为原始桶在循环中会被clear
+            std::vector<BasicBlock*> nodes_in_p_bucket_copy = bucket_map[p];
+            for (BasicBlock* y : nodes_in_p_bucket_copy) {
+                if (DEBUG) std::cout << "      Processing node y from bucket: " << y->getName() << std::endl;
+                // 找到 y 在其 DFS 树祖先链上具有最小 sdom 的节点
+                BasicBlock* u = evalAndCompress_lt_helper(y);
+                if (DEBUG) std::cout << "        Eval(" << y->getName() << ") returned " << u->getName() << std::endl;
+                
+                // 确定 idom(y)
+                // if sdom(eval(y)) == sdom(parent(w)), then idom(y) = parent(w)
+                // else idom(y) = eval(y)
+                if (sdom_map.count(u) && sdom_map.count(p) &&
+                    dfnum_map[sdom_map[u]] < dfnum_map[sdom_map[p]]) {
+                    idom_map[y] = u; // 确定的 idom
+                    if (DEBUG) std::cout << "        IDom[" << y->getName() << "] set to " << u->getName() << std::endl;
+                } else {
+                    idom_map[y] = p; // p 是 y 的 idom
+                    if (DEBUG) std::cout << "        IDom[" << y->getName() << "] set to " << p->getName() << std::endl;
+                }
+            }
+            bucket_map[p].clear(); // 清空桶，防止重复处理
+            if (DEBUG) std::cout << "    Cleared bucket for parent " << p->getName() << std::endl;
+        }
+    }
+
+    // Phase 3-part 2: 最终确定 idom (处理那些 idom != sdom 的节点)
+    if (DEBUG) std::cout << "--- Phase 3: Finalizing Immediate Dominators (idom) ---" << std::endl;
+    for (int i = 1; i < df_counter; ++i) { // 从 DFS 编号最小的节点 (除了 entryBlock) 开始
+        BasicBlock* w = vertex_vec[i];
+        if (DEBUG) std::cout << "  Finalizing node w: " << w->getName() << std::endl;
+        if (idom_map.count(w) && sdom_map.count(w) && idom_map[w] != sdom_map[w]) {
+            // idom[w] 的 idom 是其真正的 idom
+            if (DEBUG) std::cout << "    idom[" << w->getName() << "] (" << idom_map[w]->getName() 
+                                  << ") != sdom[" << w->getName() << "] (" << sdom_map[w]->getName() << ")" << std::endl;
+            if (idom_map.count(idom_map[w])) {
+                idom_map[w] = idom_map[idom_map[w]];
+                if (DEBUG) std::cout << "    Updating idom[" << w->getName() << "] to idom(idom(w)): " 
+                                      << idom_map[w]->getName() << std::endl;
+            } else {
+                 if (DEBUG) std::cout << "    Warning: idom(idom(" << w->getName() << ")) not found, leaving idom[" << w->getName() << "] as is." << std::endl;
+            }
+        }
+        if (DEBUG) {
+            std::cout << "  Final IDom[" << w->getName() << "] = " << (idom_map[w] ? idom_map[w]->getName() : "nullptr") << std::endl;
+        }
+    }
+
+    // 将计算结果从 idom_map 存储到 DominatorTree 的成员变量 IDoms 中
+    IDoms = idom_map; 
+
+    if (DEBUG) std::cout << "--- Immediate Dominators Computation Finished ---" << std::endl;
+}
+
+// ==============================================================
+// computeDominanceFrontiers 和 computeDominatorTreeChildren (保持不变)
+// ==============================================================
+
 void DominatorTree::computeDominanceFrontiers(Function *F) {
-  // 经典的支配边界计算算法
+  if (DEBUG)
+    std::cout << "--- Computing Dominance Frontiers ---" << std::endl;
+
  for (const auto &bb_ptr_X : F->getBasicBlocks()) {
    BasicBlock *X = bb_ptr_X.get();
    DominanceFrontiers[X].clear();

-    for (BasicBlock *Y : X->getSuccessors()) {
-      const std::set<BasicBlock *> *domsOfY = getDominators(Y);
-      if (domsOfY && domsOfY->find(X) == domsOfY->end()) {
-        DominanceFrontiers[X].insert(Y);
-      }
-    }
-
-    const std::set<BasicBlock *> *domsOfX = getDominators(X);
-    if (!domsOfX)
-      continue;
    for (const auto &bb_ptr_Z : F->getBasicBlocks()) {
      BasicBlock *Z = bb_ptr_Z.get();
-      if (Z == X)
-        continue;
      const std::set<BasicBlock *> *domsOfZ = getDominators(Z);
-      if (domsOfZ && domsOfZ->count(X) && Z != X) {

-        for (BasicBlock *Y : Z->getSuccessors()) {
-          const std::set<BasicBlock *> *domsOfY = getDominators(Y);
-          if (domsOfY && domsOfY->find(X) == domsOfY->end()) {
-            DominanceFrontiers[X].insert(Y);
-          }
+      if (!domsOfZ || domsOfZ->find(X) == domsOfZ->end()) { // Z 不被 X 支配
+        continue;
+      }
+
+      for (BasicBlock *Y : Z->getSuccessors()) {
+        const std::set<BasicBlock *> *domsOfY = getDominators(Y);
+        // 如果 Y == X，或者 Y 不被 X 严格支配 (即 Y 不被 X 支配)
+        if (Y == X || (domsOfY && domsOfY->find(X) == domsOfY->end())) {
+          DominanceFrontiers[X].insert(Y);
        }
      }
    }
+    if (DEBUG) {
+      std::cout << "  DF(" << X->getName() << "): ";
+      printBBSet("", DominanceFrontiers[X]);
+    }
  }
+  if (DEBUG)
+    std::cout << "--- Dominance Frontiers Computation Finished ---" << std::endl;
 }

 void DominatorTree::computeDominatorTreeChildren(Function *F) {
+  if (DEBUG)
+    std::cout << "--- Computing Dominator Tree Children ---" << std::endl;
+  // 首先清空，确保重新计算时是空的
+  for (auto &bb_ptr : F->getBasicBlocks()) {
+    DominatorTreeChildren[bb_ptr.get()].clear();
+  }
+
  for (auto &bb_ptr : F->getBasicBlocks()) {
    BasicBlock *B = bb_ptr.get();
-    auto it = getImmediateDominator(B);
-    if (it != nullptr) {
-      BasicBlock *A = it;
-      if (A) {
-        DominatorTreeChildren[A].insert(B);
+    BasicBlock *A = getImmediateDominator(B); // A 是 B 的即时支配者
+
+    if (A) { // 如果 B 有即时支配者 A (即 B 不是入口块)
+      DominatorTreeChildren[A].insert(B);
+      if (DEBUG) {
+        std::cout << "  " << B->getName() << " is child of " << A->getName() << std::endl;
      }
    }
  }
+  if (DEBUG)
+    std::cout << "--- Dominator Tree Children Computation Finished ---" << std::endl;
 }

 // ==============================================================
-// DominatorTreeAnalysisPass 的实现
+// DominatorTreeAnalysisPass 的实现 (保持不变)
 // ==============================================================

-
-bool DominatorTreeAnalysisPass::runOnFunction(Function* F, AnalysisManager &AM) {
+bool DominatorTreeAnalysisPass::runOnFunction(Function *F, AnalysisManager &AM) {
+  // 每次运行时清空旧数据，确保重新计算
  CurrentDominatorTree = std::make_unique<DominatorTree>(F);
+
  CurrentDominatorTree->computeDominators(F);
-  CurrentDominatorTree->computeIDoms(F);
+  CurrentDominatorTree->computeIDoms(F); // 修正后的LT算法
  CurrentDominatorTree->computeDominanceFrontiers(F);
  CurrentDominatorTree->computeDominatorTreeChildren(F);
  return false;
 }

 std::unique_ptr<AnalysisResultBase> DominatorTreeAnalysisPass::getResult() {
-  // 返回计算好的 DominatorTree 实例，所有权转移给 AnalysisManager
  return std::move(CurrentDominatorTree);
 }

--- a/src/midend/Pass/Analysis/Loop.cpp
+++ b/src/midend/Pass/Analysis/Loop.cpp
@ -0,0 +1,415 @@
+#include "Dom.h" // 确保包含 DominatorTreeAnalysisPass 的定义
+#include "Loop.h" //
+#include "AliasAnalysis.h" // 添加别名分析依赖
+#include "SideEffectAnalysis.h" // 添加副作用分析依赖
+#include <iostream>
+#include <queue> // 用于 BFS 遍历设置循环层级
+
+// 调试模式开关
+#ifndef DEBUG
+#define DEBUG 0
+#endif
+
+namespace sysy {
+
+// 定义 Pass 的唯一 ID
+void *LoopAnalysisPass::ID = (void *)&LoopAnalysisPass::ID;
+
+// 定义 Loop 类的静态变量
+int Loop::NextLoopID = 0;
+// **实现 LoopAnalysisResult::print() 方法**
+
+
+void LoopAnalysisResult::printBBSet(const std::string &prefix, const std::set<BasicBlock *> &s) const{
+  if (!DEBUG) return;
+  std::cout << prefix << "{";
+  bool first = true;
+  for (const auto &bb : s) {
+    if (!first) std::cout << ", ";
+    std::cout << bb->getName();
+    first = false;
+  }
+  std::cout << "}";
+}
+
+// **辅助函数：打印 Loop 指针向量**
+void LoopAnalysisResult::printLoopVector(const std::string &prefix, const std::vector<Loop *> &loops) const {
+  if (!DEBUG) return;
+  std::cout << prefix << "[";
+  bool first = true;
+  for (const auto &loop : loops) {
+    if (!first) std::cout << ", ";
+    std::cout << loop->getName(); // 假设 Loop::getName() 存在
+    first = false;
+  }
+  std::cout << "]";
+}
+
+void LoopAnalysisResult::print() const {
+  if (!DEBUG) return; // 只有在 DEBUG 模式下才打印
+
+  std::cout << "\n--- Loop Analysis Results for Function: " << AssociatedFunction->getName() << " ---" << std::endl;
+
+  if (AllLoops.empty()) {
+    std::cout << "  No loops found." << std::endl;
+    return;
+  }
+
+  std::cout << "Total Loops Found: " << AllLoops.size() << std::endl;
+
+  // 1. 按层级分组循环
+  std::map<int, std::vector<Loop*>> loopsByLevel;
+  int maxLevel = 0;
+  for (const auto& loop_ptr : AllLoops) {
+      if (loop_ptr->getLoopLevel() != -1) { // 确保层级已计算
+          loopsByLevel[loop_ptr->getLoopLevel()].push_back(loop_ptr.get());
+          if (loop_ptr->getLoopLevel() > maxLevel) {
+              maxLevel = loop_ptr->getLoopLevel();
+          }
+      }
+  }
+
+  // 2. 打印循环层次结构
+  std::cout << "\n--- Loop Hierarchy ---" << std::endl;
+  for (int level = 0; level <= maxLevel; ++level) {
+      if (loopsByLevel.count(level)) {
+          std::cout << "Level " << level << " Loops:" << std::endl;
+          for (Loop* loop : loopsByLevel[level]) {
+              std::string indent(level * 2, ' '); // 根据层级缩进
+              std::cout << indent << "- Loop Header: " << loop->getName() << std::endl;
+              std::cout << indent << "  Blocks: ";
+              printBBSet("", loop->getBlocks());
+              std::cout << std::endl;
+
+              std::cout << indent << "  Exit Blocks: ";
+              printBBSet("", loop->getExitBlocks());
+              std::cout << std::endl;
+
+              std::cout << indent << "  Pre-Header: " << (loop->getPreHeader() ? loop->getPreHeader()->getName() : "None") << std::endl;
+              std::cout << indent << "  Parent Loop: " << (loop->getParentLoop() ? loop->getParentLoop()->getName() : "None (Outermost)") << std::endl;
+              std::cout << indent << "  Nested Loops: ";
+              printLoopVector("", loop->getNestedLoops());
+              std::cout << std::endl;
+          }
+      }
+  }
+
+  // 3. 打印最外层/最内层循环摘要
+  std::cout << "\n--- Loop Summary ---" << std::endl;
+  std::cout << "Outermost Loops: ";
+  printLoopVector("", getOutermostLoops());
+  std::cout << std::endl;
+
+  std::cout << "Innermost Loops: ";
+  printLoopVector("", getInnermostLoops());
+  std::cout << std::endl;
+
+  std::cout << "-----------------------------------------------" << std::endl;
+}
+
+bool LoopAnalysisPass::runOnFunction(Function *F, AnalysisManager &AM) {
+  if (F->getBasicBlocks().empty()) {
+    CurrentResult = std::make_unique<LoopAnalysisResult>(F);
+    return false; // 空函数，没有循环
+  }
+
+  if (DEBUG)
+    std::cout << "Running LoopAnalysisPass on function: " << F->getName() << std::endl;
+
+  // 获取支配树分析结果
+  // 这是循环分析的关键依赖
+  DominatorTree *DT = AM.getAnalysisResult<DominatorTree, DominatorTreeAnalysisPass>(F);
+  if (!DT) {
+    // 无法获取支配树，无法进行循环分析
+    std::cerr << "Error: DominatorTreeAnalysisResult not available for function " << F->getName() << std::endl;
+    CurrentResult = std::make_unique<LoopAnalysisResult>(F);
+    return false;
+  }
+
+  // 获取别名分析结果 - 用于循环内存访问分析
+  AliasAnalysisResult *aliasAnalysis = AM.getAnalysisResult<AliasAnalysisResult, SysYAliasAnalysisPass>(F);
+  if (DEBUG && aliasAnalysis) {
+    std::cout << "Loop Analysis: Using alias analysis results for enhanced memory pattern detection" << std::endl;
+  }
+
+  // 获取副作用分析结果 - 用于循环纯度分析
+  SideEffectAnalysisResult *sideEffectAnalysis = AM.getAnalysisResult<SideEffectAnalysisResult, SysYSideEffectAnalysisPass>();
+  if (DEBUG && sideEffectAnalysis) {
+    std::cout << "Loop Analysis: Using side effect analysis results for loop purity detection" << std::endl;
+  }
+
+  CurrentResult = std::make_unique<LoopAnalysisResult>(F);
+  bool changed = false; // 循环分析本身不修改IR，所以通常返回false
+
+  // 步骤 1: 识别回边和对应的自然循环
+  // 回边 (N -> D) 定义：D 支配 N
+  std::vector<std::pair<BasicBlock *, BasicBlock *>> backEdges;
+  for (auto &BB : F->getBasicBlocks()) {
+    auto Block = BB.get();
+    for (BasicBlock *Succ : Block->getSuccessors()) {
+      if (DT->getDominators(Block) && DT->getDominators(Block)->count(Succ)) {
+        // Succ 支配 Block，所以 (Block -> Succ) 是一条回边
+        backEdges.push_back({Block, Succ});
+        if (DEBUG)
+          std::cout << "Found back edge: " << Block->getName() << " -> " << Succ->getName() << std::endl;
+      }
+    }
+  }
+  
+  if (DEBUG)
+    std::cout << "Total back edges found: " << backEdges.size() << std::endl;
+
+  // 步骤 2: 为每条回边构建自然循环
+  std::map<BasicBlock*, std::unique_ptr<Loop>> loopMap; // 按循环头分组
+  
+  for (auto &edge : backEdges) {
+    BasicBlock *N = edge.first;  // 回边的尾部
+    BasicBlock *D = edge.second; // 回边的头部 (循环头)
+
+    // 检查是否已经为此循环头创建了循环
+    if (loopMap.find(D) == loopMap.end()) {
+      // 创建新的 Loop 对象
+      loopMap[D] = std::make_unique<Loop>(D);
+    }
+    
+    Loop* currentLoop = loopMap[D].get();
+
+    // 收集此回边对应的循环体块：从 N 逆向遍历到 D
+    std::set<BasicBlock *> loopBlocks; // 临时存储循环块
+    std::queue<BasicBlock *> q;
+
+    // 循环头总是循环体的一部分
+    loopBlocks.insert(D);
+    
+    // 如果回边的尾部不是循环头本身，则将其加入队列进行遍历
+    if (N != D) {
+      q.push(N);
+      loopBlocks.insert(N);
+    }
+
+    while (!q.empty()) {
+      BasicBlock *current = q.front();
+      q.pop();
+
+      for (BasicBlock *pred : current->getPredecessors()) {
+        // 如果前驱还没有被访问过，则将其加入循环体并继续遍历
+        if (loopBlocks.find(pred) == loopBlocks.end()) {
+          loopBlocks.insert(pred);
+          q.push(pred);
+        }
+      }
+    }
+
+    // 将收集到的块添加到 Loop 对象中（合并所有回边的结果）
+    for (BasicBlock *loopBB : loopBlocks) {
+      currentLoop->addBlock(loopBB);
+    }
+  }
+
+  // 处理每个合并后的循环
+  for (auto &[header, currentLoop] : loopMap) {
+    const auto &loopBlocks = currentLoop->getBlocks();
+
+    // 步骤 3: 识别循环出口块 (Exit Blocks)
+    for (BasicBlock *loopBB : loopBlocks) {
+      for (BasicBlock *succ : loopBB->getSuccessors()) {
+        if (loopBlocks.find(succ) == loopBlocks.end()) {
+          // 如果后继不在循环体内，则 loopBB 是一个出口块
+          currentLoop->addExitBlock(loopBB);
+        }
+      }
+    }
+
+    // 步骤 4: 识别循环前置块 (Pre-Header)
+    BasicBlock *candidatePreHeader = nullptr;
+    int externalPredecessorCount = 0;
+    for (BasicBlock *predOfHeader : header->getPredecessors()) {
+      // 使用 currentLoop->contains() 来检查前驱是否在循环体内
+      if (!currentLoop->contains(predOfHeader)) {
+        // 如果前驱不在循环体内，则是一个外部前驱
+        externalPredecessorCount++;
+        candidatePreHeader = predOfHeader;
+      }
+    }
+
+    if (externalPredecessorCount == 1) {
+      currentLoop->setPreHeader(candidatePreHeader);
+    }
+    CurrentResult->addLoop(std::move(currentLoop));
+  }
+
+  // 步骤 5: 处理嵌套循环 (确定父子关系和层级)
+  const auto &allLoops = CurrentResult->getAllLoops();
+
+  // 1. 首先，清除所有循环已设置的父子关系和嵌套子循环列表，确保重新计算
+  for (const auto &loop_ptr : allLoops) {
+      loop_ptr->setParentLoop(nullptr); // 清除父指针
+      loop_ptr->clearNestedLoops();     // 清除子循环列表
+      loop_ptr->setLoopLevel(-1);       // 重置循环层级
+  }
+
+  // 2. 遍历所有循环，为每个循环找到其直接父循环并建立关系
+  for (const auto &innerLoop_ptr : allLoops) {
+    Loop *innerLoop = innerLoop_ptr.get();
+    Loop *immediateParent = nullptr; // 用于存储当前 innerLoop 的最近父循环
+
+    for (const auto &outerLoop_ptr : allLoops) {
+      Loop *outerLoop = outerLoop_ptr.get();
+
+      // 一个循环不能是它自己的父循环
+      if (outerLoop == innerLoop) {
+        continue;
+      }
+
+      // 检查 outerLoop 是否包含 innerLoop 的所有条件：
+      // Condition 1: outerLoop 的头支配 innerLoop 的头
+      if (!(DT->getDominators(innerLoop->getHeader()) &&
+            DT->getDominators(innerLoop->getHeader())->count(outerLoop->getHeader()))) {
+        continue; // outerLoop 不支配 innerLoop 的头，因此不是一个外层循环
+      }
+
+      // Condition 2: innerLoop 的所有基本块都在 outerLoop 的基本块集合中
+      bool allInnerBlocksInOuter = true;
+      for (BasicBlock *innerBB : innerLoop->getBlocks()) {
+        if (!outerLoop->contains(innerBB)) { //
+          allInnerBlocksInOuter = false;
+          break;
+        }
+      }
+      if (!allInnerBlocksInOuter) {
+        continue; // outerLoop 不包含 innerLoop 的所有块
+      }
+
+      // 到此为止，outerLoop 已经被确认为 innerLoop 的一个“候选父循环”（即它包含了 innerLoop）
+
+      if (immediateParent == nullptr) {
+        // 这是找到的第一个候选父循环
+        immediateParent = outerLoop;
+      } else {
+        // 已经有了一个 immediateParent，需要判断哪个是更“紧密”的父循环
+        // 更紧密的父循环是那个包含另一个候选父循环的。
+        // 如果当前的 immediateParent 包含了 outerLoop 的头，那么 outerLoop 是更深的循环（更接近 innerLoop）
+        if (immediateParent->contains(outerLoop->getHeader())) { //
+          immediateParent = outerLoop; // outerLoop 是更紧密的父循环
+        }
+        // 否则（outerLoop 包含了 immediateParent 的头），说明 immediateParent 更紧密，保持不变
+        // 或者它们互不包含（不应该发生，因为它们都包含了 innerLoop），也保持 immediateParent
+      }
+    }
+
+    // 设置 innerLoop 的直接父循环，并添加到父循环的嵌套列表中
+    if (immediateParent) {
+      innerLoop->setParentLoop(immediateParent);
+      immediateParent->addNestedLoop(innerLoop); 
+    }
+  }
+
+  // 3. 计算循环层级 (Level)
+  std::queue<Loop *> q_level;
+
+  // 查找所有最外层循环（没有父循环的），设置其层级为0，并加入队列
+  for (const auto &loop_ptr : allLoops) {
+    if (loop_ptr->isOutermost()) {
+      loop_ptr->setLoopLevel(0);
+      q_level.push(loop_ptr.get());
+    }
+  }
+
+  // 使用 BFS 遍历循环树，计算所有嵌套循环的层级
+  while (!q_level.empty()) {
+    Loop *current = q_level.front();
+    q_level.pop();
+
+    for (Loop *nestedLoop : current->getNestedLoops()) {
+      nestedLoop->setLoopLevel(current->getLoopLevel() + 1);
+      q_level.push(nestedLoop);
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "Loop Analysis completed for function: " << F->getName() << std::endl;
+    std::cout << "Total loops found: " << CurrentResult->getLoopCount() << std::endl;
+    std::cout << "Max loop depth: " << CurrentResult->getMaxLoopDepth() << std::endl;
+    std::cout << "Innermost loops: " << CurrentResult->getInnermostLoops().size() << std::endl;
+    std::cout << "Outermost loops: " << CurrentResult->getOutermostLoops().size() << std::endl;
+    
+    // 打印各深度的循环分布
+    for (int depth = 1; depth <= CurrentResult->getMaxLoopDepth(); ++depth) {
+      int count = CurrentResult->getLoopCountAtDepth(depth);
+      if (count > 0) {
+        std::cout << "Loops at depth " << depth << ": " << count << std::endl;
+      }
+    }
+    
+    // 输出缓存统计
+    auto cacheStats = CurrentResult->getCacheStats();
+    std::cout << "Cache statistics - Total cached queries: " << cacheStats.totalCachedQueries << std::endl;
+  }
+
+  return changed;
+}
+
+// ========== Loop 类的新增方法实现 ==========
+
+bool Loop::mayHaveSideEffects(SideEffectAnalysisResult* sideEffectAnalysis) const {
+  if (!sideEffectAnalysis) return true; // 保守假设
+  
+  for (BasicBlock* bb : LoopBlocks) {
+    for (auto& inst : bb->getInstructions()) {
+      if (sideEffectAnalysis->hasSideEffect(inst.get())) {
+        return true;
+      }
+    }
+  }
+  return false;
+}
+
+bool Loop::accessesGlobalMemory(AliasAnalysisResult* aliasAnalysis) const {
+  if (!aliasAnalysis) return true; // 保守假设
+  
+  for (BasicBlock* bb : LoopBlocks) {
+    for (auto& inst : bb->getInstructions()) {
+      if (auto* loadInst = dynamic_cast<LoadInst*>(inst.get())) {
+        if (!aliasAnalysis->isLocalArray(loadInst->getPointer())) {
+          return true;
+        }
+      } else if (auto* storeInst = dynamic_cast<StoreInst*>(inst.get())) {
+        if (!aliasAnalysis->isLocalArray(storeInst->getPointer())) {
+          return true;
+        }
+      }
+    }
+  }
+  return false;
+}
+
+bool Loop::hasMemoryAliasConflicts(AliasAnalysisResult* aliasAnalysis) const {
+  if (!aliasAnalysis) return true; // 保守假设
+  
+  std::vector<Value*> memoryAccesses;
+  
+  // 收集所有内存访问
+  for (BasicBlock* bb : LoopBlocks) {
+    for (auto& inst : bb->getInstructions()) {
+      if (auto* loadInst = dynamic_cast<LoadInst*>(inst.get())) {
+        memoryAccesses.push_back(loadInst->getPointer());
+      } else if (auto* storeInst = dynamic_cast<StoreInst*>(inst.get())) {
+        memoryAccesses.push_back(storeInst->getPointer());
+      }
+    }
+  }
+  
+  // 检查两两之间是否有别名
+  for (size_t i = 0; i < memoryAccesses.size(); ++i) {
+    for (size_t j = i + 1; j < memoryAccesses.size(); ++j) {
+      auto aliasType = aliasAnalysis->queryAlias(memoryAccesses[i], memoryAccesses[j]);
+      if (aliasType == AliasType::SELF_ALIAS || aliasType == AliasType::POSSIBLE_ALIAS) {
+        return true;
+      }
+    }
+  }
+  
+  return false;
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Analysis/LoopCharacteristics.cpp
+++ b/src/midend/Pass/Analysis/LoopCharacteristics.cpp
--- a/src/midend/Pass/Analysis/LoopVectorization.cpp
+++ b/src/midend/Pass/Analysis/LoopVectorization.cpp
@ -0,0 +1,803 @@
+#include "LoopVectorization.h"
+#include "Dom.h"
+#include "Loop.h"
+#include "Liveness.h"
+#include "AliasAnalysis.h" 
+#include "SideEffectAnalysis.h"
+#include <iostream>
+#include <algorithm>
+#include <cmath>
+#include <set>
+
+extern int DEBUG;
+
+namespace sysy {
+
+// 定义 Pass 的唯一 ID
+void *LoopVectorizationPass::ID = (void *)&LoopVectorizationPass::ID;
+
+std::vector<int> DependenceVector::getDirectionVector() const {
+  std::vector<int> direction;
+  direction.reserve(distances.size());
+  
+  for (int dist : distances) {
+    if (dist > 0) direction.push_back(1);       // 前向依赖
+    else if (dist < 0) direction.push_back(-1); // 后向依赖  
+    else direction.push_back(0);                // 无依赖
+  }
+  
+  return direction;
+}
+
+bool DependenceVector::isVectorizationSafe() const {
+  if (!isKnown) return false; // 未知依赖，不安全
+  
+  // 对于向量化，我们主要关心最内层循环的依赖
+  if (distances.empty()) return true;
+  
+  int innermostDistance = distances.back(); // 最内层循环的距离
+  
+  // 前向依赖 (距离 > 0) 通常是安全的，可以通过调整向量化顺序处理
+  // 后向依赖 (距离 < 0) 通常不安全，会阻止向量化
+  // 距离 = 0 表示同一迭代内的依赖，通常安全
+  
+  return innermostDistance >= 0;
+}
+
+size_t LoopVectorizationResult::getVectorizableLoopCount() const {
+  size_t count = 0;
+  for (const auto& [loop, analysis] : VectorizationMap) {
+    if (analysis.isVectorizable) count++;
+  }
+  return count;
+}
+
+size_t LoopVectorizationResult::getParallelizableLoopCount() const {
+  size_t count = 0;
+  for (const auto& [loop, analysis] : ParallelizationMap) {
+    if (analysis.isParallelizable) count++;
+  }
+  return count;
+}
+
+std::vector<Loop*> LoopVectorizationResult::getVectorizationCandidates() const {
+  std::vector<Loop*> candidates;
+  for (const auto& [loop, analysis] : VectorizationMap) {
+    if (analysis.isVectorizable) {
+      candidates.push_back(loop);
+    }
+  }
+  
+  // 按建议的向量宽度排序，优先处理收益更大的循环
+  std::sort(candidates.begin(), candidates.end(), 
+    [this](Loop* a, Loop* b) {
+      const auto& analysisA = VectorizationMap.at(a);
+      const auto& analysisB = VectorizationMap.at(b);
+      return analysisA.suggestedVectorWidth > analysisB.suggestedVectorWidth;
+    });
+  
+  return candidates;
+}
+
+std::vector<Loop*> LoopVectorizationResult::getParallelizationCandidates() const {
+  std::vector<Loop*> candidates;
+  for (const auto& [loop, analysis] : ParallelizationMap) {
+    if (analysis.isParallelizable) {
+      candidates.push_back(loop);
+    }
+  }
+  
+  // 按建议的线程数排序
+  std::sort(candidates.begin(), candidates.end(),
+    [this](Loop* a, Loop* b) {
+      const auto& analysisA = ParallelizationMap.at(a);
+      const auto& analysisB = ParallelizationMap.at(b);
+      return analysisA.suggestedThreadCount > analysisB.suggestedThreadCount;
+    });
+  
+  return candidates;
+}
+
+void LoopVectorizationResult::print() const {
+  if (!DEBUG) return;
+
+  std::cout << "\n--- Loop Vectorization/Parallelization Analysis Results for Function: " 
+            << AssociatedFunction->getName() << " ---" << std::endl;
+
+  if (VectorizationMap.empty() && ParallelizationMap.empty()) {
+    std::cout << "  No vectorization/parallelization analysis results." << std::endl;
+    return;
+  }
+
+  // 统计信息
+  std::cout << "\n=== Summary ===" << std::endl;
+  std::cout << "Total Loops Analyzed: " << VectorizationMap.size() << std::endl;
+  std::cout << "Vectorizable Loops: " << getVectorizableLoopCount() << std::endl;
+  std::cout << "Parallelizable Loops: " << getParallelizableLoopCount() << std::endl;
+
+  // 详细分析结果
+  for (const auto& [loop, vecAnalysis] : VectorizationMap) {
+    std::cout << "\n--- Loop: " << loop->getName() << " ---" << std::endl;
+    
+    // 向量化分析 (暂时搁置)
+    std::cout << "  Vectorization: " << (vecAnalysis.isVectorizable ? "YES" : "NO") << std::endl;
+    if (!vecAnalysis.preventingFactors.empty()) {
+      std::cout << "    Preventing Factors: ";
+      for (const auto& factor : vecAnalysis.preventingFactors) {
+        std::cout << factor << " ";
+      }
+      std::cout << std::endl;
+    }
+    
+    // 并行化分析
+    auto parallelIt = ParallelizationMap.find(loop);
+    if (parallelIt != ParallelizationMap.end()) {
+      const auto& parAnalysis = parallelIt->second;
+      std::cout << "  Parallelization: " << (parAnalysis.isParallelizable ? "YES" : "NO") << std::endl;
+      if (parAnalysis.isParallelizable) {
+        std::cout << "    Suggested Thread Count: " << parAnalysis.suggestedThreadCount << std::endl;
+        if (parAnalysis.requiresReduction) {
+          std::cout << "    Requires Reduction: Yes" << std::endl;
+        }
+        if (parAnalysis.requiresBarrier) {
+          std::cout << "    Requires Barrier: Yes" << std::endl;
+        }
+      } else if (!parAnalysis.preventingFactors.empty()) {
+        std::cout << "    Preventing Factors: ";
+        for (const auto& factor : parAnalysis.preventingFactors) {
+          std::cout << factor << " ";
+        }
+        std::cout << std::endl;
+      }
+    }
+    
+    // 依赖关系
+    auto depIt = DependenceMap.find(loop);
+    if (depIt != DependenceMap.end()) {
+      const auto& dependences = depIt->second;
+      std::cout << "  Dependences: " << dependences.size() << " found" << std::endl;
+      for (const auto& dep : dependences) {
+        if (dep.dependenceVector.isKnown) {
+          std::cout << "    " << dep.source->getName() << " -> " << dep.sink->getName();
+          std::cout << " [";
+          for (size_t i = 0; i < dep.dependenceVector.distances.size(); ++i) {
+            if (i > 0) std::cout << ",";
+            std::cout << dep.dependenceVector.distances[i];
+          }
+          std::cout << "]" << std::endl;
+        }
+      }
+    }
+  }
+  
+  std::cout << "-----------------------------------------------" << std::endl;
+}
+
+bool LoopVectorizationPass::runOnFunction(Function *F, AnalysisManager &AM) {
+  if (F->getBasicBlocks().empty()) {
+    CurrentResult = std::make_unique<LoopVectorizationResult>(F);
+    return false;
+  }
+
+  if (DEBUG) {
+    std::cout << "Running LoopVectorizationPass on function: " << F->getName() << std::endl;
+  }
+
+  // 获取循环分析结果
+  auto* loopAnalysisResult = AM.getAnalysisResult<LoopAnalysisResult, LoopAnalysisPass>(F);
+  if (!loopAnalysisResult || !loopAnalysisResult->hasLoops()) {
+    CurrentResult = std::make_unique<LoopVectorizationResult>(F);
+    return false;
+  }
+
+  // 获取循环特征分析结果
+  auto* loopCharacteristics = AM.getAnalysisResult<LoopCharacteristicsResult, LoopCharacteristicsPass>(F);
+  if (!loopCharacteristics) {
+    if (DEBUG) {
+      std::cout << "Warning: LoopCharacteristics analysis not available" << std::endl;
+    }
+  }
+
+  // 获取别名分析结果
+  auto* aliasAnalysis = AM.getAnalysisResult<AliasAnalysisResult, SysYAliasAnalysisPass>(F);
+  
+  // 获取副作用分析结果
+  auto* sideEffectAnalysis = AM.getAnalysisResult<SideEffectAnalysisResult, SysYSideEffectAnalysisPass>();
+
+  CurrentResult = std::make_unique<LoopVectorizationResult>(F);
+
+  // 分析每个循环的向量化/并行化可行性
+  for (const auto& loop_ptr : loopAnalysisResult->getAllLoops()) {
+    Loop* loop = loop_ptr.get();
+    
+    // 获取该循环的特征信息
+    LoopCharacteristics* characteristics = nullptr;
+    if (loopCharacteristics) {
+      characteristics = const_cast<LoopCharacteristics*>(loopCharacteristics->getCharacteristics(loop));
+    }
+    
+    analyzeLoop(loop, characteristics, aliasAnalysis, sideEffectAnalysis);
+  }
+
+  if (DEBUG) {
+    std::cout << "LoopVectorizationPass completed. Found " 
+              << CurrentResult->getVectorizableLoopCount() << " vectorizable loops, "
+              << CurrentResult->getParallelizableLoopCount() << " parallelizable loops" << std::endl;
+  }
+
+  return false; // 分析遍不修改IR
+}
+
+void LoopVectorizationPass::analyzeLoop(Loop* loop, LoopCharacteristics* characteristics, 
+                                       AliasAnalysisResult* aliasAnalysis, SideEffectAnalysisResult* sideEffectAnalysis) {
+  if (DEBUG) {
+    std::cout << "  Analyzing advanced features for loop: " << loop->getName() << std::endl;
+  }
+
+  // 1. 计算精确依赖向量
+  auto dependences = computeDependenceVectors(loop, aliasAnalysis);
+  CurrentResult->addDependenceAnalysis(loop, dependences);
+
+  // 2. 分析向量化可行性 (暂时搁置，总是返回不可向量化)
+  auto vecAnalysis = analyzeVectorizability(loop, dependences, characteristics);
+  CurrentResult->addVectorizationAnalysis(loop, vecAnalysis);
+
+  // 3. 分析并行化可行性 
+  auto parAnalysis = analyzeParallelizability(loop, dependences, characteristics);
+  CurrentResult->addParallelizationAnalysis(loop, parAnalysis);
+}
+
+// ========== 依赖向量分析实现 ==========
+
+std::vector<PreciseDependence> LoopVectorizationPass::computeDependenceVectors(Loop* loop, 
+                                                                              AliasAnalysisResult* aliasAnalysis) {
+  std::vector<PreciseDependence> dependences;
+  std::vector<Instruction*> memoryInsts;
+  
+  // 收集所有内存操作指令
+  for (BasicBlock* bb : loop->getBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      if (dynamic_cast<LoadInst*>(inst.get()) || dynamic_cast<StoreInst*>(inst.get())) {
+        memoryInsts.push_back(inst.get());
+      }
+    }
+  }
+  
+  // 分析每对内存操作之间的依赖关系
+  for (size_t i = 0; i < memoryInsts.size(); ++i) {
+    for (size_t j = i + 1; j < memoryInsts.size(); ++j) {
+      Instruction* inst1 = memoryInsts[i];
+      Instruction* inst2 = memoryInsts[j];
+      
+      Value* ptr1 = nullptr;
+      Value* ptr2 = nullptr;
+      
+      if (auto* load = dynamic_cast<LoadInst*>(inst1)) {
+        ptr1 = load->getPointer();
+      } else if (auto* store = dynamic_cast<StoreInst*>(inst1)) {
+        ptr1 = store->getPointer();
+      }
+      
+      if (auto* load = dynamic_cast<LoadInst*>(inst2)) {
+        ptr2 = load->getPointer();
+      } else if (auto* store = dynamic_cast<StoreInst*>(inst2)) {
+        ptr2 = store->getPointer();
+      }
+      
+      if (!ptr1 || !ptr2) continue;
+      
+      // 检查是否可能存在别名关系
+      bool mayAlias = false;
+      if (aliasAnalysis) {
+        mayAlias = aliasAnalysis->queryAlias(ptr1, ptr2) != AliasType::NO_ALIAS;
+      } else {
+        mayAlias = (ptr1 != ptr2); // 保守估计
+      }
+      
+      if (mayAlias) {
+        // 创建依赖关系
+        PreciseDependence dep(loop->getLoopDepth());
+        dep.source = inst1;
+        dep.sink = inst2;
+        dep.memoryLocation = ptr1;
+        
+        // 确定依赖类型
+        bool isStore1 = dynamic_cast<StoreInst*>(inst1) != nullptr;
+        bool isStore2 = dynamic_cast<StoreInst*>(inst2) != nullptr;
+        
+        if (isStore1 && !isStore2) {
+          dep.type = DependenceType::TRUE_DEPENDENCE; // Write -> Read (RAW)
+        } else if (!isStore1 && isStore2) {
+          dep.type = DependenceType::ANTI_DEPENDENCE; // Read -> Write (WAR)
+        } else if (isStore1 && isStore2) {
+          dep.type = DependenceType::OUTPUT_DEPENDENCE; // Write -> Write (WAW)
+        } else {
+          continue; // Read -> Read (RAR) - 跳过，不是真正的依赖
+        }
+        
+        // 计算依赖向量
+        dep.dependenceVector = computeAccessDependence(inst1, inst2, loop);
+        
+        // 判断是否允许并行化
+        dep.allowsParallelization = dep.dependenceVector.isLoopIndependent() || 
+                                   (dep.dependenceVector.isKnown && 
+                                    std::all_of(dep.dependenceVector.distances.begin(), 
+                                               dep.dependenceVector.distances.end(),
+                                               [](int d) { return d >= 0; }));
+        
+        dependences.push_back(dep);
+        
+        if (DEBUG && dep.dependenceVector.isKnown) {
+          std::cout << "        Found dependence: " << inst1->getName() 
+                    << " -> " << inst2->getName() << " [";
+          for (size_t k = 0; k < dep.dependenceVector.distances.size(); ++k) {
+            if (k > 0) std::cout << ",";
+            std::cout << dep.dependenceVector.distances[k];
+          }
+          std::cout << "]" << std::endl;
+        }
+      }
+    }
+  }
+  
+  return dependences;
+}
+
+DependenceVector LoopVectorizationPass::computeAccessDependence(Instruction* inst1, Instruction* inst2, Loop* loop) {
+  DependenceVector depVec(loop->getLoopDepth());
+  
+  Value* ptr1 = nullptr;
+  Value* ptr2 = nullptr;
+  
+  if (auto* load = dynamic_cast<LoadInst*>(inst1)) {
+    ptr1 = load->getPointer();
+  } else if (auto* store = dynamic_cast<StoreInst*>(inst1)) {
+    ptr1 = store->getPointer();
+  }
+  
+  if (auto* load = dynamic_cast<LoadInst*>(inst2)) {
+    ptr2 = load->getPointer();
+  } else if (auto* store = dynamic_cast<StoreInst*>(inst2)) {
+    ptr2 = store->getPointer();
+  }
+  
+  if (!ptr1 || !ptr2) return depVec;
+  
+  // 尝试分析仿射关系
+  if (areAccessesAffinelyRelated(ptr1, ptr2, loop)) {
+    auto coeff1 = extractInductionCoefficients(ptr1, loop);
+    auto coeff2 = extractInductionCoefficients(ptr2, loop);
+    
+    if (coeff1.size() == coeff2.size()) {
+      depVec.isKnown = true;
+      depVec.isConstant = true;
+      
+      for (size_t i = 0; i < coeff1.size(); ++i) {
+        depVec.distances[i] = coeff2[i] - coeff1[i];
+      }
+    }
+  }
+  
+  return depVec;
+}
+
+bool LoopVectorizationPass::areAccessesAffinelyRelated(Value* ptr1, Value* ptr2, Loop* loop) {
+  // 简化实现：检查是否都是基于归纳变量的数组访问
+  // 真正的实现需要复杂的仿射关系分析
+  
+  // 检查是否为 GEP 指令
+  auto* gep1 = dynamic_cast<GetElementPtrInst*>(ptr1);
+  auto* gep2 = dynamic_cast<GetElementPtrInst*>(ptr2);
+  
+  if (!gep1 || !gep2) return false;
+  
+  // 检查是否访问同一个数组基址
+  if (gep1->getBasePointer() != gep2->getBasePointer()) return false;
+  
+  // 简化：假设都是仿射的
+  return true;
+}
+
+// ========== 向量化分析实现 (暂时搁置) ==========
+
+VectorizationAnalysis LoopVectorizationPass::analyzeVectorizability(Loop* loop, 
+                                                                   const std::vector<PreciseDependence>& dependences,
+                                                                   LoopCharacteristics* characteristics) {
+  VectorizationAnalysis analysis; // 构造函数已设置为不可向量化
+  
+  if (DEBUG) {
+    std::cout << "    Vectorization analysis: DISABLED (temporarily)" << std::endl;
+  }
+  
+  // 向量化功能暂时搁置，总是返回不可向量化
+  // 这里可以添加一些基本的诊断信息用于日志
+  if (!loop->isInnermost()) {
+    analysis.preventingFactors.push_back("Not innermost loop");
+  }
+  if (loop->getBlocks().size() > 1) {
+    analysis.preventingFactors.push_back("Complex control flow");
+  }
+  if (!dependences.empty()) {
+    analysis.preventingFactors.push_back("Has dependences (not analyzed in detail)");
+  }
+  
+  return analysis;
+}
+
+// ========== 并行化分析实现 ==========
+
+ParallelizationAnalysis LoopVectorizationPass::analyzeParallelizability(Loop* loop,
+                                                                       const std::vector<PreciseDependence>& dependences,
+                                                                       LoopCharacteristics* characteristics) {
+  ParallelizationAnalysis analysis;
+  
+  if (DEBUG) {
+    std::cout << "    Analyzing parallelizability for loop: " << loop->getName() << std::endl;
+    std::cout << "      Found " << dependences.size() << " dependences" << std::endl;
+  }
+  
+  // 按依赖类型分类分析
+  bool hasTrueDependences = false;
+  bool hasAntiDependences = false;
+  bool hasOutputDependences = false;
+  
+  for (const auto& dep : dependences) {
+    switch (dep.type) {
+      case DependenceType::TRUE_DEPENDENCE:
+        hasTrueDependences = true;
+        // 真依赖通常是最难处理的，需要检查是否为归约模式
+        if (dep.isReductionDependence) {
+          analysis.requiresReduction = true;
+          analysis.reductionVariables.insert(dep.memoryLocation);
+        } else {
+          analysis.preventingFactors.push_back("Non-reduction true dependence");
+        }
+        break;
+      case DependenceType::ANTI_DEPENDENCE:
+        hasAntiDependences = true;
+        // 反依赖可以通过变量私有化解决
+        analysis.privatizableVariables.insert(dep.memoryLocation);
+        break;
+      case DependenceType::OUTPUT_DEPENDENCE:
+        hasOutputDependences = true;
+        // 输出依赖可以通过变量私有化或原子操作解决
+        analysis.sharedVariables.insert(dep.memoryLocation);
+        break;
+    }
+  }
+  
+  // 确定并行化类型
+  analysis.parallelType = determineParallelizationType(loop, dependences);
+  
+  // 基于依赖类型评估可并行性
+  if (!hasTrueDependences && !hasOutputDependences) {
+    // 只有反依赖或无依赖，完全可并行
+    analysis.parallelType = ParallelizationAnalysis::EMBARRASSINGLY_PARALLEL;
+    analysis.isParallelizable = true;
+  } else if (analysis.requiresReduction) {
+    // 有归约模式，可以并行但需要特殊处理
+    analysis.parallelType = ParallelizationAnalysis::REDUCTION_PARALLEL;
+    analysis.isParallelizable = true;
+  } else if (hasTrueDependences) {
+    // 有非归约的真依赖，通常不能并行化
+    analysis.isParallelizable = false;
+    analysis.preventingFactors.push_back("Non-reduction loop-carried true dependences");
+  }
+  
+  if (analysis.isParallelizable) {
+    // 进一步分析并行化收益和成本
+    estimateParallelizationBenefit(loop, &analysis, characteristics);
+    analyzeSynchronizationNeeds(loop, &analysis, dependences);
+    analysis.suggestedThreadCount = estimateOptimalThreadCount(loop, characteristics);
+  }
+  
+  if (DEBUG) {
+    std::cout << "      Parallelizable: " << (analysis.isParallelizable ? "YES" : "NO") << std::endl;
+    if (analysis.isParallelizable) {
+      std::cout << "      Type: " << (int)analysis.parallelType << ", Threads: " << analysis.suggestedThreadCount << std::endl;
+    }
+  }
+  
+  return analysis;
+}
+
+bool LoopVectorizationPass::checkParallelizationLegality(Loop* loop, const std::vector<PreciseDependence>& dependences) {
+  // 检查所有依赖是否允许并行化
+  for (const auto& dep : dependences) {
+    if (!dep.allowsParallelization) {
+      return false;
+    }
+  }
+  
+  // 检查是否有无法并行化的操作
+  for (BasicBlock* bb : loop->getBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      // 检查原子操作、同步操作等
+      if (auto* call = dynamic_cast<CallInst*>(inst.get())) {
+        // 简化：假设函数调用需要特殊处理
+        // 在实际实现中，需要分析函数的副作用
+        return false;
+      }
+    }
+  }
+  
+  return true;
+}
+
+int LoopVectorizationPass::estimateOptimalThreadCount(Loop* loop, LoopCharacteristics* characteristics) {
+  // 基于循环特征估计最优线程数
+  if (!characteristics) return 2;
+  
+  // 基于循环体大小和计算密度
+  int baseThreads = 2;
+  
+  if (characteristics->instructionCount > 50) baseThreads = 4;
+  if (characteristics->instructionCount > 200) baseThreads = 8;
+  
+  // 基于计算与内存比率调整
+  if (characteristics->computeToMemoryRatio > 2.0) {
+    baseThreads *= 2; // 计算密集型，可以使用更多线程
+  }
+  
+  return std::min(baseThreads, 16); // 限制最大线程数
+}
+
+// ========== 辅助方法实现 ==========
+
+bool LoopVectorizationPass::isConstantStride(Value* ptr, Loop* loop, int& stride) {
+  // 简化实现：检查是否为常量步长访问
+  stride = 1; // 默认步长
+  
+  auto* gep = dynamic_cast<GetElementPtrInst*>(ptr);
+  if (!gep) return false;
+  
+  // 检查最后一个索引是否为归纳变量 + 常量
+  if (gep->getNumIndices() > 0) {
+    Value* lastIndex = gep->getIndex(gep->getNumIndices() - 1);
+    
+    // 简化：假设是 i 或 i+c 的形式
+    if (auto* binInst = dynamic_cast<BinaryInst*>(lastIndex)) {
+      if (binInst->getKind() == Instruction::kAdd) {
+        // 检查是否为 i + constant
+        if (auto* constInt = dynamic_cast<ConstantInteger*>(binInst->getRhs())) {
+          stride = constInt->getInt();
+          return true;
+        }
+      }
+    }
+    
+    // 默认为步长1的连续访问
+    stride = 1;
+    return true;
+  }
+  
+  return false;
+}
+
+std::vector<int> LoopVectorizationPass::extractInductionCoefficients(Value* ptr, Loop* loop) {
+  // 简化实现：返回默认的仿射系数
+  std::vector<int> coefficients;
+  
+  // 假设是简单的 a[i] 形式，系数为 [0, 1]
+  coefficients.push_back(0); // 常数项
+  coefficients.push_back(1); // 归纳变量系数
+  
+  return coefficients;
+}
+
+// ========== 缺失的方法实现 ==========
+
+ParallelizationAnalysis::ParallelizationType LoopVectorizationPass::determineParallelizationType(
+    Loop* loop, const std::vector<PreciseDependence>& dependences) {
+  
+  // 检查是否有任何依赖
+  if (dependences.empty()) {
+    return ParallelizationAnalysis::EMBARRASSINGLY_PARALLEL;
+  }
+  
+  // 检查是否只有归约模式
+  bool hasReduction = false;
+  bool hasOtherDependences = false;
+  
+  for (const auto& dep : dependences) {
+    if (dep.isReductionDependence) {
+      hasReduction = true;
+    } else if (dep.type == DependenceType::TRUE_DEPENDENCE) {
+      hasOtherDependences = true;
+    }
+  }
+  
+  if (hasReduction && !hasOtherDependences) {
+    return ParallelizationAnalysis::REDUCTION_PARALLEL;
+  } else if (!hasOtherDependences) {
+    return ParallelizationAnalysis::EMBARRASSINGLY_PARALLEL;
+  }
+  
+  return ParallelizationAnalysis::NONE;
+}
+
+void LoopVectorizationPass::analyzeReductionPatterns(Loop* loop, ParallelizationAnalysis* analysis) {
+  // 简化实现：查找常见的归约模式
+  for (BasicBlock* bb : loop->getBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      if (auto* binInst = dynamic_cast<BinaryInst*>(inst.get())) {
+        if (binInst->getKind() == Instruction::kAdd || binInst->getKind() == Instruction::kMul) {
+          // 检查是否为累加/累乘模式
+          Value* lhs = binInst->getLhs();
+          if (hasReductionPattern(lhs, loop)) {
+            analysis->requiresReduction = true;
+            analysis->reductionVariables.insert(lhs);
+          }
+        }
+      }
+    }
+  }
+}
+
+void LoopVectorizationPass::analyzeMemoryAccessPatterns(Loop* loop, ParallelizationAnalysis* analysis, 
+                                                       AliasAnalysisResult* aliasAnalysis) {
+  std::vector<Value*> memoryAccesses;
+  
+  // 收集所有内存访问
+  for (BasicBlock* bb : loop->getBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      if (auto* load = dynamic_cast<LoadInst*>(inst.get())) {
+        memoryAccesses.push_back(load->getPointer());
+      } else if (auto* store = dynamic_cast<StoreInst*>(inst.get())) {
+        memoryAccesses.push_back(store->getPointer());
+      }
+    }
+  }
+  
+  // 分析内存访问独立性
+  bool hasIndependentAccess = true;
+  for (size_t i = 0; i < memoryAccesses.size(); ++i) {
+    for (size_t j = i + 1; j < memoryAccesses.size(); ++j) {
+      if (!isIndependentMemoryAccess(memoryAccesses[i], memoryAccesses[j], loop)) {
+        hasIndependentAccess = false;
+        analysis->hasMemoryConflicts = true;
+      }
+    }
+  }
+  
+  analysis->hasIndependentAccess = hasIndependentAccess;
+}
+
+void LoopVectorizationPass::estimateParallelizationBenefit(Loop* loop, ParallelizationAnalysis* analysis,
+                                                          LoopCharacteristics* characteristics) {
+  if (!analysis->isParallelizable) {
+    analysis->parallelizationBenefit = 0.0;
+    return;
+  }
+  
+  // 基于计算复杂度和并行度计算收益
+  double workComplexity = estimateWorkComplexity(loop);
+  double parallelFraction = 1.0; // 假设完全可并行
+  
+  // 根据依赖调整并行度
+  if (analysis->requiresReduction) {
+    parallelFraction *= 0.8; // 归约降低并行效率
+  }
+  if (analysis->hasMemoryConflicts) {
+    parallelFraction *= 0.6; // 内存冲突降低效率
+  }
+  
+  // Amdahl定律估算
+  double serialFraction = 1.0 - parallelFraction;
+  int threadCount = analysis->suggestedThreadCount;
+  double speedup = 1.0 / (serialFraction + parallelFraction / threadCount);
+  
+  analysis->parallelizationBenefit = std::min((speedup - 1.0) / threadCount, 1.0);
+  
+  // 估算同步和通信开销
+  analysis->synchronizationCost = analysis->requiresBarrier ? 100 : 0;
+  analysis->communicationCost = analysis->sharedVariables.size() * 50;
+}
+
+void LoopVectorizationPass::identifyPrivatizableVariables(Loop* loop, ParallelizationAnalysis* analysis) {
+  // 简化实现：标识循环内定义的变量为可私有化
+  for (BasicBlock* bb : loop->getBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      if (!inst->getType()->isVoid()) {
+        // 如果变量只在循环内使用，可能可以私有化
+        bool onlyUsedInLoop = true;
+        for (auto& use : inst->getUses()) {
+          if (auto* userInst = dynamic_cast<Instruction*>(use->getUser())) {
+            if (!loop->contains(userInst->getParent())) {
+              onlyUsedInLoop = false;
+              break;
+            }
+          }
+        }
+        
+        if (onlyUsedInLoop) {
+          analysis->privatizableVariables.insert(inst.get());
+        }
+      }
+    }
+  }
+}
+
+void LoopVectorizationPass::analyzeSynchronizationNeeds(Loop* loop, ParallelizationAnalysis* analysis,
+                                                       const std::vector<PreciseDependence>& dependences) {
+  // 根据依赖类型确定同步需求
+  for (const auto& dep : dependences) {
+    if (dep.type == DependenceType::OUTPUT_DEPENDENCE) {
+      analysis->requiresBarrier = true;
+      analysis->sharedVariables.insert(dep.memoryLocation);
+    }
+  }
+  
+  // 如果有归约，需要特殊的归约同步
+  if (analysis->requiresReduction) {
+    analysis->requiresBarrier = true;
+  }
+}
+
+bool LoopVectorizationPass::isIndependentMemoryAccess(Value* ptr1, Value* ptr2, Loop* loop) {
+  // 简化实现：基本的独立性检查
+  if (ptr1 == ptr2) return false;
+  
+  // 如果是不同的基址，认为是独立的
+  auto* gep1 = dynamic_cast<GetElementPtrInst*>(ptr1);
+  auto* gep2 = dynamic_cast<GetElementPtrInst*>(ptr2);
+  
+  if (gep1 && gep2) {
+    if (gep1->getBasePointer() != gep2->getBasePointer()) {
+      return true; // 不同的基址
+    }
+    // 相同基址，需要更精细的分析（这里简化为不独立）
+    return false;
+  }
+  
+  return true; // 默认认为独立
+}
+
+double LoopVectorizationPass::estimateWorkComplexity(Loop* loop) {
+  double complexity = 0.0;
+  
+  for (BasicBlock* bb : loop->getBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      // 基于指令类型分配复杂度权重
+      if (auto* binInst = dynamic_cast<BinaryInst*>(inst.get())) {
+        switch (binInst->getKind()) {
+          case Instruction::kAdd:
+          case Instruction::kSub:
+            complexity += 1.0;
+            break;
+          case Instruction::kMul:
+            complexity += 3.0;
+            break;
+          case Instruction::kDiv:
+            complexity += 10.0;
+            break;
+          default:
+            complexity += 2.0;
+        }
+      } else if (dynamic_cast<LoadInst*>(inst.get()) || dynamic_cast<StoreInst*>(inst.get())) {
+        complexity += 2.0; // 内存访问
+      } else {
+        complexity += 1.0; // 其他指令
+      }
+    }
+  }
+  
+  return complexity;
+}
+
+bool LoopVectorizationPass::hasReductionPattern(Value* var, Loop* loop) {
+  // 简化实现：检查是否为简单的累加/累乘模式
+  for (auto& use : var->getUses()) {
+    if (auto* binInst = dynamic_cast<BinaryInst*>(use->getUser())) {
+      if (binInst->getKind() == Instruction::kAdd || binInst->getKind() == Instruction::kMul) {
+        // 检查是否为 var = var op something 的模式
+        if (binInst->getLhs() == var || binInst->getRhs() == var) {
+          return true;
+        }
+      }
+    }
+  }
+  return false;
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Analysis/SideEffectAnalysis.cpp
+++ b/src/midend/Pass/Analysis/SideEffectAnalysis.cpp
@ -0,0 +1,413 @@
+#include "SideEffectAnalysis.h"
+#include "AliasAnalysis.h"
+#include "CallGraphAnalysis.h"
+#include "SysYIRPrinter.h"
+#include <iostream>
+
+namespace sysy {
+
+// 副作用分析遍的静态 ID
+void *SysYSideEffectAnalysisPass::ID = (void *)&SysYSideEffectAnalysisPass::ID;
+
+// ======================================================================
+// SideEffectAnalysisResult 类的实现
+// ======================================================================
+
+SideEffectAnalysisResult::SideEffectAnalysisResult() { initializeKnownFunctions(); }
+
+const SideEffectInfo &SideEffectAnalysisResult::getInstructionSideEffect(Instruction *inst) const {
+  auto it = instructionSideEffects.find(inst);
+  if (it != instructionSideEffects.end()) {
+    return it->second;
+  }
+  // 返回默认的无副作用信息
+  static SideEffectInfo noEffect;
+  return noEffect;
+}
+
+const SideEffectInfo &SideEffectAnalysisResult::getFunctionSideEffect(Function *func) const {
+  // 首先检查分析过的用户定义函数
+  auto it = functionSideEffects.find(func);
+  if (it != functionSideEffects.end()) {
+    return it->second;
+  }
+  
+  // 如果没有找到，检查是否为已知的库函数
+  if (func) {
+    std::string funcName = func->getName();
+    const SideEffectInfo *knownInfo = getKnownFunctionSideEffect(funcName);
+    if (knownInfo) {
+      return *knownInfo;
+    }
+  }
+  
+  // 返回默认的无副作用信息
+  static SideEffectInfo noEffect;
+  return noEffect;
+}
+
+void SideEffectAnalysisResult::setInstructionSideEffect(Instruction *inst, const SideEffectInfo &info) {
+  instructionSideEffects[inst] = info;
+}
+
+void SideEffectAnalysisResult::setFunctionSideEffect(Function *func, const SideEffectInfo &info) {
+  functionSideEffects[func] = info;
+}
+
+bool SideEffectAnalysisResult::hasSideEffect(Instruction *inst) const {
+  const auto &info = getInstructionSideEffect(inst);
+  return info.type != SideEffectType::NO_SIDE_EFFECT;
+}
+
+bool SideEffectAnalysisResult::mayModifyMemory(Instruction *inst) const {
+  const auto &info = getInstructionSideEffect(inst);
+  return info.mayModifyMemory;
+}
+
+bool SideEffectAnalysisResult::mayModifyGlobal(Instruction *inst) const {
+  const auto &info = getInstructionSideEffect(inst);
+  return info.mayModifyGlobal;
+}
+
+bool SideEffectAnalysisResult::isPureFunction(Function *func) const {
+  const auto &info = getFunctionSideEffect(func);
+  return info.isPure;
+}
+
+void SideEffectAnalysisResult::initializeKnownFunctions() {
+  // SysY标准库函数的副作用信息
+
+  // I/O函数 - 有副作用
+  SideEffectInfo ioEffect;
+  ioEffect.type = SideEffectType::IO_OPERATION;
+  ioEffect.mayModifyGlobal = true;
+  ioEffect.mayModifyMemory = true;
+  ioEffect.mayCallFunction = true;
+  ioEffect.isPure = false;
+
+  // knownFunctions["printf"] = ioEffect;
+  // knownFunctions["scanf"] = ioEffect;
+  knownFunctions["getint"] = ioEffect;
+  knownFunctions["getch"] = ioEffect;
+  knownFunctions["getfloat"] = ioEffect;
+  knownFunctions["getarray"] = ioEffect;
+  knownFunctions["getfarray"] = ioEffect;
+  knownFunctions["putint"] = ioEffect;
+  knownFunctions["putch"] = ioEffect;
+  knownFunctions["putfloat"] = ioEffect;
+  knownFunctions["putarray"] = ioEffect;
+  knownFunctions["putfarray"] = ioEffect;
+
+  // 时间函数 - 有副作用
+  SideEffectInfo timeEffect;
+  timeEffect.type = SideEffectType::FUNCTION_CALL;
+  timeEffect.mayModifyGlobal = true;
+  timeEffect.mayModifyMemory = false;
+  timeEffect.mayCallFunction = true;
+  timeEffect.isPure = false;
+
+  knownFunctions["_sysy_starttime"] = timeEffect;
+  knownFunctions["_sysy_stoptime"] = timeEffect;
+}
+
+const SideEffectInfo *SideEffectAnalysisResult::getKnownFunctionSideEffect(const std::string &funcName) const {
+  auto it = knownFunctions.find(funcName);
+  return (it != knownFunctions.end()) ? &it->second : nullptr;
+}
+
+// ======================================================================
+// SysYSideEffectAnalysisPass 类的实现
+// ======================================================================
+
+bool SysYSideEffectAnalysisPass::runOnModule(Module *M, AnalysisManager &AM) {
+  if (DEBUG) {
+    std::cout << "Running SideEffect analysis on module" << std::endl;
+  }
+
+  // 创建分析结果（构造函数中已经调用了initializeKnownFunctions）
+  result = std::make_unique<SideEffectAnalysisResult>();
+
+  // 获取调用图分析结果
+  callGraphAnalysis = AM.getAnalysisResult<CallGraphAnalysisResult, CallGraphAnalysisPass>();
+  if (!callGraphAnalysis) {
+    std::cerr << "Warning: CallGraphAnalysis not available, falling back to conservative analysis" << std::endl;
+  }
+
+  // 按拓扑序分析函数，确保被调用函数先于调用者分析
+  if (callGraphAnalysis) {
+    // 使用调用图的拓扑排序结果
+    const auto &topOrder = callGraphAnalysis->getTopologicalOrder();
+
+    // 处理强连通分量（递归函数群）
+    const auto &sccs = callGraphAnalysis->getStronglyConnectedComponents();
+    for (const auto &scc : sccs) {
+      if (scc.size() > 1) {
+        // 多个函数的强连通分量，使用不动点算法
+        analyzeStronglyConnectedComponent(scc, AM);
+      } else {
+        // 单个函数，检查是否自递归
+        Function *func = scc[0];
+        if (callGraphAnalysis->isSelfRecursive(func)) {
+          // 自递归函数也需要不动点算法
+          analyzeStronglyConnectedComponent(scc, AM);
+        } else {
+          // 非递归函数，直接分析
+          SideEffectInfo funcEffect = analyzeFunction(func, AM);
+          result->setFunctionSideEffect(func, funcEffect);
+        }
+      }
+    }
+  } else {
+    // 没有调用图，保守地分析每个函数
+    for (auto &pair : M->getFunctions()) {
+      Function *func = pair.second.get();
+      SideEffectInfo funcEffect = analyzeFunction(func, AM);
+      result->setFunctionSideEffect(func, funcEffect);
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "---- Side Effect Analysis Results for Module ----\n";
+    for (auto &pair : M->getFunctions()) {
+      Function *func = pair.second.get();
+      const auto &funcInfo = result->getFunctionSideEffect(func);
+
+      std::cout << "Function " << func->getName() << ": ";
+      switch (funcInfo.type) {
+      case SideEffectType::NO_SIDE_EFFECT:
+        std::cout << "No Side Effect";
+        break;
+      case SideEffectType::MEMORY_WRITE:
+        std::cout << "Memory Write";
+        break;
+      case SideEffectType::FUNCTION_CALL:
+        std::cout << "Function Call";
+        break;
+      case SideEffectType::IO_OPERATION:
+        std::cout << "I/O Operation";
+        break;
+      case SideEffectType::UNKNOWN:
+        std::cout << "Unknown";
+        break;
+      }
+      std::cout << " (Pure: " << (funcInfo.isPure ? "Yes" : "No")
+                << ", Modifies Global: " << (funcInfo.mayModifyGlobal ? "Yes" : "No") << ")\n";
+    }
+    std::cout << "--------------------------------------------------\n";
+  }
+
+  return false; // Analysis passes return false since they don't modify the IR
+}
+
+std::unique_ptr<AnalysisResultBase> SysYSideEffectAnalysisPass::getResult() { return std::move(result); }
+
+SideEffectInfo SysYSideEffectAnalysisPass::analyzeFunction(Function *func, AnalysisManager &AM) {
+  SideEffectInfo functionSideEffect;
+
+  // 为每个指令分析副作用
+  for (auto &BB : func->getBasicBlocks()) {
+    for (auto &I : BB->getInstructions_Range()) {
+      Instruction *inst = I.get();
+      SideEffectInfo instEffect = analyzeInstruction(inst, func, AM);
+
+      // 记录指令的副作用信息
+      result->setInstructionSideEffect(inst, instEffect);
+
+      // 合并到函数级别的副作用信息中
+      functionSideEffect = functionSideEffect.merge(instEffect);
+    }
+  }
+
+  return functionSideEffect;
+}
+
+void SysYSideEffectAnalysisPass::analyzeStronglyConnectedComponent(const std::vector<Function *> &scc,
+                                                                   AnalysisManager &AM) {
+  // 使用不动点算法处理递归函数群
+  std::unordered_map<Function *, SideEffectInfo> currentEffects;
+  std::unordered_map<Function *, SideEffectInfo> previousEffects;
+
+  // 初始化：所有函数都假设为纯函数
+  for (Function *func : scc) {
+    SideEffectInfo initialEffect;
+    initialEffect.isPure = true;
+    currentEffects[func] = initialEffect;
+    result->setFunctionSideEffect(func, initialEffect);
+  }
+
+  bool converged = false;
+  int iterations = 0;
+  const int maxIterations = 10; // 防止无限循环
+
+  while (!converged && iterations < maxIterations) {
+    previousEffects = currentEffects;
+
+    // 重新分析每个函数
+    for (Function *func : scc) {
+      SideEffectInfo newEffect = analyzeFunction(func, AM);
+      currentEffects[func] = newEffect;
+      result->setFunctionSideEffect(func, newEffect);
+    }
+
+    // 检查是否收敛
+    converged = hasConverged(previousEffects, currentEffects);
+    iterations++;
+  }
+
+  if (iterations >= maxIterations) {
+    std::cerr << "Warning: SideEffect analysis did not converge for SCC after " << maxIterations << " iterations"
+              << std::endl;
+  }
+}
+
+bool SysYSideEffectAnalysisPass::hasConverged(const std::unordered_map<Function *, SideEffectInfo> &oldEffects,
+                                              const std::unordered_map<Function *, SideEffectInfo> &newEffects) const {
+  for (const auto &pair : oldEffects) {
+    Function *func = pair.first;
+    const SideEffectInfo &oldEffect = pair.second;
+
+    auto it = newEffects.find(func);
+    if (it == newEffects.end()) {
+      return false; // 函数不存在于新结果中
+    }
+
+    const SideEffectInfo &newEffect = it->second;
+
+    // 比较关键属性是否相同
+    if (oldEffect.type != newEffect.type || oldEffect.mayModifyGlobal != newEffect.mayModifyGlobal ||
+        oldEffect.mayModifyMemory != newEffect.mayModifyMemory ||
+        oldEffect.mayCallFunction != newEffect.mayCallFunction || oldEffect.isPure != newEffect.isPure) {
+      return false;
+    }
+  }
+
+  return true;
+}
+
+SideEffectInfo SysYSideEffectAnalysisPass::analyzeInstruction(Instruction *inst, Function *currentFunc,
+                                                              AnalysisManager &AM) {
+  SideEffectInfo info;
+
+  // 根据指令类型进行分析
+  if (inst->isCall()) {
+    return analyzeCallInstruction(static_cast<CallInst *>(inst), currentFunc, AM);
+  } else if (inst->isStore()) {
+    return analyzeStoreInstruction(static_cast<StoreInst *>(inst), currentFunc, AM);
+  } else if (inst->isMemset()) {
+    return analyzeMemsetInstruction(static_cast<MemsetInst *>(inst), currentFunc, AM);
+  } else if (inst->isBranch() || inst->isReturn()) {
+    // 控制流指令无副作用，但必须保留
+    info.type = SideEffectType::NO_SIDE_EFFECT;
+    info.isPure = true;
+  } else {
+    // 其他指令（算术、逻辑、比较等）通常无副作用
+    info.type = SideEffectType::NO_SIDE_EFFECT;
+    info.isPure = true;
+  }
+
+  return info;
+}
+
+SideEffectInfo SysYSideEffectAnalysisPass::analyzeCallInstruction(CallInst *call, Function *currentFunc,
+                                                                  AnalysisManager &AM) {
+  SideEffectInfo info;
+
+  // 获取被调用的函数
+  Function *calledFunc = call->getCallee();
+  if (!calledFunc) {
+    // 间接调用，保守处理
+    info.type = SideEffectType::UNKNOWN;
+    info.mayModifyGlobal = true;
+    info.mayModifyMemory = true;
+    info.mayCallFunction = true;
+    info.isPure = false;
+    return info;
+  }
+
+  std::string funcName = calledFunc->getName();
+
+  // 检查是否为已知的标准库函数
+  const SideEffectInfo *knownInfo = result->getKnownFunctionSideEffect(funcName);
+  if (knownInfo) {
+    return *knownInfo;
+  }
+
+  // 利用调用图分析结果进行精确分析
+  if (callGraphAnalysis) {
+    // 检查被调用函数是否已分析过
+    const SideEffectInfo &funcEffect = result->getFunctionSideEffect(calledFunc);
+    if (funcEffect.type != SideEffectType::NO_SIDE_EFFECT || !funcEffect.isPure) {
+      return funcEffect;
+    }
+
+    // 检查递归调用
+    if (callGraphAnalysis->isRecursive(calledFunc)) {
+      // 递归函数保守处理（在不动点算法中会精确分析）
+      info.type = SideEffectType::FUNCTION_CALL;
+      info.mayModifyGlobal = true;
+      info.mayModifyMemory = true;
+      info.mayCallFunction = true;
+      info.isPure = false;
+      return info;
+    }
+  }
+
+  // 对于未分析的用户函数，保守处理
+  info.type = SideEffectType::FUNCTION_CALL;
+  info.mayModifyGlobal = true;
+  info.mayModifyMemory = true;
+  info.mayCallFunction = true;
+  info.isPure = false;
+
+  return info;
+}
+
+SideEffectInfo SysYSideEffectAnalysisPass::analyzeStoreInstruction(StoreInst *store, Function *currentFunc,
+                                                                   AnalysisManager &AM) {
+  SideEffectInfo info;
+  info.type = SideEffectType::MEMORY_WRITE;
+  info.mayModifyMemory = true;
+  info.isPure = false;
+
+  // 获取函数的别名分析结果
+  AliasAnalysisResult *aliasAnalysis = AM.getAnalysisResult<AliasAnalysisResult, SysYAliasAnalysisPass>(currentFunc);
+  if (aliasAnalysis) {
+    Value *storePtr = store->getPointer();
+
+    // 如果存储到全局变量或可能别名的位置，则可能修改全局状态
+    if (!aliasAnalysis->isLocalArray(storePtr)) {
+      info.mayModifyGlobal = true;
+    }
+  } else {
+    // 没有别名分析结果，保守处理
+    info.mayModifyGlobal = true;
+  }
+
+  return info;
+}
+
+SideEffectInfo SysYSideEffectAnalysisPass::analyzeMemsetInstruction(MemsetInst *memset, Function *currentFunc,
+                                                                    AnalysisManager &AM) {
+  SideEffectInfo info;
+  info.type = SideEffectType::MEMORY_WRITE;
+  info.mayModifyMemory = true;
+  info.isPure = false;
+
+  // 获取函数的别名分析结果
+  AliasAnalysisResult *aliasAnalysis = AM.getAnalysisResult<AliasAnalysisResult, SysYAliasAnalysisPass>(currentFunc);
+  if (aliasAnalysis) {
+    Value *memsetPtr = memset->getPointer();
+
+    // 如果memset操作全局变量或可能别名的位置，则可能修改全局状态
+    if (!aliasAnalysis->isLocalArray(memsetPtr)) {
+      info.mayModifyGlobal = true;
+    }
+  } else {
+    // 没有别名分析结果，保守处理
+    info.mayModifyGlobal = true;
+  }
+
+  return info;
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/BuildCFG.cpp
+++ b/src/midend/Pass/Optimize/BuildCFG.cpp
@ -0,0 +1,79 @@
+#include "BuildCFG.h"
+#include "Dom.h"
+#include "Liveness.h"
+#include <iostream>
+#include <queue>
+#include <set>
+
+namespace sysy {
+
+void *BuildCFG::ID = (void *)&BuildCFG::ID; // 定义唯一的 Pass ID
+
+// 声明Pass的分析使用
+void BuildCFG::getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const {
+  // BuildCFG不依赖其他分析
+  // analysisDependencies.insert(&DominatorTreeAnalysisPass::ID); // 错误的例子
+
+  // BuildCFG会使所有依赖于CFG的分析结果失效，所以它必须声明这些失效
+  analysisInvalidations.insert(&DominatorTreeAnalysisPass::ID);
+  analysisInvalidations.insert(&LivenessAnalysisPass::ID);
+}
+
+bool BuildCFG::runOnFunction(Function *F, AnalysisManager &AM) {
+  if (DEBUG) {
+    std::cout << "Running BuildCFG pass on function: " << F->getName() << std::endl;
+  }
+
+  bool changed = false;
+
+  // 1. 清空所有基本块的前驱和后继列表
+  for (auto &bb : F->getBasicBlocks()) {
+    bb->clearPredecessors();
+    bb->clearSuccessors();
+  }
+
+  // 2. 遍历每个基本块，重建CFG
+  for (auto &bb : F->getBasicBlocks()) {
+    // 获取基本块的最后一条指令
+    auto &inst = *bb->terminator();
+    Instruction *termInst = inst.get();
+    // 确保基本块有终结指令
+    if (!termInst) {
+      continue;
+    }
+
+    // 根据终结指令类型，建立前驱后继关系
+    if (termInst->isBranch()) {
+      // 无条件跳转
+      if (termInst->isUnconditional()) {
+        auto brInst = dynamic_cast<UncondBrInst *>(termInst);
+        BasicBlock *succ = dynamic_cast<BasicBlock *>(brInst->getBlock());
+        assert(succ && "Branch instruction's target must be a BasicBlock");
+        bb->addSuccessor(succ);
+        succ->addPredecessor(bb.get());
+        changed = true;
+
+        // 条件跳转
+      } else if (termInst->isConditional()) {
+        auto brInst = dynamic_cast<CondBrInst *>(termInst);
+        BasicBlock *trueSucc = dynamic_cast<BasicBlock *>(brInst->getThenBlock());
+        BasicBlock *falseSucc = dynamic_cast<BasicBlock *>(brInst->getElseBlock());
+
+        assert(trueSucc && falseSucc && "Branch instruction's targets must be BasicBlocks");
+
+        bb->addSuccessor(trueSucc);
+        trueSucc->addPredecessor(bb.get());
+        bb->addSuccessor(falseSucc);
+        falseSucc->addPredecessor(bb.get());
+        changed = true;
+      }
+    } else if (auto retInst = dynamic_cast<ReturnInst *>(termInst)) {
+      // RetInst没有后继，无需处理
+      // ...
+    }
+  }
+
+  return changed;
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/DCE.cpp
+++ b/src/midend/Pass/Optimize/DCE.cpp
@ -1,9 +1,9 @@
-#include "DCE.h"            // 包含DCE遍的头文件
-#include "IR.h"             // 包含IR相关的定义
-#include "SysYIROptUtils.h" // 包含SysY IR优化工具类的定义
-#include <cassert>          // 用于断言
-#include <iostream>         // 用于调试输出
-#include <set>              // 包含set，虽然DCEContext内部用unordered_set，但这里保留
+#include "DCE.h"
+#include "SysYIROptUtils.h"
+#include "SideEffectAnalysis.h"
+#include <cassert>          
+#include <iostream>         
+#include <set>              

 namespace sysy {

@ -17,10 +17,26 @@ void *DCE::ID = (void *)&DCE::ID;

 // DCEContext 的 run 方法实现
 void DCEContext::run(Function *func, AnalysisManager *AM, bool &changed) {
+  // 获取别名分析结果
+  if (AM) {
+    aliasAnalysis = AM->getAnalysisResult<AliasAnalysisResult, SysYAliasAnalysisPass>(func);
+    // 获取副作用分析结果（Module级别）
+    sideEffectAnalysis = AM->getAnalysisResult<SideEffectAnalysisResult, SysYSideEffectAnalysisPass>();
+    
+    if (DEBUG) {
+      if (aliasAnalysis) {
+        std::cout << "DCE: Using alias analysis results" << std::endl;
+      }
+      if (sideEffectAnalysis) {
+        std::cout << "DCE: Using side effect analysis results" << std::endl;
+      }
+    }
+  }
+  
  // 清空活跃指令集合，确保每次运行都是新的状态
  alive_insts.clear();

-  // 第一次遍历：扫描所有指令，识别“天然活跃”的指令并将其及其依赖标记为活跃
+  // 第一次遍历：扫描所有指令，识别"天然活跃"的指令并将其及其依赖标记为活跃
  // 使用 func->getBasicBlocks() 获取基本块列表，保留用户风格
  auto basicBlocks = func->getBasicBlocks();
  for (auto &basicBlock : basicBlocks) {
@ -51,31 +67,68 @@ void DCEContext::run(Function *func, AnalysisManager *AM, bool &changed) {
      // 如果指令不在活跃集合中，则删除它。
      // 分支和返回指令由 isAlive 处理，并会被保留。
      if (alive_insts.count(currentInst) == 0) {
-        // 删除指令，保留用户风格的 SysYIROptUtils::usedelete 和 erase
+        instIter = SysYIROptUtils::usedelete(instIter); // 删除后返回下一个迭代器
        changed = true; // 标记 IR 已被修改
-        SysYIROptUtils::usedelete(currentInst);
-        instIter = basicBlock->getInstructions().erase(instIter); // 删除后返回下一个迭代器
      } else {
        ++instIter; // 指令活跃，移动到下一个
      }
    }
  }
+  changed |= SysYIROptUtils::eliminateRedundantPhisInFunction(func); // 如果有活跃指令，则标记为已更改
 }

-// 判断指令是否是“天然活跃”的实现
+// 判断指令是否是"天然活跃"的实现
 // 只有具有副作用的指令（如存储、函数调用、原子操作）
 // 和控制流指令（如分支、返回）是天然活跃的。
 bool DCEContext::isAlive(Instruction *inst) {
-  // TODO: 后续程序并发考虑原子操作
-  // 其结果不被其他指令使用的指令（例如 StoreInst, BranchInst, ReturnInst）。
-  // dynamic_cast<ir::CallInst>(inst) 检查是否是函数调用指令，
-  // 函数调用通常有副作用。
-  // 终止指令 (BranchInst, ReturnInst) 必须是活跃的，因为它控制了程序的执行流程。
-  // 保留用户提供的 isAlive 逻辑
-  bool isBranchOrReturn = inst->isBranch() || inst->isReturn();
-  bool isCall = inst->isCall();
-  bool isStoreOrMemset = inst->isStore() || inst->isMemset(); 
-  return isBranchOrReturn || isCall || isStoreOrMemset;
+  // 终止指令 (BranchInst, ReturnInst) 必须是活跃的，因为它控制了程序的执行流程
+  if (inst->isBranch() || inst->isReturn()) {
+    return true;
+  }
+  
+  // 使用副作用分析来判断指令是否有副作用
+  if (sideEffectAnalysis && sideEffectAnalysis->hasSideEffect(inst)) {
+    return true;
+  }
+  
+  // 特殊处理Store指令：使用别名分析进行更精确的判断
+  if (inst->isStore()) {
+    auto* storeInst = static_cast<StoreInst*>(inst);
+    return mayHaveSideEffect(storeInst);
+  }
+  
+  // 特殊处理Memset指令：总是保留（因为它修改内存）
+  if (inst->isMemset()) {
+    return true;
+  }
+  
+  // 函数调用指令：总是保留（可能有未知副作用）
+  if (inst->isCall()) {
+    return true;
+  }
+  
+  // 其他指令（算术、逻辑、Load等）：无副作用，可以删除
+  return false;
+}
+
+// 检查Store指令是否可能有副作用（通过别名分析）
+bool DCEContext::mayHaveSideEffect(StoreInst* store) {
+  if (!aliasAnalysis) {
+    // 没有别名分析结果时，保守地认为所有store都有副作用
+    return true;
+  }
+  
+  Value* storePtr = store->getPointer();
+  
+  // 如果是对本地数组的存储且访问模式是常量，可能可以安全删除
+  if (aliasAnalysis->isLocalArray(storePtr)) {
+    // 检查是否有其他指令可能读取这个位置
+    // 这里需要更复杂的活性分析，暂时保守处理
+    return true; // 保守地保留所有本地数组的存储
+  }
+  
+  // 对全局变量、函数参数等的存储总是有副作用
+  return true;
 }

 // 递归地将活跃指令及其依赖加入到 alive_insts 集合中
@ -104,7 +157,6 @@ void DCEContext::addAlive(Instruction *inst) {

 // DCE 遍的 runOnFunction 方法实现
 bool DCE::runOnFunction(Function *func, AnalysisManager &AM) {
-
  DCEContext ctx;
  bool changed = false;
  ctx.run(func, &AM, changed); // 运行 DCE 优化
@ -122,7 +174,11 @@ bool DCE::runOnFunction(Function *func, AnalysisManager &AM) {

 // 声明DCE遍的分析依赖和失效信息
 void DCE::getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const {
-  // DCE不依赖特定的分析结果，它通过遍历和副作用判断来工作。
+  // DCE依赖别名分析来更精确地判断Store指令的副作用
+  analysisDependencies.insert(&SysYAliasAnalysisPass::ID);
+  
+  // DCE依赖副作用分析来判断指令是否有副作用
+  analysisDependencies.insert(&SysYSideEffectAnalysisPass::ID);

  // DCE会删除指令，这会影响许多分析结果。
  // 至少，它会影响活跃性分析、支配树、控制流图（如果删除导致基本块为空并被合并）。
--- a/src/midend/Pass/Optimize/GVN.cpp
+++ b/src/midend/Pass/Optimize/GVN.cpp
@ -0,0 +1,492 @@
+#include "GVN.h"
+#include "Dom.h"
+#include "SysYIROptUtils.h"
+#include <algorithm>
+#include <cassert>
+#include <iostream>
+#include <unordered_map>
+#include <unordered_set>
+
+extern int DEBUG;
+
+namespace sysy {
+
+// GVN 遍的静态 ID
+void *GVN::ID = (void *)&GVN::ID;
+
+// ======================================================================
+// GVN 类的实现
+// ======================================================================
+
+bool GVN::runOnFunction(Function *func, AnalysisManager &AM) {
+  if (func->getBasicBlocks().empty()) {
+    return false;
+  }
+
+  if (DEBUG) {
+    std::cout << "\n=== Running GVN on function: " << func->getName() << " ===" << std::endl;
+  }
+
+  bool changed = false;
+  GVNContext context;
+  context.run(func, &AM, changed);
+
+  if (DEBUG) {
+    if (changed) {
+      std::cout << "GVN: Function " << func->getName() << " was modified" << std::endl;
+    } else {
+      std::cout << "GVN: Function " << func->getName() << " was not modified" << std::endl;
+    }
+    std::cout << "=== GVN completed for function: " << func->getName() << " ===" << std::endl;
+  }
+  changed |= SysYIROptUtils::eliminateRedundantPhisInFunction(func);
+  return changed;
+}
+
+void GVN::getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const {
+  // GVN依赖以下分析：
+  // 1. 支配树分析 - 用于检查指令的支配关系，确保替换的安全性
+  analysisDependencies.insert(&DominatorTreeAnalysisPass::ID);
+  
+  // 2. 副作用分析 - 用于判断函数调用是否可以进行GVN
+  analysisDependencies.insert(&SysYSideEffectAnalysisPass::ID);
+
+  // GVN不会使任何分析失效，因为：
+  // - GVN只删除冗余计算，不改变CFG结构
+  // - GVN不修改程序的语义，只是消除重复计算
+  // - 支配关系保持不变
+  // - 副作用分析结果保持不变
+  // analysisInvalidations 保持为空
+  
+  if (DEBUG) {
+    std::cout << "GVN: Declared analysis dependencies (DominatorTree, SideEffectAnalysis)" << std::endl;
+  }
+}
+
+// ======================================================================
+// GVNContext 类的实现 - 重构版本
+// ======================================================================
+
+// 简单的表达式哈希结构
+struct ExpressionKey {
+  enum Type { BINARY, UNARY, LOAD, GEP, CALL } type;
+  int opcode;
+  std::vector<Value*> operands;
+  Type* resultType;
+  
+  bool operator==(const ExpressionKey& other) const {
+    return type == other.type && opcode == other.opcode && 
+           operands == other.operands && resultType == other.resultType;
+  }
+};
+
+struct ExpressionKeyHash {
+  size_t operator()(const ExpressionKey& key) const {
+    size_t hash = std::hash<int>()(static_cast<int>(key.type)) ^ 
+                  std::hash<int>()(key.opcode);
+    for (auto op : key.operands) {
+      hash ^= std::hash<Value*>()(op) + 0x9e3779b9 + (hash << 6) + (hash >> 2);
+    }
+    return hash;
+  }
+};
+
+void GVNContext::run(Function *func, AnalysisManager *AM, bool &changed) {
+  if (DEBUG) {
+    std::cout << "  Starting GVN analysis for function: " << func->getName() << std::endl;
+  }
+
+  // 获取分析结果
+  if (AM) {
+    domTree = AM->getAnalysisResult<DominatorTree, DominatorTreeAnalysisPass>(func);
+    sideEffectAnalysis = AM->getAnalysisResult<SideEffectAnalysisResult, SysYSideEffectAnalysisPass>();
+
+    if (DEBUG) {
+      if (domTree) {
+        std::cout << "    GVN: Using dominator tree analysis" << std::endl;
+      } else {
+        std::cout << "    GVN: Warning - dominator tree analysis not available" << std::endl;
+      }
+      if (sideEffectAnalysis) {
+        std::cout << "    GVN: Using side effect analysis" << std::endl;
+      } else {
+        std::cout << "    GVN: Warning - side effect analysis not available" << std::endl;
+      }
+    }
+  }
+
+  // 清空状态
+  valueToNumber.clear();
+  numberToValue.clear();
+  expressionToNumber.clear();
+  nextValueNumber = 1;
+  visited.clear();
+  rpoBlocks.clear();
+  needRemove.clear();
+
+  // 计算逆后序遍历
+  computeRPO(func);
+
+  if (DEBUG) {
+    std::cout << "    Computed RPO with " << rpoBlocks.size() << " blocks" << std::endl;
+  }
+
+  // 按逆后序遍历基本块进行GVN
+  int blockCount = 0;
+  for (auto bb : rpoBlocks) {
+    if (DEBUG) {
+      std::cout << "    Processing block " << ++blockCount << "/" << rpoBlocks.size() 
+                << ": " << bb->getName() << std::endl;
+    }
+    
+    processBasicBlock(bb, changed);
+  }
+
+  if (DEBUG) {
+    std::cout << "    Found " << needRemove.size() << " redundant instructions to remove" << std::endl;
+  }
+
+  // 删除冗余指令
+  eliminateRedundantInstructions(changed);
+
+  if (DEBUG) {
+    std::cout << "  GVN analysis completed for function: " << func->getName() << std::endl;
+    std::cout << "    Total values numbered: " << valueToNumber.size() << std::endl;
+    std::cout << "    Instructions eliminated: " << needRemove.size() << std::endl;
+  }
+}
+
+void GVNContext::computeRPO(Function *func) {
+  rpoBlocks.clear();
+  visited.clear();
+
+  auto entry = func->getEntryBlock();
+  if (entry) {
+    dfs(entry);
+    std::reverse(rpoBlocks.begin(), rpoBlocks.end());
+  }
+}
+
+void GVNContext::dfs(BasicBlock *bb) {
+  if (!bb || visited.count(bb)) {
+    return;
+  }
+
+  visited.insert(bb);
+
+  // 访问所有后继基本块
+  for (auto succ : bb->getSuccessors()) {
+    if (visited.find(succ) == visited.end()) {
+      dfs(succ);
+    }
+  }
+
+  rpoBlocks.push_back(bb);
+}
+
+unsigned GVNContext::getValueNumber(Value* value) {
+  // 如果已经有值编号，直接返回
+  auto it = valueToNumber.find(value);
+  if (it != valueToNumber.end()) {
+    return it->second;
+  }
+  
+  // 为新值分配编号
+  return assignValueNumber(value);
+}
+
+unsigned GVNContext::assignValueNumber(Value* value) {
+  unsigned number = nextValueNumber++;
+  valueToNumber[value] = number;
+  numberToValue[number] = value;
+  
+  if (DEBUG >= 2) {
+    std::cout << "            Assigned value number " << number 
+              << " to " << value->getName() << std::endl;
+  }
+  
+  return number;
+}
+
+void GVNContext::processBasicBlock(BasicBlock* bb, bool& changed) {
+  int instCount = 0;
+  for (auto &instPtr : bb->getInstructions()) {
+    if (DEBUG) {
+      std::cout << "      Processing instruction " << ++instCount 
+                << ": " << instPtr->getName() << std::endl;
+    }
+    
+    if (processInstruction(instPtr.get())) {
+      changed = true;
+    }
+  }
+}
+
+bool GVNContext::processInstruction(Instruction* inst) {
+  // 跳过分支指令和其他不可优化的指令
+  if (inst->isBranch() || dynamic_cast<ReturnInst*>(inst) || 
+      dynamic_cast<AllocaInst*>(inst) || dynamic_cast<StoreInst*>(inst)) {
+    
+    // 如果是store指令，需要使相关的内存值失效
+    if (auto store = dynamic_cast<StoreInst*>(inst)) {
+      invalidateMemoryValues(store);
+    }
+    
+    // 为这些指令分配值编号但不尝试优化
+    getValueNumber(inst);
+    return false;
+  }
+  
+  if (DEBUG) {
+    std::cout << "        Processing optimizable instruction: " << inst->getName() 
+              << " (kind: " << static_cast<int>(inst->getKind()) << ")" << std::endl;
+  }
+  
+  // 构建表达式键
+  std::string exprKey = buildExpressionKey(inst);
+  if (exprKey.empty()) {
+    // 不可优化的指令，只分配值编号
+    getValueNumber(inst);
+    return false;
+  }
+  
+  if (DEBUG >= 2) {
+    std::cout << "          Expression key: " << exprKey << std::endl;
+  }
+  
+  // 查找已存在的等价值
+  Value* existing = findExistingValue(exprKey, inst);
+  if (existing && existing != inst) {
+    // 检查支配关系
+    if (auto existingInst = dynamic_cast<Instruction*>(existing)) {
+      if (dominates(existingInst, inst)) {
+        if (DEBUG) {
+          std::cout << "        GVN: Replacing " << inst->getName() 
+                    << " with existing " << existing->getName() << std::endl;
+        }
+        
+        // 用已存在的值替换当前指令
+        inst->replaceAllUsesWith(existing);
+        needRemove.insert(inst);
+        
+        // 将当前指令的值编号指向已存在的值
+        unsigned existingNumber = getValueNumber(existing);
+        valueToNumber[inst] = existingNumber;
+        
+        return true;
+      } else {
+        if (DEBUG) {
+          std::cout << "          Found equivalent but dominance check failed" << std::endl;
+        }
+      }
+    }
+  }
+  
+  // 没有找到等价值，为这个表达式分配新的值编号
+  unsigned number = assignValueNumber(inst);
+  expressionToNumber[exprKey] = number;
+  
+  if (DEBUG) {
+    std::cout << "        Instruction " << inst->getName() << " is unique" << std::endl;
+  }
+  
+  return false;
+}
+
+std::string GVNContext::buildExpressionKey(Instruction* inst) {
+  std::ostringstream oss;
+  
+  if (auto binary = dynamic_cast<BinaryInst*>(inst)) {
+    oss << "binary_" << static_cast<int>(binary->getKind()) << "_";
+    oss << getValueNumber(binary->getLhs()) << "_" << getValueNumber(binary->getRhs());
+    
+    // 对于可交换操作，确保操作数顺序一致
+    if (binary->isCommutative()) {
+      unsigned lhsNum = getValueNumber(binary->getLhs());
+      unsigned rhsNum = getValueNumber(binary->getRhs());
+      if (lhsNum > rhsNum) {
+        oss.str("");
+        oss << "binary_" << static_cast<int>(binary->getKind()) << "_";
+        oss << rhsNum << "_" << lhsNum;
+      }
+    }
+  } else if (auto unary = dynamic_cast<UnaryInst*>(inst)) {
+    oss << "unary_" << static_cast<int>(unary->getKind()) << "_";
+    oss << getValueNumber(unary->getOperand());
+  } else if (auto gep = dynamic_cast<GetElementPtrInst*>(inst)) {
+    oss << "gep_" << getValueNumber(gep->getBasePointer());
+    for (unsigned i = 0; i < gep->getNumIndices(); ++i) {
+      oss << "_" << getValueNumber(gep->getIndex(i));
+    }
+  } else if (auto load = dynamic_cast<LoadInst*>(inst)) {
+    oss << "load_" << getValueNumber(load->getPointer());
+    oss << "_" << reinterpret_cast<uintptr_t>(load->getType()); // 类型区分
+  } else if (auto call = dynamic_cast<CallInst*>(inst)) {
+    // 只为无副作用的函数调用建立表达式
+    if (sideEffectAnalysis && sideEffectAnalysis->isPureFunction(call->getCallee())) {
+      oss << "call_" << call->getCallee()->getName();
+      for (size_t i = 1; i < call->getNumOperands(); ++i) { // 跳过函数指针
+        oss << "_" << getValueNumber(call->getOperand(i));
+      }
+    } else {
+      return ""; // 有副作用的函数调用不可优化
+    }
+  } else {
+    return ""; // 不支持的指令类型
+  }
+  
+  return oss.str();
+}
+
+Value* GVNContext::findExistingValue(const std::string& exprKey, Instruction* inst) {
+  auto it = expressionToNumber.find(exprKey);
+  if (it != expressionToNumber.end()) {
+    unsigned number = it->second;
+    auto valueIt = numberToValue.find(number);
+    if (valueIt != numberToValue.end()) {
+      Value* existing = valueIt->second;
+      
+      // 对于load指令，需要额外检查内存安全性
+      if (auto loadInst = dynamic_cast<LoadInst*>(inst)) {
+        if (auto existingLoad = dynamic_cast<LoadInst*>(existing)) {
+          if (!isMemorySafe(existingLoad, loadInst)) {
+            return nullptr;
+          }
+        }
+      }
+      
+      return existing;
+    }
+  }
+  return nullptr;
+}
+
+bool GVNContext::dominates(Instruction* a, Instruction* b) {
+  auto aBB = a->getParent();
+  auto bBB = b->getParent();
+  
+  // 同一基本块内的情况
+  if (aBB == bBB) {
+    auto &insts = aBB->getInstructions();
+    auto aIt = std::find_if(insts.begin(), insts.end(), 
+                           [a](const auto &ptr) { return ptr.get() == a; });
+    auto bIt = std::find_if(insts.begin(), insts.end(),
+                           [b](const auto &ptr) { return ptr.get() == b; });
+    
+    if (aIt == insts.end() || bIt == insts.end()) {
+      return false;
+    }
+    
+    return std::distance(insts.begin(), aIt) < std::distance(insts.begin(), bIt);
+  }
+  
+  // 不同基本块的情况，使用支配树
+  if (domTree) {
+    auto dominators = domTree->getDominators(bBB);
+    return dominators && dominators->count(aBB);
+  }
+  
+  return false; // 保守做法
+}
+
+bool GVNContext::isMemorySafe(LoadInst* earlierLoad, LoadInst* laterLoad) {
+  // 检查两个load是否访问相同的内存位置
+  unsigned earlierPtr = getValueNumber(earlierLoad->getPointer());
+  unsigned laterPtr = getValueNumber(laterLoad->getPointer());
+  
+  if (earlierPtr != laterPtr) {
+    return false; // 不同的内存位置
+  }
+  
+  // 检查类型是否匹配
+  if (earlierLoad->getType() != laterLoad->getType()) {
+    return false;
+  }
+  
+  // 简单情况：如果在同一个基本块且没有中间的store，则安全
+  auto earlierBB = earlierLoad->getParent();
+  auto laterBB = laterLoad->getParent();
+  
+  if (earlierBB != laterBB) {
+    // 跨基本块的情况需要更复杂的分析，暂时保守处理
+    return false;
+  }
+  
+  // 同一基本块内检查是否有中间的store
+  auto &insts = earlierBB->getInstructions();
+  auto earlierIt = std::find_if(insts.begin(), insts.end(),
+                               [earlierLoad](const auto &ptr) { return ptr.get() == earlierLoad; });
+  auto laterIt = std::find_if(insts.begin(), insts.end(),
+                              [laterLoad](const auto &ptr) { return ptr.get() == laterLoad; });
+  
+  if (earlierIt == insts.end() || laterIt == insts.end()) {
+    return false;
+  }
+  
+  // 确保earlierLoad真的在laterLoad之前
+  if (std::distance(insts.begin(), earlierIt) >= std::distance(insts.begin(), laterIt)) {
+    return false;
+  }
+  
+  // 检查中间是否有store指令修改了相同的内存位置
+  for (auto it = std::next(earlierIt); it != laterIt; ++it) {
+    if (auto store = dynamic_cast<StoreInst*>(it->get())) {
+      unsigned storePtr = getValueNumber(store->getPointer());
+      if (storePtr == earlierPtr) {
+        return false; // 找到中间的store
+      }
+    }
+    
+    // 检查函数调用是否可能修改内存
+    if (auto call = dynamic_cast<CallInst*>(it->get())) {
+      if (sideEffectAnalysis && !sideEffectAnalysis->isPureFunction(call->getCallee())) {
+        // 保守处理：有副作用的函数可能修改内存
+        return false;
+      }
+    }
+  }
+  
+  return true; // 安全
+}
+
+void GVNContext::invalidateMemoryValues(StoreInst* store) {
+  unsigned storePtr = getValueNumber(store->getPointer());
+  
+  if (DEBUG) {
+    std::cout << "        Invalidating memory values affected by store" << std::endl;
+  }
+  
+  // 找到所有可能被这个store影响的load表达式
+  std::vector<std::string> toRemove;
+  
+  for (auto& [exprKey, number] : expressionToNumber) {
+    if (exprKey.find("load_" + std::to_string(storePtr)) == 0) {
+      toRemove.push_back(exprKey);
+      if (DEBUG) {
+        std::cout << "          Invalidating expression: " << exprKey << std::endl;
+      }
+    }
+  }
+  
+  // 移除失效的表达式
+  for (const auto& key : toRemove) {
+    expressionToNumber.erase(key);
+  }
+}
+
+void GVNContext::eliminateRedundantInstructions(bool& changed) {
+  int removeCount = 0;
+  for (auto inst : needRemove) {
+    if (DEBUG) {
+      std::cout << "    Removing redundant instruction " << ++removeCount 
+                << "/" << needRemove.size() << ": " << inst->getName() << std::endl;
+    }
+    
+    // 删除指令前先断开所有使用关系
+    // inst->replaceAllUsesWith 已在 processInstruction 中调用
+    SysYIROptUtils::usedelete(inst);
+    changed = true;
+  }
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/GlobalStrengthReduction.cpp
+++ b/src/midend/Pass/Optimize/GlobalStrengthReduction.cpp
@ -0,0 +1,897 @@
+#include "GlobalStrengthReduction.h"
+#include "SysYIROptUtils.h"
+#include "IRBuilder.h"
+#include <algorithm>
+#include <cassert>
+#include <iostream>
+#include <cmath>
+
+extern int DEBUG;
+
+namespace sysy {
+
+// 全局强度削弱优化遍的静态 ID
+void *GlobalStrengthReduction::ID = (void *)&GlobalStrengthReduction::ID;
+
+// ======================================================================
+// GlobalStrengthReduction 类的实现
+// ======================================================================
+
+bool GlobalStrengthReduction::runOnFunction(Function *func, AnalysisManager &AM) {
+  if (func->getBasicBlocks().empty()) {
+    return false;
+  }
+
+  if (DEBUG) {
+    std::cout << "\n=== Running GlobalStrengthReduction on function: " << func->getName() << " ===" << std::endl;
+  }
+
+  bool changed = false;
+  GlobalStrengthReductionContext context(builder);
+  context.run(func, &AM, changed);
+
+  if (DEBUG) {
+    if (changed) {
+      std::cout << "GlobalStrengthReduction: Function " << func->getName() << " was modified" << std::endl;
+    } else {
+      std::cout << "GlobalStrengthReduction: Function " << func->getName() << " was not modified" << std::endl;
+    }
+    std::cout << "=== GlobalStrengthReduction completed for function: " << func->getName() << " ===" << std::endl;
+  }
+
+  return changed;
+}
+
+void GlobalStrengthReduction::getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const {
+  // 强度削弱依赖副作用分析来判断指令是否可以安全优化
+  analysisDependencies.insert(&SysYSideEffectAnalysisPass::ID);
+  
+  // 强度削弱不会使分析失效，因为：
+  // - 只替换计算指令，不改变控制流
+  // - 不修改内存，不影响别名分析
+  // - 保持程序语义不变
+  // analysisInvalidations 保持为空
+  
+  if (DEBUG) {
+    std::cout << "GlobalStrengthReduction: Declared analysis dependencies (SideEffectAnalysis)" << std::endl;
+  }
+}
+
+// ======================================================================
+// GlobalStrengthReductionContext 类的实现
+// ======================================================================
+
+void GlobalStrengthReductionContext::run(Function *func, AnalysisManager *AM, bool &changed) {
+  if (DEBUG) {
+    std::cout << "  Starting GlobalStrengthReduction analysis for function: " << func->getName() << std::endl;
+  }
+
+  // 获取分析结果
+  if (AM) {
+    sideEffectAnalysis = AM->getAnalysisResult<SideEffectAnalysisResult, SysYSideEffectAnalysisPass>();
+    
+    if (DEBUG) {
+      if (sideEffectAnalysis) {
+        std::cout << "    GlobalStrengthReduction: Using side effect analysis" << std::endl;
+      } else {
+        std::cout << "    GlobalStrengthReduction: Warning - side effect analysis not available" << std::endl;
+      }
+    }
+  }
+
+  // 重置计数器
+  algebraicOptCount = 0;
+  strengthReductionCount = 0;
+  divisionOptCount = 0;
+
+  // 遍历所有基本块进行优化
+  for (auto &bb_ptr : func->getBasicBlocks()) {
+    if (processBasicBlock(bb_ptr.get())) {
+      changed = true;
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "  GlobalStrengthReduction completed for function: " << func->getName() << std::endl;
+    std::cout << "    Algebraic optimizations: " << algebraicOptCount << std::endl;
+    std::cout << "    Strength reductions: " << strengthReductionCount << std::endl;
+    std::cout << "    Division optimizations: " << divisionOptCount << std::endl;
+  }
+}
+
+bool GlobalStrengthReductionContext::processBasicBlock(BasicBlock *bb) {
+  bool changed = false;
+  
+  if (DEBUG) {
+    std::cout << "    Processing block: " << bb->getName() << std::endl;
+  }
+
+  // 收集需要处理的指令（避免迭代器失效）
+  std::vector<Instruction*> instructions;
+  for (auto &inst_ptr : bb->getInstructions()) {
+    instructions.push_back(inst_ptr.get());
+  }
+
+  // 处理每条指令
+  for (auto inst : instructions) {
+    if (processInstruction(inst)) {
+      changed = true;
+    }
+  }
+
+  return changed;
+}
+
+bool GlobalStrengthReductionContext::processInstruction(Instruction *inst) {
+
+  if (DEBUG) {
+    std::cout << "      Processing instruction: " << inst->getName() << std::endl;
+  }
+
+  // 先尝试代数优化
+  if (tryAlgebraicOptimization(inst)) {
+    algebraicOptCount++;
+    return true;
+  }
+
+  // 再尝试强度削弱
+  if (tryStrengthReduction(inst)) {
+    strengthReductionCount++;
+    return true;
+  }
+
+  return false;
+}
+
+// ======================================================================
+// 代数优化方法
+// ======================================================================
+
+bool GlobalStrengthReductionContext::tryAlgebraicOptimization(Instruction *inst) {
+  auto binary = dynamic_cast<BinaryInst*>(inst);
+  if (!binary) {
+    return false;
+  }
+
+  switch (binary->getKind()) {
+    case Instruction::kAdd:
+      return optimizeAddition(binary);
+    case Instruction::kSub:
+      return optimizeSubtraction(binary);
+    case Instruction::kMul:
+      return optimizeMultiplication(binary);
+    case Instruction::kDiv:
+      return optimizeDivision(binary);
+    case Instruction::kICmpEQ:
+    case Instruction::kICmpNE:
+    case Instruction::kICmpLT:
+    case Instruction::kICmpGT:
+    case Instruction::kICmpLE:
+    case Instruction::kICmpGE:
+      return optimizeComparison(binary);
+    case Instruction::kAnd:
+    case Instruction::kOr:
+      return optimizeLogical(binary);
+    default:
+      return false;
+  }
+}
+
+bool GlobalStrengthReductionContext::optimizeAddition(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+  int constVal;
+
+  // x + 0 = x
+  if (isConstantInt(rhs, constVal) && constVal == 0) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x + 0 -> x" << std::endl;
+    }
+    replaceWithOptimized(inst, lhs);
+    return true;
+  }
+  
+  // 0 + x = x
+  if (isConstantInt(lhs, constVal) && constVal == 0) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = 0 + x -> x" << std::endl;
+    }
+    replaceWithOptimized(inst, rhs);
+    return true;
+  }
+
+  // x + (-y) = x - y
+  if (auto rhsInst = dynamic_cast<UnaryInst*>(rhs)) {
+    if (rhsInst->getKind() == Instruction::kNeg) {
+      if (DEBUG) {
+        std::cout << "        Algebraic: " << inst->getName() << " = x + (-y) -> x - y" << std::endl;
+      }
+      // 创建减法指令
+      builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+      auto subInst = builder->createSubInst(lhs, rhsInst->getOperand());
+      replaceWithOptimized(inst, subInst);
+      return true;
+    }
+  }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::optimizeSubtraction(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+  int constVal;
+
+  // x - 0 = x
+  if (isConstantInt(rhs, constVal) && constVal == 0) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x - 0 -> x" << std::endl;
+    }
+    replaceWithOptimized(inst, lhs);
+    return true;
+  }
+
+  // x - x = 0 (如果x没有副作用)
+  if (lhs == rhs && hasOnlyLocalUses(dynamic_cast<Instruction*>(lhs))) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x - x -> 0" << std::endl;
+    }
+    replaceWithOptimized(inst, getConstantInt(0));
+    return true;
+  }
+
+  // x - (-y) = x + y
+  if (auto rhsInst = dynamic_cast<UnaryInst*>(rhs)) {
+    if (rhsInst->getKind() == Instruction::kNeg) {
+      if (DEBUG) {
+        std::cout << "        Algebraic: " << inst->getName() << " = x - (-y) -> x + y" << std::endl;
+      }
+      builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+      auto addInst = builder->createAddInst(lhs, rhsInst->getOperand());
+      replaceWithOptimized(inst, addInst);
+      return true;
+    }
+  }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::optimizeMultiplication(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+  int constVal;
+
+  // x * 0 = 0
+  if (isConstantInt(rhs, constVal) && constVal == 0) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x * 0 -> 0" << std::endl;
+    }
+    replaceWithOptimized(inst, getConstantInt(0));
+    return true;
+  }
+  
+  // 0 * x = 0
+  if (isConstantInt(lhs, constVal) && constVal == 0) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = 0 * x -> 0" << std::endl;
+    }
+    replaceWithOptimized(inst, getConstantInt(0));
+    return true;
+  }
+
+  // x * 1 = x
+  if (isConstantInt(rhs, constVal) && constVal == 1) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x * 1 -> x" << std::endl;
+    }
+    replaceWithOptimized(inst, lhs);
+    return true;
+  }
+  
+  // 1 * x = x
+  if (isConstantInt(lhs, constVal) && constVal == 1) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = 1 * x -> x" << std::endl;
+    }
+    replaceWithOptimized(inst, rhs);
+    return true;
+  }
+
+  // x * (-1) = -x
+  if (isConstantInt(rhs, constVal) && constVal == -1) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x * (-1) -> -x" << std::endl;
+    }
+    builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+    auto negInst = builder->createNegInst(lhs);
+    replaceWithOptimized(inst, negInst);
+    return true;
+  }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::optimizeDivision(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+  int constVal;
+
+  // x / 1 = x
+  if (isConstantInt(rhs, constVal) && constVal == 1) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x / 1 -> x" << std::endl;
+    }
+    replaceWithOptimized(inst, lhs);
+    return true;
+  }
+
+  // x / (-1) = -x
+  if (isConstantInt(rhs, constVal) && constVal == -1) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x / (-1) -> -x" << std::endl;
+    }
+    builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+    auto negInst = builder->createNegInst(lhs);
+    replaceWithOptimized(inst, negInst);
+    return true;
+  }
+
+  // x / x = 1 (如果x != 0且没有副作用)
+  if (lhs == rhs && hasOnlyLocalUses(dynamic_cast<Instruction*>(lhs))) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x / x -> 1" << std::endl;
+    }
+    replaceWithOptimized(inst, getConstantInt(1));
+    return true;
+  }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::optimizeComparison(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+
+  // x == x = true (如果x没有副作用)
+  if (inst->getKind() == Instruction::kICmpEQ && lhs == rhs && 
+      hasOnlyLocalUses(dynamic_cast<Instruction*>(lhs))) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x == x -> true" << std::endl;
+    }
+    replaceWithOptimized(inst, getConstantInt(1));
+    return true;
+  }
+
+  // x != x = false (如果x没有副作用)
+  if (inst->getKind() == Instruction::kICmpNE && lhs == rhs && 
+      hasOnlyLocalUses(dynamic_cast<Instruction*>(lhs))) {
+    if (DEBUG) {
+      std::cout << "        Algebraic: " << inst->getName() << " = x != x -> false" << std::endl;
+    }
+    replaceWithOptimized(inst, getConstantInt(0));
+    return true;
+  }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::optimizeLogical(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+  int constVal;
+
+  if (inst->getKind() == Instruction::kAnd) {
+    // x && 0 = 0
+    if (isConstantInt(rhs, constVal) && constVal == 0) {
+      if (DEBUG) {
+        std::cout << "        Algebraic: " << inst->getName() << " = x && 0 -> 0" << std::endl;
+      }
+      replaceWithOptimized(inst, getConstantInt(0));
+      return true;
+    }
+    
+    // x && -1 = x
+    if (isConstantInt(rhs, constVal) && constVal == -1) {
+      if (DEBUG) {
+        std::cout << "        Algebraic: " << inst->getName() << " = x && 1 -> x" << std::endl;
+      }
+      replaceWithOptimized(inst, lhs);
+      return true;
+    }
+
+    // x && x = x
+    if (lhs == rhs) {
+      if (DEBUG) {
+        std::cout << "        Algebraic: " << inst->getName() << " = x && x -> x" << std::endl;
+      }
+      replaceWithOptimized(inst, lhs);
+      return true;
+    }
+  } else if (inst->getKind() == Instruction::kOr) {
+    // x || 0 = x
+    if (isConstantInt(rhs, constVal) && constVal == 0) {
+      if (DEBUG) {
+        std::cout << "        Algebraic: " << inst->getName() << " = x || 0 -> x" << std::endl;
+      }
+      replaceWithOptimized(inst, lhs);
+      return true;
+    }
+
+    // x || x = x
+    if (lhs == rhs) {
+      if (DEBUG) {
+        std::cout << "        Algebraic: " << inst->getName() << " = x || x -> x" << std::endl;
+      }
+      replaceWithOptimized(inst, lhs);
+      return true;
+    }
+  }
+
+  return false;
+}
+
+// ======================================================================
+// 强度削弱方法
+// ======================================================================
+
+bool GlobalStrengthReductionContext::tryStrengthReduction(Instruction *inst) {
+  if (auto binary = dynamic_cast<BinaryInst*>(inst)) {
+    switch (binary->getKind()) {
+      case Instruction::kMul:
+        return reduceMultiplication(binary);
+      case Instruction::kDiv:
+        return reduceDivision(binary);
+      default:
+        return false;
+    }
+  } else if (auto call = dynamic_cast<CallInst*>(inst)) {
+    return reducePower(call);
+  }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::reduceMultiplication(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+  int constVal;
+
+  // 尝试右操作数为常数
+  Value* variable = lhs;
+  if (isConstantInt(rhs, constVal) && constVal > 0) {
+    return tryComplexMultiplication(inst, variable, constVal);
+  }
+  
+  // 尝试左操作数为常数
+  if (isConstantInt(lhs, constVal) && constVal > 0) {
+    variable = rhs;
+    return tryComplexMultiplication(inst, variable, constVal);
+  }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::tryComplexMultiplication(BinaryInst* inst, Value* variable, int constant) {
+  // 首先检查是否为2的幂，使用简单位移
+  if (isPowerOfTwo(constant)) {
+    int shiftAmount = log2OfPowerOfTwo(constant);
+    if (DEBUG) {
+      std::cout << "        StrengthReduction: " << inst->getName() 
+                << " = x * " << constant << " -> x << " << shiftAmount << std::endl;
+    }
+    
+    builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+    auto shiftInst = builder->createBinaryInst(Instruction::kSll, Type::getIntType(), variable, getConstantInt(shiftAmount));
+    replaceWithOptimized(inst, shiftInst);
+    return true;
+  }
+  
+  // 尝试分解为位移和加法的组合
+  std::vector<int> shifts;
+  if (findOptimalShiftDecomposition(constant, shifts)) {
+    if (DEBUG) {
+      std::cout << "        StrengthReduction: " << inst->getName() 
+                << " = x * " << constant << " -> shift decomposition with " << shifts.size() << " terms" << std::endl;
+    }
+    
+    Value* result = createShiftDecomposition(inst, variable, shifts);
+    if (result) {
+      replaceWithOptimized(inst, result);
+      return true;
+    }
+  }
+  
+  return false;
+}
+
+bool GlobalStrengthReductionContext::findOptimalShiftDecomposition(int constant, std::vector<int>& shifts) {
+  shifts.clear();
+  
+  // 常见的有效分解模式
+  switch (constant) {
+    case 3:   // 3 = 2^1 + 2^0 -> (x << 1) + x
+      shifts = {1, 0};
+      return true;
+    case 5:   // 5 = 2^2 + 2^0 -> (x << 2) + x  
+      shifts = {2, 0};
+      return true;
+    case 6:   // 6 = 2^2 + 2^1 -> (x << 2) + (x << 1)
+      shifts = {2, 1};
+      return true;
+    case 7:   // 7 = 2^2 + 2^1 + 2^0 -> (x << 2) + (x << 1) + x
+      shifts = {2, 1, 0};
+      return true;
+    case 9:   // 9 = 2^3 + 2^0 -> (x << 3) + x
+      shifts = {3, 0};
+      return true;
+    case 10:  // 10 = 2^3 + 2^1 -> (x << 3) + (x << 1)
+      shifts = {3, 1};
+      return true;
+    case 11:  // 11 = 2^3 + 2^1 + 2^0 -> (x << 3) + (x << 1) + x
+      shifts = {3, 1, 0};
+      return true;
+    case 12:  // 12 = 2^3 + 2^2 -> (x << 3) + (x << 2)
+      shifts = {3, 2};
+      return true;
+    case 13:  // 13 = 2^3 + 2^2 + 2^0 -> (x << 3) + (x << 2) + x
+      shifts = {3, 2, 0};
+      return true;
+    case 14:  // 14 = 2^3 + 2^2 + 2^1 -> (x << 3) + (x << 2) + (x << 1)
+      shifts = {3, 2, 1};
+      return true;
+    case 15:  // 15 = 2^3 + 2^2 + 2^1 + 2^0 -> (x << 3) + (x << 2) + (x << 1) + x
+      shifts = {3, 2, 1, 0};
+      return true;
+    case 17:  // 17 = 2^4 + 2^0 -> (x << 4) + x
+      shifts = {4, 0};
+      return true;
+    case 18:  // 18 = 2^4 + 2^1 -> (x << 4) + (x << 1)
+      shifts = {4, 1};
+      return true;
+    case 20:  // 20 = 2^4 + 2^2 -> (x << 4) + (x << 2)
+      shifts = {4, 2};
+      return true;
+    case 24:  // 24 = 2^4 + 2^3 -> (x << 4) + (x << 3)
+      shifts = {4, 3};
+      return true;
+    case 25:  // 25 = 2^4 + 2^3 + 2^0 -> (x << 4) + (x << 3) + x
+      shifts = {4, 3, 0};
+      return true;
+    case 100: // 100 = 2^6 + 2^5 + 2^2 -> (x << 6) + (x << 5) + (x << 2)
+      shifts = {6, 5, 2};
+      return true;
+  }
+  
+  // 通用二进制分解（最多4个项，避免过度复杂化）
+  if (constant > 0 && constant < 256) {
+    std::vector<int> binaryShifts;
+    int temp = constant;
+    int bit = 0;
+    
+    while (temp > 0 && binaryShifts.size() < 4) {
+      if (temp & 1) {
+        binaryShifts.push_back(bit);
+      }
+      temp >>= 1;
+      bit++;
+    }
+    
+    // 只有当项数不超过3个时才使用二进制分解（比直接乘法更有效）
+    if (binaryShifts.size() <= 3 && binaryShifts.size() >= 2) {
+      shifts = binaryShifts;
+      return true;
+    }
+  }
+  
+  return false;
+}
+
+Value* GlobalStrengthReductionContext::createShiftDecomposition(BinaryInst* inst, Value* variable, const std::vector<int>& shifts) {
+  if (shifts.empty()) return nullptr;
+  
+  builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+  
+  Value* result = nullptr;
+  
+  for (int shift : shifts) {
+    Value* term;
+    if (shift == 0) {
+      // 0位移就是原变量
+      term = variable;
+    } else {
+      // 创建位移指令
+      term = builder->createBinaryInst(Instruction::kSll, Type::getIntType(), variable, getConstantInt(shift));
+    }
+    
+    if (result == nullptr) {
+      result = term;
+    } else {
+      // 累加到结果中
+      result = builder->createAddInst(result, term);
+    }
+  }
+  
+  return result;
+}
+
+bool GlobalStrengthReductionContext::reduceDivision(BinaryInst *inst) {
+  Value *lhs = inst->getLhs();
+  Value *rhs = inst->getRhs();
+  uint32_t constVal;
+
+  // x / 2^n = x >> n (对于无符号除法或已知为正数的情况)
+  if (isConstantInt(rhs, constVal) && constVal > 0 && isPowerOfTwo(constVal)) {
+    builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+    int shiftAmount = log2OfPowerOfTwo(constVal);
+    // 有符号除法校正：(x + (x >> 31) & mask) >> k
+    int maskValue = constVal - 1;
+    
+    // x >> 31 (算术右移获取符号位)
+    Value* signShift = ConstantInteger::get(31);
+    Value* signBits = builder->createBinaryInst(
+      Instruction::Kind::kSra,  // 算术右移
+      lhs->getType(),
+      lhs,
+      signShift
+    );
+    
+    // (x >> 31) & mask
+    Value* mask = ConstantInteger::get(maskValue);
+    Value* correction = builder->createBinaryInst(
+      Instruction::Kind::kAnd,
+      lhs->getType(),
+      signBits,
+      mask
+    );
+    
+    // x + correction
+    Value* corrected = builder->createAddInst(lhs, correction);
+    
+    // (x + correction) >> k
+    Value* divShift = ConstantInteger::get(shiftAmount);
+    Value* shiftInst = builder->createBinaryInst(
+      Instruction::Kind::kSra,  // 算术右移
+      lhs->getType(),
+      corrected,
+      divShift
+    );
+
+    if (DEBUG) {
+      std::cout << "        StrengthReduction: " << inst->getName() 
+                << " = x / " << constVal << " -> (x + (x >> 31) & mask) >> " << shiftAmount << std::endl;
+    }
+    
+    // builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+    // Value* divisor_minus_1 = ConstantInteger::get(constVal - 1);
+    // Value* adjusted = builder->createAddInst(lhs, divisor_minus_1);
+    // Value* shiftInst = builder->createBinaryInst(Instruction::kSra, Type::getIntType(), adjusted, getConstantInt(shiftAmount));
+    replaceWithOptimized(inst, shiftInst);
+    strengthReductionCount++;
+    return true;
+  }
+
+  // x / c = x * magic_number (魔数乘法优化 - 使用libdivide算法)
+  // if (isConstantInt(rhs, constVal) && constVal > 1 && constVal != (uint32_t)(-1)) {
+  //   // auto magicPair = computeMulhMagicNumbers(static_cast<int>(constVal));
+  //   Value* magicResult = createMagicDivisionLibdivide(inst, static_cast<int>(constVal));
+  //   replaceWithOptimized(inst, magicResult);
+  //   divisionOptCount++;
+  //   return true;
+  // }
+
+  return false;
+}
+
+bool GlobalStrengthReductionContext::reducePower(CallInst *inst) {
+  // 检查是否是pow函数调用
+  Function* callee = inst->getCallee();
+  if (!callee || callee->getName() != "pow") {
+    return false;
+  }
+
+  // pow(x, 2) = x * x
+  if (inst->getNumOperands() >= 2) {
+    int exponent;
+    if (isConstantInt(inst->getOperand(1), exponent)) {
+      if (exponent == 2) {
+        if (DEBUG) {
+          std::cout << "        StrengthReduction: pow(x, 2) -> x * x" << std::endl;
+        }
+        
+        Value* base = inst->getOperand(0);
+        builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+        auto mulInst = builder->createMulInst(base, base);
+        replaceWithOptimized(inst, mulInst);
+        strengthReductionCount++;
+        return true;
+      } else if (exponent >= 3 && exponent <= 8) {
+        // 对于小的指数，展开为连续乘法
+        if (DEBUG) {
+          std::cout << "        StrengthReduction: pow(x, " << exponent << ") -> repeated multiplication" << std::endl;
+        }
+        
+        Value* base = inst->getOperand(0);
+        Value* result = base;
+        builder->setPosition(inst->getParent(), inst->getParent()->findInstIterator(inst));
+        
+        for (int i = 1; i < exponent; i++) {
+          result = builder->createMulInst(result, base);
+        }
+        
+        replaceWithOptimized(inst, result);
+        strengthReductionCount++;
+        return true;
+      }
+    }
+  }
+
+  return false;
+}
+
+Value* GlobalStrengthReductionContext::createMagicDivisionLibdivide(BinaryInst* divInst, int divisor) {
+  builder->setPosition(divInst->getParent(), divInst->getParent()->findInstIterator(divInst));
+  // 使用mulh指令优化任意常数除法
+  auto [magic, shift] = SysYIROptUtils::computeMulhMagicNumbers(divisor);
+  
+  // 检查是否无法优化（magic == -1, shift == -1 表示失败）
+  if (magic == -1 && shift == -1) {
+    if (DEBUG) {
+      std::cout << "[SR] Cannot optimize division by " << divisor 
+                << ", keeping original division" << std::endl;
+    }
+    // 返回 nullptr 表示无法优化，调用方应该保持原始除法
+    return nullptr;
+  }
+  
+  // 2的幂次方除法可以用移位优化（但这不是魔数法的情况）这种情况应该不会被分类到这里但是还是做一个保护措施
+  if ((divisor & (divisor - 1)) == 0 && divisor > 0) {
+    // 是2的幂次方，可以用移位
+    int shift_amount = 0;
+    int temp = divisor;
+    while (temp > 1) {
+      temp >>= 1;
+      shift_amount++;
+    }
+    
+    Value* shiftConstant = ConstantInteger::get(shift_amount);
+    // 对于有符号除法，需要先加上除数-1然后再移位（为了正确处理负数舍入）
+    Value* divisor_minus_1 = ConstantInteger::get(divisor - 1);
+    Value* adjusted = builder->createAddInst(divInst->getOperand(0), divisor_minus_1);
+    return builder->createBinaryInst(
+      Instruction::Kind::kSra,  // 算术右移
+      divInst->getOperand(0)->getType(),
+      adjusted,
+      shiftConstant
+    );
+  }
+  
+  // 创建魔数常量
+  // 检查魔数是否能放入32位，如果不能，则不进行优化
+  if (magic > INT32_MAX || magic < INT32_MIN) {
+    if (DEBUG) {
+      std::cout << "[SR] Magic number " << magic << " exceeds 32-bit range, skipping optimization" << std::endl;
+    }
+    return nullptr; // 无法优化，保持原始除法
+  }
+  
+  Value* magicConstant = ConstantInteger::get((int32_t)magic);
+  
+  // 检查是否需要ADD_MARKER处理（加法调整）
+  bool needAdd = (shift & 0x40) != 0;
+  int actualShift = shift & 0x3F; // 提取真实的移位量
+  
+  if (DEBUG) {
+    std::cout << "[SR] IR Generation: magic=" << magic << ", needAdd=" << needAdd 
+              << ", actualShift=" << actualShift << std::endl;
+  }
+  
+  // 执行高位乘法：mulh(x, magic)
+  Value* mulhResult = builder->createBinaryInst(
+    Instruction::Kind::kMulh,  // 高位乘法
+    divInst->getOperand(0)->getType(),
+    divInst->getOperand(0),
+    magicConstant
+  );
+  
+  if (needAdd) {
+    // ADD_MARKER 情况：需要在移位前加上被除数
+    // 这对应于 libdivide 的加法调整算法
+    if (DEBUG) {
+      std::cout << "[SR] Applying ADD_MARKER: adding dividend before shift" << std::endl;
+    }
+    mulhResult = builder->createAddInst(mulhResult, divInst->getOperand(0));
+  }
+  
+  if (actualShift > 0) {
+    // 如果需要额外移位
+    Value* shiftConstant = ConstantInteger::get(actualShift);
+    mulhResult = builder->createBinaryInst(
+      Instruction::Kind::kSra,  // 算术右移
+      divInst->getOperand(0)->getType(),
+      mulhResult,
+      shiftConstant
+    );
+  }
+  
+  // 标准的有符号除法符号修正：如果被除数为负，商需要加1
+  // 这对所有有符号除法都需要，不管是否可能有负数
+  Value* isNegative = builder->createICmpLTInst(divInst->getOperand(0), ConstantInteger::get(0));
+  // 将i1转换为i32：负数时为1，非负数时为0 ICmpLTInst的结果会默认转化为32位
+  mulhResult = builder->createAddInst(mulhResult, isNegative);
+  
+  return mulhResult; 
+}
+
+// ======================================================================
+// 辅助方法
+// ======================================================================
+
+bool GlobalStrengthReductionContext::isPowerOfTwo(uint32_t n) {
+  return n > 0 && (n & (n - 1)) == 0;
+}
+
+int GlobalStrengthReductionContext::log2OfPowerOfTwo(uint32_t n) {
+  int result = 0;
+  while (n > 1) {
+    n >>= 1;
+    result++;
+  }
+  return result;
+}
+
+bool GlobalStrengthReductionContext::isConstantInt(Value* val, int& constVal) {
+  if (auto constInt = dynamic_cast<ConstantInteger*>(val)) {
+    constVal = std::get<int>(constInt->getVal());
+    return true;
+  }
+  return false;
+}
+
+bool GlobalStrengthReductionContext::isConstantInt(Value* val, uint32_t& constVal) {
+  if (auto constInt = dynamic_cast<ConstantInteger*>(val)) {
+    int signedVal = std::get<int>(constInt->getVal());
+    if (signedVal >= 0) {
+      constVal = static_cast<uint32_t>(signedVal);
+      return true;
+    }
+  }
+  return false;
+}
+
+ConstantInteger* GlobalStrengthReductionContext::getConstantInt(int val) {
+  return ConstantInteger::get(val);
+}
+
+bool GlobalStrengthReductionContext::hasOnlyLocalUses(Instruction* inst) {
+  if (!inst) return true;
+  
+  // 简单检查：如果指令没有副作用，则认为是本地的
+  if (sideEffectAnalysis) {
+    auto sideEffect = sideEffectAnalysis->getInstructionSideEffect(inst);
+    return sideEffect.type == SideEffectType::NO_SIDE_EFFECT;
+  }
+  
+  // 没有副作用分析时，保守处理
+  return !inst->isCall() && !inst->isStore() && !inst->isLoad();
+}
+
+void GlobalStrengthReductionContext::replaceWithOptimized(Instruction* original, Value* replacement) {
+  if (DEBUG) {
+    std::cout << "          Replacing " << original->getName() 
+              << " with " << replacement->getName() << std::endl;
+  }
+  
+  original->replaceAllUsesWith(replacement);
+  
+  // 如果替换值是新创建的指令，确保它有合适的名字
+//   if (auto replInst = dynamic_cast<Instruction*>(replacement)) {
+//     if (replInst->getName().empty()) {
+//       replInst->setName(original->getName() + "_opt");
+//     }
+//   }
+  
+  // 删除原指令，让调用者处理
+  SysYIROptUtils::usedelete(original);
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/InductionVariableElimination.cpp
+++ b/src/midend/Pass/Optimize/InductionVariableElimination.cpp
@ -0,0 +1,917 @@
+#include "InductionVariableElimination.h"
+#include "LoopCharacteristics.h"
+#include "Loop.h"
+#include "Dom.h"
+#include "SideEffectAnalysis.h"
+#include "AliasAnalysis.h"
+#include "SysYIROptUtils.h"
+#include <iostream>
+#include <algorithm>
+
+// 使用全局调试开关
+extern int DEBUG;
+
+namespace sysy {
+
+// 定义 Pass 的唯一 ID
+void *InductionVariableElimination::ID = (void *)&InductionVariableElimination::ID;
+
+bool InductionVariableElimination::runOnFunction(Function* F, AnalysisManager& AM) {
+  if (F->getBasicBlocks().empty()) {
+    return false; // 空函数
+  }
+
+  if (DEBUG) {
+    std::cout << "Running InductionVariableElimination on function: " << F->getName() << std::endl;
+  }
+
+  // 创建优化上下文并运行
+  InductionVariableEliminationContext context;
+  bool modified = context.run(F, AM);
+
+  if (DEBUG) {
+    std::cout << "InductionVariableElimination " << (modified ? "modified" : "did not modify") 
+              << " function: " << F->getName() << std::endl;
+  }
+
+  return modified;
+}
+
+void InductionVariableElimination::getAnalysisUsage(std::set<void*>& analysisDependencies, 
+                                                   std::set<void*>& analysisInvalidations) const {
+  // 依赖的分析
+  analysisDependencies.insert(&LoopAnalysisPass::ID);
+  analysisDependencies.insert(&LoopCharacteristicsPass::ID);
+  analysisDependencies.insert(&DominatorTreeAnalysisPass::ID);
+  analysisDependencies.insert(&SysYSideEffectAnalysisPass::ID);
+  analysisDependencies.insert(&SysYAliasAnalysisPass::ID);
+  
+  // 会使失效的分析（归纳变量消除会修改IR结构）
+  analysisInvalidations.insert(&LoopCharacteristicsPass::ID);
+  // 注意：支配树分析通常不会因为归纳变量消除而失效，因为我们不改变控制流
+}
+
+// ========== InductionVariableEliminationContext 实现 ==========
+
+bool InductionVariableEliminationContext::run(Function* F, AnalysisManager& AM) {
+  if (DEBUG) {
+    std::cout << "  Starting induction variable elimination analysis..." << std::endl;
+  }
+
+  // 获取必要的分析结果
+  loopAnalysis = AM.getAnalysisResult<LoopAnalysisResult, LoopAnalysisPass>(F);
+  if (!loopAnalysis || !loopAnalysis->hasLoops()) {
+    if (DEBUG) {
+      std::cout << "  No loops found, skipping induction variable elimination" << std::endl;
+    }
+    return false;
+  }
+
+  loopCharacteristics = AM.getAnalysisResult<LoopCharacteristicsResult, LoopCharacteristicsPass>(F);
+  if (!loopCharacteristics) {
+    if (DEBUG) {
+      std::cout << "  LoopCharacteristics analysis not available" << std::endl;
+    }
+    return false;
+  }
+
+  dominatorTree = AM.getAnalysisResult<DominatorTree, DominatorTreeAnalysisPass>(F);
+  if (!dominatorTree) {
+    if (DEBUG) {
+      std::cout << "  DominatorTree analysis not available" << std::endl;
+    }
+    return false;
+  }
+
+  sideEffectAnalysis = AM.getAnalysisResult<SideEffectAnalysisResult, SysYSideEffectAnalysisPass>();
+  if (!sideEffectAnalysis) {
+    if (DEBUG) {
+      std::cout << "  SideEffectAnalysis not available, using conservative approach" << std::endl;
+    }
+    // 可以继续执行，但会使用更保守的策略
+  } else {
+    if (DEBUG) {
+      std::cout << "  Using SideEffectAnalysis for safety checks" << std::endl;
+    }
+  }
+
+  aliasAnalysis = AM.getAnalysisResult<AliasAnalysisResult, SysYAliasAnalysisPass>(F);
+  if (!aliasAnalysis) {
+    if (DEBUG) {
+      std::cout << "  AliasAnalysis not available, using conservative approach" << std::endl;
+    }
+    // 可以继续执行，但会使用更保守的策略
+  } else {
+    if (DEBUG) {
+      std::cout << "  Using AliasAnalysis for memory safety checks" << std::endl;
+    }
+  }
+
+  // 执行三个阶段的优化
+  
+  // 阶段1：识别死归纳变量
+  identifyDeadInductionVariables(F);
+  
+  if (deadIVs.empty()) {
+    if (DEBUG) {
+      std::cout << "  No dead induction variables found" << std::endl;
+    }
+    return false;
+  }
+
+  if (DEBUG) {
+    std::cout << "  Found " << deadIVs.size() << " potentially dead induction variables" << std::endl;
+  }
+
+  // 阶段2：分析安全性
+  analyzeSafetyForElimination();
+
+  // 阶段3：执行消除
+  bool modified = performInductionVariableElimination();
+
+  if (DEBUG) {
+    printDebugInfo();
+  }
+
+  modified |= SysYIROptUtils::eliminateRedundantPhisInFunction(F);
+  return modified;
+}
+
+void InductionVariableEliminationContext::identifyDeadInductionVariables(Function* F) {
+  if (DEBUG) {
+    std::cout << "  === Phase 1: Identifying Dead Induction Variables ===" << std::endl;
+  }
+
+  // 遍历所有循环
+  for (const auto& loop_ptr : loopAnalysis->getAllLoops()) {
+    Loop* loop = loop_ptr.get();
+    
+    if (DEBUG) {
+      std::cout << "    Analyzing loop: " << loop->getName() << std::endl;
+    }
+
+    // 获取循环特征
+    const LoopCharacteristics* characteristics = loopCharacteristics->getCharacteristics(loop);
+    if (!characteristics) {
+      if (DEBUG) {
+        std::cout << "      No characteristics available for loop" << std::endl;
+      }
+      continue;
+    }
+
+    if (characteristics->InductionVars.empty()) {
+      if (DEBUG) {
+        std::cout << "      No induction variables found in loop" << std::endl;
+      }
+      continue;
+    }
+
+    // 检查每个归纳变量是否为死归纳变量
+    for (const auto& iv : characteristics->InductionVars) {
+      auto deadIV = isDeadInductionVariable(iv.get(), loop);
+      if (deadIV) {
+        if (DEBUG) {
+          std::cout << "      Found potential dead IV: %" << deadIV->phiInst->getName() << std::endl;
+        }
+        
+        // 添加到候选项列表
+        loopToDeadIVs[loop].push_back(deadIV.get());
+        deadIVs.push_back(std::move(deadIV));
+      }
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "  === End Phase 1: Found " << deadIVs.size() << " candidates ===" << std::endl;
+  }
+}
+
+std::unique_ptr<DeadInductionVariable> 
+InductionVariableEliminationContext::isDeadInductionVariable(const InductionVarInfo* iv, Loop* loop) {
+  // 获取 phi 指令
+  auto* phiInst = dynamic_cast<PhiInst*>(iv->div);
+  if (!phiInst) {
+    return nullptr; // 不是 phi 指令
+  }
+
+  // 新的逻辑：递归分析整个use-def链，判断是否有真实的使用
+  if (!isPhiInstructionDeadRecursively(phiInst, loop)) {
+    return nullptr; // 有真实的使用，不能删除
+  }
+
+  // 创建死归纳变量信息
+  auto deadIV = std::make_unique<DeadInductionVariable>(phiInst, loop);
+  deadIV->relatedInsts = collectRelatedInstructions(phiInst, loop);
+  
+  return deadIV;
+}
+
+// 递归分析phi指令及其使用链是否都是死代码
+bool InductionVariableEliminationContext::isPhiInstructionDeadRecursively(PhiInst* phiInst, Loop* loop) {
+  if (DEBUG) {
+    std::cout << "      递归分析归纳变量 " << phiInst->getName() << " 的完整使用链" << std::endl;
+  }
+
+  // 使用访问集合避免无限递归
+  std::set<Instruction*> visitedInstructions;
+  std::set<Instruction*> currentPath; // 用于检测循环依赖
+  
+  // 核心逻辑：递归分析使用链，寻找任何"逃逸点"
+  return isInstructionUseChainDeadRecursively(phiInst, loop, visitedInstructions, currentPath);
+}
+
+// 递归分析指令的使用链是否都是死代码
+bool InductionVariableEliminationContext::isInstructionUseChainDeadRecursively(
+    Instruction* inst, Loop* loop, 
+    std::set<Instruction*>& visited, 
+    std::set<Instruction*>& currentPath) {
+  
+  if (DEBUG && visited.size() < 10) { // 限制debug输出
+    std::cout << "        分析指令 " << inst->getName() << " (" << inst->getKindString() << ")" << std::endl;
+  }
+  
+  // 避免无限递归
+  if (currentPath.count(inst) > 0) {
+    // 发现循环依赖，这在归纳变量中是正常的，继续分析其他路径
+    if (DEBUG && visited.size() < 10) {
+      std::cout << "          发现循环依赖，继续分析其他路径" << std::endl;
+    }
+    return true; // 循环依赖本身不是逃逸点
+  }
+  
+  if (visited.count(inst) > 0) {
+    // 已经分析过这个指令
+    return true; // 假设之前的分析是正确的
+  }
+  
+  visited.insert(inst);
+  currentPath.insert(inst);
+  
+  // 1. 检查是否有副作用（逃逸点）
+  if (sideEffectAnalysis && sideEffectAnalysis->hasSideEffect(inst)) {
+    if (DEBUG && visited.size() < 10) {
+      std::cout << "          指令有副作用，是逃逸点" << std::endl;
+    }
+    currentPath.erase(inst);
+    return false; // 有副作用的指令是逃逸点
+  }
+  
+  // 1.5. 特殊检查：控制流指令永远不是死代码
+  auto instKind = inst->getKind();
+  if (instKind == Instruction::Kind::kCondBr || 
+      instKind == Instruction::Kind::kBr ||
+      instKind == Instruction::Kind::kReturn) {
+    if (DEBUG && visited.size() < 10) {
+      std::cout << "          控制流指令，是逃逸点" << std::endl;
+    }
+    currentPath.erase(inst);
+    return false; // 控制流指令是逃逸点
+  }
+  
+  // 2. 检查指令的所有使用
+  bool allUsesAreDead = true;
+  for (auto use : inst->getUses()) {
+    auto user = use->getUser();
+    auto* userInst = dynamic_cast<Instruction*>(user);
+    
+    if (!userInst) {
+      // 被非指令使用（如函数返回值），是逃逸点
+      if (DEBUG && visited.size() < 10) {
+        std::cout << "          被非指令使用，是逃逸点" << std::endl;
+      }
+      allUsesAreDead = false;
+      break;
+    }
+    
+    // 检查使用是否在循环外（逃逸点）
+    if (!loop->contains(userInst->getParent())) {
+      if (DEBUG && visited.size() < 10) {
+        std::cout << "          在循环外被 " << userInst->getName() << " 使用，是逃逸点" << std::endl;
+      }
+      allUsesAreDead = false;
+      break;
+    }
+    
+    // 特殊检查：如果使用者是循环的退出条件，需要进一步分析
+    // 对于用于退出条件的归纳变量，需要更谨慎的处理
+    if (isUsedInLoopExitCondition(userInst, loop)) {
+      // 修复逻辑：用于循环退出条件的归纳变量通常不应该被消除
+      // 除非整个循环都可以被证明是完全无用的（这需要更复杂的分析）
+      if (DEBUG && visited.size() < 10) {
+        std::cout << "          被用于循环退出条件，是逃逸点（避免破坏循环语义）" << std::endl;
+      }
+      allUsesAreDead = false;
+      break;
+    }
+    
+    // 递归分析使用者的使用链
+    if (!isInstructionUseChainDeadRecursively(userInst, loop, visited, currentPath)) {
+      allUsesAreDead = false;
+      break; // 找到逃逸点，不需要继续分析
+    }
+  }
+  
+  currentPath.erase(inst);
+  
+  if (allUsesAreDead && DEBUG && visited.size() < 10) {
+    std::cout << "          指令 " << inst->getName() << " 的所有使用都是死代码" << std::endl;
+  }
+  
+  return allUsesAreDead;
+}
+
+// 检查循环是否有副作用
+bool InductionVariableEliminationContext::loopHasSideEffects(Loop* loop) {
+  // 遍历循环中的所有指令，检查是否有副作用
+  for (BasicBlock* bb : loop->getBlocks()) {
+    for (auto& inst : bb->getInstructions()) {
+      Instruction* instPtr = inst.get();
+      
+      // 使用副作用分析（如果可用）
+      if (sideEffectAnalysis && sideEffectAnalysis->hasSideEffect(instPtr)) {
+        if (DEBUG) {
+          std::cout << "          循环中发现有副作用的指令: " << instPtr->getName() << std::endl;
+        }
+        return true;
+      }
+      
+      // 如果没有副作用分析，使用保守的判断
+      if (!sideEffectAnalysis) {
+        auto kind = instPtr->getKind();
+        // 这些指令通常有副作用
+        if (kind == Instruction::Kind::kCall ||
+            kind == Instruction::Kind::kStore ||
+            kind == Instruction::Kind::kReturn) {
+          if (DEBUG) {
+            std::cout << "          循环中发现潜在有副作用的指令: " << instPtr->getName() << std::endl;
+          }
+          return true;
+        }
+      }
+    }
+  }
+  
+  // 重要修复：检查是否为嵌套循环的外层循环
+  // 如果当前循环包含其他循环，那么它有潜在的副作用
+  for (const auto& loop_ptr : loopAnalysis->getAllLoops()) {
+    Loop* otherLoop = loop_ptr.get();
+    if(loopAnalysis->getLowestCommonAncestor(otherLoop, loop) == loop) {
+      if (DEBUG) {
+        std::cout << "          循环 " << loop->getName() << " 是其他循环的外层循环，视为有副作用" << std::endl;
+      }
+      return true; // 外层循环被视为有副作用
+    }
+    // if (otherLoop != loop && loop->contains(otherLoop->getHeader())) {
+    //   if (DEBUG) {
+    //     std::cout << "          循环 " << loop->getName() << " 包含子循环 " << otherLoop->getName() << "，视为有副作用" << std::endl;
+    //   }
+    //   return true; // 包含子循环的外层循环被视为有副作用
+    // }
+  }
+  
+  if (DEBUG) {
+    std::cout << "          循环 " << loop->getName() << " 无副作用" << std::endl;
+  }
+  return false; // 循环无副作用
+}
+
+// 检查指令是否被用于循环退出条件
+bool InductionVariableEliminationContext::isUsedInLoopExitCondition(Instruction* inst, Loop* loop) {
+  // 检查指令是否被循环的退出条件使用
+  for (BasicBlock* exitingBB : loop->getExitingBlocks()) {
+    auto terminatorIt = exitingBB->terminator();
+    if (terminatorIt != exitingBB->end()) {
+      Instruction* terminator = terminatorIt->get();
+      if (terminator) {
+        // 检查终结指令的操作数
+        for (size_t i = 0; i < terminator->getNumOperands(); ++i) {
+          if (terminator->getOperand(i) == inst) {
+            if (DEBUG) {
+              std::cout << "          指令 " << inst->getName() << " 用于循环退出条件" << std::endl;
+            }
+            return true;
+          }
+        }
+        
+        // 对于条件分支，还需要检查条件指令的操作数
+        if (terminator->getKind() == Instruction::Kind::kCondBr) {
+          auto* condBr = dynamic_cast<CondBrInst*>(terminator);
+          if (condBr) {
+            Value* condition = condBr->getCondition();
+            if (condition == inst) {
+              if (DEBUG) {
+                std::cout << "          指令 " << inst->getName() << " 是循环条件" << std::endl;
+              }
+              return true;
+            }
+            
+            // 递归检查条件指令的操作数（比如比较指令）
+            auto* condInst = dynamic_cast<Instruction*>(condition);
+            if (condInst) {
+              for (size_t i = 0; i < condInst->getNumOperands(); ++i) {
+                if (condInst->getOperand(i) == inst) {
+                  if (DEBUG) {
+                    std::cout << "          指令 " << inst->getName() << " 用于循环条件的操作数" << std::endl;
+                  }
+                  return true;
+                }
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+  
+  return false;
+}
+
+
+// 检查指令的结果是否未被有效使用
+bool InductionVariableEliminationContext::isInstructionResultUnused(Instruction* inst, Loop* loop) {
+  // 检查指令的所有使用
+  if (inst->getUses().empty()) {
+    return true; // 没有使用，肯定是未使用
+  }
+  
+  for (auto use : inst->getUses()) {
+    auto user = use->getUser();
+    auto* userInst = dynamic_cast<Instruction*>(user);
+    
+    if (!userInst) {
+      return false; // 被非指令使用，认为是有效使用
+    }
+    
+    // 如果在循环外被使用，认为是有效使用
+    if (!loop->contains(userInst->getParent())) {
+      return false;
+    }
+    
+    // 递归检查使用这个结果的指令是否也是死代码
+    // 为了避免无限递归，限制递归深度
+    if (!isInstructionEffectivelyDead(userInst, loop, 3)) {
+      return false; // 存在有效使用
+    }
+  }
+  
+  return true; // 所有使用都是无效的
+}
+
+// 检查store指令是否存储到死地址（利用别名分析）
+bool InductionVariableEliminationContext::isStoreToDeadLocation(StoreInst* store, Loop* loop) {
+  if (!aliasAnalysis) {
+    return false; // 没有别名分析，保守返回false
+  }
+  
+  Value* storePtr = store->getPointer();
+  
+  // 检查是否存储到局部临时变量且该变量在循环外不被读取
+  const MemoryLocation* memLoc = aliasAnalysis->getMemoryLocation(storePtr);
+  if (!memLoc) {
+    return false; // 无法确定内存位置
+  }
+  
+  // 如果是局部数组且只在循环内被访问
+  if (memLoc->isLocalArray) {
+    // 检查该内存位置是否在循环外被读取
+    for (auto* accessInst : memLoc->accessInsts) {
+      if (accessInst->getKind() == Instruction::Kind::kLoad) {
+        if (!loop->contains(accessInst->getParent())) {
+          return false; // 在循环外被读取，不是死存储
+        }
+      }
+    }
+    
+    if (DEBUG) {
+      std::cout << "            存储到局部数组且仅在循环内访问" << std::endl;
+    }
+    return true; // 存储到仅循环内访问的局部数组
+  }
+  
+  return false;
+}
+
+// 检查指令是否有效死代码（带递归深度限制）
+bool InductionVariableEliminationContext::isInstructionEffectivelyDead(Instruction* inst, Loop* loop, int maxDepth) {
+  if (maxDepth <= 0) {
+    return false; // 达到递归深度限制，保守返回false
+  }
+  
+  // 利用副作用分析
+  if (sideEffectAnalysis && sideEffectAnalysis->hasSideEffect(inst)) {
+    return false; // 有副作用的指令不是死代码
+  }
+  
+  // 检查特殊指令类型
+  switch (inst->getKind()) {
+    case Instruction::Kind::kStore:
+      // Store指令可能是死存储
+      return isStoreToDeadLocation(dynamic_cast<StoreInst*>(inst), loop);
+    
+    case Instruction::Kind::kCall:
+      // 函数调用通常有副作用
+      if (sideEffectAnalysis) {
+        return !sideEffectAnalysis->hasSideEffect(inst);
+      }
+      return false; // 保守地认为函数调用有效果
+    
+    case Instruction::Kind::kReturn:
+    case Instruction::Kind::kBr:
+    case Instruction::Kind::kCondBr:
+      // 控制流指令不是死代码
+      return false;
+    
+    default:
+      // 其他指令检查其使用是否有效
+      break;
+  }
+  
+  // 检查指令的使用
+  if (inst->getUses().empty()) {
+    return true; // 没有使用的纯指令是死代码
+  }
+  
+  // 递归检查所有使用
+  for (auto use : inst->getUses()) {
+    auto user = use->getUser();
+    auto* userInst = dynamic_cast<Instruction*>(user);
+    
+    if (!userInst) {
+      return false; // 被非指令使用
+    }
+    
+    if (!loop->contains(userInst->getParent())) {
+      return false; // 在循环外被使用
+    }
+    
+    // 递归检查使用者
+    if (!isInstructionEffectivelyDead(userInst, loop, maxDepth - 1)) {
+      return false; // 存在有效使用
+    }
+  }
+  
+  return true; // 所有使用都是死代码
+}
+
+// 原有的函数保持兼容，但现在使用增强的死代码分析
+bool InductionVariableEliminationContext::isInstructionDeadOrInternalOnly(Instruction* inst, Loop* loop) {
+  return isInstructionEffectivelyDead(inst, loop, 5);
+}
+
+// 检查store指令是否有后续的load操作
+bool InductionVariableEliminationContext::hasSubsequentLoad(StoreInst* store, Loop* loop) {
+  if (!aliasAnalysis) {
+    // 没有别名分析，保守地假设有后续读取
+    return true;
+  }
+  
+  Value* storePtr = store->getPointer();
+  const MemoryLocation* storeLoc = aliasAnalysis->getMemoryLocation(storePtr);
+  
+  if (!storeLoc) {
+    // 无法确定内存位置，保守处理
+    return true;
+  }
+  
+  // 在循环中和循环后查找对同一位置的load操作
+  std::vector<BasicBlock*> blocksToCheck;
+  
+  // 添加循环内的所有基本块
+  for (auto* bb : loop->getBlocks()) {
+    blocksToCheck.push_back(bb);
+  }
+  
+  // 添加循环的退出块
+  auto exitBlocks = loop->getExitBlocks();
+  for (auto* exitBB : exitBlocks) {
+    blocksToCheck.push_back(exitBB);
+  }
+  
+  // 搜索load操作
+  for (auto* bb : blocksToCheck) {
+    for (auto& inst : bb->getInstructions()) {
+      if (inst->getKind() == Instruction::Kind::kLoad) {
+        LoadInst* loadInst = static_cast<LoadInst*>(inst.get());
+        Value* loadPtr = loadInst->getPointer();
+        const MemoryLocation* loadLoc = aliasAnalysis->getMemoryLocation(loadPtr);
+        
+        if (loadLoc && aliasAnalysis->queryAlias(storePtr, loadPtr) != AliasType::NO_ALIAS) {
+          // 找到可能读取同一位置的load操作
+          if (DEBUG) {
+            std::cout << "            找到后续load操作: " << loadInst->getName() << std::endl;
+          }
+          return true;
+        }
+      }
+    }
+  }
+  
+  // 检查是否通过函数调用间接访问
+  for (auto* bb : blocksToCheck) {
+    for (auto& inst : bb->getInstructions()) {
+      if (inst->getKind() == Instruction::Kind::kCall) {
+        CallInst* callInst = static_cast<CallInst*>(inst.get());
+        if (callInst && sideEffectAnalysis && sideEffectAnalysis->hasSideEffect(callInst)) {
+          // 函数调用可能间接读取内存
+          if (DEBUG) {
+            std::cout << "            函数调用可能读取内存: " << callInst->getName() << std::endl;
+          }
+          return true;
+        }
+      }
+    }
+  }
+  
+  if (DEBUG) {
+    std::cout << "            未找到后续load操作" << std::endl;
+  }
+  return false; // 没有找到后续读取
+}
+
+// 检查指令是否在循环外有使用
+bool InductionVariableEliminationContext::hasUsageOutsideLoop(Instruction* inst, Loop* loop) {
+  for (auto use : inst->getUses()) {
+    auto user = use->getUser();
+    auto* userInst = dynamic_cast<Instruction*>(user);
+    
+    if (!userInst) {
+      // 被非指令使用，可能在循环外
+      return true;
+    }
+    
+    if (!loop->contains(userInst->getParent())) {
+      // 在循环外被使用
+      if (DEBUG) {
+        std::cout << "            指令 " << inst->getName() << " 在循环外被 " 
+                  << userInst->getName() << " 使用" << std::endl;
+      }
+      return true;
+    }
+  }
+  
+  return false; // 没有循环外使用
+}
+
+// 检查store指令是否在循环外有后续的load操作
+bool InductionVariableEliminationContext::hasSubsequentLoadOutsideLoop(StoreInst* store, Loop* loop) {
+  if (!aliasAnalysis) {
+    // 没有别名分析，保守地假设有后续读取
+    return true;
+  }
+  
+  Value* storePtr = store->getPointer();
+  
+  // 检查循环的退出块及其后继
+  auto exitBlocks = loop->getExitBlocks();
+  std::set<BasicBlock*> visitedBlocks;
+  
+  for (auto* exitBB : exitBlocks) {
+    if (hasLoadInSubtree(exitBB, storePtr, visitedBlocks)) {
+      if (DEBUG) {
+        std::cout << "            找到循环外的后续load操作" << std::endl;
+      }
+      return true;
+    }
+  }
+  
+  return false; // 没有找到循环外的后续读取
+}
+
+// 递归检查基本块子树中是否有对指定位置的load操作
+bool InductionVariableEliminationContext::hasLoadInSubtree(BasicBlock* bb, Value* ptr, std::set<BasicBlock*>& visited) {
+  if (visited.count(bb) > 0) {
+    return false; // 已经访问过，避免无限循环
+  }
+  visited.insert(bb);
+  
+  // 检查当前基本块中的指令
+  for (auto& inst : bb->getInstructions()) {
+    if (inst->getKind() == Instruction::Kind::kLoad) {
+      LoadInst* loadInst = static_cast<LoadInst*>(inst.get());
+      if (aliasAnalysis && aliasAnalysis->queryAlias(ptr, loadInst->getPointer()) != AliasType::NO_ALIAS) {
+        return true; // 找到了对相同或别名位置的load
+      }
+    } else if (inst->getKind() == Instruction::Kind::kCall) {
+      // 函数调用可能间接读取内存
+      CallInst* callInst = static_cast<CallInst*>(inst.get());
+      if (sideEffectAnalysis && sideEffectAnalysis->hasSideEffect(callInst)) {
+        return true; // 保守地认为函数调用可能读取内存
+      }
+    }
+  }
+  
+  // 递归检查后继基本块（限制深度以避免过度搜索）
+  static int searchDepth = 0;
+  if (searchDepth < 10) { // 限制搜索深度
+    searchDepth++;
+    for (auto* succ : bb->getSuccessors()) {
+      if (hasLoadInSubtree(succ, ptr, visited)) {
+        searchDepth--;
+        return true;
+      }
+    }
+    searchDepth--;
+  }
+  
+  return false;
+}
+
+std::vector<Instruction*> InductionVariableEliminationContext::collectRelatedInstructions(
+    PhiInst* phiInst, Loop* loop) {
+  std::vector<Instruction*> relatedInsts;
+  
+  // 收集所有与该归纳变量相关的指令
+  for (auto use : phiInst->getUses()) {
+    auto user = use->getUser();
+    auto* userInst = dynamic_cast<Instruction*>(user);
+    
+    if (userInst && loop->contains(userInst->getParent())) {
+      relatedInsts.push_back(userInst);
+    }
+  }
+  
+  return relatedInsts;
+}
+
+void InductionVariableEliminationContext::analyzeSafetyForElimination() {
+  if (DEBUG) {
+    std::cout << "  === Phase 2: Analyzing Safety for Elimination ===" << std::endl;
+  }
+
+  // 为每个死归纳变量检查消除的安全性
+  for (auto& deadIV : deadIVs) {
+    bool isSafe = isSafeToEliminate(deadIV.get());
+    deadIV->canEliminate = isSafe;
+    
+    if (DEBUG) {
+      std::cout << "    Dead IV " << deadIV->phiInst->getName() 
+                << ": " << (isSafe ? "SAFE" : "UNSAFE") << " to eliminate" << std::endl;
+    }
+  }
+
+  if (DEBUG) {
+    size_t safeCount = 0;
+    for (const auto& deadIV : deadIVs) {
+      if (deadIV->canEliminate) safeCount++;
+    }
+    std::cout << "  === End Phase 2: " << safeCount << " of " << deadIVs.size() 
+              << " variables are safe to eliminate ===" << std::endl;
+  }
+}
+
+bool InductionVariableEliminationContext::isSafeToEliminate(const DeadInductionVariable* deadIV) {
+  // 1. 确保归纳变量在循环头
+  if (deadIV->phiInst->getParent() != deadIV->containingLoop->getHeader()) {
+    if (DEBUG) {
+      std::cout << "      Unsafe: phi not in loop header" << std::endl;
+    }
+    return false;
+  }
+  
+  // 2. 确保相关指令都在循环内
+  for (auto* inst : deadIV->relatedInsts) {
+    if (!deadIV->containingLoop->contains(inst->getParent())) {
+      if (DEBUG) {
+        std::cout << "      Unsafe: related instruction outside loop" << std::endl;
+      }
+      return false;
+    }
+  }
+  
+  // 3. 确保没有副作用
+  for (auto* inst : deadIV->relatedInsts) {
+    if (sideEffectAnalysis) {
+      // 使用副作用分析进行精确检查
+      if (sideEffectAnalysis->hasSideEffect(inst)) {
+        if (DEBUG) {
+          std::cout << "      Unsafe: related instruction " << inst->getName() 
+                    << " has side effects" << std::endl;
+        }
+        return false;
+      }
+    } else {
+      // 没有副作用分析时使用保守策略：只允许基本算术运算
+      auto kind = inst->getKind();
+      if (kind != Instruction::Kind::kAdd && 
+          kind != Instruction::Kind::kSub &&
+          kind != Instruction::Kind::kMul) {
+        if (DEBUG) {
+          std::cout << "      Unsafe: related instruction may have side effects (conservative)" << std::endl;
+        }
+        return false;
+      }
+    }
+  }
+  
+  // 4. 确保不影响循环的退出条件
+  for (BasicBlock* exitingBB : deadIV->containingLoop->getExitingBlocks()) {
+    auto terminatorIt = exitingBB->terminator();
+    if (terminatorIt != exitingBB->end()) {
+      Instruction* terminator = terminatorIt->get();
+      if (terminator) {
+        for (size_t i = 0; i < terminator->getNumOperands(); ++i) {
+          if (terminator->getOperand(i) == deadIV->phiInst) {
+            if (DEBUG) {
+              std::cout << "      Unsafe: phi used in loop exit condition" << std::endl;
+            }
+            return false;
+          }
+        }
+      }
+    }
+  }
+  
+  return true;
+}
+
+bool InductionVariableEliminationContext::performInductionVariableElimination() {
+  if (DEBUG) {
+    std::cout << "  === Phase 3: Performing Induction Variable Elimination ===" << std::endl;
+  }
+
+  bool modified = false;
+
+  for (auto& deadIV : deadIVs) {
+    if (!deadIV->canEliminate) {
+      continue;
+    }
+
+    if (DEBUG) {
+      std::cout << "    Eliminating dead IV: " << deadIV->phiInst->getName() << std::endl;
+    }
+
+    if (eliminateDeadInductionVariable(deadIV.get())) {
+      if (DEBUG) {
+        std::cout << "      Successfully eliminated: " << deadIV->phiInst->getName() << std::endl;
+      }
+      modified = true;
+    } else {
+      if (DEBUG) {
+        std::cout << "      Failed to eliminate: " << deadIV->phiInst->getName() << std::endl;
+      }
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "  === End Phase 3: " << (modified ? "Eliminations performed" : "No eliminations") << " ===" << std::endl;
+  }
+
+  return modified;
+}
+
+bool InductionVariableEliminationContext::eliminateDeadInductionVariable(DeadInductionVariable* deadIV) {
+  // 1. 删除所有相关指令
+  for (auto* inst : deadIV->relatedInsts) {
+    auto* bb = inst->getParent();
+    auto it = bb->findInstIterator(inst);
+    if (it != bb->end()) {
+        SysYIROptUtils::usedelete(it);
+    //   bb->getInstructions().erase(it);
+      if (DEBUG) {
+        std::cout << "        Removed related instruction: " << inst->getName() << std::endl;
+      }
+    }
+  }
+
+  // 2. 删除 phi 指令
+  auto* bb = deadIV->phiInst->getParent();
+  auto it = bb->findInstIterator(deadIV->phiInst);
+  if (it != bb->end()) {
+    SysYIROptUtils::usedelete(it);
+    // bb->getInstructions().erase(it);
+    if (DEBUG) {
+      std::cout << "        Removed phi instruction: " << deadIV->phiInst->getName() << std::endl;
+    }
+    return true;
+  }
+
+  return false;
+}
+
+void InductionVariableEliminationContext::printDebugInfo() {
+  if (!DEBUG) return;
+
+  std::cout << "\n=== Induction Variable Elimination Summary ===" << std::endl;
+  std::cout << "Total dead IVs found: " << deadIVs.size() << std::endl;
+  
+  size_t eliminatedCount = 0;
+  for (auto& [loop, loopDeadIVs] : loopToDeadIVs) {
+    size_t loopEliminatedCount = 0;
+    for (auto* deadIV : loopDeadIVs) {
+      if (deadIV->canEliminate) {
+        loopEliminatedCount++;
+        eliminatedCount++;
+      }
+    }
+    
+    if (loopEliminatedCount > 0) {
+      std::cout << "Loop " << loop->getName() << ": " << loopEliminatedCount 
+                << " of " << loopDeadIVs.size() << " IVs eliminated" << std::endl;
+    }
+  }
+  
+  std::cout << "Total eliminated: " << eliminatedCount << " of " << deadIVs.size() << std::endl;
+  std::cout << "=============================================" << std::endl;
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/LICM.cpp
+++ b/src/midend/Pass/Optimize/LICM.cpp
@ -0,0 +1,264 @@
+#include "LICM.h"
+#include "IR.h"
+
+extern int DEBUG;
+
+namespace sysy {
+
+void *LICM::ID = (void *)&LICM::ID;
+
+bool LICMContext::run() { return hoistInstructions(); }
+
+bool LICMContext::hoistInstructions() {
+  bool changed = false;
+  BasicBlock *preheader = loop->getPreHeader();
+  if (!preheader || !chars)
+    return false;
+
+  // 1. 先收集所有可外提指令
+  std::unordered_set<Instruction *> workSet(chars->invariantInsts.begin(), chars->invariantInsts.end());
+
+  if (DEBUG) {
+    std::cout << "LICM: Found " << workSet.size() << " candidate invariant instructions to hoist:" << std::endl;
+    for (auto *inst : workSet) {
+      std::cout << "  - " << inst->getName() << " (kind: " << static_cast<int>(inst->getKind()) 
+                << ", in BB: " << inst->getParent()->getName() << ")" << std::endl;
+    }
+  }
+
+  // 2. 计算每个指令被依赖的次数（入度）
+  std::unordered_map<Instruction *, int> indegree;
+  std::unordered_map<Instruction *, std::vector<Instruction *>> dependencies; // 记录依赖关系
+  std::unordered_map<Instruction *, std::vector<Instruction *>> dependents;   // 记录被依赖关系
+  
+  for (auto *inst : workSet) {
+    indegree[inst] = 0;
+    dependencies[inst] = {};
+    dependents[inst] = {};
+  }
+  
+  if (DEBUG) {
+    std::cout << "LICM: Analyzing dependencies between invariant instructions..." << std::endl;
+  }
+  
+  for (auto *inst : workSet) {
+    for (size_t i = 0; i < inst->getNumOperands(); ++i) {
+      if (auto *dep = dynamic_cast<Instruction *>(inst->getOperand(i))) {
+        if (workSet.count(dep)) {
+          indegree[inst]++;
+          dependencies[inst].push_back(dep);
+          dependents[dep].push_back(inst);
+          
+          if (DEBUG) {
+            std::cout << "  Dependency: " << inst->getName() << " depends on " << dep->getName() << std::endl;
+          }
+        }
+      }
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "LICM: Initial indegree analysis:" << std::endl;
+    for (auto &[inst, deg] : indegree) {
+      std::cout << "  " << inst->getName() << ": indegree=" << deg;
+      if (deg > 0) {
+        std::cout << ", depends on: ";
+        for (auto *dep : dependencies[inst]) {
+          std::cout << dep->getName() << " ";
+        }
+      }
+      std::cout << std::endl;
+    }
+  }
+
+  // 3. Kahn拓扑排序
+  std::vector<Instruction *> sorted;
+  std::queue<Instruction *> q;
+  
+  if (DEBUG) {
+    std::cout << "LICM: Starting topological sort..." << std::endl;
+  }
+  
+  for (auto &[inst, deg] : indegree) {
+    if (deg == 0) {
+      q.push(inst);
+      if (DEBUG) {
+        std::cout << "  Initial zero-indegree instruction: " << inst->getName() << std::endl;
+      }
+    }
+  }
+  
+  int sortStep = 0;
+  while (!q.empty()) {
+    auto *inst = q.front();
+    q.pop();
+    sorted.push_back(inst);
+    
+    if (DEBUG) {
+      std::cout << "  Step " << (++sortStep) << ": Processing " << inst->getName() << std::endl;
+    }
+    
+    if (DEBUG) {
+      std::cout << "    Reducing indegree of dependents of " << inst->getName() << std::endl;
+    }
+    
+    // 正确的拓扑排序：当处理一个指令时，应该减少其所有使用者（dependents）的入度
+    for (auto *dependent : dependents[inst]) {
+      indegree[dependent]--;
+      if (DEBUG) {
+        std::cout << "      Reducing indegree of " << dependent->getName() << " to " << indegree[dependent] << std::endl;
+      }
+      if (indegree[dependent] == 0) {
+        q.push(dependent);
+        if (DEBUG) {
+          std::cout << "      Adding " << dependent->getName() << " to queue (indegree=0)" << std::endl;
+        }
+      }
+    }
+  }
+
+  // 检查是否全部排序，若未全部排序，打印错误信息
+  // 这可能是因为存在循环依赖或其他问题导致无法完成拓扑排序
+  if (sorted.size() != workSet.size()) {
+    if (DEBUG) {
+      std::cout << "LICM: Topological sort failed! Sorted " << sorted.size() 
+                << " instructions out of " << workSet.size() << " total." << std::endl;
+      
+      // 找出未被排序的指令（形成循环依赖的指令）
+      std::unordered_set<Instruction *> remaining;
+      for (auto *inst : workSet) {
+        bool found = false;
+        for (auto *sortedInst : sorted) {
+          if (inst == sortedInst) {
+            found = true;
+            break;
+          }
+        }
+        if (!found) {
+          remaining.insert(inst);
+        }
+      }
+      
+      std::cout << "LICM: Instructions involved in dependency cycle:" << std::endl;
+      for (auto *inst : remaining) {
+        std::cout << "  - " << inst->getName() << " (indegree=" << indegree[inst] << ")" << std::endl;
+        std::cout << "    Dependencies within cycle: ";
+        for (auto *dep : dependencies[inst]) {
+          if (remaining.count(dep)) {
+            std::cout << dep->getName() << " ";
+          }
+        }
+        std::cout << std::endl;
+        std::cout << "    Dependents within cycle: ";
+        for (auto *dependent : dependents[inst]) {
+          if (remaining.count(dependent)) {
+            std::cout << dependent->getName() << " ";
+          }
+        }
+        std::cout << std::endl;
+      }
+      
+      // 尝试找出一个具体的循环路径
+      std::cout << "LICM: Attempting to trace a dependency cycle:" << std::endl;
+      if (!remaining.empty()) {
+        auto *start = *remaining.begin();
+        std::unordered_set<Instruction *> visited;
+        std::vector<Instruction *> path;
+        
+        std::function<bool(Instruction *)> findCycle = [&](Instruction *current) -> bool {
+          if (visited.count(current)) {
+            // 找到环
+            auto it = std::find(path.begin(), path.end(), current);
+            if (it != path.end()) {
+              std::cout << "  Cycle found: ";
+              for (auto cycleIt = it; cycleIt != path.end(); ++cycleIt) {
+                std::cout << (*cycleIt)->getName() << " -> ";
+              }
+              std::cout << current->getName() << std::endl;
+              return true;
+            }
+            return false;
+          }
+          
+          visited.insert(current);
+          path.push_back(current);
+          
+          for (auto *dep : dependencies[current]) {
+            if (remaining.count(dep)) {
+              if (findCycle(dep)) {
+                return true;
+              }
+            }
+          }
+          
+          path.pop_back();
+          return false;
+        };
+        
+        findCycle(start);
+      }
+    }
+    return false;
+  }
+
+  // 4. 按拓扑序外提
+  if (DEBUG) {
+    std::cout << "LICM: Successfully completed topological sort. Hoisting instructions in order:" << std::endl;
+  }
+  
+  for (auto *inst : sorted) {
+    if (!inst)
+      continue;
+    BasicBlock *parent = inst->getParent();
+    if (parent && loop->contains(parent)) {
+      if (DEBUG) {
+        std::cout << "  Hoisting " << inst->getName() << " from " << parent->getName() 
+                  << " to preheader " << preheader->getName() << std::endl;
+      }
+      auto sourcePos = parent->findInstIterator(inst);
+      auto targetPos = preheader->terminator();
+      parent->moveInst(sourcePos, targetPos, preheader);
+      changed = true;
+    }
+  }
+  
+  if (DEBUG && changed) {
+    std::cout << "LICM: Successfully hoisted " << sorted.size() << " invariant instructions" << std::endl;
+  }
+  
+  return changed;
+}
+// ---- LICM Pass Implementation ----
+
+bool LICM::runOnFunction(Function *F, AnalysisManager &AM) {
+  auto *loopAnalysis = AM.getAnalysisResult<LoopAnalysisResult, LoopAnalysisPass>(F);
+  auto *loopCharsResult = AM.getAnalysisResult<LoopCharacteristicsResult, LoopCharacteristicsPass>(F);
+  if (!loopAnalysis || !loopCharsResult)
+    return false;
+
+  bool changed = false;
+  // 对每个函数内的所有循环做处理
+  for (const auto &loop_ptr : loopAnalysis->getAllLoops()) {
+    Loop *loop = loop_ptr.get();
+    if (DEBUG) {
+      std::cout << "LICM: Processing loop in function " << F->getName() << ": " << loop->getName() << std::endl;
+    }
+    const LoopCharacteristics *chars = loopCharsResult->getCharacteristics(loop);
+    if (!chars || !loop->getPreHeader())
+      continue; // 没有分析结果或没有前置块则跳过
+    LICMContext ctx(F, loop, builder, chars);
+    changed |= ctx.run();
+  }
+  return changed;
+}
+
+void LICM::getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const {
+
+  analysisDependencies.insert(&LoopAnalysisPass::ID);
+  analysisDependencies.insert(&LoopCharacteristicsPass::ID);
+
+  analysisInvalidations.insert(&LoopCharacteristicsPass::ID);
+  analysisInvalidations.insert(&LivenessAnalysisPass::ID);
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/LoopNormalization.cpp
+++ b/src/midend/Pass/Optimize/LoopNormalization.cpp
@ -0,0 +1,528 @@
+#include "LoopNormalization.h"
+#include "Dom.h"
+#include "Loop.h"
+#include "SysYIROptUtils.h"
+#include <iostream>
+#include <algorithm>
+#include <sstream>
+
+// 使用全局调试开关
+extern int DEBUG;
+
+namespace sysy {
+
+// 定义 Pass 的唯一 ID
+void *LoopNormalizationPass::ID = (void *)&LoopNormalizationPass::ID;
+
+bool LoopNormalizationPass::runOnFunction(Function *F, AnalysisManager &AM) {
+  if (F->getBasicBlocks().empty()) {
+    return false; // 空函数
+  }
+
+  if (DEBUG)
+    std::cout << "Running LoopNormalizationPass on function: " << F->getName() << std::endl;
+
+  // 获取并缓存所有需要的分析结果
+  loopAnalysis = AM.getAnalysisResult<LoopAnalysisResult, LoopAnalysisPass>(F);
+  if (!loopAnalysis || !loopAnalysis->hasLoops()) {
+    if (DEBUG)
+      std::cout << "No loops found in function " << F->getName() << ", skipping normalization" << std::endl;
+    return false; // 没有循环需要规范化
+  }
+
+  domTree = AM.getAnalysisResult<DominatorTree, DominatorTreeAnalysisPass>(F);
+  
+  if (!domTree) {
+    std::cerr << "Error: DominatorTree not available for function " << F->getName() << std::endl;
+    return false;
+  }
+
+  // 重置统计信息
+  stats = NormalizationStats();
+  
+  bool modified = false;
+  const auto& allLoops = loopAnalysis->getAllLoops();
+  stats.totalLoops = allLoops.size();
+
+  if (DEBUG) {
+    std::cout << "Found " << stats.totalLoops << " loops to analyze for normalization" << std::endl;
+  }
+
+  // 按循环深度从外到内处理，确保外层循环先规范化
+  std::vector<Loop*> sortedLoops;
+  for (const auto& loop_ptr : allLoops) {
+    sortedLoops.push_back(loop_ptr.get());
+  }
+  
+  std::sort(sortedLoops.begin(), sortedLoops.end(), [](Loop* a, Loop* b) {
+    return a->getLoopDepth() < b->getLoopDepth(); // 按深度升序排列
+  });
+
+  // 逐个规范化循环
+  for (Loop* loop : sortedLoops) {
+    if (needsPreheader(loop)) {
+      stats.loopsNeedingPreheader++;
+      
+      if (DEBUG) {
+        std::cout << "  Loop " << loop->getName() << " needs preheader (depth=" 
+                  << loop->getLoopDepth() << ")" << std::endl;
+      }
+      
+      if (normalizeLoop(loop)) {
+        modified = true;
+        stats.loopsNormalized++;
+        
+        // 验证规范化结果
+        if (!validateNormalization(loop)) {
+          std::cerr << "Warning: Loop normalization validation failed for loop " 
+                    << loop->getName() << std::endl;
+        }
+      }
+    } else {
+      if (DEBUG) {
+        auto* preheader = getExistingPreheader(loop);
+        if (preheader) {
+          std::cout << "  Loop " << loop->getName() << " already has preheader: " 
+                    << preheader->getName() << std::endl;
+        }
+      }
+    }
+  }
+
+  if (DEBUG && modified) {
+    printStats(F);
+  }
+
+  return modified;
+}
+
+bool LoopNormalizationPass::normalizeLoop(Loop* loop) {
+  if (DEBUG)
+    std::cout << "    Normalizing loop: " << loop->getName() << std::endl;
+
+  // 创建前置块
+  BasicBlock* preheader = createPreheaderForLoop(loop);
+  if (!preheader) {
+    if (DEBUG)
+      std::cout << "    Failed to create preheader for loop " << loop->getName() << std::endl;
+    return false;
+  }
+
+  stats.preheadersCreated++;
+  
+  if (DEBUG) {
+    std::cout << "    Successfully created preheader " << preheader->getName() 
+              << " for loop " << loop->getName() << std::endl;
+  }
+
+  return true;
+}
+
+BasicBlock* LoopNormalizationPass::createPreheaderForLoop(Loop* loop) {
+  BasicBlock* header = loop->getHeader();
+  if (!header) {
+    if (DEBUG)
+      std::cerr << "    Error: Loop has no header block" << std::endl;
+    return nullptr;
+  }
+
+  // 获取循环外的前驱块
+  std::vector<BasicBlock*> externalPreds = getExternalPredecessors(loop);
+  if (externalPreds.empty()) {
+    if (DEBUG)
+      std::cout << "    Loop " << loop->getName() << " has no external predecessors" << std::endl;
+    return nullptr;
+  }
+
+  if (DEBUG) {
+    std::cout << "    Found " << externalPreds.size() << " external predecessors for loop " 
+              << loop->getName() << std::endl;
+    for (auto* pred : externalPreds) {
+      std::cout << "      External pred: " << pred->getName() << std::endl;
+    }
+  }
+
+  // 生成前置块名称
+  std::string preheaderName = generatePreheaderName(loop);
+  
+  // 创建新的前置块
+  Function* parentFunction = header->getParent();
+  BasicBlock* preheader = parentFunction->addBasicBlock(preheaderName, header);
+  
+  if (!preheader) {
+    if (DEBUG)
+      std::cerr << "    Error: Failed to create basic block " << preheaderName << std::endl;
+    return nullptr;
+  }
+
+  // 在前置块中创建跳转指令到循环头部
+  builder->setPosition(preheader, preheader->end());
+  UncondBrInst* br = builder->createUncondBrInst(header);
+
+  // 更新preheader的CFG关系
+  preheader->addSuccessor(header);
+  header->addPredecessor(preheader);
+
+  if(DEBUG) {
+    std::cout << "    Created preheader " << preheader->getName() 
+              << " with unconditional branch to " << header->getName() << std::endl;
+  }
+  // 重定向外部前驱到新的前置块
+  redirectExternalPredecessors(loop, preheader, header, externalPreds);
+
+  // 更新PHI节点
+  updatePhiNodesForPreheader(header, preheader, externalPreds);
+
+  // 更新支配树关系
+  updateDominatorRelations(preheader, loop);
+
+  // 重要：更新循环对象的前置块信息
+  // 这样后续的优化遍可以通过 loop->getPreHeader() 获取到新创建的前置块
+  loop->setPreHeader(preheader);
+
+  if (DEBUG) {
+    std::cout << "    Updated loop object: preheader set to " << preheader->getName() << std::endl;
+  }
+
+  return preheader;
+}
+
+bool LoopNormalizationPass::needsPreheader(Loop* loop) {
+  // 检查是否已有合适的前置块
+  if (getExistingPreheader(loop) != nullptr) {
+    return false;
+  }
+
+  // 检查是否有外部前驱（如果没有外部前驱，不需要前置块）
+  std::vector<BasicBlock*> externalPreds = getExternalPredecessors(loop);
+  if (externalPreds.empty()) {
+    return false;
+  }
+
+  // 基于结构性需求判断：
+  // 1. 如果有多个外部前驱，必须创建前置块来合并它们
+  // 2. 如果单个外部前驱不适合作为前置块，需要创建新的前置块
+  return (externalPreds.size() > 1) || !isSuitableAsPreheader(externalPreds[0], loop);
+}
+
+BasicBlock* LoopNormalizationPass::getExistingPreheader(Loop* loop) {
+  BasicBlock* header = loop->getHeader();
+  if (!header) return nullptr;
+
+  std::vector<BasicBlock*> externalPreds = getExternalPredecessors(loop);
+  
+  // 如果只有一个外部前驱，且适合作为前置块，则返回它
+  if (externalPreds.size() == 1 && isSuitableAsPreheader(externalPreds[0], loop)) {
+    return externalPreds[0];
+  }
+
+  return nullptr;
+}
+
+void LoopNormalizationPass::updateDominatorRelations(BasicBlock* newBlock, Loop* loop) {
+  // 由于在getAnalysisUsage中声明了DominatorTree会失效，
+  // PassManager会在本遍运行后自动将支配树结果标记为失效，
+  // 后续需要支配树的Pass会触发重新计算，所以这里无需手动更新
+  
+  if (DEBUG) {
+    BasicBlock* header = loop->getHeader();
+    std::cout << "    DominatorTree marked for invalidation - new preheader " 
+              << newBlock->getName() << " will dominate " << header->getName() 
+              << " after recomputation by PassManager" << std::endl;
+  }
+}
+
+void LoopNormalizationPass::redirectExternalPredecessors(Loop* loop, BasicBlock* preheader, BasicBlock* header, 
+                                                         const std::vector<BasicBlock*>& externalPreds) {
+  // std::vector<BasicBlock*> externalPreds = getExternalPredecessors(loop);
+  
+  if (DEBUG) {
+    std::cout << "    Redirecting " << externalPreds.size() << " external predecessors" << std::endl;
+  }
+
+  for (BasicBlock* pred : externalPreds) {
+    // 获取前驱块的终止指令
+    auto termIt = pred->terminator();
+    if (termIt == pred->end()) continue;
+    
+    Instruction* terminator = termIt->get();
+    if (!terminator) continue;
+
+    // 更新跳转目标
+    if (auto* br = dynamic_cast<UncondBrInst*>(terminator)) {
+      // 无条件跳转
+      if (br->getBlock() == header) {
+        if(DEBUG){
+          std::cout << "      Updating unconditional branch from " << br->getBlock()->getName()
+                    << " to " << preheader->getName() << std::endl;
+        }
+        // 需要更新操作数
+        br->setOperand(0, preheader);
+        // 更新CFG关系
+        header->removePredecessor(pred);
+        preheader->addPredecessor(pred);
+        pred->removeSuccessor(header);
+        pred->addSuccessor(preheader);
+        
+      }
+    } else if (auto* condBr = dynamic_cast<CondBrInst*>(terminator)) {
+      // 条件跳转
+      bool updated = false;
+      if (condBr->getThenBlock() == header) {
+        condBr->setOperand(1, preheader);  // 第1个操作数是then分支
+        updated = true;
+      }
+      if (condBr->getElseBlock() == header) {
+        condBr->setOperand(2, preheader);  // 第2个操作数是else分支
+        updated = true;
+      }
+      if (updated) {
+        // 更新CFG关系
+        header->removePredecessor(pred);
+        preheader->addPredecessor(pred);
+        pred->removeSuccessor(header);
+        pred->addSuccessor(preheader);
+        
+        if (DEBUG) {
+          std::cout << "      Updated conditional branch from " << pred->getName() 
+                    << " to " << preheader->getName() << std::endl;
+        }
+      }
+    }
+  }
+}
+
+std::string LoopNormalizationPass::generatePreheaderName(Loop* loop) {
+  std::ostringstream oss;
+  oss << loop->getName() << "_preheader";
+  return oss.str();
+}
+
+bool LoopNormalizationPass::validateNormalization(Loop* loop) {
+  BasicBlock* header = loop->getHeader();
+  if (!header) return false;
+
+  // 检查循环是否现在有唯一的外部前驱
+  std::vector<BasicBlock*> externalPreds = getExternalPredecessors(loop);
+  if (externalPreds.size() != 1) {
+    if (DEBUG)
+      std::cout << "    Validation failed: Loop " << loop->getName() 
+                << " has " << externalPreds.size() << " external predecessors (expected 1)" << std::endl;
+    return false;
+  }
+
+  // 检查外部前驱是否适合作为前置块
+  BasicBlock* preheader = externalPreds[0];
+  if (!isSuitableAsPreheader(preheader, loop)) {
+    if (DEBUG)
+      std::cout << "    Validation failed: External predecessor " << preheader->getName() 
+                << " is not suitable as preheader" << std::endl;
+    return false;
+  }
+
+  // 额外验证：检查CFG连接性
+  if (!preheader->hasSuccessor(header)) {
+    if (DEBUG)
+      std::cout << "    Validation failed: Preheader " << preheader->getName() 
+                << " is not connected to header " << header->getName() << std::endl;
+    return false;
+  }
+
+  if (!header->hasPredecessor(preheader)) {
+    if (DEBUG)
+      std::cout << "    Validation failed: Header " << header->getName() 
+                << " does not have preheader " << preheader->getName() << " as predecessor" << std::endl;
+    return false;
+  }
+
+  if (DEBUG)
+    std::cout << "    Validation passed for loop " << loop->getName() << std::endl;
+  
+  return true;
+}
+
+std::vector<BasicBlock*> LoopNormalizationPass::getExternalPredecessors(Loop* loop) {
+  std::vector<BasicBlock*> externalPreds;
+  BasicBlock* header = loop->getHeader();
+  if (!header) return externalPreds;
+
+  for (BasicBlock* pred : header->getPredecessors()) {
+    if (!loop->contains(pred)) {
+      externalPreds.push_back(pred);
+    }
+  }
+
+  return externalPreds;
+}
+
+bool LoopNormalizationPass::isSuitableAsPreheader(BasicBlock* block, Loop* loop) {
+  if (!block) return false;
+
+  // 检查该块是否只有一个后继，且后继是循环头部
+  auto successors = block->getSuccessors();
+  if (successors.size() != 1) {
+    return false;
+  }
+
+  if (successors[0] != loop->getHeader()) {
+    return false;
+  }
+
+  // 检查该块是否不包含复杂的控制流
+  // 理想的前置块应该只包含简单的跳转指令
+  size_t instCount = 0;
+  for (const auto& inst : block->getInstructions()) {
+    instCount++;
+    // 如果指令过多，可能不适合作为前置块
+    if (instCount > 10) { // 阈值可调整
+      return false;
+    }
+  }
+
+  return true;
+}
+
+void LoopNormalizationPass::updatePhiNodesForPreheader(BasicBlock* header, BasicBlock* preheader,
+                                                      const std::vector<BasicBlock*>& oldPreds) {
+  if (DEBUG) {
+    std::cout << "    Updating PHI nodes in header " << header->getName() 
+              << " for new preheader " << preheader->getName() << std::endl;
+  }
+
+  std::vector<PhiInst*> phisToRemove; // 需要删除的PHI节点
+  
+  for (auto& inst : header->getInstructions()) {
+    if (auto* phi = dynamic_cast<PhiInst*>(inst.get())) {
+      if (DEBUG) {
+        std::cout << "      Processing PHI node: " << phi->getName() << std::endl;
+      }
+
+      // 收集来自外部前驱的值 - 需要保持原始的映射关系
+      std::map<BasicBlock*, Value*> externalValues;
+      for (BasicBlock* oldPred : oldPreds) {
+        Value* value = phi->getValfromBlk(oldPred);
+        if (value) {
+          externalValues[oldPred] = value;
+        }
+      }
+
+      // 处理PHI节点的更新
+      if (externalValues.size() > 1) {
+        // 多个外部前驱：在前置块中创建新的PHI节点
+        builder->setPosition(preheader, preheader->getInstructions().begin());
+        
+        std::vector<Value*> values;
+        std::vector<BasicBlock*> blocks;
+        for (auto& [block, value] : externalValues) {
+          values.push_back(value);
+          blocks.push_back(block);
+        }
+        
+        PhiInst* newPhi = builder->createPhiInst(phi->getType(), values, blocks);
+        
+        // 移除所有外部前驱的条目
+        for (BasicBlock* oldPred : oldPreds) {
+          phi->removeIncomingBlock(oldPred);
+        }
+        
+        // 添加来自新前置块的条目
+        phi->addIncoming(newPhi, preheader);
+        
+      } else if (externalValues.size() == 1) {
+        // 单个外部前驱：直接重新映射
+        Value* value = externalValues.begin()->second;
+        
+        // 移除旧的外部前驱条目
+        for (BasicBlock* oldPred : oldPreds) {
+          phi->removeIncomingBlock(oldPred);
+        }
+        
+        // 添加来自新前置块的条目
+        phi->addIncoming(value, preheader);
+        
+        // 检查PHI节点是否只剩下一个条目（只来自前置块）
+        if (phi->getNumIncomingValues() == 1) {
+          if (DEBUG) {
+            std::cout << "      PHI node " << phi->getName() 
+                      << " now has only one incoming value, scheduling for removal" << std::endl;
+          }
+          // 用单一值替换所有使用
+          Value* singleValue = phi->getIncomingValue(0u);
+          phi->replaceAllUsesWith(singleValue);
+          phisToRemove.push_back(phi);
+        }
+      } else {
+        // 没有外部值的PHI节点：检查是否需要更新
+        // 这种PHI节点只有循环内的边，通常不需要修改
+        // 但我们仍然需要检查是否只有一个条目
+        if (phi->getNumIncomingValues() == 1) {
+          if (DEBUG) {
+            std::cout << "      PHI node " << phi->getName() 
+                      << " has only one incoming value (no external), scheduling for removal" << std::endl;
+          }
+          // 用单一值替换所有使用
+          Value* singleValue = phi->getIncomingValue(0u);
+          phi->replaceAllUsesWith(singleValue);
+          phisToRemove.push_back(phi);
+        }
+      }
+      
+      if (DEBUG && std::find(phisToRemove.begin(), phisToRemove.end(), phi) == phisToRemove.end()) {
+        std::cout << "      Updated PHI node with " << externalValues.size() 
+                  << " external values, total incoming: " << phi->getNumIncomingValues() << std::endl;
+      }
+    }
+  }
+  
+  // 删除标记为移除的PHI节点
+  for (PhiInst* phi : phisToRemove) {
+    if (DEBUG) {
+      std::cout << "      Removing redundant PHI node: " << phi->getName() << std::endl;
+    }
+    SysYIROptUtils::usedelete(phi);
+  }
+  
+  // 更新统计信息
+  stats.redundantPhisRemoved += phisToRemove.size();
+  
+  if (DEBUG && !phisToRemove.empty()) {
+    std::cout << "    Removed " << phisToRemove.size() << " redundant PHI nodes" << std::endl;
+  }
+}
+
+void LoopNormalizationPass::printStats(Function* F) {
+  std::cout << "\n--- Loop Normalization Statistics for Function: " << F->getName() << " ---" << std::endl;
+  std::cout << "Total loops analyzed: " << stats.totalLoops << std::endl;
+  std::cout << "Loops needing preheader: " << stats.loopsNeedingPreheader << std::endl;
+  std::cout << "Preheaders created: " << stats.preheadersCreated << std::endl;
+  std::cout << "Loops successfully normalized: " << stats.loopsNormalized << std::endl;
+  std::cout << "Redundant PHI nodes removed: " << stats.redundantPhisRemoved << std::endl;
+  
+  if (stats.totalLoops > 0) {
+    double normalizationRate = (double)stats.loopsNormalized / stats.totalLoops * 100.0;
+    std::cout << "Normalization rate: " << normalizationRate << "%" << std::endl;
+  }
+  
+  std::cout << "---------------------------------------------------------------" << std::endl;
+}
+
+void LoopNormalizationPass::getAnalysisUsage(std::set<void *> &analysisDependencies, 
+                                            std::set<void *> &analysisInvalidations) const {
+  // LoopNormalization依赖的分析
+  analysisDependencies.insert(&LoopAnalysisPass::ID);              // 循环结构分析
+  analysisDependencies.insert(&DominatorTreeAnalysisPass::ID);     // 支配树分析
+
+  // LoopNormalization会修改CFG结构，因此会使以下分析失效
+  analysisInvalidations.insert(&DominatorTreeAnalysisPass::ID);    // 支配树需要重新计算
+  
+  // 注意：我们不让循环结构分析失效，原因如下：
+  // 1. 循环规范化只添加前置块，不改变循环的核心结构（头部、体、回边）
+  // 2. 我们会手动更新Loop对象的前置块信息（通过loop->setPreHeader()）
+  // 3. 让循环分析失效并重新计算的成本较高且不必要
+  // 4. 后续优化遍可以正确获取到更新后的前置块信息
+  // 
+  // 如果未来有更复杂的循环结构修改，可能需要考虑让循环分析失效：
+  // analysisInvalidations.insert(&LoopAnalysisPass::ID);
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/LoopStrengthReduction.cpp
+++ b/src/midend/Pass/Optimize/LoopStrengthReduction.cpp
@ -0,0 +1,942 @@
+#include "LoopStrengthReduction.h"
+#include "LoopCharacteristics.h"
+#include "Loop.h"
+#include "Dom.h"
+#include "IRBuilder.h"
+#include "SysYIROptUtils.h"
+#include <iostream>
+#include <algorithm>
+#include <cmath>
+#include <unordered_map>
+#include <climits>
+
+// 使用全局调试开关
+extern int DEBUG;
+
+namespace sysy {
+
+// 定义 Pass 
+void *LoopStrengthReduction::ID = (void *)&LoopStrengthReduction::ID;
+
+bool StrengthReductionContext::analyzeInductionVariableRange(
+  const InductionVarInfo* ivInfo, 
+  Loop* loop
+) const {
+  if (!ivInfo->valid) {
+    if (DEBUG) {
+      std::cout << "        Invalid IV info, assuming potential negative" << std::endl;
+    }
+    return true; // 保守假设非线性变化可能为负数
+  }
+
+  // 获取phi指令的所有入口值
+  auto* phiInst = dynamic_cast<PhiInst*>(ivInfo->base);
+  if (!phiInst) {
+    if (DEBUG) {
+      std::cout << "        No phi instruction, assuming potential negative" << std::endl;
+    }
+    return true; // 无法确定，保守假设
+  }
+
+  bool hasNegativePotential = false;
+  bool hasNonNegativeInitial = false;
+  int initialValue = 0;
+  
+  for (auto& [incomingBB, incomingVal] : phiInst->getIncomingValues()) {
+    // 检查初始值（来自循环外的值）
+    if (!loop->contains(incomingBB)) {
+      if (auto* constInt = dynamic_cast<ConstantInteger*>(incomingVal)) {
+        initialValue = constInt->getInt();
+        if (initialValue < 0) {
+          if (DEBUG) {
+            std::cout << "        Found negative initial value: " << initialValue << std::endl;
+          }
+          hasNegativePotential = true;
+        } else {
+          if (DEBUG) {
+            std::cout << "        Found non-negative initial value: " << initialValue << std::endl;
+          }
+          hasNonNegativeInitial = true;
+        }
+      } else {
+        // 如果不是常数初始值，保守假设可能为负数
+        if (DEBUG) {
+          std::cout << "        Non-constant initial value, assuming potential negative" << std::endl;
+        }
+        hasNegativePotential = true;
+      }
+    }
+  }
+
+  // 检查递增值和偏移
+  if (ivInfo->factor < 0) {
+    if (DEBUG) {
+      std::cout << "        Negative factor: " << ivInfo->factor << std::endl;
+    }
+    hasNegativePotential = true;
+  }
+
+  if (ivInfo->offset < 0) {
+    if (DEBUG) {
+      std::cout << "        Negative offset: " << ivInfo->offset << std::endl;
+    }
+    hasNegativePotential = true;
+  }
+
+  // 精确分析：如果初始值非负，递增为正，偏移非负，则整个序列非负
+  if (hasNonNegativeInitial && ivInfo->factor > 0 && ivInfo->offset >= 0) {
+    if (DEBUG) {
+      std::cout << "        ANALYSIS: Confirmed non-negative range" << std::endl;
+      std::cout << "          Initial: " << initialValue << " >= 0" << std::endl;
+      std::cout << "          Factor: " << ivInfo->factor << " > 0" << std::endl;
+      std::cout << "          Offset: " << ivInfo->offset << " >= 0" << std::endl;
+    }
+    return false; // 确定不会为负数
+  }
+
+  // 报告分析结果
+  if (DEBUG) {
+    if (hasNegativePotential) {
+      std::cout << "        ANALYSIS: Potential negative values detected" << std::endl;
+    } else {
+      std::cout << "        ANALYSIS: No negative indicators, but missing positive confirmation" << std::endl;
+    }
+  }
+
+  return hasNegativePotential;
+}
+
+
+bool LoopStrengthReduction::runOnFunction(Function* F, AnalysisManager& AM) {
+  if (F->getBasicBlocks().empty()) {
+    return false; // 空函数
+  }
+
+  if (DEBUG) {
+    std::cout << "Running LoopStrengthReduction on function: " << F->getName() << std::endl;
+  }
+
+  // 创建优化上下文并运行
+  StrengthReductionContext context(builder);
+  bool modified = context.run(F, AM);
+
+  if (DEBUG) {
+    std::cout << "LoopStrengthReduction " << (modified ? "modified" : "did not modify") 
+              << " function: " << F->getName() << std::endl;
+  }
+
+  return modified;
+}
+
+void LoopStrengthReduction::getAnalysisUsage(std::set<void*>& analysisDependencies, 
+                                           std::set<void*>& analysisInvalidations) const {
+  // 依赖的分析
+  analysisDependencies.insert(&LoopAnalysisPass::ID);
+  analysisDependencies.insert(&LoopCharacteristicsPass::ID);
+  analysisDependencies.insert(&DominatorTreeAnalysisPass::ID);
+  
+  // 会使失效的分析（强度削弱会修改IR结构）
+  analysisInvalidations.insert(&LoopCharacteristicsPass::ID);
+  // 注意：支配树分析通常不会因为强度削弱而失效，因为我们不改变控制流
+}
+
+// ========== StrengthReductionContext 实现 ==========
+
+bool StrengthReductionContext::run(Function* F, AnalysisManager& AM) {
+  if (DEBUG) {
+    std::cout << "  Starting strength reduction analysis..." << std::endl;
+  }
+
+  // 获取必要的分析结果
+  loopAnalysis = AM.getAnalysisResult<LoopAnalysisResult, LoopAnalysisPass>(F);
+  if (!loopAnalysis || !loopAnalysis->hasLoops()) {
+    if (DEBUG) {
+      std::cout << "  No loops found, skipping strength reduction" << std::endl;
+    }
+    return false;
+  }
+
+  loopCharacteristics = AM.getAnalysisResult<LoopCharacteristicsResult, LoopCharacteristicsPass>(F);
+  if (!loopCharacteristics) {
+    if (DEBUG) {
+      std::cout << "  LoopCharacteristics analysis not available" << std::endl;
+    }
+    return false;
+  }
+
+  dominatorTree = AM.getAnalysisResult<DominatorTree, DominatorTreeAnalysisPass>(F);
+  if (!dominatorTree) {
+    if (DEBUG) {
+      std::cout << "  DominatorTree analysis not available" << std::endl;
+    }
+    return false;
+  }
+
+  // 执行三个阶段的优化
+  
+  // 阶段1：识别候选项
+  identifyStrengthReductionCandidates(F);
+  
+  if (candidates.empty()) {
+    if (DEBUG) {
+      std::cout << "  No strength reduction candidates found" << std::endl;
+    }
+    return false;
+  }
+
+  if (DEBUG) {
+    std::cout << "  Found " << candidates.size() << " potential candidates" << std::endl;
+  }
+
+  // 阶段2：分析优化潜力
+  analyzeOptimizationPotential();
+
+  // 阶段3：执行优化
+  bool modified = performStrengthReduction();
+
+  if (DEBUG) {
+    printDebugInfo();
+  }
+
+  return modified;
+}
+
+void StrengthReductionContext::identifyStrengthReductionCandidates(Function* F) {
+  if (DEBUG) {
+    std::cout << "  === Phase 1: Identifying Strength Reduction Candidates ===" << std::endl;
+  }
+
+  // 遍历所有循环
+  for (const auto& loop_ptr : loopAnalysis->getAllLoops()) {
+    Loop* loop = loop_ptr.get();
+    
+    if (DEBUG) {
+      std::cout << "    Analyzing loop: " << loop->getName() << std::endl;
+    }
+
+    // 获取循环特征
+    const LoopCharacteristics* characteristics = loopCharacteristics->getCharacteristics(loop);
+    if (!characteristics) {
+      if (DEBUG) {
+        std::cout << "      No characteristics available for loop" << std::endl;
+      }
+      continue;
+    }
+
+    if (characteristics->InductionVars.empty()) {
+      if (DEBUG) {
+        std::cout << "      No induction variables found in loop" << std::endl;
+      }
+      continue;
+    }
+
+    // 遍历循环中的所有指令
+    for (BasicBlock* bb : loop->getBlocks()) {
+      for (auto& inst_ptr : bb->getInstructions()) {
+        Instruction* inst = inst_ptr.get();
+        
+        // 检查是否为强度削弱候选项
+        auto candidate = isStrengthReductionCandidate(inst, loop);
+        if (candidate) {
+          if (DEBUG) {
+            std::cout << "      Found candidate: %" << inst->getName() 
+                      << " (IV: %" << candidate->inductionVar->getName() 
+                      << ", multiplier: " << candidate->multiplier 
+                      << ", offset: " << candidate->offset << ")" << std::endl;
+          }
+          
+          // 添加到候选项列表
+          loopToCandidates[loop].push_back(candidate.get());
+          candidates.push_back(std::move(candidate));
+        }
+      }
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "  === End Phase 1: Found " << candidates.size() << " candidates ===" << std::endl;
+  }
+}
+
+std::unique_ptr<StrengthReductionCandidate> 
+StrengthReductionContext::isStrengthReductionCandidate(Instruction* inst, Loop* loop) {
+  auto kind = inst->getKind();
+  
+  // 支持乘法、除法、取模指令
+  if (kind != Instruction::Kind::kMul && 
+      kind != Instruction::Kind::kDiv && 
+      kind != Instruction::Kind::kRem) {
+    return nullptr;
+  }
+
+  auto* binaryInst = dynamic_cast<BinaryInst*>(inst);
+  if (!binaryInst) {
+    return nullptr;
+  }
+
+  Value* op0 = binaryInst->getOperand(0);
+  Value* op1 = binaryInst->getOperand(1);
+
+  // 检查模式：归纳变量 op 常数 或 常数 op 归纳变量
+  Value* inductionVar = nullptr;
+  int constantValue = 0;
+  StrengthReductionCandidate::OpType opType;
+  
+  // 获取循环特征信息
+  const LoopCharacteristics* characteristics = loopCharacteristics->getCharacteristics(loop);
+  if (!characteristics) {
+    return nullptr;
+  }
+
+  // 确定操作类型
+  switch (kind) {
+    case Instruction::Kind::kMul:
+      opType = StrengthReductionCandidate::MULTIPLY;
+      break;
+    case Instruction::Kind::kDiv:
+      opType = StrengthReductionCandidate::DIVIDE;
+      break;
+    case Instruction::Kind::kRem:
+      opType = StrengthReductionCandidate::REMAINDER;
+      break;
+    default:
+      return nullptr;
+  }
+
+  // 模式1: IV op const
+  const InductionVarInfo* ivInfo = getInductionVarInfo(op0, loop, characteristics);
+  if (ivInfo && dynamic_cast<ConstantInteger*>(op1)) {
+    inductionVar = op0;
+    constantValue = dynamic_cast<ConstantInteger*>(op1)->getInt();
+  }
+  // 模式2: const op IV (仅对乘法有效)
+  else if (opType == StrengthReductionCandidate::MULTIPLY) {
+    ivInfo = getInductionVarInfo(op1, loop, characteristics);
+    if (ivInfo && dynamic_cast<ConstantInteger*>(op0)) {
+      inductionVar = op1;
+      constantValue = dynamic_cast<ConstantInteger*>(op0)->getInt();
+    }
+  }
+
+  if (!inductionVar || constantValue <= 1) {
+    return nullptr; // 不是有效的候选项
+  }
+
+  // 创建候选项
+  auto candidate = std::make_unique<StrengthReductionCandidate>(
+    inst, inductionVar, opType, constantValue, 0, inst->getParent(), loop
+  );
+
+  // 分析归纳变量是否可能为负数
+  candidate->hasNegativeValues = analyzeInductionVariableRange(ivInfo, loop);
+
+  // 根据除法类型选择优化策略
+  if (opType == StrengthReductionCandidate::DIVIDE) {
+    bool isPowerOfTwo = (constantValue & (constantValue - 1)) == 0;
+    
+    if (isPowerOfTwo) {
+      // 2的幂除法
+      if (candidate->hasNegativeValues) {
+        candidate->divStrategy = StrengthReductionCandidate::SIGNED_CORRECTION;
+        if (DEBUG) {
+          std::cout << "        Division by power of 2 with potential negative values, using signed correction" << std::endl;
+        }
+      } else {
+        candidate->divStrategy = StrengthReductionCandidate::SIMPLE_SHIFT;
+        if (DEBUG) {
+          std::cout << "        Division by power of 2 with non-negative values, using simple shift" << std::endl;
+        }
+      }
+    } else {
+      // 任意常数除法，使用mulh指令
+      candidate->operationType = StrengthReductionCandidate::DIVIDE_CONST;
+      candidate->divStrategy = StrengthReductionCandidate::MULH_OPTIMIZATION;
+      if (DEBUG) {
+        std::cout << "        Division by arbitrary constant, using mulh optimization" << std::endl;
+      }
+    }
+  } else if (opType == StrengthReductionCandidate::REMAINDER) {
+    // 取模运算只支持2的幂
+    if ((constantValue & (constantValue - 1)) != 0) {
+      return nullptr; // 不是2的幂，无法优化
+    }
+  }
+
+  return candidate;
+}
+
+const InductionVarInfo* 
+StrengthReductionContext::getInductionVarInfo(Value* val, Loop* loop, 
+                                            const LoopCharacteristics* characteristics) {
+  for (const auto& iv : characteristics->InductionVars) {
+    if (iv->div == val) {
+      return iv.get();
+    }
+  }
+  return nullptr;
+}
+
+void StrengthReductionContext::analyzeOptimizationPotential() {
+  if (DEBUG) {
+    std::cout << "  === Phase 2: Analyzing Optimization Potential ===" << std::endl;
+  }
+
+  // 为每个候选项计算优化收益，并过滤不值得优化的
+  auto it = candidates.begin();
+  while (it != candidates.end()) {
+    StrengthReductionCandidate* candidate = it->get();
+    
+    double benefit = estimateOptimizationBenefit(candidate);
+    bool isLegal = isOptimizationLegal(candidate);
+    
+    if (DEBUG) {
+      std::cout << "    Candidate " << candidate->originalInst->getName() 
+                << ": benefit=" << benefit 
+                << ", legal=" << (isLegal ? "yes" : "no") << std::endl;
+    }
+    
+    // 如果收益太小或不合法，移除候选项
+    if (benefit < 1.0 || !isLegal) {
+      // 从 loopToCandidates 中移除
+      auto& loopCandidates = loopToCandidates[candidate->containingLoop];
+      loopCandidates.erase(
+        std::remove(loopCandidates.begin(), loopCandidates.end(), candidate),
+        loopCandidates.end()
+      );
+      
+      it = candidates.erase(it);
+    } else {
+      ++it;
+    }
+  }
+
+  if (DEBUG) {
+    std::cout << "  === End Phase 2: " << candidates.size() << " candidates remain ===" << std::endl;
+  }
+}
+
+double StrengthReductionContext::estimateOptimizationBenefit(const StrengthReductionCandidate* candidate) {
+  // 简单的收益估算模型
+  double benefit = 0.0;
+  
+  // 基础收益：乘法变加法的性能提升
+  benefit += 2.0; // 假设乘法比加法慢2倍
+  
+  // 乘数因子：乘数越大，收益越高
+  if (candidate->multiplier >= 4) {
+    benefit += 1.0;
+  }
+  if (candidate->multiplier >= 8) {
+    benefit += 1.0;
+  }
+  
+  // 循环热度因子
+  Loop* loop = candidate->containingLoop;
+  double hotness = loop->getLoopHotness();
+  benefit *= (1.0 + hotness / 100.0);
+  
+  // 使用次数因子
+  size_t useCount = candidate->originalInst->getUses().size();
+  if (useCount > 1) {
+    benefit *= (1.0 + useCount * 0.2);
+  }
+  
+  return benefit;
+}
+
+bool StrengthReductionContext::isOptimizationLegal(const StrengthReductionCandidate* candidate) {
+  // 检查优化的合法性
+  
+  // 1. 确保归纳变量在循环头有 phi 指令
+  auto* phiInst = dynamic_cast<PhiInst*>(candidate->inductionVar);
+  if (!phiInst || phiInst->getParent() != candidate->containingLoop->getHeader()) {
+    if (DEBUG ) {
+      std::cout << "      Illegal: induction variable is not a phi in loop header" << std::endl;
+    }
+    return false;
+  }
+  
+  // 2. 确保乘法指令在循环内
+  if (!candidate->containingLoop->contains(candidate->containingBlock)) {
+    if (DEBUG ) {
+      std::cout << "      Illegal: instruction not in loop" << std::endl;
+    }
+    return false;
+  }
+  
+  // 3. 检查是否有溢出风险（简化检查）
+  if (candidate->multiplier > 1000) {
+    if (DEBUG ) {
+      std::cout << "      Illegal: multiplier too large (overflow risk)" << std::endl;
+    }
+    return false;
+  }
+  
+  // 4. 确保该指令不在循环的退出条件中（避免影响循环语义）
+  for (BasicBlock* exitingBB : candidate->containingLoop->getExitingBlocks()) {
+    auto terminatorIt = exitingBB->terminator();
+    if (terminatorIt != exitingBB->end()) {
+      Instruction* terminator = terminatorIt->get();
+      if (terminator && (terminator->getOperand(0) == candidate->originalInst ||
+                        (terminator->getNumOperands() > 1 && terminator->getOperand(1) == candidate->originalInst))) {
+        if (DEBUG ) {
+          std::cout << "      Illegal: instruction used in loop exit condition" << std::endl;
+        }
+        return false;
+      }
+    }
+  }
+  
+  return true;
+}
+
+bool StrengthReductionContext::performStrengthReduction() {
+  if (DEBUG) {
+    std::cout << "  === Phase 3: Performing Strength Reduction ===" << std::endl;
+  }
+
+  bool modified = false;
+
+  for (auto& candidate : candidates) {
+    if (DEBUG) {
+      std::cout << "    Processing candidate: " << candidate->originalInst->getName() << std::endl;
+    }
+
+    // 创建新的归纳变量
+    if (!createNewInductionVariable(candidate.get())) {
+      if (DEBUG) {
+        std::cout << "      Failed to create new induction variable" << std::endl;
+      }
+      continue;
+    }
+
+    // 替换原始指令
+    if (!replaceOriginalInstruction(candidate.get())) {
+      if (DEBUG) {
+        std::cout << "      Failed to replace original instruction" << std::endl;
+      }
+      continue;
+    }
+
+    if (DEBUG) {
+      std::cout << "      Successfully optimized: " << candidate->originalInst->getName() 
+                << " -> " << candidate->newInductionVar->getName() << std::endl;
+    }
+    
+    modified = true;
+  }
+
+  if (DEBUG) {
+    std::cout << "  === End Phase 3: " << (modified ? "Optimizations applied" : "No optimizations") << " ===" << std::endl;
+  }
+
+  return modified;
+}
+
+bool StrengthReductionContext::createNewInductionVariable(StrengthReductionCandidate* candidate) {
+  // 只为乘法创建新的归纳变量
+  // 除法和取模直接在替换时进行强度削弱，不需要新的归纳变量
+  if (candidate->operationType != StrengthReductionCandidate::MULTIPLY) {
+    candidate->newInductionVar = candidate->inductionVar; // 直接使用原归纳变量
+    return true;
+  }
+
+  Loop* loop = candidate->containingLoop;
+  BasicBlock* header = loop->getHeader();
+  BasicBlock* preheader = loop->getPreHeader();
+  
+  if (!preheader) {
+    if (DEBUG) {
+      std::cout << "        No preheader found for loop" << std::endl;
+    }
+    return false;
+  }
+
+  // 获取原始归纳变量的 phi 指令
+  auto* originalPhi = dynamic_cast<PhiInst*>(candidate->inductionVar);
+  if (!originalPhi) {
+    return false;
+  }
+
+  
+
+  // 1. 找到原始归纳变量的初始值和步长
+  Value* initialValue = nullptr;
+  Value* stepValue = nullptr;
+  BasicBlock* latchBlock = nullptr;
+
+  for (auto& [incomingBB, incomingVal] : originalPhi->getIncomingValues()) {
+    if (!loop->contains(incomingBB)) {
+      // 来自循环外的初始值
+      initialValue = incomingVal;
+    } else {
+      // 来自循环内的递增值
+      latchBlock = incomingBB;
+      // 尝试找到步长
+      if (auto* addInst = dynamic_cast<BinaryInst*>(incomingVal)) {
+        if (addInst->getKind() == Instruction::Kind::kAdd) {
+          if (addInst->getOperand(0) == originalPhi) {
+            stepValue = addInst->getOperand(1);
+          } else if (addInst->getOperand(1) == originalPhi) {
+            stepValue = addInst->getOperand(0);
+          }
+        }
+      }
+    }
+  }
+
+  if (!initialValue || !stepValue || !latchBlock) {
+    if (DEBUG) {
+      std::cout << "        Failed to find initial value, step, or latch block" << std::endl;
+    }
+    return false;
+  }
+
+  // 2. 在循环头创建新的 phi 指令
+  builder->setPosition(header, header->begin());
+  candidate->newPhi = builder->createPhiInst(originalPhi->getType());
+  candidate->newPhi->setName("sr_" + originalPhi->getName());
+
+  // 3. 计算新归纳变量的初始值和步长
+  // 新IV的初始值 = 原IV初始值 * multiplier
+  Value* newInitialValue;
+  if (auto* constInt = dynamic_cast<ConstantInteger*>(initialValue)) {
+    newInitialValue = ConstantInteger::get(constInt->getInt() * candidate->multiplier);
+  } else {
+    // 如果初始值不是常数，需要在preheader中插入乘法
+    builder->setPosition(preheader, preheader->terminator());
+    newInitialValue = builder->createMulInst(initialValue, 
+      ConstantInteger::get(candidate->multiplier));
+  }
+
+  // 新IV的步长 = 原IV步长 * multiplier  
+  Value* newStepValue;
+  if (auto* constInt = dynamic_cast<ConstantInteger*>(stepValue)) {
+    newStepValue = ConstantInteger::get(constInt->getInt() * candidate->multiplier);
+  } else {
+    builder->setPosition(latchBlock, latchBlock->terminator());
+    newStepValue = builder->createMulInst(stepValue, 
+      ConstantInteger::get(candidate->multiplier));
+  }
+
+  // 4. 创建新归纳变量的递增指令
+  builder->setPosition(latchBlock, latchBlock->terminator());
+  Value* newIncrementedValue = builder->createAddInst(candidate->newPhi, newStepValue);
+  
+  // 5. 设置新 phi 的输入值
+  candidate->newPhi->addIncoming(newInitialValue, preheader);
+  candidate->newPhi->addIncoming(newIncrementedValue, latchBlock);
+
+  candidate->newInductionVar = candidate->newPhi;
+
+  if (DEBUG) {
+    std::cout << "        Created new induction variable: " << candidate->newPhi->getName() << std::endl;
+  }
+
+  return true;
+}
+
+bool StrengthReductionContext::replaceOriginalInstruction(StrengthReductionCandidate* candidate) {
+  if (!candidate->newInductionVar) {
+    return false;
+  }
+
+  Value* replacementValue = nullptr;
+  
+  // 根据操作类型生成不同的替换指令
+  switch (candidate->operationType) {
+    case StrengthReductionCandidate::MULTIPLY: {
+      // 乘法：直接使用新的归纳变量
+      replacementValue = candidate->newInductionVar;
+      break;
+    }
+    
+    case StrengthReductionCandidate::DIVIDE: {
+      // 根据除法策略生成不同的代码
+      builder->setPosition(candidate->containingBlock, 
+                          candidate->containingBlock->findInstIterator(candidate->originalInst));
+      replacementValue = generateDivisionReplacement(candidate, builder);
+      break;
+    }
+    
+    case StrengthReductionCandidate::DIVIDE_CONST: {
+      // 任意常数除法
+      // builder->setPosition(candidate->containingBlock, 
+      //                     candidate->containingBlock->findInstIterator(candidate->originalInst));
+      // replacementValue = generateConstantDivisionReplacement(candidate, builder);
+      break;
+    }
+    
+    case StrengthReductionCandidate::REMAINDER: {
+      // 取模：使用位与操作 (x % 2^n == x & (2^n - 1))
+      builder->setPosition(candidate->containingBlock, 
+                          candidate->containingBlock->findInstIterator(candidate->originalInst));
+      
+      int maskValue = candidate->multiplier - 1; // 2^n - 1
+      Value* maskConstant = ConstantInteger::get(maskValue);
+      
+      if (candidate->hasNegativeValues) {
+        // 处理负数的取模运算
+        Value* temp = builder->createBinaryInst(
+          Instruction::Kind::kAnd, candidate->inductionVar->getType(),
+          candidate->inductionVar, maskConstant
+        );
+        
+        // 检查原值是否为负数
+        Value* shift31condidata = builder->createBinaryInst(
+          Instruction::Kind::kSra, candidate->inductionVar->getType(),
+          candidate->inductionVar, ConstantInteger::get(31)
+        );
+        
+        // 如果为负数，需要调整结果
+        Value* adjustment = builder->createAndInst(shift31condidata, maskConstant);
+        Value* adjustedTemp = builder->createAddInst(candidate->inductionVar, adjustment);
+        Value* adjustedResult = builder->createBinaryInst(
+          Instruction::Kind::kAnd, candidate->inductionVar->getType(),
+          adjustedTemp, maskConstant
+        );
+        replacementValue = adjustedResult;
+      } else {
+        // 非负数的取模，直接使用位与
+        replacementValue = builder->createBinaryInst(
+          Instruction::Kind::kAnd, candidate->inductionVar->getType(),
+          candidate->inductionVar, maskConstant
+        );
+      }
+      
+      if (DEBUG) {
+        std::cout << "        Created modulus operation with mask " << maskValue 
+                  << " (handles negatives: " << (candidate->hasNegativeValues ? "yes" : "no") << ")" << std::endl;
+      }
+      break;
+    }
+    
+    default:
+      return false;
+  }
+
+  if (!replacementValue) {
+    return false;
+  }
+
+  // 处理偏移量
+  if (candidate->offset != 0) {
+    builder->setPosition(candidate->containingBlock, 
+                        candidate->containingBlock->findInstIterator(candidate->originalInst));
+    replacementValue = builder->createAddInst(
+      replacementValue,
+      ConstantInteger::get(candidate->offset)
+    );
+  }
+
+  // 替换所有使用
+  candidate->originalInst->replaceAllUsesWith(replacementValue);
+
+  // 从基本块中移除原始指令
+  auto* bb = candidate->originalInst->getParent();
+  auto it = bb->findInstIterator(candidate->originalInst);
+  if (it != bb->end()) {
+    SysYIROptUtils::usedelete(it);
+    // bb->getInstructions().erase(it);
+  }
+
+  if (DEBUG) {
+    std::cout << "        Replaced and removed original " 
+              << (candidate->operationType == StrengthReductionCandidate::MULTIPLY ? "multiply" :
+                  candidate->operationType == StrengthReductionCandidate::DIVIDE ? "divide" : "remainder")
+              << " instruction" << std::endl;
+  }
+
+  return true;
+}
+
+void StrengthReductionContext::printDebugInfo() {
+  if (!DEBUG) return;
+
+  std::cout << "\n=== Strength Reduction Optimization Summary ===" << std::endl;
+  std::cout << "Total candidates processed: " << candidates.size() << std::endl;
+  
+  for (auto& [loop, loopCandidates] : loopToCandidates) {
+    if (!loopCandidates.empty()) {
+      std::cout << "Loop " << loop->getName() << ": " << loopCandidates.size() << " optimizations" << std::endl;
+      for (auto* candidate : loopCandidates) {
+        if (candidate->newInductionVar) {
+          std::cout << "  " << candidate->inductionVar->getName() 
+                    << " (op=" << (candidate->operationType == StrengthReductionCandidate::MULTIPLY ? "mul" :
+                                   candidate->operationType == StrengthReductionCandidate::DIVIDE ? "div" : "rem")
+                    << ", factor=" << candidate->multiplier << ")"
+                    << " -> optimized" << std::endl;
+        }
+      }
+    }
+  }
+  std::cout << "===============================================" << std::endl;
+}
+
+Value* StrengthReductionContext::generateDivisionReplacement(
+  StrengthReductionCandidate* candidate, 
+  IRBuilder* builder
+) const {
+  switch (candidate->divStrategy) {
+    case StrengthReductionCandidate::SIMPLE_SHIFT: {
+      // 简单的右移除法 (仅适用于非负数)
+      int shiftAmount = __builtin_ctz(candidate->multiplier);
+      Value* shiftConstant = ConstantInteger::get(shiftAmount);
+      return builder->createBinaryInst(
+        Instruction::Kind::kSrl,  // 逻辑右移
+        candidate->inductionVar->getType(),
+        candidate->inductionVar,
+        shiftConstant
+      );
+    }
+    
+    case StrengthReductionCandidate::SIGNED_CORRECTION: {
+      // 有符号除法校正：(x + (x >> 31) & mask) >> k
+      int shiftAmount = __builtin_ctz(candidate->multiplier);
+      int maskValue = candidate->multiplier - 1;
+      
+      // x >> 31 (算术右移获取符号位)
+      Value* signShift = ConstantInteger::get(31);
+      Value* signBits = builder->createBinaryInst(
+        Instruction::Kind::kSra,  // 算术右移
+        candidate->inductionVar->getType(),
+        candidate->inductionVar,
+        signShift
+      );
+      
+      // (x >> 31) & mask
+      Value* mask = ConstantInteger::get(maskValue);
+      Value* correction = builder->createBinaryInst(
+        Instruction::Kind::kAnd,
+        candidate->inductionVar->getType(),
+        signBits,
+        mask
+      );
+      
+      // x + correction
+      Value* corrected = builder->createAddInst(candidate->inductionVar, correction);
+      
+      // (x + correction) >> k
+      Value* divShift = ConstantInteger::get(shiftAmount);
+      return builder->createBinaryInst(
+        Instruction::Kind::kSra,  // 算术右移
+        candidate->inductionVar->getType(),
+        corrected,
+        divShift
+      );
+    }
+    
+    default: {
+      // 回退到原始除法
+      Value* divisor = ConstantInteger::get(candidate->multiplier);
+      return builder->createDivInst(candidate->inductionVar, divisor);
+    }
+  }
+}
+
+Value* StrengthReductionContext::generateConstantDivisionReplacement(
+  StrengthReductionCandidate* candidate, 
+  IRBuilder* builder
+) const {
+  // 使用mulh指令优化任意常数除法
+  auto [magic, shift] = SysYIROptUtils::computeMulhMagicNumbers(candidate->multiplier);
+  
+  // 检查是否无法优化（magic == -1, shift == -1 表示失败）
+  if (magic == -1 && shift == -1) {
+    if (DEBUG) {
+      std::cout << "[SR] Cannot optimize division by " << candidate->multiplier 
+                << ", keeping original division" << std::endl;
+    }
+    // 返回 nullptr 表示无法优化，调用方应该保持原始除法
+    return nullptr;
+  }
+  
+  // 2的幂次方除法可以用移位优化（但这不是魔数法的情况）这种情况应该不会被分类到这里但是还是做一个保护措施
+  if ((candidate->multiplier & (candidate->multiplier - 1)) == 0 && candidate->multiplier > 0) {
+    // 是2的幂次方，可以用移位
+    int shift_amount = 0;
+    int temp = candidate->multiplier;
+    while (temp > 1) {
+      temp >>= 1;
+      shift_amount++;
+    }
+    
+    Value* shiftConstant = ConstantInteger::get(shift_amount);
+    if (candidate->hasNegativeValues) {
+      // 对于有符号除法，需要先加上除数-1然后再移位（为了正确处理负数舍入）
+      Value* divisor_minus_1 = ConstantInteger::get(candidate->multiplier - 1);
+      Value* adjusted = builder->createAddInst(candidate->inductionVar, divisor_minus_1);
+      return builder->createBinaryInst(
+        Instruction::Kind::kSra,  // 算术右移
+        candidate->inductionVar->getType(),
+        adjusted,
+        shiftConstant
+      );
+    } else {
+      return builder->createBinaryInst(
+        Instruction::Kind::kSrl,  // 逻辑右移
+        candidate->inductionVar->getType(),
+        candidate->inductionVar,
+        shiftConstant
+      );
+    }
+  }
+  
+  // 创建魔数常量
+  // 检查魔数是否能放入32位，如果不能，则不进行优化
+  if (magic > INT32_MAX || magic < INT32_MIN) {
+    if (DEBUG) {
+      std::cout << "[SR] Magic number " << magic << " exceeds 32-bit range, skipping optimization" << std::endl;
+    }
+    return nullptr; // 无法优化，保持原始除法
+  }
+  
+  Value* magicConstant = ConstantInteger::get((int32_t)magic);
+  
+  // 检查是否需要ADD_MARKER处理（加法调整）
+  bool needAdd = (shift & 0x40) != 0;
+  int actualShift = shift & 0x3F; // 提取真实的移位量
+  
+  if (DEBUG) {
+    std::cout << "[SR] IR Generation: magic=" << magic << ", needAdd=" << needAdd 
+              << ", actualShift=" << actualShift << std::endl;
+  }
+  
+  // 执行高位乘法：mulh(x, magic)
+  Value* mulhResult = builder->createBinaryInst(
+    Instruction::Kind::kMulh,  // 高位乘法
+    candidate->inductionVar->getType(),
+    candidate->inductionVar,
+    magicConstant
+  );
+  
+  if (needAdd) {
+    // ADD_MARKER 情况：需要在移位前加上被除数
+    // 这对应于 libdivide 的加法调整算法
+    if (DEBUG) {
+      std::cout << "[SR] Applying ADD_MARKER: adding dividend before shift" << std::endl;
+    }
+    mulhResult = builder->createAddInst(mulhResult, candidate->inductionVar);
+  }
+  
+  if (actualShift > 0) {
+    // 如果需要额外移位
+    Value* shiftConstant = ConstantInteger::get(actualShift);
+    mulhResult = builder->createBinaryInst(
+      Instruction::Kind::kSra,  // 算术右移
+      candidate->inductionVar->getType(),
+      mulhResult,
+      shiftConstant
+    );
+  }
+  
+  // 标准的有符号除法符号修正：如果被除数为负，商需要加1
+  // 这对所有有符号除法都需要，不管是否可能有负数
+  Value* isNegative = builder->createICmpLTInst(candidate->inductionVar, ConstantInteger::get(0));
+  // 将i1转换为i32：负数时为1，非负数时为0 ICmpLTInst的结果会默认转化为32位
+  mulhResult = builder->createAddInst(mulhResult, isNegative);
+  
+  return mulhResult;
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Optimize/Mem2Reg.cpp
+++ b/src/midend/Pass/Optimize/Mem2Reg.cpp
@ -1,6 +1,8 @@
 #include "Mem2Reg.h" // 包含 Mem2Reg 遍的头文件
 #include "Dom.h"     // 包含支配树分析的头文件
 #include "Liveness.h"
+#include "AliasAnalysis.h" // 包含别名分析
+#include "SideEffectAnalysis.h" // 包含副作用分析
 #include "IR.h"      // 包含 IR 相关的定义
 #include "SysYIROptUtils.h"
 #include <cassert>   // 用于断言
@ -60,7 +62,7 @@ void Mem2RegContext::run(Function *func, AnalysisManager *AM) {
  }

  // 从入口基本块开始，对支配树进行 DFS 遍历，进行变量重命名
-  renameVariables(nullptr, func->getEntryBlock()); // 第一个参数 alloca 在这里不使用，因为是递归入口点
+  renameVariables(func->getEntryBlock()); // 第一个参数 alloca 在这里不使用，因为是递归入口点

  // --------------------------------------------------------------------
  // 阶段4: 清理
@ -209,16 +211,21 @@ void Mem2RegContext::insertPhis(AllocaInst *alloca, const std::unordered_set<Bas
 }

 // 对支配树进行深度优先遍历，重命名变量并替换 load/store 指令
-void Mem2RegContext::renameVariables(AllocaInst *currentAlloca, BasicBlock *currentBB) {
-  // 维护一个局部栈，用于存储当前基本块中为 Phi 和 Store 创建的 SSA 值，以便在退出时弹出
-  std::stack<Value *> localStackPushed;
+// 移除了 AllocaInst *currentAlloca 参数，因为这个函数是为整个基本块处理所有可提升的 Alloca
+void Mem2RegContext::renameVariables(BasicBlock *currentBB) {
+  // 1. 在函数开始时，记录每个 promotableAlloca 的当前栈深度。
+  // 这将用于在函数返回时精确地回溯栈状态。
+  std::map<AllocaInst *, size_t> originalStackSizes;
+  for (auto alloca : promotableAllocas) {
+    originalStackSizes[alloca] = allocaToValueStackMap[alloca].size();
+  }

  // --------------------------------------------------------------------
  // 处理当前基本块的指令
  // --------------------------------------------------------------------
  for (auto instIter = currentBB->getInstructions().begin(); instIter != currentBB->getInstructions().end();) {
    Instruction *inst = instIter->get();
-    bool instDeleted = false;
+      bool instDeleted = false;

    // 处理 Phi 指令 (如果是当前 alloca 的 Phi)
    if (auto phiInst = dynamic_cast<PhiInst *>(inst)) {
@ -227,52 +234,69 @@ void Mem2RegContext::renameVariables(AllocaInst *currentAlloca, BasicBlock *curr
        if (allocaToPhiMap[alloca].count(currentBB) && allocaToPhiMap[alloca][currentBB] == phiInst) {
          // 为 Phi 指令的输出创建一个新的 SSA 值，并压入值栈
          allocaToValueStackMap[alloca].push(phiInst);
-          localStackPushed.push(phiInst); // 记录以便弹出
-          break;                          // 找到对应的 alloca，处理下一个指令
+          if (DEBUG) {
+            std::cout << "Mem2Reg: Pushed Phi " << (phiInst->getName().empty() ? "anonymous" : phiInst->getName()) << " for alloca " << alloca->getName()
+              << ". Stack size: " << allocaToValueStackMap[alloca].size() << std::endl;
+          }
+          break; // 找到对应的 alloca，处理下一个指令
        }
      }
    }
    // 处理 LoadInst
    else if (auto loadInst = dynamic_cast<LoadInst *>(inst)) {
-      // 检查这个 LoadInst 是否是为某个可提升的 alloca
      for (auto alloca : promotableAllocas) {
-        if (loadInst->getPointer() == alloca) { 
-          // loadInst->getPointer() 返回 AllocaInst*
-          // 将 LoadInst 的所有用途替换为当前 alloca 值栈顶部的 SSA 值
+        // 检查 LoadInst 的指针是否直接是 alloca，或者是指向 alloca 的 GEP
+        Value *ptrOperand = loadInst->getPointer();
+        if (ptrOperand == alloca || (dynamic_cast<GetElementPtrInst *>(ptrOperand) &&
+                                     dynamic_cast<GetElementPtrInst *>(ptrOperand)->getBasePointer() == alloca)) {
          assert(!allocaToValueStackMap[alloca].empty() && "Value stack empty for alloca during load replacement!");
+          if (DEBUG) {
+            std::cout << "Mem2Reg: Replacing load "
+                      << (ptrOperand->getName().empty() ? "anonymous" : ptrOperand->getName()) << " with SSA value "
+                      << (allocaToValueStackMap[alloca].top()->getName().empty()
+                              ? "anonymous"
+                              : allocaToValueStackMap[alloca].top()->getName())
+                      << " for alloca " << alloca->getName() << std::endl;
+            std::cout << "Mem2Reg: allocaToValueStackMap[" << alloca->getName()
+                      << "] size: " << allocaToValueStackMap[alloca].size() << std::endl;
+          }
          loadInst->replaceAllUsesWith(allocaToValueStackMap[alloca].top());
-          // instIter = currentBB->force_delete_inst(loadInst); // 删除 LoadInst
-          SysYIROptUtils::usedelete(loadInst); // 仅删除 use 关系
-          instIter = currentBB->getInstructions().erase(instIter); // 删除 LoadInst
+          instIter = SysYIROptUtils::usedelete(instIter);
          instDeleted = true;
-          // std::cerr << "Mem2Reg: Replaced load " << loadInst->name() << " with SSA value." << std::endl;
          break;
        }
      }
    }
    // 处理 StoreInst
    else if (auto storeInst = dynamic_cast<StoreInst *>(inst)) {
-      // 检查这个 StoreInst 是否是为某个可提升的 alloca
      for (auto alloca : promotableAllocas) {
-        if (storeInst->getPointer() == alloca) { 
-          // 假设 storeInst->getPointer() 返回 AllocaInst*
-          // 将 StoreInst 存储的值作为新的 SSA 值，压入值栈
+        // 检查 StoreInst 的指针是否直接是 alloca，或者是指向 alloca 的 GEP
+        Value *ptrOperand = storeInst->getPointer();
+        if (ptrOperand == alloca || (dynamic_cast<GetElementPtrInst *>(ptrOperand) &&
+                                     dynamic_cast<GetElementPtrInst *>(ptrOperand)->getBasePointer() == alloca)) {
+          if (DEBUG) {
+            std::cout << "Mem2Reg: Replacing store to "
+                      << (ptrOperand->getName().empty() ? "anonymous" : ptrOperand->getName()) << " with SSA value "
+                      << (storeInst->getValue()->getName().empty() ? "anonymous" : storeInst->getValue()->getName())
+                      << " for alloca " << alloca->getName() << std::endl;
+            std::cout << "Mem2Reg: allocaToValueStackMap[" << alloca->getName()
+                      << "] size before push: " << allocaToValueStackMap[alloca].size() << std::endl;
+          }
          allocaToValueStackMap[alloca].push(storeInst->getValue());
-          localStackPushed.push(storeInst->getValue());          // 记录以便弹出
-          SysYIROptUtils::usedelete(storeInst);
-          instIter = currentBB->getInstructions().erase(instIter); // 删除 StoreInst
+          instIter = SysYIROptUtils::usedelete(instIter);
          instDeleted = true;
-          // std::cerr << "Mem2Reg: Replaced store to " << storeInst->ptr()->name() << " with SSA value." << std::endl;
+          if (DEBUG) {
+            std::cout << "Mem2Reg: allocaToValueStackMap[" << alloca->getName()
+                      << "] size after push: " << allocaToValueStackMap[alloca].size() << std::endl;
+          }
          break;
        }
      }
    }
-
    if (!instDeleted) {
      ++instIter; // 如果指令没有被删除，移动到下一个
    }
  }
-
  // --------------------------------------------------------------------
  // 处理后继基本块的 Phi 指令参数
  // --------------------------------------------------------------------
@ -287,38 +311,57 @@ void Mem2RegContext::renameVariables(AllocaInst *currentAlloca, BasicBlock *curr
        // 参数值是当前 alloca 值栈顶部的 SSA 值
        assert(!allocaToValueStackMap[alloca].empty() && "Value stack empty for alloca when setting phi operand!");
        phiInst->addIncoming(allocaToValueStackMap[alloca].top(), currentBB);
+        if (DEBUG) {
+          std::cout << "Mem2Reg: Added incoming arg to Phi "
+                    << (phiInst->getName().empty() ? "anonymous" : phiInst->getName()) << " from "
+                    << currentBB->getName() << " with value "
+                    << (allocaToValueStackMap[alloca].top()->getName().empty()
+                            ? "anonymous"
+                            : allocaToValueStackMap[alloca].top()->getName())
+                    << std::endl;
+        }
      }
    }
  }
-
  // --------------------------------------------------------------------
  // 递归访问支配树的子节点
  // --------------------------------------------------------------------
  const std::set<BasicBlock *> *dominatedBlocks = dt->getDominatorTreeChildren(currentBB);
-  if(dominatedBlocks){
+  if (dominatedBlocks) { // 检查是否存在子节点
+    if(DEBUG){
+      std::cout << "Mem2Reg: Processing dominated blocks for " << currentBB->getName() << std::endl;
+      for (auto dominatedBB : *dominatedBlocks) {
+        std::cout << "Mem2Reg: Dominated block: " << (dominatedBB ? dominatedBB->getName() : "null") << std::endl;
+      }
+    }
    for (auto dominatedBB : *dominatedBlocks) {
-      if (dominatedBB) {
-        std::cout << "Mem2Reg: Recursively renaming variables in dominated block: " << dominatedBB->getName() << std::endl;
-        renameVariables(currentAlloca, dominatedBB);
+      if (dominatedBB) { // 确保子块有效
+        if (DEBUG) {
+          std::cout << "Mem2Reg: Recursively renaming variables in dominated block: " << dominatedBB->getName()
+                    << std::endl;
+        }
+        renameVariables(dominatedBB); // 递归调用，不再传递 currentAlloca
      }
    }
  }
-  

  // --------------------------------------------------------------------
-  // 退出基本块时，弹出在此块中压入值栈的 SSA 值
+  // 退出基本块时，弹出在此块中压入值栈的 SSA 值，恢复栈到进入该块时的状态
  // --------------------------------------------------------------------
-  while (!localStackPushed.empty()) {
-    Value *val = localStackPushed.top();
-    localStackPushed.pop();
-    // 找到是哪个 alloca 对应的栈
-    for (auto alloca : promotableAllocas) {
-      if (!allocaToValueStackMap[alloca].empty() && allocaToValueStackMap[alloca].top() == val) {
-        allocaToValueStackMap[alloca].pop();
-        break;
+  for (auto alloca : promotableAllocas) {
+    while (allocaToValueStackMap[alloca].size() > originalStackSizes[alloca]) {
+      if (DEBUG) {
+        std::cout << "Mem2Reg: Popping value "
+                  << (allocaToValueStackMap[alloca].top()->getName().empty()
+                          ? "anonymous"
+                          : allocaToValueStackMap[alloca].top()->getName())
+                  << " for alloca " << alloca->getName() << ". Stack size: " << allocaToValueStackMap[alloca].size()
+                  << " -> " << (allocaToValueStackMap[alloca].size() - 1) << std::endl;
      }
+      allocaToValueStackMap[alloca].pop();
    }
  }
+
 }

 // 删除所有原始的 AllocaInst、LoadInst 和 StoreInst
@ -327,7 +370,6 @@ void Mem2RegContext::cleanup() {
    if (alloca && alloca->getParent()) {
      // 删除 alloca 指令本身
      SysYIROptUtils::usedelete(alloca);
-      alloca->getParent()->removeInst(alloca); // 从基本块中删除 alloca
      
      // std::cerr << "Mem2Reg: Deleted alloca " << alloca->name() << std::endl;
    }
@ -380,8 +422,9 @@ void Mem2Reg::getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<
  // 因此，它会使许多分析结果失效。
  analysisInvalidations.insert(&DominatorTreeAnalysisPass::ID); // 支配树可能受影响
  analysisInvalidations.insert(&LivenessAnalysisPass::ID); // 活跃性分析肯定失效
+  analysisInvalidations.insert(&SysYAliasAnalysisPass::ID); // 别名分析必须失效，因为Mem2Reg改变了内存访问模式
+  analysisInvalidations.insert(&SysYSideEffectAnalysisPass::ID); // 副作用分析也可能失效
  // analysisInvalidations.insert(&LoopInfoAnalysisPass::ID); // 循环信息可能失效
-  // analysisInvalidations.insert(&SideEffectInfoAnalysisPass::ID); // 副作用分析可能失效
  // 其他所有依赖于数据流或 IR 结构的分析都可能失效。
 }

--- a/src/midend/Pass/Optimize/Reg2Mem.cpp
+++ b/src/midend/Pass/Optimize/Reg2Mem.cpp
@ -70,20 +70,20 @@ void Reg2MemContext::allocateMemoryForSSAValues(Function *func) {

  // 1. 为函数参数分配内存
  builder->setPosition(entryBlock, entryBlock->begin()); // 确保在入口块的开始位置插入
-  for (auto arg : func->getArguments()) {
-    // 默认情况下，将所有参数是提升到内存
-    if (isPromotableToMemory(arg)) {
-      // 参数的类型就是 AllocaInst 需要分配的类型
-      AllocaInst *alloca = builder->createAllocaInst(Type::getPointerType(arg->getType()), {}, arg->getName() + ".reg2mem");
-      // 将参数值 store 到 alloca 中 (这是 Mem2Reg 逆转的关键一步)
-      valueToAllocaMap[arg] = alloca;
+  // for (auto arg : func->getArguments()) {
+  //   // 默认情况下，将所有参数是提升到内存
+  //   if (isPromotableToMemory(arg)) {
+  //     // 参数的类型就是 AllocaInst 需要分配的类型
+  //     AllocaInst *alloca = builder->createAllocaInst(Type::getPointerType(arg->getType()), arg->getName() + ".reg2mem");
+  //     // 将参数值 store 到 alloca 中 (这是 Mem2Reg 逆转的关键一步)
+  //     valueToAllocaMap[arg] = alloca;

-      // 确保 alloca 位于入口块的顶部，但在所有参数的 store 指令之前
-      // 通常 alloca 都在 entry block 的最开始
-      // 这里我们只是创建，并让 builder 决定插入位置 (通常在当前插入点)
-      // 如果需要严格控制顺序，可能需要手动 insert 到 instruction list
-    }
-  }
+  //     // 确保 alloca 位于入口块的顶部，但在所有参数的 store 指令之前
+  //     // 通常 alloca 都在 entry block 的最开始
+  //     // 这里我们只是创建，并让 builder 决定插入位置 (通常在当前插入点)
+  //     // 如果需要严格控制顺序，可能需要手动 insert 到 instruction list
+  //   }
+  // }

  // 2. 为指令结果分配内存
  // 遍历所有基本块和指令，找出所有需要分配 Alloca 的指令结果
@ -103,7 +103,7 @@ void Reg2MemContext::allocateMemoryForSSAValues(Function *func) {
        // AllocaInst 应该在入口块，而不是当前指令所在块
        // 这里我们只是创建，并稍后调整其位置
        // 通常的做法是在循环结束后统一将 alloca 放到 entryBlock 的顶部
-        AllocaInst *alloca = builder->createAllocaInst(Type::getPointerType(inst.get()->getType()), {}, inst.get()->getName() + ".reg2mem");
+        AllocaInst *alloca = builder->createAllocaInst(Type::getPointerType(inst.get()->getType()), inst.get()->getName() + ".reg2mem");
        valueToAllocaMap[inst.get()] = alloca;
      }
    }
@ -123,11 +123,11 @@ void Reg2MemContext::allocateMemoryForSSAValues(Function *func) {
  }

  // 插入所有参数的初始 Store 指令
-  for (auto arg : func->getArguments()) {
-      if (valueToAllocaMap.count(arg)) { // 检查是否为其分配了 alloca
-          builder->createStoreInst(arg, valueToAllocaMap[arg]);
-      }
-  }
+  // for (auto arg : func->getArguments()) {
+  //     if (valueToAllocaMap.count(arg)) { // 检查是否为其分配了 alloca
+  //         builder->createStoreInst(arg, valueToAllocaMap[arg]);
+  //     }
+  // }
  
  builder->setPosition(entryBlock, entryBlock->terminator());
 }
@ -148,8 +148,8 @@ void Reg2MemContext::rewritePhis(Function *func) {
          // 1. 为 Phi 指令的每个入边，在前驱块的末尾插入 Store 指令
          // PhiInst 假设有 getIncomingValues() 和 getIncomingBlocks()
          for (unsigned i = 0; i < phiInst->getNumIncomingValues(); ++i) {         // 假设 PhiInst 是通过操作数来管理入边的
-            Value *incomingValue = phiInst->getValue(i);                   // 获取入值
-            BasicBlock *incomingBlock = phiInst->getBlock(i); // 获取对应的入块
+            Value *incomingValue = phiInst->getIncomingValue(i);                   // 获取入值
+            BasicBlock *incomingBlock = phiInst->getIncomingBlock(i); // 获取对应的入块

            // 在入块的跳转指令之前插入 StoreInst
            // 需要找到 incomingBlock 的终结指令 (Terminator Instruction)
@ -181,8 +181,7 @@ void Reg2MemContext::rewritePhis(Function *func) {
  // 实际删除 Phi 指令
  for (auto phi : phisToErase) {
    if (phi && phi->getParent()) {
-      SysYIROptUtils::usedelete(phi);    // 清理 use-def 链
-      phi->getParent()->removeInst(phi); // 从基本块中删除
+      SysYIROptUtils::usedelete(phi);
    }
  }
 }
--- a/src/midend/Pass/Optimize/SCCP.cpp
+++ b/src/midend/Pass/Optimize/SCCP.cpp
--- a/src/midend/Pass/Optimize/SysYIRCFGOpt.cpp
+++ b/src/midend/Pass/Optimize/SysYIRCFGOpt.cpp
@ -1,12 +1,12 @@
 #include "SysYIRCFGOpt.h"
 #include "SysYIROptUtils.h"
 #include <cassert>
+#include <iostream>
 #include <list>
 #include <map>
 #include <memory>
-#include <string>
-#include <iostream>
 #include <queue> // 引入队列，SysYDelNoPreBLock需要
+#include <string>

 namespace sysy {

@ -18,7 +18,6 @@ void *SysYBlockMergePass::ID = (void *)&SysYBlockMergePass::ID;
 void *SysYAddReturnPass::ID = (void *)&SysYAddReturnPass::ID;
 void *SysYCondBr2BrPass::ID = (void *)&SysYCondBr2BrPass::ID;

-
 // ======================================================================
 // SysYCFGOptUtils: 辅助工具类，包含实际的CFG优化逻辑
 // ======================================================================
@ -26,40 +25,42 @@ void *SysYCondBr2BrPass::ID = (void *)&SysYCondBr2BrPass::ID;
 // 删除br后的无用指令
 bool SysYCFGOptUtils::SysYDelInstAfterBr(Function *func) {
  bool changed = false;
-  
+
  auto basicBlocks = func->getBasicBlocks();
  for (auto &basicBlock : basicBlocks) {
    bool Branch = false;
    auto &instructions = basicBlock->getInstructions();
    auto Branchiter = instructions.end();
    for (auto iter = instructions.begin(); iter != instructions.end(); ++iter) {
-      if ((*iter)->isTerminator()){
+      if ((*iter)->isTerminator()) {
        Branch = true;
        Branchiter = iter;
        break;
      }
    }
-    if (Branchiter != instructions.end()) ++Branchiter;
+    if (Branchiter != instructions.end())
+      ++Branchiter;
    while (Branchiter != instructions.end()) {
      changed = true;
-      Branchiter = instructions.erase(Branchiter);
+      Branchiter = SysYIROptUtils::usedelete(Branchiter); // 删除指令
    }
-    
-    if (Branch) {  // 更新前驱后继关系
-      auto thelastinstinst = basicBlock->getInstructions().end();
-      --thelastinstinst;
+
+    if (Branch) { // 更新前驱后继关系
+      auto thelastinstinst = basicBlock->terminator();
      auto &Successors = basicBlock->getSuccessors();
      for (auto iterSucc = Successors.begin(); iterSucc != Successors.end();) {
        (*iterSucc)->removePredecessor(basicBlock.get());
        basicBlock->removeSuccessor(*iterSucc);
      }
      if (thelastinstinst->get()->isUnconditional()) {
-        BasicBlock* branchBlock = dynamic_cast<BasicBlock *>(thelastinstinst->get()->getOperand(0));
+        auto brinst = dynamic_cast<UncondBrInst *>(thelastinstinst->get());
+        BasicBlock *branchBlock = dynamic_cast<BasicBlock *>(brinst->getBlock());
        basicBlock->addSuccessor(branchBlock);
        branchBlock->addPredecessor(basicBlock.get());
      } else if (thelastinstinst->get()->isConditional()) {
-        BasicBlock* thenBlock = dynamic_cast<BasicBlock *>(thelastinstinst->get()->getOperand(1));
-        BasicBlock* elseBlock = dynamic_cast<BasicBlock *>(thelastinstinst->get()->getOperand(2));
+        auto brinst = dynamic_cast<CondBrInst *>(thelastinstinst->get());
+        BasicBlock *thenBlock = dynamic_cast<BasicBlock *>(brinst->getThenBlock());
+        BasicBlock *elseBlock = dynamic_cast<BasicBlock *>(brinst->getElseBlock());
        basicBlock->addSuccessor(thenBlock);
        basicBlock->addSuccessor(elseBlock);
        thenBlock->addPredecessor(basicBlock.get());
@ -75,38 +76,48 @@ bool SysYCFGOptUtils::SysYDelInstAfterBr(Function *func) {
 bool SysYCFGOptUtils::SysYBlockMerge(Function *func) {
  bool changed = false;

-  for (auto blockiter = func->getBasicBlocks().begin();
-        blockiter != func->getBasicBlocks().end();) {
+  for (auto blockiter = func->getBasicBlocks().begin(); blockiter != func->getBasicBlocks().end();) {
+    // 检查当前块是是不是entry块
+    if( blockiter->get() == func->getEntryBlock() ) {
+      blockiter++;
+      continue; // 跳过入口块
+    }
    if (blockiter->get()->getNumSuccessors() == 1) {
      // 如果当前块只有一个后继块
      // 且后继块只有一个前驱块
      // 则将当前块和后继块合并
      if (((blockiter->get())->getSuccessors()[0])->getNumPredecessors() == 1) {
        // std::cout << "merge block: " << blockiter->get()->getName() << std::endl;
-        BasicBlock* block = blockiter->get();
-        BasicBlock* nextBlock = blockiter->get()->getSuccessors()[0];
+        BasicBlock *block = blockiter->get();
+        BasicBlock *nextBlock = blockiter->get()->getSuccessors()[0];
        // auto nextarguments = nextBlock->getArguments();
-        // 删除br指令
+        // 删除block的br指令
        if (block->getNumInstructions() != 0) {
-          auto thelastinstinst = block->end();
-          (--thelastinstinst);
+          auto thelastinstinst = block->terminator();
          if (thelastinstinst->get()->isUnconditional()) {
-            SysYIROptUtils::usedelete(thelastinstinst->get());
-            thelastinstinst = block->getInstructions().erase(thelastinstinst);
+            thelastinstinst = SysYIROptUtils::usedelete(thelastinstinst);
          } else if (thelastinstinst->get()->isConditional()) {
-            // 如果是条件分支，判断条件是否相同，主要优化相同布尔表达式
-            if (thelastinstinst->get()->getOperand(1)->getName() == thelastinstinst->get()->getOperand(1)->getName()) {
-              SysYIROptUtils::usedelete(thelastinstinst->get());
-              thelastinstinst = block->getInstructions().erase(thelastinstinst); 
+            // 按道理不会走到这个分支
+            // 如果是条件分支，查看then else是否相同
+            auto brinst = dynamic_cast<CondBrInst *>(thelastinstinst->get());
+            if (brinst->getThenBlock() == brinst->getElseBlock()) {
+              thelastinstinst = SysYIROptUtils::usedelete(thelastinstinst);
+            }
+            else{
+              assert(false && "SysYBlockMerge: unexpected conditional branch with different then and else blocks");
            }
          }
        }
        // 将后继块的指令移动到当前块
        // 并将后继块的父指针改为当前块
        for (auto institer = nextBlock->begin(); institer != nextBlock->end();) {
-          institer->get()->setParent(block);
-          block->getInstructions().emplace_back(institer->release());
-          institer = nextBlock->getInstructions().erase(institer);   
+          // institer->get()->setParent(block);
+          // block->getInstructions().emplace_back(institer->release());
+          // 用usedelete删除会导致use关系被删除我只希望移动指令到当前块
+          // institer = SysYIROptUtils::usedelete(institer);
+          // institer = nextBlock->getInstructions().erase(institer);
+          institer = nextBlock->moveInst(institer, block->getInstructions().end(), block);
+          
        }
        // 更新前驱后继关系，类似树节点操作
        block->removeSuccessor(nextBlock);
@ -137,323 +148,433 @@ bool SysYCFGOptUtils::SysYBlockMerge(Function *func) {

 // 删除无前驱块，兼容SSA后的处理
 bool SysYCFGOptUtils::SysYDelNoPreBLock(Function *func) {
-  
-  bool changed = false;
+  bool changed = false;                   // 标记是否有基本块被删除
+  std::set<BasicBlock *> reachableBlocks; // 用于存储所有可达的基本块
+  std::queue<BasicBlock *> blockQueue;    // BFS 遍历队列

-  for (auto &block : func->getBasicBlocks()) {
-    block->setreachableFalse();
+  BasicBlock *entryBlock = func->getEntryBlock();
+  if (entryBlock) {                     // 确保函数有入口块
+    reachableBlocks.insert(entryBlock); // 将入口块标记为可达
+    blockQueue.push(entryBlock);        // 入口块入队
  }
-  // 对函数基本块做一个拓扑排序，排查不可达基本块
-  auto entryBlock = func->getEntryBlock();
-  entryBlock->setreachableTrue();
-  std::queue<BasicBlock *> blockqueue;
-  blockqueue.push(entryBlock);
-  while (!blockqueue.empty()) {
-    auto block = blockqueue.front();
-    blockqueue.pop();
-    for (auto &succ : block->getSuccessors()) {
-      if (!succ->getreachable()) {
-        succ->setreachableTrue();
-        blockqueue.push(succ);
+  // 如果没有入口块（比如一个空函数），则没有块是可达的，所有块都将被删除。
+
+  while (!blockQueue.empty()) { // BFS 遍历：只要队列不空
+    BasicBlock *currentBlock = blockQueue.front();
+    blockQueue.pop(); // 取出当前块
+
+    for (auto &succ : currentBlock->getSuccessors()) { // 遍历当前块的所有后继
+      // 如果后继块不在 reachableBlocks 中（即尚未被访问过）
+      if (reachableBlocks.find(succ) == reachableBlocks.end()) {
+        reachableBlocks.insert(succ); // 标记为可达
+        blockQueue.push(succ);        // 入队，以便继续遍历
      }
    }
  }

-  // 删除不可达基本块指令
-  for (auto blockIter = func->getBasicBlocks().begin(); blockIter != func->getBasicBlocks().end(); blockIter++) {
-    if (!blockIter->get()->getreachable()) {
-      for (auto instIter = blockIter->get()->getInstructions().begin();
-          instIter != blockIter->get()->getInstructions().end();) {
-        SysYIROptUtils::usedelete(instIter->get());
-        instIter = blockIter->get()->getInstructions().erase(instIter);
+  std::vector<BasicBlock *> blocksToDelete; // 用于存储所有不可达的基本块
+
+  for (auto &blockPtr : func->getBasicBlocks()) {
+    BasicBlock *block = blockPtr.get();
+    // 如果当前块不在 reachableBlocks 集合中，说明它是不可达的
+    if (reachableBlocks.find(block) == reachableBlocks.end()) {
+      blocksToDelete.push_back(block); // 将其加入待删除列表
+      changed = true;                  // 只要找到一个不可达块，就说明函数发生了改变
+    }
+  }
+
+  for (BasicBlock *unreachableBlock : blocksToDelete) {
+    // 遍历不可达块中的所有指令，并删除它们
+    for (auto instIter = unreachableBlock->getInstructions().begin();
+         instIter != unreachableBlock->getInstructions().end();) {
+      instIter = SysYIROptUtils::usedelete(instIter);
+    }
+  }
+
+  for (BasicBlock *unreachableBlock : blocksToDelete) {
+    for (BasicBlock *succBlock : unreachableBlock->getSuccessors()) {
+      // 只有当后继块自身是可达的（没有被删除）时才需要处理
+      if (reachableBlocks.count(succBlock)) {
+        for (auto &phiInstPtr : succBlock->getInstructions()) {
+          // Phi 指令总是在基本块的开头。一旦遇到非 Phi 指令即可停止。
+          if (phiInstPtr->getKind() != Instruction::kPhi) {
+            break;
+          }
+          // 将这个 Phi 节点中来自不可达前驱（unreachableBlock）的输入参数删除
+          dynamic_cast<PhiInst *>(phiInstPtr.get())->removeIncomingBlock(unreachableBlock);
+        }
      }
    }
  }

-  
  for (auto blockIter = func->getBasicBlocks().begin(); blockIter != func->getBasicBlocks().end();) {
-    if (!blockIter->get()->getreachable()) {
-      for (auto succblock : blockIter->get()->getSuccessors()) {
-        for (auto &phiinst : succblock->getInstructions()) {
-        if (phiinst->getKind() != Instruction::kPhi) {
-          break;
-        }
-          // 使用 delBlk 方法正确地删除对应于被删除基本块的传入值
-          dynamic_cast<PhiInst *>(phiinst.get())->delBlk(blockIter->get());
-        }
-      }
-      // 删除不可达基本块，注意迭代器不可达问题
+    BasicBlock *currentBlock = blockIter->get();
+    // 如果当前块不在可达块集合中，则将其从函数中移除
+    if (reachableBlocks.find(currentBlock) == reachableBlocks.end()) {
+      // func->removeBasicBlock 应该返回下一个有效的迭代器
      func->removeBasicBlock((blockIter++)->get());
-      changed = true;
    } else {
-      blockIter++;
+      blockIter++; // 如果可达，则移动到下一个块
    }
  }
-  
+
  return changed;
 }

-// 删除空块
-bool SysYCFGOptUtils::SysYDelEmptyBlock(Function *func, IRBuilder* pBuilder) {
+bool SysYCFGOptUtils::SysYDelEmptyBlock(Function *func, IRBuilder *pBuilder) {
  bool changed = false;

-  // 收集不可达基本块
-  // 这里的不可达基本块是指没有实际指令的基本块
-  // 当一个基本块没有实际指令例如只有phi指令和一个uncondbr指令时，也会被视作不可达
-  auto basicBlocks = func->getBasicBlocks();
-  std::map<sysy::BasicBlock *, BasicBlock *> EmptyBlocks;
-  // 空块儿和后继的基本块的映射
-  for (auto &basicBlock : basicBlocks) {
-    if (basicBlock->getNumInstructions() == 0) {
-      if (basicBlock->getNumSuccessors() == 1) {
-        EmptyBlocks[basicBlock.get()] = basicBlock->getSuccessors().front();
-      }
-    }
-    else{
-      // 如果只有phi指令和一个uncondbr。(phi)*(uncondbr)?
-        // 判断除了最后一个指令之外是不是只有phi指令
-      bool onlyPhi = true;
-      for (auto &inst : basicBlock->getInstructions()) {
-        if (!inst->isPhi() && !inst->isUnconditional()) {
-          onlyPhi = false;
-          break;
-        }
-      }
-      if(onlyPhi && basicBlock->getNumSuccessors() == 1) // 确保有后继且只有一个
-        EmptyBlocks[basicBlock.get()] = basicBlock->getSuccessors().front();
-    }
+  // 步骤 1: 识别并映射所有符合“空块”定义的基本块及其目标后继
+  // 使用 std::map 来存储 <空块, 空块跳转目标>
+  // 这样可以处理空块链：A -> B -> C，如果 B 是空块，A 应该跳到 C
+  std::map<BasicBlock *, BasicBlock *> emptyBlockRedirectMap;
+
+  // 为了避免在遍历 func->getBasicBlocks() 时修改它导致迭代器失效，
+  // 我们先收集所有的基本块。
+  std::vector<BasicBlock *> allBlocks;
+  for (auto &blockPtr : func->getBasicBlocks()) {
+    allBlocks.push_back(blockPtr.get());
  }
-  // 更新基本块信息，增加必要指令
-  for (auto &basicBlock : basicBlocks) {
-    // 把空块转换成只有跳转指令的不可达块 (这段逻辑在优化遍中可能需要调整，这里是原样保留)
-    // 通常，DelEmptyBlock 应该在BlockMerge之后运行，如果存在完全空块，它会尝试填充一个Br指令。
-    // 但是，它主要目的是重定向跳转。
-    if (distance(basicBlock->begin(), basicBlock->end()) == 0) {
-      if (basicBlock->getNumSuccessors() == 0) {
-        continue;
-      }
-      if (basicBlock->getNumSuccessors() > 1) {
-        // 如果一个空块有多个后继，说明CFG结构有问题或者需要特殊处理，这里简单assert
-        assert(false && "Empty block with multiple successors found during SysYDelEmptyBlock");
-      }
-      // 这里的逻辑有点问题，如果一个块是空的，且只有一个后继，应该直接跳转到后继。
-      // 如果这个块最终被删除了，那么其前驱也需要重定向。
-      // 这个循环的目的是重定向现有的跳转指令，而不是创建新的。
-      // 所以下面的逻辑才是核心。
-      // pBuilder->setPosition(basicBlock.get(), basicBlock->end());
-      // pBuilder->createUncondBrInst(basicBlock->getSuccessors()[0], {});
+
+  for (BasicBlock *block : allBlocks) {
+    // 入口块通常不应该被认为是空块并删除，除非它没有实际指令且只有一个后继，
+    // 但为了安全起见，通常会跳过入口块的删除。
+    // 如果入口块是空的，它应该被合并到它的后继，但处理起来更复杂，这里先不处理入口块为空的情况
+    if (block == func->getEntryBlock()) {
      continue;
    }

-    auto thelastinst = basicBlock->getInstructions().end();
-    --thelastinst;
-
-    // 根据br指令传递的后继块信息，跳过空块链
-    if (thelastinst->get()->isUnconditional()) {
-      BasicBlock* OldBrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
-      BasicBlock *thelastBlockOld = nullptr;
-      // 如果空块链表为多个块
-      while (EmptyBlocks.count(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0)))) {
-        thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
-        thelastinst->get()->replaceOperand(0, EmptyBlocks[thelastBlockOld]);
-      }
-
-      // 如果有重定向发生
-      if (thelastBlockOld != nullptr) {
-          basicBlock->removeSuccessor(OldBrBlock);
-          OldBrBlock->removePredecessor(basicBlock.get());
-          basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0)));
-          dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->addPredecessor(basicBlock.get());
-          changed = true; // 标记IR被修改
-      }
-
-
-      if (thelastBlockOld != nullptr) {
-        for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getInstructions()) {
-        if (InstInNew->isPhi()) {
-          // 使用 delBlk 方法删除 oldBlock 对应的传入值
-          dynamic_cast<PhiInst *>(InstInNew.get())->delBlk(thelastBlockOld);
-        } else {
+    // 检查基本块是否是空的：除了Phi指令外，只包含一个终止指令 (Terminator)
+    // 且该终止指令必须是无条件跳转。
+    // 空块必须只有一个后继才能被简化
+    if (block->getNumSuccessors() == 1) {
+      bool hasNonPhiNonTerminator = false;
+      // 遍历除了最后一个指令之外的指令
+      for (auto instIter = block->getInstructions().begin(); instIter != block->getInstructions().end();) {
+        // 如果是终止指令（例如 br, ret），且不是最后一个指令，则该块有问题
+        if ((*instIter)->isTerminator() && instIter != block->terminator()) {
+          hasNonPhiNonTerminator = true;
          break;
        }
-  }
-      }
-
-    } else if (thelastinst->get()->getKind() == Instruction::kCondBr) {
-      auto OldThenBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
-      auto OldElseBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2));
-      bool thenChanged = false;
-      bool elseChanged = false;
-
-
-      BasicBlock *thelastBlockOld = nullptr;
-      while (EmptyBlocks.count(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1)))) {
-        thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
-        thelastinst->get()->replaceOperand(
-            1, EmptyBlocks[dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))]);
-        thenChanged = true;
-      }
-
-      if (thenChanged) {
-        basicBlock->removeSuccessor(OldThenBlock);
-        OldThenBlock->removePredecessor(basicBlock.get());
-        basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1)));
-        dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))->addPredecessor(basicBlock.get());
-        changed = true; // 标记IR被修改
-      }
-      
-      // 处理 then 和 else 分支合并的情况
-      if (dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1)) ==
-          dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))) {
-        auto thebrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
-        SysYIROptUtils::usedelete(thelastinst->get());
-        thelastinst = basicBlock->getInstructions().erase(thelastinst);
-        pBuilder->setPosition(basicBlock.get(), basicBlock->end());
-        pBuilder->createUncondBrInst(thebrBlock, {});
-        changed = true; // 标记IR被修改
-        continue;
-      }
-      
-      if (thelastBlockOld != nullptr) {
-        for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1))->getInstructions()) {
-          if (InstInNew->isPhi()) {
-            // 使用 delBlk 方法删除 oldBlock 对应的传入值
-            dynamic_cast<PhiInst *>(InstInNew.get())->delBlk(thelastBlockOld);
-          } else {
-            break;
+        // 如果不是 Phi 指令且不是终止指令
+        if (!(*instIter)->isPhi() && !(*instIter)->isTerminator()) {
+          hasNonPhiNonTerminator = true;
+          break;
+        }
+        ++instIter;
+        if (!hasNonPhiNonTerminator &&
+            instIter == block->getInstructions().end()) { // 如果块中只有 Phi 指令和一个 Terminator
+          // 确保最后一个指令是无条件跳转
+          auto lastInst = block->terminator()->get();
+          if (lastInst && lastInst->isUnconditional()) {
+            emptyBlockRedirectMap[block] = block->getSuccessors().front();
          }
-       }
-      }
-
-      thelastBlockOld = nullptr;
-      while (EmptyBlocks.count(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2)))) {
-        thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2));
-        thelastinst->get()->replaceOperand(
-            2, EmptyBlocks[dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))]);
-        elseChanged = true;
-      }
-      
-      if (elseChanged) {
-        basicBlock->removeSuccessor(OldElseBlock);
-        OldElseBlock->removePredecessor(basicBlock.get());
-        basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2)));
-        dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))->addPredecessor(basicBlock.get());
-        changed = true; // 标记IR被修改
-      }
-
-      // 处理 then 和 else 分支合并的情况
-      if (dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1)) ==
-          dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))) {
-        auto thebrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(1));
-        SysYIROptUtils::usedelete(thelastinst->get());
-        thelastinst = basicBlock->getInstructions().erase(thelastinst);
-        pBuilder->setPosition(basicBlock.get(), basicBlock->end());
-        pBuilder->createUncondBrInst(thebrBlock, {});
-        changed = true; // 标记IR被修改
-        continue;
-      }
-      
-
-      // 如果有重定向发生
-      // 需要更新后继块的前驱关系
-      if (thelastBlockOld != nullptr) {
-        for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(2))->getInstructions()) {
-          if (InstInNew->isPhi()) {
-            // 使用 delBlk 方法删除 oldBlock 对应的传入值
-            dynamic_cast<PhiInst *>(InstInNew.get())->delBlk(thelastBlockOld);
-          } else {
-            break;
-          }
-        } 
-      }
-      
-    } else {
-      // 如果不是终止指令，但有后继 (例如，末尾没有显式终止指令的块)
-      // 这段逻辑可能需要更严谨的CFG检查来确保正确性
-      if (basicBlock->getNumSuccessors() == 1) {
-        // 这里的逻辑似乎是想为没有terminator的块添加一个，但通常这应该在CFG构建阶段完成。
-        // 如果这里仍然执行，确保它符合预期。
-        // pBuilder->setPosition(basicBlock.get(), basicBlock->end());
-        // pBuilder->createUncondBrInst(basicBlock->getSuccessors()[0], {});
-        // auto thelastinst = basicBlock->getInstructions().end();
-        // (--thelastinst);
-        // auto OldBrBlock = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
-        // sysy::BasicBlock *thelastBlockOld = nullptr;
-        // while (EmptyBlocks.find(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))) !=
-        //         EmptyBlocks.end()) {
-        //   thelastBlockOld = dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0));
-
-        //   thelastinst->get()->replaceOperand(
-        //       0, EmptyBlocks[dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))]);
-        // }
-        
-        // basicBlock->removeSuccessor(OldBrBlock);
-        // OldBrBlock->removePredecessor(basicBlock.get());
-        // basicBlock->addSuccessor(dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0)));
-        // dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->addPredecessor(basicBlock.get());
-        // changed = true; // 标记IR被修改
-        // if (thelastBlockOld != nullptr) {
-        //   int indexphi = 0;
-        //   for (auto &pred : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getPredecessors()) {
-        //     if (pred == thelastBlockOld) {
-        //       break;
-        //     }
-        //     indexphi++;
-        //   }
-
-        //   for (auto &InstInNew : dynamic_cast<BasicBlock *>(thelastinst->get()->getOperand(0))->getInstructions()) {
-        //     if (InstInNew->isPhi()) {
-        //       dynamic_cast<PhiInst *>(InstInNew.get())->removeOperand(indexphi + 1);
-        //     } else {
-        //       break;
-        //     }
-        //   }
-        // }
+        }
      }
    }
  }

-  // 真正的删除空块
-  for (auto iter = func->getBasicBlocks().begin(); iter != func->getBasicBlocks().end();) {
-    
-    if (EmptyBlocks.count(iter->get())) {
-      // EntryBlock跳过
-      if (iter->get() == func->getEntryBlock()) {
-        ++iter;
-        continue;
+  // 步骤 2: 遍历 emptyBlockRedirectMap，处理空块链
+  // 确保每个空块都直接重定向到其最终的非空后继块
+  for (auto const &[emptyBlock, directSucc] : emptyBlockRedirectMap) {
+    BasicBlock *targetBlock = directSucc;
+    // 沿着空块链一直找到最终的非空块目标
+    while (emptyBlockRedirectMap.count(targetBlock)) {
+      targetBlock = emptyBlockRedirectMap[targetBlock];
+    }
+    emptyBlockRedirectMap[emptyBlock] = targetBlock; // 更新映射到最终目标
+  }
+
+  // 步骤 3: 遍历所有基本块，重定向其终止指令，绕过空块
+  // 注意：这里需要再次遍历所有块，包括可能成为新目标的块
+  for (BasicBlock *currentBlock : allBlocks) {
+    // 如果 currentBlock 本身就是个空块，它会通过其前驱的重定向被处理，这里跳过
+    if (emptyBlockRedirectMap.count(currentBlock)) {
+      continue;
+    }
+
+    // 获取当前块的最后一个指令（终止指令）
+    if (currentBlock->getInstructions().empty()) {
+      // 理论上，除了入口块和可能被合并的空块外，所有块都应该有终止指令
+      // 如果这里碰到空块，可能是逻辑错误或者需要特殊处理
+      continue;
+    }
+
+    std::function<Value *(Value *, BasicBlock *)> getUltimateSourceValue = [&](Value *val, BasicBlock *currentDefBlock) -> Value * {
+      
+      if(!dynamic_cast<Instruction *>(val)) {
+        // 如果 val 不是指令，直接返回它
+        return val;
+      }
+      Instruction *inst = dynamic_cast<Instruction *>(val);
+      // 如果定义指令不在任何空块中，它就是最终来源
+      if (!emptyBlockRedirectMap.count(currentDefBlock)) {
+        return val;
      }

-      for (auto instIter = iter->get()->getInstructions().begin();
-         instIter != iter->get()->getInstructions().end();) {
-        SysYIROptUtils::usedelete(instIter->get()); // 仅删除 use 关系
-        // 显式地从基本块中删除指令并更新迭代器
-        instIter = iter->get()->getInstructions().erase(instIter);
+      // 如果是 Phi 指令，且它在空块中，则继续追溯其在空块链中前驱的传入值
+      if (inst->getKind() == Instruction::kPhi) {
+        PhiInst *phi = dynamic_cast<PhiInst *>(inst);
+        // 查找哪个前驱是空块链中的上一个块
+        for (size_t i = 0; i < phi->getNumOperands(); i += 2) {
+          BasicBlock *incomingBlock = dynamic_cast<BasicBlock *>(phi->getOperand(i + 1));
+          // 检查 incomingBlock 是否是当前空块的前驱，且也在空块映射中（或就是 P）
+          // 找到在空块链中导致 currentDefBlock 的那个前驱块
+          if (emptyBlockRedirectMap.count(incomingBlock) || incomingBlock == currentBlock) {
+            // 递归追溯该传入值
+            return getUltimateSourceValue(phi->getValfromBlk(incomingBlock), incomingBlock);
+          }
+        }
      }
-      // 删除不可达基本块的phi指令的操作数
-      for (auto &succ : iter->get()->getSuccessors()) {
-        for (auto &instinsucc : succ->getInstructions()) {
-          if (instinsucc->isPhi()) {
-            // iter->get() 就是当前被删除的空基本块，它作为前驱连接到这里的Phi指令
-            dynamic_cast<PhiInst *>(instinsucc.get())->delBlk(iter->get());
+      // 如果是其他指令或者无法追溯到Phi链，则认为它在空块中产生，无法安全传播，返回null或原值
+      // 在严格的空块定义下，除了Phi和Terminator，不应有其他指令产生值。
+      return val; // Fallback: If not a Phi, or unable to trace, return itself (may be dangling)
+    };
+
+    auto lastInst = currentBlock->getInstructions().back().get();
+
+    if (lastInst->isUnconditional()) { // 无条件跳转
+      UncondBrInst *brInst = dynamic_cast<UncondBrInst *>(lastInst);
+      BasicBlock *oldTarget = dynamic_cast<BasicBlock *>(brInst->getBlock()); // 原始跳转目标
+
+      if (emptyBlockRedirectMap.count(oldTarget)) {               // 如果目标是空块
+        BasicBlock *newTarget = emptyBlockRedirectMap[oldTarget]; // 获取最终目标
+
+        // 更新 CFG 关系
+        currentBlock->removeSuccessor(oldTarget);
+        oldTarget->removePredecessor(currentBlock);
+
+        brInst->replaceOperand(0, newTarget); // 更新跳转指令的操作数
+        currentBlock->addSuccessor(newTarget);
+        newTarget->addPredecessor(currentBlock);
+
+        changed = true; // 标记发生改变
+
+        for (auto &phiInstPtr : newTarget->getInstructions()) {
+          if (phiInstPtr->getKind() == Instruction::kPhi) {
+            PhiInst *phiInst = dynamic_cast<PhiInst *>(phiInstPtr.get());
+            BasicBlock *actualEmptyPredecessorOfS = nullptr;
+            for (size_t i = 0; i < phiInst->getNumOperands(); i += 2) {
+              BasicBlock *incomingBlock = dynamic_cast<BasicBlock *>(phiInst->getOperand(i + 1));
+              if (incomingBlock && emptyBlockRedirectMap.count(incomingBlock) &&
+                  emptyBlockRedirectMap[incomingBlock] == newTarget) {
+                actualEmptyPredecessorOfS = incomingBlock;
+                break;
+              }
+            }
+
+            if (actualEmptyPredecessorOfS) {
+              // 获取 Phi 节点原本从 actualEmptyPredecessorOfS 接收的值
+              Value *valueFromEmptyPredecessor = phiInst->getValfromBlk(actualEmptyPredecessorOfS);
+
+              // 追溯这个值，找到它在非空块中的最终来源
+              // currentBlock 是 P
+              // oldTarget 是 E1 (链的起点)
+              // actualEmptyPredecessorOfS 是 En (链的终点，S 的前驱)
+              Value *ultimateSourceValue = getUltimateSourceValue(valueFromEmptyPredecessor, actualEmptyPredecessorOfS);
+
+              // 替换 Phi 节点的传入块和传入值
+              if (ultimateSourceValue) { // 确保成功追溯到有效来源
+                // phiInst->replaceIncoming(actualEmptyPredecessorOfS, currentBlock, ultimateSourceValue);
+                phiInst->replaceIncomingBlock(actualEmptyPredecessorOfS, currentBlock, ultimateSourceValue);
+              } else {
+                assert(false && "[DelEmptyBlock] Unable to trace a valid source for Phi instruction");
+                // 无法追溯到有效来源，这可能是个错误或特殊情况
+                // 此时可能需要移除该 Phi 项，或者插入一个 undef 值
+                phiInst->getValfromBlk(actualEmptyPredecessorOfS);
+              }
+            }
          } else {
-            // Phi 指令通常在基本块的开头，如果不是 Phi 指令就停止检查
            break;
          }
        }
      }

-      func->removeBasicBlock((iter++)->get());
-      changed = true;
-    } else {
-      ++iter;
+    } else if (lastInst->getKind() == Instruction::kCondBr) { // 条件跳转
+      CondBrInst *condBrInst = dynamic_cast<CondBrInst *>(lastInst);
+      BasicBlock *oldThenTarget = dynamic_cast<BasicBlock *>(condBrInst->getThenBlock());
+      BasicBlock *oldElseTarget = dynamic_cast<BasicBlock *>(condBrInst->getElseBlock());
+
+      bool thenPathChanged = false;
+      bool elsePathChanged = false;
+
+      // 处理 Then 分支
+      if (emptyBlockRedirectMap.count(oldThenTarget)) {
+        BasicBlock *newThenTarget = emptyBlockRedirectMap[oldThenTarget];
+        condBrInst->replaceOperand(1, newThenTarget); // 更新跳转指令操作数
+
+        currentBlock->removeSuccessor(oldThenTarget);
+        oldThenTarget->removePredecessor(currentBlock);
+        currentBlock->addSuccessor(newThenTarget);
+        newThenTarget->addPredecessor(currentBlock);
+        thenPathChanged = true;
+        changed = true;
+
+        // 处理新 Then 目标块中的 Phi 指令
+        // for (auto &phiInstPtr : newThenTarget->getInstructions()) {
+        //   if (phiInstPtr->getKind() == Instruction::kPhi) {
+        //     dynamic_cast<PhiInst *>(phiInstPtr.get())->delBlk(oldThenTarget);
+        //   } else {
+        //     break;
+        //   }
+        // }
+        for (auto &phiInstPtr : newThenTarget->getInstructions()) {
+          if (phiInstPtr->getKind() == Instruction::kPhi) {
+            PhiInst *phiInst = dynamic_cast<PhiInst *>(phiInstPtr.get());
+            BasicBlock *actualEmptyPredecessorOfS = nullptr;
+            for (size_t i = 0; i < phiInst->getNumOperands(); i += 2) {
+              BasicBlock *incomingBlock = dynamic_cast<BasicBlock *>(phiInst->getOperand(i + 1));
+              if (incomingBlock && emptyBlockRedirectMap.count(incomingBlock) &&
+                  emptyBlockRedirectMap[incomingBlock] == newThenTarget) {
+                actualEmptyPredecessorOfS = incomingBlock;
+                break;
+              }
+            }
+
+            if (actualEmptyPredecessorOfS) {
+              // 获取 Phi 节点原本从 actualEmptyPredecessorOfS 接收的值
+              Value *valueFromEmptyPredecessor = phiInst->getValfromBlk(actualEmptyPredecessorOfS);
+
+              // 追溯这个值，找到它在非空块中的最终来源
+              // currentBlock 是 P
+              // oldTarget 是 E1 (链的起点)
+              // actualEmptyPredecessorOfS 是 En (链的终点，S 的前驱)
+              Value *ultimateSourceValue = getUltimateSourceValue(valueFromEmptyPredecessor, actualEmptyPredecessorOfS);
+
+              // 替换 Phi 节点的传入块和传入值
+              if (ultimateSourceValue) { // 确保成功追溯到有效来源
+                // phiInst->replaceIncoming(actualEmptyPredecessorOfS, currentBlock, ultimateSourceValue);
+                phiInst->replaceIncomingBlock(actualEmptyPredecessorOfS, currentBlock, ultimateSourceValue);
+              } else {
+                assert(false && "[DelEmptyBlock] Unable to trace a valid source for Phi instruction");
+                // 无法追溯到有效来源，这可能是个错误或特殊情况
+                // 此时可能需要移除该 Phi 项，或者插入一个 undef 值
+                phiInst->removeIncomingBlock(actualEmptyPredecessorOfS);
+              }
+            }
+          } else {
+            break;
+          }
+        }
+
+      }
+
+      // 处理 Else 分支
+      if (emptyBlockRedirectMap.count(oldElseTarget)) {
+        BasicBlock *newElseTarget = emptyBlockRedirectMap[oldElseTarget];
+        condBrInst->replaceOperand(2, newElseTarget); // 更新跳转指令操作数
+
+        currentBlock->removeSuccessor(oldElseTarget);
+        oldElseTarget->removePredecessor(currentBlock);
+        currentBlock->addSuccessor(newElseTarget);
+        newElseTarget->addPredecessor(currentBlock);
+        elsePathChanged = true;
+        changed = true;
+
+        // 处理新 Else 目标块中的 Phi 指令
+        // for (auto &phiInstPtr : newElseTarget->getInstructions()) {
+        //   if (phiInstPtr->getKind() == Instruction::kPhi) {
+        //     dynamic_cast<PhiInst *>(phiInstPtr.get())->delBlk(oldElseTarget);
+        //   } else {
+        //     break;
+        //   }
+        // }
+        for (auto &phiInstPtr : newElseTarget->getInstructions()) {
+          if (phiInstPtr->getKind() == Instruction::kPhi) {
+            PhiInst *phiInst = dynamic_cast<PhiInst *>(phiInstPtr.get());
+            BasicBlock *actualEmptyPredecessorOfS = nullptr;
+            for (size_t i = 0; i < phiInst->getNumOperands(); i += 2) {
+              BasicBlock *incomingBlock = dynamic_cast<BasicBlock *>(phiInst->getOperand(i + 1));
+              if (incomingBlock && emptyBlockRedirectMap.count(incomingBlock) &&
+                  emptyBlockRedirectMap[incomingBlock] == newElseTarget) {
+                actualEmptyPredecessorOfS = incomingBlock;
+                break;
+              }
+            }
+
+            if (actualEmptyPredecessorOfS) {
+              // 获取 Phi 节点原本从 actualEmptyPredecessorOfS 接收的值
+              Value *valueFromEmptyPredecessor = phiInst->getValfromBlk(actualEmptyPredecessorOfS);
+
+              // 追溯这个值，找到它在非空块中的最终来源
+              // currentBlock 是 P
+              // oldTarget 是 E1 (链的起点)
+              // actualEmptyPredecessorOfS 是 En (链的终点，S 的前驱)
+              Value *ultimateSourceValue = getUltimateSourceValue(valueFromEmptyPredecessor, actualEmptyPredecessorOfS);
+
+              // 替换 Phi 节点的传入块和传入值
+              if (ultimateSourceValue) { // 确保成功追溯到有效来源
+                // phiInst->replaceIncoming(actualEmptyPredecessorOfS, currentBlock, ultimateSourceValue);
+                phiInst->replaceIncomingBlock(actualEmptyPredecessorOfS, currentBlock, ultimateSourceValue);
+              } else {
+                assert(false && "[DelEmptyBlock] Unable to trace a valid source for Phi instruction");
+                // 无法追溯到有效来源，这可能是个错误或特殊情况
+                // 此时可能需要移除该 Phi 项，或者插入一个 undef 值
+                phiInst->removeIncomingBlock(actualEmptyPredecessorOfS);
+              }
+            }
+          } else {
+            break;
+          }
+        }
+      }
+
+      // 额外处理：如果条件跳转的两个分支现在指向同一个块，则可以简化为无条件跳转
+      if (condBrInst->getThenBlock() == condBrInst->getElseBlock()) {
+        BasicBlock *commonTarget = dynamic_cast<BasicBlock *>(condBrInst->getThenBlock());
+        SysYIROptUtils::usedelete(lastInst); // 删除旧的条件跳转指令
+        pBuilder->setPosition(currentBlock, currentBlock->end());
+        pBuilder->createUncondBrInst(commonTarget); // 插入新的无条件跳转指令
+
+        // 更安全地更新 CFG 关系
+        std::set<BasicBlock *> currentSuccessors;
+        currentSuccessors.insert(oldThenTarget);
+        currentSuccessors.insert(oldElseTarget);
+
+        // 移除旧的后继关系
+        for (BasicBlock *succ : currentSuccessors) {
+          currentBlock->removeSuccessor(succ);
+          succ->removePredecessor(currentBlock);
+        }
+        // 添加新的后继关系
+        currentBlock->addSuccessor(commonTarget);
+        commonTarget->addPredecessor(currentBlock);
+
+        changed = true;
+      }
    }
  }
-  
+
+  // 步骤 4: 真正地删除空基本块
+  // 注意：只能在所有跳转和 Phi 指令都更新完毕后才能删除这些块
+  for (auto blockIter = func->getBasicBlocks().begin(); blockIter != func->getBasicBlocks().end();) {
+    BasicBlock *currentBlock = blockIter->get();
+    if (emptyBlockRedirectMap.count(currentBlock)) { // 如果在空块映射中
+      // 入口块不应该被删除，即使它符合空块定义，因为函数需要一个入口
+      if (currentBlock == func->getEntryBlock()) {
+        ++blockIter;
+        continue;
+      }
+
+      // 在删除块之前，确保其内部指令被正确删除（虽然这类块指令很少）
+      for (auto instIter = currentBlock->getInstructions().begin();
+           instIter != currentBlock->getInstructions().end();) {
+        instIter = SysYIROptUtils::usedelete(instIter);
+      }
+
+      // 移除块
+      func->removeBasicBlock((blockIter++)->get());
+      changed = true;
+    } else {
+      ++blockIter;
+    }
+  }
+
  return changed;
 }

 // 如果函数没有返回指令，则添加一个默认返回指令(主要解决void函数没有返回指令的问题)
-bool SysYCFGOptUtils::SysYAddReturn(Function *func, IRBuilder* pBuilder) {
+bool SysYCFGOptUtils::SysYAddReturn(Function *func, IRBuilder *pBuilder) {
  bool changed = false;
  auto basicBlocks = func->getBasicBlocks();
  for (auto &block : basicBlocks) {
@ -467,7 +588,8 @@ bool SysYCFGOptUtils::SysYAddReturn(Function *func, IRBuilder* pBuilder) {
        auto thelastinst = block->getInstructions().end();
        --thelastinst;
        if (thelastinst->get()->getKind() != Instruction::kReturn) {
-          // std::cout << "Warning: Function " << func->getName() << " has no return instruction, adding default return." << std::endl;
+          // std::cout << "Warning: Function " << func->getName() << " has no return instruction, adding default
+          // return." << std::endl;

          pBuilder->setPosition(block.get(), block->end());
          // TODO: 如果int float函数缺少返回值是否需要报错
@ -483,7 +605,7 @@ bool SysYCFGOptUtils::SysYAddReturn(Function *func, IRBuilder* pBuilder) {
      }
    }
  }
-  
+
  return changed;
 }

@ -491,18 +613,18 @@ bool SysYCFGOptUtils::SysYAddReturn(Function *func, IRBuilder* pBuilder) {
 // 主要针对已知条件值的分支转换为无条件分支
 // 例如 if (cond) { ... } else { ... } 中的 cond 已经
 // 确定为 true 或 false 的情况
-bool SysYCFGOptUtils::SysYCondBr2Br(Function *func, IRBuilder* pBuilder) {
+bool SysYCFGOptUtils::SysYCondBr2Br(Function *func, IRBuilder *pBuilder) {
  bool changed = false;

  for (auto &basicblock : func->getBasicBlocks()) {
    if (basicblock->getNumInstructions() == 0)
      continue;
-    
-    auto thelast = basicblock->getInstructions().end();
-    --thelast;

-    if (thelast->get()->isConditional()){
-      ConstantValue *constOperand = dynamic_cast<ConstantValue *>(thelast->get()->getOperand(0));
+    auto thelast = basicblock->terminator();
+
+    if (thelast->get()->isConditional()) {
+      auto condBrInst = dynamic_cast<CondBrInst *>(thelast->get());
+      ConstantValue *constOperand = dynamic_cast<ConstantValue *>(condBrInst->getCondition());
      std::string opname;
      int constint = 0;
      float constfloat = 0.0F;
@ -521,32 +643,31 @@ bool SysYCFGOptUtils::SysYCondBr2Br(Function *func, IRBuilder* pBuilder) {
      if (constfloat_Use || constint_Use) {
        changed = true;

-        auto thenBlock = dynamic_cast<BasicBlock *>(thelast->get()->getOperand(1));
-        auto elseBlock = dynamic_cast<BasicBlock *>(thelast->get()->getOperand(2));
-        SysYIROptUtils::usedelete(thelast->get());
-        thelast = basicblock->getInstructions().erase(thelast);
+        auto thenBlock = dynamic_cast<BasicBlock *>(condBrInst->getThenBlock());
+        auto elseBlock = dynamic_cast<BasicBlock *>(condBrInst->getElseBlock());
+        thelast = SysYIROptUtils::usedelete(thelast);
        if ((constfloat_Use && constfloat == 1.0F) || (constint_Use && constint == 1)) {
          // cond为true或非0
          pBuilder->setPosition(basicblock.get(), basicblock->end());
-          pBuilder->createUncondBrInst(thenBlock, {});
-          
+          pBuilder->createUncondBrInst(thenBlock);
+
          // 更新CFG关系
          basicblock->removeSuccessor(elseBlock);
          elseBlock->removePredecessor(basicblock.get());
-          
+
          // 删除elseBlock的phi指令中对应的basicblock.get()的传入值
          for (auto &phiinst : elseBlock->getInstructions()) {
            if (phiinst->getKind() != Instruction::kPhi) {
              break;
            }
            // 使用 delBlk 方法删除 basicblock.get() 对应的传入值
-            dynamic_cast<PhiInst *>(phiinst.get())->delBlk(basicblock.get());
+            dynamic_cast<PhiInst *>(phiinst.get())->removeIncomingBlock(basicblock.get());
          }
-          
+
        } else { // cond为false或0

          pBuilder->setPosition(basicblock.get(), basicblock->end());
-          pBuilder->createUncondBrInst(elseBlock, {});
+          pBuilder->createUncondBrInst(elseBlock);

          // 更新CFG关系
          basicblock->removeSuccessor(thenBlock);
@ -558,9 +679,8 @@ bool SysYCFGOptUtils::SysYCondBr2Br(Function *func, IRBuilder* pBuilder) {
              break;
            }
            // 使用 delBlk 方法删除 basicblock.get() 对应的传入值
-            dynamic_cast<PhiInst *>(phiinst.get())->delBlk(basicblock.get());
+            dynamic_cast<PhiInst *>(phiinst.get())->removeIncomingBlock(basicblock.get());
          }
-
        }
      }
    }
@ -573,28 +693,28 @@ bool SysYCFGOptUtils::SysYCondBr2Br(Function *func, IRBuilder* pBuilder) {
 // 独立的CFG优化遍的实现
 // ======================================================================

-bool SysYDelInstAfterBrPass::runOnFunction(Function *F, AnalysisManager& AM) {
+bool SysYDelInstAfterBrPass::runOnFunction(Function *F, AnalysisManager &AM) {
  return SysYCFGOptUtils::SysYDelInstAfterBr(F);
 }

-bool SysYDelEmptyBlockPass::runOnFunction(Function *F, AnalysisManager& AM) {
+bool SysYDelEmptyBlockPass::runOnFunction(Function *F, AnalysisManager &AM) {
  return SysYCFGOptUtils::SysYDelEmptyBlock(F, pBuilder);
 }

-bool SysYDelNoPreBLockPass::runOnFunction(Function *F, AnalysisManager& AM) {
+bool SysYDelNoPreBLockPass::runOnFunction(Function *F, AnalysisManager &AM) {
  return SysYCFGOptUtils::SysYDelNoPreBLock(F);
 }

-bool SysYBlockMergePass::runOnFunction(Function *F, AnalysisManager& AM) {
-  return SysYCFGOptUtils::SysYBlockMerge(F);
+bool SysYBlockMergePass::runOnFunction(Function *F, AnalysisManager &AM) { 
+  return SysYCFGOptUtils::SysYBlockMerge(F); 
 }

-bool SysYAddReturnPass::runOnFunction(Function *F, AnalysisManager& AM) {
+bool SysYAddReturnPass::runOnFunction(Function *F, AnalysisManager &AM) {
  return SysYCFGOptUtils::SysYAddReturn(F, pBuilder);
 }

-bool SysYCondBr2BrPass::runOnFunction(Function *F, AnalysisManager& AM) {
+bool SysYCondBr2BrPass::runOnFunction(Function *F, AnalysisManager &AM) {
  return SysYCFGOptUtils::SysYCondBr2Br(F, pBuilder);
 }

-}  // namespace sysy
+} // namespace sysy
--- a/src/midend/Pass/Optimize/TailCallOpt.cpp
+++ b/src/midend/Pass/Optimize/TailCallOpt.cpp
@ -0,0 +1,125 @@
+#include "TailCallOpt.h"
+#include "IR.h"
+#include "IRBuilder.h"
+#include "SysYIROptUtils.h"
+#include <vector>
+// #include <iostream>
+#include <algorithm>
+
+namespace sysy {
+
+void *TailCallOpt::ID = (void *)&TailCallOpt::ID;
+
+void TailCallOpt::getAnalysisUsage(std::set<void *> &analysisDependencies, std::set<void *> &analysisInvalidations) const {
+  analysisInvalidations.insert(&DominatorTreeAnalysisPass::ID);
+  analysisInvalidations.insert(&LoopAnalysisPass::ID);
+}
+
+bool TailCallOpt::runOnFunction(Function *F, AnalysisManager &AM) {
+  std::vector<CallInst *> tailCallInsts;
+  // 遍历函数的所有基本块
+  for (auto &bb_ptr : F->getBasicBlocks()) {
+    auto BB = bb_ptr.get();
+    if (BB->getInstructions().empty()) continue; // 跳过空基本块
+
+    auto term_iter = BB->terminator();
+    if (term_iter == BB->getInstructions().end()) continue; // 没有终结指令则跳过
+    auto term = (*term_iter).get();
+
+    if (!term || !term->isReturn()) continue; // 不是返回指令则跳过
+    auto retInst = static_cast<ReturnInst *>(term);
+
+    Instruction *prevInst = nullptr;
+    if (BB->getInstructions().size() > 1) {
+        auto it = term_iter;
+        --it; // 获取返回指令前的指令
+        prevInst = (*it).get();
+    }
+
+    if (!prevInst || !prevInst->isCall()) continue; // 前一条不是调用指令则跳过
+    auto callInst = static_cast<CallInst *>(prevInst);
+
+    // 检查是否为尾递归调用：被调用函数与当前函数相同且返回值与调用结果匹配
+    if (callInst->getCallee() == F) {
+  // 对于尾递归，返回值应为调用结果或为 void 类型
+        if (retInst->getReturnValue() == callInst || 
+            (retInst->getReturnValue() == nullptr && callInst->getType()->isVoid())) {
+            tailCallInsts.push_back(callInst);
+        }
+    }
+  }
+
+  if (tailCallInsts.empty()) {
+    return false;
+  }
+
+  // 创建一个新的入口基本块，作为循环的前置块
+  auto original_entry = F->getEntryBlock();
+  auto new_entry = F->addBasicBlock("tco.entry." + F->getName());
+  auto loop_header = F->addBasicBlock("tco.loop_header." + F->getName());
+  
+  // 将原入口块中的所有指令移动到循环头块
+  loop_header->getInstructions().splice(loop_header->end(), original_entry->getInstructions());
+  original_entry->setName("tco.pre_header");
+
+  // 为函数参数创建 phi 节点
+  builder->setPosition(loop_header, loop_header->begin());
+  std::vector<PhiInst *> phis;
+  auto original_args = F->getArguments();
+  for (auto &arg : original_args) {
+    auto phi = builder->createPhiInst(arg->getType(), {}, {}, "tco.phi."+arg->getName());
+    phis.push_back(phi);
+  }
+
+  // 用 phi 节点替换所有原始参数的使用
+  for (size_t i = 0; i < original_args.size(); ++i) {
+    original_args[i]->replaceAllUsesWith(phis[i]);
+  }
+
+  // 设置 phi 节点的输入值
+  for (size_t i = 0; i < phis.size(); ++i) {
+    phis[i]->addIncoming(original_args[i], new_entry);
+  }
+
+  // 连接各个基本块
+  builder->setPosition(original_entry, original_entry->end());
+  builder->createUncondBrInst(new_entry);
+  original_entry->addSuccessor(new_entry);
+  
+  builder->setPosition(new_entry, new_entry->end());
+  builder->createUncondBrInst(loop_header);
+  new_entry->addSuccessor(loop_header);
+  loop_header->addPredecessor(new_entry);
+
+  // 处理每一个尾递归调用
+  for (auto callInst : tailCallInsts) {
+    auto tail_call_block = callInst->getParent();
+    
+  // 收集尾递归调用的参数
+    auto args_range = callInst->getArguments();
+    std::vector<Value*> args;
+    std::transform(args_range.begin(), args_range.end(), std::back_inserter(args), 
+                   [](auto& use_ptr){ return use_ptr->getValue(); });
+
+  // 用新的参数值更新 phi 节点
+    for (size_t i = 0; i < phis.size(); ++i) {
+        phis[i]->addIncoming(args[i], tail_call_block);
+    }
+
+  // 移除原有的调用和返回指令
+    auto term_iter = tail_call_block->terminator();
+    SysYIROptUtils::usedelete(term_iter);
+    auto call_iter = tail_call_block->findInstIterator(callInst);
+    SysYIROptUtils::usedelete(call_iter);
+
+  // 添加跳转回循环头块的分支指令
+    builder->setPosition(tail_call_block, tail_call_block->end());
+    builder->createUncondBrInst(loop_header);
+    tail_call_block->addSuccessor(loop_header);
+    loop_header->addPredecessor(tail_call_block);
+  }
+
+  return true;
+}
+
+} // namespace sysy
--- a/src/midend/Pass/Pass.cpp
+++ b/src/midend/Pass/Pass.cpp
@ -1,10 +1,24 @@
 #include "Dom.h"
 #include "Liveness.h"
+#include "Loop.h"
+#include "LoopCharacteristics.h"
+#include "AliasAnalysis.h"
+#include "CallGraphAnalysis.h"
+#include "SideEffectAnalysis.h"
 #include "SysYIRCFGOpt.h"
 #include "SysYIRPrinter.h"
 #include "DCE.h"
 #include "Mem2Reg.h"
 #include "Reg2Mem.h"
+#include "GVN.h"
+#include "SCCP.h"
+#include "BuildCFG.h"
+#include "LoopNormalization.h"
+#include "LICM.h"
+#include "LoopStrengthReduction.h"
+#include "InductionVariableElimination.h"
+#include "GlobalStrengthReduction.h"
+#include "TailCallOpt.h"
 #include "Pass.h"
 #include <iostream>
 #include <queue>
@ -34,10 +48,20 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
        3. 添加优化passid
    */
    // 注册分析遍
+    registerAnalysisPass<DominatorTreeAnalysisPass>();
+    registerAnalysisPass<LivenessAnalysisPass>();
    registerAnalysisPass<sysy::DominatorTreeAnalysisPass>();
    registerAnalysisPass<sysy::LivenessAnalysisPass>();
+    registerAnalysisPass<SysYAliasAnalysisPass>();           // 别名分析 (优先级高)
+    registerAnalysisPass<CallGraphAnalysisPass>();           // 调用图分析 (Module级别，独立分析)
+    registerAnalysisPass<SysYSideEffectAnalysisPass>();      // 副作用分析 (依赖别名分析和调用图)
+    registerAnalysisPass<LoopAnalysisPass>();
+    registerAnalysisPass<LoopCharacteristicsPass>();        // 循环特征分析依赖别名分析

    // 注册优化遍
+    registerOptimizationPass<BuildCFG>();
+    registerOptimizationPass<GVN>();
+    
    registerOptimizationPass<SysYDelInstAfterBrPass>();
    registerOptimizationPass<SysYDelNoPreBLockPass>();
    registerOptimizationPass<SysYBlockMergePass>();
@ -48,13 +72,31 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR

    registerOptimizationPass<DCE>();
    registerOptimizationPass<Mem2Reg>(builderIR);
+    registerOptimizationPass<LoopNormalizationPass>(builderIR);
+    registerOptimizationPass<LICM>(builderIR);
+    registerOptimizationPass<LoopStrengthReduction>(builderIR);
+    registerOptimizationPass<InductionVariableElimination>();
+
+    registerOptimizationPass<GlobalStrengthReduction>(builderIR);
    registerOptimizationPass<Reg2Mem>(builderIR);
+    registerOptimizationPass<TailCallOpt>(builderIR);
+
+    registerOptimizationPass<SCCP>(builderIR);

    if (optLevel >= 1) {
      //经过设计安排优化遍的执行顺序以及执行逻辑
      if (DEBUG) std::cout << "Applying -O1 optimizations.\n";
      if (DEBUG) std::cout << "--- Running custom optimization sequence ---\n";

+      if(DEBUG) {
+        std::cout << "=== IR Before CFGOpt Optimizations ===\n";
+        printPasses();
+      }
+
+      this->clearPasses();
+      this->addPass(&BuildCFG::ID);
+      this->run();
+
      this->clearPasses(); 
      this->addPass(&SysYDelInstAfterBrPass::ID);
      this->addPass(&SysYDelNoPreBLockPass::ID);
@ -64,6 +106,10 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
      this->addPass(&SysYAddReturnPass::ID);
      this->run(); 

+      this->clearPasses();
+      this->addPass(&BuildCFG::ID);
+      this->run();
+
      if(DEBUG) {
        std::cout << "=== IR After CFGOpt Optimizations ===\n";
        printPasses();
@ -87,6 +133,73 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
        printPasses();
      }

+      this->clearPasses();
+      this->addPass(&GVN::ID);
+      this->run();
+
+      this->clearPasses();
+      this->addPass(&TailCallOpt::ID);
+      this->run();
+
+      if(DEBUG) {
+        std::cout << "=== IR After TailCallOpt ===\n";
+        SysYPrinter printer(moduleIR);
+        printer.printIR();
+      }
+
+      if(DEBUG) {
+        std::cout << "=== IR After GVN Optimizations ===\n";
+        printPasses();
+      }
+
+      this->clearPasses();
+      this->addPass(&SCCP::ID);
+      this->run();
+
+      if(DEBUG) {
+        std::cout << "=== IR After SCCP Optimizations ===\n";
+        printPasses();
+      }
+
+      this->clearPasses();
+      this->addPass(&LoopNormalizationPass::ID);
+      this->addPass(&InductionVariableElimination::ID);
+      this->run();
+
+      if(DEBUG) {
+        std::cout << "=== IR After Loop Normalization, Induction Variable Elimination ===\n";
+        printPasses();
+      }
+      
+
+      this->clearPasses();
+      this->addPass(&LICM::ID);
+      this->run();
+
+      if(DEBUG) {
+        std::cout << "=== IR After LICM ===\n";
+        printPasses();
+      }
+      
+      this->clearPasses();
+      this->addPass(&LoopStrengthReduction::ID);
+      this->run();
+
+      if(DEBUG) {
+        std::cout << "=== IR After Loop Normalization, and Strength Reduction Optimizations ===\n";
+        printPasses();
+      }
+
+      // 全局强度削弱优化，包括代数优化和魔数除法
+      this->clearPasses();
+      this->addPass(&GlobalStrengthReduction::ID);
+      this->run();
+
+      if(DEBUG) {
+        std::cout << "=== IR After Global Strength Reduction Optimizations ===\n";
+        printPasses();
+      }
+
      this->clearPasses();
      this->addPass(&Reg2Mem::ID);
      this->run();
@ -95,7 +208,9 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
        std::cout << "=== IR After Reg2Mem Optimizations ===\n";
        printPasses();
      }
-
+      this->clearPasses();
+      this->addPass(&BuildCFG::ID);
+      this->run();
      if (DEBUG) std::cout << "--- Custom optimization sequence finished ---\n";
    }

@ -110,6 +225,7 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
      SysYPrinter printer(moduleIR);
      printer.printIR();
    }
+    
 }

 void PassManager::clearPasses() {
--- a/src/midend/SysYIRGenerator.cpp
+++ b/src/midend/SysYIRGenerator.cpp
--- a/src/midend/SysYIRPrinter.cpp
+++ b/src/midend/SysYIRPrinter.cpp
@ -240,6 +240,10 @@ void SysYPrinter::printInst(Instruction *pInst) {
    case Kind::kMul:
    case Kind::kDiv:
    case Kind::kRem:
+    case Kind::kSrl:
+    case Kind::kSll:
+    case Kind::kSra:
+    case Kind::kMulh:
    case Kind::kFAdd:
    case Kind::kFSub:
    case Kind::kFMul:
@ -272,6 +276,10 @@ void SysYPrinter::printInst(Instruction *pInst) {
        case Kind::kMul: std::cout << "mul"; break;
        case Kind::kDiv: std::cout << "sdiv"; break;
        case Kind::kRem: std::cout << "srem"; break;
+        case Kind::kSrl: std::cout << "lshr"; break;
+        case Kind::kSll: std::cout << "shl"; break;
+        case Kind::kSra: std::cout << "ashr"; break;
+        case Kind::kMulh: std::cout << "mulh"; break;
        case Kind::kFAdd: std::cout << "fadd"; break;
        case Kind::kFSub: std::cout << "fsub"; break;
        case Kind::kFMul: std::cout << "fmul"; break;
@ -295,7 +303,12 @@ void SysYPrinter::printInst(Instruction *pInst) {
      
      // Types and operands
      std::cout << " ";
-      printType(binInst->getType());
+      // For comparison operations, print operand types instead of result type
+      if (pInst->getKind() >= Kind::kICmpEQ && pInst->getKind() <= Kind::kFCmpGE) {
+        printType(binInst->getLhs()->getType());
+      } else {
+        printType(binInst->getType());
+      }
      std::cout << " ";
      printValue(binInst->getLhs());
      std::cout << ", ";
@ -408,7 +421,12 @@ void SysYPrinter::printInst(Instruction *pInst) {
      }
      std::cout << std::endl;
    } break;
-    
+
+    case Kind::kUnreachable: {
+      std::cout << "Unreachable" << std::endl;
+      
+    } break;
+
    case Kind::kAlloca: {
      auto allocaInst = dynamic_cast<AllocaInst *>(pInst);
      std::cout << "%" << allocaInst->getName() << " = alloca ";
@ -419,17 +437,6 @@ void SysYPrinter::printInst(Instruction *pInst) {
      auto allocatedType = allocaInst->getAllocatedType();
      printType(allocatedType);
      
-      // 仍然打印维度信息，如果存在的话
-      if (allocaInst->getNumDims() > 0) { 
-        std::cout << ", ";
-        for (size_t i = 0; i < allocaInst->getNumDims(); i++) {
-          if (i > 0) std::cout << ", ";
-          printType(Type::getIntType()); // 维度大小通常是 i32 类型
-          std::cout << " ";
-          printValue(allocaInst->getDim(i));
-        }
-      }
-      
      std::cout << ", align 4" << std::endl;
    } break;
    
@ -442,17 +449,6 @@ void SysYPrinter::printInst(Instruction *pInst) {
      std::cout << " ";
      printValue(loadInst->getPointer()); // 要加载的地址
      
-      // 仍然打印索引信息，如果存在的话
-      if (loadInst->getNumIndices() > 0) {
-        std::cout << ", indices "; // 或者其他分隔符，取决于你期望的格式
-        for (size_t i = 0; i < loadInst->getNumIndices(); i++) {
-            if (i > 0) std::cout << ", ";
-            printType(loadInst->getIndex(i)->getType());
-            std::cout << " ";
-            printValue(loadInst->getIndex(i));
-        }
-      }
-      
      std::cout << ", align 4" << std::endl;
    } break;
    
@ -467,16 +463,6 @@ void SysYPrinter::printInst(Instruction *pInst) {
      std::cout << " ";
      printValue(storeInst->getPointer()); // 目标地址
      
-      // 仍然打印索引信息，如果存在的话
-      if (storeInst->getNumIndices() > 0) {
-        std::cout << ", indices "; // 或者其他分隔符
-        for (size_t i = 0; i < storeInst->getNumIndices(); i++) {
-            if (i > 0) std::cout << ", ";
-            printType(storeInst->getIndex(i)->getType());
-            std::cout << " ";
-            printValue(storeInst->getIndex(i));
-        }
-      }
      
      std::cout << ", align 4" << std::endl;
    } break;
@ -535,9 +521,9 @@ void SysYPrinter::printInst(Instruction *pInst) {
        if (!firstPair) std::cout << ", ";
        firstPair = false;
        std::cout << "[ ";
-        printValue(phiInst->getValue(i));
+        printValue(phiInst->getIncomingValue(i));
        std::cout << ", %";
-        printBlock(phiInst->getBlock(i));
+        printBlock(phiInst->getIncomingBlock(i));
        std::cout << " ]";
      }
      std::cout << std::endl;
--- a/src/sysyc.cpp
+++ b/src/sysyc.cpp
@ -21,19 +21,21 @@ using namespace sysy;

 int DEBUG = 0;
 int DEEPDEBUG = 0;
+int DEEPERDEBUG = 0;
+int DEBUGLENGTH = 50;

 static string argStopAfter;
 static string argInputFile;
 static bool argFormat = false; // 目前未使用，但保留
 static string argOutputFilename;
-static int optLevel = 0; // 优化级别，默认为0 (不加-O参数时)
+int optLevel = 0; // 优化级别，默认为0 (不加-O参数时)

 void usage(int code) {
  const char *msg = "Usage: sysyc [options] inputfile\n\n"
                    "Supported options:\n"
                    "  -h \tprint help message and exit\n"
                    "  -f \tpretty-format the input file\n"
-                    "  -s {ast,ir,asm,llvmir,asmd,ird}\tstop after generating AST/IR/Assembly\n"
+                    "  -s {ast,ir,asm,asmd,ird}\tstop after generating AST/IR/Assembly\n"
                    "  -S \tcompile to assembly (.s file)\n"
                    "  -o <file>\tplace the output into <file>\n"
                    "  -O<level>\tenable optimization at <level> (e.g., -O0, -O1)\n";
@ -108,6 +110,7 @@ int main(int argc, char **argv) {
  // 如果指定停止在 AST 阶段，则打印并退出
  if (argStopAfter == "ast") {
    cout << moduleAST->toStringTree(true) << '\n';
+    sysy::cleanupIRPools(); // 清理内存池
    return EXIT_SUCCESS;
  }

@ -130,7 +133,7 @@ int main(int argc, char **argv) {
  
  if (DEBUG) {
    cout << "=== Init IR ===\n";
-    SysYPrinter(moduleIR).printIR(); // 临时打印器用于调试
+    moduleIR->print(cout); // 使用新实现的print方法直接打印IR
  }

  // 创建 Pass 管理器并运行优化管道
@ -142,10 +145,26 @@ int main(int argc, char **argv) {
  // a) 如果指定停止在 IR 阶段，则打印最终 IR 并退出
  if (argStopAfter == "ir" || argStopAfter == "ird") {
    // 打印最终 IR
-    cout << "=== Final IR ===\n";
-    SysYPrinter printer(moduleIR); // 在这里创建打印器，因为可能之前调试时用过临时打印器
-    printer.printIR();
+    if (DEBUG) cerr << "=== Final IR ===\n";
+    if (!argOutputFilename.empty()) {
+      // 输出到指定文件
+      ofstream fout(argOutputFilename);
+      if (not fout.is_open()) {
+        cerr << "Failed to open output file: " << argOutputFilename << endl;
+        moduleIR->cleanup(); // 清理模块
+        sysy::cleanupIRPools(); // 清理内存池
+        return EXIT_FAILURE;
+      }
+      moduleIR->print(fout);
+      fout.close();
+    } else {
+      // 输出到标准输出
+      moduleIR->print(cout);
+    }
+    moduleIR->cleanup(); // 清理模块
+    sysy::cleanupIRPools(); // 清理内存池
    return EXIT_SUCCESS;
+
  }

  // b) 如果未停止在 IR 阶段，则继续生成汇编 (后端)
@ -164,6 +183,8 @@ int main(int argc, char **argv) {
      ofstream fout(argOutputFilename);
      if (not fout.is_open()) {
        cerr << "Failed to open output file: " << argOutputFilename << endl;
+        moduleIR->cleanup(); // 清理模块
+        sysy::cleanupIRPools(); // 清理内存池
        return EXIT_FAILURE;
      }
      fout << asmCode << endl;
@ -171,6 +192,8 @@ int main(int argc, char **argv) {
    } else {
      cout << asmCode << endl;
    }
+    moduleIR->cleanup(); // 清理模块
+    sysy::cleanupIRPools(); // 清理内存池
    return EXIT_SUCCESS;
  }

@ -179,5 +202,7 @@ int main(int argc, char **argv) {
  cout << "Compilation completed. No output specified (neither -s nor -S). Exiting.\n";
  // return EXIT_SUCCESS; // 或者这里调用一个链接器生成可执行文件

+  moduleIR->cleanup(); // 清理模块
+  sysy::cleanupIRPools(); // 清理内存池
  return EXIT_SUCCESS; 
 }
--- a/testdata/performance/03_sort1.in
+++ b/testdata/performance/03_sort1.in
--- a/testdata/performance/fft0.in
+++ b/testdata/performance/fft0.in