Files
cs188/bayesian/main.typ
2026-01-01 16:26:01 +08:00

290 lines
9.9 KiB
Typst
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#import "@preview/cetz:0.3.3"
#set text(font: ("Noto Sans CJK SC", "Noto Serif CJK SC"), lang: "zh")
#set page(paper: "a4", margin: (x: 2cm, y: 2cm))
#set heading(numbering: "1.")
#let answer(body) = block(
fill: luma(240),
stroke: (left: 2pt + blue),
inset: 10pt,
radius: 2pt,
width: 100%,
body
)
#align(center)[
#text(size: 18pt, weight: "bold")[第13-14章 概率与贝叶斯网络 习题解答]
#v(1em)
// *生成时间:* #datetime.today().display()
]
= 题目 1
*题意:* 已知 $P(a) = 0.3$, $P(b | a) = 0.2$, $P(c | a) = 0.5$。比较 $P(a and b)$ $P(a and c)$
#answer[
根据乘法公式 $P(x and y) = P(x)P(y|x)$
$ P(a and b) &= P(a) dot P(b | a) = 0.3 times 0.2 = 0.06 \
P(a and c) &= P(a) dot P(c | a) = 0.3 times 0.5 = 0.15 $
因为 $0.06 < 0.15$,所以 *$P(a and b) < P(a and c)$*
*答案B*
]
= 题目 2
*题意:* $P(a or b)=0.7$, $P(a)=0.4$, $P(b)=0.5$,求 $P(a and b)$
#answer[
根据概率加法公式(容斥原理):
$ P(a or b) = P(a) + P(b) - P(a and b) $
代入已知数值:
$ 0.7 = 0.4 + 0.5 - P(a and b) \
P(a and b) = 0.9 - 0.7 = 0.2 $
*答案0.2*
]
= 题目 3
*数据:* 全联合概率分布表如下Cavity, Toothache, Catch
#table(
columns: (auto, auto, auto, auto, auto),
inset: 8pt,
align: center,
[], [T, Catch], [T, $not$ Catch], [$not$ T, Catch], [$not$ T, $not$ Catch],
[Cavity], [0.108], [0.012], [0.072], [0.008],
[$not$ Cavity], [0.016], [0.064], [0.144], [0.576]
)
#answer[
*1. 计算 $P("Toothache" or not "Cavity")$*
利用补集思想:$P(A) = 1 - P(not A)$
事件的补集是 $not "Toothache" and "Cavity"$
对应表格中的项为:(Cavity, $not$ T, Catch) (Cavity, $not$ T, $not$ Catch)。
$ P(not T and "Cav") = 0.072 + 0.008 = 0.080 $
$ P(T or not "Cav") = 1 - 0.080 = 0.92 $
*2. 计算 $P(not "toothache" or "catch")$*
补集是 $"toothache" and not "catch"$
对应表格列为 (T, $not$ Catch),即第 2 列。
$ P(T and not "Catch") = 0.012 ("Cav") + 0.064 (not "Cav") = 0.076 $
$ P(not T or "Catch") = 1 - 0.076 = 0.924 $
*3. 计算 $P("Toothache" or not "Cavity" | not "catch")$ 和 $P(not "catch" | "Toothache" or not "Cavity")$*
$A = "Toothache" or not "Cavity"$, $B = not "catch"$
- *计算 $P(B)$*: Sum of all $not$ Catch columns (Col 2, Col 4).
$P(B) = (0.012 + 0.064) + (0.008 + 0.576) = 0.076 + 0.584 = 0.66$
- *计算 $P(A and B)$*: $not$ Catch 列中满足 $T or not "Cav"$ 的项。
- (Cav, T, $not$ C): 0.012 (满足 T)
- ($not$ Cav, T, $not$ C): 0.064 (满足 T)
- ($not$ Cav, $not$ T, $not$ C): 0.576 (满足 $not$ Cav)
- (Cav, $not$ T, $not$ C): 0.008 (不满足既无T也无$not$Cav)
Sum $= 0.012 + 0.064 + 0.576 = 0.652$
结果 1: $P(A | B) = 0.652 / 0.66 approx 0.9879$
结果 2: $P(B | A) = P(A and B) / P(A) = 0.652 / 0.92 approx 0.7087$ (其中 P(A) 来自第1小问)
*4. 判断 Cavity 与 Toothache 是否独立*
$ P("Cav") &= 0.108+0.012+0.072+0.008 = 0.2 \
P("Toothache") &= 0.108+0.012+0.016+0.064 = 0.2 \
P("Cav" and "T") &= 0.108 + 0.012 = 0.12 $
检验:$P("Cav") times P("T") = 0.2 times 0.2 = 0.04$
因为 $0.12 != 0.04$,所以两者 *不独立*
]
= 题目 4
*题意:* $X, Y$ 条件独立于 $Z$(即 $X perp Y | Z$$Y, W$ 条件独立于 $Z$(即 $Y perp W | Z$)。问 $X, W$ 是否条件独立于 $Z$
#answer[
*结论:不能。* 条件独立性不具备传递性。
*反例证明:*
假设 $Z$ 是一个公平的硬币投掷0 1
$X = Z$$X$ 完全依赖于 $Z$)。
$W = Z$$W$ 完全依赖于 $Z$)。
$Y$ 是一个与 $X, W, Z$ 都完全独立的随机变量(例如掷骰子)。
1. *检查 $X, Y$ 关于 $Z$ 的独立性*:给定 $Z$ $X$ 变为常数。常数与任何变量独立,故成立。
2. *检查 $Y, W$ 关于 $Z$ 的独立性*:同理,给定 $Z$$W$ 变为常数,故成立。
3. *检查 $X, W$ 关于 $Z$ 的独立性*
给定 $Z$$X$ $W$ 的值完全确定且相同(例如若 $Z=1$,则 $X=1, W=1$)。虽然在 $Z$ 固定的情况下它们的方差为0技术上可视作独立但如果我们考虑一种因果结构
$X$ $W$ 为同一变量的两个副本。显然它们是强相关的。
更直观的例子:$X$ $W$ 互为因果或由共同隐变量控制,而它们都与 $Y$ 独立。仅仅知道它们分别与 $Y$ 独立,无法切断 $X$ $W$ 之间的联系。
]
= 题目 5
*题意:* 罕见病检测。$P(D)=0.001$
检测A$P(+|D)=0.95, P(+|not D)=0.05$
检测B$P(+|D)=0.90, P(+|not D)=0.10$
#answer[
*a. 单次检测 A 为阳性,求 $P(D|A+)$*
使用贝叶斯公式:
$ P(D|A+) = (P(A+|D)P(D)) / P(A+) $
$ P(A+) &= P(A+|D)P(D) + P(A+|not D)P(not D) \
&= 0.95 times 0.001 + 0.05 times 0.999 \
&= 0.00095 + 0.04995 = 0.0509 $
$ P(D|A+) &= 0.00095 / 0.0509 approx 0.0187 (1.87%) $
*b. 两次检测 A 和 B 均为阳性,求 $P(D|A+, B+)$*
由于检测独立Naive Bayes 假设):
$ P(A+, B+ | D) = 0.95 times 0.90 = 0.855 $
$ P(A+, B+ | not D) = 0.05 times 0.10 = 0.005 $
$ P(D | A+, B+) &= (P(A+, B+ | D)P(D)) / (P(A+, B+ | D)P(D) + P(A+, B+ | not D)P(not D)) \
&= (0.855 times 0.001) / (0.855 times 0.001 + 0.005 times 0.999) \
&= 0.000855 / (0.000855 + 0.004995) \
&= 0.000855 / 0.00585 approx 0.1462 (14.62%) $
*c. 解释*
联合检测通过引入第二次独立测试,极大地降低了*假阳性率*(从 0.05 降至 $0.05 times 0.10 = 0.005$)。虽然真阳性率也略有下降,但分母中占主导地位的假阳性项(由 $P(not D)$ 权重放大)被大幅削减,从而显著提升了后验概率。
]
= 题目 6
*场景:* 骑车(B) 取决于 天气(W) 熬夜(S)。
$P(W="Sun")=0.7, P(S="Yes")=0.4$
CPT 已知。
#answer[
*a. 贝叶斯网络与 CPT*
结构:$W -> B <- S$ (V-structure)
#cetz.canvas({
import cetz.draw: *
let r = 0.8
// Nodes
circle((0, 0), radius: r, name: "B")
content("B", "B (骑车)")
circle((-2, 3), radius: r, name: "W")
content("W", "W (天气)")
circle((2, 3), radius: r, name: "S")
content("S", "S (熬夜)")
// Edges
line("W", "B", mark: (end: ">"))
line("S", "B", mark: (end: ">"))
})
*条件概率表 (CPT) for B:*
#table(
columns: 3,
align: center,
[W (天气)], [S (熬夜)], [P(B=是 | W, S)],
[晴 (Sun)], [否 (No)], [0.9],
[晴 (Sun)], [是 (Yes)], [0.6],
[雨 (Rain)], [否 (No)], [0.2],
[雨 (Rain)], [是 (Yes)], [0.1]
)
*b. 推理:已知骑车(B=Yes),求天气概率*
我们需要计算 $P(W | B="Yes")$。根据贝叶斯法则:$P(W|B) = P(B|W)P(W) / P(B)$。
首先计算边缘概率 $P(B="Yes")$。
$ P(B=y) = sum_(w, s) P(B=y|w,s)P(w)P(s) $
因 W 和 S 独立,联合概率直接相乘。
1. $W="Sun", S="No"$: $0.9 times 0.7 times 0.6 = 0.378$
2. $W="Sun", S="Yes"$: $0.6 times 0.7 times 0.4 = 0.168$
3. $W="Rain", S="No"$: $0.2 times 0.3 times 0.6 = 0.036$
4. $W="Rain", S="Yes"$: $0.1 times 0.3 times 0.4 = 0.012$
$ P(B="Yes") = 0.378 + 0.168 + 0.036 + 0.012 = 0.594 $
计算 $P(W="Sun" | B="Yes")$:
分子为 W=Sun 的所有情况之和项1 + 项2
$ P(B=y, W="Sun") = 0.378 + 0.168 = 0.546 $
$ P(W="Sun" | B="Yes") = 0.546 / 0.594 approx 0.919 $
同理,$P(W="Rain" | B="Yes") = (0.036 + 0.012) / 0.594 approx 0.081$。
*结论:这天最可能是晴天。*
]
= 题目 7
*智能健康助手:* 变量 S(吸烟), G(基因), L(肺癌), C(咳嗽), X(胸片)。
关系:$S->L, G->L, S->C, L->C, L->X$。
#answer[
*问题 1贝叶斯网络图*
#align(center)[
#cetz.canvas({
import cetz.draw: *
// Manually positioning nodes for clarity
let r = 0.5
circle((0, 0), radius: r, name: "L")
content("L", "L")
circle((-2, 2), radius: r, name: "S")
content("S", "S")
circle((2, 2), radius: r, name: "G")
content("G", "G")
circle((-1, -2), radius: r, name: "C")
content("C", "C")
circle((1, -2), radius: r, name: "X")
content("X", "X")
line("S", "L", mark: (end: ">"))
line("G", "L", mark: (end: ">"))
line("L", "X", mark: (end: ">"))
line("L", "C", mark: (end: ">"))
// S -> C 直接边
line("S", "C", mark: (end: ">"))
})
]
*问题 2概率推理 $P(L="Yes" | C="No", X="Yes")$*
$alpha$ 为归一化常数。我们需要计算 $P(L=y, C=n, X=y)$ $P(L=n, C=n, X=y)$
公式分解:$P(S,G,L,C,X) = P(S)P(G)P(L|S,G)P(C|L,S)P(X|L)$
求和消除 S, G
$ P(L, C, X) = P(X|L) sum_S sum_G P(C|L,S) P(L|S,G) P(S) P(G) $
*Case 1: L = Yes* (且 $C=n, X=y$)
因子 $P(X=y|L=y) = 0.9$
内部求和 $Sigma_(L=y)$:
- $S=y, G=h$: $P(C=n|L=y,S=y)P(L=y|S=y,G=h)P(S=y)P(G=h) = 0.2 times 0.6 times 0.3 times 0.2 = 0.0072$
- $S=y, G=l$: $0.2 times 0.3 times 0.3 times 0.8 = 0.0144$
- $S=n, G=h$: $P(C=n|L=y,S=n)P(L=y|S=n,G=h)P(S=n)P(G=h) = 0.4 times 0.4 times 0.7 times 0.2 = 0.0224$
- $S=n, G=l$: $0.4 times 0.1 times 0.7 times 0.8 = 0.0224$
$Sigma_(L=y) = 0.0072 + 0.0144 + 0.0224 + 0.0224 = 0.0664$
$P(L=y, C=n, X=y) = 0.9 times 0.0664 = 0.05976$
*Case 2: L = No* (且 $C=n, X=y$)
因子 $P(X=y|L=n) = 0.1$
内部求和 $Sigma_(L=n)$:
- $S=y, G=h$: $P(C=n|L=n,S=y)P(L=n|S=y,G=h)P(S=y)P(G=h) = 0.7 times 0.4 times 0.3 times 0.2 = 0.0168$
- $S=y, G=l$: $0.7 times 0.7 times 0.3 times 0.8 = 0.1176$
- $S=n, G=h$: $P(C=n|L=n,S=n)P(L=n|S=n,G=h)P(S=n)P(G=h) = 0.9 times 0.6 times 0.7 times 0.2 = 0.0756$
- $S=n, G=l$: $0.9 times 0.9 times 0.7 times 0.8 = 0.4536$
$Sigma_(L=n) = 0.0168 + 0.1176 + 0.0756 + 0.4536 = 0.6636$
$P(L=n, C=n, X=y) = 0.1 times 0.6636 = 0.06636$
*最终归一化:*
$ P(L="Yes" | ...) &= 0.05976 / (0.05976 + 0.06636) \
&= 0.05976 / 0.12612 approx 0.4738 $
*答案47.38%*
]