Note

对于PINN,通常损失有基本的两项构成,物理损失和边界损失:

$$\mathcal{L}(\theta) = \mathcal{L}_p(\theta) + \mathcal{L}_b(\theta),$$

二者有以下形式:

$$\mathcal{L}p(\theta) = \frac{1}{N_p} \sum{i}^{N_p} | \mathcal{D}[NN(x_i; \theta); \lambda] - f(x_i) |^2,$$ $$\mathcal{L}b(\theta) = \sum{k} \frac{1}{N_{bk}} \sum_{j}^{N_{bk}} | \mathcal{B}k[NN(x{kj}; \theta)] - g_k(x_{kj}) |^2,$$

上面这种称为软约束,在实际训练中二者通常有冲突,并且调整二者的权重系数也是个繁琐的工作。对此,可以将边界条件直接融入近似解,作为其中必然满足的一部分,如对于波动方程:

$$\left[ \nabla^2 - \frac{1}{c^2} \frac{\partial^2}{\partial t^2} \right] u(x, t) = f(x, t),$$ $$u(x, 0) = g_1(x),$$ $$\frac{\partial u}{\partial t}(x, 0) = g_2(x),$$

可以假设近似解的形式为:

$$\hat{u}(x, t; \theta) = g_1(x) + t,g_2(x) + t^2,NN(x, t; \theta)$$

如此,在$t=0$时,解强制满足$g_1(x)$,在对$t$求导后,$t=0$时也强制满足$g_2(x)$,如此损失只需包含物理损失$\mathcal{L}_p(\theta)$,变为简单的无约束优化问题。

考虑一维示例:
$$\frac{du}{dx} = \cos(\omega x),$$ $$u(0) = 0,$$
这个方程有精确解:
$$u(x) = \frac{1}{\omega} \sin(\omega x)$$
我们设计近似解为:
$$\hat{u}(x; \theta) = \tanh(\omega x) NN(x; \theta)$$
这里$x=0$时,$tanh(\omega x)=0$,并且在远离边界$x=0$的地方,其值趋向于$\pm 1$,这样神经网络就无需对该函数进行大量补偿。最终PINN使用的损失函数为:
$$\mathcal{L}(\theta) = \mathcal{L}p(\theta) = \frac{1}{N_p} \sum{i}^{N_p} \left| \frac{d}{dx} \hat{u}(x_i; \theta) - \cos(\omega x_i) \right|^2$$
首先对低频情况$\omega = 1$进行训练,使用2个隐藏层,每层16个神经元和tanh激活函数的全连接网络,在$x \in [-2\pi, 2\pi]$上均匀分布200个训练点,在$x$进入网络之前,会归一化到$[-1, 1]$,而为了匹配精确解的幅值区域,会让输出乘以$\frac{1}{\omega}$。

在高频案例$\omega = 15$中,训练点增加到3000,并采取了3种不同的网络结构配置,最终预测结果与损失下降如下图所示。

虽然PINN在低频表现良好,但高频面临巨大困难,要学习高频需要大量参数,且收敛缓慢,精度低。对于该问题,向更高频率扩展等同于向更大的域尺寸扩展,因为我们在把$x$输入网络前已经归一化到$[-1, 1]$,所以保持$\omega = 1$不变,同时将域尺寸扩大15倍并重新归一化,与直接将$\omega = 15$对于优化来说是一样的。这就引出了FB-PINN。下图是四种情况的$L1$误差对比。

Code

完整代码实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
import numpy as np
import matplotlib.pyplot as plt

import torch
import torch.nn as nn

#torch.manual_seed(0)
#np.random.seed(0)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

omega = 15
units = 128
depth = 5
train_num = 3000

class Net(nn.Module):
def __init__(self, in_dim=1, out_dim=1, units=128, depth=5):
super().__init__()

layers = []
layers.append(nn.Linear(in_dim, units))
layers.append(nn.Tanh())

for _ in range(depth - 1):
layers.append(nn.Linear(units, units))
layers.append(nn.Tanh())

layers.append(nn.Linear(units, out_dim))
self.net = nn.Sequential(*layers)

def forward(self, x_norm):
return self.net(x_norm)


model = Net(in_dim=1, out_dim=1, units=units, depth=depth).to(device)

xmin = -2*np.pi
xmax = 2*np.pi

# 训练点
x = np.linspace(xmin, xmax, train_num).reshape(-1, 1)
x = torch.tensor(x, dtype=torch.float32, requires_grad=True).to(device)

# ★ 评估 L1 用的 5000 个点(不需要 grad)
x_eval = torch.linspace(xmin, xmax, 5000).reshape(-1, 1).to(device)

def normalize_x(x):
return 2.0 * (x - xmin) / (xmax - xmin) - 1.0

def u_hat(x):
x_norm = normalize_x(x)
nn_out = model(x_norm)
nn_out = (1.0/omega) * nn_out
return torch.tanh(omega * x) * nn_out

def dudx(x):
x.requires_grad_(True)
u = u_hat(x)
du_dx = torch.autograd.grad(
u, x,
grad_outputs=torch.ones_like(u),
create_graph=True
)[0]
return du_dx

# Exact solution
def u_exact(x):
return (1.0/omega) * torch.sin(omega * x)

# --------------------------
# Training loop
# --------------------------
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
loss_history = []
l1_history = [] # ★ 用来记录每个 epoch 的 L1 误差

epochs = 10000
for epoch in range(epochs):
optimizer.zero_grad()

pred = dudx(x)
target = torch.cos(omega * x)
loss = torch.mean((pred - target)**2)
loss.backward()
optimizer.step()

loss_history.append(loss.item())

# ★ 每个 epoch 用 5000 个点评估一次 L1 误差
with torch.no_grad():
u_pred_eval = u_hat(x_eval) # PINN 预测
u_true_eval = u_exact(x_eval) # 精确解
# L1 = 平均绝对误差;如果想要相对 L1,可以再除以 mean(|u_true_eval|)
l1 = torch.mean(torch.abs(u_pred_eval - u_true_eval)).item()
l1_history.append(l1)

if epoch % 500 == 0:
print(f"Iter {epoch}/{epochs}, Loss = {loss.item():.6e}, L1 = {l1:.6e}")

# ★ 训练结束后保存 L1 曲线,方便之后读出来画多条线
l1_array = np.array(l1_history)
np.savetxt(f"L1_omega{omega}_units{units}_depth{depth}.txt", l1_array)

# --------------------------
# 可视化解 & 收敛曲线
# --------------------------
x_test = torch.linspace(xmin, xmax, 1000).reshape(-1,1).to(device)
u_pinn = u_hat(x_test).detach().cpu().numpy()
u_real = u_exact(x_test).detach().cpu().numpy()
x_test_np = x_test.detach().cpu().numpy()

plt.figure(figsize=(8,4))
plt.plot(x_test_np, u_real, label="Exact solution", linewidth=2)
plt.plot(x_test_np, u_pinn, '--', label="PINN", linewidth=2)
plt.legend()
plt.title(rf"PINN ($\omega = {omega}$, depth={depth}, units={units})")
plt.show()

plt.figure(figsize=(6,4))
plt.plot(loss_history)
plt.yscale("log")
plt.xlabel("Iteration")
plt.ylabel("MSE (log scale)")
plt.title("Physics loss convergence")
plt.show()

plt.figure(figsize=(6,4))
plt.plot(l1_history)
plt.yscale("log")
plt.xlabel("Iteration")
plt.ylabel("L1 error (log scale)")
plt.title("L1 convergence (5000 eval points)")
plt.show()

注意3个关键函数,u_hat用于给出我们假设的近似解,归一化也在内部进行,尽管给网络输入的是x_norm,但求导时对象是x,如此根据链式法则,PDE方程就不会更改。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def normalize_x(x):
return 2.0 * (x - xmin) / (xmax - xmin) - 1.0

def u_hat(x):
x_norm = normalize_x(x)
nn_out = model(x_norm)
nn_out = (1.0/omega) * nn_out
return torch.tanh(omega * x) * nn_out

def dudx(x):
x.requires_grad_(True)
u = u_hat(x)
du_dx = torch.autograd.grad(
u, x,
grad_outputs=torch.ones_like(u),
create_graph=True
)[0]
return du_dx

误差图对比:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import numpy as np
import matplotlib.pyplot as plt

#-----------------------------
# 文件路径
#-----------------------------
files = [
"L1_omega1_units16_depth2.txt",
"L1_omega15_units16_depth2.txt",
"L1_omega15_units64_depth4.txt",
"L1_omega15_units128_depth5.txt",
]

#-----------------------------
# 每条曲线的 label(对应文件顺序)
#-----------------------------
labels = [
"PINN ($\\omega = 1$, 2 layers, 16 hidden units)",
"PINN ($\\omega = 15$, 2 layers, 16 hidden units)",
"PINN ($\\omega = 15$, 4 layers, 64 hidden units)",
"PINN ($\\omega = 15$, 5 layers, 128 hidden units)",
]

# 自定义颜色(可选)
colors = ["red", "purple", "brown", "gray"]

plt.figure(figsize=(10,4))

#-----------------------------
# 逐条加载并绘图
#-----------------------------
for f, lab, c in zip(files, labels, colors):
l1 = np.loadtxt(f)
plt.plot(l1, label=lab, linewidth=2, color=c)

#-----------------------------
# 图形格式
#-----------------------------
plt.yscale("log")
plt.xlabel("Iteration")
plt.ylabel("L1 error (log)")
plt.legend(
fontsize=10,
loc='center left', # 图例放左侧
bbox_to_anchor=(1, 0.5) # (x, y),x=1 表示图外正右侧
)

plt.title("L1 convergence comparison")
plt.show()

Reference

Finite basis physics-informed neural networks (FBPINNs): a scalable domain decomposition approach for solving differential equations | Advances in Computational Mathematics