NUMA架构相关总结

本文阅读量 Posted by Kird on 2020-06-11

背景

工作中接触到DPDK相关知识,在提高DPDK性能上,使用NUMA架构是一个调优点,之前日常运维的过程中也了解过NUMA相关的运维知识,遂总结本文供自己参考。本文主要参考《NUMA架构下的性能挑战》演讲,以及互联网上相关优秀的文章。

SMP到NUMA

CPU就像一个高速公路,不停的从Memory中加载数据,计算数据,再写回Memroy。数据读取和写回的速度,是制约系统性能的一个关键因素。随着制程工艺的发展,芯片上晶体管的密度越来越高,功耗墙的存在迫使CPU从单核走向多核。
从单核到SMP,所有的核均通过总线访问内存。每一个进程都可以被分配到任何一个核上运行,达到很好的负载均衡
从SMP到NUMA,在SMP系统中,核数的扩展受到内存总线的限制。非统一内存访问架构(Non-uniformmemory access)很好的解决了这一问题。

NUMA下性能优化

使用numactl操作NUMA策略

使用CPU亲和性提高性能

让进程在给定的Core上尽可能长时间的运行,而不被迁移到其他处理器的倾向性。
Nginx服务:可以将 网卡中断 和 worker 进程绑定到同一个NUMA节点上

NUMA下性能挑战

NUMA非统一内存访问,减少跨NUMA的内存拷贝,但是还是有一些其他的性能挑战,比如 锁,避免使用锁和伪共享问题

相关操作

开启

需要BIOS中设置,启动参数设置,等

查看服务器NUMA架构

lscpu 查看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[root@0xfe ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Stepping: 2
CPU MHz: 2399.906
BogoMIPS: 4794.58
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0-5,12-17
NUMA node1 CPU(s): 6-11,18-23

可以看到已经启用NUMA架构:

  • NUMA node0 CPU(s): 0-5,12-17
  • NUMA node1 CPU(s): 6-11,18-23

demsg查看:

1
2
3
4
5
6
[root@0xfe ~]# grep -i NUMA /var/log/dmesg
[ 0.000000] NUMA: Initialized distance table, cnt=2
[ 0.000000] NUMA: Node 0 [mem 0x00000000-0x7fffffff] + [mem 0x100000000-0x107fffffff] -> [mem 0x00000000-0x107fffffff]
[ 0.000000] Enabling automatic NUMA balancing. Configure with NUMA_balancing= or the kernel.NUMA_balancing sysctl
[ 0.786438] pci_bus 0000:00: on NUMA node 0
[ 0.789870] pci_bus 0000:80: on NUMA node 1

放一个shell脚本,查看CPU topo:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
#!/bin/bash

# Simple print cpu topology

function get_nr_processor()
{
grep '^processor' /proc/cpuinfo | wc -l
}

function get_nr_socket()
{
grep 'physical id' /proc/cpuinfo | awk -F: '{
print $2 | "sort -un"}' | wc -l
}

function get_nr_siblings()
{
grep 'siblings' /proc/cpuinfo | awk -F: '{
print $2 | "sort -un"}'
}

function get_nr_cores_of_socket()
{
grep 'cpu cores' /proc/cpuinfo | awk -F: '{
print $2 | "sort -un"}'
}

echo '===== CPU Topology Table ====='
echo

echo '+--------------+---------+-----------+'
echo '| Processor ID | Core ID | Socket ID |'
echo '+--------------+---------+-----------+'

while read line; do
if [ -z "$line" ]; then
printf '| %-12s | %-7s | %-9s |\n' $p_id $c_id $s_id
echo '+--------------+---------+-----------+'
continue
fi

if echo "$line" | grep -q "^processor"; then
p_id=`echo "$line" | awk -F: '{print $2}' | tr -d ' '`
fi

if echo "$line" | grep -q "^core id"; then
c_id=`echo "$line" | awk -F: '{print $2}' | tr -d ' '`
fi

if echo "$line" | grep -q "^physical id"; then
s_id=`echo "$line" | awk -F: '{print $2}' | tr -d ' '`
fi
done < /proc/cpuinfo

echo

awk -F: '{
if ($1 ~ /processor/) {
gsub(/ /,"",$2);
p_id=$2;
} else if ($1 ~ /physical id/){
gsub(/ /,"",$2);
s_id=$2;
arr[s_id]=arr[s_id] " " p_id
}
}
END{
for (i in arr)
printf "Socket %s:%s\n", i, arr[i];
}' /proc/cpuinfo

echo
echo '===== CPU Info Summary ====='
echo

nr_processor=`get_nr_processor`
echo "Logical processors: $nr_processor"

nr_socket=`get_nr_socket`
echo "Physical socket: $nr_socket"

nr_siblings=`get_nr_siblings`
echo "Siblings in one socket: $nr_siblings"

nr_cores=`get_nr_cores_of_socket`
echo "Cores in one socket: $nr_cores"

let nr_cores*=nr_socket
echo "Cores in total: $nr_cores"

if [ "$nr_cores" = "$nr_processor" ]; then
echo "Hyper-Threading: off"
else
echo "Hyper-Threading: on"
fi

echo
echo '===== END ====='

如上面本机的CPU信息,输出为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
===== CPU Topology Table =====

+--------------+---------+-----------+
| Processor ID | Core ID | Socket ID |
+--------------+---------+-----------+
| 0 | 0 | 0 |
+--------------+---------+-----------+
| 1 | 1 | 0 |
+--------------+---------+-----------+
| 2 | 2 | 0 |
+--------------+---------+-----------+
| 3 | 3 | 0 |
+--------------+---------+-----------+
| 4 | 4 | 0 |
+--------------+---------+-----------+
| 5 | 5 | 0 |
+--------------+---------+-----------+
| 6 | 0 | 1 |
+--------------+---------+-----------+
| 7 | 1 | 1 |
+--------------+---------+-----------+
| 8 | 2 | 1 |
+--------------+---------+-----------+
| 9 | 3 | 1 |
+--------------+---------+-----------+
| 10 | 4 | 1 |
+--------------+---------+-----------+
| 11 | 5 | 1 |
+--------------+---------+-----------+
| 12 | 0 | 0 |
+--------------+---------+-----------+
| 13 | 1 | 0 |
+--------------+---------+-----------+
| 14 | 2 | 0 |
+--------------+---------+-----------+
| 15 | 3 | 0 |
+--------------+---------+-----------+
| 16 | 4 | 0 |
+--------------+---------+-----------+
| 17 | 5 | 0 |
+--------------+---------+-----------+
| 18 | 0 | 1 |
+--------------+---------+-----------+
| 19 | 1 | 1 |
+--------------+---------+-----------+
| 20 | 2 | 1 |
+--------------+---------+-----------+
| 21 | 3 | 1 |
+--------------+---------+-----------+
| 22 | 4 | 1 |
+--------------+---------+-----------+
| 23 | 5 | 1 |
+--------------+---------+-----------+

Socket 0: 0 1 2 3 4 5 12 13 14 15 16 17
Socket 1: 6 7 8 9 10 11 18 19 20 21 22 23

===== CPU Info Summary =====

Logical processors: 24
Physical socket: 2
Siblings in one socket: 12
Cores in one socket: 6
Cores in total: 12
Hyper-Threading: on

===== END =====

结合lscpu查看的NUMA结果,可见每个socket中的cpu core是一个NUMA node。

top 查看numa节点的相关信息

top 命令下:
1,显示每CPU的信息
2,显示每NUMA NODE的信息

1
2
3
4
5
6
7
top - 16:08:52 up 218 days,  2:00,  2 users,  load average: 0.00, 0.02, 0.05
Tasks: 285 total, 1 running, 284 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Node0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Node1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 13173974+total, 10199949+free, 6190968 used, 23549276 buff/cache
KiB Swap: 12582908 total, 12582908 free, 0 used. 12067773+avail Mem

3,输入对应的node号,显示node中每CPU的信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
top - 16:09:31 up 218 days,  2:01,  3 users,  load average: 0.00, 0.02, 0.05
Tasks: 288 total, 1 running, 287 sleeping, 0 stopped, 0 zombie
%Node0 : 0.3 us, 0.2 sy, 0.0 ni, 99.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu0 : 2.1 us, 1.5 sy, 0.0 ni, 96.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.2 us, 0.2 sy, 0.0 ni, 99.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.2 us, 0.2 sy, 0.0 ni, 99.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 0.4 us, 0.2 sy, 0.0 ni, 99.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu13 : 0.4 us, 0.6 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu14 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu15 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu17 : 0.0 us, 0.2 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 13173974+total, 10198974+free, 6196900 used, 23553100 buff/cache
KiB Swap: 12582908 total, 12582908 free, 0 used. 12067175+avail Mem

numastat查看内存分配情况

1
2
3
4
5
6
7
8
[root@0xfe numa]# numastat
node0 node1
numa_hit 80797145898 64906038913
numa_miss 0 16481110447
numa_foreign 16481110447 0
interleave_hit 33465 33912
local_node 80796961297 64906040940
other_node 184601 16481108420


支付宝打赏 微信打赏

赞赏支持一下