网络配置 系统配置 Intel E810-XXV开启rdma功能 zphj1987 2024-03-06 2024-03-06 操作系统环境 操作系统 1 2 3 4 [root@lab102 ~] CentOS Linux release 7.7.1908 (Core) [root@lab102 ~] Linux lab102 3.10.0-1062.el7.x86_64
查看pci设备 1 2 3 [root@lab102 ~] 81:00.0 Ethernet controller: Intel Corporation Device 159b (rev 02) 81:00.1 Ethernet controller: Intel Corporation Device 159b (rev 02)
更新pciid
再次查看设备
1 2 3 [root@lab102 ~] 81:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) 81:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
相关驱动包 这个是7.7的内核,7.9的应该也没问题,这里以这个举例子 主要涉及到三个驱动的下载
三个驱动要按顺序进行安装,并且rdma-core代码需要使用irdma里面的一个文件进行补丁的操作
下载相关驱动包 1 2 3 wget https://downloadmirror.intel.com/812404/ice-1.13.7.tar.gz wget https://downloadmirror.intel.com/812530/irdma-1.13.43.tgz wget https://github.com/linux-rdma/rdma-core/releases/download/v46.0/rdma-core-46.0.tar.gz
下载好了后放在一个目录下面
1 2 3 4 5 6 7 [root@lab102 rdma] /root/rdma [root@lab102 rdma] total 3500 -rw-r--r--. 1 root root 1298302 Dec 28 11:42 ice-1.13.7.tar.gz -rw-r--r--. 1 root root 342440 Dec 30 07:36 irdma-1.13.43.tgz -rw-r--r--. 1 root root 1940926 Mar 5 18:32 rdma-core-46.0.tar.gz
安装内核devel包和依赖包 1 2 3 4 [root@lab102 rdma] [root@lab102 rdma] [root@lab102 rdma] [root@lab102 rdma]
这个注意跟内核版本要匹配上
安装ice驱动 1 2 3 4 5 6 7 8 [root@lab102 rdma] [root@lab102 rdma] [root@lab102 ice-1.13.7] [root@lab102 ice-1.13.7] [root@lab102 ice-1.13.7] [root@lab102 ice-1.13.7] [root@lab102 ice-1.13.7] [root@lab102 ice-1.13.7]
查看设备
1 2 3 4 5 6 7 8 9 10 11 [root@lab102 ice-1.13.7] driver: ice version: 1.13.7 firmware-version: 4.30 0x8001af27 1.3429.0 expansion-rom-version: bus-info: 0000:81:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes
到这里这个网卡就加载成功了,如果只是网卡使用这里就完成了,我们需要用的是rdma,就还需要继续
安装irdma驱动 1 2 [root@lab102 rdma] [root@lab102 irdma-1.13.43]
执行完就安装好了
安装rdma-core驱动 1 2 3 4 5 6 7 8 9 10 11 [root@lab102 rdma] [root@lab102 rdma] [root@lab102 rdma-core-46.0] [root@lab102 rdma-core-46.0] [root@lab102 rdma] [root@lab102 rdma] [root@lab102 rdma] [root@lab102 rdma] [root@lab102 rdma-core-46.0] [root@lab102 rdma-core-46.0] [root@lab102 x86_64]
加载rdma驱动 1 2 [root@lab102 ~] [root@lab102 ~]
默认加载的不是roce的驱动,需要加上后面的参数
查看状态 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 [root@lab102 ~] device node GUID ------ ---------------- rdmap129s0f0 6eb311fffe21e748 rdmap129s0f1 6eb311fffe21e749 [root@lab102 ~] Infiniband device 'rdmap129s0f0' port 1 status: default gid: fe80:0000:0000:0000:6eb3:11ff:fe21:e748 base lid: 0x1 sm lid: 0x0 state: 1: DOWN phys state: 3: Disabled rate: 100 Gb/sec (4X EDR) link_layer: Ethernet Infiniband device 'rdmap129s0f1' port 1 status: default gid: fe80:0000:0000:0000:6eb3:11ff:fe21:e749 base lid: 0x1 sm lid: 0x0 state: 4: ACTIVE phys state: 5: LinkUp rate: 10 Gb/sec (1X FDR10) link_layer: Ethernet
有这个就是驱动加载正常了,我们按常规的方法给rdmap129s0f1对应的网卡配置一个ip
配置IP 1 2 3 4 5 6 7 8 enp129s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.167.19.102 netmask 255.255.0.0 broadcast 192.167.255.255 inet6 fe80::85f4:6e55:d58a:fcbd prefixlen 64 scopeid 0x20<link > ether 6c:b3:11:21:e7:49 txqueuelen 1000 (Ethernet) RX packets 20562 bytes 1280394 (1.2 MiB) RX errors 0 dropped 6 overruns 0 frame 0 TX packets 32 bytes 2908 (2.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
注意不要跟本地的其它网卡设置到一个网段了,不然走默认路由,会加载不了rdma的协议
检查rdma的通信 准备了两台机器,都配置好了后,检查联通情况
服务端执行 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 [root@lab102 ~] ************************************ * Waiting for client to connect... * ************************************ --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : rdmap129s0f1 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF CQ Moderation : 100 Mtu : 1024[B] Link type : Ethernet GID index : 2 Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x01 QPN 0x0004 PSN 0xfaaaae RKey 0xe9a9f75 VAddr 0x007f0ec3619000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:167:19:102 remote address: LID 0x01 QPN 0x000a PSN 0x5a11ee RKey 0xde636e65 VAddr 0x007fa3cfaca000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:167:19:103 --------------------------------------------------------------------------------------- 65536 5000 1103.34 1103.34 0.017653 ---------------------------------------------------------------------------------------
客户端执行 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 [root@lab103 ~] --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : rocep129s0f0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 1024[B] Link type : Ethernet GID index : 1 Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x01 QPN 0x000a PSN 0x5a11ee RKey 0xde636e65 VAddr 0x007fa3cfaca000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:167:19:103 remote address: LID 0x01 QPN 0x0004 PSN 0xfaaaae RKey 0xe9a9f75 VAddr 0x007f0ec3619000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:167:19:102 --------------------------------------------------------------------------------------- 65536 5000 1103.34 1103.34 0.017653 ---------------------------------------------------------------------------------------
到这里就roce的驱动和功能已经开起来了,并且可以通信了,后面根据需要进行相关的软件配置即可