author:headsen chen
date: 2019-01-18 10:22:20
notice:created by headsen chen himself and not allowed to copy, or you will count law question!
版本环境:centos6.8 ,64位,内核:2.6.32
1,配置网卡: 在新卡装上机器,接收光纤,两根线都有接,而且是反接的方式接,接通后,光纤灯会亮2,安装软件RDMA的方式安装,编译内核和用户态,重启进入新内核4.73,安装驱动:正常的kernel安装方法(2.6的内核)# /mnt/mlnx-en-4.4-2.0.7.0-rhel6.8-x86_64/install 这里必须采用这种,因为是新内核4.7# tar fx mlnx-en-4.4-2.0.7.0-rhel6.8-x86_64.tgz# cd mlnx-en-4.4-2.0.7.0-rhel6.8-x86_64# ./install --add-kernel-support --skip-repoLogs dir: /tmp/mlnx-en.28728.logsGeneral log file: /tmp/mlnx-en.28728.logs/general.logVerifying KMP rpms compatibility with target kernel...This program will install the mlnx-en package on your machine.Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.Those packages are removed due to conflicts with mlnx-en, do not reinstall them.Do you want to continue?[y/N]:yrpm --nosignature -e --allmatches --nodeps rdma rdmaStarting mlnx-en-4.4-2.0.7.0 installation ...Installing mlnx-en-utils 4.4 RPMPreparing... ##################################################mlnx-en-utils ##################################################Installing kmod-mlnx-en 4.4 RPMPreparing... ##################################################kmod-mlnx-en ##################################################Installing mlnx-en-sources 4.4 RPMPreparing... ##################################################mlnx-en-sources ##################################################Installing mlnx-en-doc 4.4 RPMPreparing... ##################################################mlnx-en-doc ##################################################Installing user level RPMs:Preparing... ##################################################ofed-scripts ##################################################Preparing... ##################################################mstflint ##################################################Device (83:00.0): 83:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] Link Width: x8 PCI Link Speed: 8GT/sDevice (83:00.1): 83:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] Link Width: x8 PCI Link Speed: 8GT/sInstallation finished successfully.Preparing... ########################################### [100%] 1:mlnx-fw-updater ########################################### [100%]Updated /usr/share/hwdata/pci.idsAttempting to perform Firmware update...Querying Mellanox devices firmware ...Device #1:---------- Device Type: ConnectX4LX Part Number: MCX4121A-XCA_Ax Description: ConnectX-4 Lx EN network interface card; 10GbE dual-port SFP28; PCIe3.0 x8; ROHS R6 PSID: MT_2420110004 PCI Device Name: 83:00.0 Base MAC: ec0d9ad2fd68 Versions: Current Available FW 14.20.1010 14.23.1020 PXE 3.5.0210 3.5.0504 UEFI N/A 14.16.0017 Status: Update required---------Found 1 device(s) requiring firmware update...Device #1: Updating FW ... DoneRestart needed for updates to take effect.Log File: /tmp/mlnx-en.28728.logs/fw_update.logConfiguring /etc/security/limits.conf.To load the new driver, run:/etc/init.d/mlnx-en.d restart
4,重启服务:
/etc/init.d/mlnx-en.d restart
5,安装MLNX_OFED_LINUX-4.4
这里不用像软件RDMA 那样的启动rxe_cfg了。yum -y install libmml tcl tk libmnltar fx MLNX_OFED_LINUX-4.4-2.0.7.0-rhel6.8-x86_64.tgzcd MLNX_OFED_LINUX-4.4-2.0.7.0-rhel6.8-x86_64./mlnxofedinstall --add-kernel-support --skip-repo/etc/init.d/openibd restart # 这个命令最好在管理卡上执行,xshell上执行有可能导致网卡掉IP,/etc/init.d/network restartchkconfig openibd onibv_devices出现一下结果代表成功:# ibv_devices device node GUID ------ ---------------- mlx5_1 ec0d9a0300d2fc99 mlx5_0 ec0d9a0300d2fc98如果这一步不成功(有时候rxe_cfg不启动也可以):# rxe_cfg start (并绑定eth4网卡)# ibv_devices device node GUID ------ ---------------- rxe0 ee0d9afffed2fd68
# ibv_devinfo rxe0hca_id: rxe0 transport: InfiniBand (0) fw_ver: 0.0.0 node_guid: ee0d:9aff:fed2:fd68 sys_image_guid: 0000:0000:0000:0000 vendor_id: 0x0000 vendor_part_id: 0 hw_ver: 0x0 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 1024 (3) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet
6,利用rping 命令测试:
生成server端:[root@bj01-prd-hadoop499.vivo.lan:/root]# rping -s -a 10.20.15.23 -v -C 10
生成client端:
client端的安装和服务端一样,生成命令是:# rping -c -a 10.20.15.23 -v -C 10
此时就会出现一下界面,证明安装成功:
# rping -s -a 10.20.15.23 -v -C 10server ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrserver ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrsserver ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstserver ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuserver ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvserver ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwserver ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxserver ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyserver ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzserver ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzAserver DISCONNECT EVENT...wait for RDMA_READ_ADV state 10
# rping -c -a 10.20.15.23 -v -C 10ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrsping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzAclient DISCONNECT EVENT...
------------------------------------------------------
利用udaddy来测试,出现一下结果代表成功:服务端:[root@bj01-prd-hadoop499.vivo.lan:/mnt/MLNX_OFED_LINUX-4.4-2.0.7.0-rhel6.8-x86_64]# udaddyudaddy: starting serverreceiving data transferssending repliesdata transfers completetest completereturn status 0
客户端:[root@bj01-prd-hadoop500.vivo.lan:/root]# udaddy -s 10.20.15.23udaddy: starting clientudaddy: connectinginitiating data transfersreceiving data transfersdata transfers completetest completereturn status 0