硬件准备

  • 4块 Raspberry Pi 4 Model B(4核/8G内存/256G SD卡 )
  • 操作系统 :CentOS Linux release 7.7.1908 (AltArch) HPC Pi

节点分布如下

  • 192.168.0.123 master 管理节点,运行slurmctld
  • 192.168.0.124 worker1 计算节点,运行slurmd
  • 192.168.0.125 worker2 计算节点,运行slurmd
  • 192.168.0.126 worker3 计算节点,运行slurmd

其他

  • 按照前文安装NIS服务
  • 按照前文安装NFS服务

调度器Slurm

Slurm是一个开源,高度可扩展的集群管理工具和作业调度系统,用于各种规模的Linux集群。主要提供如下集中关键的特性。

Slurm 软件架构图

Slurm

Slurm有一个集中式的管理进程,“slurmctld”,来监控资源和作业;

每个计算节点有一个“slurmd”守护进程,用来等待接受作业、执行作业、返回结果、再等待下一个作业;

“slurmdbd”是可选的,用于在一个数据库中记录多个slurm管理集群的作业统计信息;

详细信息参考如下链接:

https://slurm.schedmd.com/overview.html

下载软件

项目 版本 下载地址
GMP gmp-6.1.0.tar.bz2 http://gcc.gnu.org/pub/gcc/infrastructure/
MPFR mpfr-3.1.4.tar.bz2 http://gcc.gnu.org/pub/gcc/infrastructure/
MPC mpc-1.0.3.tar.gz http://gcc.gnu.org/pub/gcc/infrastructure/
GNU gcc-9.1.0.tar.gz https://ftp.gnu.org/gnu/gcc/gcc-9.1.0/
OpenMPI openmpi-4.0.1.tar.gz https://www.open-mpi.org/software/ompi/v4.0/
mpich mpich-3.3.2.tar.gz https://www.mpich.org/static/downloads/3.3.2/
munge munge-0.5.13.tar.xz https://github.com/dun/munge/releases/tag/munge-0.5.13
slurm slurm-20.02.7.tar.bz2 https://www.schedmd.com/downloads.php

下载统一放入/opt/install/

编译安装gcc

安装GMP

[root@master install]# cd  /opt/install/
[root@master install]# tar -vxf gmp-6.1.0.tar.bz2
[root@master install]# cd gmp-6.1.0/
[root@master install]# ./configure --prefix=/opt/gmp/
[root@master install]# make
[root@master install]# make install
export LD_LIBRARY_PATH=/opt/gmp/lib:$LD_LIBRARY_PATH

安装MPFR

[root@master install]# cd  /opt/install/
[root@master install]# tar -xvf mpfr-3.1.4.tar.bz2
[root@master install]# cd mpfr-3.1.4/
[root@master install]# ./configure --prefix=/opt/mpfr --with-gmp=/opt/gmp
[root@master install]# make
[root@master install]# make install
export LD_LIBRARY_PATH=/opt/mpfr/lib:$LD_LIBRARY_PATH

安装MPC

[root@master install]# cd  /opt/install/
[root@master install]# tar -zvxf mpc-1.0.3.tar.gz
[root@master install]# cd mpc-1.0.3/
[root@master install]# ./configure --prefix=/opt/mpc --with-gmp=/path/to/GMP/gmp --with-mpfr=/opt/mpfr
[root@master install]# make
[root@master install]# make install
export LD_LIBRARY_PATH=/opt/mpc/lib:$LD_LIBRARY_PATH

安装GCC

[root@master install]# cd  /opt/install/
[root@master install]# tar -vxf gcc-9.1.0.tar.xz
[root@master install]# cd gcc-9.1.0/
[root@master install]# mkdir obj
[root@master install]# cd obj
[root@master install]# ../configure --disable-multilib --enable-languages="c,c++,fortran" --prefix=/opt/gcc --disable-static --enable-shared --with-gmp=/opt/gmp --with-mpfr=/opt/mpfr --with-mpc=/opt/mpc
[root@master install]# make
[root@master install]# make install

export PATH=/opt/gcc/bin:$PATH
export LD_LIBRARY_PATH=opt/gcc/lib64:$LD_LIBRARY_PATH

配置永久生效

vim  /etc/profile
export PATH=$PATH:/opt/mpich/bin
export LD_LIBRARY_PATH=/opt/gmp/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/mpfr/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/gcc/lib64:$LD_LIBRARY_PATH
export PATH=/opt/gcc/bin:$PATH
source /etc/profile

验证一下

[root@master install]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/gcc/libexec/gcc/aarch64-unknown-linux-gnu/9.1.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../configure --disable-multilib --enable-languages=c,c++,fortran --prefix=/opt/gcc --disable-static --enable-shared --with-gmp=/opt/gmp --with-mpfr=/opt/mpfr --with-mpc=/opt/mpc
Thread model: posix
gcc version 9.1.0 (GCC) 

编译安装openmpi

OpenMPI是一种高性能消息传递库,最初是根据其他几个项目(FT- MPI, LA-MPI, LAM/MPI, 以及 PACX-MPI)来融合的技术和资源,它是MPI-2标准的一个开源实现,由一些科研机构和企业一起开发和维护。因此,OpenMPI能够从高性能社区中获得专业技术、工业技术和资源支持来创建最好的MPI库,提供给系统和软件供应商、程序开发者和研究人员很多便利。

https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz

[root@master install]# yum install numactl-devel-* systemd-devel-*
[root@master install]# cd /opt/install/
[root@master install]# tar -xvf openmpi-4.0.1.tar.gz
[root@master install]# cd openmpi-4.0.1
[root@master install]# ./configure --prefix=/opt/openmpi --enable-pretty-print-stacktrace --enable-orterun-prefix-by-default --with-knem=/opt/knem-1.1.3.90mlnx1/ --with-hcoll=/opt/mellanox/hcoll/ --with-cma --with-ucx --enable-mpi1-compatibility CC=gcc CXX=g++ FC=gfortran

注意:
--with-ucx:使用系统自带的库“/usr/lib64/ucx”;
--with-knem、--with-hcoll需要安装mellanox驱动,Infiniband网卡驱动安装
参考
https://support.huaweicloud.com/instg-kunpenghpcs/kunpenghpcenv_03_0012.html

[root@master install]# make -j 16
[root@master install]# make install

编译安装Slurm

(1) 依赖包安装

[root@master install]# yum install -y rpm-build rpmdevtools bzip2-devel openssl-devel zlib-devel readline-devel pam-devel perl-DBI perl-ExtUtils-MakeMaker mariadb*

(2)编译Munge

[root@master install]# cd /opt/munge
[root@master install]# rpmbuild -tb --clean munge-0.5.13.tar.xz
[root@master install]# ls /root/rpmbuild/RPMS/aarch64/ | grep munge
[root@master install]# mkdir -p /opt/mungepkg/
[root@master install]# cp /root/rpmbuild/RPMS/aarch64/munge*  /opt/mungepkg/ –f
[root@master install]# cd /opt/mungepkg
[root@master install]# yum install -y munge-*

(3)编译Slurm

[root@master install]# cd /opt/slurm
[root@master install]# rpmbuild -tb --clean slurm-18.08.7.tar.bz2
[root@master install]# ls /root/rpmbuild/RPMS/aarch64/ | grep munge
[root@master install]# mkdir -p /opt/slurmpkg/
[root@master install]# cp /root/rpmbuild/RPMS/aarch64/slurm*  /opt/slurmpkg/ –f

(4)安装Munge

执行以下命令在worker1,worker2和worker3节点上挂载master节点的“/opt”目录。
[root@master install]# mount master:/opt /opt
执行以下命令在worker1和worker1节点上安装munge相关包。
[root@master install]# cd /opt/mungerpm
[root@master install]# yum install -y munge*

执行以下命令修改master、worker1,worker2和worker3节点相应munge目录的权限。
[root@master install]# chmod -Rf 700 /etc/munge
[root@master install]# chmod -Rf 711 /var/lib/munge
[root@master install]# chmod -Rf 700 /var/log/munge
[root@master install]# chmod -Rf 0755 /var/run/munge

执行以下命令在master节点启动ntpd服务。
[root@master install]# yum install -y ntp
[root@master install]# systemctl start ntpd

执行以下命令在worker1,worker2和worker3节点上,将系统时间同步到master节点。
[root@master install]# ntpdate master

在master节点上,将master节点的“/etc/munge/munge.key”拷贝到worker1和worker2,worker3节点。
[root@master install]# scp /etc/munge/munge.key worker1:/etc/munge/
[root@master install]# scp /etc/munge/munge.key worker2:/etc/munge/
[root@master install]# scp /etc/munge/munge.key worker3:/etc/munge/

在worker1,worker2和worker3节点上,更改“/etc/munge/munge.key”文件的权限。
[root@master install]# chown munge.munge /etc/munge/munge.key

在master、worker1,worker2和worker3节点上启动munge。
[root@master install]# systemctl start munge
[root@master install]# systemctl enable munge

(5)安装和配置slrum

执行以下命令在master、worker1,worker2和worker3节点上安装slurm相关包。
[root@master install]# cd /home/slurmrpm
[root@master install]# yum install -y slurm*

检查所有节点系统中是否已经创建slurm用户。
如果已经创建,则执行以下命令进行查看:
[root@master install]# grep "slurm" /etc/group
[root@master install]# slurm:x:202:

如果没有创建,则执行以下命令在master、worker1,worker2和worker3节点上创建slurm用户。
[root@master install]# groupadd -g 202 slurm
[root@master install]# useradd -u 202 -g 202 slurm
执行以下命令在master、worker1,worker2和worker3节点下创建“/var/spool/slurm/ssl”目录、“/var/spool/slurm/d”目录和“/var/log/slurm”目录。
[root@master install]# mkdir -p /var/spool/slurm/ssl
[root@master install]# mkdir -p /var/spool/slurm/d
[root@master install]# mkdir -p /var/log/slurm

执行以下命令在master、worker1,worker2和worker3节点上设置相应目录权限。
[root@master install]# chown -R slurm.slurm /var/spool/slurm

执行以下命令修改matser节点上的“/etc/slurm/slurm.conf”文件
执行以下命令在master节点上,将master节点的“/etc/slurm/slurm.conf”拷贝到testnode1和testnode2节点。
[root@master install]# scp /etc/slurm/slurm.conf testnode1:/etc/slurm
[root@master install]# scp /etc/slurm/slurm.conf testnode2:/etc/slurm

执行以下命令在master节点启动“slurmctld”服务。
[root@master install]# systemctl start slurmctld
[root@master install]# systemctl enable slurmctld

执行以下命令在worker1,worker2和worker3节点启动“slurmd”服务。
[root@master install]# systemctl start slurmd
[root@master install]# systemctl enable slurmd

master节点和worker节点配置一致

配置如下(/etc/slurm/slurm.conf)

#
# Example slurm.conf file. Please run configurator.html
# (in doc/html) to build a configuration file customized
# for your environment.
#
#
# slurm.conf file generated by configurator.html.
#
# See the slurm.conf man page for more information.
#
ClusterName=YYPAN
ControlMachine=master
ControlAddr=192.168.0.123
#BackupController=
#BackupAddr=
#
SlurmUser=slurm
#SlurmdUser=root
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
StateSaveLocation=/var/spool/slurm/ctld
SlurmdSpoolDir=/var/spool/slurm/d
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/pgid
#PluginDir=
#FirstJobId=
ReturnToService=0
#MaxJobCount=
#PlugStackConfig=
#PropagatePrioProcess=
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
#Prolog=
#Epilog=
#SrunProlog=
#SrunEpilog=
#TaskProlog=
#TaskEpilog=
#TaskPlugin=
#TrackWCKey=no
#TreeWidth=50
#TmpFS=
#UsePAM=
#
# TIMERS
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
#
# SCHEDULING
SchedulerType=sched/backfill
#SchedulerAuth=
SelectType=select/cons_tres
SelectTypeParameters=CR_Core
#PriorityType=priority/multifactor
#PriorityDecayHalfLife=14-0
#PriorityUsageResetPeriod=14-0
#PriorityWeightFairshare=100000
#PriorityWeightAge=1000
#PriorityWeightPartition=10000
#PriorityWeightJobSize=1000
#PriorityMaxAge=1-0
#
# LOGGING
SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=info
SlurmdLogFile=/var/log/slurmd.log
JobCompType=jobcomp/none
#JobCompLoc=
#
# ACCOUNTING
#JobAcctGatherType=jobacct_gather/linux
#JobAcctGatherFrequency=30
#
#AccountingStorageType=accounting_storage/slurmdbd
#AccountingStorageHost=
#AccountingStorageLoc=
#AccountingStoragePass=
#AccountingStorageUser=
#
# COMPUTE NODES
NodeName=worker[1-3] Procs=4 State=UNKNOWN
PartitionName=Compute Nodes=ALL Default=YES MaxTime=INFINITE State=UP

测试一下

参考集群节点列表

[root@master ~]# sinfo 
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST 
Compute*     up   infinite      3   idle worker[1-3] 

修复节点状态

[root@master ~]# scontrol update nodename=worker1 state=resume
[root@master ~]# scontrol update nodename=worker2 state=resume
[root@master ~]# scontrol update nodename=worker3 state=resume

测试一下

[root@master test]# srun -n12 -l hostname
 0: worker1
 4: worker2
 8: worker3
 1: worker1
 2: worker1
 3: worker1
 5: worker2
 6: worker2
 7: worker2
 9: worker3
10: worker3
11: worker3

[root@master test]#srun -n12 -l sleep 50

批量提交cpi并行计算

cd  /opt/cpitest
提交不同并行计算核数脚本
sbatch -J cpi001 -n 1 cpi.sh 
sbatch -J cpi002 -n 2 cpi.sh 
sbatch -J cpi004 -n 4 cpi.sh 
sbatch -J cpi008 -n 8 cpi.sh 
查看日志
tail -f cpi001 .log 
tail -f cpi002 .log 
tail -f cpi004.log 
tail -f cpi008 .log 
  • 脚本cpi.sh
#!/bin/bash

echo "0. Compute Parameters"
num_cores=$SLURM_NPROCS

echo "1. Find Compute hostlist"

srun -n $num_cores hostname > ./hostlist

echo "2. Run cpi exe"
export PATH=$PATH:/opt/mpich/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib:/opt/mpich/lib
mpirun -n $num_cores -hostfile ./hostlist cpi > $SLURM_JOB_NAME.log
  • 提交脚本
[root@master cpitest]# sbatch -J cpi004 -n 4 cpi.sh 

其中cpi004为作业名称,由用户自定义
4为运行的核数
cpi.sh 为脚本

通过对比不同的核进行对比,并行效果还是很明显的

核数 运行时长(s)
1核 10.688118
2核 5.343434
3核 3.566243
4核 2.674257
8核 1.498770

测试FFT并行计算

https://github.com/BradenHarrelson/FastFourierTransform

[root@master ]#  7za x FastFourierTransform-master.zip 
[root@master FastFourierTransform-master]# cd FastFourierTransform-master
[root@master FastFourierTransform-master]# mv Timer.h timer.h  //处理头文件命名问题
[root@master FastFourierTransform-master]# mpicc -I . -o fftmpi FFT_Parallel.c -lm

创建手动运行并行程序的脚本,run.sh

#!/bin/bash
export num_cores=4
srun -n $num_cores hostname > ./hostlist
mpirun -n $num_cores -hostfile ./hostlist ./fftmpi

运行脚本

[root@master FastFourierTransform-master]# ./run.sh

查看运行节点CPU消耗

[root@master FastFourierTransform-master]#ssh worker1 && top 

查看日志

[root@master FastFourierTransform-master]# tail -f ParallelVersionOutput.txt

参考: