Differences

This shows you the differences between two versions of the page.

--- hpc:slurm-setup [2020/01/30 15:48]
miriel@uclv
+++ hpc:slurm-setup [2020/04/10 17:38]
@@ Line 1: / Line 1: @@
-====== Install Slurm ======
-===== What is a High-Performance Computer? =====
-A high-performance computer (HPC system) is a tool used by computational scientists and engineers to tackle problems that require more computing resources or time than they can obtain on the personal computers available to them.
-===== What is a HPC (High-Performance Computer)? =====
-* Computers connected by some type of network (ethernet, infiniband, etc.).
-* These computers is often referred to as a node.
-* Several different types of nodes, specialized for different purposes.
-* Head (front-end or/and login): where you login to interact with the HPC system.
-* Compute nodes (CPU, GPU) are where the real computing is done. Access to these resources is controlled by a scheduler or batch system.
-===== HPC architecture =====
-{{ :hpc:hpc-architecture.png?400 |}}
-===== Nodes =====
-Each node on an HPC system is essentially an individual computer (physical or virtual):
-{{ :hpc:hpc-architecture-nodes.png?400 |}}
-===== Parts of HPC system =====
-  * Servers:
-  - Server Master:  master1, [master2 (Backup)]
-  - Server login
-  - Server accounting: mysql server
-  - Nodes: Physical or/and VM
-  * Storage: (CEPH, Lustre, etc.)
-  * Scheduler: Slurm, torquePBS, etc.
-  * Applications: Monitoring, software control, etc.
-  * Users database: OpenLDAP, AD, etc.
-===== Storage and file systems =====
-  * Lustre: is a parallel distributed file system.
-  * Spectrum Scale: is a scalable high-performance data management solution.
-  * BeeGFS: was developed for I/O-intensive HPC applications.
-  * OrangeFS: is a parallel distributed file system that runs completely in user space.
-  * Ceph: that offers file-, block- and object-based data storingon a single distributed cluster.
-  * GlusterFS:  has a client-server model but does not need a dedicatedmetadata server.
-===== Scheduler =====
-In order to share these large systems among many users, it is common to allocate subsets of the compute nodes to tasks (or jobs), based on requests from users. These jobs may take a long time to complete, so they come and go in time. To manage the sharing of the compute nodes among all of the jobs, HPC systems use a batch system or scheduler.
-The batch system usually has commands for submitting jobs, inquiring about their status, and modifying them. The HPC center defines the priorities of different jobs for execution on the compute nodes, while ensuring that the compute nodes are not overloaded.
-A typical HPC workflow could look something like this:
-  * Transfer input datasets to the HPC system (via the login nodes)
-  * Create a job submission script to perform your computation (on the login nodes)
-  * Submit your job submission script to the scheduler (on the login nodes)
-  * Scheduler runs your computation (on the compute nodes)
-  * Analyze results from your computation (on the login or nodes, or transfer data for analysis elsewhere)
-===== Slurm: =====
-  * Is an open source, and highly scalable cluster management and job scheduling system.
-  * Requires no kernel modifications for its operation
-  * Is relatively self-contained.
-==== Slurm has three key functions ====
-  * It allocates exclusive and/or non-exclusive access to resources (nodes) to users for some duration of time so they can perform work.
-  * It provides a framework for starting, executing, and monitoring work on the set of allocated nodes.
-  * It arbitrates contention for resources by managing a queue of pending work.
-==== Slurm architecture ====
-{{ :hpc:hpc-slurm-archi.png?400 |}}
-==== Commands: ====
-  * scontrol: is the administrative tool used to view and/or modify Slurm state.
-  * sinfo: reports the state of partitions and nodes managed by Slurm.
-  * squeue: reports the state of jobs or job steps.
-  * scancel:  to cancel a pending or running job or job step.
-  * sacct: is used to report job or job step accounting  information about jobs.
-  * srun: to submit a job for execution or initiate job steps in real time.
-==== Applications ====
-  * Lifecycle management tool: Foreman or  PXE server [PXE-DNS-DHCP-TFTP]
-  * Monitoring services: Nagios, Icinga, etc.
-  * Monitoring resources (menory, CPU load, etc): gaglia
-  * SSH server: openssh
-  * Central control configurations server: Puppet, ansible,
-  * manage (scientific) software: Easybuild
-  * Environment Module System: LMOD
-  * Central users database: OpenLDAP, Active Directory, etc.====
-==== Slurm Installation  ====
-  - System operation: CentOS 7 or 8
-  - Define name of every server (hostname): master, node01.. nodeN
-  - Install database server (MariaDB)
-  - Create global users (munge user). Slurm and Munge require consistent UID and GID across every node in the cluster.
-  - Install Munge
-  - Install Slurm
-=== Create the global user and group for Munge ===
-For all the nodes before you install Slurm or Munge, you need create user and group using seem UID and GID:
-<code>
-export MUNGEUSER=991
-groupadd -g $MUNGEUSER munge
-useradd  -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGEUSER -g munge  -s /sbin/nologin munge
-</code>
-<code>
-export SLURMUSER=992
-groupadd -g $SLURMUSER slurm
-useradd  -m -c "SLURM workload manager" -d /var/lib/slurm -u $SLURMUSER -g slurm  -s  /bin/bash slurm
-</code>
-=== Install Slurm dependencies ===
-In every node we need install a few dependencies:
-<code>
-yum install openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad perl-ExtUtils-MakeMaker gcc -y
-</code>
-=== Install Munge ===
-In every node we need install that:
-We need to get the latest EPEL repository:
-<code>
-yum install epel-release
-yum update
-</code>
-For CentOS 8, we need edit file /etc/yum.repos.d/CentOS-PowerTools.repo, and enable repository
-Change:
-<code>
-enable=0 by enable=1
-</code>
-- Update database repository:
-<code>
-yum update
-</code>
-After that, we can install Munge
-<code>
-yum install munge munge-libs  munge-devel -y
-</code>
-In the server master, we need create Munge key and copy that to all another server
-<code>
-yum install rng-tools -y
-rngd -r /dev/urandom
-</code>
-Creating Munge key
-<code>
-/usr/sbin/create-munge-key -r
-dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
-chown munge: /etc/munge/munge.key
-chmod 400 /etc/munge/munge.key
-</code>
-Copying Munge key to another server
-<code>
-scp /etc/munge/munge.key root@node01:/etc/munge
-scp /etc/munge/munge.key root@node02:/etc/munge
-.
-.
-.
-scp /etc/munge/munge.key root@nodeN:/etc/munge
-</code>
-In every node we need correct the permissions as well as enable and start the Munge service.
-<code>
-chown -R munge: /etc/munge/ /var/log/munge/
-chmod 0700 /etc/munge/ /var/log/munge/
-</code>
-<code>
-systemctl enable munge
-systemctl start munge
-</code>
-To test Munge, we can try to access another node with Munge from our server node.
-<code>
-munge -n
-munge -n | unmunge
-munge -n | ssh node01.cluster.test unmunge
-remunge
-</code>
-=== Create Slurm rpm packages ===
-At the server, download the last version of Slurm. At this moment the last version es 19.05.5
-<code>
-cd /tmp
-wget https://download.schedmd.com/slurm/slurm-19.05.5.tar.bz2
-yum install rpm-build
-rpmbuild -ta slurm-19.05.5.tar.bz2
-</code>
-Copying the Slurm rpm files for installation from the master to the other servers o to a shared folder.
-<code>
-cd ~/rpmbuild/RPMS/x86_64
-cp slurm*.rpm /fns/shared_folder
-</code>
-== Install Slurm in the Master, compute, and Login nodes: ==
-The slurm-torque package could perhaps be omitted, but it does contain a useful /usr/bin/mpiexec wrapper script.
-Before install Slurm, we need disable selinux
-<code>
-nano /etc/selinux/config
-</code>
-change SELINUX=enforcing to SELINUX=disables
-<code>
-cd ~/rpmbuild/RPMS/x86_64
-export VER=19.05.5-1
-yum install slurm-$VER*rpm slurm-devel-$VER*rpm slurm-perlapi-$VER*rpm slurm-torque-$VER*rpm slurm-example-configs-$VER*rpm
-</code>
-Explicitly enable the service in the master
-<code>
-systemctl enable slurmctld
-</code>
-Only if the database service will run on the Master node: Install the database service RPM:
-<code>
-cd ~/rpmbuild/RPMS/x86_64
-export VER=19.05.5-1
-yum install slurm-slurmdbd-$VER*rpm
-</code>
-If you have a server for database, install in this server:
-<code>
-export VER=19.05.5-1
-yum install slurm-$VER*rpm slurm-devel-$VER*rpm slurm-slurmdbd-$VER*rpm
-</code>
-Explicitly enable the service:
-<code>
-systemctl enable slurmdbd
-</code>
-== Compute nodes ==
-On Compute nodes you may additionally install the slurm-slurmd and slurm-pam_slurm RPM package to prevent rogue users from logging in:
-<code>
-export VER=19.05.5-1
-yum install slurm-slurmd slurm-pam-$VER*rpm_slurm-$VER*rpm
-systemctl enable slurmd
-</code>
-==== Slurm configuration ====
-Slurm provides an example file located at /etc/slurm/slurm.conf.example. You can copy this file to /etc/slurm/slurm.conf
-<code>
-cp /etc/slurm/slurm.conf.example /etc/slurm/slurm.conf
-</code>
-It also have a web-based [[https://slurm.schedmd.com/configurator.html|configuration tool]] which can be used to build a simple configuration file, which can then be manually edited for more complex configurations.
-After that we need to edit /etc/slurm/slurm.conf and make some modifications. Its
-<code>
-vi /etc/slurm/slurm.conf
-</code>
-It is important to change the parameters: ClusterName and ControlMachine.
-<code>
-ClusterName=vlir-test
-ControlMachine=10.10.2.242
-SlurmUser=slurm
-SlurmctldPort=6817
-SlurmdPort=6818
-AuthType=auth/munge
-StateSaveLocation=/var/spool/slurm/ctld
-SlurmdSpoolDir=/var/spool/slurm/d
-SwitchType=switch/none
-MpiDefault=none
-SlurmctldPidFile=/var/run/slurmctld.pid
-SlurmdPidFile=/var/run/slurmd.pid
-ProctrackType=proctrack/pgid
-ReturnToService=0
-</code>
-If the /var/spool directory does not exist, you need to create it.
-<code>
-mkdir /var/spool/slurm
-chown slurm.slurm -R /var/spool/slurm
-</code>
-==== Slurm logging ====
-The Slurm logfile directory is undefined in the RPMs since you have to define it in slurm.conf. See SlurmdLogFile and SlurmctldLogFile in the slurm.conf page, and LogFile in the slurmdbd.conf page.
-Check your logging configuration with:
-<code>
-grep -i logfile /etc/slurm/slurm.conf
-</code>
-<code>
-SlurmctldLogFile=/var/log/slurm/slurmctld.log
-SlurmdLogFile=/var/log/slurm/slurmd.log
-</code>
-<code>
-scontrol show config | grep -i logfile
-</code>
-<code>
-SlurmctldLogFile        = /var/log/slurm/slurmctld.log
-SlurmdLogFile           = /var/log/slurm/slurmd.log
-SlurmSchedLogFile       = /var/log/slurm/slurmsched.log
-</code>
-If log files are configured, you have to create the log file directory manually:
-<code>
-mkdir /var/log/slurm
-chown slurm.slurm /var/log/slurm
-</code>
-Study the configuration information in the [[https://slurm.schedmd.com/quickstart_admin.html|Quick Start Administrator_Guide]].
-===== Home Users =====
-For the users folder, you can use the server's local disk or mount the remote storage. For this reason it is recommended to create a folder to put the information of the users. In this example we created a folder /home/CLUSTER and here we create the folder for every users.
-<code>
-mkdir /home/CLUSTER
-</code>
-===== Creating users =====
-For the users you can crate every user manually o you can user an external user database how Active Directory, OpenLDAP or MySQL, etc.
-For this example we going to create the users manually in every server.

ICT Network Project

User Tools

Site Tools

Differences

Page Tools