User Tools

Site Tools


hpc:slurm-setup

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
hpc:slurm-setup [2020/01/29 14:10]
miriel@uclv
hpc:slurm-setup [2020/04/10 17:38] (current)
Line 260: Line 260:
 <​code>​ <​code>​
 systemctl enable slurmdbd systemctl enable slurmdbd
 +</​code>​
 +
 +We need to make sure that the server has all the right configurations and files.
 +
 +<​code>​
 +mkdir /​var/​spool/​slurmctld
 +chown slurm: /​var/​spool/​slurmctld
 +chmod 755 /​var/​spool/​slurmctld
 +touch /​var/​log/​slurmctld.log
 +chown slurm: /​var/​log/​slurmctld.log
 +touch /​var/​log/​slurm_jobacct.log /​var/​log/​slurm_jobcomp.log
 +chown slurm: /​var/​log/​slurm_jobacct.log /​var/​log/​slurm_jobcomp.log
 +
 </​code>​ </​code>​
  
 == Compute nodes == == Compute nodes ==
-On Compute nodes you may additionally install the slurm-pam_slurm RPM package to prevent rogue users from logging in:+On Compute nodes you may additionally install the slurm-slurmd and slurm-pam_slurm RPM package to prevent rogue users from logging in:
  
 <​code>​ <​code>​
 export VER=19.05.5-1 export VER=19.05.5-1
-yum install slurm-pam_slurm-$VER*rpm+yum install slurm-slurmd slurm-pam-$VER*rpm_slurm-$VER*rpm
 systemctl enable slurmd systemctl enable slurmd
 +</​code>​
 +
 +We need to make sure that all the compute nodes have the right configurations and files.
 +
 +<​code>​
 +mkdir /​var/​spool/​slurmd
 +chown slurm: /​var/​spool/​slurmd
 +chmod 755 /​var/​spool/​slurmd
 +touch /​var/​log/​slurmd.log
 +chown slurm: /​var/​log/​slurmd.log
 </​code>​ </​code>​
  
Line 276: Line 299:
  
 <​code>​ <​code>​
-cp /​etc/​slurm/​slurm.conf.example /etc/slurm/+cp /​etc/​slurm/​slurm.conf.example /etc/slurm/slurm.conf
 </​code>​ </​code>​
  
 It also have a web-based [[https://​slurm.schedmd.com/​configurator.html|configuration tool]] which can be used to build a simple configuration file, which can then be manually edited for more complex configurations. It also have a web-based [[https://​slurm.schedmd.com/​configurator.html|configuration tool]] which can be used to build a simple configuration file, which can then be manually edited for more complex configurations.
  
-After that we need +After that we need to edit /​etc/​slurm/​slurm.conf and make some modifications. Its  
 + 
 +<​code>​ 
 +vi /​etc/​slurm/​slurm.conf 
 +</​code>​ 
 +It is important to change the parameters: ClusterName and ControlMachine. 
 +<​code>​ 
 +ClusterName=vlir-test 
 +ControlMachine=10.10.2.242 
 +SlurmUser=slurm 
 +SlurmctldPort=6817 
 +SlurmdPort=6818 
 +AuthType=auth/​munge 
 +StateSaveLocation=/​var/​spool/​slurm/​ctld 
 +SlurmdSpoolDir=/​var/​spool/​slurm/​d 
 +SwitchType=switch/​none 
 +MpiDefault=none 
 +SlurmctldPidFile=/​var/​run/​slurmctld.pid 
 +SlurmdPidFile=/​var/​run/​slurmd.pid 
 +ProctrackType=proctrack/​pgid 
 +ReturnToService=0 
 + 
 +</​code>​ 
 + 
 +If the /var/spool directory does not exist, you need to create it. 
 + 
 +<​code>​ 
 +mkdir /​var/​spool/​slurm 
 +chown slurm.slurm -R /​var/​spool/​slurm 
 +</​code>​ 
 + 
 +==== Slurm logging ==== 
 + 
 + 
 +The Slurm logfile directory is undefined in the RPMs since you have to define it in slurm.conf. See SlurmdLogFile and SlurmctldLogFile in the slurm.conf page, and LogFile in the slurmdbd.conf page. 
 + 
 +Check your logging configuration with: 
 +<​code>​ 
 +grep -i logfile /​etc/​slurm/​slurm.conf 
 +</​code>​ 
 +<​code>​ 
 +SlurmctldLogFile=/​var/​log/​slurm/​slurmctld.log 
 +SlurmdLogFile=/​var/​log/​slurm/​slurmd.log 
 +</​code>​ 
 + 
 +<​code>​ 
 +scontrol show config | grep -i logfile 
 +</​code>​ 
 +<​code>​ 
 +SlurmctldLogFile ​       = /​var/​log/​slurm/​slurmctld.log 
 +SlurmdLogFile ​          = /​var/​log/​slurm/​slurmd.log 
 +SlurmSchedLogFile ​      = /​var/​log/​slurm/​slurmsched.log 
 + 
 +</​code>​ 
 + 
 +If log files are configured, you have to create the log file directory manually: 
 + 
 +<​code>​ 
 +mkdir /​var/​log/​slurm 
 +chown slurm.slurm /​var/​log/​slurm 
 +</​code>​
  
  
 Study the configuration information in the [[https://​slurm.schedmd.com/​quickstart_admin.html|Quick Start Administrator_Guide]]. Study the configuration information in the [[https://​slurm.schedmd.com/​quickstart_admin.html|Quick Start Administrator_Guide]].
 +
 +===== Home Users =====
 +For the users folder, you can use the server'​s local disk or mount the remote storage. For this reason it is recommended to create a folder to put the information of the users. In this example we created a folder /​home/​CLUSTER and here we create the folder for every users.
 +
 +<​code>​
 +mkdir /​home/​CLUSTER
 +</​code>​
 +
 +===== Creating users =====
 +
 +For the users you can crate every user manually o you can user an external user database how Active Directory, OpenLDAP or MySQL, etc.
 +For this example we going to create the users manually in every server.
 +
  
hpc/slurm-setup.1580307035.txt.gz · Last modified: 2020/04/10 17:38 (external edit)