I have access to a HPC system. Let's say I have three nodes/system available. Details of each node is as follows:
scontrol show node
Arch=x86_64 CoresPerSocket=10
CPUAlloc=20 CPUTot=20 CPULoad=22.67
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
.
.
RealMemory=91000 AllocMem=0 FreeMem=77291 Sockets=2 Boards=1
State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=cpu_normal_q
BootTime=2023-10-20T12:56:13 SlurmdStartTime=2023-10-20T12:57:43
CfgTRES=cpu=20,mem=91000M,billing=20
AllocTRES=cpu=20
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
I have a R-code which utilizes doParallel
package in R and performs parallel computing.
library(doParallel)
library(Matrix)
# Set the number of cores to be used
num_cores <- detectCores()
# Initialize a parallel backend using doParallel
cl <- makeCluster(num_cores)
# Register the cluster for parallel processing
registerDoParallel(cl)
# Get the number of cores being utilized
cores_utilized <- getDoParWorkers()
# Function to perform matrix multiplication and inversion
matrix_mult_inv <- function() {
# Generate random matrices
mat <- matrix(rnorm(10000), nrow = 100)
# Perform matrix multiplication
result <- mat %*% mat
# Compute the inverse of the result matrix
inv_result <- solve(result)
return(inv_result)
}
# Record the start time
start_time <- Sys.time()
# Perform the matrix multiplication and inversion in parallel
result <- foreach(i = 1:300, .combine = cbind) %dopar% {
write.table(matrix_mult_inv(),paste("iteration_", i, ".txt", sep = ""))
}
# Record the end time
end_time <- Sys.time()
# Print the number of cores being utilized
print(paste("Number of cores being utilized:", cores_utilized))
# Print the time taken to run all the iterations:
print(paste("Time taken:", end_time - start_time))
# Stop the parallel backend
stopCluster(cl)
The code is designed to perform 300 iterations and within each iteration, matrix multiplication and inversion is done. The output is the total time taken to run 300 iterations and number of cores utilized.
My goal is to run this code in the HPC environment such that 20 cores from each system is utilized simultaneously so that I have 60 cores in total. Is it possible to do that?
I have looked into parSapply
from snow
package as well, but I think ultimately it depends on the makeCluster
fucntion. I tried with
cl <- makeCluster(num_nodes, type = "SOCK", explicit = TRUE,
outfile = "", nodes = c(#3 specific node names input here#),
cpus = cores_per_node)
but this just utilized 3 cores in total.
sbatch
?--ntasks
,--cpus-per-task
and--nodes
, but it all runs on single node utilizing only 20 cores. I am also looking into OpenMPI if that helps. So far have not been able to utilize it.