Questions tagged [hpc]
The hpc tag has no usage guidance.
42
questions
0
votes
1
answer
371
views
Loading a module in a bash script on an HPC cluster
I am submitting the following bash script with qsub on a standard university cluster,
#!/bin/bash
#$ -cwd # Set the working directory for the job to the current directory
#$ -pe smp 1 # ...
0
votes
0
answers
91
views
Dynamically checking and allocating SLURM nodes within a python script
I have a computationally expensive simulation function I am looking to distribute accross a multi-node cluster. The code looks something like this:
input_tasks = [input_0, input_1, ..., input_n]
for i ...
0
votes
1
answer
79
views
On which computer are mounted files executed?
This could be a few questions, but I feel it's a similar issue. On which computer are mounted executable files executed? And does this change with like sshfs, nfs, or ln? Like if you have a storage ...
0
votes
0
answers
70
views
How can I port bind an interactive transient PBS session to the head node and then port bind the head node to my local machine?
The short of it: How can I port bind an interactive transient PBS session to the head node and then port bind the head node to my local machine? So, a chain of port bind of three machines.
More ...
0
votes
0
answers
228
views
Why does 'module load intel' fail to load libfftw3?
I am in a cluster (centos) trying to run a program (pw.x) that requires loading three modules: intel, impi, and quantum-espresso. Now, I am getting an error saying
pw.x: error while loading shared ...
1
vote
0
answers
122
views
Network management switch and data transfer switch in an HPC cluster
Shown in the image is a validated design from Dell for an HPC with AI workload. We are using this reference in our HPC design. The image is taken from this white paper by Dell on page 12: HPC ...
0
votes
0
answers
1k
views
Windows can't find SSHFS
When mapping a network drive on Windows 10, after installing SSHFS-Win and WinFsp, I get the error message "Windows can't find ssfhs". Any idea why this is happening? Here is the steps that ...
0
votes
1
answer
134
views
Why a non-root installed software can work across the whole cluster?
I recently installed a new python3 and another python package locally in my account folder on a cluster with a dozen of nodes (each node with several cores).
I originally thought that I can python3 ...
5
votes
1
answer
273
views
Do writes and reads contend with each other for an SSD? [closed]
I have a disk I/O intensive application where I am reading and writing a lot concurrently.
For a spinning disk, it makes sense that there is contention because the pointer has to move around a lot but ...
0
votes
1
answer
812
views
passwd: Authentication token manipulation error using ssh and public key
I'm using ssh and public key (saved in my local computer) to connect to a HPC cluster as follows:
$ ssh -i ~/narvi_key/xenial-narvi-key [email protected]
Enter passphrase for key '/home/xenial/...
1
vote
0
answers
2k
views
How to make a host file in SLURM with $SLURM_JOB_NODELIST
I have access to a HPC with 40 cores on each node. I have a batch file to run a total of 35 codes which are in separate folders. Each code is an open mp code which requires 4 cores each. so how do I ...
0
votes
1
answer
1k
views
Getting error while using Rsync on Cygwin
I am using rsync, to transfer data to HPC cluster, using Cygwin for windows 7. I type the following command to transfer a folder from my computer to the HPC remote server.
rsync -rzv /cygdrive/C/...
1
vote
1
answer
1k
views
slurmd: Invalid job credential
I'm having some problems with a test configuration of Slurm on my laptop. I'm trying to run four slurmd instances on one machine, which is also the same machine as slurmctld runs on. I have a local ...
1
vote
0
answers
58
views
HPC Visualization node GPU choice
On most HPC Visualization node, we find nvidia tesla cards. P100 come to mind.
I'm not sure to follow why. Nvidia Tesla cards are designed for computing according to nividia documentation, not for ...
0
votes
1
answer
1k
views
mpirun tcp_peer_send_blocking: send() to socket X failed: Broken pipe (32)
I am using a HPC composed by 4 machines (master, slave1, slave2, slave3 and slave4)
I try to run a script on a HPC structure :
mpirun -report-uri - -host master,slave1,slave2,slave3,slave4 --map-by ...