bwa mem hangs after a few thousand reads

Ask Question

Asked 7 months ago

Modified 7 months ago

Viewed 149 times

I am trying to align a bunch of paired sample fastq files using bwa mem.

My original command was:

bwa mem -t 8 hg38.fa sample_read1.fq.gz sample_read2.fq.gz > sample_paired.sam

I am running this on a HPC cluster.

These files have approx. 25 million reads, so I initially anticipated that they might take a little bit of time; however I noticed that the programs seemed to hang - the sam file that I am redirecting the output to never became populated with any entries other than the header comments, even after hours of waiting.

Noticing this, I decided to first test whether the most bare bones bwa mem functionality was working, and so I took just the first 1000 lines off one of the fastq files and tested whether it was working with:

bwa mem hg38.fa test_1000lines.fq.gz > test_output.sam

So this works. This creates a proper sam file almost instantaneously. However, I kept testing this behavior by increasing the number of lines I took off the fastq file from 1000 to 2000,3000,etc., and eventually realized that any fastq files I input with more than approximately 8400-8500 lines will make the program hang.

If I run the following with a fastq file containing 8400 lines, the program will run and the sam file will be created in a few seconds:

bwa mem hg38.fa test_8400lines.fq.gz > test_output8400.sam

and this is the stdout output:

[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2100 sequences (313733 bp)...
[M::mem_process_seqs] Processed 2100 reads in 1.051 CPU sec, 1.054 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem supporting_files/hg38.fa test_8-4k.fq.gz
[main] Real time: 4.231 sec; CPU: 4.178 sec

However, if I run the following with just 100 more lines in the fastq file:

bwa mem hg38.fa test_8500lines.fq.gz > test_output8500.sam

The command will hang (at least, that is what it appears like it's doing) and all I will see on the terminal is:

[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2125 sequences (317468 bp)...

followed by a blinking cursor.

I've monitored the CPU and MEM usage with the "top" command while I was conducting these tests, and while the memory never goes above something like 15%, the CPU usage always goes to 100% when this hanging behavior happens. When the .sam files were created properly and the program didn't hang, the CPU usage was (for the very short amount of time it took for the program to run) like 60-70%.

Whether the CPU usage going to 100% is causing the hanging, or the hanging it causing the CPU usage to 100%, I'm not sure.

I'm completely out of ideas what the issue is here. Any ideas?

EDIT: Although I originally suspected this to be an issue with my fastq file, I observed the stalling behavior with a different fastq file from a completely different source and ruled that out. It turns out the problem is something related to the reference index, since when I use a different set of reference indices (the hs38DH set) instead of the hg38 set that I had been using, the problem disappears.

edited Dec 4, 2023 at 20:27

Ram RS

2,45411 silver badges29 bronze badges

asked Nov 29, 2023 at 1:42

padakpatek

192 bronze badges

1

$\begingroup$ The problem with this question is there's no diagnosis. top doesn't diagnose the cluster - it diagnoses the head node - where the job usually shouldn't be running by convention (cluster depending). RAM allocation between the head node and the worker nodes isn't always the same thing. The correct command is htop. Whats the output of that? -t is multi-threading. htop will describe the CPU and RAM allocation across every CPU in the cluster - not just one node. Are you are running a queue system and a inhouse cluster (via jobs) or a cloud and partitions with/without job submission? $\endgroup$
– M__ ♦
Commented Nov 29, 2023 at 10:34
1

$\begingroup$ Does this always happen or is it specific to these particular reads only? $\endgroup$
– terdon ♦
Commented Nov 29, 2023 at 11:28
$\begingroup$ @terdon So what I've been able to confirm since I posted this question, is that there are multiple locations in this fastq file where this hanging behavior occurs (for example, if I skip over the first 50k lines and take out a chunk of 10k lines and run bwa mem, then I will once again observe stalling behavior from those 10k lines). It looks like this problem between line 8400 and 8500 just happens to be the first occurrence of the problem. I can also confirm that my other sample fastq files also exhibit this behavior, although at different locations in the file. $\endgroup$
– padakpatek
Commented Nov 29, 2023 at 16:11
2

$\begingroup$ Please edit that into your question. Also tell us what species you are sequencing, how much of it, if your target includes many repetitive or ow complexity regions etc. It really sounds like you're hitting some specific issue but I cannot be sure and maybe if you add more detail someone more knowledgeable than I can figure it out. Even better, try and get the smallest file that reproduces the issue and upload it somewhere we can see it (assuming you are allowed to). $\endgroup$
– terdon ♦
Commented Nov 29, 2023 at 16:13
$\begingroup$ Cross-posted here: bwa mem hangs after a few thousand reads $\endgroup$
– Timur Shtatland
Commented Nov 29, 2023 at 18:14

| Show 4 more comments

Stack Exchange Network

bwa mem hangs after a few thousand reads

0

Browse other questions tagged
sequence-alignment
genome
assembly
variant-calling
bwa
or ask your own question.

Hot Network Questions

bwa mem hangs after a few thousand reads

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Browse other questions tagged sequence-alignmentgenomeassemblyvariant-callingbwa or ask your own question.

Related

Hot Network Questions

Browse other questions tagged
sequence-alignment
genome
assembly
variant-calling
bwa
or ask your own question.