Questions tagged [assembly]
Process of creating the original sequence from the read sequences that it generated during a sequencing experiment. Can refer to genome assembly, in which case the original sequence is a genome, or transcripts assembly, in which case the original sequences are RNA transcripts.
151
questions
1
vote
0
answers
28
views
Assessing the quality of an assembly
I am trying to run a script that assess the quality of a transcriptomic assembly, a de novo assembly using a tool called Transrate. To install the tool I followed the prompts in https://bioconda....
1
vote
2
answers
63
views
Where to find the homopolymer regions bed file for Hg002 genome?
This question was also asked on Biostars
I am doing an experiment where I am trying to analyze the errors in the homopolymer regions between the polished reference hg002 genome and hifiasm assembly ...
2
votes
0
answers
48
views
1
vote
1
answer
37
views
Hybrid assembly versus polishing for hifi and illumina reads
I will have to carry out a project of assembly using hifi reads for which I have already illumina reads and I am wondering which of the hybrid assembly or polishing would be the best option for this ...
5
votes
2
answers
94
views
Multiple genome assemblies of the same bacterial species
I have some RNA-seq data where there are reads from the "host" as well as from several bacteria species. In this experimental context, I am interested in the host associated reads and the ...
2
votes
2
answers
68
views
Longstitch error make: command: Command not found *** No rule to make target
I installed Longstitch and ran the test script with no issues. The output files matched the expected output files. But when I am now trying to run Longstitch on my own data I am getting this error.
<...
1
vote
1
answer
60
views
MMSeqs taxonomy running for over a day
I've been trying to run mmseqs2 on a few metagenomic assemblies and despite my best efforts in reading the wiki and playing with parameters, the process is taking over a day.
In their paper they claim ...
1
vote
0
answers
145
views
bwa mem hangs after a few thousand reads
I am trying to align a bunch of paired sample fastq files using bwa mem.
My original command was:
...
1
vote
0
answers
29
views
Determining fragment mean and fragment stdev for MaSuRCA config file
Similar to this unanswered question on Biostars, I am using MaSuRCA for the first time and want to know how other MaSuRCA users are determining fragment mean and fragment stdev. My understanding is ...
0
votes
1
answer
105
views
Why do we delete scaffolds shorter than 500 - 1000 bp from the assembled genome?
After assembly of genome, some protocols sometimes call for removal of scaffolds shorter than 500 or 1000 (some papers have one number while the other has the other.) Is this simply to remove the ...
1
vote
1
answer
236
views
Interpreting GFA graph visualized in Bandage
I assembled Nanopore sequenced reads with Flye and visualized the GFA graph in Bandage but I don't really know how to interpret the result. For context, this is a yeast (DNA) genome. My goal is to ...
1
vote
1
answer
29
views
How important are the homozygous variants that get unnecessarily deleted using liftover?
I'm referring to the text described here:
These tools [NCBI remap, CrossMap] operate only on the sites present in an input VCF, and return the representation of those sites in a new genome assembly. ...
3
votes
1
answer
68
views
compare fasta sequences in pairs and collect metrics
I have 96 fasta files (A1, A2, A3...) from one plasmid assembly pipeline, and I have another 96 fasta files (B1, B2, B3 ...) from another plasmid assembly pipeline.
I would like to compare pair ...
1
vote
0
answers
51
views
Why do XRAY (but not CryoEM) structures of ribosome in PDB have 2 assemblies?
When i started programming against PDB i had a mixture of confusion & frustration with the fact that certain cif files contain two actual structures aka ...
1
vote
0
answers
15
views
DNASTAR viral-host integration assembly keeps failing
I have two NGS files from an NGS company corresponding to the sequencing data from a tumor sample as follows:
TB_7710391_R1.FASTQ.gz
TB_7710391_R2.FASTQ.gz
I have downloaded the genome for MCPyV as ...