Questions tagged [assembly]
Process of creating the original sequence from the read sequences that it generated during a sequencing experiment. Can refer to genome assembly, in which case the original sequence is a genome, or transcripts assembly, in which case the original sequences are RNA transcripts.
152
questions
1
vote
1
answer
24
views
Why is RNASpades giving three FASTA output files instead of only one?
I'm running RNASpades for de-novo transcriptome assembly in the Galaxy workflow manager
. Instead of giving only one output of ...
1
vote
0
answers
28
views
Assessing the quality of an assembly
I am trying to run a script that assess the quality of a transcriptomic assembly, a de novo assembly using a tool called Transrate. To install the tool I followed the prompts in https://bioconda....
1
vote
2
answers
66
views
Where to find the homopolymer regions bed file for Hg002 genome?
This question was also asked on Biostars
I am doing an experiment where I am trying to analyze the errors in the homopolymer regions between the polished reference hg002 genome and hifiasm assembly ...
2
votes
0
answers
49
views
1
vote
1
answer
40
views
Hybrid assembly versus polishing for hifi and illumina reads
I will have to carry out a project of assembly using hifi reads for which I have already illumina reads and I am wondering which of the hybrid assembly or polishing would be the best option for this ...
5
votes
2
answers
95
views
Multiple genome assemblies of the same bacterial species
I have some RNA-seq data where there are reads from the "host" as well as from several bacteria species. In this experimental context, I am interested in the host associated reads and the ...
2
votes
2
answers
68
views
Longstitch error make: command: Command not found *** No rule to make target
I installed Longstitch and ran the test script with no issues. The output files matched the expected output files. But when I am now trying to run Longstitch on my own data I am getting this error.
<...
1
vote
1
answer
63
views
MMSeqs taxonomy running for over a day
I've been trying to run mmseqs2 on a few metagenomic assemblies and despite my best efforts in reading the wiki and playing with parameters, the process is taking over a day.
In their paper they claim ...
1
vote
0
answers
149
views
bwa mem hangs after a few thousand reads
I am trying to align a bunch of paired sample fastq files using bwa mem.
My original command was:
...
1
vote
0
answers
30
views
Determining fragment mean and fragment stdev for MaSuRCA config file
Similar to this unanswered question on Biostars, I am using MaSuRCA for the first time and want to know how other MaSuRCA users are determining fragment mean and fragment stdev. My understanding is ...
0
votes
1
answer
116
views
Why do we delete scaffolds shorter than 500 - 1000 bp from the assembled genome?
After assembly of genome, some protocols sometimes call for removal of scaffolds shorter than 500 or 1000 (some papers have one number while the other has the other.) Is this simply to remove the ...
1
vote
1
answer
251
views
Interpreting GFA graph visualized in Bandage
I assembled Nanopore sequenced reads with Flye and visualized the GFA graph in Bandage but I don't really know how to interpret the result. For context, this is a yeast (DNA) genome. My goal is to ...
1
vote
1
answer
29
views
How important are the homozygous variants that get unnecessarily deleted using liftover?
I'm referring to the text described here:
These tools [NCBI remap, CrossMap] operate only on the sites present in an input VCF, and return the representation of those sites in a new genome assembly. ...
3
votes
1
answer
69
views
compare fasta sequences in pairs and collect metrics
I have 96 fasta files (A1, A2, A3...) from one plasmid assembly pipeline, and I have another 96 fasta files (B1, B2, B3 ...) from another plasmid assembly pipeline.
I would like to compare pair ...
1
vote
0
answers
51
views
Why do XRAY (but not CryoEM) structures of ribosome in PDB have 2 assemblies?
When i started programming against PDB i had a mixture of confusion & frustration with the fact that certain cif files contain two actual structures aka ...
1
vote
0
answers
15
views
DNASTAR viral-host integration assembly keeps failing
I have two NGS files from an NGS company corresponding to the sequencing data from a tumor sample as follows:
TB_7710391_R1.FASTQ.gz
TB_7710391_R2.FASTQ.gz
I have downloaded the genome for MCPyV as ...
1
vote
0
answers
97
views
RagTag patch error--"Tuple index out of range"
This question was also asked on GitHub
I'm trying to correct a long-read assembly with a short-read scaffold; I'm hoping to fill in the short gaps in the scaffold with the matching long-read sections. ...
2
votes
1
answer
78
views
Velvet Optimizer automatically changes to hash-length 31
I'm trying to use Velvet Optimizer for a De Novo Assembly; I set my hash-lengths to be between 55 and 69
...
2
votes
1
answer
40
views
How to find specific types of assemblies for specific species using entrez tools?
How to find specific types of assemblies for specific species using entrez tools?
Task: Trying to specifically find transcriptomes and associated cDNA data for a list of speices.
I can use this ...
1
vote
1
answer
61
views
How to subset an SRA file for a single chromosome?
I used prefetch to get the Pacbio reads of chicken from the SRA database. I want to align these reads against a reference genome, but not all the reads. I am only interested in a particular region on ...
2
votes
2
answers
233
views
Improving prokaryotic assembly with other contig/scaffold-level data?
I have what at first sight appears to be a high-quality MAG (~10 pieces, high completion%) that I built from a hybrid assembly (Illumina + Nanopore data) from a cyanobacterium.
Workflow:
Quality ...
2
votes
1
answer
78
views
Sanger sequencing annotation error
I am a student in a Cancer lab. Working with sanger is new to me. While analyzing a report we found an insertion that has not been reported in any databases so far, we were working on checking if the ...
4
votes
1
answer
1k
views
How to solve Nextflow error: "Trace file already exists"?
When trying to run epi2me-labs/wf-artic, I get the following error:
...
2
votes
2
answers
69
views
Calling isoforms from long read data generated from partially degraded RNA
What will be the best tool to call isoforms from long read data generated from partially degraded RNA. By mistake we processed some samples with poor quality RNA to generate long read. Now we are ...
2
votes
1
answer
38
views
Using very closely related strains to increase coverage for short read, de novo assembly
Do you think it's possible to combine short read Illumina libraries (WGS) from multiple closely related eukaryotic microbial strains (e.g. libraries from a re-sequencing study, >99% ITS1 sequence) ...
1
vote
1
answer
40
views
How is an X chromosome encoded into a fasta string?
A human "X" chromosome has a centromere and two "identical" chromatids. If the chromatids are not identical, this fact is not assembled, correct?
The fasta string for a chromomsome ...
4
votes
1
answer
66
views
How to promote assemblies into genomes in NCBI?
Note: I've never submitted an assembly/genome to NCBI, so excuse if my perspective is flawed.
I'm working with Drosophila subobscura. (spring fruit fly)
I see here https://www.ncbi.nlm.nih.gov/data-...
1
vote
3
answers
98
views
How can I assemble my genome from raw files?
I've had my whole genome sequenced (at 30x average coverage) by a lab, and they have provided the raw files to me (BAM, FASTQ, and VCF).
How can I assemble it?
And does assembly provide any further ...
2
votes
1
answer
126
views
need bam file for pilon
I just ran an assembly on yeast genomes using Flye and I want to polish those assemblies with Pilon but it requires a sorted BAM file.
How do I make a BAM file of the resulting assembled.fasta?
2
votes
1
answer
82
views
What is the best way to process yeast genomes?
I have obtained several hundred raw, unassembled yeast genomes from NCBI and I am looking for advice on how to process the genomes for downstream analysis.
I have a reference genome (S288C) to use for ...