All Questions
15
questions
2
votes
0
answers
49
views
1
vote
0
answers
15
views
DNASTAR viral-host integration assembly keeps failing
I have two NGS files from an NGS company corresponding to the sequencing data from a tumor sample as follows:
TB_7710391_R1.FASTQ.gz
TB_7710391_R2.FASTQ.gz
I have downloaded the genome for MCPyV as ...
2
votes
1
answer
78
views
Sanger sequencing annotation error
I am a student in a Cancer lab. Working with sanger is new to me. While analyzing a report we found an insertion that has not been reported in any databases so far, we were working on checking if the ...
0
votes
1
answer
218
views
Why must a maximal non-branching path be a contig?
The following is from Bioinformatics Algorithms:
Fortunately, we can derive contigs from the de Bruijn graph. A path in a graph is called non-branching if in(v) = out(v) = 1 for each intermediate ...
1
vote
1
answer
118
views
Quast duplication ratio and mismatches percent
When analyzing Quast results it seems that it doesn't calculate mismatches and indels in a useful way if the "Duplication ratio" is over 1.
For example, that's what I get for an assembly ...
1
vote
2
answers
108
views
Bacterial DNA at the tail of transcriptome reads. What does that mean?
I am assembling a transcriptome obtained from the Internet. The transcriptome was extracted from a human cancer tissue that had been previously grafted into a mouse. I have detected that many ...
2
votes
1
answer
260
views
What is the meaning of these misaligned reads in a sequencing run?
I am analyzing some SARS-CoV-2 sequencing runs abd often find read alignments like the one in the image.
...
2
votes
1
answer
237
views
Coverage required
I was came across a problem during an exercise in a book and I don't really know how to solve it. I feel like something's missing.
"coverage, c = $NL/G$ (N=number of reads, L=read length, G=genome ...
3
votes
2
answers
346
views
Separation of mixed plasmid DNA sequences post whole-plasmid sequencing
Imagine a DNA sample containing a mixture of different intact plasmids. These samples are sequenced using either MiSeq or HiSeq sequencing. Would it possible to assemble these plasmids post-sequencing ...
2
votes
0
answers
44
views
Calculating alignment/mapping time
I am trying to assemble a plant genome using AWS resources using velvet. Plant genome is huge (> 10 times human genome) and coverage is around 30 x. We are planning for de novo assembly with Velvet (...
2
votes
1
answer
224
views
E.coli Sequencing & Analysis
I have been given the task of assembling a 'new' Ecoli genome and analysing the genes present etc.
The Ecoli is a new strain, and has been taken and run on a Nextseq 500 in high-output mode with ...
2
votes
0
answers
76
views
Assembling sequence data generated by RADseq [closed]
I'm working on a project where I have to assemble sequences generated by RADseq. At the end I hope to compare two species of woodpeckers in Sri Lanka by using SNPs.
I tried to assemble it using ...
19
votes
3
answers
699
views
How to deal with heterozygosity during polishing of genome assembly based on long reads?
All the long-read sequencing platforms are based on single-molecule sequencing which causes higher per-base error rates. For this reason a polishing step was added to genome assembly pipelines - ...
35
votes
2
answers
3k
views
Why do some assemblers require an odd-length kmer for the construction of de Bruijn graphs?
Why do some assemblers like SOAPdenovo2 or Velvet require an odd-length k-mer size for the construction of de Bruijn graph, while some other assemblers like ABySS are fine with even-length k-mers?
14
votes
3
answers
530
views
How to make a distinction between the "classical" de Bruijn graph and the one described in NGS papers?
In Computer Science a De Bruijn graph has (1) m^n vertices representing all possible sequences of length n over ...