0
$\begingroup$

After assembly of genome, some protocols sometimes call for removal of scaffolds shorter than 500 or 1000 (some papers have one number while the other has the other.) Is this simply to remove the artefact and to make later analysis smoother, without the extroneous data? Also how do we know how where is our cut off point for size?

$\endgroup$
1
  • 1
    $\begingroup$ Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. $\endgroup$
    – Community Bot
    Commented Aug 11, 2023 at 23:11

1 Answer 1

2
$\begingroup$

I recognise my answer is late but I hope it helps. The following has no emperical proof, just anecdotal from working with bacterial genomes:

I'm not sure what kind of genome you're working with, but from my work with bacteria, we often delete scaffolds shorter than 1000 - 3000 bp because these are considered "noise". Essentially the scaffolds are too small to gather any particularly useful information from as they may only carry a single gene (or a handful of little genes). If you've added in long read data with your short, half the time those small scaffolds are just artifacts from sequencing errors.

$\endgroup$
2
  • 1
    $\begingroup$ Oh and for the cutoffs: I work with metagenomes, so I tend to work my cutoff around my assembly N50. Say the N50 is 2100, I will round that to 2000 and remove any scaffolds smaller than that. It seems to help me save the majority of the data whilst minimizing noise. Each case will be different though :) $\endgroup$
    – Rainman
    Commented Oct 6, 2023 at 19:20
  • $\begingroup$ Agree with what @rainman has said. depending on the organism, you might want to go lower, eg if the genome is is tiny you might drop the cutoff. Also it's good to note there are scaffolding tools that help bridge the gap between contigs. $\endgroup$
    – Ammar
    Commented Oct 9, 2023 at 0:36

Not the answer you're looking for? Browse other questions tagged or ask your own question.