1
$\begingroup$

I assembled Nanopore sequenced reads with Flye and visualized the GFA graph in Bandage but I don't really know how to interpret the result. For context, this is a yeast (DNA) genome. My goal is to figure out areas of repeats, ploidy level, and overall structure of the strain. So far I understand that the bubbles are repetitive areas but I mainly want to know how to interpret a graph like this.

enter image description here

I'm new to interpreting these graphs and I couldn't find any sort of guide online so if someone has any resources they'd be willing to share or if they could help interpret this graph I would really appreciate it!

$\endgroup$
2
  • $\begingroup$ Please edit your question and add some detail. To be honest, I don't have a clue what this is, I've never seen this sort of thing before, so your question might be clearer to someone who knows more, but adding some detail can't hurt. For example, what organism did you sequence? What kind of sequencing (RNA? DNA?). Also, why did you plot the GFA? What question are you asking of your data? $\endgroup$
    – terdon
    Commented Aug 10, 2023 at 17:17
  • 1
    $\begingroup$ I've used Bandage to visualise transcriptome assemblies, but I'm not sure how to answer your question/s about repeats, ploidy, and structure sorry. If you don't get an answer here, I suggest reaching out to the author - Ryan Wick (email: [email protected] / mastodon: @[email protected] / github: github.com/rrwick). Good luck! $\endgroup$ Commented Aug 14, 2023 at 5:39

1 Answer 1

1
$\begingroup$

I would suggest making a dot plot of your assembly against a reference (related) yeast genome assembly.

I'd suggest also running QUAST against the yeast reference to get a finer grained idea of the comparison.

Bandage is more of an assembly QC tool for understanding whether there are issues with the assembly. It will not tell you anything about ploidy or that kind of thing.

Off the top of my head, I find it worrisome that 90% of your sequence is in a single scaffold/contig. Presumably there's more than a single chromosome. It does look like the repeats are messing you up.

I'd suggest using hifiasm instead and seeing if it makes different decisions.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.