3
$\begingroup$

I have 96 fasta files (A1, A2, A3...) from one plasmid assembly pipeline, and I have another 96 fasta files (B1, B2, B3 ...) from another plasmid assembly pipeline. I would like to compare pair wise (A1 to B1, A2 to B2, A3 to B3 etc) and collect some metrics to see how this two assembly method differs.

I am thinking of doing blastn -query A1 -subject B1, but this created a lot visual alignment, rather than detailed align metrics.

Thank you for your suggestions.

$\endgroup$
1
  • $\begingroup$ OK, I think maybe I will just stay with blastn and do this: blastn -query A1 -subject B1 |grep -E "Score|Identities|Strand" >res_A1_B1 Then I will get report like this in res_A1_B1, Score = 1.048e+04 bits (5674), Expect = 0.0 Identities = 5674/5674 (100%), Gaps = 0/5674 (0%) Strand=Plus/Minus I can further collect them into tables. $\endgroup$
    – cautree
    Commented Apr 12, 2023 at 11:26

1 Answer 1

3
$\begingroup$

It's easy; just use output format 6, this gives you a nice table that's easy to parse:

blastn -db plasmid2.db -query plasmid1.fasta -evalue 1e-6 -outfmt "6 pident evalue bitscore"

I don't know what parameters you prefer.

... "6 pident evalue bitscore" ...

You just edit the pident, evalue and bitscore above. You might want to place staxid or something to tell you what it is. A selection of parameters are as follows:

staxid ssciname qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qcovs sstrand

You might want to make the command a bit fancier

blastn -db plasmid2.db -query plasmid1.fasta -evalue 1e-7 -outfmt "6 pident evalue bitscore" | sort | uniq -c | sort -n >> myplasmid.txt
$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.