Skip to main content

Questions tagged [reads]

Reads are the sequences output by a sequencing machine after the raw signal (e.g. light, electricity) is converted into bases by a basecaller.

6 votes
1 answer
84 views

How many false positive duplicates are marked using just the position of first unclipped base?

In the popular picard MarkDuplicates tool, a read is marked as a duplicate if it has the same position as another read starting from their first unclipped base in ...
Ricky's user avatar
  • 63
1 vote
0 answers
30 views

Determining fragment mean and fragment stdev for MaSuRCA config file

Similar to this unanswered question on Biostars, I am using MaSuRCA for the first time and want to know how other MaSuRCA users are determining fragment mean and fragment stdev. My understanding is ...
juliadouglasf's user avatar
2 votes
1 answer
763 views

If fastp output is not a good measure of FASTQ correctness, what is?

In the beginning of my pipeline, I just fed the paired reads (2 files) into fastp, with the default options, and assumed it would do a good job preparing the reads for the next step: alignment But I ...
gl00ten's user avatar
  • 249
1 vote
1 answer
759 views

Comparison of fastq files reads

My goal is to compare reads from two different fastq files on a Linux machine. The following are the comparisons to perform: How many common reads are between the two fastq files? How many reads are ...
utr's user avatar
  • 11
2 votes
1 answer
253 views

Filtering paired-end reads with sambamba: avoid discarding reads on the minus strand

I have a BAM file (DNA, shallow whole genome sequencing at ~1X) where I want to filter reads (using sambamba) to keep only those which have a template length > 20 and mapping quality > 20, ...
Einar's user avatar
  • 131
1 vote
0 answers
294 views

Extract read names and the associated nucleotides on specific positions from a BAM file (in R)

Let's assume I have a BAM file and several positions that I would like to examine more closely in this alignment. My goal is to find out whether these positions are on the same reads and which ...
wejt's user avatar
  • 11
2 votes
1 answer
53 views

Connection between Detected Genes and The Read Counts

I have been trying to understand the Seurat for analysing scRNA-seq data. It comes to my mind that the main data is organised in the Seurat object with rows as genes and columns as the cells, and the ...
MK Huda's user avatar
  • 163
0 votes
1 answer
123 views

Combining read counts from three separate GEO studies

I want to do differential expression analysis with DESEQ2. I have three read counts files downloaded from GEO (small RNAseq based) where the number of miRNAs and id is nearly the same. These studies ...
Megha's user avatar
  • 395
2 votes
1 answer
1k views

What is the right way of calculating a Phred score by hand?

i am trying to calculate mean Phred scores for my sequencing data, but i feel not very comfortable about it. There are actually two ways of calculating. (I just use an existing sample) giving: 3 reads ...
CoDa's user avatar
  • 45
0 votes
1 answer
55 views

Modeling number of reads mapped to a gene

I am looking for a probability distribution of a number of reads mapped to a particular gene in metagenomic sequencing (NGS, shotgun, likely illumina). Naively one could model it via a binomial (or ...
Roger V.'s user avatar
  • 381
1 vote
1 answer
1k views

What is "unmapped read segments" in the output of samtools idxstats?

samtools idxstats produces a four column output (see here) ...
Roger V.'s user avatar
  • 381
1 vote
1 answer
209 views

How to extract reads with INDELs > a given size?

I'm trying to modify this https://www.biostars.org/p/253774/ To get reads with deletions > 20bp I think this gives reads with exactly 20bp dels: ...
Liam McIntyre's user avatar
1 vote
0 answers
48 views

How to control/normalize for number of reads when calling SNPs using RNA-Seq?

I used the GATK pipeline to call SNPs on males and females using RNA-Seq data. But the males have a higher read count (~43-46M reads) than the females (~40-42M reads). This causes SNP counts to be ...
Balan's user avatar
  • 86
2 votes
1 answer
68 views

If a gene is expressed at a level of 1/1200 compared to the average gene, how is probability 50:50 that we have a read mapped to it?

I am reading a book about RNAseq analysis and it says "To calculate the probability that a read will map to a specific gene, we can assume an average gene size of 4000 nt (100 M nt divided by 25,...
Bio314's user avatar
  • 21
0 votes
2 answers
381 views

Does rRNA depletion protocol give higher number of mapped reads in Intronic regions?

Recently, I have downloaded a publicly available dataset, which are 350 tumor samples. I see the following information from the published paper. They used Ribo Zero Gold and rRNA was depleted. Strand ...
maven's user avatar
  • 1

15 30 50 per page