All Questions
Tagged with perl bioinformatics
13
questions
8
votes
5
answers
3k
views
Removing rows containing NA in every column
I have a tab delimited file which looks like this:
gene v1 v2 v3 v4
g1 NA NA NA NA
g2 NA NA 2 3
g3 NA NA NA NA
g4 1 2 3 2
The number of fields in every line is fixed and ...
2
votes
2
answers
191
views
compare columns from two different files and PRINT RECORDS FROM FIRST FILE those that DO NOT match from second file
I would like to compare columns from file one to two. Where column 2 of file1 should un-match from column 1 or 2 from file2 and print output from file 1.
file1.
cat test.head20.R2.fastq.tab
@...
1
vote
1
answer
13k
views
mkdir command not found?
I think I messed up the PATH. I was installing Bioperl and tried to changed path. The correct code should be:
$ export PATH=/usr/local/ActivePerl-5.26/bin:$PATH
$ export PATH=/usr/local/ActivePerl-5....
0
votes
2
answers
167
views
conditional replacement of rows with a number
I have a big file containing 27 columns and nearly 6 million rows. The following is a little example of my file
head data
0.65 0.722222 1.0 0.75 0
0.35 0.277778 0.0 0.25 0
0 ...
2
votes
1
answer
154
views
compute sum up each 2 rows and replace them with another value if the sum is less than a specific value
I have a genotype matrix (with tabular space), with 2 million rows and 12 columns. Columns are individuals and rows are SNPs. I have 2 rows per each SNP for each individual, one is the number of ...
-2
votes
1
answer
84
views
Remove the gap pattern from files [closed]
I want to remove gaps (-). if gap found continuous >10 in all >Tem at the same position then remove the all gaps and also remove the sequences or gap from the query at the same position which are in ...
1
vote
2
answers
380
views
intersection beween 2 files (values in file 1 which fall in range of values in file 2)
I have a file named snp_data containing SNP (Single-Nucleotide Polymorphism) chromosome data. This is a 3-column, white-space delimited CSV file which has the following format:
user@host:~$ cat ...
3
votes
2
answers
738
views
Counting a specific consecutive character with its occurrence position and length
I have a sequence file and want to count consecutive character "N" with its position of occurrence and the length
Say I have a file named mySequence.fasta like this:
>sequence-1
...
-1
votes
3
answers
103
views
Finding the different possible combinations
File A has rows of genes:
A,B,C,D,E
P,Q,R
G,D,V,K
L,Q,X,I,U,G and so on.
Taking each row at a time, how can one get the following type of output:
For the first row:
A,B,C
B,C,D
...
3
votes
2
answers
121
views
Organizing three dimensional data with awk/sed/perl
I have this file (a sparse matrix):
PC.354 OTU1 6
PC.354 OTU2 1
PC.356 OTU0 4
PC.356 OTU2 7
PC.356 OTU3 3
I want an output like this (dense matrix -classic .biom table):
OTU_ID PC.354 PC.355 PC....
1
vote
1
answer
477
views
Retrieving fasta sequences using bed file information from locally installed file
I have a .bed file containing around 30000 rows for which I have the sequences retrieved using fetch-sequences module of the rsat tool (http://rsat.ulb.ac.be/rsat/help.fetch-sequences.html#usage). [...
-2
votes
3
answers
355
views
Deleting Lines, which match a particular Identifier from another file
I have 2 files. File 1, has an identifier (eg. D7MHBF:11:1449:1988) and every new entry starts with @. It has few more fields, which are not important in our analysis.
File 2 consists of a column of ...
1
vote
1
answer
244
views
How do I reinstall the bioperl modules on Ubuntu?
I am trying to learn bioinformatics the hard way. I have no background in Linux, Ubuntu, bash, Perl, Python, etc. I'm trying to use several programs, mostly the bioperl modules, that have been ...