Skip to main content

All Questions

Tagged with
8 votes
5 answers
3k views

Removing rows containing NA in every column

I have a tab delimited file which looks like this: gene v1 v2 v3 v4 g1 NA NA NA NA g2 NA NA 2 3 g3 NA NA NA NA g4 1 2 3 2 The number of fields in every line is fixed and ...
user3138373's user avatar
  • 2,559
2 votes
2 answers
191 views

compare columns from two different files and PRINT RECORDS FROM FIRST FILE those that DO NOT match from second file

I would like to compare columns from file one to two. Where column 2 of file1 should un-match from column 1 or 2 from file2 and print output from file 1. file1. cat test.head20.R2.fastq.tab @...
RKK's user avatar
  • 77
1 vote
1 answer
13k views

mkdir command not found?

I think I messed up the PATH. I was installing Bioperl and tried to changed path. The correct code should be: $ export PATH=/usr/local/ActivePerl-5.26/bin:$PATH $ export PATH=/usr/local/ActivePerl-5....
rongrong's user avatar
0 votes
2 answers
167 views

conditional replacement of rows with a number

I have a big file containing 27 columns and nearly 6 million rows. The following is a little example of my file head data 0.65 0.722222 1.0 0.75 0 0.35 0.277778 0.0 0.25 0 0 ...
Anna1364's user avatar
  • 1,036
2 votes
1 answer
154 views

compute sum up each 2 rows and replace them with another value if the sum is less than a specific value

I have a genotype matrix (with tabular space), with 2 million rows and 12 columns. Columns are individuals and rows are SNPs. I have 2 rows per each SNP for each individual, one is the number of ...
Anna1364's user avatar
  • 1,036
-2 votes
1 answer
84 views

Remove the gap pattern from files [closed]

I want to remove gaps (-). if gap found continuous >10 in all >Tem at the same position then remove the all gaps and also remove the sequences or gap from the query at the same position which are in ...
s kumar's user avatar
  • 13
1 vote
2 answers
380 views

intersection beween 2 files (values in file 1 which fall in range of values in file 2)

I have a file named snp_data containing SNP (Single-Nucleotide Polymorphism) chromosome data. This is a 3-column, white-space delimited CSV file which has the following format: user@host:~$ cat ...
Anna1364's user avatar
  • 1,036
3 votes
2 answers
738 views

Counting a specific consecutive character with its occurrence position and length

I have a sequence file and want to count consecutive character "N" with its position of occurrence and the length Say I have a file named mySequence.fasta like this: >sequence-1 ...
Budding-bioinformatician's user avatar
-1 votes
3 answers
103 views

Finding the different possible combinations

File A has rows of genes: A,B,C,D,E P,Q,R G,D,V,K L,Q,X,I,U,G and so on. Taking each row at a time, how can one get the following type of output: For the first row: A,B,C B,C,D ...
docmart 's user avatar
3 votes
2 answers
121 views

Organizing three dimensional data with awk/sed/perl

I have this file (a sparse matrix): PC.354 OTU1 6 PC.354 OTU2 1 PC.356 OTU0 4 PC.356 OTU2 7 PC.356 OTU3 3 I want an output like this (dense matrix -classic .biom table): OTU_ID PC.354 PC.355 PC....
Lika 's user avatar
  • 33
1 vote
1 answer
477 views

Retrieving fasta sequences using bed file information from locally installed file

I have a .bed file containing around 30000 rows for which I have the sequences retrieved using fetch-sequences module of the rsat tool (http://rsat.ulb.ac.be/rsat/help.fetch-sequences.html#usage). [...
biobudhan's user avatar
  • 547
-2 votes
3 answers
355 views

Deleting Lines, which match a particular Identifier from another file

I have 2 files. File 1, has an identifier (eg. D7MHBF:11:1449:1988) and every new entry starts with @. It has few more fields, which are not important in our analysis. File 2 consists of a column of ...
Srilakshmi's user avatar
1 vote
1 answer
244 views

How do I reinstall the bioperl modules on Ubuntu?

I am trying to learn bioinformatics the hard way. I have no background in Linux, Ubuntu, bash, Perl, Python, etc. I'm trying to use several programs, mostly the bioperl modules, that have been ...
Susanne's user avatar
  • 11