How can I convert the following from a column:
1
2
3
4
5
6
.
.
.
.
98
99
100
To a row separated by a comma:
1,2,3,4,5,6,....,98,99,100
I am using Linux.
How can I convert the following from a column:
1
2
3
4
5
6
.
.
.
.
98
99
100
To a row separated by a comma:
1,2,3,4,5,6,....,98,99,100
I am using Linux.
You may use the paste
command as follows:
paste -sd, file.txt
By default, paste
"write lines consisting of the sequentially corresponding lines from each file, separated by tabs, to standard output" (quoted from manual). The -d
option sets an alternative output delimiter and the -s
option makes it paste file one at a time instead of in parallel. In the case of a singular file, this option makes it process the file one line at a time.
dos2unix
.
.
not just numbers and that can affect the tools we use. Please ensure that the example you give is as close to your real data as possible.
Here are a few ways (although Loïc's paste
approach is probably the best):
tr
$ tr '\n' , < file
1,2,3,4,5,6,.,.,.,.,98,99,100,$
This will replace all newline characters with a comma, but that means that the final, trailing newline will also be changed, so you want to add that one back (note that this assumes GNU sed
, but that's what you have on a Linux machine usually):
$ tr '\n' , < file | sed 's/,$/\n/'
1,2,3,4,5,6,.,.,.,.,98,99,100
Perl & sed
$ perl -pe 's/\n/,/' file | sed 's/,$/\n/'
1,2,3,4,5,6,.,.,.,.,98,99,100
Perl alone
$ perl -lne 'push @l,$_; END{print join ",",@l; }' file
1,2,3,4,5,6,.,.,.,.,98,99,100
Or:
$ perl -0777 -pe 's/\n/,/g; s/,$/\n/;' file
1,2,3,4,5,6,.,.,.,.,98,99,100
The -0777
tells perl to "slurp" the file, reading the whole thing into memory. Then, we replace all newlines with commas, and replace the final comma with a newline. The -pe
means "print every line of the input file after applying the script given by -e
".
Note that both of these approach store the entire file in memory, so they might not be appropriate for very large files.
awk
& sed
$ awk -v ORS="," '1' file | sed 's/,$/\n/'
1,2,3,4,5,6,.,.,.,.,98,99,100
Pure awk
(with thanks to Ed Morton who provided this approach in a comment)
$ awk '{printf "%s%s", sep, $0; sep=","} END{print ""}' file
1,2,3,4,5,6,.,.,.,.,98,99,100
The 1
is just a shorthand for "print". In awk, the default action when something evaluates to true is to print the current line, so 1
is often used since 1
is always true.
perl -0777 -pe 's/\n(?!\z)/,/g;'
, using a negative lookahead to avoid replacing the newline at the end
sed
would be sed 's/,$//;G'
. Btw, I'm missing the sed
-only solution sed 'H;1h;$!d;x;s/\n/,/g'
Commented
Aug 21, 2023 at 7:18
Using GNU datamash
:
$ datamash -t , transpose <file
1,2,3,4,5,6,.,.,.,.,98,99,100
This would also correctly transpose your data if it was made up of multiple comma-delimited columns (i.e., if it was a simple CSV format):
$ cat f
1,1,1,1
2,2,2,2
3,3,3,3
4,4,4,4
5,5,5,5
6,6,6,6
.,.,.,.
.,.,.,.
.,.,.,.
.,.,.,.
98,98,98,98
99,99,99,99
100,100,100,100
$ datamash -t , transpose <f
1,2,3,4,5,6,.,.,.,.,98,99,100
1,2,3,4,5,6,.,.,.,.,98,99,100
1,2,3,4,5,6,.,.,.,.,98,99,100
1,2,3,4,5,6,.,.,.,.,98,99,100
datamash collapse 1
would give same output.
Commented
Aug 19, 2023 at 13:00
Using Raku (formerly known as Perl6)
~$ raku -e 'lines.join(",").put;' file
Raku has a workhorse lines
routine, which strips off line terminators by default, returning a Seq
, which is a lightweight iterable data structure in Raku. If you need a comma-separated row simply join
on commas (otherwise you'll get output with elements separated by a single space).
Sample Input:
1
2
3
4
5
6
Sample Output:
1,2,3,4,5,6
In Raku, put
and say
will add the newline terminator back for you (although say
should be withheld from production scripts, it is only for code review as it returns "human-readable" output). Raku also has print
, which does not add a newline at the end.
The default in Raku is basically equivalent to Perl's -l
command-line flag, where newlines are auto-chomped upon input. Thus by using print
for output you can come up with simple code in the same vein as the "Perl & sed
" answer by @terdon (note below ~
tilde is used for string concatenation, but you could just as easily write print "$_,"
instead):
~$ raku -ne 'print $_ ~ ",";' file.txt | raku -pe 's/\,$//;'
1,2,3,4,5,6
sed
solution for completeness' sake:
% cat =commas
#!/bin/bash
# Other ways of doing it:
#sed -n 'H;${g;s/\n//;s//,/g;p}'
#sed -n '1h;1!H;${g;s/\n/,/g;p}'
sed ':b;N;$!bb;s/\n/,/g'
I think I'm going to start using Loïc's paste
solution though.
sed
solution, you could just use sed -z 's/\n/,/g;s/,$//' file
printf 'foo,\0bar\nbaz\n' | sed -z 's/\n/,/g;s/,$//'
turns to foo\0bar,baz
, which isn't even newline terminated. You can make assumptions that the data won't have this or won't have that, but I rather consider lines to be newline-terminated bytes of anything, and the output of line-oriented pipelines should be newline-terminated as well unless there's good reason it needs to lack the newline.
A bash solution would the two-liner
mapfile -t nums < file
( IFS=,; echo "${nums[*]}" )
Here's a solution (using Perl) that avoids reading the whole input file into memory at once:
perl -pe 'print "," if $. > 1; chomp unless eof'
The -p
switch makes Perl wrap the program code (given directly afterwards as an argument to the -e
switch) in a loop that reads lines from standard input (or from the files given as further command line arguments, if any) into the "default input variable" $_
and print them out after the code has executed, then repeat.
The print "," if $. > 1
command prints a comma (which will appear before the automatically printed output line) if
(and only if) the number of lines read so far ($.
) is greater than one. Meanwhile, chomp unless eof
removes the trailing newline (if any) from the input line before it's printed unless the end of the input (eof
) has been reached. Thus, together, these two commands add a comma before all input lines except the first one and remove the trailing line break from all but the last line.
Paste is, to me, the best suited for your exemple (and datamash if you really need to "transpose" a file that could have multiple columns).
However here is another way, if you have not many elements:
xargs <file | tr ' ' ','
# xargs will output all elements on the same line, separated with a space,
# and tr will then changes those separators into ","
# a big caveat of the xargs approach (often forgotten...) :
# if you have a lot (thousands?) of elements:
# xargs will maybe output several lines, and this will not work properly.
# (xargs: tries to fit the elements on the same line until it reaches the length of a
# command line, and will outputs on another line to continue fitting the next elements)
You can use the tr
command. Convert all newline characters to commas as follows (Input is shown at the end):
$ cat ab.txt | tr '\n' ','
However, this may consider the '\n' in the end and print it this way:
┌──(user㉿user)-[~/Desktop/abc]
└─$ cat ab.txt | tr '\n' ','
1,2,3,4,5,6,7,8,9,10,11,
To remove the last comma, cut the output to 11 fields (columns):
┌──(user㉿user)-[~/Desktop/abc]
└─$ cat ab.txt | tr '\n' ',' | cut -d "," -f1-11
1,2,3,4,5,6,7,8,9,10,11
Note: -d stands for delimiter here. It means that the columns of data given as input to the cut
statement are separated by commas.
ab.txt
:
1
2
3
4
5
6
7
8
9
10
11
tr
was already covered in unix.stackexchange.com/a/754372/70524