Sorry in advance if it is too basic, I am a novice. I have two files (file1 and file2). I want to look in file1 column 6 and if it matches with an entry in column 1 of file 2, print the contents of that line to the end of file 1, to make a new file. Below are what my files look like, and what I like them to look like. I tried below and it doesn’t work, the output is empty.
sort -k6 file 1 > file1_sorted
sort –k1 file2 > file2_sorted
join -1 6 -2 1 -o 1.1,1.2,1.3,1.4,1.5,1.6,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,2.10,2.11,2.12,2.13 file1_sorted file2_sorted > file3
This could be because the join is wrong, or because the file 1 is not sorted properly, or both. Both files are csv, file 2 is okay, but file 1 seems to have some empty spaces after the last column that when I try to pick column 6 of file1 via awk '{print $6} file 1 >test
, it doesn’t work…I have no idea why. I tried removing tabs using sed
etc. but no luck. Please help!
File 1 (5052 lines, 6 columns)
rs28595482,1,1,1953576,ENSG00000187730,GABRD
rs2376805,1,1,1956362,ENSG00000187730,GABRD
rs2229110,1,1,1957037,ENSG00000187730,GABRD
rs3820007,1,1,1957299,ENSG00000187730,GABRD
rs28409373,1,1,1959978,ENSG00000187730,GABRD
rs2376803,1,1,1967954,NA,GABRD
rs11582799,1,1,7832026,ENSG00000269925,VAMP3
file 2 (344 lines, 13 columns)
GABRD,16,0,0,gaba,0,0,oxt,0,0,0,0,0
ABCG1,21,0,0,0,0,cort,0,0,0,0,0,0
VAMP3,0,0,0,0,0,0,0,0,0,0,0,0
ADAMTS2,0,0,0,0,0,0,0,0,0,0,0,0
ADAMTSL1,9,0,0,0,0,0,oxt,0,0,0,0,rest
ADCY7,16,0,0,0,0,cort,0,0,0,0,0,0
What I expect to get after some magic (file 3)
rs28595482,1,1,1953576,ENSG00000187730,GABRD,16,0,0,gaba,0,0,oxt,0,0,0,0,0
rs2376805,1,1,1956362,ENSG00000187730,GABRD,16,0,0,gaba,0,0,oxt,0,0,0,0,0
rs2229110,1,1,1957037,ENSG00000187730,GABRD,16,0,0,gaba,0,0,oxt,0,0,0,0,0
rs3820007,1,1,1957299,ENSG00000187730,GABRD,16,0,0,gaba,0,0,oxt,0,0,0,0,0
rs28409373,1,1,1959978,ENSG00000187730,GABRD,16,0,0,gaba,0,0,oxt,0,0,0,0,0
rs2376803,1,1,1967954,NA,GABRD, 16,0,0,gaba,0,0,oxt,0,0,0,0,0
rs11582799,1,1,7832026,ENSG00000269925, VAMP3,0,0,0,0,0,0,0,0,0,0,0,0