I have a csv file that get's SFTP'd to a RHEL box I have setup. It then gets transformed a little before another process SFTP's in and grabs it sending it off to be imported to a DB of a 3rd party.
I need to have a shell script open file one, take the id & code fields.. search file two for them.. grab the pay field from file 2 and insert into the pay field of file one.
Example:
File1.csv
501,101,
505,101,
507,103,
File2.csv
501,101,5.50
505,101,6.25
507,103,1.25
I want the 5.50 in File2 to be inserted into File1 in the field after 101. Obviously search through File1 line by line and repeat.
Doing some searching, I've come across the join command which seems like it could work but I have 2 common fields I have to join against and I looked at the documentation and it seems that only 1 common field is supported.
Something like:
join -t, -0 1.1 1.2 2.3 -1 1 -2 1
That takes care of the first lookup.. but I need to match both columns 1 & 2 in order to get my desired result.
UPDATE
So in doing some more digging I've been playing with Join & Awk together to make this work. Feel like I am close but just missing something as my output right now is blank.
My actual datafile is laid out differently than my basic example above. Here is a better representation:
File1.csv
Column1,Column2,Column3,Column4,Column5,Column6,Column7
File2.csv
Column1,Column2,Column3,Column4,Column5
I want to set the value of Column6 in File1 to the Value of Column4 in File2 should there be a match of Columns 1,4,7 in File1 and Columns 2,5,3 in File2
My attempt thus far is:
join -t, -j1 -0 1.1 1.2 1.3 1.4 1.5 2.6 1.7 <(<file1.csv awk -F ","
'{print $1"-"$4"-"$7","$0}' | sort -k1,1) <(<file2.csv awk -F "," '{print ($2"-"$5"-"$3","$0}' | sort -k1,1) > out.csv
I still want to include the entries that don't have a match. If I include
-a1
it sort of accomplishes this but the field i am matching to is blank. Instead I want it to just retain the original data.
join
command (that allegedly takes care of the first lookup) does not seem valid. Do you meanjoin -t, -o '1.1 1.2 2.3' -1 1 -2 1 File1.csv File2.csv
? (2) I think the example files are not really a good example because for them the first lookup is enough; or evencat File2.csv
is enough. (3) If you replace the first comma in each line with_
then you will have just one common field. In the result replace back. Can you take it from here?