0

I am trying to identify active accounts over dead accounts and was wondering if diff along with grep or sed or regex can be used, instead of writing a long program.

File1 (usernames)                  File2 (emails)
janedoe                            [email protected]
johndoe                            [email protected]

Each file contain about 1000 times and I need to do this frequently, once a week or so.

Task
-Check if the usernames from the File1 exist in the File2. In the above sample data, they exist.
-If they do exist, comment them out in File1.

In the past, I have used diff to compare files and have used regex to ignore lines. But unfortunately, I cannot wrap my brains around to ignore or a consider only a part of the string from the emails. (Since @ and anything after that is not comparable)

Any assistance would be appreciated. :)

1 Answer 1

1

You can achieve that using any script language that knows hashes/dictionaries/associative arrays/whatever it calls the feature.

A very very very simple approach would be like this:

$> cat File1
johndoe
janedoe
nosuchkid
$> cat File2
[email protected]
[email protected]
$> awk -F'@' 'FILENAME=="File2" { emails[$1]=$0; next}; { print ($1 in emails) ? $1 : "# "$1}' File2 File1
johndoe
janedoe
# nosuchkid

Probably, you can see that this one does not modify anything in the input files just writes stdout.

EDIT: Redirecting the output into a file and renaming that would appear as changes in the original file called File1 here (making a backup of the original file is always a good idea):

$> awk -F'@' 'FILENAME=="File2" { emails[$1]=$0; next}; { print ($1 in emails) ? $1 : "# "$1}' File2 File1 > File1.tmp ; cp File1 File1.old ; mv File1.tmp File1
$> cat File1
johndoe
janedoe
# nosuchkid

EDIT2: Let's be a little less literal:

$> export PERSONFILE=File1 EMAILFILE=File2; awk -F'@' 'FILENAME==ENVIRON["EMAILFILE"] { emails[$1]=$0; next}; { print ($1 in emails) ? $1 : "# "$1}' "$EMAILFILE" "$PERSONFILE" > "$PERSONFILE.tmp" ; cp "$PERSONFILE" "$PERSONFILE.old" ; mv "$PERSONFILE.tmp" "$PERSONFILE"
10
  • But awk cannot write the output to the input file. Which is a necessity. Commented Mar 3, 2016 at 16:20
  • Nothings stops you from redirecting the output to a new file and renaming the new file to File1. Actually, that is what any program would do inside when working on text files where there is not a dedicated byte in the file for flag purposes.
    – user556625
    Commented Mar 3, 2016 at 16:23
  • Can you provide a solution using your answer. As I am not familiar with awk and its properties. Commented Mar 3, 2016 at 16:24
  • Okay, did so. Actually, for redirection and renaming, there is no other feature required than what the basic OS provides.
    – user556625
    Commented Mar 3, 2016 at 16:31
  • Can I assign the variable File1 & File2, to actual filenames or I have to type the filenames in the command. Commented Mar 3, 2016 at 16:38

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .