I have a situation where I need to compare two strings with each other in a foreach loop that potentially run over millions of rows of data. The two strings will always contain between 1 and 12 different parameters, but these come in as a string concatenated with comma. The strings don't necessarily come sorted either, so they could look like:
"Parameter1, Parameter2, Parameter3, Parameter4"
"Parameter2, Parameter1, Parameter4, Parameter3"
"Parameter5, Parameter1, Parameter6, Parameter2"
etc.
I need to compare two of these and validate if they both contain the same parameters. My approach has currently been to split the strings by comma into arrays, sort the arrays, re-join them to strings, and then compare the strings, like:
$Array1 = $String1 -split ", " | Sort-Object
$CompareString1 = $Array1 -join ", "
$Array2 = $String2 -split ", " | Sort-Object
$CompareString2 = $Array2 -join ", "
if ($CompareString1 -eq $CompareString2) {
do stuff
}
However, this got me thinking that I could instead compare the arrays as is:
$Array1 = $String1 -split ", " | Sort-Object
$Array2 = $String2 -split ", " | Sort-Object
if ($Array1 -eq $Array2) {
do stuff
}
But then that got me thinking, comparing arrays as is would probably (?) be resource intense, so maybe I should instead use the Compare-Object
native to Powershell:
$Array1 = $String1 -split ", " | Sort-Object
$Array2 = $String2 -split ", " | Sort-Object
if ([string]::IsNullorEmpty(Compare-Object -ReferenceObject $Array1 -DifferenceObject $Array2)) {
do stuff
}
But that feels like I'm overcomplicating things and potentially adding overhead to my code.
Is any of the above methods optimal for comparing the two strings, or are there other, better alternatives?
EDIT: Comparing arrays directly does not seem to be feasible with Powershell, so it is down to either Compare-Object or rebuilding them to strings. Running some tests it looks like the difference is negligeble, but a friend suggested I can at least first compare the length of the strings because if they're not identical I know that I can skip the validation and just move to the next item, which seems to save about 25% of time. As in:
if($ParameterString1.Length -ne $ParameterString2.Length){
continue
}
else{
#continue with the actual comparison
Not sure if there are more optimisations possible here though...