2

If i run the following:

Measure-Command -Expression {gci -Path C:\ -Recurse -ea SilentlyContinue | where Extension -eq ".txt"}
Measure-Command -Expression {gci -Path C:\ -Filter *.txt -Recurse -ea SilentlyContinue}

The second expression is always faster that the first one, im guessing its because it doesnt have to use the pipeline.

I thought maybe in the Pipeline method PowerShell recursed my drive and passed a collection of objects to the where clause, that would have to iterate through the items again, but i ruled that out, because if you run the first expression you can see it return output as it is recursing. So why is the Pipeline method slower?

2 Answers 2

5

Using Where-Object is always slower than using the built in parameters of the left hand side command. You first bring ALL objects to your shell and only then starts filtering them (client side filtering).

With regard to the -Filter parameter, it works faster because it performs on the provider level (server side filtering), objects are checked once accessed and you get back only the ones that match your criteria.

4
  • when you say that where-object is client side filtering and -Filter server side, do you mean that in the sense that that the -Filter parameter is a feature of the FileSystem Provider and Where-Object is a feature of PoSH ? Also if that is correct does this mean that, when using where-object powershell enumerates every object on my drive and then passes a collection which is iterated through by the where clause?
    – user1462199
    Commented Aug 27, 2012 at 9:12
  • Yes, the Filter parameter is a feature of the FileSystem provider. Where-Object is a filtering cmdlet. In the first command you ask gci to enumerate ALL files on your C drive and you do the filtering using where-object. In the second command you also ask to enumerate all files of a certain extension, you still enumerate all files but you get back only the ones that you really need. Sometimes there no way but to use the Where-Object cmdlet. I always check the parameters of a command to fins if they offer any built-in filtering before I resort to Where-Object.
    – Shay Levy
    Commented Aug 27, 2012 at 9:31
  • In addition, client side filtering using Where-Object has its performance penalty, you can notice performance degradation once you retrieve objects from remote computers.
    – Shay Levy
    Commented Aug 27, 2012 at 9:33
  • i see what you are saying using where-object on a remote machine, would return every file on the C:\ of the remote machine, when in reality all the filtering could be done on the remote machine using -Filter. Thanks for clearing this up.
    – user1462199
    Commented Aug 27, 2012 at 9:41
0

Shay's answer is totally correct. I wanted to touch on your secondary question a bit, though. Here's what's happening under the hood in the pipeline:

gci -Path C:\ -Recurse -ea SilentlyContinue | where Extension -eq ".txt"

gci will start searching for files and directories at or under c:\, any extension. As it finds each one, that one result is passed on to Where-Object, which will discard it if the extension is not .txt. If the extension is .txt, that object is passed on in the pipeline, and out to the console (or to a variable, or whatever). Then gci will continue its search, when it finds the next file, it will pass it on, etc. So although it might take a couple minutes to search the entire c:\ drive, you get partial results streamed back to you almost immediately, since the pipeline operates one object at a time.

What is not happening is that gci does the full disk search all at once, then hands the complete results set to Where-Object only when it's complete.