4
\$\begingroup\$

I am trying to write a code that:

First Prompt: asks the user to select the folder that contains all zipped folders where each zipped folder contains a data file.

Second Prompt: Ask the user to select the master folder path to place the unzipped data files into.

The program will unzip the first zipped folder, copy the data file out, paste the data file into the designated master folder the user was prompted to select from their second prompt.

The program will find the next zipped folder, unzip and open the folder, copy out the file it contains, paste the file into the master folder, rinse and repeat until all the files from each of the zipped folders are moved over.

I am trying to make a folder that contains all the zipped data files in order to concatenate all the data into a master document. Then I will be able to run an analysis on this data.

This code works but I need some help with getting the code to run faster. I tested the program on a folder that contained only 6 zipped files and it moved the files over in 2 seconds. But now I'm testing the program on a folder that contains 8020 zipped files and its moving over the files at a rate of 1 file every 8 seconds.

I have "Microsoft Shell Controls And Automation", "Visual Basic For Applications", "Microsoft Excel 15.0 Object Library", OLE Automation", "Microsoft Office 15.0 Object Library", Microsoft Forms 2.0 Objection Library" options enabled.

Can anyone help me speed this up?

Option Explicit

Sub UnzipFileMoveToMasterFolder()

Dim oApp As Shell
Dim strFile As String
Dim FolderPath As FileDialog
Dim MasterFolder As String
Dim ZipFolder As String
Dim i As Integer

'Optimize Macro Speed
Application.ScreenUpdating = False
Application.EnableEvents = False
Application.Calculation = xlCalculationManual
Application.DisplayAlerts = False

'Retrieve target master workbook data folder path from user
Set FolderPath = Application.FileDialog(msoFileDialogFolderPicker)

With FolderPath
  .Title = "Select a master folder location to store the unzipped files."
  .AllowMultiSelect = False
    If .Show <> -1 Then GoTo CancelSelect1
    MasterFolder = .SelectedItems(1) & "\"
End With

'In case Cancel selected
CancelSelect1:
MasterFolder = MasterFolder
If MasterFolder = "" Then GoTo ResetSettings

'Retrieve target data folder path from user
Set FolderPath = Application.FileDialog(msoFileDialogFolderPicker)

With FolderPath
  .Title = "Select folder location that contains the zipped folders with desired data files."
  .AllowMultiSelect = False
    If .Show <> -1 Then GoTo CancelSelect2
      ZipFolder = .SelectedItems(1) & "\"
End With

'In case Cancel selected
CancelSelect2:
ZipFolder = ZipFolder
If ZipFolder = "" Then GoTo ResetSettings

'get a shell object
Set oApp = CreateObject("Shell.Application")
'get all the files in the directory that are zip files

strFile = Dir(ZipFolder & "*.zip")
'loop until we cannot find anymore files
Do While strFile <> ""
        'check to see if the zip contains items
        If oApp.Namespace(ZipFolder).Items.Count > 0 Then
            'loop through all the items in teh zip file
            For i = 0 To oApp.Namespace(ZipFolder).Items.Count - 1
            'save the files to the new location
            oApp.Namespace(MasterFolder).CopyHere oApp.Namespace(ZipFolder & strFile).Items.Item(i)
            Next i
        End If
    'get the next file in the directory
    strFile = Dir
Loop
'free memory
Set oApp = Nothing

'Reset Macro Optimization Settings
ResetSettings:
Application.EnableEvents = True
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
Application.DisplayAlerts = True

'Message Box when tasks are completed
MsgBox "Task Complete!"
Application.Cursor = xlDefault

End Sub
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

This code is very hard to follow at a casual reading, because of the inconsistent and non-standard indentation (10%) and the liberal use of GoTo's (90%).

Indentation: The following should increase your indentation level by one level - Sub\ Function, With, If, For, Do. In addition, labels (which we'll discuss in a moment) should not be indented. In fact, the IDE will attempt to pin them to the left margin to make that easier.

GoTo: Just because VBA has "Basic" in the name doesn't mean that it's the non-structured Basic I knew and loved back in the late 80s. It has evolved into a structured language, so really the only place where you should use GoTo is in error handling. Everything else that it can possibly be used for can be accomplished with different flow control structure or functions. For example, this section...

With FolderPath
  .Title = "Select a master folder location to store the unzipped files."
  .AllowMultiSelect = False
    If .Show <> -1 Then GoTo CancelSelect1
    MasterFolder = .SelectedItems(1) & "\"
End With

'In case Cancel selected
CancelSelect1:

...can be restructured in a much more readable way (and get rid of the heebie-jeebies I get when I see a jump out of a With block) like this:

With FolderPath
    .Title = "Select a master folder location to store the unzipped files."
    .AllowMultiSelect = False
    If .Show = -1 Then
        MasterFolder = .SelectedItems(1) & "\"
    End If
End With

Dead Code: Think about the line of code MasterFolder = MasterFolder, then ask yourself what this is intended to accomplish. There is no situation that I can come up with where setting a variable to itself will accomplish anything useful.

VbNullString: "" should be replaced with vbNullString. Note that the two are not the same - vbNullString is a compiler constant that is basically a string pointer to null. "" actually allocates itself as a string with zero length, which is very different, especially if you're using Declare'd functions and API calls. The following demonstrates:

Debug.Print StrPtr(vbNullString)
Debug.Print StrPtr("")

The first returns 0, the second returns a memory address (no huge speed gains here - the compiler is smart enough to re-use the allocation).

Testing For Empty Strings: Not a huge performance issue, but testing string equity with "" (or VbNullString for that matter) is roughly 25% slower than checking to see if the length is zero. This is due to how Strings are stored in VBA - internally they are stored as BSTR's, so the Len function only has to grab the length from the start of the string. This is much more efficient than a string equity check, even with the function call. So, ...

Do While strFile <> ""

... would be better as this:

Do While Len(strFile) > 0

String v. Variant Functions: You may have noticed that most string handling functions have 2 versions - the "normal" function, and a version that ends in "$". The ones that end in "$" return a String, the others return a Variant. If you are assigning the return value to a String or passing it to a function that takes a String as a parameter, you should use the function that returns a String. The reason is that there is an implicit cast performed, so this...

strFile = Dir(ZipFolder & "*.zip")

...is essentially this when compiled:

strFile = CStr(Dir(ZipFolder & "*.zip"))

Much better to just do this and omit the cast:

strFile = Dir$(ZipFolder & "*.zip")

Misc: You don't need to turn off ScreenUpdating, Events, Calculation, or Alerts. Your code doesn't do anything that would effect any of them, so it does nothing to "Optimize Macro Speed". You also don't need to set objects to Nothing. The variable names MasterFolder and ZipFolder are confusing - something like source and destination would be much clearer. While you're at it, I'd ditch the Hungarian notation. You also don't need to instantiate the FileDialog twice - you can reuse the same object (which also lets you get rid of a variable, since all of the work can go into one With block). You also don't need to concat the "\" to the end of your source folder - it's much clearer if you put the concat where it's actually required instead of in the folder dialog code. Finally, the MsgBox at the end is confusing with your current control flow. It gives the indication that work was performed if the user cancels out of either of the folder selections.

Performance: Unzipping over 8000 files is never going to be fast, but the main performance gain that you'll get is from removing the inner loop that copies items out of the zip files one at a time. The .CopyHere method doesn't need individual items - it can process the entire collection at once:

oApp.Namespace(MasterFolder).CopyHere oApp.Namespace(ZipFolder & strFile).Items

Final Result: Something more like this:

Sub UnzipFileMoveToMasterFolder()
    Dim appl As Shell
    Dim archive As String
    Dim destination As String
    Dim source As String

    With Application.FileDialog(msoFileDialogFolderPicker)
        .Title = "Select a master folder location to store the unzipped files."
        .AllowMultiSelect = False
        If .Show = -1 Then destination = .SelectedItems(1)
        If Len(destination) = 0 Then Exit Sub
        .Title = "Select folder location that contains the zipped folders with desired data files."
        If .Show = -1 Then source = .SelectedItems(1)
        If Len(source) = 0 Then Exit Sub
    End With

    Set appl = CreateObject("Shell.Application")
    archive = Dir$(source & "\*.zip")
    Do While Len(archive) > 0
        appl.Namespace(destination).CopyHere appl.Namespace(source & "\" & archive).Items
        archive = Dir$
    Loop
    MsgBox "Task Complete!"
End Sub
\$\endgroup\$
1
  • \$\begingroup\$ Thank you for the assistance. I obviously haven't taken a formal coding class yet and I still learning what each command is used for and how to use them as well as how to combine these commands efficiently. I like the clarity of your response. It is very helpful for people starting out in this. \$\endgroup\$
    – DKnight
    Commented Jan 4, 2016 at 14:53

Not the answer you're looking for? Browse other questions tagged or ask your own question.