10
\$\begingroup\$

The code below was refactored for performance improvements for another user on this site.

Functionality, high level:

  • Sheet1 - CodeName aIndex: used as the main reference to the structure of the data being processed in 2 other sheets: mapping column headers for incoming data in sheet2, to column headers to be processed for the final result on Sheet3

  • Sheet2 - CodeName bImport: this where external (raw) data is imported before processing. Importing of data is not part of this process

  • Sheet3 - CodeName cFinal: out of a set of about 50 incoming columns, Sheet1 will define a subset of 20 to 30 columns to be processed for the final result

The code is fully functional, without issues, and decent performance (50,000 rows and 44 columns processed in 4 to 5 seconds); it contains more comments than usual for learning purposes, explaining some basic steps, or things that may not be obvious or clear to an inexperienced person.

Notes:

  • This is not a request that requires understanding of the functionality, or finding inefficiencies (unless there are obvious parts that can be optimized).
  • It's about self improvement relative to coding practices: I am open to any criticism no matter how harsh, for any mistakes I may have made - I'll easily swallow my pride, as long as I can improve any bad habits I may have picked up along the way.
  • When I posted the question intended to make it as relevant to this site as possible: Does this code make my ass look fat?
  • I realize that members of this community are volunteers (like me), and provide feedback out of passion about the subject, so I tried to analyse the question objectively, as a reviewer:
    • The code is way too long to make me feel it's worth the effort, and this is the reason I didn't bring its functionality into the mix: there is less effort required for analyzing it at a high level (coding style), and not intricacies of functionality
    • There is nothing I can do to make it shorter: I was curious about its structure: did I modularize it enough, or maybe too much
    • I wouldn't want to get involved in a long review by attempting to understand its logic and reasons of doing what it does, but just quick feedback about anything obviously bad from a readability and maintainability perspective

.

That said, I will provide relevant details about functionality for each part as a contexts for the algorithm

The first Sub controls the start and end of the entire process (after an imported file): turns off all events and calculations in Excel that can slow down execution, starts a timer, starts the main process, captures the total duration, and turns all Excel features back on: .

Option Explicit

Public Sub projectionTemplateFormat()
    Dim t1 As Double, t2 As Double

    fastWB True      'turn off all Excel features related to GUI and calculation updates

        t1 = Timer   'start performance timer

        mainProcess

        t2 = Timer   'process is completed

    fastWB False     'turn Excel features back on

    'MsgBox "Duration: " & t2 - t1 & " seconds"   'optional measurement output

End Sub

The next Sub is where the main processing is done, and makes calls to smaller helper functions:

  • Sets up all references needed during processing: the 3 workbooks, and a set of local variables
  • Determines the columns and size of imported data (Sheet2)
  • Determines if there is any previous data on the result sheet (Sheet3) for cleanup
    • It doesn't remove the headers: these are the column to be migrated from the imported data
  • Overwrites the headers in Imported Sheet with a standard set of headers defined on Sheet1
    • The headers on Sheet1 can be adjusted by the user (added, removed, renamed) relative to the expected headers in the imported data
    • They are also aligned with the headers on Sheet3 (the final result)
  • Re-formats the imported data with specific text, number, and date formats
  • If there is at least 1 row of imported data on Sheet2, it starts the main process

The following steps are the most CPU intensive task:

  • Start looping over each column on Sheet3 (columns of the final result)
    • Find the first column to be migrated (based on the header name from Sheet3)
    • If found, set a reference to the entire column with data (50,000 rows or more)
    • Set a reference on Sheet3, to an area of the same size as the column of imported data
    • Copy the data from Sheet2 to Sheet3
  • Move on the the next column on Sheet3 an repeat the process until all predefined columns on Sheet3 are populated

  • Overwrite some imported values on Sheet3 with hard-coded data from Sheet1

  • Reformat the dates on 2 specific columns on Sheet3 to "YYYY" requirement
  • Reformat other specific columns on Sheet3
  • Convert all data on Sheet3 to UPPER CASE
  • Apply cell and font formatting to all data on Sheet3
  • Zoom all sheets to 85%

Private Sub mainProcess()

    Const SPACE_DELIM       As String = " "

    Dim wsIndex             As Worksheet
    Dim wsImport            As Worksheet    'Raw data
    Dim wsFinal             As Worksheet    'Processed data

    Dim importHeaderRng     As Range
    Dim importColRng        As Range
    Dim importHeaderFound   As Variant
    Dim importLastRow       As Long

    Dim finalHeaderRng      As Range
    Dim finalColRng         As Range
    Dim finalHeaderRow      As Variant
    Dim finalHeaderFound    As Variant

    Dim indexHeaderCol      As Range
    Dim header              As Variant  'Each item in the FOR loop
    Dim msg                 As String


    Set wsIndex = aIndex    'This is the Code Name; top-left pane: aIndex (Index)
    Set wsImport = bImport  'Direct reference to Code Name: bImport.Range("A1")
    Set wsFinal = cFinal    'Reference using Sheets collection: ThisWorkbook.Worksheets("Final")


    With wsImport.UsedRange
        Set importHeaderRng = .Rows(1)                      'Import - Headers
        importLastRow = getMaxCell(wsImport.UsedRange).Row  'Import - Total Rows
    End With
    With wsFinal.UsedRange
        finalHeaderRow = .Rows(1)       'Final - Headers (as Array)
        Set finalHeaderRng = .Rows(1)   'Final - Headers (as Range)
    End With
    With wsIndex.UsedRange              'Transpose col 3 from Index (without the header), as column names in Import
        Set indexHeaderCol = .Columns(3).Offset(1, 0).Resize(.Rows.Count - 1, 1)
        wsImport.Range(wsImport.Cells(1, 1), wsImport.Cells(1, .Rows.Count - 1)).Value2 = Application.Transpose(indexHeaderCol)
    End With

    applyColumnFormats bImport          'Apply date and number format to Import sheet

    If Len(bImport.Cells(2, 1).Value2) > 0 Then 'if Import sheet is not empty (excluding header row)

        With Application

            For Each header In finalHeaderRow   'Loop through all headers in Final

                If Len(Trim(header)) > 0 Then   'If the Final header is not empty

                    importHeaderFound = .Match(header, importHeaderRng, 0)      'Find header in Import sheet
                    If IsError(importHeaderFound) Then
                        msg = msg & vbLf & header & SPACE_DELIM & wsImport.Name 'Import doesn't have current header
                    Else

                        finalHeaderFound = .Match(header, finalHeaderRng, 0)    'Find header in Final sheet
                        With wsImport
                            Set importColRng = .UsedRange.Columns(importHeaderFound).Offset(1, 0).Resize(.UsedRange.Rows.Count - 1, 1)
                        End With

                        With wsFinal
                            Set finalColRng = .Range(.Cells(2, finalHeaderFound), .Cells(importLastRow, finalHeaderFound))
                            finalColRng.Value2 = vbNullString                   'Delete previous values (entire column)
                        End With

                        finalColRng.Value2 = importColRng.Value2                'Copy Import data in Final columns

                    End If
                End If
            Next
        End With

        setStaticData importLastRow

        extractYears

        applyColumnFormats cFinal          'Apply date and number format to Import sheet

        allUpper wsFinal

        'wsFinal.UsedRange.AutoFilter

        applyFormat wsFinal.Range(wsFinal.Cells(1, 1), wsFinal.Cells(importLastRow, wsFinal.UsedRange.Columns.Count))

        Dim ws As Worksheet
        For Each ws In Worksheets
            ws.Activate
            ActiveWindow.Zoom = 85
            ws.Cells(2, 2).Activate
            ActiveWindow.FreezePanes = True
            ws.Cells(1, 1).Activate
        Next

    Else
        MsgBox "Missing raw data (Sheet 2 - 'Import')", vbInformation, "   Missing Raw Data"
    End If

End Sub

Next method is a straight overwrite operation of static data from Sheet1 onto Sheet3


Private Sub setStaticData(ByVal lastRow As Long)

    With cFinal
        .Range("D2:D" & lastRow).Value = aIndex.Range("H2").Value
        .Range("F2:F" & lastRow).Value = aIndex.Range("H9").Value
        .Range("AC2:AC" & lastRow).Value = aIndex.Range("H3").Value
        .Range("X2:X" & lastRow).Value = aIndex.Range("H4").Value
        .Range("Y2:Y" & lastRow).Value = aIndex.Range("H5").Value
        .Range("AE2:AE" & lastRow).Value = aIndex.Range("H6").Value
        .Range("AF2:AF" & lastRow).Value = aIndex.Range("H7").Value
        .Range("AD2:AD" & lastRow).Value = aIndex.Range("H8").Value
    End With

End Sub

Another method of applying a specific text, number, date format to a set of columns (the same set of columns on either Sheet2 (Import), or Sheet3 (final result)


Private Sub applyColumnFormats(ByRef ws As Worksheet)

    With ws.UsedRange
        .Cells.NumberFormat = "@"                               'all cells will be "General"
        .Columns(colNum("G")).NumberFormat = "MM/DD/YYYY"
        .Columns(colNum("I")).NumberFormat = "MM/DD/YYYY"
        '.Columns(colNum("A")).NumberFormat = "@"
        '.Columns(colNum("B")).NumberFormat = "@"
        '.Columns(colNum("C")).NumberFormat = "@"
        .Columns(colNum("R")).NumberFormat = "MM/DD/YYYY"
        .Columns(colNum("Q")).NumberFormat = "MM/DD/YYYY"
        .Columns(colNum("T")).NumberFormat = "MM/DD/YYYY"
        .Columns(colNum("W")).NumberFormat = "@"    '"YYYY"
        .Columns(colNum("V")).NumberFormat = "@"    '"YYYY"
        .Columns(colNum("AC")).NumberFormat = "MM/DD/YYYY"
        .Columns(colNum("N")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)"
        .Columns(colNum("AM")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)"
        .Columns(colNum("AN")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)"
        .Columns(colNum("AO")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)"
    End With

End Sub

Helper method: Cell, border, and font formatting to all data on Sheet3


Private Sub applyFormat(ByRef rng As Range)

    With rng
        .ClearFormats
        With .Font
            .Name = "Georgia"
            .Color = RGB(0, 0, 225)
        End With

        .Interior.Color = RGB(216, 228, 188)

        With .Rows(1)
            .Font.Bold = True
            .Interior.ColorIndex = xlAutomatic
        End With

        With .Borders
            .LineStyle = xlDot  'xlContinuous
            .ColorIndex = xlAutomatic
            .Weight = xlThin
        End With

    End With

    refit rng

End Sub

Helper method: Converts all data to upper case

The main aspect about all helper methods acting on large ranges of data is that they perform:

  • Only one interaction with the worksheet to copy all data to memory
  • Processes each individual value by looping over the memory arrays (unavoidable nested loops for 2 dimensional arrays)
  • Then in another single interaction with the sheet places all data transformed back in the same area

  • This is, by far, the most overlooked performance improvement. It requires minimum coding effort, but might be perceived as a somewhat difficult concept to grasp for novice VBA enthusiasts (including myself) who just want to get the job done, without "complicating" things


Private Sub allUpper(ByRef sh As Worksheet)
    Dim arr As Variant, i As Long, j As Long

    If WorksheetFunction.CountA(sh.UsedRange) > 0 Then
        arr = sh.UsedRange
        For i = 2 To UBound(arr, 1)         'each "row"
            For j = 1 To UBound(arr, 2)     'each "col"
                arr(i, j) = UCase(RTrim(Replace(arr(i, j), Chr(10), vbNullString)))
            Next
        Next
        sh.UsedRange = arr
    End If
End Sub

Helper method: converts dates on certain columns to a YYYY format. In retrospect, I should have made it generic to accept a column name, range, letter, or number, as a parameter instead of hard-codding 2 columns. The point I was trying to make here was to combine multiple columns within one loop for improved performance, instead of several loops performing the same operation, on different columns


Private Sub extractYears()
    Dim arr As Variant, i As Long, j As Long, ur As Range, colW As Long, colV As Long

    Set ur = cFinal.UsedRange               '3rd sheet

    If WorksheetFunction.CountA(ur) > 0 Then
        colW = colNum("W")
        colV = colNum("V")

        arr = ur

        For i = 2 To getMaxCell(ur).Row     'each "row"
            If Len(arr(i, colW)) > 0 Then arr(i, colW) = Format(arr(i, colW), "yyyy")
            If Len(arr(i, colV)) > 0 Then arr(i, colV) = Format(arr(i, colV), "yyyy")
        Next

        ur = arr

    End If
End Sub

Private Sub refit(ByRef rng As Range)

    With rng
        .WrapText = False
        .HorizontalAlignment = xlGeneral
        .VerticalAlignment = xlCenter
        .Columns.EntireColumn.AutoFit
        .Rows.EntireRow.AutoFit
    End With

End Sub

Helper method: next, are 2 generic functions that return:

  • The column letter from the column number
  • The column number from the column letter

Not ideal naming convention as it's not descriptive enough (not intuitive or self-documented). My reason (not excuse): long names don't fit well in the small area provided - doesn't make it OK


Public Function colLtr(ByVal fromColNum As Long) As String  'get column leter from column number
    'maximum number of columns in Excel 2007, last column: "XFD" (16384)
    Const MAX_COLUMNS   As Integer = 16384

    If fromColNum > 0 And fromColNum <= MAX_COLUMNS Then

        Dim indx As Long, cond As Long

        For indx = Int(Log(CDbl(25 * (CDbl(fromColNum) + 1))) / Log(26)) - 1 To 0 Step -1
            cond = (26 ^ (indx + 1) - 1) / 25 - 1
            If fromColNum > cond Then
                colLtr = colLtr & Chr(((fromColNum - cond - 1) \ 26 ^ indx) Mod 26 + 65)
            End If
        Next indx

    Else
        colLtr = 0
    End If

End Function

Public Function colNum(ByVal fromColLtr As String) As Long

    'A to XFD (upper or lower case); if the parameter is invalid it returns 0
    'maximum number of columns in Excel 2007, last column: "XFD" (16384)

    Const MAX_LEN       As Byte = 4
    Const LTR_OFFSET    As Byte = 64
    Const TOTAL_LETTERS As Byte = 26
    Const MAX_COLUMNS   As Integer = 16384

    Dim paramLen        As Long
    Dim tmpNum          As Integer

    paramLen = Len(fromColLtr)
    tmpNum = 0

    If paramLen > 0 And paramLen < MAX_LEN Then
        Dim i           As Integer
        Dim tmpChar     As String
        Dim numArr()    As Integer

        fromColLtr = UCase(fromColLtr)
        ReDim Preserve numArr(paramLen)

        For i = 1 To paramLen
            tmpChar = Asc(Mid(fromColLtr, i, 1))
            If tmpChar < 65 Or tmpChar > 90 Then Exit Function              'make sure it's a letter. upper case: 65 to 90, lower case: 97 to 122
            numArr(i) = tmpChar - LTR_OFFSET                                'change lettr to number indicating place in alphabet (from 1 to 26)
        Next

        Dim highPower   As Integer
        highPower = UBound(numArr()) - 1                                    'the most significant digits occur to the left

        For i = 1 To highPower + 1
            tmpNum = tmpNum + (numArr(i) * (TOTAL_LETTERS ^ highPower))     'convert the number array using powers of 26
            highPower = highPower - 1
        Next
    End If

    If tmpNum < 0 Or tmpNum > MAX_COLUMNS Then tmpNum = 0

    colNum = tmpNum

End Function

For the next method I applied an extra performance improvement to the usual known method of determining the last cell with data:

  • Normal methods perform an inverse search of the first data value staring at the last row\column of an Excel sheet (which now has over 1 million rows and and 16 thousand columns

  • This method expects only on the "UsedRange" - the notoriously inaccurate range that remembers cell formatting, unused formulas, hidden objects, etc. However, this inaccurate range is much smaller the the entire sheet, but large enough to include all data, so it performs the inverse search over only a few excess rows and columns

  • By my definition, the last used cell can also be empty, a long as it represents the longest row and column with data


Public Function getMaxCell(ByRef rng As Range) As Range

    'search the entire range (usually UsedRange)
    'last row: find first cell with data, scanning rows, from bottom-right, leftwards
    'last col: find first cell with data, scanning cols, from bottom-right, upwards

    With rng
        Set getMaxCell = rng.Cells _
                        ( _
                            .Find( _
                                What:="*", _
                                SearchDirection:=xlPrevious, _
                                LookIn:=xlFormulas, _
                                After:=rng.Cells(1, 1), _
                                SearchOrder:=xlByRows).Row, _
                            .Find( _
                                What:="*", _
                                SearchDirection:=xlPrevious, _
                                LookIn:=xlFormulas, _
                                After:=rng.Cells(1, 1), _
                                SearchOrder:=xlByColumns).Column _
                        )
    End With
End Function

  • Helper method: another set of versatile general functions for turning off Excel features that might hinder VBA performance, main ones:
    • xlCalculationAutomatic - extremely convenient for manual interactions with sheets, huge potential of performance issues when performing VBA updates to large ranges as it triggers exponential calculations to all dependent formulas on the sheet(s)
    • EnableEvents - can trigger nested events (infinite recursion) which Excel terminates eventually). Also may cause inexplicable or unexpected VBA behavior when not turned back on
    • ScreenUpdating - well known
    • DisplayPageBreaks: I've seen an earlier comment referring to this. To me this is insidious, perceived harmless, when in fact it can cause extra work behind the scenes, especially when re-sizing rows and columns. I never print anything, so I never care about page breaks, but Excel cares about them at every move: re-size 1 column\row - it recalculates page size for all used area; it should be used and only when printing

Public Sub fastWB(Optional ByVal opt As Boolean = True)

    With Application
        .Calculation = IIf(opt, xlCalculationManual, xlCalculationAutomatic)
        If .DisplayAlerts <> Not opt Then .DisplayAlerts = Not opt
        If .DisplayStatusBar <> Not opt Then .DisplayStatusBar = Not opt
        If .EnableAnimations <> Not opt Then .EnableAnimations = Not opt
        If .EnableEvents <> Not opt Then .EnableEvents = Not opt
        If .ScreenUpdating <> Not opt Then .ScreenUpdating = Not opt
    End With

    fastWS , opt

End Sub

Public Sub fastWS(Optional ByVal ws As Worksheet, Optional ByVal opt As Boolean = True)

    If ws Is Nothing Then
        For Each ws In Application.ActiveWorkbook.Sheets
            setWS ws, opt
        Next
    Else
        setWS ws, opt
    End If

End Sub
Private Sub setWS(ByVal ws As Worksheet, ByVal opt As Boolean)

    With ws
        .DisplayPageBreaks = False
        .EnableCalculation = Not opt
        .EnableFormatConditionsCalculation = Not opt
        .EnablePivotTable = Not opt
    End With

End Sub

Public Sub xlResetSettings()    'default Excel settings
    With Application
        .Calculation = xlCalculationAutomatic
        .DisplayAlerts = True
        .DisplayStatusBar = True
        .EnableAnimations = False
        .EnableEvents = True
        .ScreenUpdating = True
        Dim sh As Worksheet
        For Each sh In Application.ActiveWorkbook.Sheets
            With sh
                .DisplayPageBreaks = False
                .EnableCalculation = True
                .EnableFormatConditionsCalculation = True
                .EnablePivotTable = True
            End With
        Next
    End With
End Sub

Any suggestions to improve readability for ease of maintenance, restructuring functions, naming conventions, etc, will be much appreciated

\$\endgroup\$
3
  • 2
    \$\begingroup\$ I've asked here why this has been put on hold out of interest for when I post code asking for review. \$\endgroup\$ Commented Jun 17, 2015 at 11:14
  • \$\begingroup\$ The specific problem, from a different aspect: can this code be improved in any way. I will review the question to include details about each part of the functionality so it will be easier to follow and provide as much and reasonable context as possible. Will these be addressing the issue main issue? Also, I carefully read the article on How to Ask, and ultimately this was my interpretation of it - obviously I got it wrong - as this being my first question, any help with this aspect will be appreciated and useful to me in the future \$\endgroup\$
    – paul bica
    Commented Jun 17, 2015 at 12:53
  • \$\begingroup\$ for a more detailed "how to ask" guidance (which is admittedly huge) you may want to check out meta.codereview.stackexchange.com/q/2436/37660 also you're always welcome to Code Review Chat for questions about closures and general site workings. Feel free to drop by ;) \$\endgroup\$
    – Vogel612
    Commented Jun 17, 2015 at 15:43

3 Answers 3

9
\$\begingroup\$

This isn't going to be a full-blown, fine-combed review. Just a few points.


Use PascalCase for procedure/member identifiers. Being consistent about this helps readability because it makes it easy to tell members from locals and parameters at a glance, without even reading them.


In general your indenting is fine, except here:

fastWB True      'turn off all Excel features related to GUI and calculation updates

    t1 = Timer   'start performance timer

    mainProcess

    t2 = Timer   'process is completed

fastWB False     'turn Excel features back on

Yes, it's a logical block, a bit like On Error Resume Next {instruction} On Error GoTo 0 would be. But it's not a syntactic code block. A different usage of vertical whitespace makes a better job at regrouping the statements I find:

fastWB True      'turn off all Excel features related to GUI and calculation updates
t1 = Timer   'start performance timer

mainProcess

t2 = Timer   'process is completed
fastWB False     'turn Excel features back on

The comments are annoying more than anything else. Consider using more descriptive identifiers instead:

ToggleExcelPerformance
startTime = Timer

RunMainProcess

endTime = Timer
ToggleExcelPerformance False

Note that the difference between startTime and endTime will be skewed if you run this code a few seconds before midnight on your system, because of how Timer works. Shameless plug, but with a little bit of abuse there are much more precise and reliable ways to time method execution (I co-own the project), especially if you don't need the duration to be in your "production code".


This declaration came as a surprise:

Dim ws As Worksheet
For Each ws In Worksheets

Why? Because it's the only declaration in the MainProcess method, that's declared close to usage (as it should). Either stick it to the top of the procedure with the other ones (eh, don't do that), or move the other declarations closer to their first usage (much preferred).

Pretty much the entire procedure's body is wrapped in this If..Else block:

If Len(bImport.Cells(2, 1).Value2) > 0 Then

    'wall of code

Else
    MsgBox "Missing raw data (Sheet 2 - 'Import')", vbInformation, "Missing Raw Data"
End If

I suggest you revert the condition to reduce nesting:

If Len(bImport.Cells(2, 1).Value2) = 0 Then 
    MsgBox "Missing raw data (Sheet 2 - 'Import')", vbInformation, "Missing Raw Data"
    Exit Sub
End If

'wall of code

This is what I like to call an abuse of the With statement:

With Application

    'wall of code

End With

I like that you're making explicitly qualified references to the Application object like this, ...but not like this - a With block should look like this:

With someInstance
    foobar = .Foo(42)
    .DoSomething
    .Bar smurf
End With

If you're merely wrapping a whole method with a With block just to avoid having to type Application the 3-4 times you're referring to the Application object, ...sorry to say, but you're just being lazy - and you've uselessly increased nesting for that reason, too.

IMO this is another abusive/lazy usage of With:

With wsImport
    Set importColRng = .UsedRange.Columns(importHeaderFound).Offset(1, 0).Resize(.UsedRange.Rows.Count - 1, 1)
End With

Versus:

Set importColRng = wsImport.UsedRange.Columns(importHeaderFound) _
                                     .Offset(1, 0) _
                                     .Resize(wsImport.UsedRange.Rows.Count - 1, 1)

This is awkward:

With rng
    Set getMaxCell = rng.Cells _
                    ( _
                        .Find( _
                            What:="*", _
                            SearchDirection:=xlPrevious, _
                            LookIn:=xlFormulas, _
                            After:=rng.Cells(1, 1), _
                            SearchOrder:=xlByRows).Row, _
                        .Find( _
                            What:="*", _
                            SearchDirection:=xlPrevious, _
                            LookIn:=xlFormulas, _
                            After:=rng.Cells(1, 1), _
                            SearchOrder:=xlByColumns).Column _
                    )
End With

You open up a With block, but the first statement in it ignores it:

    Set getMaxCell = rng.Cells _

Should be

    Set getMaxCell = .Cells _

And then After:=rng.Cells(1, 1) is also referring to rng. What do you need that With block for, really?

Now, I really don't like that .Cells call: that 15-liner single instruction is doing way too many things. An instruction should only have as few as possible reasons to fail. If either Find fails, you'll have a runtime error 91, and no clue if it's the row or the column find that's blowing up.

Function GetMaxCell(ByRef rng As Range) As Range

    On Error GoTo CleanFail

    Const NONEMPTY As String = "*"

    Dim foundRow As Long
    foundRow = rng.Find(What:=NONEMPTY, _
                        SearchDirection:=xlPrevious, _
                        LookIn:=xlFormulas, _
                        After:=rng.Cells(1, 1), _
                        SearchOrder:=xlByRows) _
                  .Row

    Dim foundColumn As Long
    foundColumn = rng.Find(What:=NONEMPTY, _
                           SearchDirection:=xlPrevious, _
                           LookIn:=xlFormulas, _
                           After:=rng.Cells(1, 1), _
                           SearchOrder:=xlByColumns) _
                     .Column

    Set GetMaxCell = rng.Cells(foundRow, foundColumn)

CleanExit:
    Exit Function

CleanFail:
    Set GetMaxCell = Nothing
    Resume CleanExit 'break here
    Resume 'set next statement here

End Function

That will return Nothing to the caller (for it to handle of course) instead of blowing up if the function is given an empty range, or any other edge case that wasn't accounted for. And as a bonus, all you need to do to find the problem is to place a breakpoint just before the error-handling subroutine finishes.


There's certainly a lot more to say about this code, ...but this answer is already long enough as it is ;-)

\$\endgroup\$
5
  • \$\begingroup\$ This is GOLD! Thank you for your time!!!. I knew I was getting lazy and fat :) --- PascalCase - point taken (didn't know) --- Logical block - to make its point, but for myself I usually use your suggestion --- Comments - I don't have any comments in my code and quite long identifiers (on my wide screen) --- Declarations close to first use: I kept all at the top and move them at cleanup (but I never get to it) --- Reverted IF - point taken ! \$\endgroup\$
    – paul bica
    Commented Jun 19, 2015 at 3:53
  • 1
    \$\begingroup\$ <pre> --- Abuse of the With - I'm not actually lazy about this; a long time ago I learned that the compiler makes a separate reference to the object and it's faster, so I do use it whenever a get a chance - am I wrong about my info though? --- Awkward .Find statements - I know but can't find a shorter way --- Ignored With for the .Find statements - thank you (there was a nested With before, I took it out and forgot to clean up --- On Error - excellent ! thank you I was expecting "brutal" and I just got slapped :) <code> \$\endgroup\$
    – paul bica
    Commented Jun 19, 2015 at 4:05
  • \$\begingroup\$ Just a note: none of the Markdowns I tried seem to work, so my comments are very ugly (sorry) \$\endgroup\$
    – paul bica
    Commented Jun 19, 2015 at 4:07
  • \$\begingroup\$ @paulbica comments only support mini-markdown, a bit like in chat but even more limited: `code`, *italic*, **bold**.. but I think \$MathJax\$ is supported. Let's see.. \$O(n)\$ ..yup. oh and [links work, too](url). \$\endgroup\$ Commented Jun 19, 2015 at 4:20
  • \$\begingroup\$ no problem - I think the checkmark might be a bit early though; I'm sure you'll end up with more answers.. give it a day or two :) \$\endgroup\$ Commented Jun 19, 2015 at 5:01
6
\$\begingroup\$

Improved versions of GetMaxCell()

  • The first function, using an array is much faster
  • If called without the optional parameter, will default to .ThisWorkbook.ActiveSheet
  • If the range is empty will returns Cell( 1, 1 ) as default, instead of Nothing

GetMaxCell (Array): Duration: 0.0000790063 seconds
GetMaxCell (Find): Duration: 0.0002903480 seconds

.Measured with MicroTimer

Public Function GetLastCell(Optional ByVal ws As Worksheet = Nothing) As Range
    Dim uRng As Range, uArr As Variant, r As Long, c As Long
    Dim ubR As Long, ubC As Long, lRow As Long

    If ws Is Nothing Then Set ws = Application.ThisWorkbook.ActiveSheet
    Set uRng = ws.UsedRange
    uArr = uRng
    If IsEmpty(uArr) Then
        Set GetLastCell = ws.Cells(1, 1):   Exit Function
    End If
    If Not IsArray(uArr) Then
        Set GetLastCell = ws.Cells(uRng.Row, uRng.Column):  Exit Function
    End If
    ubR = UBound(uArr, 1):  ubC = UBound(uArr, 2)
    For r = ubR To 1 Step -1    '----------------------------------------------- last row
        For c = ubC To 1 Step -1
            If Not IsError(uArr(r, c)) Then
                If Len(Trim$(uArr(r, c))) > 0 Then
                    lRow = r:   Exit For
                End If
            End If
        Next
        If lRow > 0 Then Exit For
    Next
    If lRow = 0 Then lRow = ubR
    For c = ubC To 1 Step -1    '----------------------------------------------- last col
        For r = lRow To 1 Step -1
            If Not IsError(uArr(r, c)) Then
                If Len(Trim$(uArr(r, c))) > 0 Then
                    Set GetLastCell = ws.Cells(lRow + uRng.Row - 1, c + uRng.Column - 1)
                    Exit Function
                End If
            End If
        Next
    Next
End Function

'Returns last cell (max row & max col) using Find
Public Function GetMaxCell2(Optional ByRef rng As Range = Nothing) As Range

    Const NONEMPTY As String = "*"

    Dim lRow As Range, lCol As Range

    If rng Is Nothing Then Set rng = Application.ActiveWorkbook.ActiveSheet.UsedRange

    If WorksheetFunction.CountA(rng) = 0 Then
        Set GetMaxCell2 = rng.Parent.Cells(1, 1)
    Else
        With rng
            Set lRow = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _
                                        After:=.Cells(1, 1), _
                                        SearchDirection:=xlPrevious, _
                                        SearchOrder:=xlByRows)
            If Not lRow Is Nothing Then
                Set lCol = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _
                                            After:=.Cells(1, 1), _
                                            SearchDirection:=xlPrevious, _
                                            SearchOrder:=xlByColumns)

                Set GetMaxCell2 = .Parent.Cells(lRow.Row, lCol.Column)
            End If
        End With
    End If
End Function

Private Declare PtrSafe Function getFrequency Lib "Kernel32" Alias "QueryPerformanceFrequency" (cyFrequency As Currency) As Long
Private Declare PtrSafe Function getTickCount Lib "Kernel32" Alias "QueryPerformanceCounter" (cyTickCount As Currency) As Long


'https://msdn.microsoft.com/en-us/library/office/ff700515(v=office.14).aspx#Anchor_5
Function MicroTimer() As Double
    Dim cyTicks1 As Currency
    Static cyFrequency As Currency

    MicroTimer = 0
    If cyFrequency = 0 Then getFrequency cyFrequency        'Get frequency
    getTickCount cyTicks1                                   'Get ticks
    If cyFrequency Then MicroTimer = cyTicks1 / cyFrequency 'Returns Seconds
End Function

More info on Macro performance slow when page breaks are visible (Microsoft)

\$\endgroup\$
1
\$\begingroup\$
Public Function getMaxCell(ByRef rng As Range) As Range

'search the entire range (usually UsedRange)
'last row: find first cell with data, scanning rows, from bottom-right, leftwards
'last col: find first cell with data, scanning cols, from bottom-right, upwards

With rng
    Set getMaxCell = rng.Cells _
                    ( _
                        .Find( _
                            What:="*", _
                            SearchDirection:=xlPrevious, _
                            LookIn:=xlFormulas, _
                            After:=rng.Cells(1, 1), _
                            SearchOrder:=xlByRows).Row, _
                        .Find( _
                            What:="*", _
                            SearchDirection:=xlPrevious, _
                            LookIn:=xlFormulas, _
                            After:=rng.Cells(1, 1), _
                            SearchOrder:=xlByColumns).Column _
                    )
End With
End Function

Kudos for using the UsedRange to cut down on unnecessary cell searching, but although this popular method is very good, it's vulnerable to an unlikely bug.

If the active cell of the worksheet is in a filtered ListObject, the code will fail (incorrect Range returned from Find method). To fix this, you have to disable events, select away from the table, then select back to the original cell maybe to avoid any risk of upsetting the user or any other macros.

This means that the most robust method which avoids all bugs(AFAIK) is the below:

Public Function GetLastRow(ByRef rng As Range) As Long
    Dim arr as Variant
    arr = rng.Value2

    Dim i As Long, j As Long
    For i = UBound(arr) To 1 Step - 1
        For j = Ubound(arr, 2) To 1 Step - 1
            If Not IsError(arr(i, j))
                If arr(i, j) <> vbNullString Then
                    GetLastRow = i + rng.Row -1
                    Exit Function
                End If
            Else
                GetLastRow = i + rng.Row -1
                Exit Function 
            End If
        Next j
    Next i
End Function

I have a similar function for GetLastColumn; with these two combined you can get your MaxCell easily...

\$\endgroup\$
3
  • \$\begingroup\$ Thanks for the feedback. You have a valid poit - any filter would cause an incorrect result (ListObject or not). The only way I see to fix getMaxCell(), using the find method would be to preserve any existing filter, unfilter the data, get last cell, and filter it back with the initial filter - I'll need to update it, however, the GetLastCell() , using the loops, works as expected \$\endgroup\$
    – paul bica
    Commented Aug 27, 2017 at 11:55
  • \$\begingroup\$ Your GetLastRow() should work but there are a few issues with it: 1) it errors out at line Set arr = rng.Value2 (Type mismatch), to fix it remove the Set. 2) It errors out if parameter rng is Nothing, or something other than a Range object. 3) On lineIf arr(i, j) <> vbNullString Then it will fail if the data contains #N/A (check for error values when comparing). 4) Line getMaxRow = i + rng.Row -1 the name of the function should be GetLastRow \$\endgroup\$
    – paul bica
    Commented Aug 27, 2017 at 12:13
  • \$\begingroup\$ Nice catch. I'll edit the post for the set and name. I typed it out wrong \$\endgroup\$
    – MacroMarc
    Commented Aug 27, 2017 at 14:02

Not the answer you're looking for? Browse other questions tagged or ask your own question.