07 December 2009

Simple cache in Excel using VBA static variables

Static variables in VBA functions, like their counterparts in C, retain their values between function calls. A common use for static variables is to maintain a running total or count. This article describes how static variables can be used for cacheing in Excel VBA, compares the performance of a simple approach versus a cache in Excel, and some discussion for using this approach.

Excel applications often have read or write data in specific cells, columns or rows, which are identified by cells containing specific strings. A simple approach is to write a function that returns a cell reference given a string, as below.

Function GetHeaderFind(ByVal str As String) As Range
  Set GetHeaderFind = Sheet1.Range("A:A").Find(What:=str, SearchOrder:=xlByRows, SearchDirection:=xlNext)
End Function

If we know that the cells referenced do not change during the execution of the program, then we could find them when they are first required and cache their values in a Collection. The next time the same string is presented, we just return a reference from the Collection:

Function GetHeaderStatic(ByVal str As String) As Range
  Static colRange As New Collection
  Dim rng As Range
  On Error Resume Next
  Set rng = colRange(str)
  If Err.Number <> 0 Then
    Set rng = Sheet1.Range("A:A").Find(What:=str, SearchOrder:=xlByRows, SearchDirection:=xlNext)
    colRange.Add key:=str, Item:=rng
  End If
  On Error GoTo 0
  Set GetHeaderStatic = rng
End Function

Does caching the results make a difference? We test it by searching for three values 100000 times and writing the time elapsed between the two approaches:

Public Sub TestCache()
  Dim sStart As Single, sTimeFind As Single, sTimeCache As Single
  Dim rng As Range, wks As Worksheet
  Dim rngHeader As Range
  Const iCOUNT As Long = 100000
  Dim i As Long, j As Integer
  Dim aKey(3) As String
  aKey(0) = "x"
  aKey(1) = "y"
  aKey(2) = "z"
  'Find implementation
  sStart = Timer
  For i = 1 To iCOUNT Step 1
    For j = 0 To 2 Step 1
      Set rng = GetHeaderFind(aKey(j))
    Next j
  Next i
  sTimeFind = Timer - sStart
  'Cache implementation
  sStart = Timer
  For i = 1 To iCOUNT Step 1
    For j = 0 To 2 Step 1
      Set rng = GetHeaderStatic(aKey(j))
    Next j
  Next i
  sTimeCache = Timer - sStart
  Debug.Print "Find, Cache times = " & sTimeFind & "," & sTimeCache
  Debug.Print "Ratio = " & sTimeFind / sTimeCache
  End 'Dispose of static variables
End Sub

The results below show that the cache is about 29 times faster than the simple implementation.

Find, Cache times = 16.92969,0.5859375
Ratio = 28.89333
Find, Cache times = 16.85156,0.578125
Ratio = 29.14865
Find, Cache times = 16.84375,0.578125
Ratio = 29.13514

Of course, we could have simply declared and initialised a module-level Collection, and saved ourselves the effort of writing a cache. Below are some reasons why creating a cache in a function is useful:

  • We are refactoring an existing function and do not want to (or cannot) change the function's interface.
  • Unlike a VBA module-level variable which can be inadvertently changed by any function, we ensure that the Collection can only be changed in a one location because only the function can modify its static variable.


03 December 2009

R1C1 notation in Excel VBA

Excel supports both A1 and R1C1 notation for referencing cells in formulas. Almost everyone uses A1 notation but it can be easier to use R1C1 notation to generate column references in VBA. For instance, to generate a formula that references a column in A1 notation, you need a function that returns A to Z for the first 26 columns, then returns two letter column names (such as AA) for columns past the 26th. With R1C1 notation, you just concatenate C with a column number (such as C1 or C27). Lastly, to assign a formula to a cell using R1C1 notation, use the cell's FormulaR1C1 property.

01 December 2009

Excel VBA programming with run-time errors

When writing VBA programs for Microsoft Excel 2003, you quickly find that you have to detect and handle run-time errors in your programs. VBA provides On Error and Resume statements for handling run-time errors. This article discusses how run-time errors can be used, explicitly or implicitly, in VBA programs.

Let's start with a script below, which calls the WorksheetFunction class' Match() function to find a cell with the value z in the range B1:B8:

Sub TestMatch()
  Dim wks As Worksheet
  Set wks = Sheet1
  Dim dblMatchResult As Double
  dblMatchResult = Application.WorksheetFunction.Match("z", wks.[B1:B8], 0)
  Debug.Print "Found at " & dblMatchResult
End Sub

If Match() function succeeds, then the relative row number is returned. However, if Match() fails, Excel generates this error message: Run-time error '1004': Unable to get the Match property of the WorksheetFunction class. In this case, the error means that Match() did not find z in the range B1:B8. It is similar to a #N/A result in a cell (though Error 1004 is a VBA error code and not related to the Excel-specific xlErrNA).

Below, we explicitly use the run-time error to (trivially) determine what to print in the statement after the error with On Error Resume Next. We also explicitly clear the error handler and allow errors to be detected again with On Error GoTo 0 (of course, detecting errors again doesn't matter in this short example).

Sub TestMatch()
  Dim wks As Worksheet
  Set wks = Sheet1
  Dim dblMatchResult As Double
  On Error Resume Next
  dblMatchResult = Application.WorksheetFunction.Match("z", wks.[B1:B8], 0)
  Debug.Print "z " & IIf(Err.Number = 0, "found at " & dblMatchResult, "not found")
  On Error GoTo 0
End Sub

Another common situation where you may explicitly test error codes is when you use a Collection. VBA's Collection class do not have a method to test if a key exists, so the only way to know if a key exists in your collection object is to try and retrieve the key from the collection and get a run-time error, as in the script below.

Sub TestCollection()
  Dim col As New Collection
  col.Add Key:="b", Item:="bog"
  col.Add Key:="d", Item:="dog"
  Dim str As String
  On Error Resume Next
  str = col("c")
  Debug.Print "Key c " & IIf(Err.Number = 0, " found", " not found") & " in collection"
  On Error GoTo 0
End Sub

In this example, the error would be Run-time error '5': Invalid procedure call or argument since the key c doesn't exist in the collection.

(Some run-time error messages generated by VBA are misleading: Error 1004 sounds like VBA failed to resolve the Match() function in the WorksheetFunctions class, and Error 5 is unclear because neither the procedure nor the argument is invalid. Error messages are also inconsistent. The commonly used Sheets Collection generates a different run-time error message than a VBA Collection if you try to access a key that doesn't exist. Try this in the Immediate Pane: ? Sheets(99).Name or ? Sheets("Blah").Name. Unless you have 99 sheets or a sheet called Blah in your workbook, you will see this more comprehensible error, Run-time error '9': Subscript out of range rather than Error 5.)

Some errors can be incidental to your program's logic and easily rectified, such that you can use VBA's On Error GoTo <label> statement to jump to a statement block to fix the error, then re-execute the statement that caused the error with Resume 0. For instance, the following function returns a reference to a sheet. It tries to obtain a sheet reference using the statement on line 3, jumping to lblCreateSheet to create a sheet first if it does not exist, then re-executing line 3.

1 Function AddSheetGoTo(ByVal str As String) As Worksheet
2   On Error GoTo lblCreateSheet:
3   Set AddSheetGoTo = Sheets(str) 'Run-time error '9': Subscript out of range
4   Exit Function
5 lblCreateSheet:
6   Sheets.Add.Name = str
7   Resume 0
8 End Function

Using On Error GoTo <label> ... Resume 0 means that the body of your function is not cluttered with exception handling statements, though you have to include Exit (Function or Sub) to explicitly end your function (and avoid inadvertently executing the error handler), and you should ensure that your error handler can actually fix the error, otherwise you will end up with an infinite loop. Compare this approach with explicitly testing and clearing an error using On Error Resume Next ... On Error GoTo 0, where you end up with a slightly longer and more cluttered function:

 1 Function AddSheetResume(ByVal str As String) As Worksheet
 2   Dim wks As Worksheet
 3   On Error Resume Next
 4   Set wks = Sheets(str) 'Run-time error '9': Subscript out of range
 5   If Err.Number <> 0 Then
 6     On Error GoTo 0
 7     Sheets.Add.Name = str
 8     Set wks = Sheets(str)
 9   End If
10   Set AddSheetResume = wks
11 End Function

This article discussed how VBA's run-time error handling can be used explicitly as part of your program logic, or implicitly to keep your program logic clear. It also notes that VBA run-error codes are not always clear or consistent.