Showing posts with label MS-Excel. Show all posts
Showing posts with label MS-Excel. Show all posts

2016-06-12

Formulas for Library Catalogue Sort Order

If you use library catalogues, you may notice that book or film titles in English are sorted without the leading article, "A", "An" and "The". For example, this list of titles ...

A Kind of Intimacy
An Awfully Big Adventure
The Girl In The Polka-Dot Dress

... is sorted like this in a library catalogue ...

An Awfully Big Adventure
The Girl In The Polka-Dot Dress
A Kind of Intimacy

To reproduce this sort order in a spreadsheet, create an additional column containing titles without the leading article then sort this column.

Kind of Intimacy
Awfully Big Adventure
Girl In The Polka-Dot Dress

If the title is in cell A1, the formula to transform a title in Excel is this nested IF() formula below, which tests for each possible article at the start of a string and returns the title without the article:

=if(left(A1,2)="A ",mid(A1,3,100),if(left(A1,3)="An ",mid(A1,4,100),if(left(A1,4)="The ",mid(A1,5,100),$B33)))

The equivalent formula is much shorter in Google Sheets because it has regular expressions formulas. The one below simply replaces the leading article in a title with an empty string:

=REGEXREPLACE(A1, "^(A|An|The) ", "")

2016-05-31

Formula to Convert mm/dd/yyyy String to Date

Some data I receive has the date in a string in mm/dd/yyyy format, which is easier to sort or filter in Excel when converted to a date number. If the data is in cell A2 then the formulas for splitting the date string into substrings and creating a date value are:

StringYearMonthDayDate
5/31/2016201653131/05/2016
  • Year: =RIGHT(A2,4). The year is always the last four digits.
  • Month: =LEFT(A2,FIND("/",A2)-1). The month is the one or two digits before a forward slash.
  • Day: =MID(A2,FIND("/",A2)+1,FIND("/",A2,4)-FIND("/",A2)-1). The day is the digits between the first and second forward slash.
  • Date: =DATE(B2,C2,D2).

Probably the non-obvious bit is finding the second forward slash. The FIND() formula takes three arguments: find_text, within_text, start_num, where start_num is the position to start the search for find_text. Since the month and day numbers are always one or two digits, the second forward slash must be in position 4 ("m/d/yyyy"), 5 ("mm/d/yyyy", "m/dd/yyyy") or 6 ("mm/dd/yyyy"), so starting the search from position 4 will always find the position of the second forward slash.

Later ... I could just use Data, Text To Columns and split the date string into three columns using the forward slash as the delimiter.

2016-04-29

Excel VBA "Code execution has been interrupted"

Very strange bug: After starting an Excel VBA macro, the macro stops and Excel displays a "Code execution has been interrupted" dialog. The only way to get the macro to finish is to keep pressing the "Continue" button. The problem only occurs on my development account not in other accounts. The solution is to wait for the dialog, press the "Debug" button to use VBE then type Ctrl+Break. Apparently, the issue is caused by Excel reactivating breakpoints in the macro from earlier debugging sessions. I guess that is why the macro runs to completion on other accounts. Hat tip to The two WORST Excel Errors EVER.

2015-11-10

Recursive preorder traversal of a folder tree in VBA

To process files in a folder tree using Excel, I implemented a recursive preorder traversal of a tree in VBA below. I haven't had to do this earlier because I usually use Gnu "find . -exec" to process files in folders.
Option Explicit

'References
'1. Microsoft Scripting Runtime

Global gFso As Scripting.FileSystemObject

Sub Main()
  Set gFso = New Scripting.FileSystemObject
  TraversePreorder "C:\KS Work\Temp"
End Sub

Sub TraversePreorder(ByVal sPath As String)
  Dim oFolder As Scripting.Folder, vMember As Variant
  
  If gFso.FolderExists(sPath) Then
    Debug.Print sPath 'Process folder
    Set oFolder = gFso.GetFolder(sPath)
    For Each vMember In oFolder.Files
      TraversePreorder vMember
    Next vMember
    For Each vMember In oFolder.SubFolders
      TraversePreorder vMember
    Next vMember
  Else
    Debug.Print sPath 'Process file
  End If

End Sub

2015-10-21

Excel auto-fill keyboard "shortcut"

Excel's AutoFill is usually activated by selecting a group of cells and dragging the fill handle (the black square on the bottom-right hand corner of a selection) with the mouse until you have the range you want. The keyboard "shortcut" (more a sequence of keystrokes than a simple shortcut) is to select the range you want first then type: Alt+h,fi,s,Alt+f,Return. The breakdown of the steps are:

  1. Select the range you want.
  2. Alt+h: show the Home menu strip.
  3. fi: show the Fill sub-menu.
  4. s: show the Series dialog.
  5. Alt+f: In the Series dialog, select Type=AutoFill.
  6. Return: In the Series dialog, press the OK button.

2015-01-29

Excel cannot find the data you're searching for

Sometimes Excel cannot find text in a cell that is clearly visible. Some workarounds and solutions:

  • In the "Find and Replace" dialog, untick "Match entire cell contents".
  • In VBA, specify the LookAt parameter, .Find(..., lookat:=XlLookAt.xlPart)
  • In your worksheet, remove custom formatting from the cell range you are searching.

The last workaround is pretty weird. One would think that formatting shouldn't affect searching for data!

2014-05-22

Excel 2010 build multi-criteria search in Autofilter

Excel 2010's search in autofilter can be used to easily build a multi-criteria filter but I found that the steps for adding the first criteria is different from adding the subsequent criteria.

Assuming your started Autofilter for your worksheet, the steps to add the first criteria are:

  1. Click on the filter icon in the column.
  2. In Autofilter pane, enter your criteria.
  3. Autofilter pane displays the reduced list, including your criteria (assuming it is found).
  4. Untick "(Select All Search Results)". Excel removes the tick marks from the reduced list, including your criteria. If you don't untick this option, all the rows in your worksheet remain selected.
  5. Don't tick "Add current selection to filter". If you tick this option, all the rows in your worksheet remain selected.
  6. Tick your criteria.
  7. Press the OK button to activate the filter.
  8. Excel closes the Autofilter pane and displays only rows containing your first criteria.

To add more criteria to the filter, the steps are:

  1. Click on the filter icon in the column.
  2. In Autofilter pane, enter your next criteria.
  3. Autofilter pane displays the reduced list, including your next criteria (assuming it is found).
  4. Leave "(Select All Search Results)" ticked.
  5. Tick "Add current selection to filter" to keep the rows previously filtered.
  6. Leave your criteria ticked.
  7. Press the OK button to update the filter.
  8. Excel closes the Autofilter pane and displays only rows containing all your criteria (including the first criteria).

What is confusing is that you can't see your previous criteria in the Autofilter pane; you have to infer it from the list of visible rows.

2014-03-13

Smart View "Error" text in Excel cells

If your Excel worksheet has a lot of Smart View formulas (e.g. more than ten thousand HsGetValue), you may find that each cell with a Smart View formula shows "Error" after you refresh the worksheet. I think Excel encountered an error during the refresh process and does not update any of the cells. The workaround is to type Ctrl+Alt+F9 which "calculates all worksheets in all open workbooks, regardless of whether they have changed since the last calculation" (see Excel shortcut and function keys). This method of recalculation isn't visible in the Excel Formulas, Calculation menu.

2014-02-21

Workaround to conditionally format for a range of cells in Excel

You can format cells in one column based on the value of cell in the same row in another column in Excel using conditional formatting but you have to workaround some quirks in the Excel Conditional Formatting Rules Manager (CFRM) user interface to get that result. Say that you want to format cells in column A based on the value of a cell in column B for the same row, i.e. A1 is depends on B1, A2 on B2, A3 on B3, etc. The steps for creating a conditional formatting rule for column A are as follows:

  1. In Excel, select menu item Home, Conditional Formatting, Manage Rules.
  2. In the CFRM dialog, press the New Rule button.
  3. In the New Formatting Rule (NFR) dialog, select Use a formula to determine which cells to format.
  4. In Format values where this formula is true, enter B1.
  5. Set the format required.
  6. Press the OK to close the NFR dialog.
  7. Back in the CFRM dialog, you should see your new rule. In the Applies to field, enter $A:$A to use this rule for Column A. At this stage, if you press the OK button, the conditional format does not work for Column A. Use the following steps to workaround the
  8. Press the Edit Rule button to display the Edit Formatting Rule (EFR) dialog.
  9. In Format values where this formula is true, change ="B1" to =B1 (i.e. remove the double quotes) then press the button.
  10. In the CFRM dialog, press the OK button.

After applying these steps, when a cell in Column B is TRUE, Excel applies the specified formatting to a cell in the same row in Column A.

2014-02-04

Excel find first word without IFERROR() or IF(ISERROR())

In Excel, to find the first word in a cell, find the position of first space then return all characters up to but excluding the space: =LEFT(A1,FIND(" ",A1)-1).

The formula above fails with #VALUE if the string in A1 doesn't have a space or the cell is empty. A Excel 2007 solution is to wrap the entire formula in IFERROR to return a default value: =IFERROR(LEFT(A1,FIND(" ",A1)-1),A1). Excel 2003 doesn't have IFERROR, so the formula is much longer with the same formula for the condition and output of the IF formula: =IF(ISERROR(LEFT(A1,(FIND(" ",A1)))),A1,LEFT(A1,(FIND(" ",A1)-1))).

A shorter solution for Excel 2003 onwards is to append a space to the cell value in the FIND formula so that a default value can always be returned: =LEFT(A1,FIND(" ",A1&" ")-1). Note: Test if it works for OpenOffice Calc.

2013-06-07

Excel workbook always prints multiple copies

We had some Excel workbooks that, by default, always printed 7 copies of each sheet. A user had to change the number of copies to 1 each time she printed a sheet and the number of copies always started at 7 after she saved and reopened the workbook. The not obvious solution was to change the printer default for the workbook. Steps below:

  1. Open the workbook in Excel.
  2. Select File, Print, Print Preview.
  3. Select Page Setup.
  4. In the Page Setup dialog, press the Options... button.
  5. In the printer setup dialog, change the field that determines the number of copies. It varies from printer to printer. For example, for HP Laserjet P3005, it is in the Advanced tab, Paper/Output, Copy Count.
  6. Press the OK button to save your change.

2012-10-28

Australian Accounting Number Formats

Number format codes for Excel and Libre Calc used in Australian accounting (Excel does not have this number format code available as a default while Libre Calc has the second one).

Number format code123456.78-123456.780.00
#,##0_);(#,##0);-_)123,457 (123,457)
#,##0.00_);(#,##0.00);-_)123,456.78 (123,456.78)

Notes:

  • Semicolons delimit the sign of the number being formatted: positive;negative;zero.
  • The underscore before the right parenthesis: append a trailing space the size of the parenthesis so that positive numbers are right-aligned with negative numbers (which are enclosed in parentheses).
  • Using the first format code, your numbers are formatted without cents.

The underlying value in the cell hasn't changed, just the presentation. Use sparingly; you can confuse a user who tries to use what they see in the cell in their formula and wonder why their formula doesn't work.

2012-07-07

Excel Data Validation Drop Down Control Not Visible

A user found a strange problem in an Excel workbook: the data validation drop down control did not appear for any data validation cells in a specific worksheet (they appear in other worksheets). I tried the possible solutions but they did not work.

The solution came when I tried to resave the workbook as an Excel 97-2003 XLS file and the Excel compatibility checker reported some formulas were unavailable (in this case, SUMIFS) for earlier versions of Excel. After saving the workbook as an XLSX file to avoid this error, the data validation drop down control reappeared in the worksheet. It looks like opening an XLS file using Excel 2007 in compatibility mode may cause unexpected problems like this.

2012-06-23

Flatten or Collapse Excel Multi-column Data Into One Dimension

You can flatten or collapse multi-column Excel data into one row or column using GnuWin utilities.

  1. Copy the worksheet data into the clipboard.
  2. Open a CMD window and enter this chain of commands: getclip | tr -s [:cntrl:] \n | putclip.
  3. Paste the data back into your worksheet. You should get a column of data.

The getclip-putclip pair of programs gets and puts data in the system clipboard and is part of the CygWin package in GnuWin. tr translates control characters (e.g. TABs) to NEWLINEs and the -s option squeezes out repeated characters. This chain of commands works because each column of Excel data separated by a TAB character in the clipboard. Here are more examples of this pattern.

If you want to remove duplicates in your data, insert sort -u into the chain: getclip | tr -s [:cntrl:] \n | sort -u | putclip. For example, if you start with this input (note the trailing TABs in rows 1 and 2) ...

1  
2 4 
3 5 7
4 6 8
5 7 9

... you end up with the following:


1
2
3
4
5
6
7
8
9

If you want a single row output, replace all control characters with TABs: getclip | tr -s [:cntrl:] \t | putclip. In this case, you cannot include sort because it sorts lines of data and there is only one line in the output.

1 2 4 3 5 7 4 6 8 5 7 9 

2011-12-02

Specify Rows or Columns for Excel AutoFit

If you use the Excel AutoFit Method to resize cells to display their contents nicely, remember to specify whether to AutoFit rows or columns in Excel, i.e. use Range().Columns.AutoFit or Range().Rows.AutoFit, not just Range().AutoFit. If you don't specify rows or columns, Excel produces Run-time error '1004': AutoFit method of Range class failed.

2010-10-15

Automatically Update Last Row in Range Formulas

In OpenOffice Calc, worksheet formulas that apply a range of rows, such as SUM(), need to be updated when an extra row is added just above the cell with that formula. For example, if you start the following worksheet values and formulas ...

  A
1 4
2 7
3 9
4 =SUM(A1:A3)

... then when you insert a new row between rows 3 and 4, OO-Calc does not automatically update the second argument to the SUM() formula (Note: MS-Excel 2003 does). A solution is to use the OFFSET formula to automatically update the second argument like this: =SUM(A1:OFFSET(A4;-1;0;1;1)). In this example, the OFFSET returns a reference to a cell one row above cell A4.

See Also

2010-10-11

Starting Excel 2003 Faster

I start two or three Excel 2003 sessions at the same time so that I can use multiple displays and work on a couple of tasks at a time. To start a new Excel session, you launch it from the Windows Start menu instead of opening an Excel file in Explorer (opening an Excel file loads it in the active Excel session). A new Excel session has, by default, a Task Pane (a panel titled 'Getting Started') and a new workbook, both of which I rarely use, so I close them before opening an existing file.

If you don't want the Task Pane opened by default, open Excel's Options dialog and uncheck View, Show, Startup Task Pane.

If you don't want a new workbook by default, create a new shortcut to start Excel with the /e argument and open Excel with that shortcut.

See Also

2010-05-31

VBA One-liner to Concatenate Excel Cell Values

Here's a VBA one-liner to concatenate a column of Excel cell values, with a separator, into one string: Join(Application.Transpose(<column range>), ","). The column range is an Nx1 array, Transpose() returns a one-dimensional N-element array and Join() function a one-dimensional array as the first parameter.

To concatenate values from a row of cells, you have to use Transpose() twice like this: Join(Application.Transpose(Application.Transpose(<row range>)), ","). Excel's Range always returns a two-dimensional array (in this case, a row is a 1xM array) and you flatten the two-dimensional array by calling Transpose() twice. There doesn't seem to be a built-in function to flatten two-dimensional arrays.

References

2010-05-27

Avoiding #N/A in Range VLOOKUP

I was using VLOOKUP to put percentages in buckets. Here's the lookup table:

   a b
1  0 1
2 25 2
3 50 3
4 75 4

And here's the formula I added to my data sheet: =VLOOKUP(input, A1:B4, 2, TRUE). When the fourth parameter, Range_Lookup is set to TRUE, the function will match a range instead of an exact input value. If the input value was greater than 0 (e.g. 0, 25.5), VLOOKUP returns a bucket number, as expected. My input data was dirty and included negative numbers (e.g. -0.5, -1), so VLOOKUP would then return #N/A and dependant formulas would show an error.

One solution was enclose the input value in an IF function and avoid negative input values: =VLOOKUP(IF(input < 0, 0, input), A1:B4, 2, TRUE).

After reading about VLOOKUP, Range_Lookup parameter ...

If TRUE or omitted, an exact or approximate match is returned. If an exact match is not found, the next largest value that is less than lookup_value is returned.

... another way to avoid the #N/A error is to provide a dummy value smaller than all the negative values and an extra bucket:

    a b
1 -99 0
2   0 1
3  25 2
4  50 3
5  75 4

References

2010-05-23

Rudimentary C# COM Interop Client for Excel

I started writing some basic C# COM Interop clients for Excel to keep a hand in C# and be able to develop more sophisticated applications than VBA could provide. Here are some basic things I learnt and a complete working example to read data from an Excel workbook.

Configure your project

  1. Add to this project reference in the IDE: GAC (or COM) Microsoft.Office.Interop.Excel.
  2. Add to the source code:
    using System.Runtime.InteropServices;
    using Microsoft.Office.Interop.Excel;
    

Clean up COM References

After your program has finished using Excel, it should release all references to COM objects otherwise the Excel automation process remains running (check the Task Manager process list for Excel.exe). It seems a bit back-to-front to mention this first but I like to know how free unused resources before I start using them. Here is a discussion and statements you should add to your code.

Optional Parameters and Type.Missing

C# (pre-2010) doesn't support optional parameters so you have to supply an argument for all parameters in exposed methods. Since many frequently-used Excel methods require a lot of parameters, you have to tolerate verbose function calls in your C# source code.

Often you would use the default value for most parameters and you can simply provide the value Type.Missing. If you have Visual Studio Tools for Office (VSTO), you can use a global variable called missing. In the example below, Workbooks.Open() requires 15 arguments but I only provide the file name and use the default value for the remaining parameters.

      Excel.Application app = new Excel.Application();
      Excel.Workbook wbk = app.Workbooks.Open(
        path, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

Getting Data From Excel - Indexers, Arrays and Accessors

You can access Worksheets, Ranges and their values in VBA with this syntax: object.property or collection(index). Using the Excel object model in C# via COM Interop requires knowing about indexers, arrays and accessors.

Items in Excel collections (e.g. Worksheets) can be accessed using indexers, e.g. Workbook.Worksheets[1]. Like their VBA counterparts, some indexers support different types of arguments. The return value is always of type object so you have to cast the return result. For example:

      Excel.Worksheet wks0 = (Excel.Worksheet)wkss["Sheet1"];
      Excel.Worksheet wks1 = (Excel.Worksheet)wkss[1];

For me, an Excel indexer is syntactic sugar for an accessor method called get_Item(). (Note: The SharpDevelop IDE doesn't list this accessor in the list of object methods but VS C# 2008 Express does):

      Excel.Worksheet wks2 = (Excel.Worksheet)wkss.get_Item("Sheet1");
      Excel.Worksheet wks3 = (Excel.Worksheet)wkss.get_Item(1);

Another thing to note is that the indexers return a type object so you have to cast the return result.

Once you have a reference to a Worksheet, you can refer to individual cells using its Cells array (and remember to cast the return result):

      Excel.Range cell0 = (Excel.Range)wks0.Cells[1,1];
      Excel.Range cell1 = (Excel.Range)wks0.Cells.get_Item(1,1);

You can also get a single cell using the Worksheet's Range property. Unlike the Cells property, Range is a parameterised property, so you have to use the Worksheet method get_Range(object Cell1, object Cell2):

      Excel.Range rng0 = wks0.get_Range("A1", Type.Missing);

A common way to refer to a set of cells is to provide the top left and bottom right cell to the Worksheet's get_Range() accessor:

      Excel.Range rng1 = wks0.get_Range("A1", "A6");

If you try to get a Range using the function or array syntax, e.g. Excel.Range rng1 = wks0.Range("A1", "A6"); or Excel.Range rng0 = wks0.Range["A1","A6"];, you get a compile-time error: Property, indexer, or event 'Range' is not supported by the language; try directly calling accessor method 'Excel._Worksheet.get_Range(object, object)', which I take it to mean that Range is exposed as a property but can only be used as a method since C# (pre-2010?) doesn't support parameterised properties.

After getting a reference to a Range, you can get its data using either the get_Value() accessor or Value2 property (like Range, the Value property is a parameterised property). Finally, since a Range object can refer to a single cell or a set of cells, you have to know whether you are retrieving a single object or an object array from a Range.

      string s0 = (string)rng0.get_Value(Type.Missing);
      string s1 = (string)rng0.Value2;
      object[,] arr0 = (object[,])rng1.get_Value(Type.Missing);
      object[,] arr1 = (object[,])rng1.Value2;

Complete Example

Putting all that information together, here is the complete code for a rudimentary C# COM Interop client to read data from cells A1:A6 in an Excel workbook. Note that some statements are redundant and are there to illustrate different ways of getting the same result.

The program's output is:

$A$1:$A$6
1,1 Text:Text
2,1 1:1
3,1 5.678:5.678
4,1 0.5:0.5
5,1 1/06/2010 12:00:00 AM:40330
6,1 21.5078:21.507812345
Check Excel.exe in Task Manager . . .

The program outputs both a cell's Value and Value2 property. As you can see, cell A5's Value is a date while its Value2 property is a number, and cell A6's Value is a currency amount to four decimal points while its Value2 property is a double. Just before it exits, all COM references to Excel should be cleaned up and the program prompts the user to check the Task Manager, where there should not be any Excel.exe processes in the process list.

//Add to project reference:
//GAC, Microsoft.Office.Interop.Excel
using System;
using System.Runtime.InteropServices;
using Microsoft.Office.Interop.Excel;

namespace ExcelAutomation1 {
  class Program {
    public static void Main(string[] args) {
      string path = @"C:\Users\hung\Documents\Programming\C#\ExcelAutomation1\ExcelAutomation1\Test.xls";
      Excel.Application app = new Excel.Application();
      Excel.Workbook wbk = app.Workbooks.Open(
        path, false, true, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
      Excel.Sheets wkss = wbk.Worksheets;
      Excel.Worksheet wks0 = (Excel.Worksheet)wkss["Sheet1"];
      Excel.Worksheet wks1 = (Excel.Worksheet)wkss[1];
      Excel.Worksheet wks2 = (Excel.Worksheet)wkss.get_Item("Sheet1");
      Excel.Worksheet wks3 = (Excel.Worksheet)wkss.get_Item(1);
      Excel.Range cell0 = (Excel.Range)wks0.Cells[1,1];
      Excel.Range cell1 = (Excel.Range)wks0.Cells.get_Item(1,1);
      Excel.Range rng0 = wks0.get_Range("A1", Type.Missing);
      //Excel.Range rng1 = wks0.Range("A1", "A6"); //Not supported
      //Excel.Range rng1 = wks0.Range["A1","A6"]; //Not supported
      Excel.Range rng1 = wks0.get_Range("A1", "A6");
      Console.WriteLine(rng1.get_Address(Type.Missing, Type.Missing, Excel.XlReferenceStyle.xlA1, Type.Missing, Type.Missing));
      string s0 = (string)rng0.get_Value(Type.Missing);
      string s1 = (string)rng0.Value2;
      object[,] arr0 = (object[,])rng1.get_Value(Type.Missing);
      object[,] arr1 = (object[,])rng1.Value2;
      int cntRow = arr0.GetLength(0);
      int cntColumn = arr0.GetLength(1);
      for (int r = 1; r <= cntRow; ++r) {
        for (int c = 1; c <= cntColumn; ++c) {
          Console.WriteLine(r.ToString() + "," + c.ToString() + " " + arr0[r,c] + ":" + arr1[r,c]);
        }
      }
      rng0 = rng1 = null;
      cell0 = cell1 = null;
      wks0 = wks1 = wks2 = wks3 = null;
      wkss = null;
      
      GC.Collect();
      GC.WaitForPendingFinalizers();
      GC.Collect();
      GC.WaitForPendingFinalizers();

      wbk.Close(Type.Missing, Type.Missing, Type.Missing);
      Marshal.FinalReleaseComObject(wbk);
      app.Quit();
      Marshal.FinalReleaseComObject(app);
      Console.Write("Check Excel.exe in Task Manager . . . ");
      Console.ReadKey(true);
    }
  }
}

References