23 May 2010

Rudimentary C# COM Interop Client for Excel

I started writing some basic C# COM Interop clients for Excel to keep a hand in C# and be able to develop more sophisticated applications than VBA could provide. Here are some basic things I learnt and a complete working example to read data from an Excel workbook.

Configure your project

  1. Add to this project reference in the IDE: GAC (or COM) Microsoft.Office.Interop.Excel.
  2. Add to the source code:
    using System.Runtime.InteropServices;
    using Microsoft.Office.Interop.Excel;
    

Clean up COM References

After your program has finished using Excel, it should release all references to COM objects otherwise the Excel automation process remains running (check the Task Manager process list for Excel.exe). It seems a bit back-to-front to mention this first but I like to know how free unused resources before I start using them. Here is a discussion and statements you should add to your code.

Optional Parameters and Type.Missing

C# (pre-2010) doesn't support optional parameters so you have to supply an argument for all parameters in exposed methods. Since many frequently-used Excel methods require a lot of parameters, you have to tolerate verbose function calls in your C# source code.

Often you would use the default value for most parameters and you can simply provide the value Type.Missing. If you have Visual Studio Tools for Office (VSTO), you can use a global variable called missing. In the example below, Workbooks.Open() requires 15 arguments but I only provide the file name and use the default value for the remaining parameters.

      Excel.Application app = new Excel.Application();
      Excel.Workbook wbk = app.Workbooks.Open(
        path, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

Getting Data From Excel - Indexers, Arrays and Accessors

You can access Worksheets, Ranges and their values in VBA with this syntax: object.property or collection(index). Using the Excel object model in C# via COM Interop requires knowing about indexers, arrays and accessors.

Items in Excel collections (e.g. Worksheets) can be accessed using indexers, e.g. Workbook.Worksheets[1]. Like their VBA counterparts, some indexers support different types of arguments. The return value is always of type object so you have to cast the return result. For example:

      Excel.Worksheet wks0 = (Excel.Worksheet)wkss["Sheet1"];
      Excel.Worksheet wks1 = (Excel.Worksheet)wkss[1];

For me, an Excel indexer is syntactic sugar for an accessor method called get_Item(). (Note: The SharpDevelop IDE doesn't list this accessor in the list of object methods but VS C# 2008 Express does):

      Excel.Worksheet wks2 = (Excel.Worksheet)wkss.get_Item("Sheet1");
      Excel.Worksheet wks3 = (Excel.Worksheet)wkss.get_Item(1);

Another thing to note is that the indexers return a type object so you have to cast the return result.

Once you have a reference to a Worksheet, you can refer to individual cells using its Cells array (and remember to cast the return result):

      Excel.Range cell0 = (Excel.Range)wks0.Cells[1,1];
      Excel.Range cell1 = (Excel.Range)wks0.Cells.get_Item(1,1);

You can also get a single cell using the Worksheet's Range property. Unlike the Cells property, Range is a parameterised property, so you have to use the Worksheet method get_Range(object Cell1, object Cell2):

      Excel.Range rng0 = wks0.get_Range("A1", Type.Missing);

A common way to refer to a set of cells is to provide the top left and bottom right cell to the Worksheet's get_Range() accessor:

      Excel.Range rng1 = wks0.get_Range("A1", "A6");

If you try to get a Range using the function or array syntax, e.g. Excel.Range rng1 = wks0.Range("A1", "A6"); or Excel.Range rng0 = wks0.Range["A1","A6"];, you get a compile-time error: Property, indexer, or event 'Range' is not supported by the language; try directly calling accessor method 'Excel._Worksheet.get_Range(object, object)', which I take it to mean that Range is exposed as a property but can only be used as a method since C# (pre-2010?) doesn't support parameterised properties.

After getting a reference to a Range, you can get its data using either the get_Value() accessor or Value2 property (like Range, the Value property is a parameterised property). Finally, since a Range object can refer to a single cell or a set of cells, you have to know whether you are retrieving a single object or an object array from a Range.

      string s0 = (string)rng0.get_Value(Type.Missing);
      string s1 = (string)rng0.Value2;
      object[,] arr0 = (object[,])rng1.get_Value(Type.Missing);
      object[,] arr1 = (object[,])rng1.Value2;

Complete Example

Putting all that information together, here is the complete code for a rudimentary C# COM Interop client to read data from cells A1:A6 in an Excel workbook. Note that some statements are redundant and are there to illustrate different ways of getting the same result.

The program's output is:

$A$1:$A$6
1,1 Text:Text
2,1 1:1
3,1 5.678:5.678
4,1 0.5:0.5
5,1 1/06/2010 12:00:00 AM:40330
6,1 21.5078:21.507812345
Check Excel.exe in Task Manager . . .

The program outputs both a cell's Value and Value2 property. As you can see, cell A5's Value is a date while its Value2 property is a number, and cell A6's Value is a currency amount to four decimal points while its Value2 property is a double. Just before it exits, all COM references to Excel should be cleaned up and the program prompts the user to check the Task Manager, where there should not be any Excel.exe processes in the process list.

//Add to project reference:
//GAC, Microsoft.Office.Interop.Excel
using System;
using System.Runtime.InteropServices;
using Microsoft.Office.Interop.Excel;

namespace ExcelAutomation1 {
  class Program {
    public static void Main(string[] args) {
      string path = @"C:\Users\hung\Documents\Programming\C#\ExcelAutomation1\ExcelAutomation1\Test.xls";
      Excel.Application app = new Excel.Application();
      Excel.Workbook wbk = app.Workbooks.Open(
        path, false, true, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
      Excel.Sheets wkss = wbk.Worksheets;
      Excel.Worksheet wks0 = (Excel.Worksheet)wkss["Sheet1"];
      Excel.Worksheet wks1 = (Excel.Worksheet)wkss[1];
      Excel.Worksheet wks2 = (Excel.Worksheet)wkss.get_Item("Sheet1");
      Excel.Worksheet wks3 = (Excel.Worksheet)wkss.get_Item(1);
      Excel.Range cell0 = (Excel.Range)wks0.Cells[1,1];
      Excel.Range cell1 = (Excel.Range)wks0.Cells.get_Item(1,1);
      Excel.Range rng0 = wks0.get_Range("A1", Type.Missing);
      //Excel.Range rng1 = wks0.Range("A1", "A6"); //Not supported
      //Excel.Range rng1 = wks0.Range["A1","A6"]; //Not supported
      Excel.Range rng1 = wks0.get_Range("A1", "A6");
      Console.WriteLine(rng1.get_Address(Type.Missing, Type.Missing, Excel.XlReferenceStyle.xlA1, Type.Missing, Type.Missing));
      string s0 = (string)rng0.get_Value(Type.Missing);
      string s1 = (string)rng0.Value2;
      object[,] arr0 = (object[,])rng1.get_Value(Type.Missing);
      object[,] arr1 = (object[,])rng1.Value2;
      int cntRow = arr0.GetLength(0);
      int cntColumn = arr0.GetLength(1);
      for (int r = 1; r <= cntRow; ++r) {
        for (int c = 1; c <= cntColumn; ++c) {
          Console.WriteLine(r.ToString() + "," + c.ToString() + " " + arr0[r,c] + ":" + arr1[r,c]);
        }
      }
      rng0 = rng1 = null;
      cell0 = cell1 = null;
      wks0 = wks1 = wks2 = wks3 = null;
      wkss = null;
      
      GC.Collect();
      GC.WaitForPendingFinalizers();
      GC.Collect();
      GC.WaitForPendingFinalizers();

      wbk.Close(Type.Missing, Type.Missing, Type.Missing);
      Marshal.FinalReleaseComObject(wbk);
      app.Quit();
      Marshal.FinalReleaseComObject(app);
      Console.Write("Check Excel.exe in Task Manager . . . ");
      Console.ReadKey(true);
    }
  }
}

References