07 December 2009

Simple cache in Excel using VBA static variables

Static variables in VBA functions, like their counterparts in C, retain their values between function calls. A common use for static variables is to maintain a running total or count. This article describes how static variables can be used for cacheing in Excel VBA, compares the performance of a simple approach versus a cache in Excel, and some discussion for using this approach.

Excel applications often have read or write data in specific cells, columns or rows, which are identified by cells containing specific strings. A simple approach is to write a function that returns a cell reference given a string, as below.

Function GetHeaderFind(ByVal str As String) As Range
  Set GetHeaderFind = Sheet1.Range("A:A").Find(What:=str, SearchOrder:=xlByRows, SearchDirection:=xlNext)
End Function

If we know that the cells referenced do not change during the execution of the program, then we could find them when they are first required and cache their values in a Collection. The next time the same string is presented, we just return a reference from the Collection:

Function GetHeaderStatic(ByVal str As String) As Range
  Static colRange As New Collection
  Dim rng As Range
  On Error Resume Next
  Set rng = colRange(str)
  If Err.Number <> 0 Then
    Set rng = Sheet1.Range("A:A").Find(What:=str, SearchOrder:=xlByRows, SearchDirection:=xlNext)
    colRange.Add key:=str, Item:=rng
  End If
  On Error GoTo 0
  Set GetHeaderStatic = rng
End Function

Does caching the results make a difference? We test it by searching for three values 100000 times and writing the time elapsed between the two approaches:

Public Sub TestCache()
  Dim sStart As Single, sTimeFind As Single, sTimeCache As Single
  Dim rng As Range, wks As Worksheet
  Dim rngHeader As Range
  Const iCOUNT As Long = 100000
  Dim i As Long, j As Integer
  Dim aKey(3) As String
  aKey(0) = "x"
  aKey(1) = "y"
  aKey(2) = "z"
  'Find implementation
  sStart = Timer
  For i = 1 To iCOUNT Step 1
    For j = 0 To 2 Step 1
      Set rng = GetHeaderFind(aKey(j))
    Next j
  Next i
  sTimeFind = Timer - sStart
  'Cache implementation
  sStart = Timer
  For i = 1 To iCOUNT Step 1
    For j = 0 To 2 Step 1
      Set rng = GetHeaderStatic(aKey(j))
    Next j
  Next i
  sTimeCache = Timer - sStart
  Debug.Print "Find, Cache times = " & sTimeFind & "," & sTimeCache
  Debug.Print "Ratio = " & sTimeFind / sTimeCache
  End 'Dispose of static variables
End Sub

The results below show that the cache is about 29 times faster than the simple implementation.

Find, Cache times = 16.92969,0.5859375
Ratio = 28.89333
Find, Cache times = 16.85156,0.578125
Ratio = 29.14865
Find, Cache times = 16.84375,0.578125
Ratio = 29.13514

Of course, we could have simply declared and initialised a module-level Collection, and saved ourselves the effort of writing a cache. Below are some reasons why creating a cache in a function is useful:

  • We are refactoring an existing function and do not want to (or cannot) change the function's interface.
  • Unlike a VBA module-level variable which can be inadvertently changed by any function, we ensure that the Collection can only be changed in a one location because only the function can modify its static variable.


03 December 2009

R1C1 notation in Excel VBA

Excel supports both A1 and R1C1 notation for referencing cells in formulas. Almost everyone uses A1 notation but it can be easier to use R1C1 notation to generate column references in VBA. For instance, to generate a formula that references a column in A1 notation, you need a function that returns A to Z for the first 26 columns, then returns two letter column names (such as AA) for columns past the 26th. With R1C1 notation, you just concatenate C with a column number (such as C1 or C27). Lastly, to assign a formula to a cell using R1C1 notation, use the cell's FormulaR1C1 property.

01 December 2009

Excel VBA programming with run-time errors

When writing VBA programs for Microsoft Excel 2003, you quickly find that you have to detect and handle run-time errors in your programs. VBA provides On Error and Resume statements for handling run-time errors. This article discusses how run-time errors can be used, explicitly or implicitly, in VBA programs.

Let's start with a script below, which calls the WorksheetFunction class' Match() function to find a cell with the value z in the range B1:B8:

Sub TestMatch()
  Dim wks As Worksheet
  Set wks = Sheet1
  Dim dblMatchResult As Double
  dblMatchResult = Application.WorksheetFunction.Match("z", wks.[B1:B8], 0)
  Debug.Print "Found at " & dblMatchResult
End Sub

If Match() function succeeds, then the relative row number is returned. However, if Match() fails, Excel generates this error message: Run-time error '1004': Unable to get the Match property of the WorksheetFunction class. In this case, the error means that Match() did not find z in the range B1:B8. It is similar to a #N/A result in a cell (though Error 1004 is a VBA error code and not related to the Excel-specific xlErrNA).

Below, we explicitly use the run-time error to (trivially) determine what to print in the statement after the error with On Error Resume Next. We also explicitly clear the error handler and allow errors to be detected again with On Error GoTo 0 (of course, detecting errors again doesn't matter in this short example).

Sub TestMatch()
  Dim wks As Worksheet
  Set wks = Sheet1
  Dim dblMatchResult As Double
  On Error Resume Next
  dblMatchResult = Application.WorksheetFunction.Match("z", wks.[B1:B8], 0)
  Debug.Print "z " & IIf(Err.Number = 0, "found at " & dblMatchResult, "not found")
  On Error GoTo 0
End Sub

Another common situation where you may explicitly test error codes is when you use a Collection. VBA's Collection class do not have a method to test if a key exists, so the only way to know if a key exists in your collection object is to try and retrieve the key from the collection and get a run-time error, as in the script below.

Sub TestCollection()
  Dim col As New Collection
  col.Add Key:="b", Item:="bog"
  col.Add Key:="d", Item:="dog"
  Dim str As String
  On Error Resume Next
  str = col("c")
  Debug.Print "Key c " & IIf(Err.Number = 0, " found", " not found") & " in collection"
  On Error GoTo 0
End Sub

In this example, the error would be Run-time error '5': Invalid procedure call or argument since the key c doesn't exist in the collection.

(Some run-time error messages generated by VBA are misleading: Error 1004 sounds like VBA failed to resolve the Match() function in the WorksheetFunctions class, and Error 5 is unclear because neither the procedure nor the argument is invalid. Error messages are also inconsistent. The commonly used Sheets Collection generates a different run-time error message than a VBA Collection if you try to access a key that doesn't exist. Try this in the Immediate Pane: ? Sheets(99).Name or ? Sheets("Blah").Name. Unless you have 99 sheets or a sheet called Blah in your workbook, you will see this more comprehensible error, Run-time error '9': Subscript out of range rather than Error 5.)

Some errors can be incidental to your program's logic and easily rectified, such that you can use VBA's On Error GoTo <label> statement to jump to a statement block to fix the error, then re-execute the statement that caused the error with Resume 0. For instance, the following function returns a reference to a sheet. It tries to obtain a sheet reference using the statement on line 3, jumping to lblCreateSheet to create a sheet first if it does not exist, then re-executing line 3.

1 Function AddSheetGoTo(ByVal str As String) As Worksheet
2   On Error GoTo lblCreateSheet:
3   Set AddSheetGoTo = Sheets(str) 'Run-time error '9': Subscript out of range
4   Exit Function
5 lblCreateSheet:
6   Sheets.Add.Name = str
7   Resume 0
8 End Function

Using On Error GoTo <label> ... Resume 0 means that the body of your function is not cluttered with exception handling statements, though you have to include Exit (Function or Sub) to explicitly end your function (and avoid inadvertently executing the error handler), and you should ensure that your error handler can actually fix the error, otherwise you will end up with an infinite loop. Compare this approach with explicitly testing and clearing an error using On Error Resume Next ... On Error GoTo 0, where you end up with a slightly longer and more cluttered function:

 1 Function AddSheetResume(ByVal str As String) As Worksheet
 2   Dim wks As Worksheet
 3   On Error Resume Next
 4   Set wks = Sheets(str) 'Run-time error '9': Subscript out of range
 5   If Err.Number <> 0 Then
 6     On Error GoTo 0
 7     Sheets.Add.Name = str
 8     Set wks = Sheets(str)
 9   End If
10   Set AddSheetResume = wks
11 End Function

This article discussed how VBA's run-time error handling can be used explicitly as part of your program logic, or implicitly to keep your program logic clear. It also notes that VBA run-error codes are not always clear or consistent.


29 November 2009

VBA IIf() as a ternary conditional expression (sometimes)

VBA doesn't have a ternary conditional expression (e.g. C-style ?:), which leads to a long-winded conditional statement like this when you just want assign a value to the same variable based on a condition:

If <condition> Then
  x = 1
  x = 2
End If

In C-style languages, you would just write x = <condition> ? 1 : 2. You can mimic the ?: ternary operator using VBA's IIf(expr, truepart, falsepart) function, x = IIf(<condition>, 1, 2), which reduces the verbosity of VBA code.

There are some limitations, though. Unlike the C-style ?: operator which only evaluates either the truepart or falsepart depending on the condition, IIf() is a function call so all its arguments are evaluated and this evaluation strategy can lead to side-effects or cause an unexpected error message.

Side-effects: x = IIf(condition, function1, function2) would cause both function1 and function2 to be called, so this statement would be slower than the equivalent If condition Then function1 Else function2 End If, where either function1 or function2 are called.

Unexpected error message: You may encounter an unexpected error message if you rely on the value of a variable in truepart or falsepart. For example, the following Excel VBA script searches for a value in a worksheet range B1:B8 and prints its address if it is found, otherwise it prints "Not found".

Sub TestFind()
  Dim wks As Worksheet
  Set wks = Sheet1
  Dim rng As Range
  Set rng = wks.[B1:B8].Find(What:="z")
  Debug.Print IIf(Not rng Is Nothing, "Found " & rng.Address, "Not found")
End Sub

If z is in the range, the script would execute without any errors. On the other hand, if it is not, you may see this unhelpful error: Run-time error '91': Object variable or With block variable not set. The reason is that rng is Nothing, so evaluating the second argument, "Found " & rng.Address, causes an error.

To sum up, you can use IIf() as a exact substitute for ?: if both truepart and falsepart can always be evaluated (e.g. literals, constants or variables).

23 November 2009

Optimising VBA for Excel 2003: Avoid EntireRow.Insert

While optimising an Excel 2003 VBA program to write thousands of rows of data, I noticed that Excel's EntireRow.Insert consumed most of the program's processing time. To quantify the time, here's a simple test that just inserts rows into a workbook ...

Option Explicit

Sub Insert10kRows()
  Dim wks As Worksheet
  Set wks = Sheets("Sheet1")
  Dim intI As Integer
  For intI = 1 To 10000
    wks.Cells(1, 1).Value = "Test"
  Next intI
End Sub

Sub TestInsert()
  Dim douStart As Double
  Dim i As Integer
  For i = 1 To 6
    douStart = Timer
    Debug.Print i & "0000:" & Timer - douStart
  Next i
End Sub

... and the results:

Rows Inserted Time
10000 7.703125
20000 11.125
30000 14.75
40000 18.9375
50000 23.40625
60000 28.140625

On the other hand, if you just write rows and advance a row counter, like this ...

Sub Write10kRows()
  Dim wks As Worksheet
  Set wks = Sheets("Sheet1")
  Dim intI As Integer
  For intI = 1 To 10000
    wks.Cells(intI, 1).Value = "Test"
  Next intI
End Sub

Sub TestWrite()
  Dim douStart As Double
  Dim i As Integer
  For i = 1 To 6
    douStart = Timer
    Debug.Print i & "0000:" & Timer - douStart
  Next i
End Sub

... you get much better performance (low numbers are better).

Rows Written Time
10000 0.6875
20000 0.6875
30000 0.703125
40000 0.6875
50000 0.6875
60000 0.6875

Another observation is that the more rows exist in a worksheet, the longer Excel takes to insert rows (hence the nested loops in the sample code). I can imagine that Excel is spending a lot of time just shuttling data from one block of memory to another to make room for new rows.

For my program, the solution was to refactor the code to replace row insertion with a row counter and to write data to the appropriate row. It's a little more complicated but the program runs at least 10x faster.

21 November 2009

SharpDevelop MSB3105 Error for duplicate form

When you create a new Windows Application project in SharpDevelop 3.1, the IDE creates a default form called 'MainForm'. If you add an existing form with the same name to this project then build this project, you may get this error: The item "MainForm.Designer.cs" was specified more than once in the "Sources" parameter. Duplicate items are not supported by the "Sources" parameter. (MSB3105).

SharpDevelop displays a line number (though the line number is a furphy) but not the file with the error. It turns out that when you add the form, the IDE appends another <Compile Include="MainForm.Designer.cs"> entry in the .csproj file. The solution is to delete the duplicate entry. Note that if you use Visual Studio, there is no duplicate entry in the .csproj file.

17 November 2009

Local data files in Visual C# Express 2008

While learning data binding for Windows.Forms, I noticed that changes made to data in the DataGridView weren't written back into the local SQL Server database file when the form was closing, even after calling TableAdapter.Update(). Turns out that Visual C# Express 2008 (and VS 2008 in general) copies your local data file each time you build your project to the output folder. The original version of the database file is only modified if you change the schema or edit data using VS. This also means that if you make changes to the data while programming, then change the database schema in VS, the changes you have made in the database file in the output folder would be overwritten.

See Also

15 November 2009

Microsoft Word MailMerge Run-time error '5922'

When a MS-Access 2003 VBA script tried to open MS-Word and run the mail merge function, VBA displayed this error message:

Run-time error '5922':
Word was unable to open the data source

The problem was tracked to this statement:

    Dim objWordApp As Object
    Set objWordApp = CreateObject("Word.Application")
    With objWordApp
      .ActiveDocument.MailMerge.OpenDataSource Name:=CurrentDb.Name, SQLStatement:=sQuery, SubType:=wdMergeSubTypeWord2000

There's plenty of different solutions to this problem on the web. In this case, the problem was due to the second argument, the SQL query, sQuery, which had an error.

I spent some time browsing the web to investigate the run-time error, when it would have been more effective to first check the arguments in the subroutine call. Live and learn.

05 November 2009

Visual Studio 2008 Express Data Source Limitations

Visual Studio Express 2008 for C# or VB only allows you to choose a limited number of data sources in the IDE. Open the Choose Data Source dialog using menu item Tools / Connect to Database, and in the Data Source list, there are only three items:

  • Microsoft Access Database File
  • Microsoft SQL Server Compact 3.5
  • Microsoft SQL Server Database File

According to VS 2005 express: ".NET Framework Data Provider for ODBC" does not show up in Choose/Change Data Source dialog :(:

The data designers in the Express editions of Visual Studio only allow you to connect to SQL Server Express and Access (Jet) databases. I've passed your feedback along to the people who own these components in the hopes of getting this limitation clearly documented. As Ralph noted, you can still use the various .NET Data Providers at run-time. David Sceppa
ADO.NET Program Manager

I checked VisualStudio2008-ProductComparison-v1.08.pdf from Visual Studio 2008 Product Comparison Guide and there's no specific mention of this limitation for VS 2008 Express editions, other than Data Sources window displays the data sources in your project for creating data-bound controls.

Of course, I can work around this limitation programmatically but the learning value of VS Express editions falls when I can't match text and images from tutorials with what I see on screen.

03 November 2009

Visual Basic short-circuit evaluation of expressions

Visual Basic seems to be the only language that evaluates all the operands of AND and OR operators instead of implementing short-circuit evaluation. In all other programming languages, such as C# or Java, expressions are evaluated from left to right, and the evaluation of expressions in statements exp1 AND exp2 and exp3 OR exp4 stop when exp1 = false or exp3 = true, respectively. This difference does surprise you if you come to VB from another programming language and you first write a conditional statement that relies on short-circuit evaluation, like this: If s IsNot Nothing And s.Length > 0 Then …

Two new keywords, AndAlso and OrElse were introduced in VB.Net to provide similar semantics to other languages supported by .Net. Working in more than one programming language at a time, I would have preferred some sort of backward compatibility compiler Option rather than adding new keywords to a language that already has too many.

29 October 2009

Changing indent and tab size in Visual Basic 2008 Express

Minor annoyance in Microsoft Visual Basic 2008 Express Edition: To change the indent and tab size in Options dialog, Text Editor Basic / Editor node, you have to highlight the number in the field, then type in a new number in the Tab Size and Indent Size text fields. You can't delete the existing value first, then type in a new one.

2-Nov-2009: In Visual C# 2008 Express Edition, in Options dialog, Text Editor / All Languages / Tabs node, you can delete the existing value first.

19 October 2009

Hiding Facebook Application Posts

Following a tip from Steven S., I've hidden posts from friends' Facebook applications that don't interest me, resulting in a much cleaner home page. In only a month, I found that I have hidden more than a hundred (!) applications, which led me to wonder if Facebook shouldn't provide a reverse option, to only show certain applications instead.

To hide applications, I can imagine that there is a database query which has a NOT IN () clause that rapidly gets longer. In contrast, to only show selected applications would require a much shorter IN () clause that grows more slowly. From an engineering point of view, the latter solution would provide a more responsive system.

On the other hand, from a business point of view, encouraging subscribers to use applications would increase Facebook's stickiness and the number of ad presentations. If subscribers aren't notified about new applications (through their friends' application posts), they are unlikely to know about these applications.

15 October 2009

Blog Comment Spam

What a pain in the nether regions when spammers decide to use your blog as a vehicle for their crap advertisements. My blogs are moderated, so I see every comment before they are posted. I only twigged that my blog was being spammed when I saw the same comment twice for different postings on the same day (that spammer was enthusiastic). It seems (naïve, perhaps) that any comment with a URL, either in plain view or in an HTML <a> tag, is likely to be spam comment, and showed be examined closely.

Suggestion to Blogger team: include a option to mark a comment as spam, just like Gmail, and use the crowd to identify spam comments.

24 September 2009

Auto-arranging multiple application windows in Vista

If you're taking notes or eyeballing a couple of documents at once on screen, you might tile your application windows so that they are arranged side by side and fill up the screen. In Working with windows, Arranging windows automatically, the Vista desktop provides two commands for tiling your windows: Show Windows Stacked and Show Windows Side by Side. (If you Cascade Windows, your windows are stacked one on top of another and you can only see the contents of the top one.) This posting digs a little deeper into how this feature works and explains its limitations.

To use these commands, press the right-mouse-button (RMB) over the task bar and select the appropriate item from the context menu. The Windows desktop should automatically tile all application windows. If you want to restrict automatic tiling to specific windows, hold down the CTRL key and select the window icons in the task bar first, then press RMB and select the required menu item. A gotcha I noticed is that you can undo automatic tiling if you tiled all the windows at once, but you can't undo the last command if you tiled specific windows.

When you have selected only two or three windows, Show Windows Stacked tiles your windows one over the other vertically, like this:

|        |
|        |
|        |

As you would expect, Show Windows Side by Side tiles your windows horizontally:

|   ||   ||   |
|   ||   ||   |
|   ||   ||   |
|   ||   ||   |

If you select four or more windows, then it doesn't matter which of the two auto-tiling commands you choose because the windows are tiled in the same manner:

|   ||   |
|   ||   |
|   ||   |
|   |+---+
+---+|   |
|   |+---+
|   ||   |

The commands tile the selected windows based on the windows' titles, in reverse alphabetical order, as below. You can't make windows appear in a particular order by changing the windows' initial positions in the desktop or picking window icons in a particular sequence.

| E || C |
|   |+---+
+---+| B |
| D |+---+
|   || A |

03 September 2009

Floating IFRAMEs in Blogger

In my other blog, I wanted to show images to one side of the text, like this:

XXXXXX +--------+
XXXXXX | iframe |
XXXXXX +--------+

The images, generated by Amazon, are within an iframe element. After some false starts, I found that the simplest solution was to wrap the iframe elements in a div element, and float the div:

<div style="float:right">

27 August 2009

Microsoft Access 2003 finding the minimum and maximum column values using a VBA function

If you may have to write a query in MS-Access 2003 to get the maximum or minimum of the values of two columns, you could use the iif() expression like this:iif(n1 > n2, n1, n2) or iif(n1 < n2, n1, n2) to obtain the maximum or minimum, respectively. The expression becomes more complicated when you have to compare three columns:


Such expressions are hard to get right, and let's not even consider writing one to compare four columns! (In case you were wondering, we can't use Access' max() and min() aggregate functions because they operate on rows.)

Access allows you to call VBA functions in queries, so we can replace the long and cumbersome iif() expression with a call to a custom VBA function. In the rest of this article, we will:

  1. Define a custom VBA function.
  2. Calling a customer VBA function.

Define a custom VBA function

Here's a simple VBA function one to find the maximum of a list of values (just change the name and reverse the comparison to find the minimum):

Function max_value(ParamArray n() As Variant)
  Dim i As Integer
  Dim ret As Variant
  ret = n(0)
  For i = 1 To (UBound(n))
    If ret < n(i) Then ret = n(i)
  Next i
  max_value = ret
End Function

The ParamArray declares n() as a variable argument list (or varargs in C). Using ParamArray means that the function is not restricted to a pre-defined number of parameters, and using a Variant means that the function will work for different types of data, such as numbers and dates.

The body of the function just loops through all the values in the argument list and returns the largest value.

Calling a custom VBA function

If you have a table called Number with fields n1, n2, n3, n4, n5, you can call this function in a query, just like any other built-in function, like this:

SELECT max_value(n1, n2, n3, n4, n5) as max_value, n1, n2, n3, n4, n5 FROM [Number];

14 August 2009

Google Reader and Firefox's maximum popup limit

If you browse using Firefox and use the Google Reader keyboard v shortcut that I wrote in an earlier post, you may find that Firefox will eventually generate this message, Firefox prevented this site from opening a pop-up window, although you have added www.google.com to the Allowed Sites - Pop-ups dialog. This behaviour occurs because Firefox prevents any site from automatically opening too many popups.

According to Lifehacker's Increase Firefox's Maximum Pop-up Count, the number of pop-up windows a site is allowed to show is controlled by the dom.popup_maximum variable. By default, it is set to 20, so to avoid hitting the limit, I increased this value. Of course, now my browser is more vulnerable to a pop-up window attack or runaway script from all sites listed in the Allowed Sites.

Perhaps a better solution is to have a per-site limit?

12 August 2009

Keyboard shortcuts for navigating Google Reader and Gmail

After using any application for a while, I start using some keyboard shortcuts to speed up common operations. Desktop applications usually have keyboard shortcuts while web applications don't; luckily for me, two of my most used Google applications, Reader and Gmail, have keyboard shortcuts. You can find the list of shortcuts easily so rather than going them, I'll just highlight useful keystroke sequences for navigation and suggest mnemonics to remember them.


Both applications support ? to show and hide their keyboard shortcut help page.

Reader shortcut sequences

gF and gt
(g)o to items in people you (F)ollow or a (t)ag. Note: an (undocumented) alternative to gt is gl, (g)o to a (l)abel, like Gmail.
nov and pov
browse (n)ext or (p)revious item in a list, (o)pen it and (v)iew the original. Note: if you're using Firefox, you may be prompted to allow the www.google.com domain to open a window or tab.
browse (N)ext or (P)revious subscription, (O)pen it and mark (A)ll items read. Good for quickly scanning whole subscription (or tag) items. Just remember that operations on subscriptions use capital letters.

Gmail shortcut sequences

gi and gl
(g)o to (i)nbox or a (l)abel
j and k
browse next or previous conversation (or thread) in a label. These are Vi-style keystrokes (Ah, that brings back memories!).
n and p
browse (n)ext or (p)revious message in a conversation (or thread).
go (u)p to thread (or conversation) list from reading a message.

Note that Google uses the terms conversation or thread interchangeably in their online help. Unlike Reader, Gmail doesn't have shortcuts for browsing labels (or tags).

So, there's a very short list of keyboard shortcuts for that I've found useful for navigating Google Reader and Gmail; I hope it'd be useful for you too.

09 August 2009

Read 'Times Online' article comments earliest first

The comments appearing after articles in the Times Online are displayed in reverse chronological order (latest first), which I find difficult to read, so I wrote this GreaseMonkey jQuery script to click the Oldest first link after the page is loaded:

// ==UserScript==
// @name         Timesonline.co.uk earliest comment first
// @namespace    kamhungsoh.com
// @description  Show an article's comments, earliest comment first.
// @require      http://ajax.googleapis.com/ajax/libs/jquery/1.3/jquery.min.js
// @include      http://*.timesonline.co.uk/*
// ==/UserScript==

var evt = document.createEvent('HTMLEvents');
evt.initEvent('click', true, true);

$("a:contains('Oldest first')").each(function() {

The script is a little more complicated than I expected. Apparently jQuery cannot send events (see the last response in the thread) that it itself did not bind to objects, so my script has to create a click event object and send it to the link. Otherwise, I would written $("a:contains('Oldest first')").trigger('click');

Also, the script's a little hacky because it will send a click event to all links containing the text 'Oldest first', though it seems pretty unlikely that a page will have more than one link of that type in the first place.

05 August 2009

OpenOffice Calc AutoInput capitalization annoyance

When you enter text in a cell in OpenOffice Calc, you might find that your text is incorrectly capitalized and you cannot correct it! For example, while writing a vocabulary list, I entered 'sun' and it was capitalized as 'Sun' (maybe referring to a certain technology company). If you type Ctrl+Z to undo this capitalization, all the text in the cell is removed, not just the capitalization. It seems that Calc's AutoInput feature is automatically capitalizing text before it is entered (unlike 'AutoCorrect', which makes the correction after the text is entered, so you can undo the capitalization). I don't know how AutoInput differs from AutoCorrect, and I haven't found AutoInput useful so I've disabled it using menu item Tools / Cell Contents / AutoInput.

12 July 2009

java.sun.com - improve online readability

Since I read a lot of Sun Java development documentation online, here is a more readable style. I only applied a small number of changes to avoid accidentally screwing up the site's presentation:

  1. Set text columns' maximum width to 30em. Elements which may be naturally wide, such as <img> (for images) and <pre> (for sample code), are not affected. Text is easier to scan if the columns aren't too wide.
  2. Set font family in <code> and <pre> (usually code samples) from Courier to Consolas or Courier New. Courier characters don't seem to get aliased (right term?) properly and look broken up on my notebook screen. Does anyone else have this problem? I chose Consolas because it matches the font I use in my IDEs sessions, and Courier New as a second choice because if Consolas is not available.
  3. Set <pre> background from gray to transparent. Sun marks up code samples in different ways, so it doesn't work in all pages.

Sun does not mark up all text in tags. For example, in javax.sql package summary, the text that follows <h2> headings cannot be styled since CSS cannot select text nodes. Might check if this a limitation of javadoc or whether API documentation writers have to explicitly mark up their paragraphs in <p> tags.

05 July 2009

Blogger full window width edit box

The text pane in Blogger's edit page is rather cramped and becomes full when you're writing more than a couple of paragraphs. After looking at Blogger Wide Screen Editor, I thought it should be possible to write a style to automatically fit the edit box in a browser's window.

My Blogger full window width edit box style just changes the width of <div> elements from fixed values to auto so the edit box can fill the width of the window. A bonus of this technique is that the tabs and controls automatically stay close to the right hand border when you resize your browser's window.

Just to avoid unplanned effects on the Blogger pages, I've set my style to only be applied to Blogger's post pages using url-prefix("http://www.blogger.com/post").

28 June 2009

CSS inheritance and selector specificity for max-width

While finding it easier to read text using narrower columns using my custom CSS definition, I noticed that it didn't work on pages that use table rows for grouping sections, such as the following structure:

        p …

Here's the CSS definition I was using:

@namespace url(http://www.w3.org/1999/xhtml);

@-moz-document domain(java.sun.com)
  body {
    max-width : 35em;

It appears that the max-width property defined in the <body> element property isn't inherited by the <p> elements within the <table>. My first fix was to select all the elements in the path between <body> and <p> and make them inherit the max-width property:

  table, tbody, tr, td {
    max-width : inherit;

This isn't a nice solution because I would have to specify all the possible types (or elements) that could be in a table cell. It would be better the use the CSS universal selector, *, instead:

  * {
    max-width : inherit;

Now the definition is even applied to some elements that we don't really want to limit in size. Elements that are naturally wider than max-width, such as images in <img> tags, are now squashed, while preformatted text in <pre> elements are displayed in boxes with horizontal scrollbars.

If the max-width property is set to none, then the width of an element is not limited. The solution is to have two rules, a universal selector and a type selector, and we have a CSS definition that limits the width of all elements except for <div>, <img> and <pre>:

@namespace url(http://www.w3.org/1999/xhtml);

@-moz-document domain(java.sun.com)
  body {
    max-width : 35em;
  * {
    max-width : inherit;
  div, img, pre {
    max-width : none;

This definition works because the universal selector is a less specific than a type selector, so the rules for <div>, <img> and <pre> override the universal selector's rule.

As usual when hacking code, I now realise that there's no longer any need to use inherit and I can simplify my CSS definition by specifying the max-width of every type using the universal selector.

@namespace url(http://www.w3.org/1999/xhtml);

@-moz-document domain(java.sun.com)
  * {
    max-width : 35em;
  div, img, pre {
    max-width : none;

To see the effects of the different definitions, add the style to your Stylish add-in, visit this Sun Java page and observe what happens to the width of the navigation bar on the top of the page and the sequence diagram images.

24 June 2009

CSS max-width versus width properties for text columns

It's easier to read on a web page if the columns aren't too wide. This presentation Ideal line length for content suggests setting the values of the min-width and max-width CSS properties around 30 em. Columns of 30 em sounds very narrow for a PC-based browser, but I expect it makes sense for browsers on handheld devices.

Using the Stylish Firefox add-in, here's my Narrow Paragraph rule for sites that I read:

@-moz-document domain(<domain 1>)
  , domain(<domain 2>) {
  body {
    min-width : 25em;
    max-width : 30em;

Earlier, I used the width CSS property but when the browser window is made narrower than the width's value, the width of the text column doesn't change and you have to scroll left and right to read. Setting both min-width and max-width gives the designer and reader some flexibility in layout and viewing, respectively.

21 June 2009

Installing and configuring Subversion for personal use

I decided (finally!) to migrate my version control system from CVS to Subversion so that I could have atomic transactions when committing multiple files. There's plenty of tutorials on using and setting up Subversion on the net, so here's the simplest way to get started on Windows …

  1. Install and run Subversion on Windows
  2. Create a repository.
  3. Import a project into the repository.
  4. Check-out a project from the repository.

Install and run Subversion on Windows

  1. Install TortoiseSVN Subversion GUI client.
  2. Install CollabNet Subversion Server and Client:
    • I chose the svnserve option, otherwise by default the installer will also install the Apache web server and I already had Apache installed.
    • The installer will create a Windows service using svnserve. While it is possible to use the file protocol for personal development (since there's only one developer, yourself), it's likely that I'd use Subversion in a team, so I decided I might as well get used to the svn protocol and have a server manage my repository.

Create a repository.

  1. Open a command prompt and create a repository with this command: svnadmin create <repository path>.
    • Note that the installer should have added the Subversion bin folder to your PATH environment variable.
    • Note 2: I think I'm missing a step: How does svnserve map svn://localhost to the repository path?
  2. Edit <repository path>\conf\svnserve.conf and uncomment the following lines:
    • password-db = passwd
    • realm = My First Repository
    Note that passwd refers to the passwd file in the conf folder.
  3. Edit <repository path>\conf\passwd file and either uncomment one of the sample user and password pairs, or add a new pair. You could have anonymous read and write operations, but again, since I was likely to use it in a team, I might as well create a username-password pair.
  4. Open the Windows services console and start the CollabNet Subversion svnserve service if it is not already running.
  5. Check that TortoiseSVN can connect to your Subversion server.
    1. Open Windows Explorer
    2. Highlight an arbitrary file or folder and select the context menu item TortoiseSVN/repo-browser.
    3. When TortoiseSVN prompts you for a URL, enter svn://localhost (the address of your local Subversion server) and press the OK button.
    4. TortoiseSVN should display its Repository Browser window. If the svnserve service isn't running, you would get this message: Can't connect to host 'localhost': No connection could be made because the target machine actively refused it. If you get this message, check that the service is running in the Windows Services console.
    5. Leave the Repository Browser window open because you can use it to check that you have imported your projects successfully.

Import a project into the repository

  1. Copy or rename an existing project folder to something like <project>_import.
  2. In Explorer, select that folder and use the context menu item TortoiseSVN/Import....
  3. TortoiseSVN displays the Import dialog.
    1. In the URL of Repository field, enter svn://localhost/<project>.
    2. In the Import Message field, enter Importing <project>.
    3. Press the OK button.
  4. Since this is the first time you've used TortoiseSVN to import a project, you'll be prompted to enter a username and password. Optionally, select the Save authentication checkbox so that TortoiseSVN remembers your credentials. When you are satisfied, press the OK button to submit your credentials.
  5. Check that your project has been imported successfully using the TortoiseSVN's Repository Browser. You may have to press F5 to refresh the list of folders in the repository.

Check-out a project from the repository

  1. In Windows Explorer, highlight the folder to store your project and select the context menu item SVN Checkout....
  2. TortoiseSVN should display the Checkout dialog, with ...
    • URL of Repository = svn://localhost/<project>
    • Checkout directory = <path of project>
  3. Press the OK button and your project should be checked out to the target folder.
  4. Refresh the Windows Explorer window and you should see a green tick overlaid on the <project>'s folder icon. This shows that TortoiseSVN recognises this folder as a Subversion folder.

Final words

That's pretty much all the steps required to set up a basic Subversion system for personal development on Windows.

See Also

17 June 2009

Escaping special characters in Windows International Keyboard

The Windows International Keyboard allows you to enter letters with accent marks. For example, to type é, just type Apostrophe e. But what if you just want to enter (or escape) an apostrophe (for example, when entering a string literal while programming)? The trick is to type a space after the apostrophe, as per this bullet point in the support page: If you press the space bar, the symbol (apostrophe, quotation mark, accent grave, tilde, accent circumflex or caret) is displayed by itself.

14 June 2009

Merging multiple blog feeds into Facebook Notes with Yahoo Pipes

I started a second blog, Vibogafi, for writing about media, and wanted to display updates from that and this blog using the Facebook Notes application. Notes only allows users to import one RSS feed, so I had to merge both my RSS feeds first.

One aggregation or mashup system that came to mind was Yahoo Pipes. If you have a Yahoo account, then you can create your personal pipes. I followed the video tutorial and got my pipe working the second time around. Some tips and minor gotchas:

  • When entering the Blogger feed URL in the Pipes Fetch Feed gadget, you cannot simply enter http://<myblog>.blogspot.com and expect the gadget to find the feed URL; instead, you have to enter the entire feed URL, which looks like http://<myblog>.blogspot.com/feeds/posts/default.
  • Check the Pipes Debugger output by selecting each gadget. If you do not get the expected output, try to fix the problem before proceeding.
  • If you just want to merge two or more feeds, just use one Fetch Feed gadget instead of multiple Fetch Feed and one Union gadget; the Fetch Feed gadget allows you to enter more than one URL.
  • To use your new Pipe in Facebook Notes, you have to use your pipe's RSS output, whose URL ends with &_render=rss, not just the pipe's basic address. (Upon hindsight, it is obvious, but I tripped over that problem, too.)

10 June 2009

Localize dates in Scientific American

Scientific American formats dates in its pages as m/d/yy, while I prefer d/m/yy. Like an earlier post about formatting dates in Blogger's Dashboard, this Greasemonkey script uses a regular expression to find date strings in a page, and switches the month and day values.

// ==UserScript==
// @name           Scientific American localize dates
// @namespace      kamhungsoh.com
// @description    Convert dates from mm/dd/yy to dd/mm/yy format
// @require        http://jqueryjs.googlecode.com/files/jquery-1.3.2.min.js
// @include        http://www.scientificamerican.com/*
// ==/UserScript==

$('span').each(function() {
  var s = $(this).text().replace(/(.*)(\d+)\/(\d+)\/(\d+)/, '$1$3/$2/$4');

The script only works for the home page and I haven't figured out why I can't construct an expression to locate date strings in articles. Another thing to note is that jQuery's .text() function returns all the text values in a node, so some of the formatting, such as strong tags, are lost when the script replaces the text.

07 June 2009

Reformatting post dates in Blogger Dashboard

In the Blogger Dashboard, the posting date of your entries are in mm/dd/yy format and Google doesn't provide a way to localize it or choose a date format. Here's how you can use a Greasemonkey script to convert dates in Dashboard to dd/mm/yy format.

The dates in Dashboard have this structure:

<td class="date">

The following script uses jQuery to iterate through all span nodes of td elements with a date class attribute, then applies the Javascript string replace() function to swap the first two numbers in the span node. To change the output string, just modify the second argument of the replace() function.

// ==UserScript==
// @name           blogger.com localize dashboard date
// @namespace      kamhungsoh.com
// @description    Convert date format from m/d/yy to d/m/yy.
// @require        http://ajax.googleapis.com/ajax/libs/jquery/1.3/jquery.min.js
// @include        http://www.blogger.com/posts.g*
// ==/UserScript==

$("td.date > span").each(function() {
  var s = $(this).text().replace(/(\d+)\/(\d+)/, '$2\/$1');

The first argument of the replace() function is a regular expression where \d is a meta-character for a digit, + matches one or more of the preceding character (e.g. 4 or 23) and the parentheses group the characters to be memorised. The forward slash has to be escaped, \/, to allow us to match it in the input string. In the second argument, the $2/$1 represents the second and first memorised strings.

03 June 2009

Filtering busy mailing lists in Gmail

I subscribe to a busy mailing list and find that I only keep some of the messages and delete the rest. Then I realised it would be less work to automatically delete all messages from this list first and only keep the ones that interested me.

Here's how I set up my mailing list filter in Gmail:

  1. Specify the criteria (e.g. the mailing list name).
  2. Mark Skip the inbox (Archive it).
  3. Pick a label from the Apply the label field.
  4. Mark Delete it.

Using this filter, messages that match the criteria are automatically labelled and moved into the Bin (or Trash). I read messages from the mailing list in the Bin and if I want to keep one, I just uncheck the Bin label in the message's tool bar.

31 May 2009

Simple loading of HTML fragment using jQuery AJAX

jQuery provides a simple load() function to load an HTML fragment using AJAX. To test it, create a test data file called data.html:

      <li>Item 1</li>
      <li>Item 2</li>
      <li>Item 3</li>
      <li>Item 4</li>

Next, create a target HTML file called testLoad.html in the same folder with the following jQuery statement:

    <title>jQuery load HTML test</title>
    <script type='text/javascript' src='js/jquery-1.3.2.min.js'></script>
    <script type="text/javascript">
      $(document).ready(function() {
        $('#myLinks').load('data.html ul li');
    <ul id="myLinks"></ul>

When you view testLoad.html, you should see the list items in data.html inserted into the target file, in the unordered list named myLinks.

It was a little fiddly to get this working the first time. If your source HTML file has an error, nothing seems to happen; in that case, check your browser's error console to see what went wrong. For instance, in Firefox, when my source HTML file wasn't well-formed, I found this error:

Error: mismatched tag. Expected: </ul>.
Source File: 
Line: 8, Column: 5
Source Code:

Another thing I found is that you can't include an XML processing instruction (<? ... ?> lines) in your data file because when load() tries to insert your XML file into your target document, you would get an error like this:

Error: XML or text declaration not at start of entity
Source File: 
Line: 1, Column: 43
Source Code:
<div xmlns="http://www.w3.org/1999/xhtml"><?xml version="1.0" encoding="UTF-8"?>

29 May 2009

Selecting parent or ancestor of a node in jQuery

When manipulating an HTML document (especially one that you didn't generate), it can be easier to find a node by matching its descendant's unique id or class attribute and value first, then selecting that descendant's ancestor (which is the node you wanted to in the first place), compared to finding that node by referring to its position in the DOM, which is not obvious and isn't easy to maintain.

For example, in the Australian Government Bureau of Meterology site, you might want to highlight temperature and forecast for Melbourne, so the simplest thing to do is change the style of the row containing that string, but there's no unique attribute you can use to select that row (or tr node):

            <td><a href='…'>Sydney</a></td>
            <td>Shower or two</td>
            <td><a href='…'>Melbourne</a></td>
            <td>Shower or two</td>

For this site, the trick is to select the td node containing Melbourne, then find the first tr ancestor. Here's a sample script using jQuery:

// ==UserScript==
// @name           www.bom.gov.au Highlight City
// @namespace      kamhungsoh.com
// @description    Highlight row of a specific city.
// @require        http://ajax.googleapis.com/ajax/libs/jquery/1.3/jquery.min.js
// @include        http://www.bom.gov.au
// ==/UserScript==

$("a:contains(Melbourne)").each(function() {
  $(this).parents('tr:eq(0)').css('background-color', 'grey');

In this script, for each a node with a value of Melbourne, find the first tr parent using :eq(0) and change its background colour. You have to constrain the selection of parents to the first parent, otherwise all the tr ancestors will be selected; on this site, nested tables are used for layout so without this constraint, the enclosing tr node of the table will also be found and modified.

27 May 2009

Zebra-stripe table rows with jQuery's custom selectors

I thought I had a simple way to zebra-stripe table rows in jQuery:

$('table tbody tr').each(function(i) {
  $(this).addClass(i%2 ? 'OddRow' : 'EvenRow');

While I was looking up jQuery selectors, I found that the library has two custom selectors, :odd and :even, to select the odd and even elements in a matched element set, respectively, so you could zebra-stripe a table like this:

$('table tbody tr:odd').addClass('OddRow');
$('table tbody tr:even').addClass('EvenRow');

The second example seems a little more obvious.

See Also

26 May 2009

Read 'The Economist' article comments earliest first

A simple GreaseMonkey script using jQuery to show the comments to articles in The Economist, from earliest first. Basically, this script looks for <a> tags with specific text in their href attribute and appends &sort=asc. It's shorter and easier to read than the original plain Javascript version.

See Also

25 May 2009

CSS @media for Firefox Stylish Add-in

In an earlier entry, I wrote about using the @media print rule in a CSS file to insert the value of URLs in the print version of a web page. If the webmaster of the site didn't provide this rule, can you still print the URLs?

If you are using the Firefox Stylish add-in, you can provide your own CSS file to apply to a web site. I tried this:

@namespace url(http://www.w3.org/1999/xhtml);

@-moz-document domain("www.gamedev.net") {
  @media print {
    a:after { content:" (" attr(href) ")"; }

… but it didn't work. It seems that the Mozilla / Firefox @-moz-document rule doesn't allow @-rules within it (see the Mozilla links below).

If you move the @media print outside of @-moz-document rule, as below, then any page that you print will include URL strings in the output.

@namespace url(http://www.w3.org/1999/xhtml);

@-moz-document domain("www.gamedev.net") {}

@media print {
  a:after { content:" (" attr(href) ")"; }

The only way to manage this behaviour is to disable that stylesheet in Firefox's User Styles page of the Add-ons dialog.

See Also

24 May 2009

XPath selectors no longer supported in jQuery

It's old news, but being new to jQuery, I didn't realise that XPath selectors were no longer supported by this library until I read Upgrading to jQuery 1.2 (jQuery is now up to version 1.3).

23 May 2009

Show URL When Printing a Page

When you print a web page, you'd naturally expect to see the URL of a link, instead of just the name of the link (usually the underlined text). If you are the webmaster of a site, you could generate a different page formatted for printing (for example, look for the print link in many news sites). Another way is to add the following rule into your CSS file:

@media print {
  a:after { content:" (" attr(href) ") "; }

The rule above will append the value of the URL after the link. How does it work?

@media print
Specifies the media type (in this case, for printing) for the following set of statements in braces.
Instructs the CSS engine to select the <a> tag, while :after is a pseudo-element that instructs the CSS engine to insert some content.
{ content:" (" attr(href) ")"; }
Inserts (or generates) the value of the href attribute, in parentheses, into the output.

You can see the effect of this rule in my web site; just use the Print Preview function in your browser and examine the text after each link.

See Also

21 May 2009

Playing with PHP Data Objects (PDO)

Just migrated the database access code of my PHP sample page from the original (and old) mysql_* functions to PHP Data Objects (PDO). Some advantages of making this change:

  • You can write more general code because your application is not tied to a specific PHP database extension library.
  • PDO is similar to other database APIs, such as ADO or JDBC, so it was easier for me to write code in PDO than using, say, the PHP MySQL API.
  • You can use PHP try … catch blocks, so you can remove the clutter related to testing the return flag of a function. After instantiating a PDO class, $pdo, call $pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION) as shown in Error Handling section. Note that as per this comment, exceptions may only work in the database driver supports it.

As per the PDO introduction, PDO is not a complete data abstraction layer. You can't simply connect to another vendor's DBMS and expect your code to work. For instance, you still have to write SQL queries that the DBMS can understand. One way to tackle that is to encapsulate all database-specific information into a class and call methods to obtain queries for that specific database.

18 May 2009

Configuring Apache and PHP libraries

Problem when configuring Apache, PHP and MySQL on Windows XP. Here's my environment:

  • Microsoft Windows XP Professional Service Pack 3
  • Installed apache_2.2.11-win32-x86-no_ssl.msi as a service.
  • Installed php-5.2.9-2-win32-installer.msi
    • Used Apache 2.2 option
    • Added MySQL and MySQLi extensions.
  • Installed mysql-essential-5.1.34-win32.msi.

Checked that the PHP configuration file, php.ini, has the following entries to support MySQL and MySQLi extensions:

extension_dir ="C:\Program Files\PHP\ext"

Checked that the folder C:\Program Files\PHP\ext has the following files: php_mysql.dll and php_mysqli.dll.

When starting Apache, its errors.log has these messages:

[Fri May 15 14:41:03 2009] [notice] Apache/2.2.11 (Win32) PHP/5.2.9-2 configured -- resuming normal operations
[Fri May 15 14:41:03 2009] [notice] Server built: Dec 10 2008 00:10:06
[Fri May 15 14:41:03 2009] [notice] Parent: Created child process 3868
PHP Warning:  PHP Startup: Unable to load dynamic library 'C:\\Program Files\\PHP\\ext\\php_mysql.dll' - The specified module could not be found.\r\n in Unknown on line 0
PHP Warning:  PHP Startup: Unable to load dynamic library 'C:\\Program Files\\PHP\\ext\\php_mysqli.dll' - The specified module could not be found.\r\n in Unknown on line 0

It means that the required PHP extensions could not be loaded. It doesn't matter how I change change the value of extension_dir, for example using forward slashes instead of backslash, or using double backslashes, or a relative path, and then restarting the Apache service; the same type of error appears and the required extensions aren't loaded. In the end, I restarted Windows and then Apache loaded the libraries!

Once I got my configuration working, I changed the folder name and the value of extension_dir, and restarted Apache. Unfortunately, my system kept working, so I'm none the wiser.

11 May 2009

Rename Folder in Google Reader

Oh, erm … you can't rename folders in Google Reader. The workaround is to move all your subscriptions from one folder to another folder, then delete the original one. Here's the steps:

  1. In the Subscriptions panel, select Manage Subscriptions. The Settings page should be displayed.
  2. In the Settings page, filter your subscriptions by the name of folder you want to remove using the Filter by name, folder, or URL text box.
  3. Using the Change Folders… drop down list for each subscription, select the new folder and unselect the old folder. Note that for the first subscription, you have to create the new folder.
  4. After reassigning all the required subscriptions to the new folder, select the Folders and Tags tab.
  5. In the Folders and Tags, check the folder you want to delete, then press the Delete selected button.

See Also

08 May 2009

Modify URL to Read Comments From Beginning

When I read responses or comments to articles, I prefer to read them from the earliest to the latest. Some sites order comments in reverse (that is, latest to earliest). If a site's comment link can take an 'order' argument, just modify that URL to specify your preferred order and save yourself an extra click. Below is a sample GreaseMonkey script that can specify the order of comments in The Economist, which are ordered from latest to earliest, by default.

var pattern = 'mode=comment';
var sortOrder = '&sort=asc';
var links = document.getElementsByTagName('a');
for (var i = links.length - 1; i >= 0; --i) {
  var link = links[i];
  var url = link.getAttribute('href');
  if (url.search(pattern) != -1) {
    link.setAttribute('href', url.replace(pattern, pattern + sortOrder));

To use it on other sites that have similar URLs, just modify the values of pattern and sortOrder.

06 May 2009

Simple Data Grid Using PHP and jQuery


This article walks though the process of writing a simple Web-based data grid application for browsing a database table, using the PHP Web scripting language, jQuery Javascript library and MySQL DBMS.

Requirement and Design

The grid is just an HTML table, with filters for each column and pagination controls at the bottom of the table. The filters are implemented using drop down lists, and the pagination can be done using two buttons, one to move to the next page, the other to move to the previous page of data:

|Column header|Column header|
| Filter      | Filter      |
|Data row1                  |
|Data row2                  |
|Data row3                  |
|Pagination controls        |

Two other requirements are to ensure that the data grid can be used with or without Javascript, and that it can be used in Firefox, MSIE and Opera.

One We Prepared Earlier …

To orient you, check out the data grid implementation first. Try it with Javascript enabled and disabled on your browser. When Javascript is disabled, you have to first select a filter value, then press the Submit button before the grid is updated.

When Javascript is enabled, just changing a filter value will update the grid and you can reset all the filters at once by pressing the Reset button. Also, the presentation is slightly enhanced by zebra-striping the rows to make them easier to view.

As you filter or paginate the data, the SQL query, below the grid, is updated.

Test Data

To allow us to develop and test the grid, we create some test data in the MySQL database. I use the same schema and and data from an earlier posting. Below is the SQL statement to create the table:

create table if not exists p0020_shirt (Region varchar(8), Category varchar(8), Shirt_Style varchar(8), ShipDate date, Units integer, Price decimal(4,2), Cost decimal(4,2));

We add data into the table using a series of insert statements, like the following:

insert into p0020_shirt values ('East','Boys','Tee',date('2005-01-01'),11,5.25,4.66);
insert into p0020_shirt values ('East','Boys','Golf',date('2005-01-01'),12,5.26,4.57);
insert into p0020_shirt values ('East','Boys','Polo',date('2005-01-01'),13,5.27,5.01);
insert into p0020_shirt values ('East','Girls','Tee',date('2005-01-01'),14,5.28,5.01);

We filter and paginate our data in one select statement using the where, and limit and offset clauses:

select * from [table reference] [filter] limit [number of rows] offset [position]

The filter is just a where clause, generated when the user selects a column and value to filter the rows.

The filter controls are implemented using drop down lists, and to populate them, we fetch all unique values for each string column using the distinct option in a select statement, such as: select distinct [column name] from [table reference] order by 1 asc.

When populating the drop down lists, we should also let the user reset the filter. To this end, we add a dummy value 'All' into the drop down list. Rather than having a special step in the PHP code when it generates the drop down lists, we can use union to combine the results of two select statements into one result set, then use a loop to populate the drop down list:

select 'All'
select distinct [column name] from [table reference] order by 1 asc

For example, the statement above would generate a list like (All, East, North, South, West).

The pagination controls just modify the offset position by adding or subtracting a constant and the current offset. We can easily stop the user from paging before the first row by ensuring that the offset value is always 0 or more. To stop the user from paging beyond the last row, we use another query to count the number of rows a query would return: select count(*) from [table reference] [filter]. (In the PHP code, we also pad the table with empty rows if there are fewer rows in the dataset than the standard number of rows so that the height of the table doesn't change in the last page.)

PHP Data Grid Implementation

By moving as much of the logic into SQL statements, the PHP implementation of the data grid control is straightforward. If you view the PHP source code, you will see that the entry point, the main() function, initializes variables using default values or from a previous form submission, connects to the database, then paints the drop down lists, data rows and pagination controls.

Enhance Data Grid UI with Javascript and jQuery

HTML forms should be viewable and usable without Javascript. If Javascript is enabled, then we can enhance the presentation and usability. With this in mind, the PHP code should just generate the non-Javascript form and create no event handlers, and if Javascript is enabled, the browser should use Javascript to add event handlers to HTML elements.

One library that makes it easier to manipulate the DOM in a browser is jQuery. For instance, the laborious DOM function calls such as document.getElementsById() are replaced by simpler ones such as $().

In this data grid example, the Javascript functions to enhance the presentation or add interactivity can be found page's head element.

Perhaps one odd feature is to hide the Submit button if Javascript is enabled. If Javascript is disabled, then the user must press the Submit button to submit the form. On the other hand, if Javascript is enabled, just changing the drop down list causes the onchange event handler to submit the form, enabled, so the Submit button is redundant.

Unexpected Problems

While developing this data grid control, I stumbled upon two unexpected problems.

The first problem is that the form can't reset select elements (the filters) to the first option in the list. Resetting the select elements just sets them to the default option, which is the option chosen when this form is generated in PHP.

The second problem was the Internet Explorer Submit Button Bug.


One obvious improvement to this data grid is to reduce the number of database queries required just to update the data rows or filters. At the moment, there are 5 queries (one for each filter, one for the data rows, and one for counting the number of rows) each time the page is updated. While this is not a problem for a small dataset or a small number of users, it may quickly overload a server with a lot of users and more complex queries. This improvement can be implemented using AJAX to update each control without refreshing the entire page.

Another improvement is to auto-generate the filter lists based on rules of the number of unique values and the column types. For example, a filter for a text column could allow the user to enter a regular expression or to auto-complete as the user enters more letters. A filter for numeric values could automatically generate quartiles. Finally, a filter for dates could automatically contain months, quarters or years.


This article has presented a way to implement a simple data grid control using PHP and jQuery. SQL queries are used as much as possible to simplify the page generation logic in PHP. PHP is used to provide the business logic to query the database and generate a basic form. jQuery (and Javascript) are used to enhance the usability and presentation of the page. I'll keep exploring this approach to see if it would make it easier to develop and maintain.

25 April 2009

Internet Explorer Submit Button Bug

While testing a simple form in Firefox and MSIE 7, I found that my server-side program got different values for HTML submit buttons depending on the browser. Below is a test HTML file to demonstrate the problem. Copy and paste the text below into a test file, open the file in your browser and press the Test button.

    <form method='get'>
      <button name='testSubmitButton' type='submit' value='value1'>Test</button>

Examine the URL in the address bar.

In Firefox 3.0.9 and Opera 9.52, the URL is: file:// … MsieSubmitButton.html?testSubmitButton=value1, so the value is sent from the browser with the button name.

In MSIE 6.0.2900.5512 and MSIE 7.0.6001.18000, the URL is: file:// … MsieSubmitButton.html?testSubmitButton=Test, so the text of the button is sent from the browser with the button name.

Both MSIE's behaviour doesn't conform to the HTML 4.01 recommendation on Controls:

Each control has both an initial value and a current value, both of which are character strings. … In general, a control's "initial value" may be specified with the control element's value (emphasis mine) attribute. …
The control's "current value" is first set to the initial value…

When a form is submitted for processing, some controls have their name paired with their current value and these pairs are submitted with the form. …

My workaround was to give each submit button a unique name and change the server-side program to use the button name instead of the button value. While this was adequate for my task, it was an annoying, basic problem that shouldn't still exist.

While researching this bug, I found other issues with MSIE when using submit buttons in HTML forms, so be careful!

See Also

23 April 2009

Enabling and Disabling Javascript in Internet Explorer 7

How to enable or disable Javascript in MSIE7:

  1. Select Tools / Internet Options.
  2. In the Internet Options dialog, select Security tab.
  3. In the Security tab, select Custom Level button.
  4. In the Security Settings dialog, scroll down the tree of properties in the Settings list until you find the Scripting / Active Scripting node, then select the Disable or Enable radio button to disable or enable Javascript, respectively.
  5. Press OK buttons to accept your change.

That's a lot of steps if you want test if your Web page works with and without Javascript! It's very annoying that Microsoft decided to call the feature Active scripting instead of Javascript, that you can't expand or collapse the nodes in the Settings tree, nor can you easily jump to the required node by typing the first few letters (the control jumps first to ActiveX controls … first)!

See Also

22 April 2009

Make Columns Narrower with GreaseMonkey or Stylish

It is hard to read a lot of text in a wide browser window, so below are two GreaseMonkey scripts to make the text columns narrower. They both work by changing an element's CSS width to a percentage of the window width.

If the text is within a known element tag (e.g. p), try the following script, which iterates through the collection of elements and sets the elements CSS width.

for each (e in document.getElementsByTagName('p')) {
  e.style.width = '60%';

If all the text is all within a single containing element, such as the body element, then the script can be shortened to …

document.getElementsByTagName('body')[0].style.width = '60%';.

If you want to also centre the text in the window, just change move the left margin by adding e.style.marginLeft = '20%';. marginLeft is the Javascript equivalent of CSS' margin-left property.

Later … I realised that since we're only changing CSS properties, it's a lot easier to use the Stylish add-in. Just add a style sheet like this:

  body {
    margin-left : 20%;
    width : 60%;

See also

21 April 2009

Open WinCVS in a Directory or Folder

WinCVS' tips.txt file includes this tip: If you run WinCvs with a file or directory name as a command line argument, then WinCvs will locate and select it for you so you can operate on it right away. To use this feature in Windows, create a new shortcut, and in the shortcut's Target field, enter: <path to wincvs.exe> <directory path>. When you click on this shortcut, WinCvs will open in the specified folder.

Note that you start WinCvs with a path argument, the folder shown in the Workspace pane shows the correct folder but the path shown in Browser Bar is not updated; it shows the last one you chose in a previous session.

20 April 2009

Outlook 2003 Paste Special disabled

When you write or respond to a HTML-formatted message in Outlook 2003, and paste some text from another source (e.g. a Web page), the pasted text looks out of place because Outlook uses the string's original formatting, which is almost always different from the formatting in the message. If, like me, you find multiple fonts in a message ugly, you'd want to paste the text into the message without any formatting, as in MS Word's Paste Special command. However, when editing a message in Outlook 2003 in HTML format, the Edit / Paste Special menu item is disabled. According to this thread, it's only enabled if you use MS Word as your editor!

One workaround is to use GnuWin32 commands and a pipe: getclip | putclip. getclip outputs the unformatted string from the clipboard and putclip copies its input string back into the clipboard. Now, when you paste your text, it won't have any formatting.

07 April 2009

Microsoft Access dummy row

If you want to generate a dummy or extra row, say for a combo box control, then you can write a SQL statement like this: select col1 from table union all select 'All' from table. What this statement does is create an extra row with text All in the last row in your combo box.

While testing, we found that this statement works in Microsoft Access 2003 but not in Access 2000. In Access 2000, it works if your table has at least one row, but if your table has no rows, no result is returned! However, if you use the OLEDB interface, the dummy row is returned, so at least we can continue working.

25 March 2009

Facebook News Feed workaround and RSS feeds

The new Facebook News Feed is pretty useless because, unlike the old one, you can't filter notifications based on type, nor does it collapse multiple notifications from the same application into one. What you end up with is a long list of notifications from every friend and application each time, and you simply miss notifications that interest you (e.g. x is now a friend of y) in all the noise.

If you use an RSS reader, you can workaround this problem by using an RSS feed where possible. Unfortunately, Facebook only provides RSS feeds for three types of events: Links (My Friends' Links), Notifications (Your Notifications) and Status Updates (Friends' Status Feeds). At least if you set up your reader for those events, you won't overlook them.

2009-03-26: Added names of the RSS feeds for each of the pages in parentheses.

2009-03-27: Noticed that Facebook updates their feeds once every 30 minutes.

2009-06-07: There is also an RSS feed for My Friends' Notes which lists the last note they made. Also added '… and RSS feeds so that I can find this entry more easily in future.

05 February 2009

Firefox most popular browser

Unbelievable. The browser statistics for W3 Schools show that in January '09, Firefox was more popular than all versions of MSIE combined: 45.5% vs 44.8%. Also, there are more visits with browsers that support SVG (Firefox, Chrome, Opera and Safari) than with browsers that don't (MSIE family): 54.7% versus 44.8%.

Yes, it's only one site frequented by web developers, not representative of the average user, yada yada, …

29 January 2009

Outlook 2003 rules don't support wildcards

I use MS Outlook 2003's rules to sort notification e-mail from a bug reporting system into different folders, one folder for each combination of product and major customer. Each message is sorted based on a keyword in the message subject or body. Sorting by product name easy because it is a sub-string in the subject. Sorting by customer name requires a rule to examine the body of message and find a string whose pattern is release: <version> <customer>. Here's a simple regular expression to match this string: release: ... name (assuming that version numbers are always in the form x.y).

But I got ahead of myself. After a fruitless half an hour, I realised that Outlook 2003 rules don't support regular expressions, nor do they support wildcards, for pattern matching. If you can find a reference, please send it to me.

Rather than spend more time hacking a VBA script, I just created rules that matched the product name in the subject and the customer name anywhere in the body of the e-mail. These rules work almost all the time, and only fail when someone enters the customer name somewhere in the body of the e-mail, which is different from the release name generated by the bug reporting system.

25 January 2009

Writing a good bug report

Have you ever reported a bug to a forum or mailing list, only to have developers pester you with dumb questions or, even worse, no one seems to be interested in helping you? Why do they keep asking you for more and more information? Aren't they interested in fixing bugs? Why won't anyone help you?

Being on the sending and receiving end of bug reports for years, I find that most developers take pride in their software and want to fix bugs but can't action bug reports unless they have at least the following information:

Software version
Quote the version string of the software. For Windows software, you can usually find the version string in the Help / About dialog. If you're using command-line software, check the on-line help for the required command-line option. For example, GnuWin32 utilities display their version string if you use the --version option.
Operating system configuration
If you're using Microsoft Windows, you can open a command console and type systeminfo. You'll see plenty of information about your computer, include hotfixes and hardware information. Some of the information may be private, so you decide how much you want to share when you write your report.
Input data
Sample files or parameters used to cause the problem.
Steps to reproduce the problem
Provide every step you took.
Expected results
What you think should have happened when you found the bug instead.

Often, users (and testers) balk at the thought of writing down every step or the expected results because they seem superfluous. "Don't developers know how their own software works?" you ask? Can't you just write, Stratoblaster feature doesn't work. Fix it! Problem is that unless you write down all the steps and what you think should happen, your bug will probably be ignored (if it's free – beer or libre – software) or developers can't figure out what you're going on about and write back asking for more information (if you're paying for support). Put yourself in the developers' shoes: how can they (1) figure how to reproduce the bug and (2) know if they have fixed it?

If you accept that you have to provide all the steps, then how much detail do you have to give? My rule is to imagine that I want to explain to a friend how to use a feature. For example in Firefox, let's pretend there's a bug moving a bookmark from one folder to another. Here's how I would write the steps and expected results of a bug report:

  1. Select Bookmarks / Organize Bookmarks menu item. Firefox should open the Library dialog.
  2. In the Library dialog, select the bookmark you want to move using your mouse pointer.
  3. Click and drag the bookmark to the destination folder.
  4. Release the bookmark. BUG: Bookmark remains in original folder. EXPECTED: Bookmark is moved to destination folder.

Happy bug reporting!

24 January 2009

Beginning Firefox Ubiquity

Been playing with Firefox Ubiquity add-in for the past month and found myself using these commands:

define (def) word
Gives definition of a word in Ubiquity popup.
map <location>
Shows Google Map in Ubiquity popup. If you press Enter, it opens Google Maps in a new tab.
tab <tab name>
Displays the named tab. For example, tab gmail would display the Gmail tab. Useful if when I open too many tabs and can't see tabs that are off-screen.
translate (tr)
Translate selected text to English in Ubiquity popup. I browse Malaysian news sites but my BM is rusty, so I like to check that I read some text correctly. The translation fails if there are proper names in the start of the text, so I still have to visit a translation site.
twitter (tw)
Update your Twitter status. You have to provide Firefox with your Twitter user name and login to send the update.

22 January 2009

Firefox spell check requires dictionary

Maria H.'s version of Firefox didn't highlight misspelt words in multi-line fields, although the Check my spelling as I type option is checked. I found that I had a English / United States dictionary installed (probably from a previous installation) while she didn't, so the spell check worked on my PC but not on hers. It seems that the Firefox spell checker only works after you install a dictionary; another one of those things that are obvious in hindsight.

21 January 2009

Microsoft Visio 2003 insert page annoyance

When you insert a new page in MS Visio 2003, the page added in the right-most position, after all other page tabs. This behaviour is different from MS Excel, where a sheet is added before the current sheet. Visio's behaviour is annoying because after I create a new page, I have to move it to the position I want.

12 January 2009

Microsoft Word 2003 symbol dialog

By accident, I found that if you double-click on a symbol character in your document in Microsoft Word 2003, the Symbol dialog is displayed. However, how does MS Word define a symbol? For instance, the dialog doesn't appear if you click on a copyright symbol but it appears if you click on a smiley face symbol.


09 January 2009

Dispatch, Static versus Dynamic, Single versus Double or Multiple

I read Stuart Holloway's Java.Next #3 Dispatch and decided to make clear to myself the concept of Dispatch and how it is used.

Dispatch means to call a method of an object and is normally applicable in the context of object-oriented programming. The rest of this summary assumes that there is more than one function with the same name and revolves around selecting the function (or method) to call (see Multiple Dispatch).

Static Dispatch that the system can pick the function to call at compile-time, as opposed to Dynamic Dispatch where the system has to choose the function to call at run-time. Dynamic dispatch can be supported by the programming language (e.g. polymorphism) or implemented in a program (e.g. case-statement).

Single Dispatch means that the system picks the function to call based on one parameter. In common object-oriented languages (e.g. C++, Java or C#), that parameter is the type of the object (see Dynamic Dispatch). For example:

class A { fn(); }
class B { fn(); }

A a;
B b;


When the system encounters a.fn(), the system calls the function fn() in class A because the object a is of the type A.

Multiple Dispatch (or Multimethods) means that more than one parameter is considered when calling a function. Multiple Dispatch gives mentions some languages the support this feature but I'm only familiar with Groovy, so I found this example by Danno Ferrin where the function chosen depends on the types of the object and function argument.

Double Dispatch is a special case of multiple dispatch (see Double Dispatch) where two parameters are considered and it is often implemented using the Visitor Pattern in single dispatch languages.