13 December 2008

Gimp 2.6: Order drawing tools by keyboard accelerators

It may be easier to remember keyboard accelerators (or shortcuts) if an application's tools follow the arrangement of the keys in the keyboard. Here's how to apply that idea to the Gimp 2.6 Toolbox.

The default arrangement of tools has this key order: E (ellipse select), R (rectangle select), F (free select), U (fuzzy select), Shift+O (select by colour) …

Gimp Default Toolbox

The new arrangement follows the QWERTY layout: Q (align), E (ellipse select), R (rectangle select), T (text), U (fuzzy select) …

Gimp Rearranged Toolbox

Here's how to find and move Gimp's drawing tools:

    Gimp Tool Dialog
  1. Open the Tools dialog (see picture on the right) by select menu item Windows / Dockable Dialogs / Tools.
  2. When you type a key, Gimp will highlight the tool in the Toolbox. For example, Q highlights the Align tool.
  3. Press the up or down arrow button in bottom of the Tools dialog to move the required tool to the desired position.

It's easier to customize Gimp's toolbox than Visio's toolbar because you can find a tool by typing its keyboard accelerator, while in Visio you have to browse through the tools in the Command menu until you find the required one.

See Also

12 December 2008

Visio 2003: Order drawing tools by shortcut keys

By default, Visio 2003's drawing tools are in divided between two toolbars, Standard and Drawing, and the tools are ordered by importance or frequency.

Julian W. and I worked out that it's easier to remember the tools' shortcut keys if you move all of them into a single toolbar and arrange them by the shortcut keys, i.e. from Ctrl+1 (Pointer) to Ctrl+9 (Ellipse). Drawing tools which use Ctrl+Shift go into another group and drawing tools which don't have any shortcut keys go into a third group. Another advantage of this arrangement is that you get visual feedback when a tool is selected using its keyboard shortcut (it's highlighted); in the default arrangement, the tool selected by the keyboard shortcut is not shown if it is not the first tool in a drop-down control.

See Also

04 December 2008

Microsoft Project 2007 print time scale problem

It seemed like a simple job: print Gantt Chart of my project such that the time scale fits on one page in Microsoft Project 2007. Below is a picture of what I wanted to do:

+------+
|Page 1|
+------+
|Page 2|
+------+
|Page n|
+------+

Yet MS Project 2007 always printed the final milestone on an even page, so that I ended with twice the number of pages:

+---------+-------+
|Page 1   |Page 2 |
+---------+-------+
|Page 3   |Page 4 |
+---------+-------+
|Page 2n-1|Page 2n|
+---------+-------+

A hack by Pak K. was to shrink the view such that the scale in the top tier of dates in the Gantt Chart is in quarters. A related hack I found is to specify only one tier in the Timescale dialog (select menu item Format / Timescale).

The notes in Print a view on-line help suggests that when printing, the timescale should be scaled down little to fit a small amount of overflow, but that is not the case.

01 December 2008

Microsoft Word 2003 extra find next and find previous keyboard shortcuts

Unexpectedly found some extra keyboard shortcuts for Find Next and Find Previous functions in Microsoft Word 2003. The online help only lists Alt+Ctrl+Y for Find Next and no shortcut for Find Previous. You can also use Ctrl+PgDn for Find Next and Ctrl+PgUp for Find Previous. Bonus!

Later … Julian W. pointed out that this is part of the Browse Object feature. Press the circle icon below the page's scroll bar or type Atl+Ctrl+Home and Word displays the Browse Object panel. Select the object type you want to browse, e.g. Browse by Heading, then press the double-up or double-down arrowheads in the scrollbar to go to the previous or next object (in this case, heading). The tooltip displays the keyboard shortcuts, which are Ctrl+PgUp and Ctrl+PgDn, respectively.

Even later … Found Microsoft reference.

See Also

28 November 2008

Theme a site with Greasemonkey and Stylish

Motivated by Gmail's Terminal theme, I hacked a similar theme for an internal Web site using Firefox's Greasemonkey and Stylish add-ins. My approach was to write some small Greasemonkey scripts to touch up different parts of all pages independently, then use Stylish to apply style definitions (the theme) from one CSS file. If you want to create another theme for this site, just create a new CSS file.

  1. First, I identified logical components of each web page, for example navigation bar, header, footers, forms and tables.
  2. Wrote a Greasemonkey script for each logical component that:
    1. Removed any physical formatting such as border or bgcolor.
    2. Labelled each component with an ID and / or a class name.
  3. When naming each script, prefixed each script name with the site name to make them easy to find in Greasemonkey's Manage User Scripts dialog.
  4. Configured Greasemonkey's Included and Excluded pages to call each script as required.
  5. Wrote a site-wide CSS file, using the IDs and class names that were previously defined.
  6. Added the CSS file to Stylish. The CSS file name is prefixed with the site and theme name.

Instant theme!

See Also

20 November 2008

Foxit PDF Reader Page Navigation

The Foxit PDF reader lets you can turn to the next page using Space or Right, or to previous page using Shift+Space, Left or Backspace shortcut keys. If you scroll a page, the page is moved up or down by the height of the view, so the next or previous page is always slightly shifted vertically in the view because the view is not the same height as the page. After scolling several pages, you have to adjust the view to display the whole of a page again. When reading paginated documents, it's easier to turn pages instead of scrolling through pages.

Foxit developers: The function names in italics are swapped in the on-line help.

Note: Maria H. reports that the same keys also work for the Adobe PDF Reader.

19 November 2008

Accelerate and Brake Moderately

There's many ideas to save petrol, but which is the most effective? As a starting point, my Magna TS consumed 16L/100km of petrol in city driving. After some months of experimentation with different techniques, I found that the most effective method was (taa-daa!) … to accelerate and brake moderately. Many other techniques (e.g. leaving appropriate space behind the vehicle in front, drive smoothly) are a consequence of this one. Now, my fuel consumption is down to 12L/100km.

Of course, your mileage may vary!

08 November 2008

Simple Clock Custom Control using Swing and Windows.Forms

This article describes how to create a simple analogue clock using Java Swing and .Net Windows.Forms, and it will cover creating a custom control, simple 2D drawing, updating the display regularly with a timer. In addition, the control can display the time in a user-selectable time zone and is resizable.

To give you an idea of the goal, below are images of the Swing and Windows.Forms control, respectively, in a test application:

Clock Control Java Clock Control .Net

Define Custom Control

The custom control has to draw a clock face and hands. We specify the height and width of the clock face and use a high-quality drawing mode to smooth out the lines and avoid jaggies. By default, the control will display the time in the current time zone.

Java Swing

Sub-class JComponent and override the paintComponent() method:

public class ClockPanel extends JPanel {
  private double cx, cy, diameter, radius;
  private final double RADIAN = 180.0 / Math.PI;
  private TimeZone timeZone = null;

  public void paintComponent(Graphics g) {
    super.paintComponent(g);
    Graphics2D g2 = (Graphics2D) g;
    g2.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
    paintFace(g2);
    paintHands(g2);
  }
}

Windows.Forms

Sub-class Control and override OnPaint() method:

  public partial class ClockControl : Control {
    private double cx, cy, diameter, radius;
    private const double RADIAN = 180.0 / Math.PI;
    private TimeZoneInfo tzi = TimeZoneInfo.Local;

    protected override void OnPaint(PaintEventArgs e) {
      base.OnPaint(e);
      Graphics g = e.Graphics;
      g.SmoothingMode = System.Drawing.Drawing2D.SmoothingMode.HighQuality;
      PaintFace(g);
      PaintHands(g);
    }
  }

2D Drawing

The clock is drawn in two steps: first a circular clock face is drawn, then the hands are drawn on top of the face. The face is circular, so the minimum of the height and width of the drawable area is used.

Clock Face

To drawing the clock face's numbers, we convert each number into string, then an image, compute that image's mid-point and place it numbers using that mid-point. If we don't use the mid-point, then each number will be placed too high and on the right of the tip of the clock hands.

Java Swing
  private void paintFace(Graphics2D g2) {
    g2.setPaint(Color.GRAY);
    g2.fill(new Ellipse2D.Double(0, 0, diameter, diameter));
    g2.setPaint(Color.LIGHT_GRAY);
    g2.fill(new Ellipse2D.Double(20, 20, diameter - 40, diameter - 40));
    g2.setPaint(Color.BLACK);
    for (int i = 1; i <= 12; i++) {
      TextLayout tl = new TextLayout(Integer.toString(i), getFont(), g2.getFontRenderContext());
      Rectangle2D bb = tl.getBounds();
      double angle = (i * 30 - 90) / RADIAN;
      double x = (radius - 10) * Math.cos(angle) - bb.getCenterX() / 2;
      double y = (radius - 10) * Math.sin(angle) - bb.getCenterY() / 2;
      tl.draw(g2, (float)(cx + x), (float)(cy + y));
    }
  }
Windows.Forms
    private void PaintFace(Graphics g) {
      SolidBrush brush = new SolidBrush(Color.DarkGray);
      g.FillEllipse(brush, 0, 0, (float)diameter, (float)diameter);
      brush.Color = Color.LightGray;
      g.FillEllipse(brush, 20, 20, (float)(diameter - 40), (float)(diameter - 40));
      brush.Color = Color.Black;
      for (int i = 1; i <= 12; i++) {
        string num = Convert.ToString(i);
        Size size = TextRenderer.MeasureText(g, num, Font, new Size(int.MaxValue, int.MaxValue), TextFormatFlags.NoPadding);
        double angle = (i * 30 - 90) / RADIAN;
        double x = (radius - 10) * Math.Cos(angle) - size.Width/2;
        double y = (radius - 10) * Math.Sin(angle) - size.Height/2;
        g.DrawString(num, Font, brush, (float)(cx+x), (float)(cy+y));
      }
      brush.Dispose();
    }

Calculate Hand Positions

For each tick, we calculate the position of the hour, minute and second hands based on the current time and time zone. The hour hand is moved slightly forward each minute, and the minute hand is moved slightly forward each second, so that they don't jump at the start of the next hour and minute, respectively.

Java Swing
  private void paintHands(Graphics2D g2) {
    Calendar now = Calendar.getInstance(timeZone);
    int hour = now.get(Calendar.HOUR_OF_DAY);
    int minute = now.get(Calendar.MINUTE);
    int second = now.get(Calendar.SECOND);
    double angle = 0.0;
    
    angle = (hour % 12 * 30 + minute / 2) / RADIAN;
    paintHand(g2, 0.4, angle, 6, Color.YELLOW);
    
    angle = (minute * 6 + second / 10) / RADIAN;
    paintHand(g2, 0.6, angle, 4, Color.BLUE);
    
    angle = second * 6 / RADIAN;
    paintHand(g2, 0.8, angle, 2, Color.RED);
  }
Windows.Forms
    private void PaintHands(Graphics g) {
      DateTime dt = DateTime.UtcNow + tzi.GetUtcOffset(DateTimeOffset.UtcNow);
      double angle = 0.0;

      angle = (dt.Hour % 12 * 30 + dt.Minute / 2) / RADIAN;
      PaintHand(g, 0.4, angle, 6f, Color.Yellow);

      angle = (dt.Minute * 6 + dt.Second / 10) / RADIAN;
      PaintHand(g, 0.6, angle, 4f, Color.Blue);

      angle = dt.Second * 6 / RADIAN;
      PaintHand(g, 0.8, angle, 2f, Color.Red);
    }

Draw Hands

Now, we draw each clock hand.

Java Swing
  private void paintHand(Graphics2D g2, double proportion, double angle, float width, Color color) {
    double x = radius * proportion * Math.sin(angle);
    double y = -radius * proportion * Math.cos(angle);
    g2.setPaint(color);
    g2.setStroke(new BasicStroke(width, BasicStroke.CAP_ROUND, BasicStroke.JOIN_ROUND));
    g2.draw(new Line2D.Double(cx, cy, cx + x, cy + y));
  }
Windows.Forms
    private void PaintHand(Graphics g, double proportion, double angle, float width, Color color) {
      double x = radius * proportion * Math.Sin(angle);
      double y = -radius * proportion * Math.Cos(angle);
      Pen pen = new Pen(color, width);
      pen.EndCap = System.Drawing.Drawing2D.LineCap.Round;
      g.DrawLine(pen, (float)cx, (float)cy, (float)(cx + x), (float)(cy + y));
      pen.Dispose();
    }

Timer

The clock control updates itself every second using a timer. We set up the timer in the control's constructor.

Java Swing

Each 1000 ms, the timer will add a repaint request to the Swing event queue, which eventually results in calling the object's paintComponent() method.

  public ClockPanel() {
    super();
    ...  
    setTimeZone(TimeZone.getDefault());
    Timer t = new Timer(1000, new ActionListener() {
      public void actionPerformed(ActionEvent e) {
        repaint();
      }
    });
    t.start();
  }

Windows.Forms

Each 1000 ms, the timer will call the timer_Tick() function, which in turn invalidates the clock control so that Windows.Forms will generate a OnPaint() event (that's how I think it works).

    public ClockControl() {
      CalculateSize(); // Control's size is available at this point.
      Timer timer = new System.Windows.Forms.Timer();
      timer.Enabled = true;
      timer.Interval = 1000;
      timer.Tick += new EventHandler(timer_Tick);
    }

    void timer_Tick(object sender, EventArgs e) {
      Invalidate();
    }

Time Zones

For interest, the clock control displays the time is a specified time zone, so we provide methods for an external caller to get and set the control's time zone.

Java Swing

  public TimeZone getTimeZone() { return timeZone; }
  public void setTimeZone(TimeZone tz) { timeZone = tz; }

Windows.Forms

    public TimeZoneInfo TZI {
      get { return tzi; }
      set { tzi = value; }
    }

Test Code

To test the clock control, create a test application.

NetBeans

  1. Using NetBeans, create a new application sample form called SimpleClockView.
  2. Drag the clock's source code icon from the Projects window into SimpleClockView's Design pane.
  3. Drag a JComboBox, called timeZoneList into the Design pane.
  4. Add the following action handler into timeZoneList to change the clock's time zone when the user selects a new time zone:
      private void timeZoneListActionPerformed(java.awt.event.ActionEvent evt) {
        // TODO add your handling code here:
        String tzID = (String) this.timeZoneList.getSelectedItem();
        clockPanel1.setTimeZone(TimeZone.getTimeZone(tzID));
      }                                            
    
  5. Initialize timeZoneList with a list of time zones in SimpleClockView's constructor:
    public class SimpleClockView extends FrameView {
    
        public SimpleClockView(SingleFrameApplication app) {
            super(app);
    
            initComponents();
            String[] sortedTzID = TimeZone.getAvailableIDs();
            Arrays.sort(sortedTzID);
            timeZoneList.setModel(new javax.swing.DefaultComboBoxModel(sortedTzID));
        ...
    

SharpDevelop

  1. Using SharpDevelop, create a new Windows Applications Form called Form1.
  2. Drag your new clock control from the Toolbox into the Form1's Designer pane.
  3. Drag a ListBox, called timeZoneList into the Design pane.
  4. Bind timeZoneList's SelectedIndexChange event to the timeZoneList_SelectedIndexChange() function to change the clock's time zone when the user selects a new time zone:
        private void timeZoneList_SelectedIndexChanged(object sender, EventArgs e) {
          String tzId = this.timeZoneList.SelectedValue as String;
          if (tzId != null) this.clockControl1.TZI = TimeZoneInfo.FindSystemTimeZoneById(tzId);
        }
    
  5. Initialize timeZoneList with a list of time zones in Form1's constructor:
    namespace ClockNS {
      public partial class Form1 : Form {
        public Form1() {
          InitializeComponent();
          this.timeZoneList.DataSource = TimeZoneInfo.GetSystemTimeZones();
          this.timeZoneList.DisplayMember = "DisplayName";
          this.timeZoneList.SelectedValue = "Id";
        }
      ...
      }
    }
    

Handling Resizing

A nice additional feature is for the clock control to resize itself when the test application's is resized.

Java Swing

Add a componentResized handler to recalculate the component's size and request Swing to repaint the component. Your component may have a small inset within its boundary, so you should take that into account when calculating the clock's size.

  public ClockPanel() {
    super();
  
    addComponentListener(new ComponentAdapter() {
      @Override
      public void componentResized(ComponentEvent e) {
        calculateSize();
        repaint();
      }
    });
    ...
  }

  private void calculateSize() {
    Insets insets = getInsets();
    int width = getWidth() - insets.left - insets.right;
    int height = getHeight() - insets.top - insets.bottom;
    diameter = Math.min(width, height);
    cx = cy = radius = diameter / 2;
  }

Windows.Forms

Override the base class' OnResize() function to recalculate the clock's size and redraw the control.

    protected override void OnResize(EventArgs e) {
      base.OnResize(e);
      CalculateSize();
      Invalidate();
    }

    private void CalculateSize() {
      diameter = Math.Min(this.Width, this.Height) - 2;
      cx = cy = radius = diameter / 2;
    }

Conclusion

I've described how to create a custom control in Java Swing + NetBeans and .Net Windows.Forms + SharpDevelop using (nearly) the same procedure and structure. I experiment with both environments because a feature in one motivates me to find an equivalent feature in the other and to synthesize a common (and hopefully better) solution.

See Also

25 October 2008

Reusing Custom Page Styles With @-moz-document

After using the Firefox Stylish add-in to increase the contrast between the text and background colours of domains, I wanted to apply the same rule to different domains. In this case, I want to reuse the text colour of P elements in two or more domains.

Open a Stylish script and note that it has a sequence of @-moz-document definitions, for example:

@-moz-document domain(addons.mozilla.org) {
  p { color : black; }
}

@-moz-document domain(msdn.com) {
  p { color : black; }
}

The @-moz-document rule can accept a comma-separated URL list, so I can re-use my style definition like this:

@-moz-document domain(addons.mozilla.org), domain(msdn.com) {
  p { color : black; }
}

See Also

04 October 2008

NetBeans and WinCVS local repository

On my Vista computer, I have a local CVS repository for version control in this folder: C:\Users\<user>\Documents\Repository. I checked out a module using WinCVS 2.0.2.4 Build 4, so the module's CVSROOT is C:\Users\<user>\Documents\Repository. When I start any CVS operation in NetBeans 6.1, the operation is queued and hangs, or NetBeans reports this error: The pipe has been ended.

If you look into the NetBeans log file, C:\Users\<user>\AppData\Local\Temp\outputN (where N is a number), you can find the following entry: cvs [server aborted]: C:\Users\<user>\Documents\Repository: no such repository.

Here's a workaround using CVSNT 2.0.51d.

First, add a repository name in the CVSNT Service. For me, I use Repository:

  1. Open Windows' Control Panel.
  2. Select the CVS for NT icon. Windows should open the CVSNT applet (in Vista, you have to confirm that you allow this legacy CPL applet to run).
  3. In the Status Service tab, stop CVS Service and CVS Lock Service.
  4. Select the Repositories tab.
  5. Press the Add button to create a new repository name. The Edit Repository dialog should appear.
  6. Set Location: = C:\Users\<user>\Documents\Repository.
  7. Set Name: = /Repository.
  8. Press the OK button save your repository name and close the Edit Repository dialog.
  9. Select the Service Status tab and start the CVS service.
  10. In the Status Service tab, start CVS Service and CVS Lock Service.
  11. Press the OK button to save your settings and close the CVSNT applet.

Second, edit all CVS/ROOT files in your project and replace the folder path with: :local:/Repository.

This entry tells NetBeans to use the :local: protocol to find Repository. However, WinCVS does not understand the :local: protocol, so you can't use WinCVS to maintain your project any more.

30 September 2008

Black Text, Please

It makes my eyes water to read grey text on a white background, so much so that I was motivated to learn how to use the Greasemonkey Ain't It Readable script to set text in <p> elements to black. Using Greasemonkey just to set a single CSS attribute was an overkill, so I turned it off and wrote this rule for the Stylish add-in: p { color:black; }.

See Also

27 September 2008

GDB 6.8 in NetBeans 6.1 Hangs

When trying to debug a C++ program in NetBeans 6.1 using MinGW gdb 6.8, gdb would hang when starting (and cause NetBeans to hang as well). My workaround was to restart NetBeans, open the debug window (see Windows / Debug menu item) and delete all breakpoints.

This thread gdb 6.8.0 debug process in netbeans 6.1 hangs? shows where to find gdb's log file to help solve the problem (in Vista, it's C:\Users\<user>\AppData\Local\Temp\gdb-cmdsX.log).

17 September 2008

Cannot Open EAP (MDB) File

When I tried to open an old EAP file, EA displayed this error message: An Error has Occurred: The Microsoft Jet database engine cannot open the file '…'. It is already opened exclusively by another user, or you need permission to view its data.. I copied the EAP file to my hard disk and tried to open it again but EA displayed the same message. Maybe there's an internal lock on the file?

An EAP file is a Microsoft Access database, so I renamed the extension from '.eap' to '.mdb' and opened it in MS-Access to see if I could unlock the file. When MS-Access displayed this warning: The database '…' is read-only., I wondered if the solution was as simple as just removing the 'Read-only' attribute from the file. Tried it and now I can open my EAP file.

13 September 2008

NetBeans JUnit "Forked Java VM exited abnormally"

While trying to run a test case for a Java program in NetBeans using JUnit, my test suite failed and I found this error:

Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit.
junit.framework.AssertionFailedError
at org.netbeans.core.execution.RunClassThread.run(RunClassThread.java:151)

For some reason, NetBeans could not start a new virtual machine (I guess that's what the 'fork' means). I used NetBeans to debug the Ant test target and found in the nbproject\build-impl.xml file …

    <target name="-init-macrodef-junit">
        <macrodef name="junit" uri="http://www.netbeans.org/ns/j2se-project/3">
            <attribute default="${includes}" name="includes"/>
            <attribute default="${excludes}" name="excludes"/>
            <attribute default="**" name="testincludes"/>
            <sequential>
                <junit dir="${work.dir}" errorproperty="tests.failed" failureproperty="tests.failed" fork="true" showoutput="true">
…

replacing fork="true" with fork="false" gave a more meaningful error message:

Exit from within execution engine, normal
org.netbeans.core.execution.ExitSecurityException: Exit from within execution engine, normal
…
        at org.netbeans.core.execution.SecMan.checkExit(SecMan.java:66)
        at org.netbeans.TopSecurityManager.checkExit(TopSecurityManager.java:145)
        at java.lang.Runtime.exit(Runtime.java:88)
        at java.lang.System.exit(System.java:906)
        at PrintLineNumber.Main.main(Main.java:16)
        at PrintLineNumber.MainTest.testMainNoArguments(MainTest.java:47)
        at org.netbeans.core.execution.RunClassThread.run(RunClassThread.java:151)

So, the NetBeans security manager does not allow my program to call System.exit(1), which is fair enough, otherwise (I guess) NetBeans would exit as well.

The location of the problem is in the test code below, created by NetBeans, which calls the main() method of my application:

44  public void testMain() {
45    System.out.println("main");
46    String[] args = null;
47    Main.main(args);
48  }

And here is the code that was tripping JUnit:

15      if (args == null || args.length != 2) {
16        System.exit(1);
17      }

My code was checking if the argument list is null, but that should never happen in normal use because the Java launcher always calls main() with a String array, not null. I removed the test for a null args and allowed a NullPointerException to be thrown. Then I added a test case for this situation:

  @Test(expected=NullPointerException.class)
  public void testMainNullArguments() {
    Main.main(null);
  }

Now, JUnit runs my test suite to completion!

12 September 2008

ORA-24324, ORA-24323 and ORA-28547 Workaround

On a Windows notebook, one of our consultants couldn't connect to his local Oracle database. We deleted the database, then when we try to create a new database using Database Configuration Wizard, the wizard would report the following errors: ORA-24324, ORA-24323 and ORA-28547. Then we remembered that we recently installed software for a USB-based 3G modem, so we plugged in the modem and now Oracle works again! Don't know why at the moment, but we guess that the absence of the new network adapter was confusing Oracle.

Later ߪ it seems that all that is required is to ensure that the modem adapter software is loaded into memory when Windows is restarted (it was originally disabled).

07 September 2008

Minor Bug Fix in Conquest Game

A game that I play, Conquest, occasionally crashes because of some bad input, so I ported it from Borland C to Visual C++ to compile and debug it. Some minimal changes required were:

  • Use double instead of float types.
  • Reimplemented clrscr() and gotoxy() functions using Windows Console functions.
  • Change function names in proto.h: _cprintf(), _getch() and _putch().

In the VC++ debugger, I found that the problem occurs in the get_token() function, which parses the user's input. For some reason, when I type too fast, an illegal character is entered. Changing the get_line() function, called by get_token(), to only allow 7-bit ASCII characters in the input, seems to fix the problem.

See Also

06 September 2008

Change Default Text Font in Eclipse

Here's how to change the default font used by all the text editors in the Eclipse IDE:

  1. Select menu item Window / Preferences.
  2. In Preferences dialog, select node General / Appearance / Colors and Fonts.
  3. In Colors and Fonts pane, select node Basic / Text Font.
  4. Press Change… button to open the font selection dialog, select the font that you like, then close that dialog.
  5. Press Apply button.

Text editors that have specific fonts selected won't use the font selected in the Text Font property. If you want always use the default text font, select the text editor's font property in the Preferences dialog, then press the Reset button. You should see the text editor's font property change from … (overrrides default: Text Font) to … (set to default: Text Font).

30 August 2008

NetBeans 'Main Project' Configuration Hack

I had an old folder of C++ exercises, containing a Makefile and some source files. Each source file would compile to one executable. I imported that folder into NetBeans then wondered how to run and debug each executable as a different 'Main Project'. After some twiddling, the solution (hack?) is to define different configurations for the project, one for each executable.

  1. Select your project's context menu item Set Configuration / Manage Configurations….
  2. In the Project Properties dialog, press the Manage Configurations button (I know that's rather confusing).
  3. In the Configurations dialog, create a new configuration with the target executable file name.
  4. Press the OK button to close the Configurations dialog.
  5. In the Project Properties dialog, select the Categories / Build / Make node.
  6. Select your configuration from the Configuration: drop down list. Note that you have to select any node other than General otherwise the Configuration: drop down list is disabled.
  7. In the Makefile panel:
    • Working Directory = . (current directory)
    • Build Command = ${MAKE} -f Makefile <file.exe>
    • Clean Command = ${MAKE} -f Makefile clean
    • Build Result = <file.exe>
  8. Press the OK button

Now, to run or debug a different executable, I just change the project's Active Configuration.

27 August 2008

What's the time?

14 quick ways to find the current time on your computer.

Cmd.exe has two built-in commands for the date and time. You have to add the /t option when calling these commands otherwise you are prompted to set the system time:

> date /t
Wed 27/08/2008
> time /t
07:42 PM

GnuWin's date command prints the date, time and time zone:

> date
Wed Aug 27 19:43:23 AUS Eastern Standard Time 2008

You can use the POSIX module in Perl to get the current date and time:

> perl -e "use POSIX; print asctime(localtime());"
Wed Aug 27 19:44:21 2008

Python has a time module similar to Perl's:

> python -c "import time; print time.asctime()"
Wed Aug 27 19:48:07 2008

PHP's time and date functions return an array, which you can dump using the print_r() function:

> php -r "print_r(get_date());"
Array
(
    [seconds] => 49
    [minutes] => 34
    [hours] => 14
    [mday] => 30
    [wday] => 6
    [mon] => 8
    [year] => 2008
    [yday] => 242
    [weekday] => Saturday
    [month] => August
    [0] => 1220070889
)

Ruby has a Time class:

> ruby -e "print Time.now"
Wed Aug 27 19:45:32 +1000 2008

PowerShell has a get-date cmdlet:

> get-date
Wednesday, 27 August 2008 7:50:13 PM

Or use the .Net System.DateTime.Now property in PowerShell:

> [System.DateTime]::Now
Thursday, 28 August 2008 9:53:21 AM

Firefox can tell you the time using the Javascript Date() object. Enter the following statement in your browser's address bar:

javascript:Date()
Wed Aug 27 2008 20:11:27 GMT+1000 (AUS Eastern Standard Time)

MSIE6 has a similar object but the output is different from Firefox's:

javascript:Date()
Thu Aug 28 10:06:59 2008

Groovy (and Java) has a java.util.Date object which defaults to the current time:

new java.util.Date()
Result: Thu Aug 28 09:58:45 EST 2008

24 August 2008

MinGW and Gdb in NetBeans

I configured NetBeans to use MinGW (Minimalist GNU for Windows), but when trying to debug an old C++ program, NetBeans kept displaying this error message: Gdb could not load your program. Terminating debug session. The version of gdb matched the MinGW distribution, so that wasn't the problem. After the usual bit of head-scratching, I ran gdb in a console and got this message: Reading symbols from <file>...(no debugging symbols found)...done. I manually compiled the source using g++ and found that my program was 3 times bigger than the NetBeans compiled version. So, the source of the problem is in the Makefile.

It turned out that when I imported my project into NetBeans, I forgot that my Makefile had this rule: $(CXX) -Wall -ggdb -s. The command-line option -s told the linker to strip all symbols, so gdb couldn't debug my program. Duh.

15 August 2008

Firefox Unexpectedly Popular

Falls in the use of older versions of MSIE are always offset by rises in use of newer versions and some rise in the use for other browsers. The July 08 browser statistics for W3 Schools show that the use of all versions of MSIE fell 2.0% while use of Firefox rose 1.6% (and Opera gains 0.2%). Has the use of MSIE7 peaked? (Insert usual caveats about selective audience of this site, monthly fluctuations, rounding errors, yada yada.)

05 August 2008

Change Default for Show Markup in MS-Word 2003

An annoying feature in Microsoft Word 2003 is that the default setting for the Reviewing toolbar is Final Showing Markup, so that all changes in your documents are highlighted when you open it, even after you accept all changes and saved the document previously. The solution is to unset your Security option Make hidden markup visible when opening or saving.. See How to turn off annoying MS Word Features, 'Change Default for "Show Markup" Box' for this and other solutions to Word annoyances.

04 August 2008

List Empty Access Tables using Perl

A port of my Python DBI program to Perl. Note that you have to install package DBD-ODBC for the ODBC driver.

use warnings;
use strict;
use DBI;

use constant MDB_PATH => '<path>';

my $dbh = DBI->connect('DBI:ODBC:DRIVER=Microsoft Access Driver (*.mdb);Dbq=' . MDB_PATH);

my $sth = $dbh->prepare(
  "SELECT name FROM MSYSOBJECTS WHERE name NOT LIKE 'MSYS%' AND type = 1");
$sth->execute();
my $ref = $sth->fetchall_arrayref();

for my $row ( @{$ref} )  {
  my $table_name = @$row[0];
  my $sth = $dbh->prepare("SELECT COUNT(*) FROM [$table_name]");
  $sth->execute();
  my @data = $sth->fetchrow_array();
  if ($data[0] == 0) {
    print "$table_name is empty\n";
  }
}

$dbh->disconnect();

See Also

PS

2008-08-08: Replaced sprintf("SELECT COUNT(*) FROM [%s]", $table_name) with "SELECT COUNT(*) FROM [$table_name]" since Perl can evaluate variables in a string.

Added empty parentheses when calling functions to be consistent.

01 August 2008

List Empty Access Tables using Python

We wanted to find empty tables in a Microsoft Access database. Below is a Python script that uses the PythonWin odbc module to find empty tables in an MS-Access database. Edit the required path to your database file by modifying the MDB_PATH variable.

Follow the note at the start of the script to configure your MS-Access database security if you get the following message: dbi.program-error: [Microsoft][ODBC Microsoft Access Driver] Record(s) cannot be read; no read permission on 'MSYSOBJECTS'. in EXEC.

# List empty tables in an Access database by Kam-Hung Soh 2008.
# Before using this script, you have to allow User and Group Permissions in Access.
# 1. Open database.
# 2. Select menu item Tools / Security / User and Workgroup Permissions.
# 3. In 'User and Group Permissions' dialog:
# 3.1. Select User/Group Name = Admin.
# 3.2. Select Object Name = MSysObjects.
# 3.3. Check 'Read Data' check box.
# 3.4. Press OK button to close dialog box.

import odbc

MDB_PATH = r'<path>'

conn = odbc.odbc(r"DRIVER={Microsoft Access Driver (*.mdb)}; Dbq=%s;" % MDB_PATH)
cur = conn.cursor()
cur.execute(r"SELECT name from MSYSOBJECTS WHERE name NOT LIKE 'MSYS%' AND type = 1")
for x in cur.fetchall():
    table_name = x[0]
    cur.execute(r'SELECT COUNT(*) FROM [%s]' % table_name)
    row = cur.fetchone()
    if row[0] == 0:
        print table_name + ' is empty'
cur.close()
conn.close()

Script Notes

MS-Access stores object metadata in a system table called MSysObjects. In this table, a user table object has a type value of '1' and its name doesn't start with 'MSys'. This script first gets a list of all user tables from MSysObjects, then counts the number of rows in those tables. If a table has no rows, the script prints the table name and a message.

The fetchall() function always returns a list of tuples even if only one column is selected, so you have to extract the required column data using an array operator (e.g. x[0]).

The table name in the second cur.execute() SQL statement is delimited by square brackets in case the table name has whitespaces. Without these delimiters, you may see the following message: dbi.program-error: [Microsoft][ODBC Microsoft Access Driver] Syntax error in WITH OWNERACCESS OPTION declaration. in EXEC.

See Also

28 July 2008

Extract Lines with Line Numbers using Gawk, Groovy, Perl, Python and Ruby

More ways to extract a block of text from a stream and prepend the line number to each line.

Below is the Gawk version. The built-in variables NR is the number of the current line and $0 is the content of the current line.

gawk "(NR >= r1 && NR <= r2) {printf("""%4d %s\n""", NR, $0)}"

The Perl and Ruby scripts are exactly the same. The built-in variable $. holds the number of the current line and $_ holds the text of the current line.

perl|ruby -ne "printf '%4d %s', $., $_ if $. >= r1 && $. <= r2"

The Groovy command line options are similar to the Perl and Ruby version, except that you have to separate -n and -e. The built-in variable count holds the number of the current line and line holds the text of the current line.

groovy -n -e "if (count >= r1 && count <= r2) out.format '%4d %s\n', count, line"

The Python version is verbose due to boilerplate code to iterate through all rows in a file:

python -c "import sys; print ''.join('%4d %s' % (r, l) for r, l in enumerate(sys.stdin) if r >= r1 and r <= r2)"

See Also

PS

2008-07-29: Added Groovy version.

27 July 2008

Basic Perl Tk HTTP Server Monitor

Here's a port of my simple Python HTTP Server Monitor to Perl, using the Tkx module to interface with Tk. A minor difference is to use the Tk options database to specify the font of the headers in a configuration file.

# Basic HTTP Server Monitor by Kam-Hung Soh 2008.
use strict;
use warnings;
use Log::Log4perl qw(:easy);
use LWP::Simple;
use Text::CSV;
use POSIX;
use Tkx;

use constant CONFIGURATION_PATH => 'HttpServerMonitor.csv';
use constant LOG_PATH           => 'HttpServerMonitor.log';
use constant OPTION_PATH        => 'HttpServerMonitor.db';
use constant REFRESH_INTERVAL   => 60000; # Miliseconds
use constant TIME_FORMAT        => '%H:%M:%S %d-%m-%y';

sub create_widgets {
  my $logger = get_logger;
  my $app = Tkx::widget->new('.');
  Tkx::wm_title($app, 'HTTP Server Monitor');
  my $col = 0;
  for my $text ('Name', 'Host', 'Port', 'Status', 'Last Check') {
    $app->new_label(-name => 'header' . $col, -text => $text)
      ->g_grid(-row => 0, -column => $col, -padx => 2, -pady => 2);
    $col++;
  }

  my $row = 1;
  my $csv = Text::CSV->new;
  open CSV, "<", CONFIGURATION_PATH;
  ; # Skip header row
  while () {
    $logger->debug('$. = ' . $.);
    if ($csv->parse($_)) {
      my @field = $csv->fields;
      $col = 0;
      for my $s (@field) {
        $app->new_label(-text => $s)
          ->g_grid(-row => $row, -column => $col, -padx => 2, -pady => 2, -sticky => 'W');
        $col++;
      }
      my ($name, $host, $port) = @field;
      my $key = $host . ':' . $port;
      $::status_label{$key} = $app->new_label(-background => 'yellow', -text => 'unknown');
      $::status_label{$key}->g_grid(-row => $row, -column => $col, -padx => 2, -pady => 2, -sticky => 'W');
      $::time_label{$key} = $app->new_label(-text => strftime TIME_FORMAT, localtime);
      $::time_label{$key}->g_grid(-row => $row, -column => $col+1, -padx => 2, -pady => 2, -sticky => 'W');
    }
    $row = $.;
  }
  close CSV;

  $app->new_button(-text => "Refresh", -command => \&refresh)
    ->g_grid(-row => $row, -column => 4, -padx => 2, -pady => 2, -sticky => 'E');
}

sub refresh {
  my $logger = get_logger;
  for my $key (keys %::status_label) {
    my $url = 'http://' . $key;
    $logger->debug($url);
    if (head $url) {
      $::status_label{$key}->configure(-background => 'green', -text => 'up');
    } else {
      $::status_label{$key}->configure(-background => 'red', -text => 'down');
    }
    $::time_label{$key}->configure(-text => strftime TIME_FORMAT, localtime);
  }
  Tkx::after(REFRESH_INTERVAL, \&refresh);
}

Log::Log4perl->easy_init($INFO);
my $logger = get_logger;
$logger->info('Start');
Tkx::option_readfile(OPTION_PATH);
create_widgets;
refresh;
Tkx::MainLoop;
$logger->info('Finish');

This script reads a list of servers in CONFIGURATION_PATH to monitor from a CSV file with three columns, the display name, the host name and the port, such as the one below:

Name,Host,Port
Google,google.com.au,80

When the script starts, it reads Tk widget configuration from an Xdefaults-style file in OPTION_PATH, such as the one below. Note that according to Options and Tk - A Beginner's Guide, you can't set grid options (that's why the script is peppered with padx and pady options).

*header0.font : -size 10 -weight bold
*header1.font : -size 10 -weight bold
*header2.font : -size 10 -weight bold
*header3.font : -size 10 -weight bold
*header4.font : -size 10 -weight bold

The Tk configuration file is more verbose than I expected. Each widget in Tk belongs in a container, containers can be members of other containers, and all widgets belong to a root container (similar to a file system). In each line of a Tk configuration file, you specify the path to a widget (all text up to the last dot), the option (the text between the last dot and colon) and the value to use (the text after the column).

!---- pathname ---+ +option+     +----- value -------+
application.header0.font       : -size 10 -weight bold

You can use an asterisk in the widget pathname if you don't care about the container of the widget. However, there's no wildcard for the widget's name, so in this case, I have to enumerate each widget that I want to configure.

See Also

13 July 2008

Basic Python Tk HTTP Server Monitor

HTTP Server Monitor We had some servers which would occasionally go offline, so I wrote a basic HTTP server monitor using Python and Tkinter (the interface to the Tk GUI library):

# HTTP Server Monitor by Kam-Hung Soh 2008
from csv     import reader
from httplib import HTTPConnection
from logging import basicConfig, error, info, INFO
from os.path import exists
from time    import strftime
from tkFont  import Font
from Tkinter import Button, Frame, Label

CONFIGURATION_PATH = 'HttpServerMonitor.csv'
LOG_PATH           = 'HttpServerMonitor.log'
REFRESH_INTERVAL   = 60000 # Miliseconds
TIME_FORMAT        = '%H:%M:%S %d-%m-%y'
GRID_DEFAULT       = {'padx':2, 'pady':2}

class Application(Frame):
  def __init__(self, master=None):
    Frame.__init__(self, master)
    self.status_label = {}
    self.time_label = {}
    self.grid(**GRID_DEFAULT)
    self.create_widgets()

  def create_widgets(self):
    for i, s in enumerate(['Name', 'Host', 'Port', 'Status', 'Last Check']):
      Label(self, font=Font(size=10, weight='bold'), text=s).grid(column=i, row=0)

    if not exists(CONFIGURATION_PATH):
      error("Cannot open,%s" % CONFIGURATION_PATH)
      exit(1)

    f = open(CONFIGURATION_PATH, "rb")
    f.next() # Skip header row
    for r, p in enumerate(reader(f)):
      row_num = r + 1
      for col_num, s in enumerate(p):
        Label(self, justify='left', text="%s" % s).grid(column=col_num, row=row_num, sticky='w', **GRID_DEFAULT)
      host_name, host, port = p
      key = host + ":" + port
      self.status_label[key] = Label(self, background='yellow', text='unknown')
      self.status_label[key].grid(column=col_num + 1, row=row_num, sticky='w', **GRID_DEFAULT)
      self.time_label[key] = Label(self, text='%s' % strftime(TIME_FORMAT))
      self.time_label[key].grid(column=col_num + 2, row=row_num, sticky='w', **GRID_DEFAULT)

    Button(self, text='Refresh', command=self.refresh).grid(column=4, sticky='e', **GRID_DEFAULT)

  def refresh(self):
    for key in self.status_label.keys():
      self.time_label[key].config(text=strftime(TIME_FORMAT))
      label = self.status_label[key]
      h = HTTPConnection(key)
      try:
        h.connect()
        label.config(background='green', text='up')
      except:
        label.config(background='red', text='down')
      finally:
        h.close()
    self.after(REFRESH_INTERVAL, self.refresh)

if __name__ == "__main__":
  basicConfig(
    datefmt='%Y%m%d_T%H%M%S',
    filemode='a',
    filename=LOG_PATH,
    format='%(asctime)s,%(levelname)s,%(message)s',
    level=INFO
  )
  info('Started')
  app = Application()
  app.master.title('HTTP Server Monitor')
  app.refresh()
  app.mainloop()
  info('Ended')

This program reads a CSV file specified in CONFIGURATION_PATH constant for a list of servers to monitor. The CSV file has three columns: the display name, the server address and the server's port. The first line of the CSV file is for information only; it is not used by the program. Below is a sample CSV file:

Name,Host,Port
My server,myserver.com,80

You can define the time interval between checks by modifying the REFRESH_INTERVAL constant. This constant is in miliseconds, not seconds, so don't set too small a value!

If you using Windows, run it using pythonw HttpServerMonitor.py.

See Also

12 July 2008

Extract Columns From Tabular Text - Powershell and Python

Finishing off different ways to extract columns, here's the PowerShell and Python versions:

foreach-object { $_.Split('<delimiter>')[-1] }

$_ is the current object (or record) in the loop. When processing tabular text, $_ is a .Net String class, so we use its Split() method to divide the input on the <delimiter>. Split() returns a String array, and index -1 refers to the last String (or column) in that array.

python -c "import sys; print ''.join(s.split('<delimiter>')[-1] for s in sys.stdin)"

Unlike Perl or Ruby, Python doesn't have any special command-line support to iterate through all lines of input or split the input, so we have to use this generator hack. Like the PowerShell version, each record (s) is a string, so we use a string's split() function to divide the input into an array and use index -1 to refer to the last column in that array.

See Also

11 July 2008

Extract Columns From Tabular Text - Perl and Ruby

My previous posting described using the GnuWin cut command to extract columns from tabular text data but you couldn't specify columns relative to the last column. The cut command is pretty easy to use in a command console, so if you want to overcome this limitation without too additional effort, you could write an ad-hoc script using Perl or Ruby programming languages.

A Perl solution: perl -F <delimiter> -ane "print @F[-1]".

A Ruby solution: ruby -F <delimiter> -ane "print $F[-1]".

Both Perl and Ruby have the same command line switches for splitting lines: -n makes the interpreter iterate through all lines of input for the statement specified in the -e switch, the -a switch turns on the auto-split mode and -F changes the character used to split a line.

All columns in a record are collected in the global F array. For example, you extract column two using @F[1] (Perl) or $F[1] (Ruby). To extract the last column in a record, use $F[-1].

See Also

05 July 2008

Browser Usage Forecast

W3Schools Browser Statistics page shows that in June 2008, 41% of hits came from developers using Firefox and 53.5% from developers using MSIE7 or MSIE6.

What would the list be like at the end of this year? IE7 should cross 30%, IE6 to be about 22% and IE5 may disappear from the list. Unlike IE5, Moz could be barely be on the list because the number of hits is declining slower than IE5. FF may cross 43%, after the jump caused by the release of FF3 in June has been absorbed. Opera and Safari will noodle along at about 5% in total.

Enough crystal ball gazing …

Extract Columns From Tabular Text - Cut

A quick way to extract one or more columns from tabular or character delimited data, such as Web pages or log files, is to use the GnuWin cut command.

Some examples:

  • Print just the bug number and title from a list of bugs in a Web page (e.g. from Bugzilla): cut -f1,8.
  • Print the URLs requested from Apache log (in common format): cut -d" " -f7.

The -f switch specifies the column to extract. By default, the delimiter is TAB and the -d switch specifies an alternative delimiter.

One limitation of cut is that you can't specify the columns relative to last column, unless you know the index of the last column. If your data has a varying number of columns, such as the path strings printed by the find . -type f command, such as the example below …

./Profiles/9ls0tqn1.default/blocklist.xml
./Profiles/9ls0tqn1.default/bookmarkbackups/bookmarks-2008-06-14.html

… you can't easily extract just the file name (the last column) in every line.

A related command is colrm, which removes character columns from the input. It's quite limited and does the opposite of what I expect, so I haven't used it.

Cut is a simple utility to extract columns of data, and it can't process the column data like a scripting language. I'll write a bit more about processing tabular data in future.

See Also

27 June 2008

Firefox 3 SVG Performance Improvement

I did a quick check of my SVG Game of Life demonstration with Firefox 3 and found that it ran much faster compared to Adobe's ancient SVG Viewer 3.03 (ASV). The performance ratio used to be roughly 3:2 in favour of Firefox. Now, it's close to 5:1!

The raw numbers on the same computer: 24-27 fps on Firefox 3, 5-6 fps for MSIE6 + ASV.

25 June 2008

Event 7000, DS1410D service failed to start

Windows Event Viewer kept reporting error event 7000 each time I restarted my computer:

The DS1410D service failed to start due to the following error: 
The system cannot find the file specified.

The error is caused by Windows trying to load a file called DS1410D.SYS. This file is part of an application called FlexLM but since I've uninstalled FlexLM, the file has also been deleted.

The fix is find all instances of .../services/DS1410D/Start in registry and set it from 2 (Auto load) to 4 (disabled). See Microsoft KB 103000 article for more information.

24 June 2008

Gawk Print Last Field

gawk script to print the last field of each line:

gawk -F <delimiter> "{ print $NF }".

-F defines the separator in the command line, otherwise you would prepend "BEGIN { FS=<delimiter> }".

NF is the number of fields in a line and $n is the value of the n'th field, so $NF outputs the last field.

See Also

23 June 2008

Firefox 3 - Second Impressions

Second impressions after using Firefox 3 for a couple of days:

  • Smart Location Bar pretty much replaces the Search bar in most situations.
  • Faster than FF2 when rendering of pages with Javascript controls. Load http://msdn.microsoft.com/en-us/library/aa923541.aspx and select some of the menu links at the top of the page. In FF3, the menus expand immediately while in FF2, they used to take about a second or two.
  • Identity button is a great shortcut for checking the certificates of secure sites (try http://paypal.com and click on the site's favicon).
  • UI improvements:
    • FF3 is prettier than FF2 in Win XP and Vista. The toolbar buttons are clearer and brighter, and there's less clutter.
    • The Find dialog is shorter and less intrusive.
    • Password Manager prompt appears on top of page instead of a modal dialog. You can continue browsing or tell the Password Manager to remember your password.
    • Buttons with focus now have bright highlighting (FF2 drew a hard-to-see dotted line border around the button's label). Would be nice if all controls were highlighted the same way.

2008-06-24: The OK button in the dialog to open or download files now has focus when the dialog opens. In FF2, the OK button didn't get focus until the dialog lost and regained focus.

21 June 2008

Firefox 3 on ABC Radio National Breakfast

Firefox 3's release was reported on ABC Radio National Breakfast Tech review with Peter Marks - Firefox. The ABC doesn't seem to make transcripts of that programme (at least, I couldn't find any), so here's one:

Mark Bannerman: Well increasingly we do business, receive entertainment and socialize through the Internet. And the way we access the Internet is with a Web browser. This week, a new version of the popular Firefox browser was released, and it achieved a massive 8.3 million downloads in the first 24 hours. To explain all this interest, we are joined again by our technology editor, Peter Marks. Peter, good morning.

Peter Marks: Good morning, Mark.

MB: Well, let's start with the basics. 8.3 million downloads in one day. That's a lot of interest. But exactly what are they seeking? What is it?

PM: Well, the Web browser is the thing that turns the mark up language of the Internet, it's called HTML, it's a simple text mark up designed to just transmit ???. Originally it's just text, but it turns it into that beautiful rendered page when you go to Web sites, like the ABC's Web site for example. And the software you run to do that rendering on your screen is the Web browser. Originally, it was very simple text, but now we're doing applications like banking, bidding on auctions, we're arranging parties ... we're doing all sorts of things through the Web browser. And in a sense, the operating system underneath it, Windows, or the Mac or Linux or whatever, is becoming less important, and the platform that we work in is the Web browser. Different browsers are different. Some are faster than others, some are more secure and so on. So, you know, people like to have a bit of a choice about which one they use.

MB: So, in those terms, computers come with a browser, but what you're saying is that, or what we're clearly learning here, is that some people choose to install a different one.

PM: That's right.

MB: Why is that?

PM: Well, Windows comes with Internet Explorer, for example. The Mac comes with Safari, which I should add, is available on Windows as well. And Internet Explorer was incredibly dominant. That had well into the 90's of all browser impressions. So it became the dominant browser everyone was using. And that caused a number of problems a few years ago. The first one was that it had some quirks, and so Web sites were designed just primarily to work with Internet Explorer. If you were unlucky enough to use something else, you found that sites looked a bit strange. The other problem was that it had some security holes. And people, because it was the dominant browser, targeted it. And I can remember my kids would go to a Web site, that was kind of fishing for kids, and just by visiting that Web site in Internet Explorer, they would get a virus or a Trojan horse installed on their computer. So it had some bad things happening because it was so dominant. It's a bit like a monoculture and it can get diseased.

MB: So having a variety of them saves us from that?

PM: It's a good thing, yes. It's a bit like with farming that you have a variety of crops, then you're not going to be all wiped out.

MB: It's extraordinary isn't it, really?

PM: Yeah, it's the same thing. And we talk about viruses, it's very much like life. Firefox happily has started to appear as quite a dominant browser, and I think people had a good experience with versions 1 and 2, and they're recommending it to each other. Now, in fact, the statistics in Q2 2008 this year were that IE has 74% of the market share, Firefox has 18%, which is pretty good, given that it doesn't come with either of the computers. So people have actually got to make a decision to download and install it, which of course is a big thing. Safari on the Mac has 6% but the trend has strongly been against IE and pro-Firefox. It's been growing as IE has been shrinking. It's just great to see Firefox on all platforms. So now Web designers know that if they build a site that works in Firefox, they can say to a user, "Whatever you're using, go and get Firefox and it will work."

MB: So, without wanting to, sort of, you know, to do a total ad for this, what are the features that have made that many people rush to it within 24 hours?

PM: I should say that Firefox is free, so it's hardly an ad if you're talking about something that's free to download. Firefox 3 looks better than previous versions. It has native themes, so if you're on Vista, it looks like Vista, it fits right in. The old version looked a little bit odd. On the Mac, it looks like a genuine Mac application. It's really, really fast at rendering. It's much faster than IE, it's about 2 or 3 times faster than the previous version. So, when you go to a Web site, it pops. It just, bang, it's rendered. ??? which is fantastic. They fixed a lot of memory leaks and problems that were there in the past. My favourite feature is called, if I can do the voice, the Awesome Bar.

MB: [Laughs]

PM: Now, this is the address bar where you used to type URLs, so you would type, you know, H T T P whatever abc.net.au, and what people often do is that they can't remember it's a ".com" or ".org" and so they would typically Google for the URL.

MB: Right.

PM: In the Awesome Bar, it remembers where you've been in the past, and you can type in any part of the URL, any word in there, or any part of the name of the site or the page, and it will pop up this list of suggestions, and, I can tell you, it really works. You will find that the site you want is right there, and you can choose it. I love the name.

MB: It's scary, isn't it? Because it almost does it before you think of it. Now you were saying, though, that Firefox is free, so how do they make money?

PM: Well, they're paid with ... it's got this little search box that defaults to Google. When you do a search through Firefox, Google sends them over some money. I think about half of their funding does come from Google. And that of course raises some questions. It gives Google a lot of leverage to say, "Hey, Firefox team, you know, can you put this feature in for us and so on." But it's generally a good thing. There's no money changing hands. It's an open-source project and various companies do participate with it. IBM contributes code and ??? codes for it. So it is kind of a community project. It's fantastic to see it's getting such dominance.

MB: All right then, Peter Marks. We'll leave it there. Thanks very much for that.

PM: Thanks, Mark.

MB: That was our technology editor, Peter Marks, talking to us about the new Firefox Web browser.

Firefox 3 Gmail Problem Fixed

Upgraded from Firefox 2.0.0.14 to Firefox 3.0 and found that I couldn't load the default Gmail client (Firefox displayed the progress bar and then stopped). Same problem with using the HTTPS URL. The plain HTML client and older Gmail client was OK. I worked my way through the Firefox Basic Troubleshooting guide and the problem was fixed only after making a new profile.

18 June 2008

Creating UML Composite States in Sparx Enterprise Architect

How to create a UML composite state element using Sparx Systems' Enterprise Architect application:

  1. In a state machine diagram, create a new state element.
  2. Select the state element's context menu item Advanced / Composite Element.

The selected state element is converted into a composite state element (the image has a infinity symbol) with its own state machine diagram (check the Project Browser). Now you can draw a transition line to and from this composite state and include it in state transition tables.

Annoyingly, Enterprise Architect's on-line help describes a composite element but doesn't show to make one!

10 June 2008

Visio 2003 Cannot Resize Shape Directly with Keyboard

Microsoft Visio 2003 doesn't provide keyboard shortcut for the user to resize a shape instance. You have to use the mouse pointer or (shudder) enter the required height and width of the shape in the Size & Position window.

04 June 2008

Match Multiple String Patterns

To find multiple string patterns in an input file or stream, these commands are equivalent:

  • sed -n -e "/pattern1/p" -e "/pattern2/p". -n suppresses printing all input lines.
  • sed -n -r -e "/pattern1|pattern2/p". -r enables extended regular expressions.
  • grep -e "pattern1" -e "pattern2"..
  • grep -E "pattern1|pattern2". -E enables extended regular expressions.
  • findstr "pattern1 pattern2". You have to delimit the patterns in a single string argument. To find strings containing white spaces, you have to use the \s (whitespace) character class in your pattern.

26 May 2008

Pivot Table Hack in Sqlite3 and MySQL

Introduction

A pivot table or cross tabulation is a reporting feature that BAs love to use to summarise transaction data, such as server logs and sales figures. Spreadsheet programs such as Microsoft Excel or OpenOffice.org Calc have nifty wizards to help you create a pivot table. You can also create pivot tables in databases. For example, Microsoft Access has a TRANSFORM … PIVOT SQL statement for generating a crosstab or pivot table.

What if you're using a database program that doesn't directly support pivot tables? For example, Sqlite 3 and MySQL don't seem to have any SQL statements for pivot tables.

All is not lost; another way to express a pivot table is to use aggregate functions, condition clauses and GROUP BY clause in this template:

SELECT col1, col2, … <aggregate>(<condition>) … FROM table1 GROUP BY col1, col2, ….

For Sqlite 3, the aggregate functions and GROUP BY is similar to SQL in other database programs. The condition clause we can use has this syntax: case when <expression> then <expression> end.

In the next section, we'll demonstration how to create pivot tables in Sqlite 3 using this template. All examples will be shown using Sqlite 3's command line interface, sqlite3.exe.

Sqlite 3 Pivot Table Demonstration

First, you have to download some sample transaction data. I used the NumberGo Pivot Table Tutorial AcmeShirtsCompany.xls spreadsheet as the raw data for this demonstration.

We start sqlite3.exe and use the -column -header arguments make the output of queries more readable.

sqlite3 -column -header test.db
SQLite version 3.5.9
Enter ".help" for instructions

Now we create a shirt table based on the headings in that spreadsheet:

create table shirt (Region varchar(8), Category varchar(8), Shirt_Style varchar(8), ShipDate date, Units integer, Price double, Cost double);

Next we load some transaction data into the shirt table:

insert into shirt values ('East','Boys','Tee',date('2005-01-01'),11,5.25,4.66);
insert into shirt values ('East','Boys','Golf',date('2005-01-01'),12,5.26,4.57);
insert into shirt values ('East','Boys','Polo',date('2005-01-01'),13,5.27,5.01);
insert into shirt values ('East','Girls','Tee',date('2005-01-01'),14,5.28,5.01);
insert into shirt values ('East','Girls','Golf',date('2005-01-01'),15,5.29,5.10);
insert into shirt values ('East','Girls','Polo',date('2005-01-01'),16,5.30,4.67);
insert into shirt values ('West','Boys','Tee',date('2005-01-01'),33,6.25,5.36);
insert into shirt values ('West','Boys','Golf',date('2005-01-01'),35,6.26,6.24);
insert into shirt values ('West','Boys','Polo',date('2005-01-01'),36,6.27,6.03);
…

Let's begin our analysis with a simple question: How many shirts were sold in each region?

select Region, sum(Units) from shirt group by Region;
Region      sum(Units)
----------  ----------
East        21841
North       27275
South       29994
West        23984

Next: in each region, how many Boys' and Girls' shirts were sold? Here's where a pivot table is useful:

select
  Region
  , sum(case when Category = 'Boys' then Units end) as Boys
  , sum(case when Category = 'Girls' then Units end) as Girls
  , sum(Units) as SubTotal
from shirt
group by Region;
Region      Boys        Girls       SubTotal
----------  ----------  ----------  ----------
East        10586       11255       21841
North       14049       13226       27275
South       14312       15682       29994
West        10763       13221       23984

We can drill further into the data: How many of each shirt style were sold?

select
  Region
  , Category
  , sum(case when Shirt_Style = 'Tee' then Units end) as Tee
  , sum(case when Shirt_Style = 'Golf' then Units end) as Golf
  , sum(case when Shirt_Style = 'Polo' then Units end) as Polo
  , sum(Units) as SubTotal
from shirt
group by Region, Category;
Region      Category    Tee         Golf        Polo        SubTotal
----------  ----------  ----------  ----------  ----------  ----------
East        Boys        3458        3096        4032        10586
East        Girls       3688        3481        4086        11255
North       Boys        4597        4702        4750        14049
North       Girls       4196        4598        4432        13226
South       Boys        5192        4670        4450        14312
South       Girls       5113        5377        5192        15682
West        Boys        3722        3791        3250        10763
West        Girls       4472        4235        4514        13221

The pattern becomes obvious, if rather tedious, when you want to use a specific values as a new virtual column.

Discussion

In this article, I've presented a SQL template for generating pivot tables for database programs, such as Sqlite 3, that do not have explicit support for this feature. While this template is extensible, it relies on the developer knowing beforehand the possible values (e.g. Category has 'Boys' and 'Girls', or Shirt Styles has 'Tee', 'Golf' and 'Polo') to use in the condition clause of the template. If there are many possible values, then it becomes very tedious to enumerate each of them in the SQL case when … then … end clause.

2008-06-01: I had a play with MySQL and found that I can use the same SQL statements to create the pivot tables.

I had saved the data in AcmeShirtsCompany.sql, so to set up my MySQL database, I created the shirt table using mysql.exe, exit the interpreter, then loaded the data into the database using this cmd.exe command: mysql -u root -p -D test < AcmeShirtsCompany.sql.

2008-06-06: See also SQL Cookbook by Anthony Molinaro, O'Reilly Media.

25 May 2008

Disable Vista Memory Diagnostic Tool

Vista has a Memory Diagnostic Tool which you can turn on to test your computer's memory when you restart it. Once it is enabled, this tool starts every time you restart your computer. Be warned: the Vista help system doesn't explain how to disable it!

After some Web searching, I found this tip:

- Open command prompt as Admistrator: by typing in start ''cmd'' right click the .exe file and then clicking on adminstrator. - Then typing in the console: ''bcdedit /bootsequence {memdiag} /remove'' press enter, after that you can restart your com. and it wont start

24 May 2008

Fix Incorrectly Encoded Unicode Files with Python

The Problem

We had a lot of text files committed into our CVS repository as Unicode format. When these files were checked out later, we found that they weren't really text files nor Unicode files because CVS had only prepended two bytes to the start of these files, FF FE, but left only one byte for encoding each character. Some text editors such as Vim could open these files but other applications such as Notepad and Excel showed only gibberish.

Unicode Encoded Text in Files

Unicode is an encoding standard … for processing, storage and interchange of text data in any language. For the purpose of fixing this problem, we just have to know how to identify and write valid Unicode files.

We use two tools to experiment and visualize the effect of different encoding methods:

  1. Microsoft Notepad editor, because it can save text files using different encoding methods.
  2. GnuWin32 od utility to output the data in a file as byte values.

Open Notepad and enter this text: Hello World. Select the File / Save As menu item. In the Save As dialog, there are four encoding methods in the Encoding drop down list: ANSI, Unicode, Unicode big endian and UTF-8. Save the same text using each of the encoding methods into four files, say TestANSI.txt, TestUnicode.txt, TestUnicodeBigEndian.txt and TestUTF8.txt, respectively.

Examine the contents of each file using od:

>od -A x -t x1 HelloANSI.txt
000000 48 65 6c 6c 6f 20 57 6f 72 6c 64
00000b

>od -A x -t x1 HelloUnicode.txt
000000 ff fe 48 00 65 00 6c 00 6c 00 6f 00 20 00 57 00
000010 6f 00 72 00 6c 00 64 00
000018

>od -A x -t x1 HelloUnicodeBigEndian.txt
000000 fe ff 00 48 00 65 00 6c 00 6c 00 6f 00 20 00 57
000010 00 6f 00 72 00 6c 00 64
000018

>od -A x -t x1 HelloUTF8.txt
000000 ef bb bf 48 65 6c 6c 6f 20 57 6f 72 6c 64
00000e

The ANSI encoded file contains 11 bytes representing the characters you typed. The Unicode encoded files contain 24 bytes, starting with a two-byte BOM and using two bytes to represent each character. If the first two bytes are FF FE, then the two bytes are stored in low-byte, high-byte order. Conversely, if the first two bytes are FE FF, then the two bytes are stored in high-byte, low-byte order. Finally, when a file starts with byte EF BB BF, only one byte is used to encode each ANSI character and two or more bytes are used to encode non-ANSI characters (not demonstrated).

Fixing Incorrectly Encoded Files in Python

Now we know the format of a Unicode encoded file: it starts with FF FE and stores each character in low-byte, high-byte order. Our text files in CVS just have ANSI characters, so we just have to insert a 0 byte between each character, starting from the third byte. Julian W. wrote a short Python script that to do this. I don't have his code right now, so here's my version for correcting the Unicode encoding for a file:

import codecs
raw = map(ord, file(r'HelloBadUnicode.txt').read())
if raw[0] == 255 and raw[1] == 254 and raw[3] != 0:
  output = codecs.open(r'HelloFixedUnicode.txt', 'w', 'UTF-16')
  for i in raw[2:]:
    output.write(chr(i))
  output.close()

References

Postscript

I started with a more complicated piece of Python code using lists and generators:

from itertools import repeat
from operator import concat

raw = map(ord, file(r'HelloBadUnicode.txt').read())
if raw[0] == 255 and raw[1] == 254 and raw[3] != 0:
  output = file(r'HelloFixedUnicode.txt','w')
  output.write(chr(255))
  output.write(chr(254))
  for i in reduce(concat, zip(raw[2:], repeat(0, len(raw)-2))):
    output.write(chr(i))
  output.close()

But then I realised I just had to write a 0 byte after each ANSI character, so here's a simpler version:

raw = map(ord, file(r'HelloBadUnicode.txt').read())
if raw[0] == 255 and raw[1] == 254 and raw[3] != 0:
  output = file(r'HelloFixedUnicode.txt','w')
  output.write(chr(255))
  output.write(chr(254))
  for i in raw[2:]:
    output.write(chr(i))
    output.write(chr(0))
  output.close()

2008-05-25. I remembered that Python had no problems with writing Unicode files, resulting in the even simpler code in the body of this article.

23 May 2008

MDI Child Window Menu Shortcuts

I accidently moved PythonWin's Interactive window out of sight when I grabbed and dropped it with my mouse pointer. Restarting PythonWin didn't help because the position of the child window was stored in Windows Registry, so it remained hidden even after restarting PythonWin. I considered hacking Windows Registry to reset the child window's position, until I found the keyboard shortcuts to move the keyboard focus to a child window, show the System Menu and select the Move menu item.

Background: A child window is a window in a MDI application.

The keyboard shortcuts required to bring the child window back into view were:

  1. Move focus through child windows: Control+F6.
  2. Show a child window's System Menu: Alt+- (Alt Minus).
  3. Move child window: m, then press the cursor keys.

Keyboard shortcuts for Microsoft Windows applications: KB 126449.

20 May 2008

GnuWin32 find and missing argument for exec

Reminder on how to use -exec action in GnuWin32 find command in Windows cmd.exe. For example, if you want to find a string, the format is:

find . -type f -exec grep <pattern> {} ;

If you do any of the following, you can get this cryptic error message: find: missing argument to `-exec'

  • Put double-quote marks around the command:
    find . -type f -exec "grep <pattern> {} ;"
  • Don't leave a space between braces and semi-colon:
    find . -type f -exec "grep <pattern> {};"
  • Use Unix shell escape character:
    find . -type f -exec grep <pattern> {} \;

Finally, if all else fails and you lack time to investigate, use xargs:

find . -type f | xargs grep <pattern>

Python Command Line (-c option) Test 2

Julian W. suggested that I write a one line Python loop for my command scripts instead of map(), as in my earlier article. Instead of map(lambda l: expression(l), sys.stdin), I could write for l in sys.stdin: expression(l). An example trivial command to echo all input lines would be:

python -c "import sys; for line in sys.stdin: print line,"

Problem is that the Python interpreter complains:

  File "", line 1
    import sys; for line in sys.stdin: print line,
                  ^
SyntaxError: invalid syntax

13 May 2008

Outlook 2003 Save HTML Limitation

If want to save an e-mail message in HTML format in Microsoft Outlook 2003, you may find that Outlook, unlike MSIE or Firefox browsers, only saves the text in the message but not any of the embedded images or attachments. Worse, Outlook doesn't warn you that it is not saving the entire message.

Another annoyance is that if you try to save an image in the message using the context menu item Save Picture As, then you can only save using BMP format.

11 May 2008

Assign USB Drives to Folder

Each time you plug in a USB device to your Windows computer, Windows can assign a different drive letter to your device. If you have programs that rely on a fixed drive letter (e.g. portable applications on a USB drive or backups) or if you use more than one computer regularly, then it gets annoying to reset the programs' configuration after plugging in your drive or remember to plug in devices in a particular sequence. Assign USB Drives to Folder describes how to use Window's Disk Management to assign a fixed path to each device.

I wonder if it's possible to refer to a device using its volume name, which would make this method redundant?

07 May 2008

Sed Translate / Transform / Transliterate Command

Note to self: sed's (Stream EDitor) command y/list1/list2/ to transform / transliterate each character is based on its position in list1 to a character in the same position in list2. list1 and list2 must be an explicit character list, not a regular expression (and hence, not a character class). In other words, if you enter y/[a-z]/[A-Z]/, sed will look for these characters in the input, '[', 'a', '-', 'z' and ']', to replace with '[', 'A', '-', 'Z' and ']' respectively; sed does not expand a character class [a-z] to replace with [A-Z]. Same with Posix character class names such as [:lower:] and [:upper:].

I incorrectly mixed up the idea that sed's transform command with the tr (translate) command, which supports interpreted sequences, e.g. tr [:lower:] [:upper:] will transform all lower case characters to upper case.

05 May 2008

Obscure Cmd.exe Output Replacement (Back Tick)

With Unix shells such as bash and [t]csh, you can set the value of a variable to the result of a command using the back-tick operator (or output replacement). For example, LINES = `wc -l filename`, would set the variable LINES with the result of wc -l, which is the number of lines in filename. This technique is useful when you want pass the value of a computed variable to subsequent commands in a script.

Windows' cmd.exe also supports this feature, in a obscure way, using the for command: for /f %i in ('command') do set VARIABLE=%i. To reproduce the previous example in cmd.exe: for /f %i in ('wc -l filename') do set LINES=%i.

Notes:

  • Use %%i in a script.
  • Use single-quote marks to delimit a command. If you use double-quote marks, for treats the argument in parentheses as a string.

I saw this and other cmd.exe hacks somewhere but I didn't take a note of it. Grr. Remind myself to update this page when I find that site again.

01 May 2008

More Uses of Getclip-Putclip

More uses of GnuWin32 / Cygutils tools getclip and putclip using this recipe: getclip | <command chain> | putclip.

  • Copy m'th and n'th column of a table from a browser: cut -fm,n.
  • Copy columns from Excel and replace tab character with space: tr \t " ".
  • Capitalize letters: tr [:lower:] [:upper:]. (Duh! Enter Shift-F3 in Microsoft Word, thanks to Maria H.).
  • Remove indentation from e-mail messages: sed "s/> //".
  • Remove indentation from source code in Word document: sed -e "s/^ //" (5-May-2008).
  • Join lines broken into multiple lines by e-mail clients: dos2unix | tr -d \n. On a Windows system, tr doesn't recognise CR-LF pairs for terminating a line, so you have to convert them to a Unix-style LF using dos2unix first (6-May-2008).
  • Another way to join broken lines: tr -d \r\n using escape codes for carriage return and line feed, respectively (11-May-2008).
  • Remove formatting from string: getclip | putclip. This is equivalent to Microsoft Word's Paste Special / Unformatted Text. Also to work-around an annoyance in Outlook 2003, were the Edit / Paste Special is disabled when you are responding to an HTML-formatted document (7-May-2008).
  • Remove HTML / XML formatting from input: sed -e "s/<[^>]*>//g" (12-Jun-2008).

A second recipe is (for /f %i in ('getclip') do @command %i) | putclip if command cannot be used in a pipeline. Two examples are basename (return name of file in a path string) and dirname (return path string without file name).

2008-05-01: Don't simply list transformations and filters that can be done with GnuWin32 tools, but ones where existing applications (e.g. Excel, Firefox, Outlook or Word) don't have an easy way to achieve a particular action.

2012-06-25: Flatten or collapse Excel multi-column data.

30 April 2008

Using Clipboard in the Command Line

GnuWin32 / Cygutils package has two tools for interacting with the Windows clipboard: getclip and putclip. The first copies text from the clipboard to standard output and the second copies text from standard input to the clipboard. These tools are useful when you want to process text from one Windows application before pasting the text into another application, in the following recipe: getclip | <filters> | putclip.

For example, I want to paste all DLL file names in a folder into a document:

  1. Navigate to the required folder using 2xExplorer browser.
  2. Type Alt+a to select all files.
  3. Type Alt+c to copy all file names. 2xExplorer copies the absolute path for each file.
  4. Start cmd.exe console.
  5. In cmd.exe console, enter: getclip | cut -d\ -fn | grep dll$ | putclip. cut is GnuWin32 tool which selects a column of data given a column delimiter (-d\ defines backslash) and field number (-fn defines column n). grep filters the output to only list files with "dll" in their name.
  6. Start editor.
  7. Paste the text in the clipboard in destination document.

Of course, you can do the same using Excel:

  1. Navigate to the required folder using 2xExplorer browser.
  2. Type Alt+a to select all files.
  3. Type Alt+c to copy all file names. 2xExplorer copies the absolute path for each file.
  4. Start Excel.
  5. Paste data in a worksheet column.
  6. Select all cells by typing Shift+Space.
  7. Open Convert Text to Columns Wizard by typing Alt+d+e.
  8. Select Delimited data type by typing Alt+d.
  9. Type Alt+n to go to page 2.
  10. Select Other delimiter by typing Alt+o, then enter "\" for paths.
  11. Type Alt+f run the wizard.
  12. Start Auto Filter by typing Alt+d+f+f.
  13. Move to filter column using the mouse (no keyboard shortcuts?) then select from the drop down list (Custom …).
  14. Select ends width criteria, enter .dll, then press Enter.
  15. Move cursor to required column and select it using Control+Space.
  16. Copy column by typing Control+C.
  17. Start editor.
  18. Paste the text in the clipboard in destination document.

The Excel solution has many more steps than the getclip-putclip solution but Excel leads you through to a solution step-by-step. If you're familiar with GNU tools, then getclip-putclip recipe is faster to use and much more extensible.

2008-05-07. I should have remembered that the basename command would output the name of the file without the leading path string. See later article More Uses of Getclip-PutClip about how to use basename in a pipeline.

25 April 2008

Strange GnuWin32 Invalid Argument Error Messages

When chaining GnuWin32 commands in Windows cmd.exe, you may encounter strange error messages like this:

> ls | grep
…
ls: write error: Invalid argument

The first command reports a write error but the error is really in the second command after the pipe symbol.

You may also encounter a similar write error if the wrong command is found in your PATH variable. For instance, Windows and GnuWin32 both have a find and sort command which support different command-line options, so depending on the order of directories listed in your PATH variable, one version or the other is used. If you enter the wrong command-line options for these commands, they won't start and cause the command earlier in the chain to report some sort of I/O error.

24 April 2008

Python Command Line (-c option)

Perl has a -n option which implicitly runs a while-loop over all lines in STDIN (while (<>) { }). This mode is handy in a command shell when Perl is the recipient of the output of another command and you don't want to write a script. Can we do the same for Python?

Python has a -c option which runs a command in the string following it. While it's not entirely clear to me what is a Python command, I found that you can write some useful functions using list functions and statements using this template:

python -c "import <package>; print '\n'.join(<list function>(lambda x: <expression>, (s.strip() for s in sys.stdin)))

To use this template, replace <package> with a package name (e.g. os), <list function> with a list function (e.g. filter()) and <expression> with, well, an expression. The rest of the template just constructs a list of strings (without a trailing "\n") from the input and prints the results.

For simple string processing, the list function and expression are not required, resulting in a simplified version of this template:

python -c "import <package>; print '\n'.join(<fn>(s.strip()) for s in sys.stdin)"

While researching this topic, I found an ASPN Python Recipe called Pyline to help write commands. Here's the examples in that recipe rewritten using my template:

Print the first 20 characters of each line:

tail test.txt | python -c "import sys; print '\n'.join(s.strip()[:20] for s in sys.stdin)"

Print the 7th word in each line, assuming the separator is ' ':

tail test.txt | python -c "import sys; print '\n'.join(s.strip().split(' ')[6:7] for s in sys.stdin)"

Note that you can also get columns of text from a file using the cut command. Also note that the reason for using the array slice is to avoid getting an IndexError exception if the string is not long enough.

List all files that are greater than 1024 bytes in size:

ls | python -c "import os, sys; print '\n'.join(filter(lambda x: os.path.isfile(x) and os.stat(x).st_size > 1024, (s.strip() for s in sys.stdin)))

Generate MD5 digest values for a list of files, like md5sum.

ls *.txt | python -c "import md5, sys; print ''.join('%s %s' % md5.new(file(s.strip()).read()).hexdigest(), s) for s in sys.stdin)"

26-Apr-2008: Replaced list comprehension statement (for-in with square brackets) with generator expression (for-in with parentheses) in the template to avoid very large lists stored in memory.

Added MD5 digest example, and realised that we only need to use list functions (e.g. filter()) if you want to change the members of the resulting list. Otherwise, the simpler template suffices.

11 April 2008

Firefox Greasemonkey Kills Google Groups Spam

If you read Usenet newsgroups, no doubt you'd be familiar with spam messages spruiking credit, fake jewellery, external organ enlargements and free graduate degrees. On a PC, you can use killfiles in newsreading software to ignore spam messages. If you're reading newsgroups using the Google Groups web-based reader with Firefox, you can ignore annoying spam messages using a Greasemonkey script called Google Groups Killfile (GGK).

You can add entries to your killfile list using GGK's context menu but the list becomes hard to view and manage once you have a lot of entries. It is easier to edit GGK's kill list variable:

  • Enter "about:config" in Firefox's location bar.
  • Enter "kill" in the Filter field.
  • Click on greasemonkey.scriptvals.www.penney.org/Google Groups Killfile.GoogleKillFile and edit the configuration string.

2008-04-14: If you use regular expressions (RE), you can reduce the number of entries in the killfile list by using wildcards and the "alternate" operator (vertical bar symbol ("|")). You can further reduce the number of patterns to define by specifying case-insensitive comparison in GGK. Just search for the REs' "compile()" function in the GGK script and add a second "i" argument.