Computing

Posted by Jeet Sukumaran

There is no way to get tar to ignore directory paths of files that it is archiving. So, for example, if you have a large number of files scattered about in subdirectories, there is no way to tell tar to archive all the files while ignoring their subdirectories, such that when unpacking the archive you extract all the files to the same location. You can, however, tell tar to strip a fixed number of elements from the full (relative) path to the file when extracting using the "--strip-components" option. For example:

Posted by Jeet Sukumaran

In a previous post, I discuss a couple of approaches to dealing with the "Argument list too long" error when transferring large numbers of files. The solution to this problem is to archive the files, using the "-T" option of the "tar" command to pass in a list files generated by a "find" command:

  1. Create a list of the files to be archived using the "find" command:
    $ find . -name="*.tre" > filelist.txt
  2. Use the "-T" option of the "tar" command to pass in this list of filenames:
    $ tar cvjf archive.tbz -T filelist.txt

If you want to delete a long list of files, however, this approach will not work, as "rm" does not support the very convenient "-T"/"--files-from" flag or the equivalent (so convenient, in fact, that I have started adding this to virtually every file-processing script or program that I write).

Luckily, however, "find" does support a "-delete" flag, so to recursively delete all files and directories:

 
Posted by Jeet Sukumaran

There are several functions that calculate principal component statistics in R. Two of these are "prcomp()" and "princomp()". The "prcomp()" function has fewer features, but is numerically more stable than "princomp()".

Both of these functions can be invoked by simply passing in a suitable data frame, in which case all columns will be used:

    pca1 = prcomp(d)
    pca2 = princomp(d)

Alternatively, the columns to be used can be specified using a formula notation:

Posted by Jeet Sukumaran

If you have opened a file, and see a bunch "^M" or "^J" characters in it, chances are that for some reason Vim is confused as to the line-ending type. You can force it to interpret the file with a specific line-ending by using the "++ff" argument and asking Vim to re-read the file using the ":e" command:

SGE Queue Tantrums

16 Jun 2010
Posted by Jeet Sukumaran

You can get a "bird's eye" view of your cluster load by running:

Posted by Jeet Sukumaran

Every day I discover at least one new thing about Vim. Sometimes useful, sometimes not. Sometimes rather prosaic, sometimes sublime.

This one falls in the useful but prosaic category: to get a count of the number of characters, lines, words etc. in the current selection, type "g CTRL-G".

Posted by Jeet Sukumaran

Given a list of strings, how would you iterpolate a multi-character string in front of each element?

For example, given:

    >>> k = ['the quick', 'brown fox', 'jumps over', 'the lazy', 'dog']
The objective is to get:
    ['-c', 'the quick', '-c', 'brown fox', '-c', 'jumps over', '-c', 'the lazy', '-c', 'dog']

Of course, the naive solution would be to compose a new list by iterate over the original list:

Posted by Jeet Sukumaran

For the past few months, I've been 'defensive coding' wrt to Python 3.x; basically, if there is a construct that:

  • will be broken under 3.x

and

  • the alternate (which is not broken) is supported under 2.6+

I've been trying to use that instead.

Here is a "3K-ism" got me, that was completely unanticipated. I encountered it when running my Python environment description script under Python 3.x.

Posted by Jeet Sukumaran

I love text editors. Which is a good thing, because I spend the overwhelming majority of my computing time (and, hence, sadly, most of my conscious life) in one text editor or another. For years I have been an Emacs user, only relatively recently moving to BBEdit with my adoption/inheritance of a Mac as a personal machine. Using and often administrating Linux-based systems has necessitated that I use Vi now and then, but I have long held the opinion that the only Vi command one needs to know is: ":q!", perhaps to be followed by "emacs".

This attitude was born out of some unpleasant experiences really early on in my computing history. I distinctly remember a few occassions when I was trapped in an apparently psychotic terminal session that would not accept my typing despite dozens of increasingly-frenzied keystrokes, and then suddenly and inexplicably it started accepting my typing but refused to let me stop typing and exit. This was my introduction to the Vi editor. After once or twice resorting to disconnecting and relogging-in as the only way break out of the grip of this insane editor, I learned how to properly quit it: ":q!". For many years after that, those three keystrokes probably summed up 90-99% of my Vi usage: whenever I inadvertently triggered an editing session with it, I would quit it with alacrity and get on with life. It was a long while before I stopped getting a flash of a "Arrgh! Not again!" semi-panicky feeling whenever I saw a screen with all those tildes running down the left hand side. As far as I was concerned, a Vi session was synonymous with an operating system glitch or failure.

All that has recently radically changed ...

Posted by Jeet Sukumaran

This is pretty slick: enter "fc" in the shell and your last command opens up for editing in your default editor (as given by "$EDITOR"). Works perfectly with vi. The"$EDITOR" variable approach does not seem to work with BBEdit though, and you have to:

$ fc -e '/usr/bin/bbedit --wait'

With vi, ":cq" aborts execution of the command. Not sure how to do the same thing with BBEdit.