Saturday, July 16, 2011

Cluster, I hate you.

Working with huge datasets as I do, I have to use my school's cluster computing environment, which is really different from any other computer with which I am familiar. For example, in order to use programs such as emacs, svn, or java, I have to enable them using SoftEnv. This has to be done either each time you log on or with a startup script. I've been doing the former since basically all I use are emacs, svn, and java, and usually not all every time I log in, but planning to do the latter eventually... (Hint: This is foreshadowing about how my laziness may have been my salvation.)

Yesterday, late at night, I was struggling to do something I had done successfully before and kept getting bizarre errors. I finally tried unsuccessfully one last time to run my script, gave up and went to bed. This morning I began to tackle the problem again. My first step was to verify the error by rerunning the script.

Um....what error? Errors, I hate you when you exist, but I hate you even more when by not existing, you make me look crazy.

When I had slept and could think clearly, I found the problem was that, since I had to add java yesterday in order to use javac, it messed up my ability to use a particular jarfile by being the wrong version. Java versions, I hate you.

The lessons I learn from this are: First, on the cluster, if I'm having problems running something I have run before, I should log on in a new shell and see if that fixed the problem. Second, if I decrease my announced memory requirements and do all my processing at night, my process won't sit in the queue as long, so I'll find out more quickly if there are any immediate problems I need to solve.

No comments:

Post a Comment