Free Apache Spark course with Docker image

My friends over at Big Data University just launched a refresh of their Apache Spark course. Spark is an engine for processing and mining large amounts of data quickly.

The course takes advantage of a Docker image for Spark. Docker provides a way to run Spark on a typical laptop without getting a beefy server.

Spark’s on my personal to-learn list, so I just pencilled in a slot to take the course myself on my calendar.

Dockerize all the things

Dockerize all the things As my team creates new web apps and microservices in Rails, Meteor JS, and other stacks, we dockerize them. Docker is great because it enforces loose coupling, modular design, and process isolation.

A Docker container can be thought of as a very lightweight VM that shares a Linux kernel with its host. The infrastructure and tooling around it makes configuration management, scaling, and efficient use of resources easier.

My team’s also dockerized the Informix and DB2 database servers. The direct hardware access that Docker allows will give you better performance than you’d have seen with a VM equivalent in the past, while the container format allows for easy automation, operation, and management.

Unix command of the day: watch

The project I’m working on right now involve not just Dockerized Rails microservices, Meteor JS, and a data set measured in tens of terabytes, but also a big Bash code base.

Bash is a language that makes it easy to shoot yourself in the foot. I have some thoughts on how to write robust, modular, loosely coupled, unit tested Bash that go beyond Bash strict mode and shellcheck. However, let’s save those for later.

Here’s a useful shell command for today: watch

watch will run a command for you every few seconds and output the results on a clean screen. If you combine it with a split-screen or split-pane tool, you can quickly create a mini-dashboard.

For example, watch -n15 df -h will print your free disk space every 15 seconds:

By way of another example, watch -n60 db2 list utilities show detail will check on the status of your DB2 load and other operations every 60 seconds.

Mac OS X does not include watch by default. Assuming you have Homebrew, the following will install watch for you:

libdb2.so.1: cannot open shared object file

I got this error starting a ruby application:

An .so is a Linux library, equivalent to a .dll on Windows or a .dylib on Mac. Note that there are two different libraries mentioned. ibm_db.so is present, while libdb2.so.1 is missing.

You can verify the dependencies using the ldd command:

For me, the issue was caused by a missing IBM Data Server Driver directory. IBM_DB_LIB was pointing to a non-existent directory:

Reinstalling the Data Server Driver restored the libdb2.so.1 and eliminated the error.

(If you run into this issue with different libraries, you will likely need to examine your LD_LIBRARY_PATH environment variable and use the ldconfig command to reload any changes.)

How to run and install IPMIView on Mac

Softlayer cloud uses IPMIView for direct console access to bare metal hardware. SuperMicro makes a Mac version of IPMIView available for download.

Bizarrely, SuperMicro doesn’t appear to have ever tested it, because double-clicking on the downloaded IPMIView20 application doesn’t do anything. This is because someone forgot to set the execute permission on the installer.

Supposing it’s in your downloads folder, open up Terminal and run the following commands:

This will open up the installer for you. Once it’s installed, it will show up in your Launchpad like a normal application.

Launchpad

Persistent SSH sessions with screen

Do you ever need to kick off a long-running command while SSHed to a server, but be able to disconnect and reconnect at will? You can do this with screen.

Before doing anything, start a screen session:

When you’re ready to put your work on hold, detach the screen:

If you have a long-running command running, you can detach that screen from a different shell session by specifying the process id:

You can now disconnect from the server safely.

When you reconnect, you can also reconnect to your screen session:

 

How to rename a file in a File upload dialog on Mac

Windows users will scoff at this, but renaming a file in a File Upload dialog box on Mac is a surprisingly obscure action. Renaming is not available in the context menu, nor does the usual shortcut work.

Screenshot 2014-03-24 18.01.26To rename a file in a regular Finder window, you can select it and hit the Enter key. However, you can’t do this in a File Upload dialog as the Enter key has a different meaning.

To rename a file, select it, hit Cmd+I, open up the Name & Extension pane, and change the name.