My friends over at Big Data University just launched a refresh of their Apache Spark course. Spark is an engine for processing and mining large amounts of data quickly.
The course takes advantage of a Docker image for Spark. Docker provides a way to run Spark on a typical laptop without getting a beefy server.
Spark’s on my personal to-learn list, so I just pencilled in a slot to take the course myself on my calendar.
As my team creates new web apps and microservices in Rails, Meteor JS, and other stacks, we dockerize them. Docker is great because it enforces loose coupling, modular design, and process isolation.
A Docker container can be thought of as a very lightweight VM that shares a Linux kernel with its host. The infrastructure and tooling around it makes configuration management, scaling, and efficient use of resources easier.
My team’s also dockerized the Informix and DB2 database servers. The direct hardware access that Docker allows will give you better performance than you’d have seen with a VM equivalent in the past, while the container format allows for easy automation, operation, and management.
The project I’m working on right now involve not just Dockerized Rails microservices, Meteor JS, and a data set measured in tens of terabytes, but also a big Bash code base.
Bash is a language that makes it easy to shoot yourself in the foot. I have some thoughts on how to write robust, modular, loosely coupled, unit tested Bash that go beyond Bash strict mode and shellcheck. However, let’s save those for later.
Here’s a useful shell command for today: watch
watch will run a command for you every few seconds and output the results on a clean screen. If you combine it with a split-screen or split-pane tool, you can quickly create a mini-dashboard.
For example, watch -n15 df -h will print your free disk space every 15 seconds:
Every 15.0s: df -h Tue Apr 14 13:43:25 2015
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 1.8T 691G 1.1T 40% /
tmpfs 253G 12K 253G 1% /dev/shm
/dev/sda1 248M 58M 178M 25% /boot
/dev/sdb1 30T 12T 18T 41% /my_disk
By way of another example, watch -n60 db2 list utilities show detail will check on the status of your DB2 load and other operations every 60 seconds.
Every 60.0s: db2 list utilities show detail Mon Apr 13 12:06:24 2015
ID = 4
Type = LOAD
Database Name = MY_DB
Member Number = 0
Description = [LOADID: 1177.2015-04-13-18.104.22.1682010.0 (20;15)]
[*LOCAL.MY_USER.150413204511] OFFLINE LOAD DEL
AUTOMATIC INDEXING INSERT NON-RECOVERABLE MY_SCHEMA .MY_TABLE
Start Time = 04/13/2015 15:58:12.298788
State = Executing
Invocation Type = User
Phase Number = 1
Description = SETUP
Total Work = 0 bytes
Completed Work = 0 bytes
Start Time = 04/13/2015 15:58:12.298799
Phase Number [Current] = 2
Description = LOAD
Total Work = 4105362974 rows
Completed Work = 1069904365 rows
Start Time = 04/13/2015 15:58:15.821695
Phase Number = 3
Description = BUILD
Total Work = 2 indexes
Completed Work = 0 indexes
Start Time = Not Started
Mac OS X does not include watch by default. Assuming you have Homebrew, the following will install watch for you:
I got this error starting a ruby application:
libdb2.so.1: cannot open shared object file:
No such file or directory -
An .so is a Linux library, equivalent to a .dll on Windows or a .dylib on Mac. Note that there are two different libraries mentioned. ibm_db.so is present, while libdb2.so.1 is missing.
You can verify the dependencies using the ldd command:
$ ldd /usr/local/rvm/gems/ruby-1.9.3-p547/extensions/x86_64-linux/1.9.1/ibm_db-2.5.11/ibm_db.so
linux-vdso.so.1 => (0x00007fff93545000)
libruby.so.1.9 => /usr/local/rvm/rubies/ruby-1.9.3-p547/lib/libruby.so.1.9 (0x00007fb8ba5a5000)
libdb2.so.1 => not found
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb8ba37f000)
librt.so.1 => /lib64/librt.so.1 (0x00007fb8ba177000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fb8b9f72000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fb8b9d3b000)
libm.so.6 => /lib64/libm.so.6 (0x00007fb8b9ab7000)
libc.so.6 => /lib64/libc.so.6 (0x00007fb8b9722000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007fb8b94ab000)
For me, the issue was caused by a missing IBM Data Server Driver directory. IBM_DB_LIB was pointing to a non-existent directory:
$ set | grep IBM
Reinstalling the Data Server Driver restored the libdb2.so.1 and eliminated the error.
(If you run into this issue with different libraries, you will likely need to examine your LD_LIBRARY_PATH environment variable and use the ldconfig command to reload any changes.)
Softlayer cloud uses IPMIView for direct console access to bare metal hardware. SuperMicro makes a Mac version of IPMIView available for download.
Bizarrely, SuperMicro doesn’t appear to have ever tested it, because double-clicking on the downloaded IPMIView20 application doesn’t do anything. This is because someone forgot to set the execute permission on the installer.
Supposing it’s in your downloads folder, open up Terminal and run the following commands:
cd /Users/$( whoami )/Downloads/IPMIView20.app/Contents/MacOS
chmod u+x IPMIView20
This will open up the installer for you. Once it’s installed, it will show up in your Launchpad like a normal application.
Do you ever need to kick off a long-running command while SSHed to a server, but be able to disconnect and reconnect at will? You can do this with screen.
Before doing anything, start a screen session:
When you’re ready to put your work on hold, detach the screen:
If you have a long-running command running, you can detach that screen from a different shell session by specifying the process id:
# look up the process id
ps aux | grep SCREEN
# detach that screen
screen -d <my_process_id>
You can now disconnect from the server safely.
When you reconnect, you can also reconnect to your screen session:
Windows users will scoff at this, but renaming a file in a File Upload dialog box on Mac is a surprisingly obscure action. Renaming is not available in the context menu, nor does the usual shortcut work.
To rename a file in a regular Finder window, you can select it and hit the Enter key. However, you can’t do this in a File Upload dialog as the Enter key has a different meaning.
To rename a file, select it, hit Cmd+I, open up the Name & Extension pane, and change the name.