$10k! Spark hackathon in San Fran

I’ll be in San Francisco this weekend helping run the Apache Spark hackathon, and afterwards I’ll be at Spark Summit 2015.

If you’re curious at all about Spark, you should come out and hack with us. We’ll have some fun data sets and help you find a team.

You can take the free Spark Fundamentals course on Big Data University to brush up on your Spark skills. Spark is a framework for fast in-memory and batch analytics processing. It’s algorithmically smarter and so a lot faster than traditional Hadoop.

There’s $10k in prizes at the hackathon.

Encrypt Gmail, Facebook emails with OpenPGP and Mailvelope

Facebook just released a great use case for OpenPGP encryption in Gmail and other web email providers. You can now configure Facebook to encrypt all email it sends you with OpenPGP.

Whether or not you use Facebook, it’s surprisingly easy to use Mailvelope to integrate OpenPGP with Gmail and other email providers. Mailvelope is a browser extension for Chrome and Firefox that lets you encrypt messages that you write and decrypt messages that you receive.

The encryption can be done externally to the web email interface, so your email provider does not have access to the plain text of your email message.

OpenPGP is based on public key cryptography. You have two keys — a public key you can share with everyone, and a private key that you keep secret. Everyone can use your public key to encrypt messages they send you, but only you can decrypt them using your private key.

Why encrypt email? Email is generally transmitted in plain text across the internet, meaning a hostile party can intercept it. With the web (http) moving to encrypted connections for everything, email is left as an insecure communication medium. You as a user have to take active steps to make it secure.

Here’s how you can transparently integrate OpenPGP encryption with Gmail and Facebook:

  1. Install Mailvelope
  2. Generate a new key in Mailvelope options
  3. Go to Display Keys, click on a key, go to Export, and copy the public key
  4. Open your Facebook profile, go to About > Contact and Basic Info, and paste in the key in the PGP Public Key field
  5. Facebook will send you an encrypted notification. Mailvelope should turn your browser cursor into a golden key when you hover your cursor over the encrypted contents
  6. Tada!

You may also want to share your public key on keyservers like the MIT PGP Key Server or the PGP Global Directory. In principle, that will allow other people to send you encrypted email messages that only you can decrypt.

$7,500! Spark hackathon in Boston May 28-30

My team will be supporting the Spark hackathon in Boston on May 28-30.

This weekend’s a good chance to teach yourself Spark.

“Apache Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical Map Reduce program cannot provide, Spark is the alternative. Spark performs at speeds up to 100 times faster than Map Reduce for iterative algorithms or interactive data mining.”

Happy Memorial Day weekend!

How I made my Android Galaxy S3 phone run faster

My primary phone is a Samsung Galaxy S3 running Android 4.4.2 Kit Kat. It’s a little old at this point, but I’m disinclined to upgrade for the sake of upgrading. It was getting slower, so I’ve recently taken a number of steps to tune and optimize its performance.

Things I’ve done to make Android slower

1. Enable device encryption

If someone were to steal my phone without encryption, they would be able to hijack any number of my online accounts through email password resets. This is a frightening possibility, and the surest way to protect against it is to encrypt your device.

Device encryption would be mandatory if I chose to access work email on phone, but that is something I choose to avoid.

2. Encrypt microSD storage

Samsung Galaxy S2 through S5 were fantastic phones in that they have replaceable batteries and expandable storage. I’ve recently added a high performance 32GB microSD card to my phone, and sensibly encrypted it.

Things I’ve done to make Android faster

1. Replace Samsung keyboard with anything else

Ever since I purchased my phone, it’s had an infuriating multi-second delay in bringing up the Samsung touch keyboard. I’ve recently switched to the Google Keyboard available in the Google Play app store, and the difference is night and day. The Keyboard comes up instantaneously.

I’ve also heard great things about Swype.

2. Clear caches with CCleaner

Back when I was a Windows user, CCleaner was the one reputable, malware free application for tuning Windows. They have an Android version that does a good job of clearing the many caches and temporary files of your Android apps.

3. Delete all photos

Dropbox automatically backs up all of my photos, so there’s absolutely no reason to keep them on disk. Having a full file system on your Android phone reduces file system and encryption performance.

I went into the Gallery app and bulk-deleted all my old photos.

4. Move applications to external storage

The built-in storage on your phone is not as performant as a quality microSD card. Many applications will let you relocate them after installation to the microSD storage.

5. Disable animated transitions

You can enable developer mode on Android without jailbreaking.

Once you do, you can go into Settings > More > Developer options and disable all useless animations. This is something I’ve done in the past on other operating systems. On Android, I again got better responsiveness and minimal difference in visual experience.

6. Get an extended battery

Samsung phones other than the newest Galaxy S6 have user-serviceable batteries. That means you can easily purchase a replacement battery, an external spare battery charger, or an extended battery.

Since I’ve gotten an extended battery, my phone lasts two whole days on a single charge. This is fantastic for my user experience.

Going beyond

One thing I haven’t done is root or jailbreak my device. If you do jailbreak, you can uninstall all the bundleware cruft that Samsung puts on the phone.

Note that if you do intend to use your phone to access work email, there may be a policy against jailbreaking your device.

Installing DB2 on Mac OS X

It’s useful to have a local installation of the DB2 on Mac for development and test purposes. I have a local installation on my Macbook to develop DB2-backed Ruby on Rails applications.

IBM DB2 is a mature relational database server. It supports lots of neat things like SQL, XQuery for XML, SPARQL for RDF, full text search, and so on.

DB2 on Mac using DockerDB2 on Mac logo

A quicker option for getting running with DB2 on Mac is to use the DB2 docker image. It should get you started quickly with DB2 by running it in a virtualized Linux environment on your Mac.

DB2 on Mac natively

Here are the current instructions for installing DB2 10.1 on Mac OS X, courtesy of my colleague Kevin Rose:

Instructions to install DB2 v10.1 on Mac OS X Yosemite:

Prerequisite: XCode developer tools must be installed. These can be installed from the Mac App Store.

1. Ensure the following entries are in the /etc/sysctl.conf. Create the file /etc/sysctl.conf if it does not exist.


Restart your Mac after creating the file to make the values take effect.

2. Open a terminal with a shell for the the user that will become instance owner.

3. Ensure that otool is in the path. Execute otool:


If the error is “command not found” then run the following

     export PATH=$PATH*:*/Applications/XCode.app/Contents/Developer/usr/bin

4. Extract the DB2 install image from the tar archive:

     tar -xzvf db2_v101_macos_expc.tar.gz

The image will be extracted into an expc directory.

5. Enter the expc directory and run the installer and perform a non-root install:

     cd expc
  # *** DO NOT RUN db2_install AS ROOT ***

This will install DB2 to the following default location: /Users/$(whoami)/sqllib

Execute step 6 if you need to enable connections for a userid other than the instance owner:

6. Enable OS authentication. (You need to be an Admin user to run these commands):

cd /Users/$(whoami)/sqllib/security
sudo chown root /Users/$(whoami)/sqllib/security/db2ckpw
sudo chmod u+rxs /Users/$(whoami)/sqllib/security/db2ckpw 
sudo chmod o+rx  /Users/$(whoami)/sqllib/security/db2ckpw

The instructions for starting DB2 and configuring remote access are the same as before.

Free Apache Spark course with Docker image

My friends over at Big Data University just launched a refresh of their Apache Spark course. Spark is an engine for processing and mining large amounts of data quickly.

The course takes advantage of a Docker image for Spark. Docker provides a way to run Spark on a typical laptop without getting a beefy server.

Spark’s on my personal to-learn list, so I just pencilled in a slot to take the course myself on my calendar.

Dockerize all the things

Dockerize all the things As my team creates new web apps and microservices in Rails, Meteor JS, and other stacks, we dockerize them. Docker is great because it enforces loose coupling, modular design, and process isolation.

A Docker container can be thought of as a very lightweight VM that shares a Linux kernel with its host. The infrastructure and tooling around it makes configuration management, scaling, and efficient use of resources easier.

My team’s also dockerized the Informix and DB2 database servers. The direct hardware access that Docker allows will give you better performance than you’d have seen with a VM equivalent in the past, while the container format allows for easy automation, operation, and management.

Unix command of the day: watch

The project I’m working on right now involve not just Dockerized Rails microservices, Meteor JS, and a data set measured in tens of terabytes, but also a big Bash code base.

Bash is a language that makes it easy to shoot yourself in the foot. I have some thoughts on how to write robust, modular, loosely coupled, unit tested Bash that go beyond Bash strict mode and shellcheck. However, let’s save those for later.

Here’s a useful shell command for today: watch

watch will run a command for you every few seconds and output the results on a clean screen. If you combine it with a split-screen or split-pane tool, you can quickly create a mini-dashboard.

For example, watch -n15 df -h will print your free disk space every 15 seconds:

Every 15.0s: df -h        Tue Apr 14 13:43:25 2015

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             1.8T  691G  1.1T  40% /
tmpfs                 253G   12K  253G   1% /dev/shm
/dev/sda1             248M   58M  178M  25% /boot
/dev/sdb1              30T   12T   18T  41% /my_disk

By way of another example, watch -n60 db2 list utilities show detail will check on the status of your DB2 load and other operations every 60 seconds.

Every 60.0s: db2 list utilities show detail            Mon Apr 13 12:06:24 2015

ID                               = 4
Type                             = LOAD
Database Name                    = MY_DB
Member Number                    = 0
Description                      = [LOADID: 1177.2015-04-13- (20;15)] 
Start Time                       = 04/13/2015 15:58:12.298788
State                            = Executing
Invocation Type                  = User
Progress Monitoring:
   Phase Number                  = 1
      Description                = SETUP
      Total Work                 = 0 bytes
      Completed Work             = 0 bytes
      Start Time                 = 04/13/2015 15:58:12.298799

   Phase Number [Current]        = 2
      Description                = LOAD
      Total Work                 = 4105362974 rows
      Completed Work             = 1069904365 rows
      Start Time                 = 04/13/2015 15:58:15.821695

   Phase Number                  = 3
      Description                = BUILD
      Total Work                 = 2 indexes
      Completed Work             = 0 indexes
      Start Time                 = Not Started

Mac OS X does not include watch by default. Assuming you have Homebrew, the following will install watch for you:

brew install watch

libdb2.so.1: cannot open shared object file

I got this error starting a ruby application:

/usr/local/rvm/gems/ruby-1.9.3-p547/gems/bundler-1.7.3/lib/bundler/runtime.rb:76:in `require': 
libdb2.so.1: cannot open shared object file: 
No such file or directory - 
/usr/local/rvm/gems/ruby-1.9.3-p547/extensions/x86_64-linux/1.9.1/ibm_db-2.5.11/ibm_db.so (LoadError)

An .so is a Linux library, equivalent to a .dll on Windows or a .dylib on Mac. Note that there are two different libraries mentioned. ibm_db.so is present, while libdb2.so.1 is missing.

You can verify the dependencies using the ldd command:

$ ldd /usr/local/rvm/gems/ruby-1.9.3-p547/extensions/x86_64-linux/1.9.1/ibm_db-2.5.11/ibm_db.so
    linux-vdso.so.1 =>  (0x00007fff93545000)
    libruby.so.1.9 => /usr/local/rvm/rubies/ruby-1.9.3-p547/lib/libruby.so.1.9 (0x00007fb8ba5a5000)
    libdb2.so.1 => not found
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb8ba37f000)
    librt.so.1 => /lib64/librt.so.1 (0x00007fb8ba177000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007fb8b9f72000)
    libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fb8b9d3b000)
    libm.so.6 => /lib64/libm.so.6 (0x00007fb8b9ab7000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fb8b9722000)
    /lib64/ld-linux-x86-64.so.2 (0x0000003730400000)
    libfreebl3.so => /lib64/libfreebl3.so (0x00007fb8b94ab000)

For me, the issue was caused by a missing IBM Data Server Driver directory. IBM_DB_LIB was pointing to a non-existent directory:

$ set | grep IBM

Reinstalling the Data Server Driver restored the libdb2.so.1 and eliminated the error.

(If you run into this issue with different libraries, you will likely need to examine your LD_LIBRARY_PATH environment variable and use the ldconfig command to reload any changes.)

How to run and install IPMIView on Mac

Edit: The download link doesn’t work anymore, but it looks like IPMIView is now available in the Mac App Store. Also, just in case, someone suggested running the Linux version on Mac.

Softlayer cloud uses IPMIView for direct console access to bare metal hardware. SuperMicro makes a Mac version of IPMIView available for download.

Bizarrely, SuperMicro doesn’t appear to have ever tested it, because double-clicking on the downloaded IPMIView20 application doesn’t do anything. This is because someone forgot to set the execute permission on the installer.

Supposing it’s in your downloads folder, open up Terminal and run the following commands:

cd /Users/$( whoami )/Downloads/IPMIView20.app/Contents/MacOS
chmod u+x IPMIView20

This will open up the installer for you. Once it’s installed, it will show up in your Launchpad like a normal application.