Category: Database

  • Retiring Planet Db2 and Planet Big Data

    Planet Db2 and Planet Big Data are two blog aggregators that I created when I first started at IBM. They were similar to Planet Debian or Planet Intertwingly but about the Db2 relational database server and about big data technologies like Hadoop. If you are interested in taking over the blogrolls, please contact me.

    Blog aggregators are something from an earlier era of the web. We live in a closed off world of walled gardens and information silos like Facebook, Snapchat, and TikTok. In an earlier era, people ran their own blogs independently of any social network. The content was published in a machine-readable format that anyone or anything could pull to get notifications. Anyone could set up the equivalent of the Facebook Wall by using a tool to aggregate — that is, to federate — other people’s feeds.

    I still inhabit that lost world to some extent because I use NewsBlur to subscribe to RSS and Atom feeds for news sources like Ars Technica, webcomics like XKCD, and so on. NewsBlur is one of many successors to the late Google Reader. It’s great because I get exactly the content that I have myself curated as something that I care about with none of the noise and none of the ads of Facebook or even Reddit.

    The Planets are based on the Planet Venus software created by Sam Ruby. The architecture is that there’s a scheduled cron job that runs at a regular interval to pull updates from all the blogs in the blogroll. It then generates a static site that can be served by any webserver like Apache, Nginx, and so on.

    I’m shutting the planets down because the current incarnation is too old to update in place, and I can’t justify spending the time myself to recreate the sites on new technology.

    If you are member of the community who’s interested in taking over Planet Db2 or Planet Big Data, please contact me.

  • Command line client for Sentry (Bash)

    Sentry is a great error aggregation service. We use it for every service we deploy at work. It lets us monitor and troubleshoot incidents of errors. It also integrates nicely with Slack, a messaging tool we use for everything.

    It integrates nicely with Javascript, Ruby, and Python stacks among others — but as a RESTful service you can also access it directly from the command line.

    Sometimes, you start writing a Bash shell script that grows so much that you need to start logging errors in a central error aggregation service. Frankly, that’s a sign that you should have picked a different language for the initial implementation, whether Python or Ruby or something else more robust. However, once you have such a Bash script, porting it arguably becomes as problematic as instrumenting it.

    Bash client for Sentry

    A quick Google search turns up more feature-complete attempts at a command line client. You may want to follow the link and use that instead.

    Still, for posterity, here’s something I used to instrument such a Bash script last year:

    # Install dependencies on Alpine Linux
    apk --no-cache add --virtual build-dependencies gcc python-dev musl-dev
    pip2 install httpie
    
    # Transform a Sentry DSN into useful components
    trim() {
        local var="$*"
        # remove leading whitespace characters
        var="${var#"${var%%[![:space:]]*}"}"
        # remove trailing whitespace characters
        var="${var%"${var##*[![:space:]]}"}"
        echo -n "$var"
    }
    SENTRY_DSN=$(trim "${SENTRY_DSN:-}")
    SENTRY_KEY="$(echo $SENTRY_DSN | sed -E "s@^.*//(.*):.*@1@g")"
    SENTRY_SECRET="$(echo $SENTRY_DSN | sed -E "s@^.*:(.*)@.*@1@g")"
    SENTRY_PROJECT_ID="$(echo $SENTRY_DSN | sed -E "s@^.*/([0-9]*)@1@g")"
    SENTRY_URL_WITH_PROJECT="${SENTRY_DSN/${SENTRY_KEY}:${SENTRY_SECRET}@/}"
    SENTRY_URL="${SENTRY_URL_WITH_PROJECT//[0-9]*/}"
    
    # Bash function to report errors to Sentry
    # Usage:
    # report_error "${FUNCNAME[0]}:$LINENO" "Uh oh, spaghettios!"
    report_error() {
      [[ -z "${SENTRY_DSN:-}" ]] && return
    
      declare culprit
      declare timestamp
      declare message
      declare x_sentry_auth
      declare referer
      declare body
      declare url
      declare content_type
    
      culprit=${1:?}
      timestamp=$(date +%Y-%m-%dT%H:%M:%S)
      message=${2:?}
    
      x_sentry_auth="X-Sentry-Auth:Sentry sentry_version=5"
      x_sentry_auth="${x_sentry_auth:?},sentry_client=0.1.0"
      x_sentry_auth="${x_sentry_auth:?},sentry_timestamp=${timestamp:?}"
      x_sentry_auth="${x_sentry_auth:?},sentry_key=${SENTRY_KEY:?}"
      x_sentry_auth="${x_sentry_auth:?},sentry_secret=${SENTRY_SECRET:?}"
    
      referer="Referer:http://example.com/"
      content_type="Content-Type: application/json"
    
      url="${SENTRY_URL:?}/api/${SENTRY_PROJECT_ID:?}/store/"
    
      body=$(cat <<BODY
    {
      "culprit": "${culprit:?}",
      "timestamp": "${timestamp:?}",
      "message": "${message:?}",
      "tags": {
        "BACKUP_INTERVAL": "${BACKUP_INTERVAL:?}"
      },
      "exception": [{
        "type": "BackupError",
        "value": "${message:?}",
        "module": "${BASH_SOURCE[0]}"
      }]
    }
    BODY
    )
    
      echo "$body" | http POST "${url:?}" "${x_sentry_auth:?}" "${referer:?}" "${content_type:?}"
    }
    

     

     

  • Fix full Ubuntu /boot partition

    Linux kernel images are stored on a separate partition mounted under /boot. This partition can fill up, at which point you can no longer install any software updates.

    Ubuntu (and possibly Debian and Mint) has a command called purge-old-kernels that helps to prevent you from ever getting in that situation. Similarly, RHEL/CentOS/Fedora have a command called package-cleanup.

    However, if your /boot partition is already full, purge-old-kernels won’t work. You will need to run something like the following:

    dpkg --list 'linux-image*' | cut -d' ' -f3 | grep linux-image | grep -v "$(uname -r)" | grep "[0-9]" | xargs dpkg -r --force-depends
    
    apt-get -fy install
    
    purge-old-kernels -y

     

     

  • #DUTO2015 conference

    I’m at the Data Unconference in Toronto today. Jarred Gaertner just gave a through-provoking keynote on the ethics of big data, and I’m about to dig my hands into some open data sets in Richard Pietro’s hands-on session.

    My colleague Polong Lin will be on a panel about IBM’s data science tools this afternoon, which should be interesting if only because it’s hard to keep track of everything that’s out there.

    #DUTO2015 is sponsored by my friends at Big Data University

  • Encrypt Gmail, Facebook emails with OpenPGP and Mailvelope

    Facebook just released a great use case for OpenPGP encryption in Gmail and other web email providers. You can now configure Facebook to encrypt all email it sends you with OpenPGP.

    Whether or not you use Facebook, it’s surprisingly easy to use Mailvelope to integrate OpenPGP with Gmail and other email providers. Mailvelope is a browser extension for Chrome and Firefox that lets you encrypt messages that you write and decrypt messages that you receive.

    The encryption can be done externally to the web email interface, so your email provider does not have access to the plain text of your email message.

    OpenPGP is based on public key cryptography. You have two keys — a public key you can share with everyone, and a private key that you keep secret. Everyone can use your public key to encrypt messages they send you, but only you can decrypt them using your private key.

    Why encrypt email? Email is generally transmitted in plain text across the internet, meaning a hostile party can intercept it. With the web (http) moving to encrypted connections for everything, email is left as an insecure communication medium. You as a user have to take active steps to make it secure.

    Here’s how you can transparently integrate OpenPGP encryption with Gmail and Facebook:

    1. Install Mailvelope
    2. Generate a new key in Mailvelope options
    3. Go to Display Keys, click on a key, go to Export, and copy the public key
    4. Open your Facebook profile, go to About > Contact and Basic Info, and paste in the key in the PGP Public Key field
    5. Facebook will send you an encrypted notification. Mailvelope should turn your browser cursor into a golden key when you hover your cursor over the encrypted contents
    6. Tada!

    You may also want to share your public key on keyservers like the MIT PGP Key Server or the PGP Global Directory. In principle, that will allow other people to send you encrypted email messages that only you can decrypt.

  • How I made my Android Galaxy S3 phone run faster

    My primary phone is a Samsung Galaxy S3 running Android 4.4.2 Kit Kat. It’s a little old at this point, but I’m disinclined to upgrade for the sake of upgrading. It was getting slower, so I’ve recently taken a number of steps to tune and optimize its performance.

    Things I’ve done to make Android slower

    1. Enable device encryption

    If someone were to steal my phone without encryption, they would be able to hijack any number of my online accounts through email password resets. This is a frightening possibility, and the surest way to protect against it is to encrypt your device.

    Device encryption would be mandatory if I chose to access work email on phone, but that is something I choose to avoid.

    2. Encrypt microSD storage

    Samsung Galaxy S2 through S5 were fantastic phones in that they have replaceable batteries and expandable storage. I’ve recently added a high performance 32GB microSD card to my phone, and sensibly encrypted it.

    Things I’ve done to make Android faster

    1. Replace Samsung keyboard with anything else

    Ever since I purchased my phone, it’s had an infuriating multi-second delay in bringing up the Samsung touch keyboard. I’ve recently switched to the Google Keyboard available in the Google Play app store, and the difference is night and day. The Keyboard comes up instantaneously.

    I’ve also heard great things about Swype.

    2. Clear caches with CCleaner

    Back when I was a Windows user, CCleaner was the one reputable, malware free application for tuning Windows. They have an Android version that does a good job of clearing the many caches and temporary files of your Android apps.

    3. Delete all photos

    Dropbox automatically backs up all of my photos, so there’s absolutely no reason to keep them on disk. Having a full file system on your Android phone reduces file system and encryption performance.

    I went into the Gallery app and bulk-deleted all my old photos.

    4. Move applications to external storage

    The built-in storage on your phone is not as performant as a quality microSD card. Many applications will let you relocate them after installation to the microSD storage.

    5. Disable animated transitions

    You can enable developer mode on Android without jailbreaking.

    Once you do, you can go into Settings > More > Developer options and disable all useless animations. This is something I’ve done in the past on other operating systems. On Android, I again got better responsiveness and minimal difference in visual experience.

    6. Get an extended battery

    Samsung phones other than the newest Galaxy S6 have user-serviceable batteries. That means you can easily purchase a replacement battery, an external spare battery charger, or an extended battery.

    Since I’ve gotten an extended battery, my phone lasts two whole days on a single charge. This is fantastic for my user experience.

    Going beyond

    One thing I haven’t done is root or jailbreak my device. If you do jailbreak, you can uninstall all the bundleware cruft that Samsung puts on the phone.

    Note that if you do intend to use your phone to access work email, there may be a policy against jailbreaking your device.

  • Installing DB2 on Mac OS X

    It’s useful to have a local installation of the DB2 on Mac for development and test purposes. I have a local installation on my Macbook to develop DB2-backed Ruby on Rails applications.

    IBM DB2 is a mature relational database server. It supports lots of neat things like SQL, XQuery for XML, SPARQL for RDF, full text search, and so on.

    DB2 on Mac using DockerDB2 on Mac logo

    A quicker option for getting running with DB2 on Mac is to use the DB2 docker image. It should get you started quickly with DB2 by running it in a virtualized Linux environment on your Mac.

    DB2 on Mac natively

    Here are the current instructions for installing DB2 10.1 on Mac OS X, courtesy of my colleague Kevin Rose:

    Instructions to install DB2 v10.1 on Mac OS X Yosemite:

    Prerequisite: XCode developer tools must be installed. These can be installed from the Mac App Store.

    1. Ensure the following entries are in the /etc/sysctl.conf. Create the file /etc/sysctl.conf if it does not exist.

    kern.sysv.shmmax=1073741824
    kern.sysv.shmmin=1
    kern.sysv.shmmni=4096
    kern.sysv.shmseg=32
    kern.sysv.shmall=1179648
    kern.maxfilesperproc=65536
    kern.maxfiles=65536

    Restart your Mac after creating the file to make the values take effect.

    2. Open a terminal with a shell for the the user that will become instance owner.

    3. Ensure that otool is in the path. Execute otool:

         otool

    If the error is “command not found” then run the following

         export PATH=$PATH*:*/Applications/XCode.app/Contents/Developer/usr/bin

    4. Extract the DB2 install image from the tar archive:

         tar -xzvf db2_v101_macos_expc.tar.gz

    The image will be extracted into an expc directory.

    5. Enter the expc directory and run the installer and perform a non-root install:

         cd expc
         db2_install
      # *** DO NOT RUN db2_install AS ROOT ***

    This will install DB2 to the following default location: /Users/$(whoami)/sqllib

    Execute step 6 if you need to enable connections for a userid other than the instance owner:

    6. Enable OS authentication. (You need to be an Admin user to run these commands):

    cd /Users/$(whoami)/sqllib/security
    sudo chown root /Users/$(whoami)/sqllib/security/db2ckpw
    sudo chmod u+rxs /Users/$(whoami)/sqllib/security/db2ckpw 
    sudo chmod o+rx  /Users/$(whoami)/sqllib/security/db2ckpw

    The instructions for starting DB2 and configuring remote access are the same as before.

  • Dockerize all the things

    Dockerize all the things As my team creates new web apps and microservices in Rails, Meteor JS, and other stacks, we dockerize them. Docker is great because it enforces loose coupling, modular design, and process isolation.

    A Docker container can be thought of as a very lightweight VM that shares a Linux kernel with its host. The infrastructure and tooling around it makes configuration management, scaling, and efficient use of resources easier.

    My team’s also dockerized the Informix and DB2 database servers. The direct hardware access that Docker allows will give you better performance than you’d have seen with a VM equivalent in the past, while the container format allows for easy automation, operation, and management.

  • libdb2.so.1: cannot open shared object file: No such file or directory – … ibm_db.so

    Got this error while deploying a Rails app on Nginx:

    libdb2.so.1: cannot open shared object file: No such file or directory - ... ibm_db.so

    This means that the ibm_db adapter is installed, but it can’t find the DB2 libraries. The issue is that IBM_DB_HOME and some other environment variables are not set.

    The best solution is to make sure all users have db2profile loaded. Edit /etc/profile and add:

    . /opt/dsdriver/db2profile

    You should now reload your profile (. /etc/profile) and restart Nginx.

    This assumes that you already have IBM Data Server Driver installed under /opt/dsdriver.

  • JRuby for the Java .class is .java_class

    I’ve been having a lot of fun working with a Apache jclouds in JRuby. All the examples for the API are in Java and Clojure, while online JRuby docs could be better, so there’ve been some interesting translation challenges.

    I just had to re-Google what .class becomes in JRuby, so a quick note for the future.

    Java:

    template.getOptions().as(EC2TemplateOptions.class).keyPair("bluforcloud-test");

    JRuby:

    template.getOptions.as(EC2TemplateOptions.java_class).keyPair "bluforcloud-test"