Blog

  • My team at IBM Skills Network

    Some of the out-of-town team members visited in October. We wrapped up the workday at a local escape room here in Ontario. It was good to have everyone in the same room, or perhaps trapped in three separate rooms filled with puzzling situations, each room more inescapable than the last.

    At IBM Skills Network, we run education portals for companies and educational institutions including CognitiveClass.ai. We also create the best rated and most popular data science courses on Coursera and edX, as well as operate a containerized data science and machine learning labs environment.

  • Retiring Planet Db2 and Planet Big Data

    Planet Db2 and Planet Big Data are two blog aggregators that I created when I first started at IBM. They were similar to Planet Debian or Planet Intertwingly but about the Db2 relational database server and about big data technologies like Hadoop. If you are interested in taking over the blogrolls, please contact me.

    Blog aggregators are something from an earlier era of the web. We live in a closed off world of walled gardens and information silos like Facebook, Snapchat, and TikTok. In an earlier era, people ran their own blogs independently of any social network. The content was published in a machine-readable format that anyone or anything could pull to get notifications. Anyone could set up the equivalent of the Facebook Wall by using a tool to aggregate — that is, to federate — other people’s feeds.

    I still inhabit that lost world to some extent because I use NewsBlur to subscribe to RSS and Atom feeds for news sources like Ars Technica, webcomics like XKCD, and so on. NewsBlur is one of many successors to the late Google Reader. It’s great because I get exactly the content that I have myself curated as something that I care about with none of the noise and none of the ads of Facebook or even Reddit.

    The Planets are based on the Planet Venus software created by Sam Ruby. The architecture is that there’s a scheduled cron job that runs at a regular interval to pull updates from all the blogs in the blogroll. It then generates a static site that can be served by any webserver like Apache, Nginx, and so on.

    I’m shutting the planets down because the current incarnation is too old to update in place, and I can’t justify spending the time myself to recreate the sites on new technology.

    If you are member of the community who’s interested in taking over Planet Db2 or Planet Big Data, please contact me.

  • Skynet Golems in Cyberspace

    There is no stealth in space. You can see a rocket burn anywhere between here and past Pluto, or any object of room temperature on infrared at the same distance. You cannot hide a spaceship. It is much the same thing in cyberspace. Anyone on the internet can attack anyone on the internet.

    Yet across the gulf of space, minds that are to our minds as ours are to those of the beasts that perish, intellects vast and cool and unsympathetic, regarded this earth with envious eyes, and slowly and surely drew their plans against us.

    H.G. Wells, War of the Worlds

    All systems on the internet are continuously under attack. Your phone, your laptop, your smart refrigerator, your nannycam that you use to watch your children — they are all targets. Some of them have already been hacked and are now drones in a botnet, unbeknownst to you.

    There are many motivations for cybercrime. Sometimes people hack your system because it’s your system. Sometimes they do it because it’s a system. Sometimes they do it to use it as a tool against an entirely different target.

    Motivation is important because you need to understand a threat model to defend against it. Are you defending against your own government? Are you defending against the advanced persistent threat of a different government, like Google was in 2009? Are you defending against a targeted hacker that wants to spearphish you into wire transferring your company funds to the wrong wire transfer address? Are you defending against an undiscerning hacker that wants to cryptojack your system and hold your files for ransom? Are you defending against your abusive significant other who wants to stalk and control you? The best approach for one threat isn’t right for another.

    In certain circles, a lot of ink has been spilled on AI explosion, the Singularity, and so on. From my perspective, any speculation in that regard has to make unjustifiable assumptions and tends to predict the unpredictable. More importantly, I do not fret about Skynet, because the Golems are already here.

    Quote from Feet of Clay
    “I heard there was a golem who was made to dig a trench and they forgot about it and they only remembered it when there was all this water ’cos it had dug all the way to the river.” – Terry Pratchett, Feet of Clay

    A golem is a fictional creature from Jewish folklore. It is a clay being animated by an inscription that follows directions. I’m thinking less of the mythological version and more of the modern interpretation, whether in the comic fantasy of Terry Pratchett or in the generic mythos of Dungeons and Dragons.

    The internet is filled with animated hammers ceaselessly hammering. Do you run an obsolete version of some software with known security vulnerabilities? The animated hammers will break in, because there is a directory of all known instances of that software on the internet, and there is a hammer hitting each one to see if it cracks or not. When the hammer breaks into one, it uses it to send out more hammers.

    Security by obscurity is increasingly impossible, because nothing is obscure. There is no stealth in space or in cyberspace. Everything has to be secure by default, because the window of time between vulnerable and hacked is ever narrowing.

    We live in an age of artificial stupidity. Perhaps someday soon we’ll build a human-level artificial stupidity, an artificial general stupidity if you will. Until then, we live in a world of animated hammers.

    Animated marching hammers
  • Python Library of the Day: retrying

    Python logoI’ve learned through extensive experience that Bash is the wrong choice for anything longer than a few lines. I needed to write a command line app, so I put one together in Python — Python 3 of course, as Python 2 is going away by 2020. In the process I discovered a new to me Python library called retrying.

    If you want to learn Python, check out the Python for Data Science course on Cognitive Class.

    retrying

    I needed my Python code to repeat a bunch of operations until they succeeded. It’s easy to write a naive loop for that, but the logic gets convoluted and makes the actual operation ugly to look at. By the time you do something three times over, you should automate.

    You can of course write an abstraction yourself, but for this sort of common problem it is best to use an existing library.

    XKCD comic on automation

    The benefit to using an existing library is not just that someone else maintains it, but also that you benefit from the collective wisdom and experience of everyone else using the library. Computing is full of strange edge cases and unexpected security holes. These are harder to avoid when rolling your own abstraction.

    For my purpose, I found a Python library called retrying. It provides a simple decorator called @retry that you can apply to any function or method. The decorator also takes additional parameters so you can configure all the timeouts, intervals, exponential decay, and smart exception handling that you want.

    Kudos to everyone working on the library. It’s a great little tool.

  • Kudos to Firefox team on Quantum release

    The new Firefox Quantum release is incredibly fast. It feels faster than Chrome, faster than old Firefox, and faster than all the other browsers on my Macbook.

    Impressively, despite Firefox ditching the old extension model, all my extensions continue to work. I did have to manually reinstall the indispensable Tree Style Lab, but it works and Firefox is incredibly speedy.

    Kudos on the great effort!

    My Firefox Extensions

  • Cryptocurrency and irreversible transactions

    There’s a current news story about a wallet blunder freezing up $280,000,000 of Ether, a cryptocurrency. I try to avoid posting too much opinion on my blog, but I do have a view on this.

    Cryptocurrency

    A cryptocurrency like Bitcoin or Ether is based on the idea of unbreakable contracts and irreversible transactions. This is great in many contexts, but somewhat scary to me as consumer should I ever choose to pay for something using a cryptocurrency.

    If you want to know more about cryptocurrency and Blockchain, you should check out the Blockchain Essentials course on Cognitive Class.

    Mostly Harmless

    I think this Douglas Adams parable about the design problem of un-openable windows applies to many things in tech, including cryptocurrency:

    …all the windows in the buildings were built sealed shut. This is true.

    While the systems were being installed, a number of people who were going to work in the buildings found themselves having conversations with Breathe-o-Smart systems fitters which went something like this:

    “But what if we want to have the windows open?”

    “You won’t want to have the windows open with new Breathe-o-Smart.”

    “Yes but supposing we just wanted to have them open for a little bit?”

    “You won’t want to have them open even for a little bit. The new Breathe-o-Smart system will see to that.”

    “Hmmm.”

    “Enjoy Breathe-o-Smart!”

    “OK, so what if the Breathe-o-Smart breaks down or goes wrong or something?”

    “Ah! One of the smartest features of the Breathe-o-Smart is that it cannot possibly go wrong. So. No worries on that score. Enjoy your breathing now, and have a nice day.”

    It was, of course, as a result of the Great Ventilation and Telephone Riots of SrDt 3454, that all mechanical or electrical or quantum-mechanical or hydraulic or even wind, steam or piston-driven devices, are now requited to have a certain legend emblazoned on them somewhere. It doesn’t matter how small the object is, the designers of the object have got to find a way of squeezing the legend in somewhere, because it is their attention which is being drawn to it rather than necessarily that of the user’s.

    The legend is this:

    “The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair.”

  • Integrate your Rails app with Open edX SSO and Oauth2

    Earlier this year, I put together the omniauth-cognitiveclass gem for integrating Rails apps with Open edX SSO (single sign-on). OpenEdX is an open source platform for running massive online open courses (MOOCs). Ruby on Rails is a web application framework. I develop the services and infrastructure for IBM Cognitive Class which is partly based on Open edX.

    Cognitive Class

    IBM Cognitive ClassCognitive Class has a whole bunch of learning paths and courses covering topics like data science, deep learning, and machine learning. My role on the team is to architect and develop the hands-on labs environment, which I think is one of the best in the industry. We provision a full suite of industry tools on demand for any student looking to do data science exercises.

    Open edX

    I’m assuming that your Open edX online course system is set up as an Oauth2 authentication provider.

    Ruby on Rails

    Behind the scenes at Cognitive Class, we use a mix of micro-services and web applications built in Ruby, Python, and Node.js to manage the infrastructure. Rails is a great framework for creating web services or web applications.

    The usual way to add authentication to Rails is using the devise and omniauth gems.

    Open edX SSO

    The omniauth-cognitiveclass gem is a plugin for that extends omniauth to support Open edX SSO with Oauth2 as an authentication provider. I’ve deployed it in production with Cognitive Class, but it should work generally for all Open edX. Let me know if you run into any issues.

     

     

  • Brace expansion to match multiple files in Bash

    Bash has handy brace expansion powers that I’ve belatedly discovered.

    $ echo I love hippo{griffs,potamuses,dromes}
    I love hippogriffs hippopotamuses hippodromes

    For example, you can quickly diff a file with and without a suffix:

    $ echo diff .env{,.example}
    diff .env .env.example

    Or tail multiple log files:

    $ echo tail -f /var/log/{messages,secure}
    tail -f /var/log/messages /var/log/secure

    Bash brace expansion can do other things too, such as specify a range with a .. operator.

  • Command line client for Sentry (Bash)

    Sentry is a great error aggregation service. We use it for every service we deploy at work. It lets us monitor and troubleshoot incidents of errors. It also integrates nicely with Slack, a messaging tool we use for everything.

    It integrates nicely with Javascript, Ruby, and Python stacks among others — but as a RESTful service you can also access it directly from the command line.

    Sometimes, you start writing a Bash shell script that grows so much that you need to start logging errors in a central error aggregation service. Frankly, that’s a sign that you should have picked a different language for the initial implementation, whether Python or Ruby or something else more robust. However, once you have such a Bash script, porting it arguably becomes as problematic as instrumenting it.

    Bash client for Sentry

    A quick Google search turns up more feature-complete attempts at a command line client. You may want to follow the link and use that instead.

    Still, for posterity, here’s something I used to instrument such a Bash script last year:

    # Install dependencies on Alpine Linux
    apk --no-cache add --virtual build-dependencies gcc python-dev musl-dev
    pip2 install httpie
    
    # Transform a Sentry DSN into useful components
    trim() {
        local var="$*"
        # remove leading whitespace characters
        var="${var#"${var%%[![:space:]]*}"}"
        # remove trailing whitespace characters
        var="${var%"${var##*[![:space:]]}"}"
        echo -n "$var"
    }
    SENTRY_DSN=$(trim "${SENTRY_DSN:-}")
    SENTRY_KEY="$(echo $SENTRY_DSN | sed -E "s@^.*//(.*):.*@1@g")"
    SENTRY_SECRET="$(echo $SENTRY_DSN | sed -E "s@^.*:(.*)@.*@1@g")"
    SENTRY_PROJECT_ID="$(echo $SENTRY_DSN | sed -E "s@^.*/([0-9]*)@1@g")"
    SENTRY_URL_WITH_PROJECT="${SENTRY_DSN/${SENTRY_KEY}:${SENTRY_SECRET}@/}"
    SENTRY_URL="${SENTRY_URL_WITH_PROJECT//[0-9]*/}"
    
    # Bash function to report errors to Sentry
    # Usage:
    # report_error "${FUNCNAME[0]}:$LINENO" "Uh oh, spaghettios!"
    report_error() {
      [[ -z "${SENTRY_DSN:-}" ]] && return
    
      declare culprit
      declare timestamp
      declare message
      declare x_sentry_auth
      declare referer
      declare body
      declare url
      declare content_type
    
      culprit=${1:?}
      timestamp=$(date +%Y-%m-%dT%H:%M:%S)
      message=${2:?}
    
      x_sentry_auth="X-Sentry-Auth:Sentry sentry_version=5"
      x_sentry_auth="${x_sentry_auth:?},sentry_client=0.1.0"
      x_sentry_auth="${x_sentry_auth:?},sentry_timestamp=${timestamp:?}"
      x_sentry_auth="${x_sentry_auth:?},sentry_key=${SENTRY_KEY:?}"
      x_sentry_auth="${x_sentry_auth:?},sentry_secret=${SENTRY_SECRET:?}"
    
      referer="Referer:http://example.com/"
      content_type="Content-Type: application/json"
    
      url="${SENTRY_URL:?}/api/${SENTRY_PROJECT_ID:?}/store/"
    
      body=$(cat <<BODY
    {
      "culprit": "${culprit:?}",
      "timestamp": "${timestamp:?}",
      "message": "${message:?}",
      "tags": {
        "BACKUP_INTERVAL": "${BACKUP_INTERVAL:?}"
      },
      "exception": [{
        "type": "BackupError",
        "value": "${message:?}",
        "module": "${BASH_SOURCE[0]}"
      }]
    }
    BODY
    )
    
      echo "$body" | http POST "${url:?}" "${x_sentry_auth:?}" "${referer:?}" "${content_type:?}"
    }
    

     

     

  • Update OpenLDAP SSL certificate on CentOS 6

    You may need to update your OpenLDAP SSL certificate, as well as the CA certificate and signing key on a regular basis. I ran into an issue that was ultimately resolved by doing that.

    Connections to an OpenLDAP server I administer stopped working with this error:

    ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)

    The server itself was up and the relevant ports were accessible. In fact, unencrypted LDAP continued to work while LDAPS saw the error above.

    I restarted slapd with -d 255 flag (-d 8 is sufficient for this error) and started seeing this error:

    TLS: error: could not initialize moznss security context - error -5925:The one-time function was previously called and failed. Its error code is no longer available TLS: can't create ssl handle.

    At the start of the log, I saw several related errors including this one:

    ... is not valid - error -8181:Peer's Certificate has expired..

    Ultimately this meant that I had to replace not just my certificate but also the CA certificate and the signing key in OpenLDAP’s moznss database. I believe my CA’s certificate had to be replaced because of the SHA1 retirement last year.

    The steps I had to follow were surprisingly involved and undocumented:

    • Upload the new certificates to /etc/openldap/ssl
    • cd /etc/openldap/certs
    • List the existing certificates in the database:
    certutil -L -d .
    • Remove the existing certs:
    certutil -D -d . -n "OpenLDAP Server"
    
    certutil -D -d . -n "My CA Certificate"
    • Load the new OpenLDAP SSL certificate and CA certificate:
    certutil -A -n "OpenLDAP Server" -t CTu,u,u -d . -a -i ../ssl/my_certificate.bundle.crt
    
    certutil -A -n "My CA Certificate" -t CT,C,c -d . -a -i ../ssl/my_CA_certificate.intermediate.crt
    • Verify:
    certutil -L -d .
    • Convert the key to pkcs12 format:
    openssl pkcs12 -export -out ../ssl/my_certificate.key.pkcs12 -inkey ../ssl/my_certificate.key -in ../ssl/my_certificate.bundle.crt -certfile ../ssl/my_CA_certificate.intermediate.crt
    • Import the signing key:
    pk12util -i ../ssl/my_certificate.key.pkcs12 -d .
    
    # Database password is in /etc/openssl/certs/password
    
    # Key password is what you set above
    • Restart slapd:
    service slapd restart

    I hope that’s enough to help anyone facing the same problem on CentOS, RHEL, Fedora, and possibly other distros.

    See Also