Category Archives: Announcement

Posting more frequently

I’ve been really enjoying Rafe Colburn’s technical blog since he made his pledge to post more frequently. It makes a lot of sense for a technical blog to also have linkblogging with brief commentary within the same stream of content. I would argue that the appeal of sites like Reddit and Hacker News relates to people doing the same en masse.

Naturally, I’ve also been doing some techie linkblogging on my Twitter account.

CASCON 2012

On November 6, 2012, I’m teaching a hands-on lab at CASCON together with Bradley Steinfeld and Marius Butuc. The lab is called Crunching Big Data with Hadoop and BigInsights in the Cloud. The lab is based on the Hadoop Fundamentals course at Big Data University.

Morning

1.0 Welcome
1.1 What is Big Data?
1.2 Lab Setup
- Setup Lab
- Setup Lab (PDF Download)
1.3 What is Hadoop?
1.4 Hadoop Architecture – HDFS
- HDFS Lab
- Lab (PDF Download)
1.5 Hadoop Architecture – MapReduce
- MapReduce Lab
- Lab (PDF Download)

Afternoon

1.6 Pig, Hive, and Jaql
- Pig, Hive, and Jaql Lab
- Lab (PDF Download)
1.8 Working with BigInsights
- Web Console Lab
- Web Console Lab (PDF Download)
1.9
Data Discovery with BigSheets

Module 1.7 covers Flume. It’s available for free on Big Data University.

Dehacking this blog

The first rule of security is to, of course, assume everything is compromised. If some code is compromised, everything is compromised. The correct response to a hacked WordPress is to nuke all the code.

My WordPress installation was recently compromised. There’s a limit to how far I can apply the principle because this particular WordPress is currently on shared hosting, but all code I have access to is now nuked. WordPress has been reinstalled from scratch, and all the various hanger-on sites that had accumulated in the same hosting account are now no more.

I’ve also adopted the pertinent steps from My WordPress Site Was Hacked, Hardening WordPress, and the Ultimate Security Checker plugin (guide).

Last line of defense:

The attack’s objective was to inject PHP code into various pages. The code was obfuscated via a double pass through those two functions. The two shell commands above will show any instances of those two functions.

Materials for the Hadoop workshop at CASCON

This is the syllabus for the workshop I’m chairing at CASCON 2011 with @mariusbutuc and @bsteinfe. If you’re interested, you can also take the course at your own pace online at BigDataUniversity.

SSH

Attendees will be provided with access to machines running Hadoop in a cloud environment. The necessary SSH credentials will be provided in class.

Materials

Chairing a Hadoop workshop at CASCON 2011

I’ll be chairing the Crunching Big Data in the Cloud with Hadoop and BigInsights workshop at CASCON 2011 in Toronto on Wednesday, November 9th. @BSteinfe and @MariusButuc will be joining me as co-chairs.

The workshop will be an all day hands-on introduction to Hadoop, HDFS, MapReduce, Hive, and JAQL. The plan is to have ready Hadoop clusters running in the cloud for the various exercises.

Hadoop is a parallelized data processing framework. It lends itself very nicely to running in cloud environments like Amazon EC2 and IBM SCE, as the core concept is to split sophisticated queries across clusters of commodity hardware. On a basic level it’s an implementation of MapReduce in Java, but a great many tools in its eco system make it easy to formulate and execute queries on the fly.

The material will have some things in common with the free Hadoop Fundamentals course you can take on Big Data University today, though naturally adapted for the CASCON themes and with added hands-on instruction.

Next steps

Worst Google Translation ever

I was writing a response to a forum post in Russian and thought to run it through Google Translate for verification. If nothing else, it would catch the sort of misspelling I tend to make.

It surprised me by completely reversing the meaning of what I wrote:


That’s the complete opposite meaning!

This is the first time I’ve been irritated enough to correct a Google translation. It’s a common word, so I’ m not sure what could have led to the misapprehension on Google’s part.

Adopting a new WordPress theme

New theme thumbnailAfter a pointer from rc3, I read an interesting article earlier today:

In short, 9 of the top 10 Google search results for free WordPress themes provide themes full of malware and spammy links. The one site that doesn’t is the official site. Unfortunately, I have to say from experience that the free themes on the official site are consistently poor in quality.

You can verify that your current theme is free of malware by using the Theme-Check and Theme Authenticity Checker plugins.

The theme I was using before was clean, but the design quality was low. I began to consider buying a quality theme somewhere, but the article did point out two decent sites that have some quality free themes:

I can’t vouch them, as all I have to go on is the word of that article. I did end up adopting the free TypeBased theme from the latter site, and I am very happy with it so far. It’s well-designed, polished, and it integrates nicely with WordPress 3.0.

Oddly, Theme-Check does flag TypeBased as using base64_encode() and base64_decode() functions, but from what I can tell it’s in the legitimate context of an FTP API.