Some fun with D3.js

player-comparer

I’ve been intrigued by D3.js and the impressive visualizations I’ve seen people make for quite some time and I finally got to check it out for myself. I wanted to write this blog post sharing my experiences and revelations from the perspective of someone who at one point spent a fair amount of time with Matlab/Octave and my personal fave matplotlib.

The first major point I want to make is that D3.js is not a plotting library - it should not be compared directly to Matlab/Octave or matplotlib because they serve different purposes. D3.js is a visualization library, sure this visualization may be a graph but you shouldn’t be playing with D3.js until you already understand your data and simply want to show it off. While exploring and figuring out your data matplotlib is going to be way faster and more effective (note there are some wrappers building standard plotting functions ontop of D3.js like NVD3 if javascript is really your thing).

Once you’ve figured out your data and it’s time to show-off D3.js really shines and here is the main reason why in my opinion. It’s due to a fundamental shift in a way of thinking that makes D3.js so powerful: rather than calling plot with an array of x and y values I am binding an array of javascript objects to my plot and then telling it which attributes to use for different things. Essentially you are pulling more data into the plot and you can write code at each level - let me show you the snippet that made me realize how powerful this shift in thinking is:

var chart = d3.select(".chart");

var tip = d3.tip()
.attr('class', 'd3-tip')
.offset([-10, 0])
.html(function(d) { return "<span>" + d.text + "</span>"; });

chart.call(tip);

var bar = chart.selectAll("g")
.data(data)
.enter().append("g")
.attr("transform", function(d, i) { return "translate(" + i * barWidth + ",0)"; });

bar.append("rect")
.attr("y", function(d) { return y(d.value); })
.attr("height", function(d) { return height - y(d.value); })
.attr("width", barWidth - 1)
.on('mouseover', tip.show)
.on('mouseout', tip.hide);

The important parts to note are this: I grab the value for my y axis of the rect from d.value where d is one of the objects in my data array that I am plotting. So far this is pretty standard but look at the tip function that gets called on mouseover - it shows a span which contains d.text! I’ve attached another piece of data here called text that can be used to display more information! This was the wow moment for me because this kind of thing isn’t possible in matplotlib and showing a simple tooltip is really only the beginning of what you can do with all this added context.

Also something about plots with a :hover effect is kind of awesome!

If you haven’t played with D3 yet and have some cool data I definetly recommend it, I used it to make some pretty sweet visualizations/tools for our Parity Ultimate Frisbee League here in Ottawa, check’em out:

trade-dashboard

https://github.com/kevinhughes27/ocua-parity-league

Building my first Shopify App

A few weeks ago in my first quarterly hackathon at Shopify I joined a team that was building a Shopify App to help charities issue donation receipts for orders on their store. We got pretty far during the hackathon and afterwards I kept working on it in my free time. It was a good way to dog food our API and tooling which was my responsibility at work.

I finally finished the app and a few days ago it launched on the Shopify App Store. The app automates the process of sending customers tax receipts for their donations to a non-profit Shopify store using webhooks. It’s a pretty cool little app and a good example of how to build a simple piece of automation using webhooks.

I knew the scope of the app was going to be small so I wanted to pick an appropriately minimalist framework instead of Rails. I went with Sinatra and ended up extracting a small gem shopify-sinatra-app for others to use. The app itself is also open source, you can check out all the code and follow the ongoing development and maintenance here.

Testing javascript with python

I was recently tasked with adding Mailcheck.js to some of our production pages and I want to describe a bit of the process I went through because I did some things a bit differently and had some fun along the way.

Lets start with a PSA - do not simply drop Mailcheck onto your website as is! In my opinion / findings the default algorithm is way too greedy - aka it will mostly suggest all emails should be ____@gmail.com. It is worth taking the time to tweak mailcheck for your particular userbase, on one wants to see a correction for their proper email address!

The first thing I did was dumped a ton of emails from our database to create a dataset to work with. I could have used Node to write some scripts to test out the Mailcheck behaviour but Python is just so much more convient for doing numerical analysis. Plus it’s what our data team uses so I could leverage some of their knowledge and code. So now for the fun part - I ended up using PyV8 (a python wrapper for calling out to Google’s V8 javascript engine). With this setup I was able to slice and dice through our production emails using python and pandas calling the exact javascript mailcheck algorithm and collecting my results. After tweaking the algorithm I could take the settings and new js code and put it in production.

Check out this wacky franken script that got the job done (pandas not included):

import PyV8

def init_mailcheck():
  global ctxt
  ctxt = PyV8.JSContext()
  ctxt.enter()
  ctxt.eval(open("mailcheck.js").read())


def run_sift3Distance(s1,s2):
  script = "Mailcheck.mailcheck.sift3Distance('%s','%s')" %(s1,s2)
  return ctxt.eval(script)


def run_splitEmail(email):
  script = "Mailcheck.mailcheck.splitEmail('%s')" %(email)
  return ctxt.eval(script)


def run_mailcheck(email):
  script = """ Mailcheck.mailcheck.run({
         email: "%s",
       })
   """ % (email)
  result =  ctxt.eval(script)
  if result:
    try:
      result = result.address + '@' + result.domain
    except(AttributeError):
       pass

  return result

if __name__=="__main__":
  init_mailcheck()
  print run_mailcheck("kevinhughes27@gmil.com")
  # >>> @kevinhughes27@gmail.com