Tune the method or the question?

Last modified on 2019-01-26

When I was in primary school, my math teacher asked me to answer this question:

A ship is 40 meters long, 15 meters wide, and 10 meters high. -
How old is the captain?

Luckily, the same teacher previously taught me how to add and subtract numbers! Immediately, I started doing the math.

40 + 15 + 10 years old?

40 + 15 - 10 years old?

15 - 10 - 40 years old?

Even though I was not quite sure, I felt there must be a way! Something was just odd about this question, though.

Fast-forward 25 years

Recently, a client asked me a question very similar. On a higher level, it was not very different from the already known:

A ship is 40 meters long, 15 meters wide, and 10 meters high. -
How old is  the captain?

I was glad my mathematical tool set has expanded in the meantime. I now also know Python and some Machine Learning libraries!

So how about the following project proposal: Why not scraping Wikipedia on famous ships' sizes. Next, scrape and structure information on their captains' year of birth. If you just find the captains' names, why not correlate additional data? Age could be estimated from when certain names were commonly given to babies. Combined, these data would make a decent feature matrix or two! Finally, we are able to predict the captain's age with a regression model. We know that our estimation is uncertain, though. So let us use a type of model that informs on uncertainty, e.g. a Bayesian one.

Fast-forward 5 minutes

After this brief moment of triumph, I thought of that day in primary school. Maybe, again, something is odd about the question.

It seems common that we feel quite invigorated by our new methods. Sometimes we just overlook the elephant in the room, which in this case is: Are we trying to answer the right question, in the first place?

My conclusion to this, for now: let us invest at least as much energy into tuning questions than tuning methods.