Thinking Books

I was a research assistant at two universities in the 1990s. So I spent a lot of time loading and crunching data on mainframes with SAS, then I’d take stacks of greenbar fanfold printouts to my bosses (professors and/or policy institute types with PhDs) and we’d pour over summary statistics, plots, regression output, etc. I enjoyed seeing the way their minds worked as they interpreted the results. They had a way of questioning what appeared to be strong results, framing stories about what is actually happening, and identifying the next steps in analysis to send me off to do next.

As I read current books on data analysis, data science, big data, whatever, there always seems to be a hand-wave when it comes to the steps of output interpretation, framing, questioning, discrimination, etc. Something like “This step is beyond the scope of this book, go figure it somewhere else, now let’s get back to R and Python code.” I know that the “somewhere else” comes in part from experience with subject matter, but to me there should be a way to provide some structured, if not formal, ways to approach building this experience.

booksthinkSo over the past year I accumulated a list of books towards this goal. If I read an article that mentioned a book relevant to thinking about interpreting data, I’d put that book in the list. Christmas came last month and brought some sweet Amazon gift certificates my way, so I bought the books on my list and dug into them. They’ve been great! By now, I’m pretty sure people around me are already tired of hearing me say things like “You know, according to Kahneman…” or “Nate Silver has a related story to that” or “Nassim Taleb would tell you that…” Anyway, the books are listed below from most favorite to least, but each of them were good enough that I’d buy and read them again.

The Signal and the Noise: Why So Many Predictions Fail — but Some Don’t – by Nate Silver. This was probably the first book that started me down the rabbit hole of this list since I regularly read Silver’s 538 blog.

Fooled by Randomness – by Nassim Nicholas Taleb. This is older than his popular Black Swan book but I liked it more. This book’s theme is that events we think have a cause may just be due to chance.

Thinking, Fast and Slow – by Daniel Kahneman. If you have a background in psychology or economics (or like me, both) then you’ve probably heard “Kahneman & Tversky” muttered by professors several times. This book deals with fast and slow thinking, aka type 1 and type 2 thinking, or thinking about dealing with a bear versus deciding whether to get a data science degree.

The Black Swan: The Impact of the Highly Improbable – another Nassim Nicholas Taleb book, this one on how to not be a turkey.

How We Know What Isn’t So: The Fallibility of Human Reason in Everyday Life – by Thomas Gilovich. Gilovich worked with Kahneman but there’s almost no overlap with Kahneman’s book above, so get them both.

Thinking in Time: The Uses of History for Decision-Makers – by Richard Neustadt & Ernest May. The authors intended this as a book for policy makers and government employees, but I think the material generalizes to any situation.

I’d enjoy hearing your recommendations on other similar books I could add to my reading list. One that I’ve been considering is Thinking with Data by Max Shron, so if you’ve read it please let me know what you thought about it.

Posted in Books, Data, Professional Development | Tagged | Comments Off on Thinking Books

GUI Phooey

Sometimes a complex undertaking can be avoided by asking the right questions. The best example of this for me was when a client asked me to meet with the head of operations to discuss a potential project; he wanted a GUI interface on one of his department’s batch processes.

So the head of operations takes me to his office to show me the problem. He fires off the batch process from a command line, then tells me “Now I wait 2 hours for it to finish. For those 2 hours the operations team has no idea what stage it’s at, how far it’s progressed, how much longer it might take, etc. Plus, people drive us crazy for those 2 hours stopping by and asking how far the batch job has progressed, why is it taking so long, when will it be done, etc. and we have nothing to show them.”

The request made a lot of sense to me, but I also knew how much work it would be to create such a GUI interface or dashboard, as well as the new complexity it would add to the system. So I asked the head of operations the following questions…

Me: Would you need this GUI solution if the batch completed in 2 minutes instead of 2 hours?

Ops: No, I wouldn’t need anything else!

Me: What if it completed in 10 minutes?

Ops: No, that would be fine.

Me: How about if it completed in 15 minutes?

Ops: Hmmm, I’d have to think about that, maybe, I’m not sure.

Now I knew the pain threshold. So I asked if I could have a week to try getting the batch completion time down to 10 minutes just by performance tuning the database operations. I got the go-ahead to try performance tuning since the client had been anticipating a month of work to build the GUI interface.

Of course, it took very little time to find a frightening query that was consuming most of the 2 hours. After a couple of days I got the batch completion time to under 10 minutes just by tuning some SQL.

I know it annoys people when I ask questions and push back on their ideas, but it can pay off!

Posted in General Technology | Comments Off on GUI Phooey

Ops Books


These books are never far from my desk

My operations management experience was needed for a consulting project last year. I had been the head of a technology team before joining President Obama’s re-election campaign in 2011, plus I worked in management early in my career, so I had some books nearby to help get my head back into the operations and management realm.

While thumbing through these books looking for excerpts or chapters that might help, there were four of them that I decided to re-read from cover-to-cover. Even though some of them were published years ago, their material still seems relevant and useful. These four books plus a couple more are listed below.

Web Operations: Keeping the Data on Time by John Allspaw – This is one of those books of essays by various experts that O’Reilly seems to like churning out. For example, Baron Schwartz wrote the chapter on databases. In other words, this is a good book.

Scalable Internet Architectures by Theo Schlossnagle – Examples in this book involve very large systems, but this is a good book to have around even if you aren’t working with large environments. I’d also say that it’s a useful book for admins, developers and management.

Release It!: Design and Deploy Production-Ready Software by Michael T. Nygard – The first three sections on stability, capacity and design build up to the final section on operations. It’s easy to think this is a book on software development from its title and description, but it’s a valuable ops book as well.

The Visible Ops Handbook by Kevin Behr et al. – The subtitle is “Implementing ITIL in 4 Practical and Auditable Steps” where ITIL stands for Information Technology Infrastructure Library. Sounds like a real page-turner, right? Actually, it is… I’ve highlighted something on almost every page.

The Goal: A Process of Ongoing Improvement by Eliyahu Goldratt -This book is a vehicle for the author’s Theory Of Constraints (TOC) that is discussed in The Visible Ops Handbook, so I’ll list it here as well. The Goal was already an old book when I read it for the first time in 1995. It’s written in the form of a novel about the manager of a plant where everything is always going wrong and the lessons he learns from a scientist friend who is trying to help him see the non-intuitive ways to solve his problems.

Learning from First Responders: When Your Systems Have to Work by Dylan Richard – Dylan was one of my amazing managers at President Obama’s re-election campaign. Also, this book is free. Need I say more?

Posted in Books | Tagged , , | Comments Off on Ops Books

Git ‘store’ credential helper with encrypted partition

Tonight I was setting up git on a new linux box so that it can access GitHub. I enabled two-factor authentication on my GitHub account almost a year ago; some great instructions for doing this are available here. I had been using “credential.helper cache’ for storing my credentials on linux machines, but this is a temporary store that by default caches your credentials for 15 minutes. I could increase that default, but it’s still going to be temporary. On my MacBook, I use the OSX Keychain to store these credentials permanently, which has unfortunately made me lazy. I wanted a way to store these credentials safely on my linux box so that I didn’t have to type them in repeatedly.

This led me to the git-credential-store helper. This stores credentials on disk, however they are not encrypted. So I began looking for an alternative. I wondered if I could use the gnome keyring with git. A search turned up that this might be possible but it wouldn’t be easy. Then it occurred to me that I have an encrypted partition on this machine utilizing dm-crypt plus LUKS and mounted under my home directory. If the git-credential-store helper stored credentials in this encrypted partition, that would provide some protection. There are still vulnerabilities, but I carried on.

Beware: I am NOT a security professional so what I am doing here might be horrible advice. It is quite possible that I have no idea what I am doing.

The git-credential-store helper has a “–file=” option that can be used to specify the file where credentials are stored. I set this to a file in my encrypted partition. By default this is “~/.git-credentials” so I used that same file name and replaced the “~” with an encrypted directory path (in the example here that is “/home/myname/encrypted/”). Let’s say that git is configured with the commands below.

$ git config --global "MyName"
$ git config --global "me@somewhere"
$ git config --global credential.helper 'store --file=/home/myname/encrypted/.git-credentials'

As a result, the “~/.gitconfig” file should look something like the below snippet.

	name = MyName
	email = me@somewhere
	helper = store --file=/home/myname/encrypted/.git-credentials

If the “/home/myname/encrypted/.git-credentials” file doesn’t exist, it will be created the next time that git requests credentials (when using two-factor authentication then remember that the Personal Access Token is used for the password and not the regular GitHub password). After that, credentials should not have to be entered again (of course, this assumes that the encrypted partition is available and at the same mount point).

Posted in General Technology | Tagged | Comments Off on Git ‘store’ credential helper with encrypted partition

Ubuntu 11.10 Upgrade – Getting Back to Work

The new Ubuntu 11.10 version (Oneiric Ocelot) has arrived. I was happy using Ubuntu 11.04 (Natty Narwhal) with the classic GNOME desktop environment, but it’s always hard to turn down an upgrade to the new flashy toy when it just takes a click of the mouse.

So I said “yes” to the upgrade, and when it was done I was sad to  see the classic  desktop was gone and replaced with Unity. For browsing and light activity on my netbook Unity is fine, but I was having trouble getting any real work done with it. I like moving forward and learning new things, but this interface is downright frustrating to me. I wanted my old experience back, or at least something close to it.

Fortunately, it wasn’t hard to find out how to do this because so many others feel the same way. After reading several posts,  I found one that was loaded with ad pop-up junk, but it was quite straightforward with advice for “falling-back” to a classic Gnome  experience. From a terminal I just entered “sudo apt-get install gnome-session-fallback” at the command line. After that, log out then log back in choosing “GNOME Classic” (click on the little gears on the login screen to get this choice) and things will look more familiar.

After that I added some applets to the upper panel (Alt-right click on the panel and choose “Add to Panel…”) to add some other things I like having there, for example “The main GNOME menu”, Show Desktop, Shutdown, Trash, and some commonly used application launchers.

I fired up the Nautilus file manager and returned some of the features I prefer there. I like seeing a tree structure in the left panel, so I brought that back by clicking View in the menu, then Sidebar -> Tree. I don’t like icon views, so on the menu I chose Edit -> Preferences select List View as the Default View. Finally I selected View -> Statusbar to get back the status bar on the bottom of Nautilus.

In a few minutes I was able to return my experience to something much more usable. So far it seems fine, and I’ll probably report back here if I run into any issues.

Posted in General Technology | Tagged , | Comments Off on Ubuntu 11.10 Upgrade – Getting Back to Work