MacBook Pro, Terminal Window and Mousing Around

I am in the process of making the switch to a MacBook Pro (specifically, the new MBP with Retina Display). After some 17 years on Wintel machines, I was not sure what kind of switching costs I might encounter. So far, however, the transition has been pretty smooth, helped in part by the presence of VMWare so that I can run Windows 7 if and when needed.

This encore has been a long time in coming. After several years on an Apple IIc, I bought my first Mac — a used Macintosh SE with dual 800k drives — in the Summer of 1988. Shortly thereafter I bought my first harddrive — a Jasmine Direct Drive 40 (a 40mb external drive). The drive cost something like $799, as I recall. The last time I used a Mac was circa 1995 — a Mac PowerBook Duo 230.

Meanwhile, circa 2012, so far I’ve been using nothing but the trackpad, mostly because each time I tried plugging in a mouse, the scaling speed was too slow. Today I decided to figure out how to fix it.

Step 1: Open a Terminal window. This can be done by navigating to your Applications folder, opening Utilities, and double clicking on Terminal.

Step 2: In Terminal, type the following:

defaults write -g 5.0

The 5.0 designates the scaling speed and can be changed to suit your preferences.

Alternatively, you can adjust these settings under the Apple menu > System Preferences > Hardare, Mouse settings. Happy mousing…

Gmail Tricks — Find Big Mail

I converted to Gmail full-time several years ago. As of today, I’m still only using 4201 MB of my 7693 MB, meaning my Gmail account is only about 54% full. Still, as I prepare to transition from Penn State to Alberta, it seemed like a good time for a little inbox analysis.

The first application I tried, Find Big Mail, is pretty much a one trick pony. But it appears to execute that one trick pretty flawlessly.

Step 1. Go to Find Big Mail and enter your Google Username.

Step 2. Let Find Big Mail run in the background.

Step 3. In my case, nine minutes later the results were back:

Some 96.5% of my messages were “small,” in this case, meaning less than 1 megabyte. Conversely, 3.5% of my messages were greater than 1mb.

Number of Messages By Size 

Collectively, however, the largest 3.5% of my messages consumed a whopping 74.5% of my Gmail space — more lopsided than even the Pareto Principle would have guessed. Or stated another way, I could keep the smallest 96.5% of my messages and use only 25.5% of the Gmail space currently required for all my messages.

Space By Size 

Pretty slick — and painless.

Importing Large Stata Files

Recently I encountered a problem when trying to use a large Stata file (nearly 10 gb). The file contained data for the period 1981 to 2011, but I only needed data for the period 1991 to 2009. To complicate matters, initially, I didn’t even know the names of the variables in the file, a problem that can be resolved with: 

type "filename"

In this case, it turns out that knowing the variable names proved unimportant. Instead, after a bit of trial and error, I ended up importing batches of observations (1 million observations at a time). Below is the code for several such batches.

use "1980-2011.dta" in 8000001 / 9000000
gen pct = round((shares / outstanding),.01)
keep if pct >= .05 & pct != .
save blockholders , replace

use "1980-2011.dta" in 9000001 / 10000000
gen pct = round((shares / outstanding),.01)
keep if pct >= .05 & pct != .
append using blockholders
save blockholders , replace


Step 1 imports a chunk of 1 million observations, and keeps only those in which an investor owns 5% or more of a particular company. About 22,000 out of one million observations meet this criterion. These ~22,000 observations are saved. In Step 2, the procedure is repeated, at which point another ~22,000 qualifying observations are appended to the blockholders file, and the file is saved again. Finally, the procedure is repeated N times until all the observations have been evaluated and only those relevant to my research project have been retained.