Google has this habit of providing extra links to sites within its search results, which go to specific supposedly popular pages within the site. I’ve made use of them a number of times and they’re usually pretty helpful, though I’ve never determined the sort of rules that produce the links (they’re plentiful enough that I assume they aren’t done manually).

Then today I noticed an interesting instance of the feature when I searched for “wiktionary”:

Comments

There's some javascript trying to load some comments, and if you're reading this, it's probably not working.

Texting

2010-12-26

Being most of the time a late adopter of new technologies, I’m still composing text messages on a traditional 12-button telephone keypad with 26 letters spread across 8 keys. The classical entry mode requires pressing each key up to several times to specify which letter was intended, while the more clever entry mode lets you press one key per letter and makes use of the redundancy in English to guess which letters you intended. Of course I use the latter since it’s much faster. Anybody who has used such a thing knows that not too infrequently there are collisions, where two English words are represented by the same sequence of keystrokes (the most common one that I come across being “go” and “in”). Only recently did it occur to me that there could be some interesting computation to be done with these collisions.

1
ABC
2
DEF
3
GHI
4
JKL
5
MNO
6
PQRS
7
TUV
8
WXYZ
9

So I grabbed the /usr/share/dict/american-english word list file and scrapped together some code. The first thing I wanted to do was explore the collisions in the normal key layout. My word list has 73895 words in it (I filtered out anything that included punctuation). Of those, 13186 (or 17.8%) collide with some other word. The largest set of mutually colliding words consists of the following twelve: acres, bards, barer, bares, barfs, baser, bases, caper, capes, cards, cares, and cases. To anybody who uses all twelve words in a sentence I will award one hundred points divided by the number of words in the sentence.

I also looked for some of the most dissimilar colliding words, where dissimilarity is measured by number of positions with differing letters. The winner, at eight, is “Khoikhoi” and “jingling”. Since I hadn’t heard of the proper noun “Khoikhoi” until a moment ago, I’ll include the sevens as well:

  • astride, brushed, crushed
  • amounts, contour
  • hotbeds, invader, invades
  • rampages, scorcher, scorches
  • Smirnoff, pogromed, poisoned

The next question is how much better other key layouts can do. This is an interesting computational problem, since searching all possible layouts is practically infeasible. If we describe a layout as a partitioning of the alphabet into eight sets, and we limit ourselves to two sets of four and six sets of three, my sloppy and careless calculations put the number of combinations at 10421411600600000. My computer gets tired just thinking about looping through all of the English words that many times.

The most straightforward approach seems to be a hill-climbing algorithm. I started from the standard layout, and at each step made the most helpful swap of two letters out of all possible swaps. Usually about ten steps would produce a layout with no improving swaps. The state I reached from the standard layout has a collision percentage of only 12.7%.

1
BKN
2
DHI
3
FGR
4
CLY
5
AJM
6
EPVX
7
OQT
8
SUWZ
9

Its collisions include (chant, limbo), (scanty, symbol), (ghoul, ritzy), (speedy, weevil), and (hold, itch). Of course we couldn’t switch to something like this because it would mess up all those phone number mnemonics. And anyhow nobody writes text messages this way anymore.

Also I think “The Speedy Ritzy Weevil-Ghouls” could be a promising name for a rock band.

Comments

There's some javascript trying to load some comments, and if you're reading this, it's probably not working.

Bitwise

2010-12-09

I think it's interesting the sort of patterns that come out when you view the bitwise operators as tables.

XOR


AND


OR


Comments

There's some javascript trying to load some comments, and if you're reading this, it's probably not working.

There’s a drawing I made a couple years back that I posted on my original website, and I decided I ought to stick it up here to prevent myself from losing track of it.

I had the idea while I was taking a GIS class. The professor had mentioned the tactic of displaying statistics about geographic regions by distorting their size to match the statistic. I realized that it wouldn’t be difficult to scribble out a map of the United States where each state can have an arbitrary shape, but the boundary relationships have to be preserved (both the relationship of which states border each other, as well as which states are on the U.S. border).

And so this is what I came up with:

I’m not sure if I had ever consciously noticed the way that New York and New Hampshire both run along the U.S. border in two separate locations.

Everybody try this at home!

Comments

There's some javascript trying to load some comments, and if you're reading this, it's probably not working.

Squared Squared

2010-12-05

I accidentally noticed some unexpected behavior from the “gcalctool” calculator included with Ubuntu 10.10. It has a “squared” button, and if I push it twice:

rather than returning the value to the fourth power (which agrees with either interpretation of associativity), it actually computes the value to the 22nd power:

Which means the calculator is capable of computing any exponent as long as its decimal representation is all 2’s.

Comments

There's some javascript trying to load some comments, and if you're reading this, it's probably not working.