Yesterday, @tadej sent me an article that called Dvorak keyboard layout a myth, an urban legend, a lie made up to retain funding for Dvoraks’ research.
I have been typing dvorak for two years now, and know only one other person crazy enough to do it (he even has both layouts printed on the keys). I’ve never really been a vocal proponent of the layout – it took me roughly two months to learn, it doesn’t seem faster, I’ve developed some new typical typing mistakes. It does however feel a bit more ergonomic. Definitely not enough to bother and I’ve been actively discouraging people from switching, but since I already know how to use it I wouldn’t go back to Qwerty.
So this article was a very interesting read. I can buy the theory that the whole story is a scientific fabrication, but what’s with this feeling of comfort I have is also a fact. I decided to test it statistically – design simple model of typing, count the finger movement overhead for both layouts and let the number speak for themselves.
It took me a couple of hours, so it wasn’t really hard. It also isn’t very detailed – I tried to capture the main points that are used whenever keyboard layout efficiency is being discussed, and I’m totally open for corrections / suggestions / …
The model consists of following concepts:
- the key: arranged in 4 rows and 12 columns, just like you’d find them on any PC keyboard
- three hands: one with 4 fingers for each half of the keyboard and the thumbs as separate ‘hand’ for pressing space
- the finger: each finger has assigned 3-6 keys it can press at any point
- simplified text: consisting of words and spaces only, stripped of non-alphabet characters. Also no caps.
The rules for counting overhead are:
- always look at pairs of: previous key – current key
- if we are at the beginning of the word, use ‘space’ as previous key
- if we switched hands movement overhead is 1
- if we switched finger on same hand, movement overhead is 1.5
- if we pressed the key with same finger as previous one, the overhead is the vector-distance between the keys
So, for example, if I type ‘oh’ on dvorak layout, I hit ‘o’ with my left ring-finger, then hit ‘h’ with my left index-finger, making it a very simple word that ‘costs’ 2 moves.
If I write ‘oh’ on qwerty, I’d move my right ring-finger up, and then right index-finger to the left, accounting for 1.5 + 1 moves.
Yes, these rules are somewhat arbitrary, but the idea is to follow the assumptions:
- it easy to hit the first key
- it’s a bit harder to hit the second one with same hand while retracting the first one
- it’s hardest to hit the second key with same finger, increasing with the distance the finger has to travel.
For instance, typing ‘ny’ in querty is really hard, because the right index-finger has to do some funky acrobatics. Try hitting “qz” on querty now… 😉
Now, I’m sure you’re all curious about the results already.
I tested the layouts on that very article from the beginning. The article had 35304 characters (36125 with commas and dots included) in 5865 words, 1610 of them distinct. Here’s bird’s eye view of the performance of the two layouts:
|
dvorak |
qwerty |
strokes needed (alphabet only) |
32927 |
33560 |
strokes needed (commas and dots) |
33615 |
34340 |
better at words |
2375 |
1301 |
better at distinct words |
654 |
529 |
If each stroke was worth 10ms, the dvorak layout would win by 1 minute in a 1-hour typing match. Dismissible?
So much for the most important metric – the two layouts seem to be of roughly same efficiency. It also seems they are similarly efficient across the distinct words. However, if you notice that the total number of words the layouts excel at differs noticeably,we can hypothesize that dvorak is more efficient at more frequent words.
I’ve calculated the difference between layouts’ performances for each distinct word in the article, and the number of times each word repeated. The product of these two indicators is an interesting ‘score’, indicating the impact the winning layout had on that particular word. Here are top-30 lists:
dvorak |
qwerty |
word |
length |
gain |
frequency |
score |
word |
length |
gain |
frequency |
score |
of |
2 |
0.5 |
222 |
111 |
and |
3 |
0.5 |
91 |
45.5 |
to |
2 |
0.5 |
167 |
83.5 |
it |
2 |
1 |
40 |
40 |
for |
3 |
1 |
71 |
71 |
typists |
7 |
1 |
37 |
37 |
that |
4 |
0.5 |
128 |
64 |
typing |
6 |
1 |
28 |
28 |
in |
2 |
0.5 |
121 |
60.5 |
evidence |
8 |
2 |
13 |
26 |
keyboard |
8 |
1 |
54 |
54 |
study |
5 |
1 |
25 |
25 |
but |
3 |
2.5 |
19 |
47.5 |
which |
5 |
2 |
12 |
24 |
not |
3 |
1 |
41 |
41 |
since |
5 |
2 |
12 |
24 |
example |
7 |
2 |
20 |
40 |
dependence |
10 |
2 |
9 |
18 |
was |
3 |
1 |
34 |
34 |
can |
3 |
1 |
17 |
17 |
qwerty |
6 |
0.5 |
57 |
28.5 |
with |
4 |
0.5 |
34 |
17 |
dvorak |
6 |
0.5 |
52 |
26 |
cincinnati |
10 |
3 |
5 |
15 |
by |
2 |
1.5 |
17 |
25.5 |
academic |
8 |
3 |
5 |
15 |
these |
5 |
1 |
25 |
25 |
machine |
7 |
2 |
7 |
14 |
only |
4 |
1.5 |
16 |
24 |
choice |
6 |
2 |
7 |
14 |
would |
5 |
1 |
22 |
22 |
article |
7 |
1.5 |
9 |
13.5 |
on |
2 |
0.5 |
44 |
22 |
luck |
4 |
1.5 |
9 |
13.5 |
more |
4 |
1 |
21 |
21 |
scientific |
10 |
3.5 |
3 |
10.5 |
published |
9 |
4 |
5 |
20 |
such |
4 |
1.5 |
7 |
10.5 |
it |
2 |
0.5 |
40 |
20 |
mcgurrin |
8 |
1.5 |
7 |
10.5 |
minute |
6 |
1.5 |
13 |
19.5 |
success |
7 |
1.5 |
7 |
10.5 |
were |
4 |
0.5 |
37 |
18.5 |
conducted |
9 |
2.5 |
4 |
10 |
we |
2 |
0.5 |
36 |
18 |
switch |
6 |
2 |
5 |
10 |
as |
2 |
0.5 |
34 |
17 |
standard |
8 |
1 |
10 |
10 |
keyboards |
9 |
1.5 |
11 |
16.5 |
studies |
7 |
1 |
10 |
10 |
results |
7 |
1.5 |
11 |
16.5 |
chance |
6 |
3 |
3 |
9 |
found |
5 |
2 |
8 |
16 |
lockin |
6 |
1.5 |
6 |
9 |
story |
5 |
1 |
16 |
16 |
just |
4 |
1 |
9 |
9 |
although |
8 |
2.5 |
6 |
15 |
so |
2 |
0.5 |
18 |
9 |
group |
5 |
1.5 |
9 |
13.5 |
speed |
5 |
0.5 |
17 |
8.5 |
This table gives us clear insight that dvorak layout performed better at often-used shorter words. Let’s compare graphs of frequency X gain for both of them:
Dvorak:

Qwerty:

We can see what is going on – while the majority of words behave the roughly the same, dvorak wins over most of the frequent ones. Overall averages were:
|
dvorak |
qwerty |
document |
avg repeats of a word |
3.63 |
2.46 |
3.64 |
avg length of a word |
7.03 |
7.47 |
6.85 |
avg gain over the other layout |
1 |
1.14 |
|
avg score |
6.52 |
5.41 |
|
avg score with dots and commas |
10.78 |
7.21 |
|
The average score was calculated as an average of typing improvements for all words where the layout was superior. It is very interesting however that in the end, both layouts level out. Interesting enough to try it with another text, this time shorter and more mundane – an email to a friend. Here are the results:
|
dvorak |
qwerty |
document |
characters / strokes (alphabet only) |
3162 |
3278 |
3518 |
strokes (with commas and dots) |
3335 |
3403 |
3658 |
better at words |
260 |
116 |
632 |
better at distinct words |
136 |
80 |
308 |
avg repeats of a word |
1.91 |
1.45 |
2.05 |
avg length of a word |
5.93 |
6.35 |
5.72 |
avg gain |
0.96 |
1.05 |
|
avg score |
2.74 |
1.11 |
|
avg score with dots and commas |
4.07 |
1.87 |
|
This e-mail would take 5min to write and dvorak would save me 7sec had I been using it back then. Dvorak would be even less efficient per-word, but again on more words that count. And way more if I count the dots and commas.
Now, this approach is not language-specific, so it made sense to test the final dvorak myth – it’s supposed to be designed for English language. Here is the table for a journalistic-type text in Slovene:
|
dvorak |
qwerty |
document |
characters / strokes (alphabet only) |
20709 |
20783 |
21168 |
strokes (with commas and dots) |
21203 |
21339 |
22699 |
better at words |
1266 |
842 |
3210 |
better at distinct words |
705 |
437 |
1476 |
avg repeats of a word |
1.76 |
1.93 |
2.18 |
avg length of a word |
7.7 |
7.82 |
7.38 |
avg gain |
1.11 |
1.38 |
|
avg score |
14.77 |
8.82 |
|
avg score with dots and commas |
18.05 |
11.42 |
|
This document would take 36min to write and dvorak would save almost no time. Slovene language has permutations for all word-types, so the number of repeated words is lower, yet the ratio of success in distinct words is the same as for english documents.
The source code (keyboardlayouttest.pl) is available, feel free to abuse it. It would be very interesting to create a more generic word-count tool, that would calculate the time wasted for not using dvorak. 😛