» 2010 » January Rational Idealist

Twitter Weekly Updates for 2010-01-31

January 31st, 2010 § Comments Off on Twitter Weekly Updates for 2010-01-31 § permalink

new blog post: dvorak vs qwerty performance test: http://bostjan.konstrukt.it/?p=469 [for @tadej] #
New blog post: Shopping cart http://cultureshocks.konstrukt.it/?p=8 #
learned new word: nullify #
dreamhost giving away ipads, i'd love one 😉 #
New blog post: States of the Union http://cultureshocks.konstrukt.it/?p=10 #
25kg = $150 #

Dvorak vs. Qwerty performance test

January 23rd, 2010 § 5 comments § permalink

Yesterday, @tadej sent me an article that called Dvorak keyboard layout a myth, an urban legend, a lie made up to retain funding for Dvoraks’ research.

I have been typing dvorak for two years now, and know only one other person crazy enough to do it (he even has both layouts printed on the keys). I’ve never really been a vocal proponent of the layout – it took me roughly two months to learn, it doesn’t seem faster, I’ve developed some new typical typing mistakes. It does however feel a bit more ergonomic. Definitely not enough to bother and I’ve been actively discouraging people from switching, but since I already know how to use it I wouldn’t go back to Qwerty.

So this article was a very interesting read. I can buy the theory that the whole story is a scientific fabrication, but what’s with this feeling of comfort I have is also a fact. I decided to test it statistically – design simple model of typing, count the finger movement overhead for both layouts and let the number speak for themselves.

It took me a couple of hours, so it wasn’t really hard. It also isn’t very detailed – I tried to capture the main points that are used whenever keyboard layout efficiency is being discussed, and I’m totally open for corrections / suggestions / â€¦

The model consists of following concepts:

the key: arranged in 4 rows and 12 columns, just like you’d find them on any PC keyboard
three hands: one with 4 fingers for each half of the keyboard and the thumbs as separate ‘hand’ for pressing space
the finger: each finger has assigned 3-6 keys it can press at any point
simplified text: consisting of words and spaces only, stripped of non-alphabet characters. Also no caps.

The rules for counting overhead are:

always look at pairs of: previous key – current key
if we are at the beginning of the word, use ‘space’ as previous key
if we switched hands movement overhead is 1
if we switched finger on same hand, movement overhead is 1.5
if we pressed the key with same finger as previous one, the overhead is the vector-distance between the keys

So, for example, if I type ‘oh’ on dvorak layout, I hit ‘o’ with my left ring-finger, then hit ‘h’ with my left index-finger, making it a very simple word that ‘costs’ 2 moves.

If I write ‘oh’ on qwerty, I’d move my right ring-finger up, and then right index-finger to the left, accounting for 1.5 + 1 moves.

Yes, these rules are somewhat arbitrary, but the idea is to follow the assumptions:

it easy to hit the first key
it’s a bit harder to hit the second one with same hand while retracting the first one
it’s hardest to hit the second key with same finger, increasing with the distance the finger has to travel.

For instance, typing ‘ny’ in querty is really hard, because the right index-finger has to do some funky acrobatics. Try hitting “qz” on querty nowâ€¦ 😉

Now, I’m sure you’re all curious about the results already.

I tested the layouts on that very article from the beginning. The article had 35304 characters (36125 with commas and dots included) in 5865 words, 1610 of them distinct. Here’s bird’s eye view of the performance of the two layouts:

	dvorak	qwerty
strokes needed (alphabet only)	32927	33560
strokes needed (commas and dots)	33615	34340
better at words	2375	1301
better at distinct words	654	529

If each stroke was worth 10ms, the dvorak layout would win by 1 minute in a 1-hour typing match. Dismissible?

So much for the most important metric – the two layouts seem to be of roughly same efficiency. It also seems they are similarly efficient across the distinct words. However, if you notice that the total number of words the layouts excel at differs noticeably,we can hypothesize that dvorak is more efficient at more frequent words.

I’ve calculated the difference between layouts’ performances for each distinct word in the article, and the number of times each word repeated. The product of these two indicators is an interesting ‘score’, indicating the impact the winning layout had on that particular word. Here are top-30 lists:

dvorak					qwerty
word	length	gain	frequency	score	word	length	gain	frequency	score
of	2	0.5	222	111	and	3	0.5	91	45.5
to	2	0.5	167	83.5	it	2	1	40	40
for	3	1	71	71	typists	7	1	37	37
that	4	0.5	128	64	typing	6	1	28	28
in	2	0.5	121	60.5	evidence	8	2	13	26
keyboard	8	1	54	54	study	5	1	25	25
but	3	2.5	19	47.5	which	5	2	12	24
not	3	1	41	41	since	5	2	12	24
example	7	2	20	40	dependence	10	2	9	18
was	3	1	34	34	can	3	1	17	17
qwerty	6	0.5	57	28.5	with	4	0.5	34	17
dvorak	6	0.5	52	26	cincinnati	10	3	5	15
by	2	1.5	17	25.5	academic	8	3	5	15
these	5	1	25	25	machine	7	2	7	14
only	4	1.5	16	24	choice	6	2	7	14
would	5	1	22	22	article	7	1.5	9	13.5
on	2	0.5	44	22	luck	4	1.5	9	13.5
more	4	1	21	21	scientific	10	3.5	3	10.5
published	9	4	5	20	such	4	1.5	7	10.5
it	2	0.5	40	20	mcgurrin	8	1.5	7	10.5
minute	6	1.5	13	19.5	success	7	1.5	7	10.5
were	4	0.5	37	18.5	conducted	9	2.5	4	10
we	2	0.5	36	18	switch	6	2	5	10
as	2	0.5	34	17	standard	8	1	10	10
keyboards	9	1.5	11	16.5	studies	7	1	10	10
results	7	1.5	11	16.5	chance	6	3	3	9
found	5	2	8	16	lockin	6	1.5	6	9
story	5	1	16	16	just	4	1	9	9
although	8	2.5	6	15	so	2	0.5	18	9
group	5	1.5	9	13.5	speed	5	0.5	17	8.5

This table gives us clear insight that dvorak layout performed better at often-used shorter words. Let’s compare graphs of frequency X gain for both of them:

Dvorak:

Qwerty:

We can see what is going on – while the majority of words behave the roughly the same, dvorak wins over most of the frequent ones. Overall averages were:

	dvorak	qwerty	document
avg repeats of a word	3.63	2.46	3.64
avg length of a word	7.03	7.47	6.85
avg gain over the other layout	1	1.14
avg score	6.52	5.41
avg score with dots and commas	10.78	7.21

The average score was calculated as an average of typing improvements for all words where the layout was superior. It is very interesting however that in the end, both layouts level out. Interesting enough to try it with another text, this time shorter and more mundane – an email to a friend. Here are the results:

	dvorak	qwerty	document
characters / strokes (alphabet only)	3162	3278	3518
strokes (with commas and dots)	3335	3403	3658
better at words	260	116	632
better at distinct words	136	80	308
avg repeats of a word	1.91	1.45	2.05
avg length of a word	5.93	6.35	5.72
avg gain	0.96	1.05
avg score	2.74	1.11
avg score with dots and commas	4.07	1.87

This e-mail would take 5min to write and dvorak would save me 7sec had I been using it back then. Dvorak would be even less efficient per-word, but again on more words that count. And way more if I count the dots and commas.

Now, this approach is not language-specific, so it made sense to test the final dvorak myth – it’s supposed to be designed for English language. Here is the table for a journalistic-type text in Slovene:

	dvorak	qwerty	document
characters / strokes (alphabet only)	20709	20783	21168
strokes (with commas and dots)	21203	21339	22699
better at words	1266	842	3210
better at distinct words	705	437	1476
avg repeats of a word	1.76	1.93	2.18
avg length of a word	7.7	7.82	7.38
avg gain	1.11	1.38
avg score	14.77	8.82
avg score with dots and commas	18.05	11.42

This document would take 36min to write and dvorak would save almost no time. Slovene language has permutations for all word-types, so the number of repeated words is lower, yet the ratio of success in distinct words is the same as for english documents.

The source code (keyboardlayouttest.pl) is available, feel free to abuse it. It would be very interesting to create a more generic word-count tool, that would calculate the time wasted for not using dvorak. 😛