Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - iandoug

Pages: [1] 2 3 4 ... 18
1
Keyboards and Other Interfaces / Re: Balanced Keyboard Layout
« on: Yesterday at 03:37 AM »
puttin it out there

Do you have the KLA json files? :-)

thanks, Ian

2
Keyboards and Other Interfaces / BEAKL PLLT x1
« on: 2018-Sep-18 12:34 »
@Den

Hi

1. How does KLAtest handle BEAKL PLLT x1, in particular typing things requiring AltGr + Shift?

2. is the Num key a toggle or hold-and-type-another-key ?

thanks, Ian

3
So I went back to the drawing board a bit

Again. Finally figured out what I *should* be doing, which is this:
1. we have standard character distribution for English.
2. for each input text, work out character distribution percentage.
3. then for each character in standard distribution, find difference between the frequencies : abs(frequency[standard] - frequency[inputtext])
4. if character not in input text, use zero in above calculation == frequency[standard] as is.
5. sum these numbers, and rank scores for each input text. Low score wins, indicates least percentage difference from standard distribution.

So results attached, as well as ten best as measured by "Similarity" metric, which looks more at a weighted order, and FreqMatch, which is the frequency match detailed above.

By Similarity:
other-peoples-money.txt
the-fable-of-the-keys.txt
play-equipment.txt
economic-consequences-of-peace.txt
war-is-a-racket.txt
170-chinese-poems.txt
deadly-world-order.txt
what-schools-teach.txt
the-elements-of-style.txt
common-sense.txt


By FrequencyMatch:
the-devils-dictionary.txt
play-equipment.txt
the-elements-of-style.txt
war-is-a-racket.txt
what-schools-teach.txt
miller-the-crucible.txt
the-fable-of-the-keys.txt
economic-consequences-of-peace.txt
logical.txt
And-Then-There-Were-None.txt

In both lists:
play-equipment.txt
the-elements-of-style.txt
war-is-a-racket.txt
what-schools-teach.txt
the-fable-of-the-keys.txt
economic-consequences-of-peace.txt


4
How long is it supposed to take for the MTGAP algorithm to run its course? It took me almost 8 hours to get to the 19 swap rounds (default 30 key).

Depends a lot on your hardware I suppose... Sorry I think I may have done that exercise a year or more ago but can't remember, I do remember not being particularly impressed with the end result.
Maybe Den knows.

5
Keyboards and Other Interfaces / auto-testing
« on: 2018-Sep-15 14:34 »
I used to use iMacros to bulk-test the layouts with the different inputs.

I see they've gone and gotten into "give us money" mode, and now the browser add-on is severely restricted in what it can do (Firefox).

So I either need to get to grips with Selenium Webdriver and write a PHP script or something to drive it, or delay testing until such time as I ever get around to having a stand-alone program to do the same thing as KLA in all its variants.

Anyone got experience with Selenium?

thanks, Ian

6
Unrelated note, but how exactly do you run the MTGAP program? I'm using a mac and I'm not exactly tech savy (I have gcc installed with homebrew but I can't seem to run "Makefile").

what happens when you change to the Typing directory and do
make
?

(FWIW, doing that worked here, though gcc complained a lot during the compile).

Then
./optimizer
runs the program.

7
So I went back to the drawing board a bit, added some new tests, dropped two, then ranked the layouts for each test, and added up their ranks.
sum-of-ranks == Score, and lowest Score wins.

Went through top 1000 most popular books on Gutenberg, as well as newly-in-public-domain over last few years, looking for interesting material.
After much cleaning up, have settled on these 17. (because, like, obviously because 17 is the most random random number. Everyone knows that. Or because these scored above the 0.55 Average Score cut-off point.)

In general, I stripped top and bottom matter, and replaced non-ANSI-US characters with either plain character or HTML-entity version (eg ë -> e, £ -> £ ).
Also set line-ending to Unix, and removed Byte Order Marker from start of file.

List of rejected titles attached, in case someone else wants to try a similar exercise. No sense in reploughing the same field.
So a lot of the files that I was using previously for testing, are now less-desirable due to having bad character frequency compared to English.

Files and analysis attached.

There's a good mix of styles, dates and lengths.

Now to have a go with MTGAP program ... think I had it working before.

8
Unrelated note, but how exactly do you run the MTGAP program? I'm using a mac and I'm not exactly tech savy (I have gcc installed with homebrew but I can't seem to run "Makefile").

Do you mean from
https://github.com/michaeldickens/Typing

Will have a go and revert.

9
Yes I know you're probably sick of this by now, but I feel that if we are developing layouts for English, then the input texts should be as close to generalised English as possible.

Was not happy with this, and found out why when I tried whittling the list down to 10 or 15. Did it in stages and noticed the order jumping around too much, so modified program to scale the four test results in range 0 - 1, then took average of each input text, high average wins. Also pushed the "length" in Jaro-Winkler to 28, effectively matching against  " etaoinsrhldcumpfg.ywb-,v0k1" (sans quotes).

Results attached. Putins-resistance is "part of the resistance" text pasted onto bottom of Putin's speech text. The combo is better than each individually.

Now been scratching my head about a corpus for code, and getting nowhere.

Current code tests are various sample tasks in an assortment of languages, borrowed from RosettaCode, as well as Google home page (not real world typing), plus Keyboard Layout Editor single-page app.

I think we should be using samples from the most-used languages. Problem is determining which those are. Our local tech website regulary publishes a pair of lists of "top" languages, but we're all sceptical of those lists because of how they are generated.  Current lists here: https://mybroadband.co.za/news/software/274403-python-climbing-most-popular-programming-languages-list.html  and screengrab attached.

StackOverflow did their own survey, results here: https://insights.stackoverflow.com/survey/2018/  , scroll down to Most Popular Technologies, which gives what is probably a more real-world list.

The part that worries me about these is the absence of things like COBOL and Fortran and possibly even Ada ... and I suspect they are missing because programmers in these languages don't hang out on StackOverflow or need to run Google searches on how to sort an array ... they already know what they're doing.

So I think I will go with the top end of the StackOverflow list, down to "C". Typescript is very similar to Javascript which is first.

Started poking around for samples on GitHub, then the "James Bond" problem reared its head again.

1. pretty comment separators eg /* ------------------------------------------------------------------------------*/
which are going to mess up the character distribution analysis.

2. for example, in CSS, repeated use of various phrases, eg
Code: [Select]
.hvr-pulse-shrink:hover, .hvr-pulse-shrink:focus, .hvr-pulse-shrink:active {
  -webkit-animation-name: hvr-pulse-shrink;
  animation-name: hvr-pulse-shrink;
  -webkit-animation-duration: 0.3s;
  animation-duration: 0.3s;
  -webkit-animation-timing-function: linear;
  animation-timing-function: linear;
  -webkit-animation-iteration-count: infinite;
  animation-iteration-count: infinite;
  -webkit-animation-direction: alternate;
  animation-direction: alternate;
}
which introduces a surplus of w, b and k, which are less-common letters normally, as well as "-".

or repeated use of variable or class names, eg
Code: [Select]
.token.property,
.token.tag,
.token.boolean,
.token.number,
.token.constant,
.token.symbol,
.token.deleted {
color: #905;
}

or
Code: [Select]
def parse_args():
  parser = argparse.ArgumentParser(description='Run Electron tests')
  parser.add_argument('--use_instrumented_asar',
                      help='Run tests with coverage instructed asar file',
                      action='store_true',
                      required=False)
  parser.add_argument('--rebuild_native_modules',
                      help='Rebuild native modules used by specs',
                      action='store_true',
                      required=False)
  parser.add_argument('--ci',
                      help='Run tests in CI mode',
                      action='store_true',
                      required=False)
  parser.add_argument('-g', '--grep',
                      help='Only run tests matching <pattern>',
                      metavar='pattern',
                      required=False)
  parser.add_argument('-i', '--invert',
                      help='Inverts --grep matches',
                      action='store_true',
                      required=False)
  parser.add_argument('-v', '--verbose',
                      action='store_true',
                      help='Prints the output of the subprocesses')
  parser.add_argument('-c', '--configuration',
                      help='Build configuration to run tests against',
                      default='D',
                      required=False)
return parser.parse_args()

So the various samples chosen are going to skew the character distribution in different directions, the only way around that is to have a large number of source projects and only take small bits from each, and hope that it all balances out in the end.

Am open to better ideas at this point ....

Thanks, Ian


10
For the next round of keyboard tests, I will be using these for English:

Yes I know you're probably sick of this by now, but I feel that if we are developing layouts for English, then the input texts should be as close to generalised English as possible.

So I went back to the drawing board a bit, added some new tests, dropped two, then ranked the layouts for each test, and added up their ranks.
sum-of-ranks == Score, and lowest Score wins.

Tests used:

1. my "similarity" test, which is not a generalised "how similar are two strings" test but rather calculates a score based on the letter frequency distribution in English.

2. modified Levenshtein distance, the Damerau-Levenshtein. Code borrowed from https://github.com/Oefenweb/damerau-levenshtein/blob/master/src/DamerauLevenshtein.php
I use 95 - this number, so that higher results are better. 95 be number of characters on ANSI-US.
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance

3. Jaro-Winkler distance, code borrowed from https://stackoverflow.com/questions/16925150/php-string-comparison-and-similarity-index/38236357#38236357
The "length" is set to 10, which translates to " etaoinsrh". Maybe this needs to be higher.
https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance

4. Sørensen–dice-coefficient, code borrowed from https://www.programmingalgorithms.com/algorithm/s%C3%B8rensen%E2%80%93dice-coefficient
https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient

Spreadsheet with prose scores attached.
I suppose the top 15 or 20 would suffice for real-world testing, although the first two are "heavy".

I see Alice still falls way down the list.

To-Do: something similar for code.


11
Sure as purely phonetics, the two sounds are distinct. However, in English the two are (mostly) interchangeable allophones. Thus, English only needs one new letter for TH. Then other languages can invent their own method (on top of the new letter, most likely) if they really need to differentiate between hard and soft.

So in-between all the other side projects running in my head, I'm busy rethinking the alphabet, wondering whether to re-introduce eth and thorn and eng (eth ðÐ, thorn þÞ, eng ŋŊ) (my initial rant is at https://iandoug.com/?p=235 ).

Reading various things lead to this:

1. Ye (as in Ye Olde Curiousity Shoppe) should be pronounced "the" not jee. The y was actually thorn written badly so it looked like a y, but at some point in the middle ages they started using th instead, and that stuck.

2. today I realised that possibly we went in the other direction as well ... with "thou" for "you". So "you" should actually be pronounced "thou", which ties up nicely with "them" and "their" ... all referring to other people. Not biblical at all.

Which leads to : replace th with y (or thorn) again. Use other letter(s) for y. For example:
1. ugly -> uglee
2. quietly -> quietlee
3. fly -> fli
4. dunno what to do about things like yellow ...--> jellow ? jesterday ?
5. hard g and soft g can be split to use g and z. Most current uses of z are either x or s anyway?

12
Is there a list anywhere of the differences between versions 1,2,3 of Den KLA?

Was waiting for Den to reply. I'm sure I wrote a summary for the first few, but can't find it where I thought it would be ...
So briefly:

1. Patrick / KLA original: measures distance travelled (plus finger usage etc, with particular weighting for each finger).
2. Den 1 (AFAIK): same as above, but includes vertical distance for each keypress, at 4mm down and 4mm up.
2. Den 2: as for Den 1 but changed weightings on fingers, think started punishing pinky use more. This version not used much and superseded by Den 3.
3. Den 3: as for Den 2 but punished inner column usage or in general, horizontal movements of fingers. So ortholinear layouts score much higher. Also includes some word-based metrics.
4. Den 4: as for Den 3 but other changes which are now surfacing... :-), like punishing left hand.

Regret have not looked at the code for a while, it should mostly be in analyze.js

FWIW I'm "trying" to get going on a comprehensive analyzer which will offer the option of picking analysis models/parameters, but this will be stand-alone program not browser-based. Browser/JS leaks too much memory and can't handle large inputs, or multiple inputs, etc. and writing logfiles is becoming problematic due to increased sandboxing. I have the basic idea in my head but want to do it in Ada (because.) which I don't know yet. So big learning curve. :-)
In theory the GUI should work on Linux/Windows/Mac (eventually).
In theory it would allow you to import some layout from KLE and analyze that, rather than relying on the preset form factors in KLA.

I'll probably end up asking for design inputs here (or on Den's new platform).

13
Keyboards and Other Interfaces / Another input text
« on: 2018-Sep-07 04:23 »
The Yanks are probably well aware of the anti-Trump rant posted by the New York Times.
https://www.nytimes.com/2018/09/05/opinion/trump-white-house-anonymous-resistance.html

As it happens, it does reasonably well as an input test for English.

1. Currently second best on PHP's "similar_text" function (after the Logical Argument FAQ)
2. Currently third best on Levenshtein distance test, after Economic Consequences of Peace, and Wikipedia Constantine article.
3. Not so good on my "similarity" test, presumably because it only uses 54 of the 95 chars on the keyboard.

Char frequency order is
⍽etiaonsrhdlcumpfwgy.vb,k-T'AIxMWRP"BOSH:KjJD2CG5VFzNU   (complete).

This is the "cleaned up" version, all non-ASCII chars replaced with ASCII equivalents.
At only 5440 bytes long, it is also one of the shorter tests, making for enhanced work flow... :-)

Attached.

14
yes. left hand 1.54 times more effort cost.

Mmm .... since when?

thanks, Ian

15

It would help if you specified which version of KLA you are using :-)

- How does KLA calculate distance for row changes? More specifically for the keys above and below the index finger's resting position.

- How does KLA calculate word difficulty?

- Does KLA strongly favor the right hand?

- Does KLA glitch out when using a really small input text (say less than 5 characters)?

I'm asking because I tested an input text of (without quotes) "rst" and the results were a bit odd. For example it put MTGAP over Colemak for that one trigram and I know for a fact that "rst" is easier to type with ring-middle-index than it is to type it with pinky-ring-middle.

Den will probably have better answers, but:
1. On ANSI/ISO, the stagger is not symmetrical, so the distance (on QWERTY) from J to U is less than the distance from J to M (measured centre to centre). Hence keys on top row favoured over keys on bottom.

2. see Den... :-)

3. Not that I know of. Might vary between versions. But be aware that on ANSI/ISO, right Shift key is further from right pinky than for left hand. So any caps on left hand get extra penalty. Part of the reason why ANSI/ISO form factor sucks.

4. Depending on version... Den's latest versions look at the whole keyboard (AFAIK) not just the input text. Den? So factors other than input text may influence the score.

That said, I am also a bit puzzled by the rankings on both KLA3 and Klatest. On Den1, with input rst, Colemak gets 98 and MTGap gets 127. So possibly you hightlighted a bug in the scoring..? :-)

16
3. Consequence 2: the navigation buttons in the centre also suck. Same issues: in theory ambidextrous, in reality awkward. So I'm contemplating a joystick for the arrow keys. I think a playstation-style will maybe work better than an thumbpoint style, but it must be switches not potentiometers, and struggling to find that (and with decent switch life -- 10k presses ain't gonna cut it.)

Am a bit late to this idea it seems.

https://www.massdrop.com/buy/tex-yoda-ii-mechanical-keyboard-kit

Though in fairness they're apeing IBM's laptops with the trackpoint-is-your-mouse while I'm aiming at joystick-is-your-arrow-keys.

17
1. US keyboards don't have AltGr key.

Actually they do, it's just not defined as such in the keyboard drivers/layouts. But if you load custom layout, then you can define it.

At least on Linux, and I assume on Windows...

Cheers, Ian

18

@Den, see spreadsheet. From Jones 2004 analysis, I added totals and calculated percentages and sorted. Other analysis I've seen puts percentage for "e" around 9.xx rather than over 10.
Comments? You got better numbers?
Need to figure out how to assign a value for "space" percentage.

I want to redo my "similarity" calculations using actual frequency of character rather than just its relative position.

Okay sorted. I took the space percentage from the other analysis above (17.xx), then recalculated based on adding that percentage of spaces to the Jones data. Their letter frequencies make sense once you realise they are excluding space.

So that produces attached. Figures are as per PHP similar_text function, 95 - Levenshtein distance, Similarity % (to known frequency) and the actual score achieved.
The Known Frequency score for this exercise is:
 Ideal is 8435.3245409231

Observations:
1. scores are much closer than previous method, mainly because differences between fractions are mostly less than 1, while previous method had stepsize of 1.
2. t and a are only around 0.5 apart, and similarly a o i n are within about a 0.5 range. So that would explain why texts with slight variations in the expected etaoin order produce close scores.

So guess will use fable of the keys for further playing around with English layouts. :-)

19
http://www.fitaly.com/board/domper3/posts/136.html

puts e at 8.5 and space at 17% ....

20
Also if something better comes along, will use that ... the current cut-off is a score of 93% or better on weighted similarity to English letter frequency order.

Found a few things which may or may not be useful. Some books and things are no longer in copyright... numbers are my "similarity" score. Current top score is 97.34%
1. Economic Consequences of Peace. 97.3 ⍽etaoinrshldcufmpy,wgbv0.-1TG (etc)
2. Supply and Demand 96.47 ⍽etoainsrhlcdumfpwygb,.vkTI;x (etc)
3. The Curse of QWERTY 94.67  ⍽etonarsihdlcmuypfgw,bvk.T-E (etc)

Problem is that the first two, while scoring well, are bloody large. 400K and 300K, which takes a while for KLA to process. (3) is 23K which is better.
Attached.

@Den, see spreadsheet. From Jones 2004 analysis, I added totals and calculated percentages and sorted. Other analysis I've seen puts percentage for "e" around 9.xx rather than over 10.
Comments? You got better numbers?
Need to figure out how to assign a value for "space" percentage.

I want to redo my "similarity" calculations using actual frequency of character rather than just its relative position.

21
For the next round of keyboard tests, I will be using these for English:

the-fable-of-the-keys.txt
quotes.txt
war-is-a-racket.txt
keyboards.txt
constantine.txt
Daode-jing.txt
Paul-Atreides-Wikipedia.txt
magna-carta-english.txt
logical.txt
tlp.txt  (The Little Prince)
blushing-morel.txt
animalfarm.txt
american-states.txt
typing-champ-2.txt
classiccollection.txt
putins-speech.txt
1984-chapter-1.txt
A-Nice-Cup-of-Tea-by-George-Orwell.txt
nightfall.txt
dragonboy.txt
origin.txt


Note: no Alice, or word lists.
Also if something better comes along, will use that ... the current cut-off is a score of 93% or better on weighted similarity to English letter frequency order.

22
Cleaned up my various corpus folders a bit. Discovered bug in analysis program (well I think it's actually a PHP bug not a coding bug, but coded around it anyway ... bottom line is that it was dropping the letter zero from the analysis strings.)

Spreadsheet attached with analysis, sorted by similarity to known character frequency for English.

Two additional texts included. Also cleaned up Blushing Morel and The Crucible a bit.
@Den: have not included the other Academic texts yet ... they need some cleaning up and I've had enough of that for today ;-)
Also my sorting is a bit different to yours.... put the academic/technical stuff with prose, because I didn't know what to do with quotes/lyrics/poems etc, so just lumped all together.

The two new ones score well... I've had Smedley's rant for a while but not used it much, and grabbed Artreides today after being motivated/inspired by the MTG/Ixlan spoilers text. Was surprised at both their scores.



23
top few.

24
Keyboards and Other Interfaces / Ideal input....
« on: 2018-Aug-25 21:48 »
Couldn't sleep so wrote some code ...

It dawned on me a few days ago that neither PHP's "similar_text" function, nor Levenshtein function, was given the correct evaluation of the character frequency in the input files.

For our purposes, we want the similarity to be skewed in favour of the most frequent characters. ie in


space  etaoinsrhldcumpfg.ywb-,v0k1TAIS2C'"/3ED9:MN=RP;4OB5)L(HFx8W67G_UjqzJ<?Y@*VK!|$~[]%X&+#QZ}{>`\^


It doesn't really matter what order \^ are in, but it does matter what order etaoin is in.

So I wrote some code...
1. There are 95 characters.
2. Give each character a "power" based on it's position in the string above. So space is 95, e is 94, etc.
3. Calculate the frequency list for each input text.
4. For each character in the frequency list, multiply it's position (counting down from 95) by it's power calculated in (2).
5. Add these all up and produce a percentage out of the ideal score.

Which produces the attached spreadsheet, sorted according to this metric.

And surprisingly, my favourite input text, Putin's speech, is not at the top. In a bizarre twist of fate, it is the famous paper praising QWERTY ....
@Den: those two Sci-Fi pieces do quite well. (20 and 23 out of 126). Wikipedia Keyboards (3rd)  and Constantine (5th) very good. (note: these are my cleaned-up versions. Well at least Constantine.)

FWIW, Alice only comes in at number 40.

I should make another zip with all the inputs.

ToDo: do this separately for code and numpad tests. Problem is the numpad tests include chars not on the numpad (comma, colon, quote, space).

Note: There might be some random "other" files in this list, which are lurking in the /presets folder for some reason.


25
Keyboards and Other Interfaces / common linux commands
« on: 2018-Aug-25 07:34 »
Attached file has 50 or so common Linux commands, x 10, wordwrapped.

Should probably have a few more with switch options.

X7.1 is best on KLA3, while BEAKL PLLT x1 is best on KLatest, by quite a margin.

Linux/Unix commands often have letter combinations that don't occur in normal English.

I suppose should do the same for Windows universe .... do people still use command prompt a lot on Windows?

26
I was messing around with English.

Have you tried swapping the d and f on your tweaked version?

f and g have very similar frequency in English. So you need to check which bigrams show up on the same finger.

27
J is very common in English names. Can't be called the best layout if common names are hard to type.

For that matter, list of names should be another special category of corpus to consider (just like source code.)

Attached list of full names, made by combining previous list of 482 first names with the 482 most common surnames in UK.
In theory it should have been done with more Smiths and Joneses etc in proportion to their commonness.

But good enough for our purposes. I see the capital letter distribution is now better, with J less frequent. Curious that lower t is relatively low down compared to full English frequency.

Letter frequency list:


 ea,nrliostyhdmcuAMBgHSCkwJLbRvEWGPDpFKNTxOfzIVZXj'YQ


Mm..... missing two letters.... q and U ....

28
and here's where I'm currently at with tweaks:

Is this for English or French?

29

As a reminder, this is what we want:  etaoinsrhldcumpfg.ywb-,v0k1T (scores were based on the full frequency list, not just this snippet)


Finally found something ... extract from Chapter 1 of Darwin's Origin of the Species.


31.25 71 etaoinsrhdlcmfubpgy,wv.k-I;TxESACRHOMBNqFWDjPz":GLV()?U!KJ'Y2613


Which is correct up to "h", ie first 10 letters including space, and next few are not too bad.
Sourced from Gutenberg.

30
J is very common in English names. Can't be called the best layout if common names are hard to type.

For that matter, list of names should be another special category of corpus to consider (just like source code.)

Attached sorted, linewrapped, comma-separated list of common US and UK baby names, male and female. Curiously, in UK, John is now in middle 50s, while top boy's name is Mohammed ...

I should maybe remove the commas because character frequency is

, space a e i n l r o y s t


Which is very different to regular English, even apart from the comma.

FWIW X7.1 does well with this list on both KLA3 and Klatest. I assume because of the commas.

I see "J" is second most common capital after "A".

31
I don't think the "subject" in a suitably sized text sample would be significant. Certainly, the vocubulary of the author and their grammatical style (genre) will directly affect the metrics -- not necessarily a bad thing as I have used collections of my own written material in klatest to evaluate the results against Den's selections to further validate the utility of a layout for my particular use case.

 I was thinking of things like "James slowly out his Baretta." "Careful, Bond" said Miss Moneypenny. Bond shrugged and returned the pistol to its hoster.
etc. "J" is normally rare, but not in a James Bond or Jason Bourne novel.

I don't know where I stumbled upon these google word frequency files but I have two 100K word frequency lists, one of word usage culled from the web (page content), the other from the google books library, which I use to analyze bigram and trigram frequecies. A mechanism to input these lists would be handy -- the alternative being to simply expand them out by frequency (you'd need to reduce by a suitable factor as, for example, THE in the mentioned frequency files comes in at 23,135,851,162 and 53,097,401,461 respectively, not to mention, combining them!

Attached is a few years old. Can't remember where  I found it either... :-)


32
I included Nightfall and Dragonboy to test different genres. Namely sci-fi and fantasy. So if you were a writer in these genres, these corpus are worth consideration.

I think the problem with any given text (prose or scientific papers) is that subject (eg hero of story, chemical or plant discussed in paper) will get used more than normal, and depending on what it is, that's going to skew the character distribution.

I had the bright idea of generating the "perfect" test piece, by working backwards from the character distribution, eg 1000 spaces, 600 (/whatever) "e", 400 (/whatever) "t", etc, then use bigram, trigram, quadgram etc frequency lists plus common words lists to generate text that is English-like and uses all the characters in the pool.

But got stuck at that point, figuring out the next move... :-)

The "bigrams" test is actually a good starting point, the numbers look like this:

bigrams.txt 34.426 76  etaoinrslchdumpgfybwvkxjzq


where the 34.426 is the similarity score as per samples above, and the 76 is the Levenshtein distance from ideal distribution. But only has lower case letters and no punctuation.
At least the e t a o i n part is correct.

(the Logical Argument text above has a Levenshtein distance of 72.)

33
I see you've added some new input texts ... I'll run them through my analysers to see how they letter frequency looks.

Well that took way longer than I thought it would.
I had to clean up the texts a bit, to get rid on non-ANSI characters (assorted diacritics, Greek, bullets, em-dash and en-dash, etc ...)

Anyway, compared to "standard letter frequency for English", we have the following "similarity" (using PHP "similar_text" function):
Also first few letters in frequency (up to first Capital)

1. Constructing a Logical Argument, from the old alt.atheism FAQ: 37.437    etaoisnrhlcumfdpgyw.bvT
2. Putin's speech: 36.025    etaonisrhldcumfgpyw,bv.kI
3. Daode-jing: 35.503    etoinasrhldu.cgfywmpvbT
4. A nice cup of tea (George Orwell): 33.540  etoainsrhludyfwcgpbm,.kv-I
5. Constantine (Wikipedia): 33.136   etnaiosrhlducmfpgw,ybC
... (skip a few)
6. Keyboards (Wikipedia): 31.214   etaosrindchlyumpkbfgw,.vT
skip more
7. Nightfall: 25.926   etoanishrlducfmwyg.pb',vk-I
skip more
8. Dragonboy: 23.171   etanohsridlgucwmfyb.,vp"k'T

As a reminder, this is what we want:  etaoinsrhldcumpfg.ywb-,v0k1T (scores were based on the full frequency list, not just this snippet)

So I would not recommend either Nightfall or Dragonboy as good for optimising for English.




34
Code: [Select]
KLAtest I just made the thumb penalty score almost as bad as pinky. now you can see the difference in finger usage versus space on thumb.

Just had a look at how this scores, using Putin's speech as base.

First observation is that X7.1 is knocked down the list, beaten by BEAKL 15 and Opted 4 Ergo Alt.
Second is that your new layout PLLT x1 does not beat any of them. Which surprised me because I was expecting in at #1. (see later)
Second surprise is that my mods to your BEAKL 15 is still best ... was not expecting that. (ditto)

Then did you change the way the "words" score is done? IIRC correctly on previous scoring the best were single-digits (eg 6/7/8/9) but now they're less than 1?

I see you've added some new input texts ... I'll run them through my analysers to see how they letter frequency looks.

Played some more, using different inputs, and the ordering is all over the place ... eg top six for different inputs:
1. Putin
BEAKL 15 Matrix mod Ian
BEAKL Opted4 Ergo Alt   +4
BEAKL 15 Matrix   +4
X7.1H Ergolinear   +6
BEAKL PLLT x1   +9
BEAKL Stretch   +17


Alice:
BEAKL 15 Matrix mod Ian
BEAKL PLLT x1   +7
BEAKL Opted4 Ergo Alt  +9
X7.1H Ergolinear  +9
BEAKL 15 Matrix  +10
P_RN   +17


Nightfall:
BEAKL PLLT x1
BEAKL 15 Matrix mod Ian  +12
BEAKL Opted4 Ergo Alt  +13
X7.1H Ergolinear  +16
BEAKL 15 Matrix  +17
P_RN  +18

Dragonboy:
BEAKL 15 Matrix mod Ian
BEAKL Opted4 Ergo Alt  +4
X7.1H Ergolinear  +4
BEAKL 15 Matrix  +6
P_RN  +11
BEAKL PLLT x1  +11


It looks like BEAKL PLLT  was optimised for Nightfall ... either that, or there's something odd about the scoring calcs?
These are all essentially English prose tests, so shouldn't the general pattern of the scoring, and differences between layouts, be largely the same, with some differences?

Back when we started we could count on MTGap doing very well regardless of the English input tests. ie scores from different prose tests should not differ radically?
Or am I missing something?

Cheers, Ian

35
KLAtest I just made the thumb penalty score almost as bad as pinky. now you can see the difference in finger usage versus space on thumb.

Um, so you are abandoning Arensito and Malt theory in favour of Scholes theory, and want to basically type with two/three fingers each hand? :-)

Just checking. :-)

36
brewing a combination pinkyless + low thumb usage.

Thumb use probably too high for you. But does better than X7.1 on at least some of the inputs. (Klatest scoring)

Work in progress. Don't laugh about the Shift keys ... was only thing that worked.

I think in some of the CLP/Essie layouts we had space on right middle or somesuch, which created enormous problems about what to put above and below, to avoid high same-finger.
Net result was that those keys became effectively underused.

So am not sure that taking space off thumb will be ideal.


37

3. Fewest keys to cover ASCII visible characters.  95 keys to start. but digits will be accessed on numpad or number row. So only need 85 chars. so need 29 keys with 3 layers, or 22 keys with 4 layers.


Can you clarify re digits? (apart from numpad). You want separate row for digits? But digits only, no puncts on them?

thanks, Ian

38
1. No pinkies for typing. One may reassign other utility functions, like editing or navigation.
2. Low thumb. Ideally 5% usage per thumb. so no space. Probably modifiers includiig shift and altgr, due to next point.
3. Fewest keys to cover ASCII visible characters.  95 keys to start. but digits will be accessed on numpad or number row. So only need 85 chars. so need 29 keys with 3 layers, or 22 keys with 4 layers.

Given these constraints, then, pinkyless suggests 22 keys on 4 layers. this is doable with just shift and altgr, the home block, plus two more keys per hand. probably home pinky and index inside column. (if making room for arrows on bottom left hand, left hand will also need another index inside key.)

This post may end up being quite long. Read twice. :-)

Coincidentally I've gone back to the drawing board as well, now that I've identified some issues with ErgoLinear 1 form factor.

1. Years ago when I first saw the Maltron boards, I thought the central numpad was a good idea. Suitable for both lefties and righties. So that got integrated into my thinking at some deep level.
Now however, having built one and tested it a bit, I feel that it's like trying to make a shoe that fits both feet. What seems like "good for either" is actually "good for neither".

I wanted a lefty-righty layout for possible selling purposes. However it would be better to have proper lefty designs, just like lefty scissors. One design does not suit both. And since I'm right-handed, the num pad goes back on the right.

I like to have a numpad for when I need it, and don't like extra separate ones.

2. Consequence: The F-keys, which were integrated into the central numpad, now need to come back. I use F2/F3/F4 quite a bit, the others hardly ever. But I accept that some other users/programs may use them a lot. On the plus side, they take over the 4th row so are much closer than on ANSI/ISO.

3. Consequence 2: the navigation buttons in the centre also suck. Same issues: in theory ambidextrous, in reality awkward. So I'm contemplating a joystick for the arrow keys. I think a playstation-style will maybe work better than an thumbpoint style, but it must be switches not potentiometers, and struggling to find that (and with decent switch life -- 10k presses ain't gonna cut it.)

4. Consequence 4: other nav keys also need to move. I need shift close to arrows (sorted), and ctrl-home or ctrl-end must also be easy. So should shift-home and shift-end. So have ended up with something similar/borrowed from Den's matrix layouts, but split between hands. The hand can drop down and work them with index and middle, while thumb can hit shift and pinky hit ctrl.

5. The two thumb keys next to each other were awkward, particularly turning the thumb in. So moved it down and out for more natural thumb movement.

6. Escape key moves to centre, and added Compose key. Don't know if the EMACs/Vim users use Escape, I know they complain about `~ key position.

7. Put backspace back in similar place to ANSI/ISO mainly due to muscle memory looking for it there. Put Delete in mirror spot on left, dunno if viable.

8. Layout now X7.1. I know the score on Klatest was good, but my mind is still not convinced that h on AltGr is such a good idea. Time will tell.

9. X7.1 actually goes quite a way to meeting your requirements above .... the pinky only has minor characters (except when programming C-style languages). To meet your requirements I think you will have to put all/most of the punctuation on AltGr and Shift-AltGr. I'll play around with designs like that in the next few days once I catch up with current tasks.

Screenshots of current thinking (work in progress), with gamepad style joystick and thumbstick style.

[Aside: still struggling with xkb to get it to do all I want, in particular putting the Function keys on the numpad, and menu/OS on the Escape key... xkb just doesn't like what I tell it. Must be some other config file overriding my instructions.]

40
I see X7.1 addresses my same finger issues above. Will have to print some stickers...

41
Why didn't you go X7.1H?

Keys were already printed for 6.5.

This is essentially a prototype for finding what is good and bad about the form factor.
And a learning curve in hardware design and construction.

Some key choices are annoying already, maybe because I'm typing slowly and more aware of same finger than on qwerty.
eg the GOY and AD and CT. Also KE especially on this forum.

Once I adjust to this form factor and sort out the remaining hardware/software issues I will try other layouts.


42
Hi

So my X6.5 layout on ergolinear is up and running and using it to type this (slowly)

Pic attached. Assorted comments:

Current version was supposed to be finished in black leather but I damaged that layer and more trying to widen the holes to put the pins through. They were originally going to be screws. So ended up with fallback superwood which does not look that great. It is partially assembled.

My carefully calculated split angle is too steep, combined with my short pinkies makes hitting i and r awkward.

h on space is not as awkward as I thought it would be.

Linear layout feels much better than ANSI staggered.

Tend to hit capslock by accident.

AltGr and shift on same thumb not strictly legal for KLE. Having one shift on thumb also a new experience.

ctrl-c v x z t n s r q f w p all work fine on right hand.

Still struggling to get Function keys to work.

Must still sort out LEDs... and blue one is way too bright.

Arrow keys and other nav need a rethink.

FWIW, I am basically letting QMK send qwerty scan codes and then remapping  them in xkb.

Okay all this hardwired-brain-rewiring is mentally exhausting.

43
Will revert.

Discovered I had specified the wrong chip in the configs.... so now keyboard is "live". Keypresses work, most letters are okay. Some to fix, like the hH which is on the space key, but I was planning on letting xkb handle that anyway ..., and all the other AltGr letters.
And have two Ms and no W.

Two of the LEDs are on permanently, will need to figure that out. Want to use LEDs for
1. Insert Mode is on
2. Caps lock. Or possibly "AltGr is pushed" ...
3. Scroll lock (I think).

Originally had 4 LEDs, the 4th was supposed to be for the built-in Pomodoro timer (still need to figure out how to do that....)

There's no Num lock function at all.

Going to be a huge learning curve getting used to the different form factor and letter layout....



44

I suggest you post an SOS at https://www.reddit.com/r/olkb/. Builders there should be able to provide hw debugging advice.


Thanks, was wondering where the help channel was. Let me first try to make sense of Soarer's docs, as they're apparently quite good.

Will revert.

45
Keyboards and Other Interfaces / Programming the board
« on: 2018-Aug-01 14:07 »
@sdothum

Hi. Posting publically in case someone else can help (or it can help someone else).

Finally trying to get my keyboard up and running. I had previously done some configs in QMK, and after the usual fun and games, managed to get it compiled and written to the Teensy 2.0++.

Which bricked it. Reset it by rebooting and holding switch for 15 seconds while rebooting.

At that point I realised that QMK is not user-friendly, a keyboard config system should not be having end users directly editing C code to make it work. It should work via one config file.

So then stumbled upon exactly that on DT, where Soarer had done something like that a few years back.

I followed the brief "big dummy's" guide and flashed the appropriate hex file, rebooted it, and hid_listen this time picked up the Teensy.

But pressing keys does nothing ... neither in hid_listen or xev.

Which is a problem because I have no idea where to start trying to debug this.

Possibilities:
1. my wiring sucks
2. the diodes are the other way around to what the programs expect
3. I fried something when soldering in the wires
4. general incompatibility between Soarer's stuff or QMK, and my grid layout.
5. I'm using a pin on the board that I'm not supposed to. Or not using one that I should be. But don't think so.

Trying to figure out where to start to find the problem. Then can look for solutions...

Any ideas welcome .. :)
I need some way to verify that the hardware is all okay and correct. Pressing keys does nothing. I would be happy with "anything" because that at least says something works.

Thanks, Ian

46
and I like the outcome.

Yes, looks promising. Some interesting ideas. Can you please post full layout when you're done? :-)

Thanks, Ian

47
It's ironic that the most common character (space) is typed by a slow finger (thumb). Wouldnt it be better to move it to a faster finger? In fact, we should ask, does the thumb deserve a role in typing (like the pinky)?

We did, see attached. Also
beakl-clp-0.en.matrix
seelpy-1-lowscore.en.ergolinear

and Schizo put space on Shift-AltGr layer
SchizoKBD AltGrSpc
SchizoKBD shifted AltGrSpc

which all my logic screams against.... :-)   [ ... mmm these two look the same ... must check if I made a mistake renaming them and lost one]

It would work for me to replace the space with numlock. To provide easier access to numpad. One might also put arrows and nav keys in the same layer, on the other hand opposite from the numpad.

I'm wondering if possibly the Kinesis form factor / key positioning is playing a part here. Would you still have the same issues on a Matrix layout?

In terms of "Malt theory" (for lack of a better term) the thumbs are strong and should be used for common letters.

In terms of "Ian theory", (for lack of a better term), the Direction of Travel for thumb keys is All Wrong. The natural motion for us chimpanzees with opposable thumbs is towards the palm, not at right angles to the plane of the palm, although clearly that is also "doable", as anyone who has played thumb drums on their desk can confirm. Or keyboard users and pianists, for that matter :-)

In my "stick keys on paper and see how it feels" experiments I came to realise that clusters like on Maltron or Kinesis were awkward ... the thumb can't easily navigate those clusters. (see other attached). Or maybe just my thumbs. So I reduced them and eventually ended up with Ergolinear structure. Where Thumb is used, but does not move sideways much, just mostly up and down.

I think the issue with Space on Thumb on ANSI/ISO layouts is that they typically make it HARDER to press than regular keys. Which makes it slower. And strains the thumbs more. I've seen keyboard designs where the designer deliberately picks a heavier switch for the thumbs. Never understood that.


48
Thumb rolls could feel better only if their size and placement exactly conform to one's hand.

I came to the same conclusion about the arrow keys the other day.
Actually the whole Ctrl/Shift/Insert/Delete/Home/End/Arrows cluster, that I use so much.

And I need it one-handed because my left hand is typically on the trackball when doing those operations.

49
The major stumbling block is how to prepare my Kinesis keyboard for thumb and arrows, per recent discussions. So I can test drive and experiment with these radical layouts.

You got USD100?

https://www.massdrop.com/buy/xd75-aluminum-mechanical-keyboard


{after reading comments on massdrop, am not so sure about this anymore... but it's certainly matrix-compatible for you....)

50
After that visit to Reddit, I have to give another downvote. I find it nigh unusable. And it's not from lack of computer experience.

Alternatives (I accept politics will play a role in your decision):

1. deskthority?
2. Google group? (groups.google.com)
3. G+ group? (but don't like that interface either. Maybe am too old-school...). These sort of things need an easily navigatable structure.
4. there's always geekhack but don't like that interface either. And I hear discussions can get quite heated... :-)

Pages: [1] 2 3 4 ... 18