Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements.  See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

=== Probe length: 8B ===
                            N  | --- ML ablation --------------------------------------------------- | --- Baselines --------------------------------- |
Charset                        | Stat R%   S%  T3%  D%   A%  | +ISO R%   S%  T3%  D%   A%  | +CJK R%   S%  T3%  D%   A%  | All  R%   S%  T3%  D%   A%  | ICU4J R%   S%  T3%  D%   A%  | juniv R%   S%  T3%  D%   A%  |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.
Big5-HKSCS              30334  |  74.0  74.0  94.1  78.3  78.3  |  73.6  73.6  89.9  81.5  81.6  |  77.3  77.3  89.9  85.0  85.1  |  77.3  77.3  89.9  85.0  85.1  |   0.0  51.5  88.5  56.2  56.5  |   0.0   0.0   0.0   7.7   8.8  |
EUC-JP                  37043  |  62.1  62.1  77.0  79.4  80.4  |  61.9  61.9  75.2  82.0  83.0  |  61.7  61.7  74.4  81.2  82.6  |  61.7  61.7  74.4  81.2  82.6  |   0.0   0.0  37.7  16.3  19.1  |   0.0   0.0   0.0  17.6  20.0  |
EUC-KR                  36883  |  66.5  66.5  86.3  73.1  73.3  |  68.0  68.0  85.6  75.5  75.7  |  67.8  67.8  85.2  75.2  75.4  |  67.8  67.8  85.2  75.2  75.4  |   0.0   0.0  63.1   6.9   7.3  |   0.0   0.0   0.0   7.3   7.8  |
GB18030                 36862  |  72.1  72.1  81.9  84.2  84.9  |  71.7  71.7  80.6  85.6  86.3  |  71.8  71.8  80.5  85.5  86.3  |  71.8  71.8  80.5  85.5  86.3  |   0.4   0.4   7.3  13.6  14.6  |   1.0   1.0   1.0  14.3  15.6  |
IBM1047                 34790  |   6.9  54.4  69.1  54.3  54.4  |   7.2  55.8  70.0  55.8  55.8  |   7.3  56.0  70.3  56.0  56.0  |   7.3  56.0  70.3  56.0  56.0  |   0.0  78.7  83.5  78.5  78.6  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-ltr              36874  |  74.5  94.4  97.1  74.5  74.5  |  74.6  94.7  96.7  74.6  74.6  |  74.7  94.7  96.7  74.7  74.7  |  74.7  94.7  96.7  74.7  74.7  |   0.0  55.6  62.6   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-rtl              37018  |  73.2  92.5  97.3  73.2  73.2  |  73.8  93.3  97.1  73.8  73.8  |  73.9  93.4  97.1  73.9  73.9  |  73.9  93.4  97.1  73.9  73.9  |   0.0  65.2  70.5   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-ltr              34927  |  76.2  93.0  96.6  76.2  76.2  |   9.5  14.2  17.6   9.5   9.5  |   9.6  14.3  17.7   9.6   9.6  |   9.6  14.3  17.7   9.6   9.6  |   0.0  51.4  56.6   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-rtl              32984  |  82.5  93.4  97.4  82.5  82.5  |  12.3  13.3  16.8  12.3  12.3  |  12.3  13.4  16.8  12.3  12.3  |  12.3  13.4  16.8  12.3  12.3  |   0.0  56.3  59.5   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM500                  35083  |  47.6  54.4  69.3  54.4  54.4  |  48.7  55.7  70.2  55.7  55.7  |  48.9  56.0  70.5  56.0  56.0  |  48.9  56.0  70.5  56.0  56.0  |  78.7  78.7  83.4  78.7  78.7  |   0.0   0.0   0.0   0.0   0.0  |
IBM850                  34505  |  29.0  29.0  58.2  91.8  92.0  |  17.0  17.0  22.0  94.3  94.5  |  17.1  17.1  22.2  94.5  94.5  |  17.1  17.1  22.2  94.5  94.5  |   0.0   0.0   0.0  71.8  72.4  |   0.0   0.0   0.0  75.5  76.3  |
IBM852                  35418  |  46.3  46.3  72.0  91.2  91.3  |  24.4  24.4  30.6  93.5  93.6  |  24.6  24.6  30.8  93.7  93.7  |  24.6  24.6  30.8  93.7  93.7  |   0.0   0.0   0.0  62.0  62.1  |   0.0   0.0   0.0  67.0  67.2  |
IBM855                  36702  |  65.8  65.8  90.4  70.4  70.4  |  68.3  68.3  90.6  73.2  73.2  |  73.8  73.8  91.5  78.7  78.7  |  73.8  73.8  91.5  78.7  78.7  |   0.0   0.0   0.0   4.6   4.7  |  79.7  79.7  79.7  84.6  84.6  |
IBM866                  36985  |  76.3  76.3  90.9  81.1  81.1  |  80.7  80.7  91.4  85.8  85.8  |  82.5  82.5  92.2  87.6  87.6  |  82.5  82.5  92.2  87.6  87.6  |  30.5  30.5  49.2  35.4  35.4  |  88.9  88.9  88.9  94.1  94.1  |
ISO-2022-CN             40954  |   0.0   0.0   0.0  13.6  17.0  |  81.4  81.4  81.4  99.1 100.0  |  81.4  81.4  81.4  99.1 100.0  |  81.4  81.4  81.4  99.1 100.0  |  36.5  36.5  79.2  50.2  54.3  |   0.0   0.0   0.0  14.2  18.2  |
ISO-2022-JP             37151  |   0.0   0.0   0.0  16.1  19.3  |  80.5  80.5  80.5  98.2  99.9  |  80.5  80.5  80.5  98.2  99.9  |  80.5  80.5  80.5  98.2  99.9  |  69.9  69.9  73.6  86.4  89.7  |  78.6  78.6  78.6  96.4 100.0  |
ISO-2022-KR             36860  |   0.0   0.0   0.0   7.0   9.3  |  89.7  89.7  89.7  99.3 100.0  |  89.7  89.7  89.7  99.3 100.0  |  89.7  89.7  89.7  99.3 100.0  |  88.0  88.0  89.6  94.9  97.7  |  89.7  89.7  89.7  97.1 100.0  |
ISO-8859-16             32901  |  49.7  49.7  66.2  93.8  94.0  |  20.9  20.9  24.6  95.5  95.6  |  21.0  21.0  24.7  95.5  95.6  |  21.0  21.0  24.7  95.5  95.6  |   0.0   0.0   0.0  82.8  83.2  |   0.0   0.0   0.0  91.4  92.1  |
ISO-8859-3              35648  |  46.5  46.5  75.6  91.8  91.9  |  18.4  18.4  22.3  95.1  95.1  |  18.6  18.6  22.5  95.4  95.4  |  18.6  18.6  22.5  95.4  95.4  |   0.0   0.0   0.0  72.2  72.2  |   0.0   0.0   0.0  76.8  76.8  |
KOI8-R                  36850  |  67.4  81.8  93.3  86.6  86.7  |  67.4  81.8  92.8  87.0  87.0  |  68.2  82.8  93.0  88.0  88.1  |  68.2  82.8  93.0  88.0  88.1  |  40.9  40.9  47.2  45.9  45.9  |  90.7  90.7  90.7  95.9  96.0  |
KOI8-U                  36846  |  57.6  83.4  94.9  85.6  85.6  |  57.6  83.4  94.6  85.8  85.9  |  59.0  85.3  95.0  87.7  87.7  |  59.0  85.3  95.0  87.7  87.7  |   0.0  33.7  40.1  24.7  24.7  |   0.0  89.2  89.2  61.1  61.1  |
Shift_JIS               36917  |  60.2  60.2  75.3  76.7  76.9  |  63.8  63.8  76.3  81.6  81.8  |  63.6  63.6  75.5  81.1  81.2  |  63.6  63.6  75.5  81.1  81.2  |   0.0   0.0  15.3  16.4  17.3  |   0.0   0.0   0.0  17.4  18.6  |
US-ASCII                36759  |   0.0   5.7  18.2  97.2  97.2  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   9.2  75.4  95.8  95.8  |   0.0   0.0   0.0 100.0 100.0  |
UTF-16-BE               36799  |  70.9  71.5  84.2  70.9  70.9  |  92.4  92.5  96.8  92.4  92.4  |  95.2  95.3  97.4  95.2  95.2  |  95.2  95.3  97.4  95.2  95.2  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-16-LE               36736  |  69.2  69.5  83.9  69.2  69.2  |  95.1  95.1  97.4  95.1  95.1  |  95.6  95.6  97.8  95.6  95.6  |  95.6  95.6  97.8  95.6  95.6  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-BE               36757  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  90.1  90.1  99.8  90.1  90.1  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-LE               37011  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  96.9  96.9 100.0  96.9  96.9  |   0.0   0.0   0.0   0.0   0.0  |
UTF-8                   36254  |  64.8  64.8  76.7  86.5  86.6  |  62.2  62.2  69.2  87.3  87.4  |  62.3  62.3  69.3  87.3  87.4  |  62.3  62.3  69.3  87.3  87.4  |  60.2  60.2  93.1  79.6  79.6  |  69.0  69.0  69.0  94.0  94.0  |
windows-1250            34548  |  20.6  20.6  35.0  88.3  88.6  |  20.0  20.0  26.3  90.3  90.3  |  20.0  20.0  26.4  90.4  90.4  |  20.0  20.0  26.4  90.4  90.4  |   4.5  40.3  63.0  80.0  80.3  |   0.0   0.0   0.0  78.3  81.6  |
windows-1251            36852  |  77.5  77.5  87.7  82.0  82.1  |  77.6  77.6  87.7  82.4  82.4  |  77.8  77.8  88.0  82.6  82.6  |  77.8  77.8  88.0  82.6  82.6  |  35.1  35.4  41.1  39.8  39.9  |  64.8  65.6  65.6  69.6  69.8  |
windows-1252            25975  |   8.6   8.6  16.1  81.2  81.7  |  77.1  77.1  84.3  83.3  83.4  |  77.1  77.1  84.3  83.3  83.4  |  77.1  77.1  84.3  83.3  83.4  |   1.8  42.3  76.9  82.9  83.2  |   0.0  98.9  98.9  94.9  98.9  |
windows-1253            36845  |  75.3  75.3  88.6  82.4  82.5  |  75.1  75.1  87.6  82.8  82.9  |  75.4  75.4  87.9  83.1  83.1  |  75.4  75.4  87.9  83.1  83.1  |   0.7  38.0  52.9  45.1  45.1  |   0.0  75.6  75.6  81.5  83.0  |
windows-1254            36705  |  40.6  40.6  60.8  86.9  87.1  |  33.8  33.8  42.1  89.2  89.2  |  34.0  34.0  42.3  89.4  89.4  |  34.0  34.0  42.3  89.4  89.4  |   1.9  32.9  45.8  74.1  74.3  |   0.0   0.0   0.0  68.0  71.9  |
windows-1255            36774  |  75.8  75.8  93.5  79.1  79.1  |  75.8  75.8  93.4  79.3  79.3  |  75.9  75.9  93.5  79.3  79.3  |  75.9  75.9  93.5  79.3  79.3  |   0.7  15.5  20.1  18.6  18.8  |  80.9  82.0  82.0  85.3  85.5  |
windows-1256            41912  |  89.5  89.5  95.6  92.5  92.6  |  89.5  89.5  95.3  92.8  92.8  |  90.1  90.1  95.5  93.3  93.3  |  90.1  90.1  95.5  93.3  93.3  |  27.6  39.5  47.4  30.7  30.7  |   0.0   0.0   0.0   3.2   3.3  |
windows-1257            35316  |  48.0  48.0  73.0  88.4  88.7  |  25.5  25.5  34.5  89.8  89.8  |  25.6  25.6  34.6  89.8  89.9  |  25.6  25.6  34.6  89.8  89.9  |   0.0   0.0   0.0  64.2  64.6  |   0.0   0.0   0.0  68.0  70.7  |
windows-1258            36885  |  69.4  69.4  81.7  88.0  89.5  |  63.5  63.5  72.0  88.9  89.9  |  64.3  64.3  72.3  89.7  90.2  |  64.3  64.3  72.3  89.7  90.2  |   0.0   0.0   0.0  25.0  26.0  |   0.0   0.0   0.0  24.6  26.0  |
windows-874             36809  |  58.2  58.2  77.1  74.6  74.7  |  60.2  60.2  74.8  78.6  78.7  |  62.0  62.0  75.9  80.5  80.5  |  62.0  62.0  75.9  80.5  80.5  |   0.0   0.0   0.0  17.5  17.6  |   0.0   0.0   0.0  86.8  89.0  |
x-EUC-TW                26788  |  69.2  69.2  89.1  75.9  76.1  |  69.4  69.4  86.3  77.3  77.5  |  69.0  69.0  85.6  76.6  77.0  |  69.0  69.0  85.6  76.6  77.0  |   0.0   0.0   0.0   7.6   8.3  |   0.0   0.0   0.0   7.6   8.3  |
x-MacRoman              36756  |  29.0  29.0  62.8  90.0  90.3  |  17.2  17.2  22.1  92.7  92.8  |  17.5  17.5  22.3  93.0  93.1  |  17.5  17.5  22.3  93.0  93.1  |   0.0   0.0   0.0  72.1  72.4  |   0.0   0.0   0.0  75.5  75.9  |
x-mac-cyrillic          36631  |  77.5  77.5  88.4  77.5  77.5  |  77.7  77.7  88.5  77.7  77.7  |  77.9  77.9  88.7  77.9  77.9  |  77.9  77.9  88.7  77.9  77.9  |   0.0   0.0   0.0   0.0   0.0  |  42.1  42.1  42.1  42.1  42.1  |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OVERALL                 1469647  |  50.8  54.9  68.1  71.5  71.9  |  58.6  62.1  70.0  82.7  82.9  |  59.1  62.6  70.2  83.2  83.4  |  59.1  62.6  70.2  83.2  83.4  |  16.7  30.2  42.1  44.0  44.5  |  17.2  23.1  23.1  46.1  47.1  |
  Stat=model only | +ISO=+C1-correction | +CJK=+grammar | All=ML+rules | R%=strict | S%=soft | T3%=top-3 hit | D%=decode-match | A%=alpha-match
  µs/sample                    |                       10.4  |                        6.4  |                        6.3  |                        6.2  |                        5.1  |                        2.2  |

--- Confusion (All/ML+rules, 8B, top errors per charset) ---
  Big5-HKSCS              22.7% wrong → windows-1252:7.7%, GB18030:7.5%, EUC-JP:1.0%, EUC-KR:0.9%, KOI8-U:0.6%
  EUC-JP                  38.3% wrong → windows-1252:17.6%, GB18030:6.3%, EUC-KR:4.5%, x-EUC-TW:1.8%, Big5-HKSCS:1.3%
  EUC-KR                  32.2% wrong → GB18030:8.4%, windows-1252:8.2%, EUC-JP:5.4%, windows-1251:3.0%, Big5-HKSCS:1.4%
  GB18030                 28.2% wrong → windows-1252:13.6%, EUC-JP:2.2%, windows-1251:1.9%, EUC-KR:1.7%, x-EUC-TW:1.2%
  IBM1047                 92.7% wrong → IBM500:48.8%, IBM424-rtl:12.1%, IBM424-ltr:6.2%, IBM420-ltr:5.2%, GB18030:3.2%
  IBM420-ltr              25.3% wrong → IBM420-rtl:20.1%, IBM500:1.1%, IBM424-rtl:0.9%, IBM424-ltr:0.8%, windows-1252:0.7%
  IBM420-rtl              26.1% wrong → IBM420-ltr:19.5%, IBM424-rtl:3.6%, windows-1252:0.5%, IBM500:0.4%, IBM424-ltr:0.4%
  IBM424-ltr              90.4% wrong → windows-1252:79.3%, IBM424-rtl:4.7%, IBM500:2.3%, IBM420-ltr:0.8%, GB18030:0.4%
  IBM424-rtl              87.7% wrong → windows-1252:80.9%, IBM420-rtl:1.3%, IBM500:1.1%, IBM424-ltr:1.0%, IBM420-ltr:0.7%
  IBM500                  51.1% wrong → IBM424-rtl:12.5%, IBM1047:7.1%, IBM424-ltr:6.1%, IBM420-ltr:5.1%, GB18030:3.4%
  IBM850                  82.9% wrong → windows-1252:75.6%, IBM852:2.4%, x-MacRoman:0.8%, IBM424-rtl:0.5%, ISO-8859-16:0.5%
  IBM852                  75.4% wrong → windows-1252:67.1%, IBM850:2.8%, x-MacRoman:0.8%, IBM424-rtl:0.5%, windows-1256:0.5%
  IBM855                  26.2% wrong → GB18030:8.0%, windows-1252:5.0%, IBM850:3.0%, windows-1256:2.0%, IBM866:1.2%
  IBM866                  17.5% wrong → windows-1252:5.2%, IBM850:3.1%, Big5-HKSCS:1.6%, x-mac-cyrillic:1.3%, GB18030:1.1%
  ISO-2022-CN             18.6% wrong → windows-1252:15.2%, ISO-2022-JP:3.4%, UTF-16BE:0.0%, UTF-16-BE:0.0%, IBM424-rtl:0.0%
  ISO-2022-JP             19.5% wrong → windows-1252:19.5%
  ISO-2022-KR             10.3% wrong → windows-1252:8.1%, ISO-2022-JP:2.3%
  ISO-8859-16             79.0% wrong → windows-1252:73.1%, windows-1250:1.6%, windows-1257:1.4%, IBM852:0.6%, IBM850:0.3%
  ISO-8859-3              81.4% wrong → windows-1252:76.4%, x-MacRoman:1.2%, IBM850:0.7%, ISO-8859-16:0.4%, UTF-16-LE:0.3%
  KOI8-R                  31.8% wrong → KOI8-U:14.6%, windows-1252:5.3%, windows-1253:3.7%, windows-1251:2.3%, windows-1256:2.1%
  KOI8-U                  41.0% wrong → KOI8-R:26.3%, windows-1252:3.4%, GB18030:2.9%, windows-1253:2.4%, windows-1256:2.0%
  Shift_JIS               36.4% wrong → windows-1252:17.4%, x-mac-cyrillic:2.9%, GB18030:2.5%, x-MacRoman:2.3%, IBM424-rtl:2.2%
  US-ASCII               100.0% wrong → windows-1252:100.0%
  UTF-16-BE                4.8% wrong → windows-1252:0.8%, windows-1256:0.4%, windows-1253:0.4%, IBM424-rtl:0.3%, IBM852:0.2%
  UTF-16-LE                4.4% wrong → windows-1252:0.8%, IBM420-ltr:0.5%, IBM850:0.3%, IBM852:0.2%, GB18030:0.2%
  UTF-8                   37.7% wrong → windows-1252:25.6%, GB18030:4.5%, IBM850:1.7%, IBM866:1.1%, windows-1256:0.8%
  windows-1250            80.0% wrong → windows-1252:65.1%, ISO-8859-16:3.5%, x-MacRoman:2.1%, windows-1257:2.0%, IBM852:1.5%
  windows-1251            22.2% wrong → x-mac-cyrillic:5.3%, windows-1252:5.1%, windows-1253:3.0%, windows-1256:2.2%, windows-1250:1.1%
  windows-1252            22.9% wrong → windows-1257:5.4%, windows-1250:2.8%, IBM850:2.3%, windows-1254:2.0%, x-MacRoman:1.6%
  windows-1253            24.6% wrong → windows-1252:9.7%, windows-1256:4.7%, KOI8-R:2.0%, windows-1251:1.4%, KOI8-U:1.1%
  windows-1254            66.0% wrong → windows-1252:53.7%, IBM852:2.5%, windows-1257:2.4%, IBM850:1.2%, x-MacRoman:1.1%
  windows-1255            24.1% wrong → windows-1252:5.7%, windows-1253:4.0%, windows-1251:3.0%, windows-1257:2.9%, windows-1250:2.3%
  windows-1256             9.9% wrong → windows-1252:3.7%, windows-1253:1.6%, windows-1251:1.3%, KOI8-R:0.6%, windows-1250:0.6%
  windows-1257            74.4% wrong → windows-1252:62.7%, IBM850:2.0%, ISO-8859-16:1.8%, IBM852:1.6%, windows-1254:1.4%
  windows-1258            35.7% wrong → windows-1252:24.7%, windows-1257:2.0%, IBM850:1.9%, windows-874:1.0%, x-MacRoman:0.8%
  windows-874             38.0% wrong → windows-1252:18.4%, GB18030:5.5%, EUC-JP:3.5%, windows-1251:2.1%, KOI8-U:1.1%
  x-EUC-TW                31.0% wrong → windows-1252:7.6%, GB18030:7.1%, windows-1256:2.8%, KOI8-R:2.0%, windows-1251:1.8%
  x-MacRoman              82.5% wrong → windows-1252:75.5%, IBM850:2.4%, IBM852:0.9%, windows-1256:0.5%, windows-1250:0.3%
  x-mac-cyrillic          22.1% wrong → windows-1252:5.5%, windows-1251:5.3%, windows-1253:2.1%, windows-1256:1.8%, windows-1250:1.4%

=== Probe length: 32B ===
                            N  | --- ML ablation --------------------------------------------------- | --- Baselines --------------------------------- |
Charset                        | Stat R%   S%  T3%  D%   A%  | +ISO R%   S%  T3%  D%   A%  | +CJK R%   S%  T3%  D%   A%  | All  R%   S%  T3%  D%   A%  | ICU4J R%   S%  T3%  D%   A%  | juniv R%   S%  T3%  D%   A%  |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Big5-HKSCS              30334  |  94.6  94.6  97.3  97.3  97.3  |  94.6  94.6  96.9  97.5  97.6  |  94.9  94.9  96.9  97.8  97.9  |  94.9  94.9  96.9  97.8  97.9  |   0.0  78.1  86.5  65.3  65.4  |   0.0  70.2  70.2  67.6  67.8  |
EUC-JP                  37043  |  89.2  89.2  94.9  92.9  93.2  |  89.2  89.2  94.7  93.2  93.5  |  89.2  89.2  94.6  93.2  93.5  |  89.2  89.2  94.6  93.2  93.5  |  61.5  61.5  66.9  65.2  65.7  |  79.8  79.8  79.8  83.5  84.0  |
EUC-KR                  36883  |  92.9  92.9  97.1  94.2  94.3  |  93.0  93.0  97.1  94.4  94.4  |  93.0  93.0  97.1  94.4  94.4  |  93.0  93.0  97.1  94.4  94.4  |  78.9  78.9  83.2  80.2  80.3  |  92.7  92.7  92.7  94.0  94.2  |
GB18030                 36862  |  93.6  93.6  95.5  96.6  96.8  |  93.5  93.5  95.0  96.9  97.1  |  93.6  93.6  95.1  96.9  97.1  |  93.6  93.6  95.1  96.9  97.1  |  59.8  59.8  66.2  63.0  63.5  |  71.7  71.7  71.7  75.0  75.5  |
IBM1047                 34790  |  12.6  90.0  95.1  89.0  89.9  |  12.6  90.2  95.2  89.1  90.1  |  12.7  90.2  95.2  89.1  90.1  |  12.7  90.2  95.2  89.1  90.1  |   0.0  82.4  98.6  80.9  82.1  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-ltr              36874  |  96.1  99.0  99.5  96.1  96.1  |  96.2  99.0  99.5  96.2  96.2  |  96.2  99.0  99.5  96.2  96.2  |  96.2  99.0  99.5  96.2  96.2  |   0.0  91.9  94.6   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-rtl              37018  |  96.0  99.1  99.8  96.0  96.0  |  96.0  99.1  99.8  96.0  96.0  |  96.0  99.1  99.8  96.0  96.0  |  96.0  99.1  99.8  96.0  96.0  |   0.0  94.2  96.1   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-ltr              34927  |  92.9  97.1  98.9  92.9  92.9  |  34.2  37.7  39.5  34.2  34.2  |  34.2  37.7  39.5  34.2  34.2  |  34.2  37.7  39.5  34.2  34.2  |   0.0  85.4  90.3   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-rtl              32984  |  96.4  99.6  99.9  96.4  96.4  |  29.0  29.7  30.0  29.0  29.0  |  29.0  29.7  30.0  29.0  29.0  |  29.0  29.7  30.0  29.0  29.0  |   0.0  89.6  92.3   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM500                  35083  |  78.1  90.7  95.4  90.6  90.7  |  78.2  90.8  95.5  90.7  90.8  |  78.2  90.8  95.5  90.7  90.8  |  78.2  90.8  95.5  90.7  90.8  |  82.8  82.8  98.7  82.8  82.8  |   0.0   0.0   0.0   0.0   0.0  |
IBM850                  34505  |  59.3  59.3  84.9  93.7  93.9  |  49.7  49.7  56.5  94.0  94.2  |  49.7  49.7  56.5  94.0  94.2  |  49.7  49.7  56.5  94.0  94.2  |   0.0   0.0   0.0  41.8  42.7  |   0.0   0.0   0.0  42.1  43.1  |
IBM852                  35418  |  80.6  80.6  93.2  94.8  94.9  |  66.7  66.7  71.8  95.1  95.1  |  66.8  66.8  71.8  95.2  95.2  |  66.8  66.8  71.8  95.2  95.2  |   0.0   0.0   0.0  26.3  26.6  |   0.0   0.0   0.0  26.9  27.2  |
IBM855                  36702  |  95.0  95.0  98.9  95.6  95.7  |  95.1  95.1  98.9  95.8  95.9  |  97.9  97.9  99.0  98.6  98.6  |  97.9  97.9  99.0  98.6  98.6  |   0.0   0.0   0.0   0.7   0.7  |  97.1  97.1  97.1  97.8  97.8  |
IBM866                  36985  |  97.9  97.9  99.0  98.7  98.7  |  98.1  98.1  99.0  98.9  98.9  |  98.3  98.3  99.0  99.1  99.1  |  98.3  98.3  99.0  99.1  99.1  |  63.9  63.9  86.6  64.7  64.7  |  97.7  97.7  97.7  98.5  98.5  |
ISO-2022-CN             40954  |   0.0   0.0   0.0   4.0   4.5  |  95.4  95.4  95.4  99.8 100.0  |  95.4  95.4  95.4  99.8 100.0  |  95.4  95.4  95.4  99.8 100.0  |  92.5  92.5  94.3  96.5  97.1  |   0.0   0.0   0.0   4.0   4.6  |
ISO-2022-JP             37151  |   0.0   0.0   0.0   3.8   4.2  |  95.8  95.8  95.8  99.7 100.0  |  95.8  95.8  95.8  99.7 100.0  |  95.8  95.8  95.8  99.7 100.0  |  86.5  86.5  89.2  90.3  90.7  |  95.7  95.7  95.7  99.6 100.0  |
ISO-2022-KR             36860  |   0.0   0.0   0.0   1.5   1.7  |  98.3  98.3  98.3  99.9 100.0  |  98.3  98.3  98.3  99.9 100.0  |  98.3  98.3  98.3  99.9 100.0  |  96.4  96.4  96.6  97.9  98.1  |  98.3  98.3  98.3  99.8 100.0  |
ISO-8859-16             32901  |  69.1  69.1  86.1  94.2  94.3  |  51.6  51.6  61.9  93.9  93.9  |  51.6  51.6  62.0  93.9  93.9  |  51.6  51.6  62.0  93.9  93.9  |   0.0   0.0   0.0  80.2  80.6  |   0.0   0.0   0.0  75.5  76.0  |
ISO-8859-3              35648  |  85.1  85.1  95.8  96.6  96.7  |  61.4  61.4  64.3  96.9  96.9  |  61.5  61.5  64.3  97.0  97.0  |  61.5  61.5  64.3  97.0  97.0  |   0.0   0.0   0.0  35.5  35.5  |   0.0   0.0   0.0  36.3  36.3  |
KOI8-R                  36850  |  95.4  98.2  99.0  98.9  98.9  |  95.3  98.2  99.0  98.9  98.9  |  95.4  98.3  99.0  99.0  99.0  |  95.4  98.3  99.0  99.0  99.0  |  76.4  76.4  86.2  77.1  77.1  |  98.3  98.3  98.3  99.0  99.0  |
KOI8-U                  36846  |  87.8  98.3  99.4  96.3  96.4  |  87.8  98.2  99.3  96.4  96.4  |  88.3  98.8  99.3  96.9  96.9  |  88.3  98.8  99.3  96.9  96.9  |   0.0  68.4  81.0  11.4  11.4  |   0.0  98.4  98.4  17.8  17.8  |
Shift_JIS               36917  |  89.6  89.6  94.7  93.1  93.2  |  91.7  91.7  94.9  95.3  95.4  |  91.7  91.7  94.9  95.4  95.5  |  91.7  91.7  94.9  95.4  95.5  |  65.1  65.1  65.8  68.7  69.0  |  85.3  85.3  85.3  89.0  89.4  |
US-ASCII                36759  |   0.0   1.2   4.3  99.6  99.6  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.4  59.7  99.5  99.5  |   0.0   0.0   0.0 100.0 100.0  |
UTF-16-BE               36799  |  84.8  85.2  95.3  84.8  84.8  |  98.3  98.3  99.3  98.3  98.3  |  99.0  99.1  99.7  99.0  99.0  |  99.0  99.1  99.7  99.0  99.0  |  71.0  71.0  92.8  71.0  71.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-16-LE               36736  |  93.5  93.5  98.5  93.5  93.5  |  99.1  99.1  99.7  99.1  99.1  |  99.5  99.5  99.9  99.5  99.5  |  99.5  99.5  99.9  99.5  99.5  |  72.0  72.0  93.1  72.0  72.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-BE               36757  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-LE               37011  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-8                   36254  |  81.8  81.8  87.4  93.6  93.6  |  87.7  87.7  87.7 100.0 100.0  |  87.7  87.7  87.7 100.0 100.0  |  87.7  87.7  87.7 100.0 100.0  |  86.0  86.0  92.8  98.0  98.0  |  86.9  86.9  86.9  99.2  99.2  |
windows-1250            34548  |  58.7  58.7  78.6  91.1  91.4  |  56.7  56.7  67.8  91.0  91.2  |  56.7  56.7  67.8  91.0  91.2  |  56.7  56.7  67.8  91.0  91.2  |  17.2  59.4  91.3  82.4  82.4  |   0.0   0.0   0.0  45.1  48.9  |
windows-1251            36852  |  93.6  93.6  98.2  94.5  94.5  |  93.6  93.6  98.2  94.5  94.5  |  93.7  93.7  98.3  94.6  94.6  |  93.7  93.7  98.3  94.6  94.6  |  69.5  69.5  80.6  70.4  70.4  |  79.8  79.9  79.9  80.7  80.8  |
windows-1252            25975  |  46.3  46.3  65.3  88.3  88.7  |  82.5  82.5  91.6  88.4  88.6  |  82.5  82.5  91.7  88.4  88.6  |  82.5  82.5  91.7  88.4  88.6  |   5.6  74.2  95.9  89.5  89.5  |   0.0  98.6  98.6  91.3  98.6  |
windows-1253            36845  |  95.4  95.4  97.6  96.9  96.9  |  95.4  95.4  97.6  96.9  96.9  |  95.4  95.4  97.6  96.9  96.9  |  95.4  95.4  97.6  96.9  96.9  |   3.4  84.6  92.4  85.0  85.1  |   0.1  93.7  93.7  90.0  94.0  |
windows-1254            36705  |  86.6  86.6  94.8  94.5  94.6  |  83.0  83.0  88.4  94.4  94.5  |  83.1  83.1  88.4  94.5  94.5  |  83.1  83.1  88.4  94.5  94.5  |   8.6  69.7  90.6  81.2  81.2  |   0.0   0.0   0.0  22.6  26.0  |
windows-1255            36774  |  97.9  97.9  99.1  98.5  98.6  |  97.8  97.8  99.1  98.5  98.6  |  97.9  97.9  99.1  98.6  98.6  |  97.9  97.9  99.1  98.6  98.6  |   4.8  34.5  47.9  34.6  35.3  |  96.7  98.3  98.3  98.8  99.0  |
windows-1256            41912  |  98.9  98.9  99.1  99.6  99.6  |  98.9  98.9  99.1  99.6  99.6  |  98.9  98.9  99.1  99.6  99.6  |  98.9  98.9  99.1  99.6  99.6  |  39.2  66.3  86.4  40.0  40.0  |   0.0   0.0   0.0   0.7   0.8  |
windows-1257            35316  |  82.0  82.0  94.1  93.3  93.4  |  69.0  69.0  76.2  93.0  93.1  |  69.0  69.0  76.2  93.1  93.1  |  69.0  69.0  76.2  93.1  93.1  |   0.0   0.0   0.0  36.3  36.3  |   0.0   0.0   0.0  36.0  38.5  |
windows-1258            36885  |  96.3  96.3  98.0  98.1  98.3  |  96.0  96.0  97.4  98.1  98.3  |  96.2  96.2  97.4  98.2  98.4  |  96.2  96.2  97.4  98.2  98.4  |   0.0   0.0   0.0   2.1   2.1  |   0.0   0.0   0.0   2.0   2.1  |
windows-874             36809  |  89.5  89.5  94.3  94.8  94.8  |  89.8  89.8  93.6  95.3  95.3  |  91.4  91.4  93.7  96.9  97.0  |  91.4  91.4  93.7  96.9  97.0  |   0.0   0.0   0.0   5.5   5.5  |   0.0   0.0   0.0  93.2  97.7  |
x-EUC-TW                26788  |  92.7  92.7  96.9  95.5  95.6  |  92.6  92.6  96.6  95.6  95.6  |  92.6  92.6  96.6  95.6  95.6  |  92.6  92.6  96.6  95.6  95.6  |   0.0   0.0   0.0   2.9   3.0  |  68.3  68.3  68.3  71.2  71.3  |
x-MacRoman              36756  |  64.9  64.9  89.4  92.8  93.1  |  52.2  52.2  57.8  93.1  93.4  |  52.2  52.2  57.8  93.1  93.4  |  52.2  52.2  57.8  93.1  93.4  |   0.0   0.0   0.0  40.6  41.0  |   0.0   0.0   0.0  41.0  41.4  |
x-mac-cyrillic          36631  |  92.7  92.7  98.3  92.7  92.7  |  92.8  92.8  98.3  92.8  92.8  |  92.8  92.8  98.4  92.8  92.8  |  92.8  92.8  98.4  92.8  92.8  |   0.0   0.0   0.0   0.0   0.0  |  58.5  58.5  58.5  58.5  58.5  |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OVERALL                 1469647  |  72.0  74.8  80.4  82.8  82.9  |  80.7  83.4  86.3  93.2  93.3  |  80.8  83.6  86.3  93.4  93.5  |  80.8  83.6  86.3  93.4  93.5  |  33.9  55.1  63.8  54.8  55.0  |  29.8  37.8  37.8  49.2  49.9  |
  Stat=model only | +ISO=+C1-correction | +CJK=+grammar | All=ML+rules | R%=strict | S%=soft | T3%=top-3 hit | D%=decode-match | A%=alpha-match
  µs/sample                    |                       12.1  |                        8.3  |                        8.2  |                        8.0  |                       14.1  |                        3.2  |

--- Confusion (All/ML+rules, 32B, top errors per charset) ---
  Big5-HKSCS               5.1% wrong → windows-1252:2.9%, GB18030:1.6%, IBM850:0.1%, ISO-8859-16:0.1%, IBM852:0.1%
  EUC-JP                  10.8% wrong → GB18030:4.2%, windows-1252:3.8%, EUC-KR:0.6%, x-EUC-TW:0.3%, IBM850:0.2%
  EUC-KR                   7.0% wrong → GB18030:2.5%, windows-1252:1.9%, EUC-JP:1.1%, windows-1251:0.3%, Big5-HKSCS:0.1%
  GB18030                  6.4% wrong → windows-1252:3.4%, windows-1251:0.3%, EUC-JP:0.3%, UTF-8:0.3%, IBM850:0.3%
  IBM1047                 87.3% wrong → IBM500:77.6%, IBM424-rtl:3.6%, IBM420-ltr:2.0%, IBM424-ltr:1.3%, IBM852:0.5%
  IBM420-ltr               3.8% wrong → IBM420-rtl:2.8%, IBM500:0.6%, IBM424-ltr:0.1%, IBM424-rtl:0.1%, IBM1047:0.1%
  IBM420-rtl               4.0% wrong → IBM420-ltr:3.1%, IBM424-rtl:0.8%, IBM500:0.0%, GB18030:0.0%, x-mac-cyrillic:0.0%
  IBM424-ltr              65.8% wrong → windows-1252:58.6%, IBM424-rtl:3.5%, IBM500:1.9%, UTF-8:0.8%, IBM420-ltr:0.3%
  IBM424-rtl              71.0% wrong → windows-1252:69.7%, IBM424-ltr:0.6%, UTF-8:0.2%, IBM500:0.2%, IBM420-rtl:0.1%
  IBM500                  21.8% wrong → IBM1047:12.6%, IBM424-rtl:3.4%, IBM420-ltr:1.8%, IBM424-ltr:1.2%, IBM852:0.4%
  IBM850                  50.3% wrong → windows-1252:42.4%, IBM852:3.0%, x-MacRoman:1.8%, ISO-8859-16:0.9%, windows-1257:0.8%
  IBM852                  33.2% wrong → windows-1252:27.1%, IBM850:2.4%, ISO-8859-16:0.8%, windows-1250:0.8%, x-MacRoman:0.7%
  IBM855                   2.1% wrong → windows-1252:0.7%, GB18030:0.4%, IBM850:0.3%, windows-1256:0.1%, IBM852:0.1%
  IBM866                   1.7% wrong → windows-1252:0.7%, x-mac-cyrillic:0.3%, IBM850:0.2%, x-MacRoman:0.1%, IBM852:0.1%
  ISO-2022-CN              4.6% wrong → windows-1252:4.2%, ISO-2022-JP:0.5%, UTF-16BE:0.0%
  ISO-2022-JP              4.2% wrong → windows-1252:4.2%
  ISO-2022-KR              1.7% wrong → windows-1252:1.6%, ISO-2022-JP:0.1%, UTF-16BE:0.0%
  ISO-8859-16             48.4% wrong → windows-1252:34.8%, windows-1250:8.6%, windows-1257:1.8%, IBM852:0.8%, UTF-8:0.8%
  ISO-8859-3              38.5% wrong → windows-1252:35.4%, x-MacRoman:0.7%, ISO-8859-16:0.4%, IBM852:0.4%, windows-1257:0.4%
  KOI8-R                   4.6% wrong → KOI8-U:2.9%, windows-1252:0.8%, windows-1253:0.2%, windows-1251:0.1%, windows-1256:0.1%
  KOI8-U                  11.7% wrong → KOI8-R:10.6%, windows-1252:0.5%, windows-1253:0.1%, windows-1251:0.1%, windows-1256:0.1%
  Shift_JIS                8.3% wrong → windows-1252:3.7%, x-MacRoman:0.7%, GB18030:0.7%, IBM852:0.6%, x-mac-cyrillic:0.6%
  US-ASCII               100.0% wrong → windows-1252:100.0%
  UTF-16-BE                1.0% wrong → windows-1256:0.1%, IBM850:0.1%, x-MacRoman:0.1%, IBM420-rtl:0.1%, windows-1250:0.1%
  UTF-16-LE                0.5% wrong → windows-1253:0.1%, windows-1251:0.0%, UTF-16BE:0.0%, IBM420-rtl:0.0%, KOI8-U:0.0%
  UTF-8                   12.3% wrong → windows-1252:12.3%
  windows-1250            43.3% wrong → windows-1252:26.9%, ISO-8859-16:7.3%, IBM852:3.2%, x-MacRoman:1.7%, windows-1257:1.6%
  windows-1251             6.3% wrong → x-mac-cyrillic:4.0%, windows-1252:1.1%, windows-1257:0.2%, windows-1253:0.1%, windows-1255:0.1%
  windows-1252            17.5% wrong → x-MacRoman:3.5%, windows-1250:3.1%, windows-1257:2.8%, IBM850:2.3%, windows-1254:1.4%
  windows-1253             4.6% wrong → windows-1252:2.7%, KOI8-R:0.4%, windows-1256:0.3%, windows-1257:0.2%, windows-1254:0.2%
  windows-1254            16.9% wrong → windows-1252:11.6%, IBM852:1.1%, windows-1250:0.8%, windows-1257:0.8%, ISO-8859-16:0.6%
  windows-1255             2.1% wrong → windows-1252:1.2%, windows-1250:0.2%, windows-1257:0.2%, windows-1251:0.1%, windows-1253:0.1%
  windows-1256             1.1% wrong → windows-1252:0.8%, windows-1253:0.1%, UTF-8:0.1%, x-MacRoman:0.0%, windows-1250:0.0%
  windows-1257            31.0% wrong → windows-1252:22.8%, windows-1250:1.5%, IBM852:1.4%, windows-1254:1.2%, ISO-8859-16:1.1%
  windows-1258             3.8% wrong → windows-1252:2.0%, x-MacRoman:0.3%, ISO-8859-16:0.2%, windows-1257:0.2%, IBM850:0.2%
  windows-874              8.6% wrong → windows-1252:5.4%, GB18030:0.8%, windows-1251:0.3%, windows-1257:0.2%, EUC-JP:0.2%
  x-EUC-TW                 7.4% wrong → GB18030:3.2%, windows-1252:2.9%, UTF-8:0.2%, windows-1256:0.2%, Big5-HKSCS:0.1%
  x-MacRoman              47.8% wrong → windows-1252:41.2%, IBM850:2.7%, IBM852:1.2%, ISO-8859-16:0.7%, windows-1250:0.5%
  x-mac-cyrillic           7.2% wrong → windows-1251:4.8%, windows-1252:1.0%, windows-1255:0.2%, windows-1253:0.2%, x-MacRoman:0.2%

=== Probe length: 128B ===
                            N  | --- ML ablation --------------------------------------------------- | --- Baselines --------------------------------- |
Charset                        | Stat R%   S%  T3%  D%   A%  | +ISO R%   S%  T3%  D%   A%  | +CJK R%   S%  T3%  D%   A%  | All  R%   S%  T3%  D%   A%  | ICU4J R%   S%  T3%  D%   A%  | juniv R%   S%  T3%  D%   A%  |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Big5-HKSCS              30334  |  99.8  99.8  99.9  99.9  99.9  |  99.8  99.8  99.9  99.9  99.9  |  99.8  99.8  99.9  99.9  99.9  |  99.8  99.8  99.9  99.9  99.9  |   0.0  99.6  99.8  58.9  58.9  |   0.0  84.3  84.3  58.6  58.6  |
EUC-JP                  37043  |  98.5  98.5  98.7  98.8  98.8  |  98.5  98.5  98.7  98.8  98.8  |  98.5  98.5  98.7  98.8  98.8  |  98.5  98.5  98.7  98.8  98.8  |  98.6  98.6  98.8  99.0  99.0  |  97.7  97.7  97.7  98.1  98.1  |
EUC-KR                  36883  |  99.6  99.6  99.6  99.6  99.6  |  99.6  99.6  99.6  99.6  99.6  |  99.6  99.6  99.6  99.6  99.6  |  99.6  99.6  99.6  99.6  99.6  |  99.7  99.7  99.7  99.8  99.8  |  99.8  99.8  99.8  99.8  99.8  |
GB18030                 36862  |  99.1  99.1  99.2  99.5  99.5  |  99.1  99.1  99.2  99.5  99.6  |  99.1  99.1  99.2  99.5  99.6  |  99.1  99.1  99.2  99.5  99.6  |  97.0  97.0  98.3  97.4  97.5  |  98.4  98.4  98.4  98.8  98.9  |
IBM1047                 34790  |  10.4  98.5  98.8  92.9  98.1  |  10.4  98.6  98.8  93.0  98.2  |  10.4  98.6  98.8  93.0  98.2  |  10.4  98.6  98.8  93.0  98.2  |   0.0  88.9 100.0  82.6  88.1  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-ltr              36874  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |   0.0  98.4  99.6   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-rtl              37018  |  99.7  99.8  99.8  99.7  99.7  |  99.7  99.8  99.8  99.7  99.7  |  99.7  99.8  99.8  99.7  99.7  |  99.7  99.8  99.8  99.7  99.7  |   0.0  98.6  99.7   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-ltr              34927  |  99.5  99.7  99.7  99.5  99.5  |  70.9  71.1  71.1  70.9  70.9  |  70.9  71.1  71.1  70.9  70.9  |  70.9  71.1  71.1  70.9  70.9  |   0.0  96.3  99.1   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-rtl              32984  |  99.9 100.0 100.0  99.9  99.9  |  56.8  56.9  56.9  56.8  56.8  |  56.8  56.9  56.9  56.8  56.8  |  56.8  56.9  56.9  56.8  56.8  |   0.0  96.3  98.9   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM500                  35083  |  89.5  98.7  98.9  98.4  98.7  |  89.5  98.7  99.0  98.4  98.7  |  89.5  98.7  99.0  98.4  98.7  |  89.5  98.7  99.0  98.4  98.7  |  89.0  89.0 100.0  89.0  89.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM850                  34505  |  88.5  88.5  93.9  94.7  94.7  |  87.3  87.3  90.7  94.7  94.7  |  87.3  87.3  90.7  94.7  94.7  |  87.3  87.3  90.7  94.7  94.7  |   0.0   0.0   0.0   7.0   7.3  |   0.0   0.0   0.0   7.0   7.3  |
IBM852                  35418  |  97.7  97.7  98.5  99.1  99.1  |  96.2  96.2  96.7  99.1  99.1  |  96.2  96.2  96.7  99.1  99.1  |  96.2  96.2  96.7  99.1  99.1  |   0.0   0.0   0.0   2.7   2.8  |   0.0   0.0   0.0   2.7   2.8  |
IBM855                  36702  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |   0.0   0.0   0.0   0.0   0.0  |  99.8  99.8  99.8  99.9  99.9  |
IBM866                  36985  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  94.5  94.5  99.5  94.6  94.6  |  99.8  99.8  99.8  99.9  99.9  |
ISO-2022-CN             40954  |   0.0   0.0   0.0   0.8   0.8  |  99.2  99.2  99.2 100.0 100.0  |  99.2  99.2  99.2 100.0 100.0  |  99.2  99.2  99.2 100.0 100.0  |  98.3  98.3  98.6  99.1  99.2  |   0.0   0.0   0.0   0.8   0.8  |
ISO-2022-JP             37151  |   0.0   0.0   0.0   0.5   0.5  |  99.5  99.5  99.5 100.0 100.0  |  99.5  99.5  99.5 100.0 100.0  |  99.5  99.5  99.5 100.0 100.0  |  99.1  99.1  99.3  99.6  99.7  |  99.5  99.5  99.5 100.0 100.0  |
ISO-2022-KR             36860  |   0.0   0.0   0.0   0.1   0.1  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.2  99.2  99.3  99.3  99.3  |  99.9  99.9  99.9 100.0 100.0  |
ISO-8859-16             32901  |  83.2  83.2  89.0  96.3  96.3  |  77.5  77.5  82.4  96.3  96.3  |  77.5  77.5  82.4  96.3  96.3  |  77.5  77.5  82.4  96.3  96.3  |   0.0   0.0   0.0  57.8  58.1  |   0.0   0.0   0.0  49.9  50.1  |
ISO-8859-3              35648  |  99.3  99.3  99.7  99.6  99.6  |  97.5  97.5  97.8  99.6  99.6  |  97.5  97.5  97.8  99.6  99.6  |  97.5  97.5  97.8  99.6  99.6  |   0.0   0.0   0.0   2.4   2.4  |   0.0   0.0   0.0   2.4   2.4  |
KOI8-R                  36850  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  97.3  97.3  99.6  97.3  97.3  |  99.8  99.8  99.8  99.8  99.8  |
KOI8-U                  36846  |  98.9  99.9  99.9  99.2  99.2  |  98.9  99.9  99.9  99.2  99.2  |  98.9  99.9  99.9  99.2  99.2  |  98.9  99.9  99.9  99.2  99.2  |   0.0  94.2  99.1   0.4   0.4  |   0.0  99.7  99.7   0.5   0.5  |
Shift_JIS               36917  |  99.2  99.2  99.3  99.5  99.5  |  99.2  99.2  99.3  99.6  99.6  |  99.2  99.2  99.3  99.6  99.6  |  99.2  99.2  99.3  99.6  99.6  |  99.0  99.0  99.0  99.3  99.3  |  99.1  99.1  99.1  99.5  99.5  |
US-ASCII                36759  |   0.0   0.0   0.1 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0  65.8 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |
UTF-16-BE               36799  |  88.6  89.1  91.7  88.6  88.6  |  98.4  98.4  98.8  98.4  98.4  |  98.8  98.8  98.8  98.8  98.8  |  98.8  98.8  98.8  98.8  98.8  |  69.0  69.0  94.4  69.0  69.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-16-LE               36736  |  94.2  94.2  96.1  94.2  94.2  |  99.0  99.0  99.2  99.0  99.0  |  99.1  99.1  99.3  99.1  99.1  |  99.1  99.1  99.3  99.1  99.1  |  69.6  69.6  94.6  69.6  69.6  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-BE               36757  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-LE               37011  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-8                   36254  |  92.7  92.7  93.6  95.4  95.4  |  97.3  97.3  97.3 100.0 100.0  |  97.3  97.3  97.3 100.0 100.0  |  97.3  97.3  97.3 100.0 100.0  |  96.5  96.5  98.0  99.3  99.3  |  96.6  96.6  96.6  99.3  99.3  |
windows-1250            34548  |  88.6  88.6  90.7  95.5  95.7  |  88.5  88.5  90.4  95.5  95.6  |  88.5  88.5  90.4  95.5  95.6  |  88.5  88.5  90.4  95.5  95.6  |  49.6  74.4  98.9  80.3  80.3  |   0.0   0.0   0.0  10.6  13.5  |
windows-1251            36852  |  98.2  98.2  98.5  98.3  98.3  |  98.2  98.2  98.5  98.3  98.3  |  98.2  98.2  98.5  98.3  98.3  |  98.2  98.2  98.5  98.3  98.3  |  94.7  94.7  97.8  94.7  94.7  |  88.2  88.2  88.2  88.2  88.2  |
windows-1252            25975  |  89.4  89.4  93.7  95.2  95.3  |  94.0  94.0  96.6  95.1  95.2  |  94.0  94.0  96.6  95.1  95.2  |  94.0  94.0  96.6  95.1  95.2  |  16.0  88.0  99.3  93.8  93.8  |   0.0  99.1  99.1  81.5  99.0  |
windows-1253            36845  |  99.8  99.8  99.8  99.8  99.8  |  99.8  99.8  99.8  99.8  99.8  |  99.8  99.8  99.8  99.8  99.8  |  99.8  99.8  99.8  99.8  99.8  |  11.9  99.0  99.7  96.1  96.2  |   0.3  98.9  98.9  84.5  95.9  |
windows-1254            36705  |  99.6  99.6  99.8  99.7  99.7  |  99.5  99.5  99.7  99.7  99.7  |  99.5  99.5  99.7  99.7  99.7  |  99.5  99.5  99.7  99.7  99.7  |  29.0  96.3  99.8  96.5  96.5  |   0.0   0.0   0.0   0.6   0.8  |
windows-1255            36774  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9 100.0  |  99.9  99.9  99.9  99.9 100.0  |  99.9  99.9  99.9  99.9 100.0  |  14.3  60.4  77.9  58.1  60.5  |  99.6  99.7  99.7  99.8  99.8  |
windows-1256            41912  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  47.8  82.4  97.5  47.9  47.9  |   0.0   0.0   0.0   0.1   0.1  |
windows-1257            35316  |  98.3  98.3  99.3  99.2  99.2  |  97.4  97.4  98.4  99.2  99.2  |  97.4  97.4  98.4  99.2  99.2  |  97.4  97.4  98.4  99.2  99.2  |   0.0   0.0   0.0  15.0  15.0  |   0.0   0.0   0.0  14.9  16.5  |
windows-1258            36885  |  99.8  99.8  99.9  99.9  99.9  |  99.8  99.8  99.8  99.9  99.9  |  99.8  99.8  99.8  99.9  99.9  |  99.8  99.8  99.8  99.9  99.9  |   0.0   0.0   0.0   0.1   0.1  |   0.0   0.0   0.0   0.1   0.1  |
windows-874             36809  |  99.4  99.4  99.5  99.8  99.8  |  99.4  99.4  99.5  99.8  99.8  |  99.4  99.4  99.5  99.9  99.9  |  99.4  99.4  99.5  99.9  99.9  |   0.0   0.0   0.0   0.4   0.4  |   0.0   0.0   0.0  88.3  99.9  |
x-EUC-TW                26788  |  99.4  99.4  99.5  99.5  99.5  |  99.4  99.4  99.5  99.5  99.5  |  99.4  99.4  99.5  99.5  99.5  |  99.4  99.4  99.5  99.5  99.5  |   0.0   0.0   0.0   0.1   0.1  |  81.1  81.1  81.1  81.2  81.2  |
x-MacRoman              36756  |  92.2  92.2  96.8  96.1  96.2  |  90.1  90.1  92.7  96.1  96.2  |  90.1  90.1  92.7  96.1  96.2  |  90.1  90.1  92.7  96.1  96.2  |   0.0   0.0   0.0   6.1   6.2  |   0.0   0.0   0.0   6.1   6.2  |
x-mac-cyrillic          36631  |  99.7  99.7  99.8  99.7  99.7  |  99.7  99.7  99.8  99.7  99.7  |  99.7  99.7  99.8  99.7  99.7  |  99.7  99.7  99.8  99.7  99.7  |   0.0   0.0   0.0   0.0   0.0  |  76.4  76.4  76.4  76.4  76.4  |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OVERALL                 1469647  |  80.0  82.4  83.1  85.7  85.9  |  91.4  93.7  94.2  97.3  97.5  |  91.4  93.8  94.2  97.4  97.5  |  91.4  93.8  94.2  97.4  97.5  |  42.0  65.8  71.6  56.9  57.2  |  32.9  41.4  41.4  44.7  45.8  |
  Stat=model only | +ISO=+C1-correction | +CJK=+grammar | All=ML+rules | R%=strict | S%=soft | T3%=top-3 hit | D%=decode-match | A%=alpha-match
  µs/sample                    |                       11.4  |                        8.8  |                        8.4  |                        8.3  |                       40.3  |                        5.8  |

--- Confusion (All/ML+rules, 128B, top errors per charset) ---
  Big5-HKSCS               0.2% wrong → GB18030:0.1%, windows-1252:0.0%, IBM850:0.0%, ISO-8859-16:0.0%, windows-874:0.0%
  EUC-JP                   1.5% wrong → GB18030:0.9%, windows-1252:0.3%, IBM850:0.1%, x-MacRoman:0.1%, IBM852:0.0%
  EUC-KR                   0.4% wrong → windows-1252:0.2%, GB18030:0.1%, x-MacRoman:0.0%, EUC-JP:0.0%, IBM850:0.0%
  GB18030                  0.9% wrong → windows-1252:0.5%, IBM850:0.1%, x-MacRoman:0.1%, ISO-8859-3:0.0%, ISO-8859-16:0.0%
  IBM1047                 89.6% wrong → IBM500:88.2%, IBM420-ltr:0.7%, IBM424-rtl:0.3%, UTF-16-BE:0.2%, IBM424-ltr:0.1%
  IBM420-ltr               0.1% wrong → IBM500:0.1%, IBM420-rtl:0.0%, UTF-16-BE:0.0%, IBM424-ltr:0.0%, IBM1047:0.0%
  IBM420-rtl               0.3% wrong → IBM424-rtl:0.2%, IBM420-ltr:0.1%
  IBM424-ltr              29.1% wrong → windows-1252:28.4%, IBM500:0.2%, UTF-8:0.2%, IBM424-rtl:0.2%, IBM420-ltr:0.0%
  IBM424-rtl              43.2% wrong → windows-1252:43.0%, UTF-8:0.1%, IBM424-ltr:0.0%, IBM500:0.0%, windows-1253:0.0%
  IBM500                  10.5% wrong → IBM1047:9.2%, IBM420-ltr:0.6%, IBM424-rtl:0.2%, UTF-16-BE:0.2%, IBM424-ltr:0.1%
  IBM850                  12.7% wrong → windows-1252:9.2%, x-MacRoman:1.9%, IBM852:0.7%, windows-1257:0.4%, ISO-8859-16:0.4%
  IBM852                   3.8% wrong → windows-1252:2.8%, ISO-8859-16:0.3%, IBM850:0.3%, windows-1250:0.2%, x-MacRoman:0.1%
  IBM855                   0.1% wrong → windows-1252:0.1%, IBM852:0.0%, x-MacRoman:0.0%, windows-1250:0.0%, UTF-8:0.0%
  IBM866                   0.1% wrong → windows-1252:0.1%, x-mac-cyrillic:0.0%, IBM852:0.0%, ISO-8859-3:0.0%, IBM850:0.0%
  ISO-2022-CN              0.8% wrong → windows-1252:0.8%, ISO-2022-JP:0.0%
  ISO-2022-JP              0.5% wrong → windows-1252:0.5%
  ISO-2022-KR              0.1% wrong → windows-1252:0.1%, ISO-2022-JP:0.0%
  ISO-8859-16             22.5% wrong → windows-1250:12.9%, windows-1252:8.5%, IBM852:0.4%, windows-1257:0.2%, x-MacRoman:0.2%
  ISO-8859-3               2.5% wrong → windows-1252:2.2%, x-MacRoman:0.1%, IBM852:0.0%, UTF-8:0.0%, ISO-8859-16:0.0%
  KOI8-R                   0.1% wrong → windows-1252:0.1%, KOI8-U:0.0%, x-MacRoman:0.0%, IBM850:0.0%, windows-1253:0.0%
  KOI8-U                   1.1% wrong → KOI8-R:1.0%, windows-1252:0.0%, x-MacRoman:0.0%, windows-1251:0.0%, ISO-8859-16:0.0%
  Shift_JIS                0.8% wrong → windows-1252:0.3%, x-MacRoman:0.1%, IBM850:0.1%, IBM852:0.1%, x-mac-cyrillic:0.0%
  US-ASCII               100.0% wrong → windows-1252:100.0%, ISO-2022-JP:0.0%
  UTF-16-BE                1.2% wrong → windows-1252:1.0%, UTF-8:0.0%, UTF-16LE:0.0%, ISO-8859-16:0.0%, IBM500:0.0%
  UTF-16-LE                0.9% wrong → windows-1252:0.7%, KOI8-U:0.1%, IBM866:0.0%, IBM852:0.0%, UTF-16BE:0.0%
  UTF-8                    2.7% wrong → windows-1252:2.7%
  windows-1250            11.5% wrong → ISO-8859-16:5.6%, windows-1252:3.2%, IBM852:2.0%, windows-1257:0.3%, x-MacRoman:0.2%
  windows-1251             1.8% wrong → x-mac-cyrillic:1.6%, windows-1252:0.1%, windows-1250:0.0%, windows-1257:0.0%, IBM852:0.0%
  windows-1252             6.0% wrong → x-MacRoman:2.6%, IBM850:1.0%, windows-1257:0.5%, windows-1250:0.4%, ISO-8859-16:0.3%
  windows-1253             0.2% wrong → windows-1252:0.2%, KOI8-R:0.0%, ISO-8859-3:0.0%, windows-1257:0.0%, x-MacRoman:0.0%
  windows-1254             0.5% wrong → windows-1252:0.4%, IBM852:0.0%, windows-1250:0.0%, IBM850:0.0%, ISO-8859-16:0.0%
  windows-1255             0.1% wrong → windows-1252:0.1%, x-MacRoman:0.0%, IBM852:0.0%, IBM850:0.0%, windows-1254:0.0%
  windows-1256             0.1% wrong → windows-1252:0.1%, x-MacRoman:0.0%, UTF-8:0.0%, windows-1254:0.0%, ISO-8859-3:0.0%
  windows-1257             2.6% wrong → windows-1252:2.0%, x-MacRoman:0.1%, windows-1250:0.1%, IBM852:0.1%, windows-1254:0.1%
  windows-1258             0.2% wrong → windows-1252:0.2%, x-MacRoman:0.0%, IBM852:0.0%, IBM850:0.0%, ISO-8859-3:0.0%
  windows-874              0.6% wrong → windows-1252:0.4%, x-MacRoman:0.0%, windows-1254:0.0%, IBM850:0.0%, GB18030:0.0%
  x-EUC-TW                 0.6% wrong → GB18030:0.5%, windows-1252:0.1%, windows-1256:0.0%, Big5-HKSCS:0.0%, x-MacRoman:0.0%
  x-MacRoman               9.9% wrong → windows-1252:7.8%, IBM850:1.3%, ISO-8859-16:0.3%, IBM852:0.2%, windows-1257:0.1%
  x-mac-cyrillic           0.3% wrong → windows-1251:0.2%, windows-1252:0.0%, IBM852:0.0%, x-MacRoman:0.0%, IBM866:0.0%

=== Probe length: full ===
                            N  | --- ML ablation --------------------------------------------------- | --- Baselines --------------------------------- |
Charset                        | Stat R%   S%  T3%  D%   A%  | +ISO R%   S%  T3%  D%   A%  | +CJK R%   S%  T3%  D%   A%  | All  R%   S%  T3%  D%   A%  | ICU4J R%   S%  T3%  D%   A%  | juniv R%   S%  T3%  D%   A%  |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Big5-HKSCS              30334  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0  99.9 100.0  33.8  33.8  |   0.0  84.6  84.6  33.7  33.7  |
EUC-JP                  37043  |  99.8  99.8  99.8  99.8  99.8  |  99.8  99.8  99.8  99.8  99.8  |  99.8  99.8  99.8  99.8  99.8  |  99.8  99.8  99.8  99.8  99.8  |  99.9  99.9  99.9  99.9  99.9  |  99.6  99.6  99.6  99.6  99.6  |
EUC-KR                  36883  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |
GB18030                 36862  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  99.5  99.5  99.8  99.5  99.5  |  99.7  99.7  99.7  99.7  99.7  |
IBM1047                 34790  |  15.2  99.6  99.7  82.2  98.1  |  15.2  99.7  99.7  82.2  98.2  |  15.2  99.7  99.7  82.2  98.2  |  15.2  99.7  99.7  82.2  98.2  |   0.0  75.7 100.0  59.7  73.6  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-ltr              36874  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0  99.5  99.9   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM420-rtl              37018  |  99.9 100.0 100.0  99.9  99.9  |  99.9 100.0 100.0  99.9  99.9  |  99.9 100.0 100.0  99.9  99.9  |  99.9 100.0 100.0  99.9  99.9  |   0.0  99.3  99.9   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-ltr              34927  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |   0.0  94.3  99.7   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM424-rtl              32984  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0  92.0  99.6   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM500                  35083  |  89.0  99.7  99.8  98.3  99.7  |  89.1  99.8  99.8  98.4  99.8  |  89.1  99.8  99.8  98.4  99.8  |  89.1  99.8  99.8  98.4  99.8  |  76.1  76.1 100.0  76.1  76.1  |   0.0   0.0   0.0   0.0   0.0  |
IBM850                  34505  |  99.7  99.7  99.9  99.8  99.8  |  99.7  99.7  99.9  99.8  99.8  |  99.7  99.7  99.9  99.8  99.8  |  99.7  99.7  99.9  99.8  99.8  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM852                  35418  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
IBM855                  36702  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  |
IBM866                  36985  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  98.9  98.9  99.9  98.9  98.9  | 100.0 100.0 100.0 100.0 100.0  |
ISO-2022-CN             40954  |   0.0   0.0   0.0   0.3   0.3  |  99.7  99.7  99.7 100.0 100.0  |  99.7  99.7  99.7 100.0 100.0  |  99.7  99.7  99.7 100.0 100.0  |  99.4  99.4  99.4  99.6  99.6  |   0.0   0.0   0.0   0.3   0.3  |
ISO-2022-JP             37151  |   0.0   0.0   0.0   0.1   0.1  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.9  99.9  99.9 100.0 100.0  |  99.8  99.8  99.8  99.9  99.9  |  99.9  99.9  99.9 100.0 100.0  |
ISO-2022-KR             36860  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  99.7  99.7  99.7  99.7  99.7  | 100.0 100.0 100.0 100.0 100.0  |
ISO-8859-16             32901  |  95.4  95.4  96.5  97.5  97.5  |  95.4  95.4  96.5  97.5  97.5  |  95.4  95.4  96.5  97.5  97.5  |  95.4  95.4  96.5  97.5  97.5  |   0.0   0.0   0.0  14.1  14.4  |   0.0   0.0   0.0  11.9  12.2  |
ISO-8859-3              35648  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
KOI8-R                  36850  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  99.3  99.3  99.9  99.3  99.3  |  99.9  99.9  99.9  99.9  99.9  |
KOI8-U                  36846  |  99.7 100.0 100.0  99.8  99.8  |  99.7 100.0 100.0  99.8  99.8  |  99.7 100.0 100.0  99.8  99.8  |  99.7 100.0 100.0  99.8  99.8  |   0.0  98.2  99.8   0.1   0.1  |   0.0  99.9  99.9   0.1   0.1  |
Shift_JIS               36917  |  99.9  99.9 100.0  99.9  99.9  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  99.9  99.9  99.9  99.9  99.9  |
US-ASCII                36759  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |   0.0   0.0  69.4 100.0 100.0  |   0.0   0.0   0.0 100.0 100.0  |
UTF-16-BE               36799  |  93.7  94.4  95.1  93.7  93.7  |  98.7  98.7  98.9  98.7  98.7  |  98.8  98.9  98.9  98.8  98.8  |  98.8  98.9  98.9  98.8  98.8  |  68.6  68.6  95.7  68.6  68.6  |   0.0   0.0   0.0   0.0   0.0  |
UTF-16-LE               36736  |  97.5  97.5  98.1  97.5  97.5  |  99.3  99.3  99.5  99.3  99.3  |  99.4  99.4  99.5  99.4  99.4  |  99.4  99.4  99.5  99.4  99.4  |  68.8  68.8  96.5  68.8  68.8  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-BE               36757  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-32-LE               37011  |   0.0   0.0   0.0   0.0   0.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |
UTF-8                   36254  |  98.8  98.8  99.0  98.8  98.8  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  99.6  99.6  99.6  99.6  99.6  |
windows-1250            34548  |  99.2  99.2  99.3  99.2  99.2  |  99.2  99.2  99.3  99.2  99.2  |  99.2  99.2  99.3  99.2  99.2  |  99.2  99.2  99.3  99.2  99.2  |  80.1  85.5  99.9  84.5  84.5  |   0.0   0.0   0.0   0.0   0.0  |
windows-1251            36852  |  99.5  99.5  99.6  99.5  99.5  |  99.5  99.5  99.6  99.5  99.5  |  99.5  99.5  99.6  99.5  99.5  |  99.5  99.5  99.6  99.5  99.5  |  98.9  98.9  99.6  98.9  98.9  |  93.2  93.2  93.2  93.2  93.2  |
windows-1252            25975  |  99.7  99.7  99.7  99.7  99.7  |  99.7  99.7  99.7  99.7  99.7  |  99.7  99.7  99.7  99.7  99.7  |  99.7  99.7  99.7  99.7  99.7  |  46.5  94.4  99.8  94.4  94.4  |   0.0  98.9  98.9  50.8  98.1  |
windows-1253            36845  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  33.7  99.8  99.9  93.8  93.9  |   1.0  99.7  99.7  61.1  90.4  |
windows-1254            36705  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  60.3  99.3 100.0  99.3  99.3  |   0.0   0.0   0.0   0.0   0.0  |
windows-1255            36774  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  38.0  81.7  94.2  76.3  81.7  |  99.9  99.9  99.9  99.9  99.9  |
windows-1256            41912  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |  52.0  91.6  99.4  52.0  52.0  |   0.0   0.0   0.0   0.0   0.0  |
windows-1257            35316  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.1  |
windows-1258            36885  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
windows-874             36809  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  | 100.0 100.0 100.0 100.0 100.0  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0  70.6  99.9  |
x-EUC-TW                26788  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |  99.9  99.9  99.9  99.9  99.9  |   0.0   0.0   0.0   0.0   0.0  |  81.1  81.1  81.1  81.1  81.1  |
x-MacRoman              36756  |  99.6  99.6  99.9  99.6  99.6  |  99.7  99.7  99.9  99.7  99.7  |  99.7  99.7  99.9  99.7  99.7  |  99.7  99.7  99.9  99.7  99.7  |   0.0   0.0   0.0   0.0   0.0  |   0.0   0.0   0.0   0.0   0.0  |
x-mac-cyrillic          36631  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |  99.9  99.9 100.0  99.9  99.9  |   0.0   0.0   0.0   0.0   0.0  |  86.7  86.7  86.7  86.7  86.7  |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OVERALL                 1469647  |  81.9  84.2  84.3  86.3  86.7  |  95.0  97.2  97.3  99.4  99.8  |  95.0  97.3  97.3  99.4  99.8  |  95.0  97.3  97.3  99.4  99.8  |  45.5  67.0  72.6  54.8  55.3  |  33.5  42.0  42.0  41.2  43.5  |
  Stat=model only | +ISO=+C1-correction | +CJK=+grammar | All=ML+rules | R%=strict | S%=soft | T3%=top-3 hit | D%=decode-match | A%=alpha-match
  µs/sample                    |                       18.4  |                       13.7  |                       12.8  |                       13.1  |                      146.6  |                       16.4  |

--- Confusion (All/ML+rules, full, top errors per charset) ---
  Big5-HKSCS               0.0% wrong → GB18030:0.0%, windows-1252:0.0%, ISO-8859-16:0.0%
  EUC-JP                   0.2% wrong → GB18030:0.2%, ISO-2022-JP:0.0%, IBM852:0.0%, IBM850:0.0%, EUC-KR:0.0%
  EUC-KR                   0.1% wrong → windows-1252:0.0%, GB18030:0.0%
  GB18030                  0.0% wrong → windows-1252:0.0%, Big5-HKSCS:0.0%, x-EUC-TW:0.0%, windows-874:0.0%, IBM424-ltr:0.0%
  IBM1047                 84.8% wrong → IBM500:84.5%, IBM420-ltr:0.1%, UTF-16-BE:0.1%, IBM424-rtl:0.0%, UTF-16-LE:0.0%
  IBM420-ltr               0.0% wrong → IBM420-rtl:0.0%, IBM500:0.0%
  IBM420-rtl               0.1% wrong → IBM424-rtl:0.0%, IBM420-ltr:0.0%
  IBM424-ltr               0.1% wrong → IBM500:0.1%, IBM424-rtl:0.0%, UTF-8:0.0%, IBM420-ltr:0.0%, x-mac-cyrillic:0.0%
  IBM424-rtl               0.0% wrong → IBM424-ltr:0.0%, UTF-8:0.0%, IBM420-rtl:0.0%
  IBM500                  10.9% wrong → IBM1047:10.7%, IBM420-ltr:0.1%, UTF-16-BE:0.1%, IBM424-rtl:0.0%, IBM424-ltr:0.0%
  IBM850                   0.3% wrong → windows-1252:0.1%, x-MacRoman:0.1%, IBM852:0.1%, windows-1257:0.0%, ISO-8859-16:0.0%
  IBM852                   0.1% wrong → ISO-8859-16:0.0%, windows-1250:0.0%, IBM850:0.0%, windows-1252:0.0%, IBM424-rtl:0.0%
  IBM855                   0.0% wrong → windows-1252:0.0%, IBM850:0.0%, x-MacRoman:0.0%, IBM852:0.0%, UTF-8:0.0%
  IBM866                   0.0% wrong → x-MacRoman:0.0%, IBM852:0.0%, x-mac-cyrillic:0.0%
  ISO-2022-CN              0.3% wrong → windows-1252:0.3%, ISO-2022-JP:0.0%
  ISO-2022-JP              0.1% wrong → windows-1252:0.1%
  ISO-2022-KR              0.0% wrong → windows-1252:0.0%, ISO-2022-JP:0.0%
  ISO-8859-16              4.6% wrong → windows-1250:4.5%, windows-1252:0.0%, windows-1257:0.0%, IBM852:0.0%, x-MacRoman:0.0%
  ISO-8859-3               0.0% wrong → windows-1252:0.0%, x-MacRoman:0.0%, IBM852:0.0%, UTF-8:0.0%, windows-1254:0.0%
  KOI8-R                   0.0% wrong → KOI8-U:0.0%, windows-874:0.0%, windows-1252:0.0%
  KOI8-U                   0.3% wrong → KOI8-R:0.3%, windows-1250:0.0%, x-MacRoman:0.0%, windows-1251:0.0%, windows-1252:0.0%
  Shift_JIS                0.0% wrong → IBM852:0.0%, x-mac-cyrillic:0.0%, IBM866:0.0%, GB18030:0.0%, UTF-16-LE:0.0%
  US-ASCII               100.0% wrong → windows-1252:100.0%, x-MacRoman:0.0%, windows-1255:0.0%, ISO-2022-JP:0.0%
  UTF-16-BE                1.2% wrong → windows-1252:1.0%, UTF-16LE:0.1%, ISO-8859-16:0.1%, UTF-8:0.0%, UTF-16-LE:0.0%
  UTF-16-LE                0.6% wrong → windows-1252:0.5%, UTF-16BE:0.0%, IBM852:0.0%, IBM866:0.0%, windows-1256:0.0%
  windows-1250             0.8% wrong → ISO-8859-16:0.7%, IBM852:0.0%, windows-1252:0.0%, x-MacRoman:0.0%, windows-1257:0.0%
  windows-1251             0.5% wrong → x-mac-cyrillic:0.5%, IBM852:0.0%, windows-1250:0.0%, KOI8-R:0.0%
  windows-1252             0.3% wrong → EUC-KR:0.1%, windows-1256:0.0%, IBM850:0.0%, windows-1253:0.0%, KOI8-R:0.0%
  windows-1253             0.1% wrong → windows-1252:0.0%, windows-1255:0.0%, KOI8-R:0.0%, IBM424-ltr:0.0%, IBM852:0.0%
  windows-1254             0.0% wrong → windows-1252:0.0%, windows-1250:0.0%, x-MacRoman:0.0%, IBM852:0.0%, ISO-8859-16:0.0%
  windows-1256             0.0% wrong → windows-1252:0.0%, x-MacRoman:0.0%, windows-874:0.0%, windows-1250:0.0%
  windows-1257             0.1% wrong → windows-1252:0.1%, windows-1250:0.0%, IBM852:0.0%, x-MacRoman:0.0%, IBM850:0.0%
  windows-1258             0.0% wrong → x-MacRoman:0.0%, ISO-8859-3:0.0%, windows-874:0.0%, windows-1257:0.0%
  windows-874              0.0% wrong → windows-1252:0.0%, GB18030:0.0%, IBM850:0.0%, IBM852:0.0%
  x-EUC-TW                 0.1% wrong → GB18030:0.1%
  x-MacRoman               0.3% wrong → IBM850:0.2%, windows-1252:0.1%, IBM852:0.0%, ISO-8859-16:0.0%, windows-1257:0.0%
  x-mac-cyrillic           0.1% wrong → windows-1251:0.0%, windows-1252:0.0%, x-MacRoman:0.0%, windows-1250:0.0%

=== Accuracy by probe length (All detector) ===
  Length     Strict%     Soft%     Top3%   Decode%    Alpha%
  ----------------------------------------------------------
  8B            59.1      62.6      70.2      83.2      83.4
  32B           80.8      83.6      86.3      93.4      93.5
  128B          91.4      93.8      94.2      97.4      97.5
  full          95.0      97.3      97.3      99.4      99.8
