Diachronic normalisation of Polish texts
Transform old Polish texts into modern spelling. [ver. 1.0.0]
This is a long list of all submissions, if you want to see only the best, click leaderboard.
| # | submitter | when | ver. | description | dev-0 CharMatch | dev-1 CharMatch | test-A CharMatch | |
|---|---|---|---|---|---|---|---|---|
| 26 | ked | 2023-10-12 10:01 | 1.0.0 | plt5-base_normalizer_test_pruned, no finetuning | N/A | N/A | 0.0021 | |
| 14 | p/tlen | 2022-07-07 06:18 | 1.0.0 | Lucene Transducers ver. 0.25-SNAPSHOT extended=yes rule-based | 1.0000 | 0.5508 | 0.5968 | |
| 7 | p/tlen | 2022-07-07 06:18 | 1.0.0 | Lucene Transducers ver. 0.25-SNAPSHOT extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
| 17 | p/tlen | 2022-07-06 18:45 | 1.0.0 | Lucene Transducers ver. 0.24 extended=yes rule-based | 1.0000 | 0.5375 | 0.5839 | |
| 6 | p/tlen | 2022-07-06 18:45 | 1.0.0 | Lucene Transducers ver. 0.24 extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
| 16 | p/tlen | 2022-07-06 18:35 | 1.0.0 | Lucene Transducers ver. 0.24 extended=yes rule-based | 1.0000 | 0.5375 | 0.5839 | |
| 5 | p/tlen | 2022-07-06 18:35 | 1.0.0 | Lucene Transducers ver. 0.24 extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
| 15 | p/tlen | 2022-07-06 18:23 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=yes rule-based | 1.0000 | 0.5375 | 0.5839 | |
| 4 | p/tlen | 2022-07-06 18:23 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
| 22 | p/tlen | 2022-07-06 18:21 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=yes rule-based | 1.0000 | 0.3570 | 0.4012 | |
| 3 | p/tlen | 2022-07-06 18:21 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
| 19 | p/tlen | 2022-07-06 14:41 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT rule-based | 1.0000 | 0.5328 | 0.5820 | |
| 1 | p/tlen | 2022-02-24 19:47 | 1.0.0 | Lucene Transducers ver. 0.23-SNAPSHOT rule-based | 1.0000 | 0.6618 | 0.6732 | |
| 2 | p/tlen | 2021-10-20 14:00 | 1.0.0 | Lucene Transducers ver. 0.23-SNAPSHOT rule-based | 1.0000 | 0.6628 | 0.6663 | |
| 9 | p/tlen | 2021-10-20 11:02 | 1.0.0 | Lucene Transducers ver. 0.22-SNAPSHOT rule-based | 1.0000 | 0.6724 | 0.6580 | |
| 8 | [anonymized] | 2021-08-15 13:19 | 1.0.0 | 0.22 use nosecondary option | 1.0000 | 0.6724 | 0.6580 | |
| 21 | [anonymized] | 2021-08-03 20:03 | 1.0.0 | Lucene transducers 0.22 - move pairs to a separate file | 1.0000 | 0.4064 | 0.4101 | |
| 23 | p/tlen | 2020-04-22 19:16 | 1.0.0 | PSI-Toolkit Diachroniser 2020 | 1.0000 | 0.2045 | 0.2934 | |
| 10 | p/tlen | 2019-10-26 19:38 | 1.0.0 | Lucene Transducers 0.21 | 1.0000 | 0.6724 | 0.6580 | |
| 11 | p/tlen | 2019-10-19 20:13 | 1.0.0 | Lucene Transducers 20 | 1.0000 | 0.6143 | 0.6189 | |
| 24 | p/tlen | 2018-03-30 12:49 | 1.0.0 | PSI-Toolkit better-diachronizer | 1.0000 | 0.1375 | 0.1951 | |
| 12 | p/tlen | 2018-03-17 11:07 | 1.0.0 | use Lucene token filter with sub-word variants (v. 0.15) | 1.0000 | 0.6110 | 0.6093 | |
| 25 | [anonymized] | 2018-03-16 20:38 | 1.0.0 | Raw normalization | N/A | 0.0150 | 0.0284 | |
| 13 | p/tlen | 2018-03-16 13:25 | 1.0.0 | use Lucene filter with words mined using word2vec (v. 0.14) | 1.0000 | 0.6061 | 0.6031 | |
| 18 | p/tlen | 2018-03-16 11:16 | 1.0.0 | use Lucene filter without OCR fixes (v. 0.13) | 1.0000 | 0.6122 | 0.5833 | |
| 20 | p/tlen | 2018-03-15 20:55 | 1.0.0 | use Lucene token filter (v. 0.12) | 1.0000 | 0.5181 | 0.4656 | |
| 27 | p/tlen | 2018-03-15 20:47 | 1.0.0 | do nothing stupid | 0.0000 | 0.0000 | 0.0000 |