Leaderboard - NKJP Tagset (Morfeusz)


Rank Model name Model url Pretrained embeddings Dataset Metric Average Tokens Sentences Words UPOS XPOS UFeats AllTags Lemmas UAS LAS CLAS MLAS BLEX
1 combo LINK hertBERT NKJP AligndAcc 97.28 - - - 98.76 96.54 96.65 96.08 98.35 - - - - -
F1 96.65 99.16 93.08 99.07 97.84 95.65 95.76 95.19 97.44 - - - - -
2 stanza LINK fasttext NKJP AligndAcc 95.55 - - - 97.97 94.10 94.38 93.85 97.43 - - - - -
F1 95.89 99.77 92.70 99.46 97.45 93.59 93.88 93.35 96.91 - - - - -
3 combo LINK fasttext NKJP AligndAcc 95.78 - - - 98.22 94.63 94.41 93.77 97.85 - - - - -
F1 95.72 99.16 93.08 99.07 97.31 93.76 93.54 92.91 96.95 - - - - -
4 udpipe LINK fasttext NKJP AligndAcc 93.35 - - - 97.67 90.87 91.21 90.87 96.13 - - - - -
F1 94.42 99.73 90.58 99.70 97.38 90.60 90.94 90.60 95.84 - - - - -
5 trankit LINK xlm-RoBERTa-base NKJP AligndAcc 92.99 - - - 97.43 91.69 92.03 90.65 93.15 - - - - -
F1 92.36 98.24 88.58 97.72 95.21 89.59 89.93 88.58 91.02 - - - - -
6 concraft LINK - NKJP AligndAcc 92.92 - - - 96.22 90.22 90.79 90.22 97.15 - - - - -
F1 91.52 98.55 71.10 99.62 95.86 89.88 90.45 89.88 96.79 - - - - -
7 spaCy LINK dkleczek NKJP AligndAcc 70.88 - - - 98.55 96.03 31.49 30.91 97.44 - - - - -
F1 76.00 99.56 61.06 98.46 97.03 94.55 31.00 30.43 95.94 - - - - -
8 spaCy LINK pl-core-news-lg NKJP AligndAcc 69.66 - - - 97.77 92.31 31.49 30.54 96.21 - - - - -
F1 75.25 99.56 61.06 98.46 96.26 90.89 31.00 30.07 94.73 - - - - -
9 spaCy LINK fasttext NKJP AligndAcc 69.29 - - - 97.35 91.20 31.49 30.49 95.94 - - - - -
F1 75.02 99.56 61.06 98.46 95.85 89.79 31.00 30.02 94.46 - - - - -

Legend:

  • NKJP - test set, prepared by fairly dividing 1M NKJP frozen snapshot by the document type (such as: news, fiction, poetry etc. ), and then randomly selecting sentences in even proportions to ensure that in each dataset there is and equal number of sentences from each document type. Does not contain dependency parsing annotations. Prediction is done on plain text file.

Leaderboard - UD Tagset


Rank Model name Model url Pretrained embeddings Dataset Metric Average Tokens Sentences Words UPOS XPOS UFeats AllTags Lemmas UAS LAS CLAS MLAS BLEX
1 combo3-base LINK allegro/herbert-base-cased NKJP AligndAcc 95.10 - - - 96.85 93.76 95.01 92.48 97.40 - - - - -
F1 94.61 99.07 87.80 99.05 95.93 92.86 94.10 91.60 96.47 - - - - -
PDB AligndAcc 94.34 - - - 98.92 96.25 96.68 95.48 97.90 95.23 93.66 92.22 87.32 89.69
F1 94.90 99.42 97.24 99.37 98.29 95.64 96.07 94.87 97.28 94.63 93.07 91.73 86.86 89.22
PDB3 AligndAcc 92.47 - - - 98.00 95.24 96.12 94.30 97.64 93.36 91.05 89.15 83.42 86.41
F1 93.35 99.68 92.04 99.63 97.63 94.89 95.77 93.95 97.28 93.01 90.71 89.15 83.42 86.41
2 trankit LINK xlm-RoBERTa-base NKJP AligndAcc 92.58 - - - 97.32 91.43 91.62 89.40 93.15 - - - - -
F1 92.12 98.24 88.58 97.73 95.11 89.36 89.55 87.37 91.04 - - - - -
PDB AligndAcc 92.51 - - - 99.18 96.28 96.44 95.68 89.08 95.89 94.34 93.10 87.88 77.26
F1 94.03 99.90 98.51 99.89 99.07 96.18 96.34 95.57 88.98 95.79 94.24 93.00 87.79 77.18
PDB3 AligndAcc 90.44 - - - 98.24 95.15 95.55 94.33 88.50 94.03 91.84 89.89 83.38 73.46
F1 91.99 99.49 95.39 99.44 97.69 94.62 95.02 93.80 88.00 93.51 91.32 90.19 83.66 73.71
3 stanza LINK fasttext NKJP AligndAcc 94.82 - - - 97.78 93.42 93.41 92.05 97.45 - - - - -
F1 95.46 99.76 92.89 99.47 97.26 92.93 92.91 91.56 96.93 - - - - -
PDB AligndAcc 90.60 - - - 98.21 93.71 93.76 92.69 96.32 91.62 89.34 87.25 80.22 82.87
F1 92.10 99.86 96.83 99.42 97.64 93.17 93.22 92.15 95.77 91.09 88.83 86.90 79.90 82.53
PDB3 AligndAcc 86.85 - - - 96.65 91.42 91.95 90.18 95.68 87.68 84.30 81.24 72.74 76.67
F1 88.31 99.46 92.27 98.48 95.18 90.03 90.55 88.81 94.22 86.34 83.02 80.90 72.43 76.34
4 spaCy LINK dkleczek (transformer) NKJP AligndAcc 96.83 - - - 98.67 96.30 96.48 95.68 97.02 - - - - -
F1 91.97 99.54 61.06 98.46 97.15 94.82 94.99 94.20 95.52 - - - - -
PDB AligndAcc 87.45 - - - 99.02 95.77 95.95 95.27 95.19 89.41 81.54 77.23 72.67 72.41
F1 87.98 99.65 71.46 98.51 97.54 94.35 94.52 93.85 93.77 88.08 80.33 80.50 75.75 75.48
PDB3 AligndAcc 81.53 - - - 98.09 - 95.09 - 94.66 85.11 75.81 70.90 66.17 66.42
F1 83.05 99.20 68.05 97.14 95.28 - 92.36 - 91.95 82.68 73.63 74.31 69.35 69.61
5 udpipe LINK fasttext NKJP AligndAcc 93.17 - - - 97.59 90.88 91.24 90.20 95.94 - - - - -
F1 94.35 99.77 90.59 99.76 97.35 90.65 91.02 89.98 95.70 - - - - -
PDB AligndAcc 85.14 - - - 97.43 88.71 89.21 88.16 94.44 86.82 83.14 79.52 69.52 74.48
F1 88.16 99.86 95.90 99.84 97.28 88.57 89.07 88.02 94.29 86.68 83.01 79.53 69.53 74.49
PDB3 AligndAcc 82.04 - - - 95.72 86.32 87.20 85.59 93.54 84.38 79.62 75.04 63.27 69.72
F1 85.33 99.46 92.44 99.43 95.17 85.82 86.70 85.10 93.01 83.90 79.17 75.41 63.57 70.06
6 spaCy LINK pl-core-news-lg NKJP AligndAcc 94.24 - - - 97.84 92.75 93.01 91.50 96.10 - - - - -
F1 90.38 99.54 61.06 98.46 96.34 91.32 91.57 90.09 94.62 - - - - -
PDB AligndAcc 82.58 - - - 97.95 91.50 91.79 90.05 93.07 83.39 74.71 72.66 64.21 66.50
F1 84.21 99.65 71.46 98.51 96.49 90.14 90.42 88.71 91.69 82.15 73.60 75.71 66.90 69.29
PDB3 AligndAcc 75.72 - - - 96.48 - 90.19 - 92.33 78.68 68.01 64.93 55.95 59.19
F1 78.83 99.20 68.05 97.14 93.72 - 87.61 - 89.68 76.43 66.06 68.23 58.79 62.20
7 spaCy LINK fasttext NKJP AligndAcc 93.49 - - - 97.44 91.77 92.03 90.42 95.79 - - - - -
F1 89.91 99.54 61.06 98.46 95.93 90.35 90.62 89.03 94.32 - - - - -
PDB AligndAcc 81.07 90.13 88.31 93.10 82.13 73.33 70.76 61.02 64.93 - - - 97.06 89.91
F1 83.03 88.79 87.00 91.72 80.91 72.24 73.72 63.57 67.64 99.65 71.46 98.51 95.62 88.57
PDB3 AligndAcc 74.26 - - - 95.32 - 88.22 - 92.04 77.08 66.42 63.60 53.51 57.86
F1 77.69 99.20 68.05 97.14 92.59 - 85.69 - 89.40 74.87 64.52 66.57 56.01 60.56

Legend:

  • NKJP - test set, prepared by fairly dividing 1M NKJP frozen snapshot by the document type (such as: news, fiction, poetry etc. ), and then randomly selecting sentences in even proportions to ensure that in each dataset there is and equal number of sentences from each document type. Does not contain dependency parsing annotations. Prediction is done on plain text file.
  • PDB-UD test - test set from PDB-UD treebank. Does contain dependency parsing annotations. Prediction is done on plain text file.