Word vectors
- XPM Configxpmir.text.wordvec_vocab.WordvecVocab(*, data, learn, random)[source]
Bases:
TokensEncoder
,TorchModule
Submit type:
xpmir.text.wordvec_vocab.WordvecVocab
Word-based pre-trained embeddings
- Parameters:
train – Should the word embeddings be re-retrained?
- learn: bool = False
- random: xpmir.learning.base.Random
- XPM Configxpmir.text.wordvec_vocab.WordvecHashVocab(*, data, learn, random, hashspace, init_stddev, log_miss)[source]
Bases:
WordvecVocab
Submit type:
xpmir.text.wordvec_vocab.WordvecHashVocab
Word-based embeddings with hash-based OOV
A vocabulary in which all unknown terms are assigned a position in a flexible cache based on their hash value. Each position is assigned its own random weight.
- learn: bool = False
- random: xpmir.learning.base.Random
- hashspace: int = 1000
- init_stddev: float = 0.5
- log_miss: bool = False
- XPM Configxpmir.text.wordvec_vocab.WordvecUnkVocab(*, data, learn, random)[source]
Bases:
WordvecVocab
Submit type:
xpmir.text.wordvec_vocab.WordvecUnkVocab
Word-based embeddings with OOV
A vocabulary in which all unknown terns are given the same token (UNK; 0) with random weights
- learn: bool = False
- random: xpmir.learning.base.Random