Word vectors
- XPM Configxpmir.text.wordvec_vocab.WordvecVocab(*, data, learn, random)[source]
Bases:
TokensEncoder,TorchModuleSubmit type:
xpmir.text.wordvec_vocab.WordvecVocabWord-based pre-trained embeddings
- Parameters:
train – Should the word embeddings be re-retrained?
- learn: bool = False
- random: xpmir.learning.base.Random
- XPM Configxpmir.text.wordvec_vocab.WordvecHashVocab(*, data, learn, random, hashspace, init_stddev, log_miss)[source]
Bases:
WordvecVocabSubmit type:
xpmir.text.wordvec_vocab.WordvecHashVocabWord-based embeddings with hash-based OOV
A vocabulary in which all unknown terms are assigned a position in a flexible cache based on their hash value. Each position is assigned its own random weight.
- learn: bool = False
- random: xpmir.learning.base.Random
- hashspace: int = 1000
- init_stddev: float = 0.5
- log_miss: bool = False
- XPM Configxpmir.text.wordvec_vocab.WordvecUnkVocab(*, data, learn, random)[source]
Bases:
WordvecVocabSubmit type:
xpmir.text.wordvec_vocab.WordvecUnkVocabWord-based embeddings with OOV
A vocabulary in which all unknown terns are given the same token (UNK; 0) with random weights
- learn: bool = False
- random: xpmir.learning.base.Random