SimpleVocab¶
-
class
finalfusion.vocab.simple_vocab.
SimpleVocab
(*args, **kwds)[source]¶ Bases:
finalfusion.vocab.vocab.Vocab
Simple vocabulary.
SimpleVocabs provide a simple string to index mapping and index to string mapping.
-
__init__
(words: List[str])[source]¶ Initialize a SimpleVocab.
Initializes the vocabulary with the given words and optional index. If no index is given, the nth word in the words list is assigned index n. The word list cannot contain duplicate entries and it needs to be of same length as the index.
- Parameters
words (List[str]) – List of unique words
- Raises
AssertionError – if
words
contains duplicate entries.
-
property
words
¶ Get the list of known words
- Returns
words – list of known words
- Return type
List[str]
-
property
word_index
¶ Get the index of known words
-
property
upper_bound
¶ The exclusive upper bound of indices in this vocabulary.
- Returns
upper_bound – Exclusive upper bound of indices covered by the vocabulary.
- Return type
-
idx
(item: str, default: Optional[Union[list, int]] = None) → Optional[Union[list, int]][source]¶ Lookup the given query item.
This lookup does not raise an exception if the vocab can’t produce indices.
- Parameters
item (str) – The query item.
default (Optional[Union[int, List[int]]]) – Fall-back value to return if the vocab can’t provide indices.
- Returns
index –
An integer if there is a single index for a known item.
A list if the vocab can provide subword indices for a unknown item.
The provided default item if the vocab can’t provide indices.
- Return type
-
static
read_chunk
(file: BinaryIO) → finalfusion.vocab.simple_vocab.SimpleVocab[source]¶ Read the Chunk and return it.
The file must be positioned before the contents of the
Chunk
but after its header.- Parameters
file (BinaryIO) – a finalfusion file containing the given Chunk
- Returns
chunk – The chunk read from the file.
- Return type
-
write_chunk
(file: BinaryIO)[source]¶ Write the Chunk to a file.
- Parameters
file (BinaryIO) – Output file for the Chunk
-
static
chunk_identifier
() → finalfusion.io.ChunkIdentifier[source]¶ Get the ChunkIdentifier for this Chunk.
- Returns
chunk_identifier
- Return type
-
-
finalfusion.vocab.simple_vocab.
load_simple_vocab
(file: Union[str, bytes, int, os.PathLike]) → finalfusion.vocab.simple_vocab.SimpleVocab[source]¶ Load a SimpleVocab from the given finalfusion file.
- Parameters
file (str) – Path to file containing a SimpleVocab chunk.
- Returns
vocab – Returns the first SimpleVocab in the file.
- Return type