banhxeo.model.old.n_gram module

class banhxeo.model.old.n_gram.NgramTrieNode(id: int, children: Dict[int, ForwardRef('NgramTrieNode')] = <factory>, count: int = 0, is_end_of_ngram: bool = False)[source]

Bases: object

id: int
children: Dict[int, NgramTrieNode]
count: int = 0
is_end_of_ngram: bool = False
__init__(id: int, children: ~typing.Dict[int, ~banhxeo.model.old.n_gram.NgramTrieNode] = <factory>, count: int = 0, is_end_of_ngram: bool = False) None
class banhxeo.model.old.n_gram.NGramConfig(*, vocab_size: int | None = None, n: int, smoothing: bool | str = False, k: float | None = None)[source]

Bases: ModelConfig

n: int
smoothing: bool | str
k: float | None
check_smoothing() Self[source]
check_n() Self[source]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class banhxeo.model.old.n_gram.NGram(vocab: Vocabulary, n: int = 2, smoothing: bool | str = False, k: int | None = None)[source]

Bases: BaseLanguageModel

ConfigClass

alias of NGramConfig

__init__(vocab: Vocabulary, n: int = 2, smoothing: bool | str = False, k: int | None = None)[source]

Initializes the BaseLanguageModel.

Parameters:
  • model_config – The configuration object for the model. It should be an instance of ModelConfig or its subclass.

  • vocab – The Vocabulary instance to be used by the model.

fit(corpus: list[str])[source]
generate_sequence(prompt: str, sampling: str = 'greedy', max_length: int | None = 20, **kwargs) str[source]