banhxeo.data.config module

class banhxeo.data.config.DatasetSplit(train: int, test: int, val: int | None = None)[source]

Bases: object

train: int
test: int
val: int | None = None
__init__(train: int, test: int, val: int | None = None) None
class banhxeo.data.config.DownloadDatasetFile(name: str, ext: str, source: str | None = None)[source]

Bases: object

name: str
ext: str
source: str | None = None
__init__(name: str, ext: str, source: str | None = None) None
class banhxeo.data.config.DatasetConfig(*, name: str, url: str | None = None, file_info: DownloadDatasetFile | None = None, md5: str | None = None, hf_path: str | None = None, hf_name: str | None = None, text_column: str = 'text', label_column: str | None = 'label', label_map: Dict[str, int] = {'neg': 0, 'pos': 1}, split: DatasetSplit | None = None)[source]

Bases: BaseModel

name: str
url: str | None
file_info: DownloadDatasetFile | None
md5: str | None
hf_path: str | None
hf_name: str | None
text_column: str
label_column: str | None
label_map: Dict[str, int]
split: DatasetSplit | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class banhxeo.data.config.TorchDatasetConfig(*, tokenizer: Tokenizer, tokenizer_config: TokenizerConfig, vocab: Vocabulary, is_classification: bool = False, transforms: List[Transforms] | ComposeTransforms = [], text_column_name: str = 'text', label_column_name: str | None = 'label', label_map: Dict[str, int] | None)[source]

Bases: BaseModel

tokenizer: Tokenizer
tokenizer_config: TokenizerConfig
vocab: Vocabulary
is_classification: bool
transforms: List[Transforms] | ComposeTransforms
text_column_name: str
label_column_name: str | None
label_map: Dict[str, int] | None
classmethod ensure_compose_transforms(v)[source]
class Config[source]

Bases: object

arbitrary_types_allowed = True
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].