本文件根据../textclf/config/trainer.py自动生成
MLTrainerConfig有以下属性:
Attribute name | Type | Default | Description |
---|---|---|---|
vectorizer | VectorizerConfig | CountVectorizerConfig() | |
model | MLModelConfig | LogisticRegressionConfig() | |
raw_data_path | str | "textclf.joblib" | |
save_dir | str | "ckpts/" | the dir to save model and vectorizer |
Traning config for deep learning model
DLTrainerConfig有以下属性:
Attribute name | Type | Default | Description |
---|---|---|---|
use_cuda | bool | True | 是否使用GPU |
epochs | int | 10 | : Training epochs |
score_method | str | "accuracy" | score method 指定保存最优模型的方式如果score_method为accuracy,那么保存验证集上准确率最高的模型如果score_method为loss,那么保存损失最小的模型 |
ckpts_dir | str | "ckpts" | 指定checkpoints保存的目录 |
save_ckpt_every_epoch | bool | True | 是否每个epoch都保存ckpt |
random_state | Optional[int] | 2020 | 随机数种子,保证每次结果相同 |
state_dict_file | Optional[str] | None | random_state: Optional[int] = None从state_dict_file指定的断点开始训练state_dict_file: Optional[str] = "./ckpts/1.pt" |
early_stop_after | Optional[int] | None | : Stop after how many epochs when the eval metric is not improving |
max_clip_norm | Optional[float] | None | : Clip gradient norm if set |
do_eval | bool | True | : Whether to do evaluation and model selection based on it. |
load_best_model_after_train | bool | True | : if do_eval, do we load the best model state dict after training or justuse the latest model state |
num_batch_to_print | int | 10 | : Number of samples for print training info. |
optimizer | OptimizerConfig | AdamConfig() | : config for optimizer, used in parameter update |
scheduler | Optional[SchedulerConfig] | NoneSchedulerConfig() | |
model | DLModelConfig | DLModelConfig() | config Classifer |
data_loader | DataLoaderConfig | DataLoaderConfig() | config data loader |
criterion | CriterionConfig | CrossEntropyLossConfig() | config criterion |