Configurations
Although it is possible to get a rather good performance with default configurations, performance might be gained easily by specifying configurations with our prior knowledges.
We've already mentioned the basic ideas on how to configure carefree-learn in Introduction, so we will focus on introducing how to actually configure Pipelines in this page.
Specify Configurations
There are three ways to specify configurations in carefree-learn:
- Construct a
Pipelinefrom scratch. - Leverage
DLZooto construct aPipelinewith a JSON file. - Utilize
cflearn.api(recommended!).
Let's say we want to construct a Pipeline to train resnet18 on MNIST dataset, here are three different ways to achieve this:
- From Scratch
- DLZoo
- cflearn.api
We'll describe some details in the following sections.
DLPipeline
Since carefree-learn exposed almost every parameter to users, we can actually control every part of the Pipeline through args and kwargs of __init__.
Although Machine Learning, Computer Vision and Natural Language Processing are very different, they share many things in common when they are solved by Deep Learning. Therefore in carefree-learn, we implement DLPipeline to handle these shared stuffs.
note
The DLPipeline serves as the base class of all Pipelines, and for specific domain, we need to inherit DLPipeline and implement its own Pipeline class.
loss_name- Loss that we'll use for training.
- Currently
carefree-learnsupports- Common losses:
mae,mse,quantile,cross_entropy,focal, ... - Task specific losses:
vae,vqvae, ...
- Common losses:
loss_config[default ={}]- Configurations of the corresponding loss.
state_config[default ={}]- Configurations of the
TrainerState.
- Configurations of the
num_epoch[default =40]- Specify number of epochs.
- Notice that in most cases this will not be the final epoch number.
max_epoch[default =1000]- Specify the maximum number of epochs.
fixed_epoch[default =None]- Specify the (fixed) number of epochs.
- If sepcified, then
num_epochandmax_epochwill all be set to it.
fixed_steps[default =None]- Specify the (fixed) number of steps.
log_steps[default =None]- Specify the (fixed) number of steps to do loggings.
valid_portion[default =1.0]- Specify how much data from validation set do we want to use for monitoring.
amp[default =False]- Specify whether use the
amptechnique or not.
- Specify whether use the
clip_norm[default =0.0]- Given a gradient
g, and theclip_normvalue, we will normalizegso that its L2-norm is less than or equal toclip_norm. - If
0.0, then no gradient clip will be performed.
- Given a gradient
cudnn_benchmark[default =False]- Specify whether use the
cudnn.benchmarktechnique or not.
- Specify whether use the
metric_names[default =None]- Specify what metrics do we want to use for monitoring.
- If
None, then no metrics will be used, and losses will be treated as metrics.
metric_configs[default ={}]- Configurations of the corresponding metrics.
use_losses_as_metrics[default =None]- Specify whether use losses as metrics or not.
- It will always be
Trueifmetric_namesisNone.
loss_metrics_weights[default =None]- Specify the weight of each loss when they are used as metrics.
recompute_train_losses_in_eval[default =True]- Specify whether should we recompute losses on training set in monitor steps when validation set is not provided.
monitor_names[default =None]- Specify what monitors do we want to use for monitoring.
- If
None, thenBasicMonitorwill be used.
monitor_configs[default ={}]- Configurations of the corresponding monitors.
callback_names[default =None]- Specify what callbacks do we want to use during training.
- If
None, then_LogMetricsMsgCallbackwill be used.
callback_configs[default ={}]- Configurations of the corresponding callbacks.
lr[default =None]- Default learning rate.
- If not specified,
carefree-learnwill try to infer the best default value.
optimizer_name[default ="None"]- Specify which optimizer will be used.
- If not specified,
carefree-learnwill try to infer the best default value.
scheduler_name[default ="None"]- Specify which learning rate scheduler will be used.
- If not specified,
carefree-learnwill try to infer the best default value.
optimizer_config[default ={}]- Specify the optimizer's configuration.
scheduler_config[default ={}]- Specify the scheduler's configuration.
optimizer_settings[default =None]- Specify the fine grained configurations of optimizers and schedulers.
- We should not specify
optimizer_name, ... if we want to specifyoptimizer_settings. - See
OptimizerPackfor more details.
workplace[default ="_logs"]- Specify the workplace of the whole training process.
- In general,
carefree-learnwill create a folder (with timestamp as its name) in the workplace, and will dump everything generated in the training process to it.
finetune_config[default =None]- Specify the finetune configurations.
- If
None, then we'll not utilize the finetune mechanism supported bycarefree-learn. - See
finetune_configfor more details.
tqdm_settings[default =None]- Specify the
tqdmconfigurations. - See
TqdmSettingsfor more details.
- Specify the
in_loading[default =False]- In most cases this is an internal property handled by
carefree-learnitself.
- In most cases this is an internal property handled by
dl.SimplePipeline
This Pipeline aims to solve general deep learning tasks.
model_name- Model that we'll use for training.
model_config[default ={}]- Configurations of the corresponding model.
dl.CarefreePipeline
This Pipeline will provide some useful default settings on top of dl.SimplePipeline.
cv.SimplePipeline
This Pipeline is exactly the same as dl.SimplePipeline, just an alias.
cv.CarefreePipeline
This Pipeline is exactly the same as dl.CarefreePipeline, just an alias.
ml.SimplePipeline
This Pipeline aims to solve tabular tasks. It will always use MLModel as its model, and we can only specify the core of the MLModel.
info
The reason why carefree-learn does so is that in tabular tasks, there are many common practices which shall be applied everytime, such as:
- Encode the categorical columns (to
one_hot/embeddingformat, required). - Pre-process the numerical columns (with
min_max/normalize/ ... method, optional). - Decide the binary threshold in binary classification tasks.
- ......
In order to prevent users from re-implementing these stuffs over and over again, carefree-learn decides to provide MLModel which wraps everything up. In this case, we can focus on the core algorithms, without concerning the rest.
core_name[default ="fcnn"]- Core of
MLModelthat we'll use for training.
- Core of
core_config[default ={}]- Configurations of the corresponding core.
input_dim[default =None]- Input dimension of the task.
- If not provided, then
cf_datashould be provided inMLDatawhich we want to train on.
output_dim[default =None]- Output dimension of the task.
- If not provided, then
cf_datashould be provided inMLDatawhich we want to train on.
loss_name[default ="auto"]- Loss that we'll use for training.
- As default (
"auto"),carefree-learnwill use:"mae"for regression tasks."focal"for classification tasks.
loss_config[default ={}]- Configurations of the corresponding loss.
only_categorical[default =False]- Specify whether all columns in the task are categorical columns or not.
encoder_config[default ={}]- Configurations of
Encoder.
- Configurations of
encoding_methods[default =None]- Encoding methods we will use to encode the categorical columns.
encoding_configs[default ={}]- Configurations of the corresponding methods.
default_encoding_methods[default =["embedding"]]- Default encoding methods we will use to encode the categorical columns.
default_encoding_configs[default ={}]- Default configurations of the corresponding methods.
pre_process_batch[default =False]- Specify whether should we pre-process the input batch or not.
num_repeat[default =None]- In most cases this is an internal property handled by
carefree-learnitself.
- In most cases this is an internal property handled by
ml.CarefreePipeline
This Pipeline will provide some useful default settings on top of ml.SimplePipeline.
Configure DLZoo
Since it will be tedious to re-define similar configurations over and over, carefree-learn provides DLZoo to improve user experience. Internally, DLZoo will read configurations from cflearn/api/zoo/configs, which contains a bunch of JSON files:
The general usage of DLZoo should be as follows:
task- Specify the task we want to work with.
- See Supported Models for more details.
model- Specify the model we want to use.
- See Supported Models for more details.
type- Specify the model type we want to use.
- If not provided, we will use
defaultas the model type.
kwargs- Specify the keyword arguments of the
Pipeline, described above. - See Example section for more details.
- Specify the keyword arguments of the
__requires__
Although carefree-learn wants to make everything as easy as possible, there are still some properties that carefree-learn cannot make decisions for you (e.g. img_size, num_classes, etc.). These properties will be presented in the __requires__ field of each JSON file.
For example, in resnet18, we will need you to provide the num_classes property, so the corresponding JSON file will be:
Which means we need to specify num_classes if we want to use resnet18:
info
In fact, the 'original' configuration should be:
Because num_classes should be defined under the model_config scope.
Since this is quite troublesome, we decided to allow users to specify these 'requirements' directly by the names, which makes DLZoo more readable and easier to use!
Example
The following two code snippets have same effects:
- From Scratch
- DLZoo
Configure cflearn.api
Since DLZoo mainly depends on JSON files which cannot provide useful auto-completion, carefree-learn further provides cflearn.api, which is a thin wrapper of DLZoo, as the recommended user interface.
Configuring cflearn.api will be exactly the same as configuring DLZoo, except that it can utilize auto-completion which significantly improves user experience.
Configuration Details
make_multiple mechanism
This mechanism is based on the
Register Mechanism.
make_multiple mechanism is useful when we need to use either one single instance or multiple instances (e.g. use one metric / use multiple metrics to monitor the training process):
- When we need one single instance, only one single name (
str) and the corresponding config is required. - When we need multiple instances, their names (
List[str]) are required, and the configs should be a dictionary, where:- The keys should be the names.
- The values should be the corresponding configs.
The source codes well demonstrate how it works:
TrainerState
Source code: protocol.py
loader- This will be handled by
carefree-learninternally.
- This will be handled by
num_epoch- Specify number of epochs.
- Notice that in most cases this will not be the final epoch number.
max_epoch- Specify the maximum number of epochs.
fixed_steps[default =None]- Specify the (fixed) number of steps.
extension[default =None]- Specify the number of the extended epochs per extension.
- So basically, we'll not extend the epoch for more than times.
enable_logging[default =True]- Whether enable logging stuffs or not.
min_num_sample[default =3000]- We'll not start monitoring until the model has already seen
min_num_samplesamples. - This can avoid monitors from stopping too early, when the model is still trying to optimize its initial state.
- We'll not start monitoring until the model has already seen
snapshot_start_step[default =None]- Specify the number of steps when we start to take snapshots.
- If not specified,
carefree-learnwill try to infer the best default value.
max_snapshot_file[default =5]- Specify the maximum number of checkpoint files we could save during training.
num_snapshot_per_epoch[default =2]- Indicates how many snapshots we would like to take per epoch.
- The final behaviour will be affected by
max_step_per_snapshot.
num_step_per_log[default =350]- Indicates the number of steps of each logging period.
num_step_per_snapshot[default =None]- Specify the number of steps of each snapshot period.
- If not specified,
carefree-learnwill try to infer the best default value.
max_step_per_snapshot[default =1000]- Specify the maximum number of steps of each snapshot period.
BasicMonitor
Source code: monitors.py.
This is the default monitor of carefree-learn. It's fairly simple, but quite useful in practice:
- It will take a snapshot when SOTA is achieved.
- It will terminate the training after
patiencesteps, if the new score is even worse than the worst score. - It will not punish extension
info
So in most cases, BasicMonitor will not early-stop until max_epoch is reached.
_LogMetricsMsgCallback
Source code: general.py.
This is the default callback of carefree-learn. It will report the validation metrics to the console periodically, along with the current steps / epochs, and the execution time since last report. It will also write these information to disk.
info
When writing to disk, _LogMetricsMsgCallback will also write the lr (learning rate) of the corresponding steps.
OptimizerPack
scope- Specify the parameter 'scope' of this pack.
- If
scope="all", all trainable parameters will be considered. - Else, it represents the attribute of the model, and:
- If this attribute is an
nn.Module, then its parameters will be considered. - Else, this attribute should be a list of parameters, which will be considered.
- If this attribute is an
optimizer_name- Specify which optimizer will be used.
scheduler_name[default ="None"]- Specify which learning rate scheduler will be used.
- If not specified, no scheduler will be used.
optimizer_config[default ={}]- Specify optimizer's configuration.
scheduler_config[default ={}]- Specify scheduler's configuration.
Since directly constructing OptimizerPacks will be troublesome, carefree-learn provides many convenient interface for users to specify optimizer settings. For instance, these configurations will have same effects:
- Via `kwargs`
- Via `optimizer_settings`
If we need to apply different optimizers on different parameters (which is quite common in GANs), we need to walk through the following two steps:
- Define a
propertyin yourModelwhich returns a list of parameters you want to optimize. - Define the corresponding optimizer configs with
property's name as the dictionary key.
Here's an example:
finetune_config
Source code:
_init_finetune.
carefree-learn supports finetune mechanism, and we can specify:
- The initial states we want to start training from.
- What parameters should we freeze / train during the finetune process, and Regex is supported!
Example
info
freeze & freeze_except should not be provided simultaneously
TqdmSettings
use_tqdm[default =False]- Whether enable
tqdmprogress bar or not.
- Whether enable
use_step_tqdm[default =False]- Whether enable
tqdmprogress bar on steps or not.
- Whether enable
use_tqdm_in_validation[default =False]- Whether enable
tqdmprogress bar in validation procedure or not.
- Whether enable
in_distributed[default =False]- This will be handled by
carefree-learninternally.
- This will be handled by
position[default =0]- This will be handled by
carefree-learninternally.
- This will be handled by
desc[default ="epoch"]- This will be handled by
carefree-learninternally.
- This will be handled by
Supported Models
info
In this section, we will:
- Use
loadto representcflearn.DLZoo.load_pipeline. - Use
key=...to represent the__requires__field.
tip
It's also recommended to browse the cflearn/api/zoo/configs folder, from which you can see all the JSON files that represent the corresponding supported models.