Configurations
Although it is possible to get a rather good performance with default configurations, performance might be gained easily by specifying configurations with our prior knowledges.
We've already mentioned the basic ideas on how to configure carefree-learn
in Introduction
, so we will focus on introducing how to actually configure Pipeline
s in this page.
Specify Configurations
There are three ways to specify configurations in carefree-learn
:
- Construct a
Pipeline
from scratch. - Leverage
DLZoo
to construct aPipeline
with a JSON file. - Utilize
cflearn.api
(recommended!).
Let's say we want to construct a Pipeline
to train resnet18
on MNIST dataset, here are three different ways to achieve this:
- From Scratch
- DLZoo
- cflearn.api
We'll describe some details in the following sections.
DLPipeline
Since carefree-learn
exposed almost every parameter to users, we can actually control every part of the Pipeline
through args and kwargs of __init__
.
Although Machine Learning, Computer Vision and Natural Language Processing are very different, they share many things in common when they are solved by Deep Learning. Therefore in carefree-learn
, we implement DLPipeline
to handle these shared stuffs.
note
The DLPipeline
serves as the base class of all Pipeline
s, and for specific domain, we need to inherit DLPipeline
and implement its own Pipeline
class.
loss_name
- Loss that we'll use for training.
- Currently
carefree-learn
supports- Common losses:
mae
,mse
,quantile
,cross_entropy
,focal
, ... - Task specific losses:
vae
,vqvae
, ...
- Common losses:
loss_config
[default ={}
]- Configurations of the corresponding loss.
state_config
[default ={}
]- Configurations of the
TrainerState
.
- Configurations of the
num_epoch
[default =40
]- Specify number of epochs.
- Notice that in most cases this will not be the final epoch number.
max_epoch
[default =1000
]- Specify the maximum number of epochs.
fixed_epoch
[default =None
]- Specify the (fixed) number of epochs.
- If sepcified, then
num_epoch
andmax_epoch
will all be set to it.
fixed_steps
[default =None
]- Specify the (fixed) number of steps.
log_steps
[default =None
]- Specify the (fixed) number of steps to do loggings.
valid_portion
[default =1.0
]- Specify how much data from validation set do we want to use for monitoring.
amp
[default =False
]- Specify whether use the
amp
technique or not.
- Specify whether use the
clip_norm
[default =0.0
]- Given a gradient
g
, and theclip_norm
value, we will normalizeg
so that its L2-norm is less than or equal toclip_norm
. - If
0.0
, then no gradient clip will be performed.
- Given a gradient
cudnn_benchmark
[default =False
]- Specify whether use the
cudnn.benchmark
technique or not.
- Specify whether use the
metric_names
[default =None
]- Specify what metrics do we want to use for monitoring.
- If
None
, then no metrics will be used, and losses will be treated as metrics.
metric_configs
[default ={}
]- Configurations of the corresponding metrics.
use_losses_as_metrics
[default =None
]- Specify whether use losses as metrics or not.
- It will always be
True
ifmetric_names
isNone
.
loss_metrics_weights
[default =None
]- Specify the weight of each loss when they are used as metrics.
recompute_train_losses_in_eval
[default =True
]- Specify whether should we recompute losses on training set in monitor steps when validation set is not provided.
monitor_names
[default =None
]- Specify what monitors do we want to use for monitoring.
- If
None
, thenBasicMonitor
will be used.
monitor_configs
[default ={}
]- Configurations of the corresponding monitors.
callback_names
[default =None
]- Specify what callbacks do we want to use during training.
- If
None
, then_LogMetricsMsgCallback
will be used.
callback_configs
[default ={}
]- Configurations of the corresponding callbacks.
lr
[default =None
]- Default learning rate.
- If not specified,
carefree-learn
will try to infer the best default value.
optimizer_name
[default ="None"
]- Specify which optimizer will be used.
- If not specified,
carefree-learn
will try to infer the best default value.
scheduler_name
[default ="None"
]- Specify which learning rate scheduler will be used.
- If not specified,
carefree-learn
will try to infer the best default value.
optimizer_config
[default ={}
]- Specify the optimizer's configuration.
scheduler_config
[default ={}
]- Specify the scheduler's configuration.
optimizer_settings
[default =None
]- Specify the fine grained configurations of optimizers and schedulers.
- We should not specify
optimizer_name
, ... if we want to specifyoptimizer_settings
. - See
OptimizerPack
for more details.
workplace
[default ="_logs"
]- Specify the workplace of the whole training process.
- In general,
carefree-learn
will create a folder (with timestamp as its name) in the workplace, and will dump everything generated in the training process to it.
finetune_config
[default =None
]- Specify the finetune configurations.
- If
None
, then we'll not utilize the finetune mechanism supported bycarefree-learn
. - See
finetune_config
for more details.
tqdm_settings
[default =None
]- Specify the
tqdm
configurations. - See
TqdmSettings
for more details.
- Specify the
in_loading
[default =False
]- In most cases this is an internal property handled by
carefree-learn
itself.
- In most cases this is an internal property handled by
dl.SimplePipeline
This Pipeline
aims to solve general deep learning tasks.
model_name
- Model that we'll use for training.
model_config
[default ={}
]- Configurations of the corresponding model.
dl.CarefreePipeline
This Pipeline
will provide some useful default settings on top of dl.SimplePipeline
.
cv.SimplePipeline
This Pipeline
is exactly the same as dl.SimplePipeline
, just an alias.
cv.CarefreePipeline
This Pipeline
is exactly the same as dl.CarefreePipeline
, just an alias.
ml.SimplePipeline
This Pipeline
aims to solve tabular tasks. It will always use MLModel
as its model, and we can only specify the core
of the MLModel
.
info
The reason why carefree-learn
does so is that in tabular tasks, there are many common practices which shall be applied everytime, such as:
- Encode the categorical columns (to
one_hot
/embedding
format, required). - Pre-process the numerical columns (with
min_max
/normalize
/ ... method, optional). - Decide the binary threshold in binary classification tasks.
- ......
In order to prevent users from re-implementing these stuffs over and over again, carefree-learn
decides to provide MLModel
which wraps everything up. In this case, we can focus on the core algorithms, without concerning the rest.
core_name
[default ="fcnn"
]- Core of
MLModel
that we'll use for training.
- Core of
core_config
[default ={}
]- Configurations of the corresponding core.
input_dim
[default =None
]- Input dimension of the task.
- If not provided, then
cf_data
should be provided inMLData
which we want to train on.
output_dim
[default =None
]- Output dimension of the task.
- If not provided, then
cf_data
should be provided inMLData
which we want to train on.
loss_name
[default ="auto"
]- Loss that we'll use for training.
- As default (
"auto"
),carefree-learn
will use:"mae"
for regression tasks."focal"
for classification tasks.
loss_config
[default ={}
]- Configurations of the corresponding loss.
only_categorical
[default =False
]- Specify whether all columns in the task are categorical columns or not.
encoder_config
[default ={}
]- Configurations of
Encoder
.
- Configurations of
encoding_methods
[default =None
]- Encoding methods we will use to encode the categorical columns.
encoding_configs
[default ={}
]- Configurations of the corresponding methods.
default_encoding_methods
[default =["embedding"]
]- Default encoding methods we will use to encode the categorical columns.
default_encoding_configs
[default ={}
]- Default configurations of the corresponding methods.
pre_process_batch
[default =False
]- Specify whether should we pre-process the input batch or not.
num_repeat
[default =None
]- In most cases this is an internal property handled by
carefree-learn
itself.
- In most cases this is an internal property handled by
ml.CarefreePipeline
This Pipeline
will provide some useful default settings on top of ml.SimplePipeline
.
DLZoo
Configure Since it will be tedious to re-define similar configurations over and over, carefree-learn
provides DLZoo
to improve user experience. Internally, DLZoo
will read configurations from cflearn/api/zoo/configs
, which contains a bunch of JSON files:
The general usage of DLZoo
should be as follows:
task
- Specify the task we want to work with.
- See Supported Models for more details.
model
- Specify the model we want to use.
- See Supported Models for more details.
type
- Specify the model type we want to use.
- If not provided, we will use
default
as the model type.
kwargs
- Specify the keyword arguments of the
Pipeline
, described above. - See Example section for more details.
- Specify the keyword arguments of the
__requires__
Although carefree-learn
wants to make everything as easy as possible, there are still some properties that carefree-learn
cannot make decisions for you (e.g. img_size
, num_classes
, etc.). These properties will be presented in the __requires__
field of each JSON file.
For example, in resnet18
, we will need you to provide the num_classes
property, so the corresponding JSON file will be:
Which means we need to specify num_classes
if we want to use resnet18
:
info
In fact, the 'original' configuration should be:
Because num_classes
should be defined under the model_config
scope.
Since this is quite troublesome, we decided to allow users to specify these 'requirements' directly by the names, which makes DLZoo
more readable and easier to use!
Example
The following two code snippets have same effects:
- From Scratch
- DLZoo
cflearn.api
Configure Since DLZoo
mainly depends on JSON files which cannot provide useful auto-completion, carefree-learn
further provides cflearn.api
, which is a thin wrapper of DLZoo
, as the recommended user interface.
Configuring cflearn.api
will be exactly the same as configuring DLZoo
, except that it can utilize auto-completion which significantly improves user experience.
Configuration Details
make_multiple
mechanism
This mechanism is based on the
Register Mechanism
.
make_multiple
mechanism is useful when we need to use either one single instance or multiple instances (e.g. use one metric / use multiple metrics to monitor the training process):
- When we need one single instance, only one single name (
str
) and the corresponding config is required. - When we need multiple instances, their names (
List[str]
) are required, and the configs should be a dictionary, where:- The keys should be the names.
- The values should be the corresponding configs.
The source codes well demonstrate how it works:
TrainerState
Source code: protocol.py
loader
- This will be handled by
carefree-learn
internally.
- This will be handled by
num_epoch
- Specify number of epochs.
- Notice that in most cases this will not be the final epoch number.
max_epoch
- Specify the maximum number of epochs.
fixed_steps
[default =None
]- Specify the (fixed) number of steps.
extension
[default =None
]- Specify the number of the extended epochs per extension.
- So basically, we'll not extend the epoch for more than times.
enable_logging
[default =True
]- Whether enable logging stuffs or not.
min_num_sample
[default =3000
]- We'll not start monitoring until the model has already seen
min_num_sample
samples. - This can avoid monitors from stopping too early, when the model is still trying to optimize its initial state.
- We'll not start monitoring until the model has already seen
snapshot_start_step
[default =None
]- Specify the number of steps when we start to take snapshots.
- If not specified,
carefree-learn
will try to infer the best default value.
max_snapshot_file
[default =5
]- Specify the maximum number of checkpoint files we could save during training.
num_snapshot_per_epoch
[default =2
]- Indicates how many snapshots we would like to take per epoch.
- The final behaviour will be affected by
max_step_per_snapshot
.
num_step_per_log
[default =350
]- Indicates the number of steps of each logging period.
num_step_per_snapshot
[default =None
]- Specify the number of steps of each snapshot period.
- If not specified,
carefree-learn
will try to infer the best default value.
max_step_per_snapshot
[default =1000
]- Specify the maximum number of steps of each snapshot period.
BasicMonitor
Source code: monitors.py.
This is the default monitor of carefree-learn
. It's fairly simple, but quite useful in practice:
- It will take a snapshot when SOTA is achieved.
- It will terminate the training after
patience
steps, if the new score is even worse than the worst score. - It will not punish extension
info
So in most cases, BasicMonitor
will not early-stop until max_epoch
is reached.
_LogMetricsMsgCallback
Source code: general.py.
This is the default callback of carefree-learn
. It will report the validation metrics to the console periodically, along with the current steps / epochs, and the execution time since last report. It will also write these information to disk.
info
When writing to disk, _LogMetricsMsgCallback
will also write the lr
(learning rate) of the corresponding steps.
OptimizerPack
scope
- Specify the parameter 'scope' of this pack.
- If
scope="all"
, all trainable parameters will be considered. - Else, it represents the attribute of the model, and:
- If this attribute is an
nn.Module
, then its parameters will be considered. - Else, this attribute should be a list of parameters, which will be considered.
- If this attribute is an
optimizer_name
- Specify which optimizer will be used.
scheduler_name
[default ="None"
]- Specify which learning rate scheduler will be used.
- If not specified, no scheduler will be used.
optimizer_config
[default ={}
]- Specify optimizer's configuration.
scheduler_config
[default ={}
]- Specify scheduler's configuration.
Since directly constructing OptimizerPack
s will be troublesome, carefree-learn
provides many convenient interface for users to specify optimizer settings. For instance, these configurations will have same effects:
- Via `kwargs`
- Via `optimizer_settings`
If we need to apply different optimizers on different parameters (which is quite common in GANs), we need to walk through the following two steps:
- Define a
property
in yourModel
which returns a list of parameters you want to optimize. - Define the corresponding optimizer configs with
property
's name as the dictionary key.
Here's an example:
finetune_config
Source code:
_init_finetune
.
carefree-learn
supports finetune mechanism, and we can specify:
- The initial states we want to start training from.
- What parameters should we freeze / train during the finetune process, and Regex is supported!
Example
info
freeze
& freeze_except
should not be provided simultaneously
TqdmSettings
use_tqdm
[default =False
]- Whether enable
tqdm
progress bar or not.
- Whether enable
use_step_tqdm
[default =False
]- Whether enable
tqdm
progress bar on steps or not.
- Whether enable
use_tqdm_in_validation
[default =False
]- Whether enable
tqdm
progress bar in validation procedure or not.
- Whether enable
in_distributed
[default =False
]- This will be handled by
carefree-learn
internally.
- This will be handled by
position
[default =0
]- This will be handled by
carefree-learn
internally.
- This will be handled by
desc
[default ="epoch"
]- This will be handled by
carefree-learn
internally.
- This will be handled by
Supported Models
info
In this section, we will:
- Use
load
to representcflearn.DLZoo.load_pipeline
. - Use
key=...
to represent the__requires__
field.
tip
It's also recommended to browse the cflearn/api/zoo/configs
folder, from which you can see all the JSON files that represent the corresponding supported models.