ezmsg.learn.model.mlp#

Classes

class MLP(input_size, hidden_size, num_layers=None, output_heads=2, norm_layer=None, activation_layer='ReLU', inplace=None, bias=True, dropout=0.0)[source]#

Bases: Module

A simple Multi-Layer Perceptron (MLP) model. Adapted from Ezmsg MLP.

Parameters:
feature_extractor#

The sequential feature extractor part of the MLP.

Type:

Sequential

heads#

A dictionary of output linear layers for each output head.

Type:

ModuleDict

__init__(input_size, hidden_size, num_layers=None, output_heads=2, norm_layer=None, activation_layer='ReLU', inplace=None, bias=True, dropout=0.0)[source]#

Initialize the MLP model. :type input_size: int :param input_size: The size of the input features. :type input_size: int :type hidden_size: int | list[int] :param hidden_size: The sizes of the hidden layers. If a list, num_layers must be None or the length

of the list. If a single integer, num_layers must be specified and determines the number of hidden layers.

Parameters:
  • num_layers (int, optional) – The number of hidden layers. Length of hidden_size if None. Default is None.

  • output_heads (int | dict[str, int], optional) – Number of output features or classes if single head output or a dictionary mapping head names to output sizes if multi-head output. Default is 2 (single head).

  • norm_layer (str, optional) – A normalization layer to be applied after each linear layer. Default is None. Common choices are “BatchNorm1d” or “LayerNorm”.

  • activation_layer (str, optional) – An activation function to be applied after each normalization layer. Default is “ReLU”.

  • inplace (bool, optional) – Whether the activation function is performed in-place. Default is None.

  • bias (bool, optional) – Whether to use bias in the linear layers. Default is True.

  • dropout (float, optional) – The dropout rate to be applied after each linear layer. Default is 0.0.

  • input_size (int)

  • hidden_size (int | list[int])

classmethod infer_config_from_state_dict(state_dict)[source]#

Infer the configuration from the state dict.

Parameters:

state_dict (dict) – The state dict of the model.

Returns:

A dictionary containing the inferred configuration.

Return type:

dict[str, int | float]

forward(x)[source]#

Forward pass through the MLP.

Parameters:

x (Tensor) – Input tensor of shape (batch, seq_len, input_size).

Returns:

A dictionary mapping head names to output tensors.

Return type:

dict[str, Tensor]

class MLP(input_size, hidden_size, num_layers=None, output_heads=2, norm_layer=None, activation_layer='ReLU', inplace=None, bias=True, dropout=0.0)[source]#

Bases: Module

A simple Multi-Layer Perceptron (MLP) model. Adapted from Ezmsg MLP.

Parameters:
feature_extractor#

The sequential feature extractor part of the MLP.

Type:

Sequential

heads#

A dictionary of output linear layers for each output head.

Type:

ModuleDict

__init__(input_size, hidden_size, num_layers=None, output_heads=2, norm_layer=None, activation_layer='ReLU', inplace=None, bias=True, dropout=0.0)[source]#

Initialize the MLP model. :type input_size: int :param input_size: The size of the input features. :type input_size: int :type hidden_size: int | list[int] :param hidden_size: The sizes of the hidden layers. If a list, num_layers must be None or the length

of the list. If a single integer, num_layers must be specified and determines the number of hidden layers.

Parameters:
  • num_layers (int, optional) – The number of hidden layers. Length of hidden_size if None. Default is None.

  • output_heads (int | dict[str, int], optional) – Number of output features or classes if single head output or a dictionary mapping head names to output sizes if multi-head output. Default is 2 (single head).

  • norm_layer (str, optional) – A normalization layer to be applied after each linear layer. Default is None. Common choices are “BatchNorm1d” or “LayerNorm”.

  • activation_layer (str, optional) – An activation function to be applied after each normalization layer. Default is “ReLU”.

  • inplace (bool, optional) – Whether the activation function is performed in-place. Default is None.

  • bias (bool, optional) – Whether to use bias in the linear layers. Default is True.

  • dropout (float, optional) – The dropout rate to be applied after each linear layer. Default is 0.0.

  • input_size (int)

  • hidden_size (int | list[int])

classmethod infer_config_from_state_dict(state_dict)[source]#

Infer the configuration from the state dict.

Parameters:

state_dict (dict) – The state dict of the model.

Returns:

A dictionary containing the inferred configuration.

Return type:

dict[str, int | float]

forward(x)[source]#

Forward pass through the MLP.

Parameters:

x (Tensor) – Input tensor of shape (batch, seq_len, input_size).

Returns:

A dictionary mapping head names to output tensors.

Return type:

dict[str, Tensor]