Policies#

navground.learning.policies

Null#

Bases: BasePolicy

This class describes a dummy policy that always returns zeros.

Parameters:

squash_output (bool)

This class describes a dummy predictor that always returns zeros and conforms to navground.learning.types.PolicyPredictor

Parameters:
  • action_space (gym.spaces.Box) – The action space

  • observation_space (gym.Space[Any]) – An optional observation space

Random#

Bases: BasePolicy

This class describes a onnx-able policy that returns random actions.

Parameters:

squash_output (bool)

This class describes a predictor that returns random actions and conforms to navground.learning.types.PolicyPredictor

Parameters:
  • action_space (gym.spaces.Box) – The action space

  • observation_space (gym.Space[Any]) – An optional observation space

Info#

A predictor that extracts navground actions from the info dictionary and conforms to navground.learning.types.PolicyPredictorWithInfo

Parameters:
  • action_space (gym.Space[Any]) – The action space

  • key (str) – The key of the action in the info dictionary

  • observation_space (gym.Space[Any]) – An optional observation space

Ordering-invariant extractor#

A variation of SB3 stable_baselines3.common.torch_layers.CombinedExtractor that applies a ordering invariant MLP feature extractor to a group of keys after optionally masking it.

Same as the original CombinedExtractor: Combined features extractor for Dict observation spaces. Builds a features extractor for each key of the space. Input from each space is fed through a separate submodule (CNN or MLP, depending on input shape), the output features are concatenated and fed through additional MLP network (“combined”).

Parameters:
  • observation_space (gym.spaces.Dict) – the observation space

  • cnn_output_dim (int) – Number of features to output from each CNN submodule(s). Defaults to 256 to avoid exploding network sizes.

  • normalized_image (bool) – Whether to assume that the image is already normalized or not (this disables dtype and bounds checks): when True, it only checks that the space is a Box and has 3 dimensions. Otherwise, it checks that it has expected dtype (uint8) and bounds (values in [0, 255]).

  • order_invariant_keys (Collection[str]) – the keys to group together and process by an ordering invariant feature extractor

  • replicated_keys (Collection[str]) – additional keys to add to the ordering invariant groups, replicating the values for each group

  • filter_key (str) – the key to use for masking to select items with positive values of this key

  • removed_keys (Collection[str]) – keys removed from the observations before concatenating with ordering invariant features

  • net_arch (list[int]) – the ordering invariant MLP layers sizes

  • activation_fn (type[nn.Module] | None) – the ordering invariant MLP activation function If not set, it defaults to torch.nn.ReLU.

  • reductions (Sequence[Reduction] | None) – A sequence of (order-invariant) modules. If not set, it defaults to [torch.sum].

Similar to OrderInvariantCombinedExtractor but for flat observation spaces.

Parameters:
  • observation_space (gym.spaces.Box) – the observation space

  • order_invariant_slices (Collection[slice]) – the slices to group together and process by an ordering invariant feature extractor

  • replicated_slices (Collection[slice]) – additional slices to add to the ordering invariant groups, replicating the values for each group

  • filter_slice (slice | None) – the slice to use for masking to select items with positive values for indices in this slice

  • removed_slices (Collection[slice]) – keys removed from the observations before concatenating with ordering invariant features

  • net_arch (list[int]) – the ordering invariant MLP layers sizes

  • activation_fn (type[nn.Module] | None) – the ordering invariant MLP activation function If not set, it defaults to torch.nn.ReLU.

  • reductions (Sequence[Reduction] | None) – A sequence of (order-invariant) modules. If not set, it defaults to [torch.sum].

  • use_masked_tensors (bool) – Whether to use masked tensors

  • number (int)

Helper function that creates a OrderInvariantFlattenExtractor using information from a dictionary space to infer the layout of the observation space.

Parameters:
  • observation_space (Box) – the observation space

  • dict_space (Dict) – The dictionary space

  • cnn_output_dim (int) – Number of features to output from each CNN submodule(s). Defaults to 256 to avoid exploding network sizes.

  • normalized_image (bool) – Whether to assume that the image is already normalized or not (this disables dtype and bounds checks): when True, it only checks that the space is a Box and has 3 dimensions. Otherwise, it checks that it has expected dtype (uint8) and bounds (values in [0, 255]).

  • order_invariant_keys (Collection[str]) – the keys to group together and process by an ordering invariant feature extractor

  • replicated_keys (Collection[str]) – additional keys to add to the ordering invariant groups, replicating the values for each group

  • filter_key (str) – the key to use for masking to select items with positive values of this key

  • removed_keys (Collection[str]) – keys removed from the observations before concatenating with ordering invariant features

  • net_arch (list[int]) – the ordering invariant MLP layers sizes

  • activation_fn (type[Module] | None) – the ordering invariant MLP activation function If not set, it defaults to torch.nn.ReLU.

  • reductions (Sequence[Reduction] | None) – A sequence of (order-invariant) modules. If not set, it defaults to [torch.sum].

  • use_masked_tensors (bool) – Whether to use masked tensors

Returns:

The order invariant flatten extractor.

Raises:

AssertionError – if filter_key is associated to a space that is non-flat.

Return type:

OrderInvariantFlattenExtractor