Skip to main content
Ctrl+K

navground_learning 0.1.0 documentation

Contents:

  • Introduction
  • Installation
  • Tutorials
    • Basics
      • Gymnasium Environment
      • Navground-PettingZoo integration
      • Using a ML policy in Navground
    • Empty environment
    • Corridor with obstacle
      • Scenario
      • Learning
    • Crossing
      • Training one agent among many agents
      • Performance of policies trained in single-agent environment
      • Training agents among peers
      • Performance of policies trained in multi-agent environment
    • Periodic Crossing
      • Uniform speeds
      • Different speeds
  • Guides
    • How to extend
  • Reference
    • Types
    • Indices
    • Configuration
    • Rewards functions
    • Single-agent Gymnasium Environment
    • Multi-agent Pettingzoo Environment
    • Policies
    • Imitation Learning
    • Evaluation
    • Saving and Loading
    • Onnx
    • Navground Components
    • Examples
  • .rst

Navground Components

Contents

  • Behaviors
    • PolicyBehavior
      • PolicyBehavior
        • PolicyBehavior.clone()
        • PolicyBehavior.clone_behavior()
        • PolicyBehavior.deterministic
        • PolicyBehavior.fix_orientation
        • PolicyBehavior.flat
        • PolicyBehavior.history
        • PolicyBehavior.include_radius
        • PolicyBehavior.include_target_angular_speed
        • PolicyBehavior.include_target_direction
        • PolicyBehavior.include_target_direction_validity
        • PolicyBehavior.include_target_distance
        • PolicyBehavior.include_target_distance_validity
        • PolicyBehavior.include_target_speed
        • PolicyBehavior.include_velocity
        • PolicyBehavior.max_acceleration
        • PolicyBehavior.max_angular_acceleration
        • PolicyBehavior.policy_path
        • PolicyBehavior.use_acceleration_action
        • PolicyBehavior.use_wheels
  • Scenarios
    • CorridorWithObstacle
      • CorridorWithObstacle
        • CorridorWithObstacle.init_world()
        • CorridorWithObstacle.length
        • CorridorWithObstacle.max_radius
        • CorridorWithObstacle.min_radius
        • CorridorWithObstacle.width
    • Forward
      • ForwardScenario
        • ForwardScenario.init_world()
  • Probes
    • GymProbe
      • GymProbe.with_env()
    • RewardProbe
      • RewardProbe.dtype
      • RewardProbe.with_reward()

Navground Components#

Behaviors#

navground.learning.behaviors

PolicyBehavior#

class PolicyBehavior(kinematics: core.Kinematics | None = None, radius: float = 0.0, policy: AnyPolicyPredictor | None = None, action_config: ControlActionConfig = ControlActionConfig(), observation_config: DefaultObservationConfig = DefaultObservationConfig(), deterministic: bool = False)#

Bases: Behavior

A navigation behavior that evaluates a ML policy

Registered properties:

  • policy_path (str)

  • flat (bool)

  • history (int)

  • fix_orientation (bool)

  • include_target_direction (bool)

  • include_target_direction_validity (bool)

  • include_target_distance (bool)

  • include_target_distance_validity (bool)

  • include_target_speed (bool)

  • include_target_angular_speed (bool)

  • include_velocity (bool)

  • include_radius (bool)

  • use_wheels (bool)

  • use_acceleration_action (bool)

  • max_acceleration (float)

  • max_angular_acceleration (float)

  • deterministic (bool)

State: core.SensingState

Parameters:
  • kinematics (core.Kinematics | None) – The agent kinematics

  • radius (float) – The agent radius

  • policy (AnyPolicyPredictor | None) – The policy

  • config – How to use the policy (default if not specified)

  • action_config (ControlActionConfig)

  • observation_config (DefaultObservationConfig)

  • deterministic (bool)

clone() → PolicyBehavior#

Creates a new policy behavior with same properties but a separate state.

Returns:

Copy of this object.

Return type:

PolicyBehavior

classmethod clone_behavior(behavior: core.Behavior, policy: AnyPolicyPredictor | PathLike | None, action_config: ControlActionConfig, observation_config: DefaultObservationConfig, deterministic: bool = False) → Self#

Configure a new policy behavior from a behavior.

Parameters:
  • behavior (core.Behavior) – The behavior to replicate

  • policy (AnyPolicyPredictor | PathLike | None) – The policy

  • config – The configuration

  • deterministic (bool) – Whether or not to output deterministic actions

  • action_config (ControlActionConfig)

  • observation_config (DefaultObservationConfig)

Returns:

The configured policy behavior

Return type:

Self

property deterministic: bool#

Whether or not to output deterministic actions

property fix_orientation: bool#

See ControlActionConfig.fix_orientation

property flat: bool#

See DefaultObservationConfig.flat

property history: int#

See DefaultObservationConfig.history

property include_radius: bool#

See DefaultObservationConfig.include_radius

property include_target_angular_speed: bool#

See DefaultObservationConfig.include_target_angular_speed

property include_target_direction: bool#

See DefaultObservationConfig.include_target_direction

property include_target_direction_validity: bool#

See DefaultObservationConfig.include_target_direction_validity

property include_target_distance: bool#

See DefaultObservationConfig.include_target_distance

property include_target_distance_validity: bool#

See DefaultObservationConfig.include_target_distance_validity

property include_target_speed: bool#

See DefaultObservationConfig.include_target_speed

property include_velocity: bool#

See DefaultObservationConfig.include_velocity

property max_acceleration: float#

See ControlActionConfig.max_acceleration

property max_angular_acceleration: float#

See ControlActionConfig.max_angular_acceleration

property policy_path: str#

The file from which to load the model

property use_acceleration_action: bool#

See ControlActionConfig.use_acceleration_action

property use_wheels: bool#

See ControlActionConfig.use_wheels

Scenarios#

navground.learning.scenarios

CorridorWithObstacle#

class CorridorWithObstacle(length: float = 10.0, width: float = 1.0, min_radius: float = 0.1, max_radius: float = 0.5)#

Bases: Scenario

Simple worlds with

  • one agent that starts at (0, ~U(0, width)) and wants to travel towards +x

  • two long horizontal walls at y = 0 and y = width

  • one circular obstacle at ~(length, U(0, width)) with radius ~U(min_radius, max_radius)

The simulation finishes when the agent reaches x >= 2 length

Parameters:
  • length (float) – The position of the obstacle

  • width (float) – The width of the corridor

  • min_radius (float) – The minimal obstacle radius

  • max_radius (float) – The maximal obstacle radius

init_world(world: World, seed: int | None = None) → None#

Initializes the world.

Parameters:
  • world (World) – The world

  • seed (int | None) – The seed

Return type:

None

property length: float#

The length of the corridor as the position of the obstacle

property max_radius: float#

The maximal obstacle radius

property min_radius: float#

The minimal obstacle radius

property width: float#

The width of the corridor

Forward#

class ForwardScenario(width: float = 2.0, length: float = 6.0, min_number_of_obstacles: int = 0, max_number_of_obstacles: int = 0, min_obstacle_radius: float = 0.2, max_obstacle_radius: float = 0.2, margin: float = 0.0, periodic_x: bool = True, periodic_y: bool = True)#

Bases: Scenario

Periodic worlds with agents traveling in (constant) random target directions between random obstacles:

Parameters:
  • width (float) – The width of the (periodic) cell

  • length (float) – The length of the (periodic) cell

  • min_number_of_obstacles (int) – The minimum number of obstacles

  • max_number_of_obstacles (int) – The maximum number of obstacles

  • min_obstacle_radius (float) – The minimal obstacles radius

  • max_obstacle_radius (float) – The maximal obstacles radius

  • margin (float) – The margin between agents at init

  • periodic_x (bool) – Whether the world is periodic along the x-axis

  • periodic_y (bool) – Whether the world is periodic along the y-axis

init_world(world: World, seed: int | None = None) → None#

Initializes the world.

Parameters:
  • world (World) – The world

  • seed (int | None) – The seed

Return type:

None

Probes#

navground.learning.probes

class GymProbe(groups: Collection[GroupConfig])#

A probe to record observation, rewards and actions, like during a rollout. Internally uses a imitation.data.rollout.TrajectoryAccumulator to store the data, which it writes to datasets only at the end of the run:

  • observations/<agent_index>/<key>

  • actions/<agent_index>

  • rewards/<agent_index>

Parameters:

groups (Collection[GroupConfig]) – The configuration of the groups

classmethod with_env(env: BaseEnv | BaseParallelEnv) → Callable[[], GymProbe]#

Creates a probe factory to record all actions and observations in an environment

Parameters:

env (BaseEnv | BaseParallelEnv) – The environment

Returns:

A callable that can be added to runs or experiments, using navground.sim.ExperimentalRun.add_record_probe() or navground.sim.Experiment.add_record_probe()

Return type:

Callable[[], GymProbe]

class RewardProbe(ds: Dataset, groups: Collection[GroupConfig] = (), reward: Reward | None = None)#

A probe to record rewards to a single dataset of shape (#agents, #steps)

Parameters:
  • ds (sim.Dataset) – The dataset

  • groups (Collection[GroupConfig]) – The configuration of the groups

  • reward (Reward | None) – The reward function

dtype#

alias of float64

classmethod with_reward(reward: Reward) → Callable[[Dataset], RewardProbe]#

Creates a probe factory to record a reward

Parameters:

reward (Reward) – The reward

Returns:

A callable that can be added to runs or experiments using navground.sim.ExperimentalRun.add_record_probe() or navground.sim.Experiment.add_record_probe().

Return type:

Callable[[Dataset], RewardProbe]

previous

Onnx

next

Examples

Contents
  • Behaviors
    • PolicyBehavior
      • PolicyBehavior
        • PolicyBehavior.clone()
        • PolicyBehavior.clone_behavior()
        • PolicyBehavior.deterministic
        • PolicyBehavior.fix_orientation
        • PolicyBehavior.flat
        • PolicyBehavior.history
        • PolicyBehavior.include_radius
        • PolicyBehavior.include_target_angular_speed
        • PolicyBehavior.include_target_direction
        • PolicyBehavior.include_target_direction_validity
        • PolicyBehavior.include_target_distance
        • PolicyBehavior.include_target_distance_validity
        • PolicyBehavior.include_target_speed
        • PolicyBehavior.include_velocity
        • PolicyBehavior.max_acceleration
        • PolicyBehavior.max_angular_acceleration
        • PolicyBehavior.policy_path
        • PolicyBehavior.use_acceleration_action
        • PolicyBehavior.use_wheels
  • Scenarios
    • CorridorWithObstacle
      • CorridorWithObstacle
        • CorridorWithObstacle.init_world()
        • CorridorWithObstacle.length
        • CorridorWithObstacle.max_radius
        • CorridorWithObstacle.min_radius
        • CorridorWithObstacle.width
    • Forward
      • ForwardScenario
        • ForwardScenario.init_world()
  • Probes
    • GymProbe
      • GymProbe.with_env()
    • RewardProbe
      • RewardProbe.dtype
      • RewardProbe.with_reward()

By Jerome Guzzi et al. (IDSIA, USI-SUPSI)

© Copyright 2024, Jerome Guzzi et al. (IDSIA, USI-SUPSI).