Examples#

navground.learning.examples

Corridor#

The environment of Corridor with obstacle.

get_env(flat: bool = True, duration: float = 40.0, time_step: float = 0.1) → Env[dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]]#

Creates the an environment where a single agent traveling along a corridor with a single static obstacle.

Parameters:

flat (bool) – Whether the observation space is flat
duration (float) – The duration of an episode
time_step (float) – The simulation time step

Returns:

A Gymnasium environment

Return type:

Env[dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]]

Crossing#

The single and multi-agent environments of Crossing.

get_env(flat: bool = True, use_acceleration_action: bool = True, multi_agent: bool = False, **kwargs: Any) → Env[dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]] | ParallelEnv[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]]#

Creates the an environment where 20 agents travel back and forth between way-points, crossing in the middle.

Parameters:

flat (bool) – Whether the observation space is flat
use_acceleration_action (bool) – Whether actions are acceleration or velocities
multi_agent (bool) – Whether to expose all agents or just one.
kwargs (Any) – Arguments passed to the environment constructor

Returns:

A Parallel PettingZoo environment if multi_agent is set, else a Gymnasium environment.

Return type:

Env[dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]] | ParallelEnv[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]]

Pad#

The single and multi-agent environments, the reward, and the sensors of Exclusive crossing on a pad.

Environments#

get_env(action: ControlActionConfig, observation: ObservationConfig, sensors: SensorSequenceLike = (), reward: Reward = PadReward(pad_penalty=10, neighbor_weight=0), max_duration: float = 20, time_step: float = 0.1, init_success: bool = False, intermediate_success: bool = False, include_success: bool = True, render_mode: str | None = None, render_kwargs: dict = {}, state: StateConfig | None = None, multi_agent: bool = True, **kwargs: Any) → BaseEnv | BaseParallelEnv#

Creates the an environment where 2 agents cross along a corridor where there is pad which should not be entered by more than one agent at the same time.

Parameters:

action (ControlActionConfig) – The action config
observation (ObservationConfig) – The observation config
sensors (SensorSequenceLike) – The sensors
reward (Reward) – The reward function
max_duration (float) – The maximal duration [s]
time_step (float) – The time step [s]
init_success (bool) – The initialization value for intermediate success
intermediate_success (bool) – Whether to return intermediate success
include_success (bool) – Whether to include success
render_mode (str | None) – The render mode
render_kwargs (dict) – The rendering keywords arguments
state (StateConfig | None) – The global state config (only relevant if multi_agent=True)
multi_agent (bool) – Whether the environments controls both agents
kwargs (Any) – Keywords arguments passed to navground.learning.scenarios.PadScenario.

Returns:

A Parallel PettingZoo environment if multi_agent is set, else a Gymnasium environment.

Return type:

BaseEnv | BaseParallelEnv

Reward#

class PadReward(pad_penalty: float = 10, neighbor_weight: float = 0)#

Bases: EfficacyReward

An efficacy reward that also penalizes by pad_penalty when both agents are inside the pad area.

When neighbor_weight > 0, it includes the efficacy of the neighbor, weighted accordingly.

Parameters:

pad_penalty (float)
neighbor_weight (float)

Sensors#

comm(size: int = 1, name: str = 'neighbor', binarize: bool = False) → Sensor#

The sensors that receives messages broadcasted by the neighbor.

Parameters:

size (int) – The size of the message
name (str) – The namespace
binarize (bool) – Whether to binarize the message

Returns:

The sensor

Return type:

Sensor

marker(min_x: float = -1, max_x: float = 1) → MarkerStateEstimation#

The sensors that detects the pad.

Parameters:

min_x (float) – The lower bound of the relative (horizontal) position.
max_x (float) – The upper bound of the relative (horizontal) position.

Returns:

The sensor

Return type:

MarkerStateEstimation

neighbor(range: float = 10, max_speed: float = 0.166) → DiscsStateEstimation#

The sensors that detects the neighbor.

Parameters:

range (float) – The range
max_speed (float) – The neighbor maximum speed

Returns:

The sensor

Return type:

DiscsStateEstimation

Examples

Contents

Examples#

Corridor#

Crossing#

Pad#

Environments#

Reward#

Sensors#