Multi-agent Pettingzoo Environment

Multi-agent Pettingzoo Environment#

navground.learning.parallel_env

type BaseParallelEnv = pettingzoo.utils.env.ParallelEnv[int, Observation, Action]#: The environment base class

class MultiAgentNavgroundEnv(*args: Any, **kwargs: Any)#

Bases: NavgroundBaseEnv, ParallelEnv[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]]

This class describes an environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World.

Actions and observations relates to one or more selected navground.sim.Agent.

We provide convenience functions to initialize the class parallel_env() and shared_parallel_env() to create the environment.

Parameters:

args (Any)
kwargs (Any)

action_config(index: int) → ActionConfig | None#

Gets the action configuration for a (possible) agent

Parameters:: index (int) – The agent index
Returns:: The action configuration, or None if undefined.
Return type:: ActionConfig | None

classmethod from_dict(value: Mapping[str, Any]) → Self#

Load the class from the JSON representation

Parameters:: value (Mapping[str, Any]) – A JSON-able dict
Returns:: An instance of the class
Return type:: Self

get_cmd_from_action(index: int, action: Action, time_step: float) → Twist2 | None#

Convert action to navground command for a (possible) agent

Parameters:

index (int) – The agent index
action (Action) – The action
time_step (float) – The time step

Returns:

A control command or None if no agent is configured at the given index.

Return type:

Twist2 | None

get_policy(index: int) → InfoPolicy#

A policy that returns the action computed by the navground agent.

Returns:: The policy.
Parameters:: index (int)
Return type:: InfoPolicy

observation_config(index: int) → ObservationConfig | None#

Gets the observation configuration for a (possible) agent

Parameters:: index (int) – The agent index
Returns:: The observation configuration, or None if undefined.
Return type:: ObservationConfig | None

reset(seed: int | None = None, options: dict[str, Any] | None = None) → tuple[dict[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]]], dict[int, dict[str, ndarray[Any, dtype[Any]]]]]#

Conforms to pettingzoo.utils.env.ParallelEnv.reset().

It samples a new world from a scenario, runs one dry simulation step using navground.sim.World.update_dry(). Then, it converts the agents states to observations. and the commands they would actuate to actions, which it includes in the info dictionary at key “navground_action”`.

Parameters:

seed (int | None)
options (dict[str, Any] | None)

Return type:

tuple[dict[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]]], dict[int, dict[str, ndarray[Any, dtype[Any]]]]]

step(actions: dict[int, Action]) → tuple[dict[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]]], dict[int, float], dict[int, bool], dict[int, bool], dict[int, dict[str, ndarray[Any, dtype[Any]]]]]#

Conforms to pettingzoo.utils.env.ParallelEnv.step().

It converts the actions to commands that the navground agents actuate. Then, it updates the world for one step, calling navground.sim.World.update(). Finally, it converts the agents states to observations, the commands they would actuate to actions, which it includes in the info dictionary at key “navground_action”, and computes rewards.

Termination for individual agents is set when they complete the task, exit the boundary, or get stuck. Truncation for all agents is set when the maximal duration has passed.

Parameters:: actions (dict[int, Action])
Return type:: tuple[dict[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]]], dict[int, float], dict[int, bool], dict[int, bool], dict[int, dict[str, ndarray[Any, dtype[Any]]]]]

property asdict: dict[str, Any]#

A JSON-able representation of the instance

Returns:: A JSON-able dict

property init_args: dict[str, Any]#

Returns the arguments used to initialize the environment

Returns:: The initialization arguments.

property scenario: Scenario | None#: The navground scenario, if set.

shared_parallel_env(scenario: sim.Scenario | str | dict[str, Any] | None = None, max_number_of_agents: int | None = None, indices: IndicesLike = Indices.all(), sensor: SensorLike | None = None, sensors: SensorSequenceLike = (), action: ActionConfig = ControlActionConfig(), observation: ObservationConfig = DefaultObservationConfig(ignore_keys=[]), reward: Reward | None = None, time_step: float = 0.1, max_duration: float = -1.0, terminate_if_idle: bool = True, bounds: Bounds | None = None, truncate_outside_bounds: bool = False, render_mode: str | None = None, render_kwargs: Mapping[str, Any] = {}, realtime_factor: float = 1.0, stuck_timeout: float = -1, color: str = '', tag: str = '', wait: bool = False, truncate_fast: bool = False, terminate_on_success: bool = True, terminate_on_failure: bool = True, success_condition: TerminationCondition | None = None, failure_condition: TerminationCondition | None = None, include_action: bool = True, include_success: bool = True, init_success: bool | None = None, intermediate_success: bool = False, state: StateConfig | None = None) → MultiAgentNavgroundEnv#

Create a multi-agent PettingZoo environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World. All controlled agents share the same configuration.

>>> from navground.learning.parallel_env import shared_parallel_env
>>> from navground import sim
>>>
>>> scenario = sim.load_scenario(...)
>>> env = shared_parallel_env(scenario=scenario, ...)

Parameters:

scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a str, it will be interpreted as the YAML representation of a scenario. If a dict, it will be dumped to YAML and then treated as a str.
max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.
indices (IndicesLike) – The world indices of the agent to control. All other agents are controlled solely by navground.
sensor (SensorLike | None) – An optional sensor that will be added to sensors.
sensors (SensorSequenceLike) – A sequence of sensor to generate observations for the agents or its YAML representation. If Items of class str will be interpreted as the YAML representation of a sensor. Items of class dict will be dumped to YAML and then treated as a str. If empty, it will use the agents’ own sensors.
action (ActionConfig) – The configuration of the action space to use.
observation (ObservationConfig) – The configuration of the observation space to use.
reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.
time_step (float) – The simulation time step applied at every MultiAgentNavgroundEnv.step().
max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_if_idle (bool) – Whether to terminate when an agent is idle
truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see navground.sim.ui.WebUI). If “rgb_array”, it uses navground.sim.ui.render.image_for_world() to render the world on demand.
render_kwargs (Mapping[str, Any]) – Arguments passed to navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck.
color (str) – An optional color of the agents (only used for displaying)
tag (str) – An optional tag to be added to the agents (only used as metadata)
wait (bool) – Whether to signal termination/truncation only when all agents have terminated/truncated.
truncate_fast (bool) – Whether to signal truncation for all agents as soon as one agent truncates.
state (StateConfig | None) – An optional global state configuration
terminate_on_success (bool) – Whether to terminate the episode on success.
terminate_on_failure (bool) – Whether to terminate the episode on failure.
success_condition (TerminationCondition | None) – An optional success criteria
failure_condition (TerminationCondition | None) – An optional failure criteria
include_action (bool) – Whether to include field “navground_action” in the info
include_success (bool) – Whether to include field “is_success” in the info
init_success (bool | None) – The default value of success (valid until a termination condition is met)
intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).

Returns:

The multi agent navground environment.

Return type:

MultiAgentNavgroundEnv

parallel_env(scenario: sim.Scenario | str | dict[str, Any] | None = None, groups: Collection[GroupConfig] = (), max_number_of_agents: int | None = None, time_step: float = 0.1, max_duration: float = -1.0, terminate_if_idle: bool = True, bounds: Bounds | None = None, truncate_outside_bounds: bool = False, render_mode: str | None = None, render_kwargs: Mapping[str, Any] = {}, realtime_factor: float = 1.0, stuck_timeout: float = -1, wait: bool = False, truncate_fast: bool = False, state: StateConfig | None = None, include_action: bool = True, include_success: bool = True, init_success: bool | None = None, intermediate_success: bool = False) → MultiAgentNavgroundEnv#

Create a multi-agent PettingZoo environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World.

>>> from navground.learning.parallel_env import parallel_env
>>> from navground.learning import GroupConfig
>>> from navground import sim
>>>
>>> scenario = sim.load_scenario(...)
>>> groups = [GroupConfig(...), ...]
>>> env = parallel_env(scenario=scenario, groups=groups)

Parameters:

scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a str, it will be interpreted as the YAML representation of a scenario. If a dict, it will be dumped to YAML and then treated as a str.
groups (Collection[GroupConfig]) – The configuration of the agents controlled by the environment. All other agents are controlled solely by navground.
max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.
time_step (float) – The simulation time step applied at every MultiAgentNavgroundEnv.step().
max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_if_idle (bool) – Whether to terminate when an agent is idle
truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see navground.sim.ui.WebUI). If “rgb_array”, it uses navground.sim.ui.render.image_for_world() to render the world on demand.
render_kwargs (Mapping[str, Any]) – Arguments passed to navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck.
wait (bool) – Whether to signal termination/truncation only when all agents have terminated/truncated.
truncate_fast (bool) – Whether to signal truncation for all agents as soon as one agent truncates.
state (StateConfig | None) – An optional global state configuration
include_action (bool) – Whether to include field “navground_action” in the info
include_success (bool) – Whether to include field “is_success” in the info
init_success (bool | None) – The default value of success (valid until a termination condition is met)
intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).

Returns:

The multi agent navground environment.

Return type:

MultiAgentNavgroundEnv

make_vec_from_penv(env: BaseParallelEnv, num_envs: int = 1, processes: int = 1, black_death: bool = False, seed: int = 0, monitor: bool = False, monitor_keywords: tuple[str] = ('is_success',)) → VecEnv#

Converts a PettingZoo Parallel environment to a StableBaseline3 vectorized environment

The multiple agents of the single PettingZoo (action/observation spaces needs to be shared) are stuck as multiple environments of a single agent using supersuit.pettingzoo_env_to_vec_env_v1(), and then concatenated using supersuit.concat_vec_envs_v1().

Parameters:

env (BaseParallelEnv) – The environment
num_envs (int) – The number of parallel envs to concatenate
processes (int) – The number of processes
seed (int) – The seed
black_death (bool) – Whether to allow dynamic number of agents
monitor (bool) – Whether to wrap the vector env in a VecMonitor
monitor_keywords (tuple[str]) – The keywords passed to VecMonitor

Returns:

The vector environment with number x |agents| single agent environments.

Return type:

VecEnv

Creates a shared parallel environment using the configuration of the agent exposed in a (navground) single-agent environment.

Parameters:

env (BaseEnv) – The environment.
max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.
indices (Indices | slice | list[int] | tuple[int] | set[int] | Literal['ALL']) – The world indices of the agent to control. All other agents are controlled solely by navground.

Raises:

TypeError – If env.unwrapped is not a subclass of env.NavgroundEnv

Return type:

MultiAgentNavgroundEnv

class JointEnv(env: ParallelEnv, state: bool = False)#

Bases: Env

Wraps a multi-agent parallel environment as a single agent environment, stacking observation but aggregating rewards, terminations and truncations.

Requires that homogeneous observations spaces (if state=False) and action spaces.

Parameters:

env (ParallelEnv) – The parallel environment
state (bool) – Whether to return the global state as observations (vs the stacked observation).

Multi-agent Pettingzoo Environment

Contents

Multi-agent Pettingzoo Environment#