Multi-agent Pettingzoo Environment#

navground.learning.parallel_env

The environment base class

This class describes an environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World.

Actions and observations relates to one or more selected navground.sim.Agent.

We provide convenience functions to initialize the class parallel_env() and shared_parallel_env() to create the environment.

Parameters:
  • args (Any)

  • kwargs (Any)

Load the class from the JSON representation

Parameters:

value (Mapping[str, Any]) – A JSON-able dict

Returns:

An instance of the class

Return type:

Self

A policy that returns the action computed by the navground agent.

Returns:

The policy.

Parameters:

index (int)

Return type:

InfoPolicy

Conforms to pettingzoo.utils.env.ParallelEnv.reset().

It samples a new world from a scenario, runs one dry simulation step using navground.sim.World.update_dry(). Then, it converts the agents states to observations. and the commands they would actuate to actions, which it includes in the info dictionary at key “navground_action”`.

Parameters:
Return type:

tuple[dict[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]]], dict[int, dict[str, ndarray[Any, dtype[Any]]]]]

Conforms to pettingzoo.utils.env.ParallelEnv.step().

It converts the actions to commands that the navground agents actuate. Then, it updates the world for one step, calling navground.sim.World.update(). Finally, it converts the agents states to observations, the commands they would actuate to actions, which it includes in the info dictionary at key “navground_action”, and computes rewards.

Termination for individual agents is set when they complete the task, exit the boundary, or get stuck. Truncation for all agents is set when the maximal duration has passed.

Parameters:

actions (dict[int, Action])

Return type:

tuple[dict[int, dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]]], dict[int, float], dict[int, bool], dict[int, bool], dict[int, dict[str, ndarray[Any, dtype[Any]]]]]

A JSON-able representation of the instance

Returns:

A JSON-able dict

Returns the arguments used to initialize the environment

Returns:

The initialization arguments.

The navground scenario, if set.

Create a multi-agent PettingZoo environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World. All controlled agents share the same configuration.

>>> from navground.learning.parallel_env import shared_parallel_env
>>> from navground import sim
>>>
>>> scenario = sim.load_scenario(...)
>>> env = shared_parallel_env(scenario=scenario, ...)
Parameters:
  • scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a str, it will be interpreted as the YAML representation of a scenario. If a dict, it will be dumped to YAML and then treated as a str.

  • max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.

  • indices (IndicesLike) – The world indices of the agent to control. All other agents are controlled solely by navground.

  • sensor (sim.Sensor | str | dict[str, Any] | None) – A sensor to produce observations for the selected agents. If a str, it will be interpreted as the YAML representation of a sensor. If a dict, it will be dumped to YAML and then treated as a str. If None, it will use the agents’ own state estimation, if a sensor.

  • action (ActionConfig) – The configuration of the action space to use.

  • observation (ObservationConfig) – The configuration of the observation space to use.

  • reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.

  • time_step (float) – The simulation time step applied at every MultiAgentNavgroundEnv.step().

  • max_duration (float) – If positive, it will signal a truncation after this simulated time.

  • terminate_outside_bounds (bool) – Whether to terminate when an agent exit the bounds

  • bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.

  • render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see navground.sim.ui.WebUI). If “rgb_array”, it uses navground.sim.ui.render.image_for_world() to render the world on demand.

  • render_kwargs (Mapping[str, Any]) – Arguments passed to navground.sim.ui.render.image_for_world()

  • realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.

  • stuck_timeout (float) – The time to wait before considering an agent stuck.

  • color (str) – An optional color of the agents (only used for displaying)

  • tag (str) – An optional tag to be added to the agents (only used as metadata)

Returns:

The multi agent navground environment.

Return type:

MultiAgentNavgroundEnv

Create a multi-agent PettingZoo environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World.

>>> from navground.learning.parallel_env import parallel_env
>>> from navground.learning import GroupConfig
>>> from navground import sim
>>>
>>> scenario = sim.load_scenario(...)
>>> groups = [GroupConfig(...), ...]
>>> env = parallel_env(scenario=scenario, groups=groups)
Parameters:
  • scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a str, it will be interpreted as the YAML representation of a scenario. If a dict, it will be dumped to YAML and then treated as a str.

  • groups (Collection[GroupConfig]) – The configuration of the agents controlled by the environment. All other agents are controlled solely by navground.

  • max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.

  • time_step (float) – The simulation time step applied at every MultiAgentNavgroundEnv.step().

  • max_duration (float) – If positive, it will signal a truncation after this simulated time.

  • terminate_outside_bounds (bool) – Whether to terminate when an agent exit the bounds

  • bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.

  • render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see navground.sim.ui.WebUI). If “rgb_array”, it uses navground.sim.ui.render.image_for_world() to render the world on demand.

  • render_kwargs (Mapping[str, Any]) – Arguments passed to navground.sim.ui.render.image_for_world()

  • realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.

  • stuck_timeout (float) – The time to wait before considering an agent stuck.

Returns:

The multi agent navground environment.

Return type:

MultiAgentNavgroundEnv

Converts a PettingZoo  Parallel environment to a StableBaseline3 vectorized environment

The multiple agents of the single PettingZoo (action/observation spaces needs to be shared) are stuck as multiple environments of a single agent using supersuit.pettingzoo_env_to_vec_env_v1(), and then concatenated using supersuit.concat_vec_envs_v1().

Parameters:
  • env (BaseParallelEnv) – The environment

  • num_envs (int) – The number of parallel envs to concatenate

  • processes (int) – The number of processes

Returns:

The vector environment with number x |agents| single agent environments.

Return type:

VecEnv

Creates a shared parallel environment using the configuration of the agent exposed in a (navground) single-agent environment.

Parameters:
  • env (BaseEnv) – The environment.

  • max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.

  • indices (Indices | slice | list[int] | tuple[int] | set[int] | Literal['ALL']) – The world indices of the agent to control. All other agents are controlled solely by navground.

Raises:

TypeError – If env.unwrapped is not a subclass of env.NavgroundEnv

Return type:

MultiAgentNavgroundEnv