Multi-agent Pettingzoo Environment#
navground.learning.parallel_env
Bases:
NavgroundBaseEnv
,ParallelEnv
[int
,dict
[str
,ndarray
[Any
,dtype
[Any
]]] |ndarray
[Any
,dtype
[Any
]],ndarray
[Any
,dtype
[Any
]]]This class describes an environment that uses a
navground.sim.Scenario
to generate and then simulate anavground.sim.World
.Actions and observations relates to one or more selected
navground.sim.Agent
.We provide convenience functions to initialize the class
parallel_env()
andshared_parallel_env()
to create the environment.- Parameters:
args (Any)
kwargs (Any)
Gets the action configuration for a (possible) agent
- Parameters:
index (int) – The agent index
- Returns:
The action configuration, or None if undefined.
- Return type:
ActionConfig | None
Load the class from the JSON representation
Convert action to navground command for a (possible) agent
A policy that returns the action computed by the navground agent.
- Returns:
The policy.
- Parameters:
index (int)
- Return type:
Gets the observation configuration for a (possible) agent
- Parameters:
index (int) – The agent index
- Returns:
The observation configuration, or None if undefined.
- Return type:
ObservationConfig | None
Conforms to
pettingzoo.utils.env.ParallelEnv.reset()
.It samples a new world from a scenario, runs one dry simulation step using
navground.sim.World.update_dry()
. Then, it converts the agents states to observations. and the commands they would actuate to actions, which it includes in theinfo
dictionary at key “navground_action”`.
Conforms to
pettingzoo.utils.env.ParallelEnv.step()
.It converts the actions to commands that the navground agents actuate. Then, it updates the world for one step, calling
navground.sim.World.update()
. Finally, it converts the agents states to observations, the commands they would actuate to actions, which it includes in theinfo
dictionary at key “navground_action”, and computes rewards.Termination for individual agents is set when they complete the task, exit the boundary, or get stuck. Truncation for all agents is set when the maximal duration has passed.
A JSON-able representation of the instance
- Returns:
A JSON-able dict
Returns the arguments used to initialize the environment
- Returns:
The initialization arguments.
The navground scenario, if set.
Create a multi-agent PettingZoo environment that uses a
navground.sim.Scenario
to generate and then simulate anavground.sim.World
. All controlled agents share the same configuration.>>> from navground.learning.parallel_env import shared_parallel_env >>> from navground import sim >>> >>> scenario = sim.load_scenario(...) >>> env = shared_parallel_env(scenario=scenario, ...)
- Parameters:
scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a
str
, it will be interpreted as the YAML representation of a scenario. If adict
, it will be dumped to YAML and then treated as astr
.max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.
indices (IndicesLike) – The world indices of the agent to control. All other agents are controlled solely by navground.
sensor (SensorLike | None) – An optional sensor that will be added to
sensors
.sensors (SensorSequenceLike) – A sequence of sensor to generate observations for the agents or its YAML representation. If Items of class
str
will be interpreted as the YAML representation of a sensor. Items of classdict
will be dumped to YAML and then treated as astr
. If empty, it will use the agents’ own sensors.action (ActionConfig) – The configuration of the action space to use.
observation (ObservationConfig) – The configuration of the observation space to use.
reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.
time_step (float) – The simulation time step applied at every
MultiAgentNavgroundEnv.step()
.max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_if_idle (bool) – Whether to terminate when an agent is idle
truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see
navground.sim.ui.WebUI
). If “rgb_array”, it usesnavground.sim.ui.render.image_for_world()
to render the world on demand.render_kwargs (Mapping[str, Any]) – Arguments passed to
navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck.
color (str) – An optional color of the agents (only used for displaying)
tag (str) – An optional tag to be added to the agents (only used as metadata)
wait (bool) – Whether to signal termination/truncation only when all agents have terminated/truncated.
truncate_fast (bool) – Whether to signal truncation for all agents as soon as one agent truncates.
state (StateConfig | None) – An optional global state configuration
terminate_on_success (bool) – Whether to terminate the episode on success.
terminate_on_failure (bool) – Whether to terminate the episode on failure.
success_condition (TerminationCondition | None) – An optional success criteria
failure_condition (TerminationCondition | None) – An optional failure criteria
include_action (bool) – Whether to include field “navground_action” in the info
include_success (bool) – Whether to include field “is_success” in the info
init_success (bool | None) – The default value of success (valid until a termination condition is met)
intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).
- Returns:
The multi agent navground environment.
- Return type:
Create a multi-agent PettingZoo environment that uses a
navground.sim.Scenario
to generate and then simulate anavground.sim.World
.>>> from navground.learning.parallel_env import parallel_env >>> from navground.learning import GroupConfig >>> from navground import sim >>> >>> scenario = sim.load_scenario(...) >>> groups = [GroupConfig(...), ...] >>> env = parallel_env(scenario=scenario, groups=groups)
- Parameters:
scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a
str
, it will be interpreted as the YAML representation of a scenario. If adict
, it will be dumped to YAML and then treated as astr
.groups (Collection[GroupConfig]) – The configuration of the agents controlled by the environment. All other agents are controlled solely by navground.
max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.
time_step (float) – The simulation time step applied at every
MultiAgentNavgroundEnv.step()
.max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_if_idle (bool) – Whether to terminate when an agent is idle
truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see
navground.sim.ui.WebUI
). If “rgb_array”, it usesnavground.sim.ui.render.image_for_world()
to render the world on demand.render_kwargs (Mapping[str, Any]) – Arguments passed to
navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck.
wait (bool) – Whether to signal termination/truncation only when all agents have terminated/truncated.
truncate_fast (bool) – Whether to signal truncation for all agents as soon as one agent truncates.
state (StateConfig | None) – An optional global state configuration
include_action (bool) – Whether to include field “navground_action” in the info
include_success (bool) – Whether to include field “is_success” in the info
init_success (bool | None) – The default value of success (valid until a termination condition is met)
intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).
- Returns:
The multi agent navground environment.
- Return type:
Converts a
PettingZoo Parallel environment
to aStableBaseline3 vectorized environment
The multiple agents of the single PettingZoo (action/observation spaces needs to be shared) are stuck as multiple environments of a single agent using
supersuit.pettingzoo_env_to_vec_env_v1()
, and then concatenated usingsupersuit.concat_vec_envs_v1()
.- Parameters:
env (BaseParallelEnv) – The environment
num_envs (int) – The number of parallel envs to concatenate
processes (int) – The number of processes
seed (int) – The seed
black_death (bool) – Whether to allow dynamic number of agents
monitor (bool) – Whether to wrap the vector env in a
VecMonitor
monitor_keywords (tuple[str]) – The keywords passed to
VecMonitor
- Returns:
The vector environment with
number x |agents|
single agent environments.- Return type:
Creates a shared parallel environment using the configuration of the agent exposed in a (navground) single-agent environment.
- Parameters:
env (BaseEnv) – The environment.
max_number_of_agents (int | None) – The maximal number of agents that we will expose. It needs to be specified only for scenarios that generate world with a variable number of agents.
indices (Indices | slice | list[int] | tuple[int] | set[int] | Literal['ALL']) – The world indices of the agent to control. All other agents are controlled solely by navground.
- Raises:
TypeError – If
env.unwrapped
is not a subclass ofenv.NavgroundEnv
- Return type:
Bases:
Env
Wraps a multi-agent parallel environment as a single agent environment, stacking observation but aggregating rewards, terminations and truncations.
Requires that homogeneous observations spaces (if
state=False
) and action spaces.- Parameters:
env (ParallelEnv) – The parallel environment
state (bool) – Whether to return the global state as observations (vs the stacked observation).