Single-agent Gymnasium Environment#
navground.learning.env
Bases:
NavgroundBaseEnv
,Env
[dict
[str
,ndarray
[Any
,dtype
[Any
]]] |ndarray
[Any
,dtype
[Any
]],ndarray
[Any
,dtype
[Any
]]]This class describes an environment that uses a
navground.sim.Scenario
to generate and then simulate anavground.sim.World
.Actions and observations relates to a single selected individual
navground.sim.Agent
.The behavior is registered under the id
"navground"
:>>> import gymnasium as gym >>> import navground.learning.env >>> from navground import sim >>> >>> scenario = sim.load_scenario(...) >>> env = gym.make("navground", scenario=scenario)
- Parameters:
scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a
str
, it will be interpreted as the YAML representation of a scenario. If adict
, it will be dumped to YAML and then treated as astr
.agent_index (int) – The world index of the selected agent, must be smaller than the number of agents.
sensor (SensorLike | None) – An optional sensor that will be added to
sensors
.sensors (SensorSequenceLike) – A sequence of sensor to generate observations for the agents or its YAML representation. If Items of class
str
will be interpreted as the YAML representation of a sensor. Items of classdict
will be dumped to YAML and then treated as astr
. If empty, it will use the agents’ own sensors.action (ActionConfig) – The configuration of the action space to use.
observation (ObservationConfig) – The configuration of the observation space to use.
reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.
time_step (float) – The simulation time step applied at every
step()
.max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_if_idle (bool) – Whether to terminate when an agent is idle
truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see
navground.sim.ui.WebUI
). If “rgb_array”, it usesnavground.sim.ui.render.image_for_world()
to render the world on demand.render_kwargs (Mapping[str, Any]) – Arguments passed to
navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck and terminate it.
color (str) – An optional color of the agent (only used for displaying)
tag (str) – An optional tag to be added to the agent (only used as metadata)
terminate_on_success (bool) – Whether to terminate the episode on success.
terminate_on_failure (bool) – Whether to terminate the episode on failure.
success_condition (TerminationCondition | None) – An optional success criteria
failure_condition (TerminationCondition | None) – An optional failure criteria
include_action (bool) – Whether to include field “navground_action” in the info
include_success (bool) – Whether to include field “is_success” in the info
init_success (bool | None) – The default value of success (valid until a termination condition is met)
intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).
Gets the action configuration for a (possible) agent
- Parameters:
index (int) – The agent index
- Returns:
The action configuration, or None if undefined.
- Return type:
ActionConfig | None
Load the class from the JSON representation
Convert action to navground command for a (possible) agent
Gets the observation configuration for a (possible) agent
- Parameters:
index (int) – The agent index
- Returns:
The observation configuration, or None if undefined.
- Return type:
ObservationConfig | None
Conforms to
gymnasium.Env.reset()
.It samples a new world from a scenario, runs one dry simulation step using
navground.sim.World.update_dry()
. Then, it converts the agent’s state to observations. and the command it would actuate to an action, which it includes in theinfo
dictionary at key “navground_action”`.
Conforms to
gymnasium.Env.step()
.It converts the action to a command that the navground agent actuates. Then, it updates the world for one step, calling
navground.sim.World.update()
. Finally, it converts the agent’s state to observations, the command it would actuate to an action, which it includes in theinfo
dictionary at key “navground_action”, and computes a reward.Termination is set when the agent completes the task, exits the boundary, or gets stuck. Truncation is set when the maximal duration has passed.
A JSON-able representation of the instance
- Returns:
A JSON-able dict
Returns the arguments used to initialize the environment
- Returns:
The initialization arguments.
A policy that returns the action computed by the navground agent.
- Returns:
The policy.
The navground scenario, if set.