Single-agent Gymnasium Environment#
navground.learning.env
Bases:
NavgroundBaseEnv,Env[dict[str,ndarray[tuple[Any, …],dtype[Any]]] |ndarray[tuple[Any, …],dtype[Any]],ndarray[tuple[Any, …],dtype[Any]]]This class describes an environment that uses a
navground.sim.Scenarioto generate and then simulate anavground.sim.World.Actions and observations relates to a single selected individual
navground.sim.Agent.The behavior is registered under the id
"navground":>>> import gymnasium as gym >>> import navground.learning.env >>> from navground import sim >>> >>> scenario = sim.load_scenario(...) >>> env = gym.make("navground", scenario=scenario)
- Parameters:
scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a
str, it will be interpreted as the YAML representation of a scenario. If adict, it will be dumped to YAML and then treated as astr.agent_index (int) – The world index of the selected agent, must be smaller than the number of agents.
sensor (SensorLike | None) – An optional sensor that will be added to
sensors.sensors (SensorSequenceLike) – A sequence of sensor to generate observations for the agents or its YAML representation. If Items of class
strwill be interpreted as the YAML representation of a sensor. Items of classdictwill be dumped to YAML and then treated as astr. If empty, it will use the agents’ own sensors.action (ActionConfig) – The configuration of the action space to use.
observation (ObservationConfig) – The configuration of the observation space to use.
reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.
time_step (float) – The simulation time step applied at every
step().max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_if_idle (bool) – Whether to terminate when an agent is idle
truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see
navground.sim.ui.WebUI). If “rgb_array”, it usesnavground.sim.ui.render.image_for_world()to render the world on demand.render_kwargs (Mapping[str, Any]) – Arguments passed to
navground.sim.ui.render.image_for_world()realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck and terminate it.
color (str) – An optional color of the agent (only used for displaying)
tag (str) – An optional tag to be added to the agent (only used as metadata)
terminate_on_success (bool) – Whether to terminate the episode on success.
terminate_on_failure (bool) – Whether to terminate the episode on failure.
success_condition (TerminationCondition | None) – An optional success criteria
failure_condition (TerminationCondition | None) – An optional failure criteria
include_action (bool) – Whether to include field “navground_action” in the info
include_success (bool) – Whether to include field “is_success” in the info
init_success (bool | None) – The default value of success (valid until a termination condition is met)
intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).
Gets the action configuration for a (possible) agent
- Parameters:
index (int) – The agent index
- Returns:
The action configuration, or None if undefined.
- Return type:
ActionConfig | None
Load the class from the JSON representation
- Parameters:
value (Mapping[str, Any]) – A JSON-able dict
- Returns:
An instance of the class
- Return type:
Self
Convert action to navground command for a (possible) agent
Gets the observation configuration for a (possible) agent
- Parameters:
index (int) – The agent index
- Returns:
The observation configuration, or None if undefined.
- Return type:
ObservationConfig | None
Conforms to
gymnasium.Env.reset().It samples a new world from a scenario, runs one dry simulation step using
navground.sim.World.update_dry(). Then, it converts the agent’s state to observations. and the command it would actuate to an action, which it includes in theinfodictionary at key “navground_action”`.
Conforms to
gymnasium.Env.step().It converts the action to a command that the navground agent actuates. Then, it updates the world for one step, calling
navground.sim.World.update(). Finally, it converts the agent’s state to observations, the command it would actuate to an action, which it includes in theinfodictionary at key “navground_action”, and computes a reward.Termination is set when the agent completes the task, exits the boundary, or gets stuck. Truncation is set when the maximal duration has passed.
A JSON-able representation of the instance
- Returns:
A JSON-able dict
Returns the arguments used to initialize the environment
- Returns:
The initialization arguments.
A policy that returns the action computed by the navground agent.
- Returns:
The policy.
The navground scenario, if set.