Single-agent Gymnasium Environment#

navground.learning.env

type BaseEnv = gymnasium.Env[Observation, Action]#: The environment base class

class NavgroundEnv(scenario: sim.Scenario | str | dict[str, Any] | None = None, agent_index: int = 0, sensor: SensorLike | None = None, sensors: SensorSequenceLike = (), action: ActionConfig = ControlActionConfig(), observation: ObservationConfig = DefaultObservationConfig(ignore_keys=[]), reward: Reward | None = None, time_step: float = 0.1, max_duration: float = -1.0, terminate_if_idle: bool = True, bounds: Bounds | None = None, truncate_outside_bounds: bool = False, render_mode: str | None = None, render_kwargs: Mapping[str, Any] = {}, realtime_factor: float = 1.0, stuck_timeout: float = -1, color: str = '', tag: str = '', terminate_on_success: bool = True, terminate_on_failure: bool = True, success_condition: TerminationCondition | None = None, failure_condition: TerminationCondition | None = None, include_action: bool = True, include_success: bool = True, init_success: bool | None = None, intermediate_success: bool = False)#

Bases: NavgroundBaseEnv, Env[dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]]

This class describes an environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World.

Actions and observations relates to a single selected individual navground.sim.Agent.

The behavior is registered under the id "navground":

>>> import gymnasium as gym
>>> import navground.learning.env
>>> from navground import sim
>>>
>>> scenario = sim.load_scenario(...)
>>> env = gym.make("navground", scenario=scenario)

Parameters:

scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a str, it will be interpreted as the YAML representation of a scenario. If a dict, it will be dumped to YAML and then treated as a str.
agent_index (int) – The world index of the selected agent, must be smaller than the number of agents.
sensor (SensorLike | None) – An optional sensor that will be added to sensors.
sensors (SensorSequenceLike) – A sequence of sensor to generate observations for the agents or its YAML representation. If Items of class str will be interpreted as the YAML representation of a sensor. Items of class dict will be dumped to YAML and then treated as a str. If empty, it will use the agents’ own sensors.
action (ActionConfig) – The configuration of the action space to use.
observation (ObservationConfig) – The configuration of the observation space to use.
reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.
time_step (float) – The simulation time step applied at every step().
max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_if_idle (bool) – Whether to terminate when an agent is idle
truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see navground.sim.ui.WebUI). If “rgb_array”, it uses navground.sim.ui.render.image_for_world() to render the world on demand.
render_kwargs (Mapping[str, Any]) – Arguments passed to navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck and terminate it.
color (str) – An optional color of the agent (only used for displaying)
tag (str) – An optional tag to be added to the agent (only used as metadata)
terminate_on_success (bool) – Whether to terminate the episode on success.
terminate_on_failure (bool) – Whether to terminate the episode on failure.
success_condition (TerminationCondition | None) – An optional success criteria
failure_condition (TerminationCondition | None) – An optional failure criteria
include_action (bool) – Whether to include field “navground_action” in the info
include_success (bool) – Whether to include field “is_success” in the info
init_success (bool | None) – The default value of success (valid until a termination condition is met)
intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).

action_config(index: int) → ActionConfig | None#

Gets the action configuration for a (possible) agent

Parameters:: index (int) – The agent index
Returns:: The action configuration, or None if undefined.
Return type:: ActionConfig | None

classmethod from_dict(value: Mapping[str, Any]) → Self#

Load the class from the JSON representation

Parameters:: value (Mapping[str, Any]) – A JSON-able dict
Returns:: An instance of the class
Return type:: Self

get_cmd_from_action(index: int, action: Action, time_step: float) → Twist2 | None#

Convert action to navground command for a (possible) agent

Parameters:

index (int) – The agent index
action (Action) – The action
time_step (float) – The time step

Returns:

A control command or None if no agent is configured at the given index.

Return type:

Twist2 | None

observation_config(index: int) → ObservationConfig | None#

Gets the observation configuration for a (possible) agent

Parameters:: index (int) – The agent index
Returns:: The observation configuration, or None if undefined.
Return type:: ObservationConfig | None

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) → tuple[Observation, dict[str, Any]]#

Conforms to gymnasium.Env.reset().

It samples a new world from a scenario, runs one dry simulation step using navground.sim.World.update_dry(). Then, it converts the agent’s state to observations. and the command it would actuate to an action, which it includes in the info dictionary at key “navground_action”`.

Parameters:

seed (int | None)
options (dict[str, Any] | None)

Return type:

tuple[Observation, dict[str, Any]]

step(action: Action) → tuple[Observation, float, bool, bool, dict[str, Action]]#

Conforms to gymnasium.Env.step().

It converts the action to a command that the navground agent actuates. Then, it updates the world for one step, calling navground.sim.World.update(). Finally, it converts the agent’s state to observations, the command it would actuate to an action, which it includes in the info dictionary at key “navground_action”, and computes a reward.

Termination is set when the agent completes the task, exits the boundary, or gets stuck. Truncation is set when the maximal duration has passed.

Parameters:: action (Action)
Return type:: tuple[Observation, float, bool, bool, dict[str, Action]]

property asdict: dict[str, Any]#

A JSON-able representation of the instance

Returns:: A JSON-able dict

property init_args: dict[str, Any]#

Returns the arguments used to initialize the environment

Returns:: The initialization arguments.

property policy: InfoPolicy#

A policy that returns the action computed by the navground agent.

Returns:: The policy.

property scenario: Scenario | None#: The navground scenario, if set.

Single-agent Gymnasium Environment

Contents

Single-agent Gymnasium Environment#