Single-agent Gymnasium Environment#

navground.learning.env

The environment base class

Bases: NavgroundBaseEnv, Env[dict[str, ndarray[Any, dtype[Any]]] | ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]]]

This class describes an environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World.

Actions and observations relates to a single selected individual navground.sim.Agent.

The behavior is registered under the id "navground":

>>> import gymnasium as gym
>>> import navground.learning.env
>>> from navground import sim
>>>
>>> scenario = sim.load_scenario(...)
>>> env = gym.make("navground", scenario=scenario)
Parameters:
  • scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a str, it will be interpreted as the YAML representation of a scenario. If a dict, it will be dumped to YAML and then treated as a str.

  • agent_index (int) – The world index of the selected agent, must be smaller than the number of agents.

  • sensor (SensorLike | None) – An optional sensor that will be added to sensors.

  • sensors (SensorSequenceLike) – A sequence of sensor to generate observations for the agents or its YAML representation. If Items of class str will be interpreted as the YAML representation of a sensor. Items of class dict will be dumped to YAML and then treated as a str. If empty, it will use the agents’ own sensors.

  • action (ActionConfig) – The configuration of the action space to use.

  • observation (ObservationConfig) – The configuration of the observation space to use.

  • reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.

  • time_step (float) – The simulation time step applied at every step().

  • max_duration (float) – If positive, it will signal a truncation after this simulated time.

  • terminate_if_idle (bool) – Whether to terminate when an agent is idle

  • truncate_outside_bounds (bool) – Whether to truncate when an agent exit the bounds

  • bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.

  • render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see navground.sim.ui.WebUI). If “rgb_array”, it uses navground.sim.ui.render.image_for_world() to render the world on demand.

  • render_kwargs (Mapping[str, Any]) – Arguments passed to navground.sim.ui.render.image_for_world()

  • realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.

  • stuck_timeout (float) – The time to wait before considering an agent stuck and terminate it.

  • color (str) – An optional color of the agent (only used for displaying)

  • tag (str) – An optional tag to be added to the agent (only used as metadata)

  • terminate_on_success (bool) – Whether to terminate the episode on success.

  • terminate_on_failure (bool) – Whether to terminate the episode on failure.

  • success_condition (TerminationCondition | None) – An optional success criteria

  • failure_condition (TerminationCondition | None) – An optional failure criteria

  • include_action (bool) – Whether to include field “navground_action” in the info

  • include_success (bool) – Whether to include field “is_success” in the info

  • init_success (bool | None) – The default value of success (valid until a termination condition is met)

  • intermediate_success (bool) – Whether to include “is_success” in the info at intermediate steps (vs only at termination).

Gets the action configuration for a (possible) agent

Parameters:

index (int) – The agent index

Returns:

The action configuration, or None if undefined.

Return type:

ActionConfig | None

Load the class from the JSON representation

Parameters:

value (Mapping[str, Any]) – A JSON-able dict

Returns:

An instance of the class

Return type:

Self

Convert action to navground command for a (possible) agent

Parameters:
  • index (int) – The agent index

  • action (Action) – The action

  • time_step (float) – The time step

Returns:

A control command or None if no agent is configured at the given index.

Return type:

Twist2 | None

Gets the observation configuration for a (possible) agent

Parameters:

index (int) – The agent index

Returns:

The observation configuration, or None if undefined.

Return type:

ObservationConfig | None

Conforms to gymnasium.Env.reset().

It samples a new world from a scenario, runs one dry simulation step using navground.sim.World.update_dry(). Then, it converts the agent’s state to observations. and the command it would actuate to an action, which it includes in the info dictionary at key “navground_action”`.

Parameters:
Return type:

tuple[Observation, dict[str, Any]]

Conforms to gymnasium.Env.step().

It converts the action to a command that the navground agent actuates. Then, it updates the world for one step, calling navground.sim.World.update(). Finally, it converts the agent’s state to observations, the command it would actuate to an action, which it includes in the info dictionary at key “navground_action”, and computes a reward.

Termination is set when the agent completes the task, exits the boundary, or gets stuck. Truncation is set when the maximal duration has passed.

Parameters:

action (Action)

Return type:

tuple[Observation, float, bool, bool, dict[str, Action]]

A JSON-able representation of the instance

Returns:

A JSON-able dict

Returns the arguments used to initialize the environment

Returns:

The initialization arguments.

A policy that returns the action computed by the navground agent.

Returns:

The policy.

The navground scenario, if set.