Single-agent Gymnasium Environment#

navground.learning.env

type BaseEnv = gymnasium.Env[Observation, Action]#: The environment base class

class NavgroundEnv(scenario: sim.Scenario | str | dict[str, Any] | None = None, agent_index: int = 0, sensor: sim.Sensor | str | dict[str, Any] | None = None, action: ActionConfig = ControlActionConfig(), observation: ObservationConfig = DefaultObservationConfig(), reward: Reward | None = None, time_step: float = 0.1, max_duration: float = -1.0, bounds: Bounds | None = None, terminate_outside_bounds: bool = False, render_mode: str | None = None, render_kwargs: Mapping[str, Any] = {}, realtime_factor: float = 1.0, stuck_timeout: float = 1, color: str = '', tag: str = '')#

This class describes an environment that uses a navground.sim.Scenario to generate and then simulate a navground.sim.World.

Actions and observations relates to a single selected individual navground.sim.Agent.

The behavior is registered under the id "navground":

>>> import gymnasium as gym
>>> import navground.learning.env
>>> from navground import sim
>>>
>>> scenario = sim.load_scenario(...)
>>> env = gym.make("navground", scenario=scenario)

Parameters:

scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a str, it will be interpreted as the YAML representation of a scenario. If a dict, it will be dumped to YAML and then treated as a str.
agent_index (int) – The world index of the selected agent, must be smaller than the number of agents.
sensor (sim.Sensor | str | dict[str, Any] | None) – A sensor to produce observations for the selected agents. If a str, it will be interpreted as the YAML representation of a sensor. If a dict, it will be dumped to YAML and then treated as a str. If None, it will use the agents’ own state estimation, if a sensor.
action (ActionConfig) – The configuration of the action space to use.
observation (ObservationConfig) – The configuration of the observation space to use.
reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.
time_step (float) – The simulation time step applied at every step().
max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_outside_bounds (bool) – Whether to terminate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see navground.sim.ui.WebUI). If “rgb_array”, it uses navground.sim.ui.render.image_for_world() to render the world on demand.
render_kwargs (Mapping[str, Any]) – Arguments passed to navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck.
color (str) – An optional color of the agent (only used for displaying)
tag (str) – An optional tag to be added to the agent (only used as metadata)

classmethod from_dict(value: Mapping[str, Any]) → Self#

Load the class from the JSON representation

Parameters:: value (Mapping[str, Any]) – A JSON-able dict
Returns:: An instance of the class
Return type:: Self

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) → tuple[Observation, dict[str, Any]]#

Conforms to gymnasium.Env.reset().

It samples a new world from a scenario, runs one dry simulation step using navground.sim.World.update_dry(). Then, it converts the agent’s state to observations. and the command it would actuate to an action, which it includes in the info dictionary at key “navground_action”`.

Parameters:

seed (int | None)
options (dict[str, Any] | None)

Return type:

tuple[Observation, dict[str, Any]]

step(action: Action) → tuple[Observation, float, bool, bool, dict[str, Action]]#

Conforms to gymnasium.Env.step().

It converts the action to a command that the navground agent actuates. Then, it updates the world for one step, calling navground.sim.World.update(). Finally, it converts the agent’s state to observations, the command it would actuate to an action, which it includes in the info dictionary at key “navground_action”, and computes a reward.

Termination is set when the agent completes the task, exits the boundary, or gets stuck. Truncation is set when the maximal duration has passed.

Parameters:: action (Action)
Return type:: tuple[Observation, float, bool, bool, dict[str, Action]]

property asdict: dict[str, Any]#

A JSON-able representation of the instance

Returns:: A JSON-able dict

property init_args: dict[str, Any]#

Returns the arguments used to initialize the environment

Returns:: The initialization arguments.

property policy: InfoPolicy#

A policy that returns the action computed by the navground agent.

Returns:: The policy.

property scenario: Scenario | None#: The navground scenario, if set.

Single-agent Gymnasium Environment

Contents

Single-agent Gymnasium Environment#