Single-agent Gymnasium Environment#
navground.learning.env
This class describes an environment that uses a
navground.sim.Scenario
to generate and then simulate anavground.sim.World
.Actions and observations relates to a single selected individual
navground.sim.Agent
.The behavior is registered under the id
"navground"
:>>> import gymnasium as gym >>> import navground.learning.env >>> from navground import sim >>> >>> scenario = sim.load_scenario(...) >>> env = gym.make("navground", scenario=scenario)
- Parameters:
scenario (sim.Scenario | str | dict[str, Any] | None) – The scenario to initialize all simulated worlds. If a
str
, it will be interpreted as the YAML representation of a scenario. If adict
, it will be dumped to YAML and then treated as astr
.agent_index (int) – The world index of the selected agent, must be smaller than the number of agents.
sensor (sim.Sensor | str | dict[str, Any] | None) – A sensor to produce observations for the selected agents. If a
str
, it will be interpreted as the YAML representation of a sensor. If adict
, it will be dumped to YAML and then treated as astr
. If None, it will use the agents’ own state estimation, if a sensor.action (ActionConfig) – The configuration of the action space to use.
observation (ObservationConfig) – The configuration of the observation space to use.
reward (Reward | None) – The reward function to use. If none, it will default to constant zeros.
time_step (float) – The simulation time step applied at every
step()
.max_duration (float) – If positive, it will signal a truncation after this simulated time.
terminate_outside_bounds (bool) – Whether to terminate when an agent exit the bounds
bounds (Bounds | None) – The area to render and a fence for truncating processes when agents exit it.
render_mode (str | None) – The render mode. If “human”, it renders a simulation in real time via websockets (see
navground.sim.ui.WebUI
). If “rgb_array”, it usesnavground.sim.ui.render.image_for_world()
to render the world on demand.render_kwargs (Mapping[str, Any]) – Arguments passed to
navground.sim.ui.render.image_for_world()
realtime_factor (float) – a realtime factor for render_mode=”human”: larger values speed up the simulation.
stuck_timeout (float) – The time to wait before considering an agent stuck.
color (str) – An optional color of the agent (only used for displaying)
tag (str) – An optional tag to be added to the agent (only used as metadata)
Load the class from the JSON representation
Conforms to
gymnasium.Env.reset()
.It samples a new world from a scenario, runs one dry simulation step using
navground.sim.World.update_dry()
. Then, it converts the agent’s state to observations. and the command it would actuate to an action, which it includes in theinfo
dictionary at key “navground_action”`.
Conforms to
gymnasium.Env.step()
.It converts the action to a command that the navground agent actuates. Then, it updates the world for one step, calling
navground.sim.World.update()
. Finally, it converts the agent’s state to observations, the command it would actuate to an action, which it includes in theinfo
dictionary at key “navground_action”, and computes a reward.Termination is set when the agent completes the task, exits the boundary, or gets stuck. Truncation is set when the maximal duration has passed.
A JSON-able representation of the instance
- Returns:
A JSON-able dict
Returns the arguments used to initialize the environment
- Returns:
The initialization arguments.
A policy that returns the action computed by the navground agent.
- Returns:
The policy.
The navground scenario, if set.