Rewards functions
navground.learning.rewards
Null
-
class NullReward
A dummy reward that returns always zero
Social
-
class SocialReward(alpha: float = 0.0, beta: float = 1.0, critical_safety_margin: float = 0.0, safety_margin: float | None = None, default_social_margin: float = 0.0, social_margins: dict[int, float] = <factory>)
Reward function for social navigation, inspired by [TODO add citation]
It returns a weighted sum of
violations of the social margin ([-1, 0], weight alpha
)
violations of the safety margin ([-1, 0], weight beta
)
efficacy ([-1, 0], weight 1)
so that it is lower or equal to zero, which corresponds to no violations,
while moving at optimal speed towards the target.
- Parameters:
alpha (float) – The weight of social margin violations
beta (float) – The weight of safety violations
critical_safety_margin (float) – Violation of this margin has maximal penalty of -1
safety_margin (float | None) – Violations between this and the critical
safety_margin have a linear penalty. If not set,
it defaults to the agent’s own safety_margin.
beta – The weight of safety violation
default_social_margin (float) – The default social margin
social_margins (dict[int, float]) – The social margins assigned to neighbors’ ids
- Returns:
A function that returns -1 if the safety margin is violated
or weighted sum of social margin violations and efficacy.
Social#
Reward function for social navigation, inspired by [TODO add citation]
It returns a weighted sum of
violations of the social margin ([-1, 0], weight
alpha
)violations of the safety margin ([-1, 0], weight
beta
)efficacy ([-1, 0], weight 1)
so that it is lower or equal to zero, which corresponds to no violations, while moving at optimal speed towards the target.
alpha (float) – The weight of social margin violations
beta (float) – The weight of safety violations
critical_safety_margin (float) – Violation of this margin has maximal penalty of -1
safety_margin (float | None) – Violations between this and the critical safety_margin have a linear penalty. If not set, it defaults to the agent’s own safety_margin.
beta – The weight of safety violation
default_social_margin (float) – The default social margin
social_margins (dict[int, float]) – The social margins assigned to neighbors’ ids
A function that returns -1 if the safety margin is violated or weighted sum of social margin violations and efficacy.