.. index:: Intents .. role:: json(code) :language: json .. _intents: Intents ======= **Intents** are the general mechanism used on the robot to *aggregate* user commands, and present them to the robot's *mission controller*. They are published on the :topic:`/intents` (:msg:`hri_actions_msgs/msg/Intent`). What are intents? ----------------- An :term:`intent` is an abstract description of an operation to be performed by the robot. Intents are represented as ROS messages of type :msg:`hri_actions_msgs/msg/Intent` and published on the :topic:`/intents` topic. While inspired by the Android intents [android-intents]_, ROS intents are primarily designed to capture user-initiated intents. For instance, a button click on a touchscreen, the result of a chatbot-based verbal interaction, a command started by a remote user interface. Intents are emitted (*published*) by nodes that track the user's activities (eg, the touchscreen, the dialogue manager), and are meant to be consumed by a :term:`mission controller` (ie. the top-level controller of the robot). Structure of an intent ---------------------- Intents comprise of four mandatory fields: - the ``intent``, which should be one of the available predefined intents, - the ``data`` which must be a JSON object containing the data required to fully instantiate the intent. - the ``source`` of the intent (for instance, an user) - and the ``modality`` by which the intent was conveyed to the robot. Optionally, you can also specify a ``priority`` and a level of ``confidence``. Intent name and data ++++++++++++++++++++++ Intents are primarily composed of an *intent name* and *data* to parametrise the intent. The ``intent`` field is a string describing the action intended by this intent. Where suitable, the intent name SHOULD be one of the constant defined in the table below. However, we recognise that the list of intents is possibly large. Therefore, custom strings are also permissible. .. attention:: **Possible terminology confusion** Even though an intent describes a desired action, the ``intent`` field is unrelated to ROS actions. Here, the *intent* is the intended action to be performed (going somewhere, picking something...), while *ROS actions* are a low-level asynchronous remote procedure call (RPC) technique. They are not directly related (even though ROS actions can be used to implement intents). The intent's ``data`` is a JSON object containing the data required to fully specify the intent. The keys of the object should be one of the following *thematic role*, or the generic ``other_data``: - ``agent``: the agent expected to perform the intent (if omitted, the robot itself is assumed) - ``object`` (also named *theme* or *patient* in the linguistics literature): entity undergoing the effect of the intent - ``goal``: entity towards which the intent is directed or moves - ``recipient``: entity that receives the object Examples: - *"I want you to go to the kitchen"*: - intent: ``MOVE_TO`` - data: :json:`{"goal":"kitchen_1"}` - *"Can you take the groceries to Luke in the kitchen?"*: - intent: ``BRING_OBJECT`` - data: :json:`{"object": "groceries", "goal":"kitchen_1", "recipient": "person_luke"}` .. note:: Additional complete examples of intents are provided below: `Examples of intents`_. Each intent defines a specific set of required and optional thematic roles, listed in the following table (note that the ``agent`` role can be optionally added to all intents, and is omitted from the table for clarity): +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | **Intent** | **Description** | **Required thematic roles** | **Optional thematic roles** | +====================+=============================================================================+===============================================================+=============================================================================+ | ``ENGAGE_WITH`` | an agent wants to engage with another one | - ``recipient`` | | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``MOVE_TO`` | navigates to a specific location | - ``goal`` | | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``GUIDE`` | guides someone somewhere | - ``goal`` | | | | | - ``recipient`` | | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``GRAB_OBJECT`` | pick-up a specific object | - ``object`` | | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``BRING_OBJECT`` | bring a specific object to a specific place | - ``object`` | | | | | - ``recipient`` | | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``PLACE_OBJECT`` | put an object on a support (eg a table) | - ``recipient`` | - ``object`` (only required if more that one object could be placed) | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``GREET`` | greet an agent | - ``recipient`` | | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``SAY`` | says some text, optionally annotated with gestures or expressions | - ``object`` (the text to say) | - ``recipient`` | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``PRESENT_CONTENT``| present (via a screen, pre-recorded text...) predefined content | - ``object`` (the content identifier) | - ``recipient`` | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``PERFORM_MOTION`` | performs a motion (eg, a dance or a specific gesture like pointing, waving) | - ``object`` (the system-specific name of the motion/gesture) | - ``recipient`` | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``START_ACTIVITY`` | start a scripted behaviour/activity | - ``object`` (the name of the activity) | - any additional parameter required to start the activity | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ | ``CANCEL_ACTIVITY``| request cancellation of an activity | - ``object`` (the name of the activity) | - ``object`` (the name of the activity. If unset, current main activity) | +--------------------+-----------------------------------------------------------------------------+---------------------------------------------------------------+-----------------------------------------------------------------------------+ .. note:: If you believe your intent should be standardised and added to the list of pre-defined intents, fill the corresponding entry in the "thematic roles" table below and submit a pull request on the `hri_actions_msg `_ repository. Source of the intent ++++++++++++++++++++ The *source* of the intent is a string describing who created this intent. This is *not* the node which published the intent, but instead the actual agent who expressed the intent/command/desire. ``source`` can be either one of the constant below, or the specific id of the person/agent expressing the intent. In a `REP-155 `_ compliant system, this ID must be the person ID of the agent. .. code-block:: # for intents originating from the robot itself string ROBOT_ITSELF = "__myself__" # for intents originating from a external robot control system (for instance, a remote control tablet) string REMOTE_SUPERVISOR = "__remote_supervisor__" # for intents coming from an agent interacting with the robot, but not uniquely # identified string UNKNOWN_AGENT = "__unknown_agent__" # for unknown sources string UNKNOWN = "__unknown__" Modality of the intent ++++++++++++++++++++++ The intent's *modality* conveys how the intent was expressed: verbally, via the touchscreen, via a gesture, etc. The special modality ``MODALITY_INTERNAL`` must be used for intents coming for the robot's internal processes, when applicable. The ``modality`` field MUST be one of the ``MODALITY_`` constant below. .. code-block:: string MODALITY_SPEECH = "speech" string MODALITY_MOTION = "motion" string MODALITY_TOUCHSCREEN = "touchscreen" string MODALITY_INTERNAL = "internal" string MODALITY_OTHER = "other" Intent priority +++++++++++++++ The priority of this intent. This MIGHT be used as a hint by the robot's application controller to prioritise appropriately the intent. The application controller is however *not* forced to respect this priority level. 0 is the lowest priority, 128 is the default priority, 255 is the highest priority. Intent confidence +++++++++++++++++ The intent's *confidence* is a value between 0.0 (no confidence) and 1.0 (full confidence) that the intent was correctly perceived and interpreted. For instance, a 'waving' gesture could be interpreted as an implicit request from a user for the robot to greet back or engage. As this interpretation is not certain, the confidence of the intent may be below 1.0. Examples of intents ------------------- User approaches the robot +++++++++++++++++++++++++ - **Possible intent trigger**: user less than 2 meters away, looking at the robot - **Possible published intent**: .. code-block:: intent: ENGAGE_WITH data: {"recipient": "anonymous_person_a2f5"} source: "anonymous_person_a2f5" modality: MODALITY_MOTION priority: 128 confidence: 0.6 User presses a button on the touchscreen to navigate ++++++++++++++++++++++++++++++++++++++++++++++++++++ - **Possible intent trigger**: button press on "Go to room X" - **Possible published intent**: .. code-block:: intent: MOVE_TO data: {"goal": "room_X"} source: UNKNOWN_USER modality: MODALITY_TOUCHSCREEN priority: 200 confidence: 1.0 User presses a button on the touchscreen to play game +++++++++++++++++++++++++++++++++++++++++++++++++++++ - **Possible intent trigger**: button press: "Play memory game" - **Possible published intent**: .. code-block:: intent: START_ACTIVITY data: {"object": "games_memory_game"} source: UNKNOWN_USER modality: MODALITY_TOUCHSCREEN priority: 100 confidence: 1.0 User presses a button on the touchscreen to display page ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - **Possible intent trigger**: button press: "Go to page_X" The result depends on what ``page_X`` is about: - if ``page_X`` is a purely informational page, that does not require any additional robot capability (eg, does not requires the robot to speak or to move), **no intent** needs to be generated. As this action is 'read-only' with no impact on the robot, it can be handled directly by the touchscreen. - ``page_X`` requires additional robot resources. In this case, an intent needs to be published: - **Possible published intent**: .. code-block:: intent: PRESENT_CONTENT data: {"object": "page_X"} source: UNKNOWN_USER modality: MODALITY_TOUCHSCREEN priority: 100 confidence: 1.0 User asks the robot to display a specific page ++++++++++++++++++++++++++++++++++++++++++++++ - **Possible intent trigger**: the chatbot recognises the command 'display page_X' - **Possible published intent**: .. code-block:: intent: PRESENT_CONTENT data: {"object": "page_X", "recipient": "anonymous_person_e4da"} source: "anonymous_person_e4da" modality: MODALITY_SPEECH priority: 128 confidence: 0.4 User asks the robot to go somewhere +++++++++++++++++++++++++++++++++++ - **Possible intent trigger**: the chatbot recognises the command 'take me to place_X' - **Possible published intent**: .. code-block:: intent: GUIDE data: {"goal": "place_X", "recipient": "person_55dc"} source: "person_55dc" modality: MODALITY_SPEECH priority: 128 confidence: 0.8 User wants to cancel a task +++++++++++++++++++++++++++ - **Possible intent trigger**: the user presses a 'cancel' button on the touchscreen - **Possible published intent**: .. code-block:: intent: STOP_ACTIVITY data: {"object": ""} source: UNKONOWN_USER modality: MODALITY_TOUCHSCREEN priority: 255 confidence: 1.0 Supervisor sends command for the robot to dock via tablet +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - **Possible intent trigger**: a button press on a remote control tablet - **Possible published intent**: .. code-block:: intent: MOVE_TO data: {"goal": "poi_docking"} source: REMOTE_SUPERVISOR modality: MODALITY_TOUCHSCREEN priority: 255 confidence: 1.0 How are intents used by the robot? ---------------------------------- Intents published on the ``/intents`` topic represent each of the user's desires or commands understood by the robot. These intents need to be acted upon by a dedicated node (or a group of nodes, depending on the architecture design) that is called the **robot's application controller**. The general role of the :term:`mission controller` is to schedule and run the different :term:`capabilities` based on received intents, and allocate robot's resources (to ensure no two actions are using eg the arms or navigation, at the same time). You can use any supervision technique to implement your own application controller: simple python scripts, finite state machines, behaviour trees, symbolic task planner: the PAL SDK does not enforce any particular paradigm. You can learn more about how to program applications for the robot here: :ref:`intro-development`. To get started, the :ref:`basic-interaction` tutorial explains how to create your own simple Python controller. PAL robots normally come with a default application controller (the one underpinning the landing demo page, see eg :ref:`ari-first-startup` for ARI) that reacts to different types of intents. You can have a look at its source code and use it as a reference. When *not* to use intents? ++++++++++++++++++++++++++ There are two interaction situations where intents should not be used: interactions with no side-effect on the robot (instead, use directly the chatbot actions), and short confirmation .. TODO (use the :ref:`UserInput ` API instead). Interactions with no side-effects ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ User interactions do not always have to generate intents. In particular, during a chatbot interaction, the chatbot engine might need perform simple *actions* to answer the user's questions which do not impact the robot state. In this case, it is unnecessary to generate an intent, as no complex action scheduling is necessary. For instance, if the user asks the robot about the weather, the chatbot can generate an answer by querying an online weather forecast API. This does not require any specific robot resources. Similarly, checking the battery level of the robot has no impact on the robot state or resources. In these cases, instead of publishing an intent, the chatbot engine can directly perform the API requests or ROS service calls to answer the user's question. .. note:: :ref:`dialogue_management` explains how to create and customise your own chatbots. You can also specifically refer to :ref:`rasa_chatbot` or :ref:`llm_chatbot`. .. TODO: not yet implemented! Short user input ~~~~~~~~~~~~~~~~ For short user input (eg, a yes/no confirmation dialogue, a simple multiple-choice question, a simple text entry), you can use the robot multi-modal user-input mechanism :ref:`user-input`. .. [android-intents] `Android Intents `_