Includes HTTP polling mode and HTTP SSE mode, controlled by the parameter "sse". Let’s first talk about the HTTP polling mode (sse == False), and then we will talk about the HTTP SSE mode (sse == True) in the “Interface Usage Instructions” section.
Interface address: https://<your_server_url>/sapi/intentchat_begin (<your_server_url> needs to be replaced with a specific url, the same below)
HTTP GET parameters
user_name: user's name
access_key: API access key, a string of 60 characters or numbers
start_tree: the name of the ChatTree to start (without the ".xmind" or ".py" suffix)
infoitem_params: The value of the initial InfoItem passed in when starting the ChatTree, the format is "{infoitem_1}=value|{<infoitem_2}=value|...", there cannot be "?", "=", "&", "|" characters here, otherwise you need to perform URL escaping and anti-escaping separately; but the URL length of the entire interface should not exceed 2000 characters (after escaping), because some Web server or proxy will limit the length. If there are a larger number of parameters to be passed, it is recommended not to use infoitem_params of this interface, but to call the relevant API or HTTP in the ChatTree to obtain them.
audio_back: In the first interaction, in addition to the text, whether the voice should also be returned at the same time (that is, whether the returned json text contains the audio field), True/False, if this parameter is not passed, the default is False
sse: Whether to use HTTP SSE mode, True/False, if this parameter is not passed, the default is False
HTTP GET return value
If the return value is a piece of HTML starting with the "<...>" tag, the server may not be ready yet, otherwise it is a piece of json text, similar to \{"return_code":"...", "return_content":"...", ...\}, where return_code is a must-have field when returning from any API interface, and then based on different values of return_code there are other fields (the other interfaces below are similar and will not be described again). The return_code field is:
"error" indicates an error. at this time:
Both the return_content and return_content_delta fields are the complete content of the error
"wait_result" indicates that the dialogue has been successfully started. You need to continue to call the "Dialogue Result" interface to obtain the complete utterance returned by the dialogue robot in a streaming manner. At this time:
session_id field is the id of the dialogue started
return_content and return_content_delta fields are both part of the first utterance of the dialogue robot
audio_delta field (when the audio_back parameter is True) is the incremental voice of the dialogue robot (a list containing one or more speech segments / each segment is in base64-encoded mp3 format / or it may be an empty list)
structured_output field is the complete structured information returned (generally it should be json or xml, it may be empty)
Other instructions
The returned json value contains escape characters such as carriage return, line feed, single and double quotes, etc. Python's json.loads() can automatically handle it. Other programming languages also need to pay attention (the same below)
If the words in the return value are to be displayed in the browser, it may be necessary to replace characters such as greater than or less than signs, single quotes, double quotes, carriage returns, backslashes, etc. to avoid conflicts with HTML Tags (the same below)
After the dialogue is successfully started, the expiration time of the dialogue session is 2 hours after the last call to the API interface (any of the 5 interfaces)
access_key: API access key, a string of 60 characters or numbers
session_id: session_id returned by the "Start Dialogue" interface
user_input: text entered by the user
sse: Whether to use HTTP SSE mode, True/False, if this parameter is not passed, the default is False (let’s talk about the case of False first, that is, HTTP polling mode; the case of True, that is, SSE mode, see the "Interface Description" section later)
HTTP GET return value
return_code field is:
"error" indicates an error. At this time:
Both the return_content and return_content_delta fields are error content
"wait_result" means that the dialogue robot has received the user's input and needs to continue to call the "Dialogue Result" interface to obtain the complete utterance returned by the dialogue robot in a streaming manner. At this time:
return_content and return_content_delta fields are both part of the first utterance of the dialogue robot
structured_output field is the complete structured information returned (generally it should be json or xml, it may be empty)
access_key: API access key, a string of 60 characters or numbers
session_id: session_id returned by the "Start Dialogue" interface
audio_format: refers to the uploaded audio format, optional wav, pcm, mp3, speex, silk, m4a, aac, amr, ogg-opus (the returned audio format will only be mp3)
audio_back: In this interaction, whether the system returns voice in addition to text (that is, whether the returned json text contains the audio field), True/False, if this parameter is not passed, the default is True
[files]: HTTP POST body, user voice (original audio data/non-base64 encoding), 16K sampling frequency, audio format see audio_format parameter
sse: Whether to use HTTP SSE mode, True/False, if this parameter is not passed, the default is False (let’s talk about the case of False first, that is, HTTP polling mode; the case of True, that is, SSE mode, see the "Interface Description" section below)
HTTP POST return value
return_code field is:
"error" indicates an error. At this time:
Both the return_content and return_content_delta fields are error content. If the error content is "input audio too long...", it means that the user's voice is too long. For compressed voice formats (such as mp3, opus), the time should be less than 30 seconds, and for non-compressed formats (such as wav, pcm), the time should be shorter.
"wait_result" means that the dialogue robot has received the user's input and needs to continue to call the "Dialogue Result" interface to obtain the complete utterance returned by the dialogue robot in a streaming manner. At this time:
return_content and return_content_delta fields are both part of the first utterance of the dialogue robot
audio_delta field is the incremental voice of the dialogue robot's speech (list / contains one or more speech segments / each segment is in base64-encoded mp3 format / may be an empty list)
user_input field is the text content of the user's voice
structured_output field is the complete structured information returned (generally it should be json or xml, it may be empty)
access_key: API access key, a string of 60 characters or numbers
session_id: session_id returned by the "Start Dialogue" interface
HTTP GET return value
return_code field is:
"error" indicates an error. At this time:
Both the return_content and return_content_delta fields are error content
"go_on_chat" or "end_chat" or "end_chat_by_redirect_to_human" indicates that a streaming dialogue interaction has been successfully completed. If it is "go_on_chat", you can continue to call the "Dialogue Interaction" interface or the "Dialogue Interaction (Voice)" interface for the next dialogue interaction. If it is "end_chat", the dialogue should be ended. If it is "end_chat_by_redirect_to_human" you can transfer the dialogue to a human and end the dialogue. At this time:
return_content is all the words spoken by the dialogue robot so far in this round of dialogue interaction
return_content_delta is the incremental discourse content generated by this call
user_input field (if this interaction is initiated by calling the "Dialogue Interaction (Voice)" interface) is the text content of the user's voice
audio_delta field (if this interaction is initiated by calling the "Dialogue Interaction (Voice)" interface, or when the "Start Dialogue" interface with the parameter audio_back is True) contains the incremental speech of the robot's speech (list / contains one or more speech segments / each speech segment is in base64-encoded mp3 format / may also be an empty list)
At this point, you can continue to call the "Get Execution Trace" interface to get the dialogue status. Especially "end_chat_by_redirect_to_human" means that the dialogue should be transfered to a human, you need to get the dialogue status to know the specific reason for the transfer and pass the relevant context information to the human. For details, see the "Get Execution Trace" interface description below.
"wait_result" means that you still need to continue calling this interface to obtain the dialogue robot's words. At this time:
return_content field is the continuously increasing content of all utterances generated so far by the dialogical robot returned in streaming mode
return_content_delta field is the incremental discourse content generated this time
user_input field (if this interaction is initiated by calling the "Dialogue Interaction (Voice)" interface) is the text content of the user's voice
audio_delta field (if this interaction is initiated by calling the "Dialogue Interaction (Voice)" interface, or when the "Start Dialogue" interface with the parameter audio_back is True) contains the incremental speech of the robot's speech (list / contains one or more speech segments / each segment is in base64-encoded mp3 format / may be an empty list)
Other instructions
This interface only needs to be called in HTTP polling mode, and does not need to be called in HTTP SSE mode. See the "Interface Description" section below.
access_key: API access key, a string of 60 characters or numbers
session_id: session_id returned by the "Start Dialogue" interface
HTTP GET return value
return_code field is:
"error" indicates an error. At this time:
return_content field is the content of the error
"success" indicates success. At this time:
return_content is the execution trace content in json text format. The fields included in this json are: <UserInputOfThisTurn>, <ActiveNodeBeforeThisTurn>, <InfoitemValuesBeforeThisTurn>, <ExecutedNodeListInThisTurn>, <ActiveNodeAfterThisTurn>, <ScriptDebugInfoInThisTurn>, <InfoitemValuesAfterThisTurn>, <ActiveTopicAfterThisTurn>, <ActiveTriggerAfterThisTurn>, <ActiveChatTreeAfterThisTurn>, <ChatRecord>, <ThisTurnServerTTFT>
Other instructions
This interface should be called after all streaming return results have been received through the "Dialogue Result" interface or HTTP SSE (that is, the return_code of the returned data packet is no longer "wait_result"). If other interfaces return errors and then call this interface, the results may be incorrect.
When the dialogue ends (including the case where the return_code field returned by the "Dialogue Result" interface is "end_chat" or "end_chat_by_redirect_to_human"), you can call this interface to obtain detailed final dialogue status information, see the json field description above, especially the <InfoitemValuesAfterThisTurn>, <ChatRecord> and other informations, which are very useful for analyzing the dialogue process, optimizing the ChatTree, and transferring contextual information during manual transfer.
Start by executing once [call the "Start Dialogue" interface to create a dialogue, and then call the "Dialogue Result" interface multiple times to obtain the AI robot's welcome message in a streaming manner until the acquisition is complete]; then execute in a loop [call the "Dialogue Interaction" or "Dialogue Interaction (Voice)" interface and pass in the user's input, and then call the "Dialogue Result" interface multiple times to obtain the AI robot's responses or questions in a streaming manner until the acquisition is complete]; until the dialogue ends
Regardless of whether you call "Start Dialogue", "Dialogue Interaction", "Dialogue Interaction (Voice)" or call the "Dialogue Result" interface, the returned return_content is all the content of the AI reply or question so far in this round of dialogue interaction, and return_content_delta is the incremental content of the AI reply or question text generated after this call.
There is a sse parameter in the previous three interfaces "Start Dialogue", "Dialogue Interaction", and "Dialogue Interaction (Voice)". If this parameter is True:
The 3 interfaces just mentioned all continue to receive HTTP SSE ("data:") data packets, then parse the data packets and process return_code, return_content and other fields as mentioned above.
In the above description, the places that require (continue) calling "Dialogue Result" are replaced by continuing to receive HTTP SSE ("data:") data packets, and there is no need to call the "Dialogue Result" interface.
It is recommended to establish an automated testing mechanism through API
Please refer to the following two code examples for details
access_key: API access key, a string of 60 characters or numbers
file_type: Could be "chattree" for ChatTree (.xmind/.py), "knowledge" for knowledge base (.docx/.md/.zip) or "codefile" for code file (.py)
[files]: One or more uploaded files, each file is a triplet: (file name, file content, file MIME type), where the file name needs to be URL encoded (use urllib.parse.quote in Python)
HTTP POST return value
If return_code:
is "error", it means an error occurred, and return_content is the content of the error.
access_key: API access key, a string of 60 characters or numbers
start_tree: ChatTree started when making an outbound call
infoitem_params: The value of the initial InfoItem passed in when starting the ChatTree in an outbound call, the format is "{InfoItem_1}=value|{\InfoItem_2}=value|..."
number: Outbound phone number
HTTP GET return value
If return_code:
is "error" means an error occurred, return_content is the error content
is "success", it means starting to try to call, return_content returns the call sequence number, and then you can use this call sequence number to call the "Outbound Call Status" interface to obtain the status (result) of the outbound call