Designing our bot to have an effective data footprint is a good idea from many perspectives. It lets us query the conversational log data much faster, puts a number of good practices into focus, and it makes us conscious of what we're storing in the conversational log data, which helps from a privacy perspective.
Teneo Inquire — which is the analytics and data part of the Teneo platform — along with all of Teneo, is built to perform well on conversational data. Teneo is not designed to store large chunks of binary data or large and unique JSON structures.
Typical conversational data sessions in the Teneo platform are normally in the range of 50 kb to 400 kb. Very large sessions, e.g. really long sessions with many turns of dialog, can be slightly larger at 400 kb to 800 kb.
Teneo Inquire scales typical conversational data well with traffic, meaning it scales well in regards to API calls per session.
Teneo Inquire scales less well with data outside of its purpose, including:
- Conversational log data, which includes a large number of non-conversational data, such as big JSON payloads or large binary objects.
- Very large conversational log data sessions which are outside of the normal spans. This is often an indicator that the bot includes large chunks of non-conversational data.
An effective solution data footprint is a key indicator of a well-designed bot. It also greatly impacts the performance of Teneo Inquire, which affects how fast we can query our conversational log data and how quickly Teneo Studio is able to give us feedback in e.g. the Optimization section.
You can use this scripts to get a sense of how large your session logs are. In order to be able to do this you would need to download a groovy file, and upload it to your solution resources.
- Download the following groovy file: SessionSizeStatistics.groovy
- Locate to the solution backstage and select 'Resources'.
- Select 'File' at the top.
- Use 'Add' on the upper right to add the SessionSizeStatistics.groovy file. (Alternatively, you can drag and drop it.)
- Set the 'Published Location' for this file from / to
/script_lib. This ensures the file can be accessed using a Groovy script later.
- Hit 'Save'.
The uploaded groovy script provides multiple ways of using it, below you will find some examples that can be useful.
|Continuously print session size (in kb) on Tryout
|Only print warnings for sessions passing, 0,5, 0,75 and, 1 mb
Each session size can be retrieved and stored in a global variable, the script to retrieve this varies depending on if you want to access it to for your Development (Dev) and Quality Assurance / Staging (QA), or Production (Prod) environment.
Here is how to do that:
- Navigate to your solution backstage, followed up with 'Globals' and Variables.
- Create a new Global Variable called
sessionSizeand give it the value
- Save and navigate over to 'Scripts'.
- Add a new 'End dialog' script and paste in the following snippet:
|Development (Dev) and Quality Assurance / Staging (QA)
sessionSize = SessionSizeStatistics.build(engineAccess).size
sessionSize = SessionSizeStatistics.build(engineAccess).production().size
You can then retrieve the session sizes with Teneo Query Language (TQL) using one of the following queries:
la avg s.sv:n:sessionSize
|Get an average view of the sessions
la s.sv:n:sessionSize as 'sessionSize' order by sessionSize desc
|List all the sessions in decreasing order
la s.id, s.sv:n:sessionSize : s.sv:n:sessionSize >= 500
|List all sessions with over 0.5 mb size
ca s.sv:n:sessionSize : s.sv:n:sessionSize >= 500
|Count the number of sessions with over 0.5 mb size
Here are some good practices when working on your Data Footprint.
- Avoid sending in large 'blobs' or strings representing objects of data as these are very costly. Instead, use integrations to call web services when you need to retrieve this data.
- Use Adorners! Adorners can be used to copy variables from event level to session level, which means that they will be faster to query for. You can read more about Adorners in the documentation and here in the Developers pages.
- Use Aggregators! Aggregators are used to aggregate data, for example the amount of traffic towards the bots' key flows. These are incredibly fast to query against, and can be used to power dashboards. You can read more about Aggregators in the documentation and here in the Developers pages.
- Use Sample! When your bot is successful and your datasets grow larger, TQL queries will take longer to run. To quickly design queries, you can use the
samplecommand to ask Teneo to run your query over a small subset of sessions and return results. Read more about sampling in the TQL Reference.
- When you are working on reporting and analytics, it's a good idea to work on your Teneo Query Language queries in Teneo Studio as you have much more support there. However, it is recommended to use the Teneo Inquire Client to run long-running queries.
- Teneo Studio also gives you the possibility of sharing queries, which is a perfect way to save commonly used queries.
- You can publish queries, which are then easy to retrieve using the Teneo Inquire Client.
- Do not wait to set up efficient reporting - do it already in sprint 1 and extend it as you go. This will make sure things are done right from the beginning.
Further reading can be found in the Forum, where you can ask questions to a Teneo Developer.