Skip to content

Creating a Knowledge Graph from CSV or JSON Files

The XO Platform gives you the option to create a Knowledge Graph in a spreadsheet or JSON and then import it into the VA instead of creating the Knowledge Graph from scratch. Learn more.

The process to create a Knowledge Graph using an editor is summarized below:

  1. Download the sample CSV or a JSON file. You can download these sample files from a blank Knowledge Graph too.
  2. Edit the file by adding rows corresponding to the questions, responses, synonyms, etc.
  3. Import the file to your VA.

CSV File

You can create the Knowledge Graph using a sample spreadsheet that you can download from the VA. If you anticipate frequent changes to the Knowledge Graph, we recommend that you create it in a spreadsheet as it is easier to perform bulk updates compared to the application UI.

Follow the instructions below to build your Knowledge Graph in a spreadsheet.

Download the Sample File

  1. Select the VA to work with and go to Virtual Assistant > Knowledge AI > FAQs.
  2. In the top right corner, click more icon (three dots).
  3. You can find the Import option on the respective Knowledge Graph.
  4. You are prompted to back up the Knowledge Graph before proceeding. Choose the CSV or JSON format for the backup.
  5. After backup, click Proceed.
  6. On the corresponding dialog box, click Sample CSV. The CSV file is downloaded to your local computer.

    download sample csv

Build the Knowledge Graph

The format for the CSV file includes details regarding alternate answers, extended responses, and advanced responses.

The following types of entries are supported:

  • Faq – The leaf level nodes with questions and answers.
  • Node – For node/tags, traits, preconditions, and output context.
  • Synonyms
  • KG Params
  • Traits

Each of the above categories needs to be preceded by the appropriate header. The header helps identify the new vs old versions of the JSON file by the platform.

Moving forward, this article discusses detailed information for each section and the content expected for each.

FAQ

This contains the actual questions and answers along with the alternate questions, answers, and extended answers.

FAQ Sample

The following column details can be used:

  • Faq: Mandatory entry in the header, must be left blank in the following rows.
  • Que ID: The Question ID is auto-generated by the Platform. This field uniquely identifies the FAQs and it should not be added or edited manually. Leave this field blank if you are adding a new FAQ. Do not alter the value of this field if you are updating an existing FAQ. Do not manually add any data in this field.
  • Path: To which the FAQ belongs. The mandatory node names must be prefixed with ** and organizer nodes with !!
  • Primary Question: The actual question users might ask: When left blank, the entry in the Answer column is considered as the alternative answer to the previous primary question.
  • Alternate Question: Optional: Alternate question to the primary question if there are multiple alternate questions, they must be given in multiple rows.
  • Tags: For each question or alternate question.
  • Answer: Answer to the question serves as an alternate answer when the primary question field is left blank. The Answer format can be:
    • Plain text
    • Script with SYS_SCRIPT prefix i.e. SYS_SCRIPT; <answer in javascript format>
    • Channel-specific formatted response when prefixed with SYS_CHANNEL_<channel-name>, the answer can be simple or in script format:
      • SYS_CHANNEL_<channel-name> SYS_TEXT; <answer>
      • SYS_CHANNEL_<channel-name> SYS_SCRIPT; <answer in javascript format>
    • Trigger a dialog then prefix with SYS_INTENT i.e.<SYS_INTENT> <dialog ref id>
  • Extended Answer-1: Optional to be used in case the response is lengthy.
  • Extended Answer-2: Optional to be used in case the response is lengthy.
  • ReferenceId: reference to any external content used as a source for this FAQ
  • Display Name: The name that would be used for presenting the FAQ to the end-users in case of ambiguity.

Nodes

This section includes settings for both nodes and tags.

Nodes and Tags Settings

Key Terms and Definitions

  • Node: Mandatory entry in the header must be blank in the following rows.
  • Que ID: The Question ID is auto-generated by the Platform. This field uniquely identifies the FAQs and it should not be added or edited manually. Leave this field blank if you are adding a new FAQ. Do not alter the value of this field if you are updating an existing FAQ. Do not manually add any data in this field.
  • Nodepath: Path for reaching the node/tag.
  • Tag: Mandatory for tag settings, leave blank for node.
  • Precondition: Conditions to be satisfied for qualifying this node/tag.
  • outputcontext: Context to be populated by this node/tag.
  • Traits: Traits for this node/tag.

Synonyms

Use this section to enter the synonyms as key-value pairs.

enter synonyms

  • Synonyms: Mandatory entry in the header, must be blank in the following rows.
  • Phrase: for which the synonym needs to be entered.
  • Synonyms: Comma-separated values.

Use of synonyms in KG term identification can be enabled using the following:

  • confidenceConfigs: Mandatory entry in the header, must be blank in the following rows.
    • parameter: useBotSynonyms in this case.
    • value: Set to true or false.

KG Params

  • KG Params: mandatory entry in the header, must be blank in the following rows.
  • lang: VA language code. For example, “en” for English.
  • stopwords: Comma-separated values. kg params

Traits

Trait related information can be specified as follows:

trait related information

  • Traits: Mandatory entry in the header, must be blank in the following rows.
  • lang: VA language code. For example, “en” for English.
  • GroupName: Trait group name.
  • matchStrategy: Pattern or probability (for ML-based).
  • scoreThreshold: Threshold value (between 0 and 1) when the matchStrategy above is set to ML-based.
  • TraitName: The name of the trait.
  • Training data: Utterances for the trait.

For Taxonomy Based KG, the following fields can be included if there are one or more faqs linked to another faq in the KG:

  • faqLinkedTo: The faqLinkedto field identifies the source FAQ to which another FAQ is linked to. The faqLinkedTo field must contain a single, valid ‘Que ID’ of the source FAQ. ‘Que Id’ should be a valid identity generated by the platform. Do not give a reference to an FAQ that is already linked to another FAQ.
  • faqLinkedBy: The faqLinkedBy field contains the list of ‘Que Ids’ of the FAQs that are linked to a particular FAQ. ‘Que Id’ should be a valid identity generated by the platform..
  • isSoftDeleted: The isSoftDeleted field is used to identify the FAQs that are deleted but it has one or more FAQs linked to it.

JSON file

The XO Platform allows you to create the Knowledge Graph in JSON and upload it. You can download a sample JSON from the VA to understand its structure.

Follow the instructions below to build your Knowledge Graph using JSON:

Download the Sample File

  1. Select the VA to work with and go to Virtual Assistant > Knowledge AI > FAQs.
  2. In the top right corner, click more icon (three dots).
  3. You are prompted to back up the Knowledge Graph before proceeding. Choose the CSV or JSON format for the backup.
  4. After backup, click Proceed.
  5. On the corresponding dialog box, click Sample JSON. The JSON file is downloaded to your local computer.

JSON Reference

json reference

PROPERTY NAME TYPE DESCRIPTION
FAQ Array Consists of the following:
  • Question
  • Answer
  • Leaf and parent terms up to the First-level node in the path
  • Alternative question
Question String Primary question; included in the FAQ array.
Answer String VA response; included in the FAQ array.
Terms Array Includes the leaf node to which the question is added, and its parents up to the First-level node.
refId String Optional reference to any external content used as a source for this FAQ.
Alternate Questions Array Consists of alternative questions and terms. Include terms from leaf to the First-level node.
Synonyms Object Consists of arrays of terms and their synonyms.
Unmappedpath Array Consists of arrays of nodes that do not have any questions, and all their parents up to the First-level node.
Traits Object Consists of trait names as keys and an array of utterances as values.

For a Taxonomy Based KG, the following fields can be included if there are one or more faqs linked to another faq in the KG:

  • faqLinkedTo: To identify the source faq.
  • faqLinkedBy: To identify the linked faqs.
  • isSoftDeleted: To identify if the faq is deleted but has some linked faqs.