Skip to main content

Knowledge Base

  • The name of the knowledge base file (without the “.md” or “.docx” or “.zip” suffix) is the name of the knowledge base
  • The knowledge base can use Word or Markdown format, which are introduced below.

Word Format Knowledge Base

Knowledge Base Word File Template

  • Use empty_document_of_knowledge_base.docx (note that after downloading, be sure to name the file “empty_document_of_knowledge_base.docx”) to start creating a knowledge base document. This document has 9 levels of titles preset.
  • Must be a *.docx file

Set Hierarchical Title Style

  • Set the titles in the knowledge base Word file one by one according to the hierarchical title style (you can quickly set the title through Ctrl + 1, 2, 3, ..., 8, 9), i.e. use hierarchical titles to divide the document content with appropriate granularity (** For titles without text or subtitles, add a line of text with only a decimal point "." / Otherwise, the title will be ignored), and those that cannot be separated by titles should be divided into different paragraphs as much as possible, it is best not to exceed 500 English words for each paragraph.
  • The original title may start with a number, which may disappear automatically after being set as a title style. It is recommended to fill it in manually as it is.
  • QA (question and answer) knowledge generally consider Q (question) as the title and A (answer) as the text
  • Lines starting with a numeric number (such as "1.", "(1)", "1)", etc.)
    • If there is less content in this row, set it to the title style
    • If there is a lot of content in this row, you don’t need to set it as a title style (just keep the text style); But if there are rows with numbers at the lower level, then set the number of this row as the title separately
  • Some lines that do not start with a numeric number are words with a smaller number of words that start in the form of titles, and can also be processed separately as titles

Form Processing

The table needs to be manually converted into row-by-row description text, as follows:

Table: Grading of newborn's red buttocks

GradingClinical Manifestations
Degree ILocal skin flushing with a small amount of rash, small range
Degree IIRed skin, large area, rash ulceration accompanied by peeling
Grade IIIRed skin, wide area, accompanied by rash, large area of ​​skin erosion, exfoliation and exudation

Can be converted to:

When the degree of neonatal red buttocks is I, the clinical manifestation is local skin flushing with a small amount of rash, small range
When the degree of neonatal red buttocks is II, the clinical manifestations are red skin, large area, rash ulceration accompanied by peeling
When the degree of neonatal red buttocks is III, the clinical manifestations are red skin, wide area, accompanied by rash, large area of ​​skin erosion, exfoliation and exudation

Other Notes

  • Sometimes when setting the title style, the font of the title has been changed, and the title also appears in the left navigation bar, but no title style is selected in the style box above. At this time, you need to cut the title and paste it out without format, and then set the title. Be sure to select the relevant title style in the style box above.
  • Delete pictures (and short descriptions attached to the pictures / descriptions with independent meaning do not need to be deleted) and text boxes
  • Delete irrelevant signatures, references, appendices, etc.

Markdown Format Knowledge Base

  • To build a Markdown format knowledge base, you can refer to the hierarchical title setting method of the Word format knowledge base, but use Markdown's title syntax ("#" represents a first-level title, "##" represents a second-level title, "###" represents a third-level title, and so on up to nine levels of titles) to represent hierarchical titles.
  • Markdown format knowledge base file must be a file name with a .md suffix and a plain text file encoded with utf-8
  • Recommended to use Markdown format

Use Zip to Package Multiple Knowledge Base Files

If multiple document knowledge bases in Word and/or Markdown format are needed in a CHatTree, they need to be zip-packed together (directly compress multiple *.docx and/or *.md files into one *.zip file | Do not compress the directory | Each *.docx and/or *.md file must have a unique first-level directory covering all the contents of the file) to form a knowledge base. At this time, the name of the knowledge base is the name of the zip file

Define Term Concepts in the Knowledge Base

Sometimes it is necessary to clarify some scattered term concepts in the knowledge base alone. You can define a first-level directory called "term concept", and then the name of the second-level directory below is a certain term concept. The text content under the second-level directory can be an explanation and description of the term concept, related synonyms, and the difference with other related concepts, etc., such as:

# Terminology Concept
## AI
### Definition
Artificial Intelligence (AI) refers to tasks performed by computer systems that usually require human intelligence. It includes machine learning, natural language processing, computer vision and other fields.
### Synonyms
intelligent computing
### Differences from Related Concepts
Artificial intelligence is different from traditional software programming, which is based on clear rules and logic, while artificial intelligence learns and makes decisions in a data-driven manner.
## Machine Learning
### Definition
Machine Learning is a subfield of artificial intelligence that focuses on developing algorithms that enable computers to learn from data and improve performance without having to be explicitly programmed.
### Synonyms
adaptive systems
### Differences from Related Concepts
Machine learning is a method for achieving artificial intelligence, but not all artificial intelligence systems rely on machine learning.

Support Scale

Cases supported by the system include a knowledge base of 25,000 law documents (approximately 200 million pieces of information), and the response speed is still very fast

Location where the Knowledge Base is Loaded

Only when the size of the knowledge base reaches a certain level and the remaining video memory of the GPU is sufficient, the system will try to load the knowledge base to the GPU (faster access speed), otherwise the knowledge base will be loaded to the CPU (slightly slower access speed), which can be observed through nvidia-smi to view the remaining video memory.

In "ChatTree Examples", there are also related knowledge base file examples in md format and zip format for reference.