TherapyBot: Exploring Knowledge Graphs and LLMs in Chat-Based Applications

Introduction

In my spare time I’m trying to make some progress on a coding project called TherapyBot. Despite its name, it’s not strictly about therapy – that’s just one potential use case. The main aim? To test out some ideas and, let’s be honest, to level up my coding skills.

What’s TherapyBot, anyway?

At its core, TherapyBot is built around chat sessions – back-and-forth text conversations with a Large Language Model (LLM). These chats aren’t just ephemeral; they’re encrypted and saved to a database. We also generate a vector for each chat message, implementing a form of RAG (Retrieval Augmented Generation).

The project is split into two main components:

  1. A backend: A FastAPI app coded in Python (where most of my sweat and tears are going)
  2. A frontend: A React app (which, admittedly, is getting less of my attention)

The Current Challenge: Knowledge Graphs

Right now, I’m wrestling with functionality to generate a knowledge graph from parsing the chat text. The idea is to extract people, places, events, organisations, and objects from the chat and set these as nodes in a knowledge graph.

But why a knowledge graph? Well, it’s not just for the sake of using fancy tech.

The Power of Knowledge Graphs

  1. Compressed Representation: It provides a neat way to generate a compressed representation of facts that can be fed back into the chat. This helps the LLM understand more about the context.
  2. Relevance Filtering: As chats progress, not all information remains relevant. Your primary school might not matter when discussing your dad’s illness. Knowledge graphs centre the conversation around “things,” allowing facts to be supplied on demand.
  3. Visualisation: Users can visualise the graph, providing a representation of what’s important in their life. Frequent mentions could translate to larger node sizes, for instance.

Why Not Just Text?

You might ask, “Couldn’t you just use text for this?” Sure, we could have a text database where an LLM generates headings for each entity and lists properties underneath. But that lacks the flexibility and visual potential of a graph structure.

The State of Knowledge Graphs

Knowledge graphs have had their ups and downs in the tech world. Historically, they stored facts as a series of predicate expressions: (object, verb, subject). Before LLMs, generating these from free-flow text was a nightmare. You either built them manually or used clunky, unreliable automated methods.

Enter LLMs. These models can reliably convert free-form text into structured formats, breathing new life into knowledge graph creation.

A Sidenote on Named Entity Recognition (NER)

NER is a similar technology that’s worth mentioning. Pre-neural networks, it often relied on grammar rules implemented in software. The neural network era brought BERT architectures, which more accurately label word tokens for different entity types.

But NER has its limits, particularly with pronouns and coreferences – the “he,” “she,” “it,” “they” problem. LLMs, however, excel at this task.

Beyond TherapyBot: Other Applications

I had another project parsing books from Project Gutenberg. NLP libraries like Spacy were great at identifying capitalized names, but struggled with pronouns and multiple references to the same entity (e.g., “Sally,” “Harry’s wife,” “my wife,” “her”).

The core idea I’m testing with TherapyBot is using LLMs to enhance fact extraction through improved coreference resolution.

The Road Ahead

There’s still plenty to do:

  1. Build the MVP for parsing chat messages and creating the knowledge graph
  2. Develop minimal code for visualising the knowledge graph
  3. Test it on custom conversations
  4. Iterate and improve based on results

Future Extensions

The potential extensions are exciting:

  1. Life Representation: Parse the knowledge graph to develop a representation of a user’s life.
  2. Geographical Rooting: Use place representations to explore how a user’s life is anchored in their immediate surroundings.
  3. Relationship Mapping: Analyse people representations to identify important relationships and potentially help users improve them.

A Philosophical Musing

I have a theory: a lot of modern anxiety stems from our daily exposure to global news and social media. We see possibilities that aren’t actually accessible to us because we’re rooted in specific places and relationships.

My hope is that the ideas in TherapyBot can help root users in their actual lives, focusing on what they can change and finding peace with what they cannot.

Stay tuned for updates as TherapyBot evolves!

Leave a comment