Part 6: What string theory reveals about AI chat models

Published on 19 November 2024
Updated on 27 February 2025

Part 6: What string theory reveals about AI chat models

This post is part of the AI Apprenticeship series:

By Dr Anita Lamprecht (supported by DiploAI and Gemini)

In Diplo’s AI Apprenticeship online course, we are now delving deeper into the theory behind the DiploAI chatbots we built. So far, we have concentrated on gaining a solid understanding of the core concepts and terminology needed to build our chatbots. The aim of our apprenticeship is to enable diplomats to utilise chatbots in their profession and provide them with the necessary AI literacy for negotiations on the topic of AI.

A vital part of this literacy involves an awareness of what we should and can know (‘the knowns’) and what we cannot know (‘the unknowns’). Large language models (LLMs) are a type of neural network, and we should know that such networks have an AI ‘black box problem’. This means there is a phase in the entire process that nobody fully understands. According to our course lecturer, Jovan Kurbalija, this known unknown (i.e. the black box problem) is often utilised or misused in dystopian AI narratives about the dangers of AI and artificial general intelligence (AGI).

Just as physicists grapple with the ‘known unknowns’ and even ‘unknown unknowns’ of the universe, we, as AI apprentices, are learning to navigate the complexities of these powerful language models. Sometimes, the most unexpected connections emerge, like finding a string theory metaphor hidden within the universe of AI.

Yarn, string theory, and AI?

Take, for example, our recent exploration of how LLMs store and represent language. To illustrate this, our lecturer used strings – yes, actual pieces of yarn! This seemed a bit odd to me, but reminded me of an article I read recently about string theory.

String theory is a complex concept in physics that attempts to explain the universe as being composed of tiny, vibrating strings. I recently listened to a fascinating interview with American physicist and professor Leonard Susskind, one of the pioneers of string theory. Surprisingly, he stated, ‘String theory is definitely not the theory of the real world.’ While he acknowledged that string theory offers a compelling framework, there is still much we do not know about whether it accurately reflects reality. Susskind also noted that we do not know if string theory will help us uncover these mysteries and that we remain uncertain about what ‘it’ truly is.

A web of words

Jovan’s version of string theory is much simpler. He uses strings to illustrate what we can know, a crucial aspect for making informed decisions about AI governance and its application in governance. Picture our classroom transformed into a web of interconnected words. We, the apprentices, became nodes, each holding a string that represented a word. The connections between us symbolised the relationships between those words. For example, ‘diplomacy’ might be closely linked to ‘treaty,’ ‘negotiation,’ and ‘ambassador,’ while ‘sanction’ might connect to ’embargo,’ ‘resolution,’ and ‘compliance.’

Suddenly, language is no longer a simple linear sequence of words. It becomes a dynamic, multidimensional network, similar to the complex web of interactions described in string theory. The ‘strings’ represent the probabilistic positioning of words – the likelihood of one word appearing in proximity to another. This is how large language models (LLMs) learn to generate human-like text: by navigating this complex web of relationships and probabilities.

Layers, dimensions, and a bit of confusion

Our lecturer introduced a new concept that can feel a bit complex at first. He explained that words in AI are connected within something called a ‘high-dimensional vector space’. Think of this as a multi-dimensional map where each word is represented as a point, and their positions show how closely related they are.

the image shows a vector space of seven words: goose, eagle, bee, helicopter, drone, rocket, and jet, and their relationship to the words sky, wings, and engine. — A vector space of seven words in three contexts (Hypothesis).

These points, called ‘vectors’, are built using mathematical formulas that capture the relationships between words. You can imagine these as layers of meaning. Each layer highlights a different aspect of how a word relates to other words – like its meaning, context, or how often it appears near another word.

In simpler terms:

Vectors in AI: These are like coordinates that help AI organise and process data.
What makes up a vector: It’s just a list of numbers (called ‘floating-point numbers’) that shows where the word is located in this multi-dimensional space.

To make this easier to picture, imagine holding physical strings that connect words. These strings represent the relationships between words, like how ‘apple’ connects to ‘fruit’, or ‘king’ connects to ‘queen.’ Visualising it this way helps you see how words are related in a space that has many dimensions .

Jovan’s ‘string theory’ is a fantastic way to bring these abstract ideas to life. By physically showing how words connect, he makes it easier to grasp concepts that would otherwise remain stuck in mathematical definitions.

The AI Apprenticeship online course is part of the Diplo AI Campus programme.