Having finished teaching Deep Learning course 2024 ( in this course content changes every year), insightful questions and discussions with students have lead to clear vision of what is missing in the Large Language Models of today and what PiniTree V2.0 could be.
The realisations that:
LLM like ChatGPT work similar to History tab in PiniTree, where text-like story tokens are produced from merely navigating a PiniTree knowledge graph
Knowledge graph is like language - it is universal for all documents past and present and is “the culture” surviving individual documents or ages. Each document is like a thread in the giant rug we call language or culture (a PiniTree Property graph, or LLM weights - where each conversations is a thread, as well).
Deep Learning = Imitation learning. As such it cannot produce anything new beyond what is deducible from DATA, it is trained on. LLMs are the ultimate Imitation learning machines trained on the DATA of all human culture.
allow us to ask new questions (Mysteries):
M1: Continuous Learning. Learning occurs in the neural network and in the map, where map includes locations of self and goal1, goal2, goal3,…. Agent can pursue only one goal at the time, but which one is a latent variable (not observable in DATA). Goals are often nested.
M2: Concept Creation. Human language differs from animal languages with relying on sparse discrete concepts like “stone axe”, “arrow”, “bow”, etc. These concepts started culture 100K years ago (cave paintings and artefacts), followed by the modern languages much later 10K years ago after global migration of Homo Sapiens (hence so many unrelated language groups).
M3: Evolutionary Learning (Where does DATA comes from ?). Unlike deep learning, evolutionary learning does not need DATA or differentiable loss function for back-propagation. Instead, it produces DATA from physical interaction with environment and within the populations of the same or different species, where the fittest survives (fitness function = non-differentiable loss function). Simple binary splitting and mutations in bacteria embody a slow form of evolutionary learning. Eukaryotic reproduction via crossover of genes is much faster form of evolutionary (genetic) learning. Here the subject of evolution are gene pools rather than individual organisms or individual cells. Similarly memes are genes of culture are the true subjects of the cultural evolution.
How to incorporate this new vision in to PiniTree V2.0? Let see in the next posts.