Mr. Spock meets Pythagoras
Over Easter break, I did what probably counts as a very specific kind of relaxation: I disappeared into philosophy, AI, Docker logs, and GPU temperatures, and came back with something called PhiloGPT.
The question.
The original idea was not “let me build a platform.” It was much smaller and more personal than that. I wanted to explore a question that has been sitting in the back of my mind for a while: what happens when you treat dialogue itself as the interface for thinking?
Reading philosophy is one thing. You can read Plato, Marcus Aurelius, Nietzsche, or Kant and feel like you understand the shape of an idea. But dialogue is different. In dialogue, ideas push back. They expose contradictions. They force clarification. They stay with you longer. I wanted to build something that could support that kind of exchange, not as a one-off chatbot gimmick, but as a system I could actually live with and improve.
So I started building.
What the system needed.
At first, it was the usual optimistic version of a side project: a frontend, a backend, a database, a model endpoint, and the naive belief that once the messages were flowing, the hard part was over. It was not over.
The first thing I learned was that good dialogue depends on memory. If every session starts from zero, the conversation stays shallow. So I added persistent client memory, not because “memory” sounds impressive in an AI demo, but because without it, the system kept forgetting the very things that make a real conversation meaningful.
Then I ran into a different problem: models are fluent, but fluency is not the same as grounded knowledge. A philosophical exchange falls apart quickly when names, dates, schools of thought, or historical details get fuzzy. That is why the Wikipedia tool exists. Not for novelty. Not for a flashy checkbox. It exists because the conversation became more honest and more useful once the system could ground itself in facts instead of bluffing.
Then came structure. Open-ended dialogue can be beautiful, but it can also drift. For more reflective or counseling-style interactions, I needed a way for the system to remember where the conversation was heading, what had already shifted, and what the next step might be. That became the counseling plan tool. Again, not because I wanted more tools, but because the conversation itself kept showing me what was missing.
And because some tasks need explicit reasoning rather than beautifully improvised guessing, I added a sandboxed System2 tool for constrained logic and code execution. That made the system feel less magical and more inspectable, which I increasingly think is the healthier direction for AI systems in general.
When it grew up.
Somewhere along the way, the architecture grew up. What started as “something on my machine” turned into a proper deployment. The stack now runs through Docker, with the public-facing traffic coming in through a Synology reverse proxy, while a separate GPU server with an RTX 4090 and 24 GB VRAM runs Ollama locally for model inference. That gave me something I cared about from the start: control. Control over the infrastructure, control over the data path, control over the models, and control over cost.
That local-first setup also changed the feel of the project. Once Ollama was in the loop, this stopped being “yet another app that forwards prompts to a cloud API” and became something I could actually shape end to end. At the same time, I did not want to hardwire the whole system to one provider, so I added a provider abstraction layer. That means I can run local models when privacy and cost matter most, and still switch to OpenAI-compatible endpoints when capability matters more. The same runtime, different tradeoffs.
Where it got real.
The less glamorous part, and probably the more important part, was learning where systems actually break. Latency was one lesson. A single tool call could easily add enough delay to ruin the feeling of dialogue. One Wikipedia path was taking far too long, and tracing that through the stack led to a satisfying but humbling discovery: I was doing extra work because I had designed the flow badly, not because the universe was against me. Another lesson was production drift. Something that behaved perfectly locally turned into a visible bug in production because an LLM config field was never actually being persisted. That was a good reminder that software only becomes real when it survives contact with deployment.
That is also why I cared about versioned seed patches and steered updates. I did not want every deployment to feel like rolling dice. So the project now carries its own evolution path: schema changes, default data updates, and configuration updates are version-managed and applied deliberately instead of through hopeful manual steps. I suspect the MLOps folks will appreciate that this part gave me almost as much satisfaction as the model work. It turns out I enjoy the intersection where model behavior, product design, and operational discipline all have to work together.
What it became.
What I like most about the project is that it still feels personal even though it is now much more than a toy. It started as a nerdy Easter-break experiment about mind, memory, and philosophy in dialogue. It turned into a self-hosted multi-persona AI platform with real-time chat, tool use, deployment discipline, and a cleaner separation between product ideas and infrastructure than I expected when I began.
If you want to try it.
If you want to try it, it is live here:
And if you want to look at the code or contribute, the repository is here:
https://github.com/trueal82/philoGPT
I am sharing it partly because I think it is genuinely interesting, and partly because projects like this get better when other people poke at them, challenge the assumptions, and suggest directions I would not have thought of on my own.
If any of that resonates.
So if you are into AI systems, dialogue design, local inference, MLOps, philosophy, or just slightly overengineered holiday projects, I would be happy to compare notes.