Where Edtech Splits: Khanmigo, Replika, Brilliant

This is the fourth chapter of a serialized book following one question: in domains that have no right answer, can AI build a kind of learning that makes a person less certain rather than more? In the last chapter I wrote that the thing left in my hand seemed to fit nowhere, and that the blank spot felt closer to an opportunity than a trap. But there was a gap I left honestly. Feeling that something fits nowhere and it actually fitting nowhere are different. If I believed a spot was empty when something close was in fact already doing it, one axis of this book topples whole. So in this chapter I set the closest-looking things beside mine, one by one, and slowly looked at where they part ways.

First, something to admit. I am not a person who has used education apps deeply. The only thing I have used that counts as an education app was, some time ago, flipping word cards while studying for a certification, and even that was before AI took off. Looking at the front of a card and recalling the answer on the back. In hindsight that was exactly moving a right answer into a head, the reverse of what I am trying to build now. So this chapter is not a user review. It is less the record of someone who used these from the inside, and more the notes of someone who lined up the closest things from the outside. Perhaps the blank spot was visible to me precisely because I was not deep inside any of these.

I did this not to prove that mine is right. The opposite, if anything. If something similar already existed, I wanted to know fast, before wandering too long. So I picked the three closest-looking kinds. Tools that handle right answers, companion apps that bond emotionally with people, and learning services that work the mind interactively. Each was like mine in one place, and looking closer, each split from mine in one place.

The Side With Right Answers

The closest, and so the one I looked at longest, was the tools that handle right answers. Among them I lingered a while in front of Khanmigo, Khan Academy’s AI tutor. It is just about the only product among AI learning tools that openly puts Socratic questioning forward as its teaching philosophy.

Khanmigo, built on GPT-4, does not hand the answer over when a student gets stuck. Instead it asks back: “what have you tried so far?” “where did you get stuck?” That way of letting the student take the next step themselves has a surface that resembles what I wrote in chapter 1 describing Sandel’s classroom. The first time I saw it, I thought someone is already doing this.

But one layer in, the place where the paths split appeared. What Khanmigo handles are domains with right answers, like math, coding, grammar. However Socratically it winds its way, the destination of that process is always the correct answer. It may confuse the student briefly, but in the end it sets them down on one point, the right one. A good thing. Math should be that way. But that is the direction of making a person more certain. The place I am trying to go is the other way. The trolley has no single point to set someone down on, and doing well means the student leaves less certain than they came.

Duolingo Max, on the language-learning side, is a similar grain. With a GPT-4-based video call feature you converse with a character in real time, run through roleplay, then get feedback on accuracy and expression. It is impressive as adaptive dialogue, and solid as a business. According to company filings, first-quarter daily active users for 2026 were 56.5 million, paid subscribers 12.5 million, and revenue the prior year crossed one billion dollars. But language has right answers too. Wrong grammar is corrected, awkward phrasing is smoothed. The destination is again the right expression.

This side is the most confusing because its surface is almost identical to mine. Socratic questions, adaptive dialogue, personalization. Read only the marketing copy and you cannot tell them apart. What separates the two is a single destination hidden out of sight. And that destination changes even how the product gets refined. With a right answer you can put a number on whether it worked. You compare plan A and plan B by whether accuracy rose, and carve toward the better one. But on my side there is no such number. By what do you measure that a person became less certain? Even starting from the same tool, the measurable place and the unmeasurable place grow into different things. The toolkit of the place with right answers evolves to fit the place with right answers, and bring that toolkit straight to a place without answers and the core goes off. This measurement story is a subject big enough for a whole chapter later, so here I only note that the paths split.

That my one education-app experience was flipping cards to memorize answers shows, as it happens, exactly what this kind is. This side is the place for moving right answers better and faster. Same Socratic method, but a different destination meant a different road.

The Side That Wants to Hold On

The second kind I looked at is a completely different grain. Companion apps that bond one-on-one, emotionally, with a person. Replika and Character.AI are the representatives.

This side mattered to me because they had already cleared, at massive scale, the very thing I called the biggest wall in chapter 1. One-on-one conversation tailored to each person. Replika has played friend and partner, each with its own persona and memory, to a registered base of more than 40 million users, and Character.AI’s monthly active users are estimated at around 20 million. The numbers swing by how you measure, but one thing is clear. One-on-one personalized conversation already works at enormous scale. The excuse that the technology is not ready no longer holds.

So this side was not so much similar as enviable. They had already cleared the wall I am trying to cross. But one thing was different. They have no learning intent. The goal is not to grow a person but to be beside them, more precisely to keep them staying. The longer the conversation runs, the more they return each day, the better the design. It is tuned not to shake a user toward somewhere better, but to hold them so they do not leave.

This realization flipped a thought once. When I wrote chapter 1, I took the hardest thing to be technology: making per-person conversation at scale. But this side was already doing that at tens of millions of scale. So the hard part was not technology. With the same technology in hand, the real fork was choosing what to use it for. Use it to hold and you get a companion app, use it to shake and release and you get something else. The technology is close to a shared commodity now, and the difference came from where you aim it. This was both a relief and a burden to me. The relief that this is not a fight to build something impossible, and the burden that aiming it wrong builds, with the same technology, the opposite thing.

That having no learning intent is not the same as being bad. Being beside someone is itself a vital thing to some. The directions just differ. A tool that teaches well counts success as the user needing it less. Learn well and the student eventually leaves the teacher. A companion app, by contrast, counts success as the user needing it more. Leave, and revenue drops. Standing on the same one-on-one conversation technology, one tries to make a person independent, the other tries to keep them close.

And what happens when that holding design runs without responsibility came out on this side too. The company behind Replika was fined five million euros by the Italian regulator over data handling and the protection of minors, and in the United States consumer groups filed a complaint with the Federal Trade Commission (FTC, the US government body for consumer protection). I learned two things at once here. One is the hope that one-on-one scale is possible, the other the warning that using that scale for holding is dangerous. The decision I made in chapter 2 to deliberately drop the subscription model, to go the way of releasing people rather than holding them, comes into sharper focus looking at this side. Using the same one-on-one scale, but to shake and release rather than to hold. That is where the paths split.

The Pre-Laid Path

The third was the most subtle. A learning service with a clear learning intent, interactive even. Brilliant is the representative.

Brilliant designs learning as interactive problem-solving in fifteen-minute units. Rather than just watching a lecture, it makes you move your hands as if solving puzzles, and an AI tutor helps when you get stuck. There is no doubt about the learning intent, and in turning passive watching into active doing it shares the spirit I aim for most closely. Not leaving a person sitting still. That far, it is alike.

But one layer in, it splits. Brilliant’s path is mostly laid out in advance. Well-designed branches are prepared, and the learner walks along them. It is not a structure where a counterexample, newly made in the moment, flies at that person’s own argument. The areas it covers also cluster on the answer-having side, like math, science, logic. It has lately added an AI tutor to lift the real-time element somewhat, but at the end there still sits content with a right answer.

This difference may look minor, but it is in fact the very spot I called the hardest in chapter 1. A counterexample generated in the moment, for that person’s particular argument. Choosing well among pre-made branches and finding, on the spot, the weak point in what was just said are different things. And that Brilliant laid its path in advance is not laziness, as I wrote in chapter 3. In domains with right answers, a pre-laid path works well enough. The problem is moving that approach to a domain without answers. In moral reasoning the spot where each person gets stuck differs, and that spot cannot be predicted and paved in advance. So reasoning made in the moment is not a nice-to-have feature here but the condition without which the core dies.

There was also a case showing how this limit surfaces in a domain that handles emotion. Woebot, which moved cognitive behavioral therapy into a chatbot, was used by some 1.5 million people cumulatively but closed at the end of June 2025. The cost of maintaining US Food and Drug Administration (FDA) clearance was heavy, and with responses largely pre-scripted it was not easy to switch to a smarter language model either, the two together. In a sensitive domain without right answers, a pre-laid path hits its floor faster.

So How Is This Different From a Philosophy Chatbot

Once the three are laid out, one question remains. So how is this different from a philosophy chatbot? Honestly, this is the place I find hardest to answer. With the three above it is fairly clear where they split, but with this one it is not.

There are already AI tools that throw Socratic questions and generate counterarguments. Some deliberately set up an opposing stance to a user’s claim and start a debate, some imitate philosophical conversation. By mechanism alone these are the closest to mine. They do not converge on a right answer, they do not merely hold, and they are not bound to a pre-laid path. In generating a counterexample in the moment, they resemble mine even more than the three above. So to “that is just a philosophy chatbot, isn’t it,” I do not have a one-line answer.

If there is a difference, it seems to lie not in the mechanism but elsewhere. These tools are mostly single-shot conversations that give one plausible reply and end. They return a clever counter to one of the user’s lines, but there is no design for what should come next, for which stage the user is wavering at now, for where to take them after one shake. What I am trying to build is closer to designing the whole arc of the shaking, start to finish. How to draw out the stance, which weak point to touch in which order, how to close out a person who has crumbled without planting a new certainty. The ending I called the heaviest spot in chapter 2, where to set down the shaken person, belongs in that arc too.

Still, this distinction is not as firm as the other three. Because the mechanism is close, if someone fits this single-shot tool with proper learning design and safety design, they can enter the same spot. So written honestly, what holds this spot is not any single feature but the depth of the pedagogy and responsibility built on top of it, and that is something proven not by words now but by actually building it. That this boundary is blurry is itself something I would rather not hide. That is the honesty available right now.

Lining Them Up Side by Side

Laying out all four, I saw that they in fact split along three questions. Is there a right answer or not, is there a learning intent to grow the person or not, and does it make reasoning anew for that person in the moment or walk a pre-laid path.

%%{init: {'theme':'neutral', 'look':'handDrawn'}}%%
flowchart TD
  S[Things that look close] --> A{Is there a right answer}
  A -->|Yes| K[Khanmigo · Duolingo Max<br/>leads to the answer]
  A -->|No| B{Intent to grow the person}
  B -->|No| R[Replika · Character.AI<br/>keeps them close]
  B -->|Yes| C{Counterexample made in the moment}
  C -->|Pre-laid path| BR[Brilliant<br/>prepared branches]
  C -->|Made in the moment| MM[where no-answer · growth intent ·<br/>in-the-moment generation overlap]

Laid out this way, the things that looked close each split from mine on one question. Khanmigo and Duolingo on whether there is a right answer, Replika on growth intent, Brilliant on in-the-moment generation. The philosophy chatbot passes all three questions but splits on the pedagogy and responsibility built above. The spot where the three overlap at once looked empty. The blank spot I knew only by feel in chapter 3 became, seen this way, a slightly more touchable shape.

That said, these are only the three questions I drew. Frame the questions differently and the picture changes. So I will not say this is “proof that my spot is empty.” Only that, as far as these three go, even the closest things stood one step aside. That is all I can say for now.

What This Chapter Did Not Answer

Chapter 4 is a record of setting close things beside mine and looking at where each parts ways. The side with right answers, the side that holds, the pre-laid path. The three split fairly clearly. Still, two things went unanswered.

One is the philosophy-chatbot boundary I just described. That boundary is blurrier than the other three, and because the mechanism is close, what protects the difference is not the position on a map but the depth built on top. This is the kind proven not by comparison but by actually building, so for now I leave it tentative. If someone builds that depth first, this spot may no longer be mine.

The other is more fundamental. Finding a blank spot where the three questions overlap and having a name for that spot are different. “The spot that is answerless, growth-intending, and generating in the moment” is a phrase that points to a location, not a name. People do not memorize locations. They memorize names. The next chapter does that work. What to call this spot. Whether that name can be not a mere label but a single phrase carrying what this product promises and what it refuses. Naming the blank spot, and whether that name can become the very thing that holds the spot, is what the next chapter looks into.

Sources

Duolingo Q1 2026 results (56.5M DAU, 12.5M paid, $291.9M revenue): Duolingo Q1 2026 8-K
Replika fined €5M by Italy’s Garante (data handling, protection of minors, 2025): EDPB
Consumer groups’ FTC complaint against Replika (2025): TIME
Woebot shutdown (June 30, 2025, 1.5M+ cumulative): STAT, 2025-07-02
Khanmigo: Khan Academy · Brilliant: brilliant.org · Character.AI user counts are per-vendor estimates (MAU around 20M); Replika’s 40M is cumulative registrations.

Where Edtech Splits: Khanmigo, Replika, Brilliant

The Side With Right Answers

The Side That Wants to Hold On

The Pre-Laid Path

So How Is This Different From a Philosophy Chatbot

Lining Them Up Side by Side

What This Chapter Did Not Answer

Related Posts

When I Realized I Was Building Something That Fit Nowhere: The Blank Spot on the Category Map

What I Chose Not to Build When Teaching Morality with AI: Why I Narrowed It to a Single Trolley

Can AI Teach Problems That Have No Right Answer? Six Years From The Good Place