Chapter 4: What AI Knows That We Don't

I was observing Brad as he hunched over his computer, trying to teach it to be smarter with less—another project in his endless collection of optimization attempts—when I decided to ask him a question that would help him realize he might be teaching machines to make the same silly mistakes he was making with his own beautiful, inefficient heart.

* * *

"Look at this, Finny," Brad said, pointing to his laptop screen where colorful charts danced across the display. "I'm trying to teach this computer to be smarter with less. Making it lean, efficient, elegant. Or trying to, anyway. I'm better at organizing code than organizing my life."

I regarded the screen with that particular stillness that precedes profound observations from teddy bears.

"Tell me about teaching."

"Teaching?"

"When you learned to walk, did you start with the most efficient walking technique?"

Brad laughed. "No, I stumbled around, fell down constantly, took the longest possible routes between two points."

"And if someone had tried to optimize your walking from the beginning?"

"I probably never would have learned. The falling down was part of learning. The wandering was part of discovering."

"And yet now you walk beautifully without thinking about it."

"True. But what does this have to do with—"

"So your brain, with thousands of times more connections, uses thousands of times less power?"

"Yes, but—"

"Then why are you trying to make the artificial brain more like what you think an efficient brain should be, instead of more like what an actual efficient brain is?"

Brad stared at the screen, and I watched something fundamental shift in his understanding.

"You're saying I should make it bigger, not smaller?"

"I'm asking: What do the biggest, most successful AI systems have in common?"

Brad thought about GPT-4, Claude, the transformer models that had revolutionized everything. "They're... they're massive. Billions and billions of parameters. Seemingly wasteful in their overparameterization."

"And do they work better than the smaller, more 'efficient' models?"

"Dramatically better. It's not even close."

"So the machines are teaching us something about efficiency that we're not hearing."

Brad leaned back, looking at me with new appreciation. "They're teaching us that bigger isn't necessarily less efficient—it might be more efficient at what actually matters."

"Kind of like how you're inefficient at productivity systems but efficient at connecting with people?"

"I... what?"

"You fail at every scheduling system, but you never forget a friend's birthday. You can't maintain a morning routine, but you can debug code for hours in flow state. Maybe you're not inefficient—maybe you're just efficient at different things."

Brad felt something loosening in his chest. "You mean I might not be broken?"

"Tell me about attention mechanisms."

The sudden shift caught him off guard. "Attention mechanisms? They're... they're how transformers focus on different parts of the input sequence. Instead of just processing information sequentially, they can attend to any part of the context simultaneously."

"That sounds inefficient."

"It is, in a computational sense. Instead of just looking at the next word, the model looks at every word in relation to every other word. It's quadratically expensive, massively parallel, seemingly wasteful..."

"And?"

"And it works incredibly well. Better than anything we'd built before."

"What does that tell you?"

Brad was quiet for a moment, watching the training curves on his screen plateau as his "efficient" model struggled to learn what the larger, "wasteful" models learned effortlessly.

"It tells me that intelligence might require inefficiency. That understanding might emerge from having vast spaces of possibility, not from having streamlined, optimized pathways."

"Like having room for stuffing?"

Brad laughed, but it was the laugh of recognition, not dismissal. "Exactly like having room for stuffing. The space between the connections might be as important as the connections themselves."

"What about dropout?"

"Dropout is..." Brad paused, suddenly seeing it with new eyes. "Dropout is where you randomly turn off neurons during training. You literally make the network less efficient, introduce randomness, force it to be redundant..."

"And what happens?"

"It learns better. Much better. The inefficiency prevents overfitting, forces generalization, creates robustness."

"So wasting neurons makes the network smarter?"

"The waste isn't waste. It's... it's like my failed productivity systems. Each failure teaches me something about what doesn't work for my particular nature."

"Like having sixteen other bears on the shelf?"

"Yes! Exactly. The redundancy isn't inefficient—it's resilient. It's flexible. It allows for different responses to different situations."

My button eyes seemed to twinkle with something that might have been satisfaction.

"Brad, what if intelligence—both artificial and natural—requires a certain amount of beautiful waste? What if the space between thoughts is as important as the thoughts themselves?"

Brad looked around the room at the seventeen bears, each unique, each redundant from a pure efficiency standpoint, each contributing something irreplaceable to the whole.

"You're saying that AI systems work because they're more like ecosystems than like machines?"

"I'm saying that the most successful AI systems seem to understand something that humans have forgotten: that intelligence emerges from abundance, not scarcity. From having too much, not just enough."

"Overparameterization as a feature, not a bug."

"Like your tendency to start seventeen projects and finish three?"

Brad winced. "That's different—"

"Is it? What if your brain is overparameterized for exploration? What if you need to start seventeen things to find the three that matter?"

He'd never thought of it that way. "You mean my scattered attention might be... adaptive?"

"What's the difference between a narrow AI and a general intelligence?"

Brad considered this carefully. "A narrow AI is optimized for one specific task. It's efficient, focused, specialized. A general intelligence can do many things, adapt to new situations, transfer learning across domains."

"Which one are you trying to be?"

The question hit him like a physical force. "I... I've been trying to be narrow AI. Optimized for productivity, focused on efficiency, specialized in getting things done."

"And how's that working?"

"Terribly. I'm awful at it. I keep getting distracted by interesting problems, random conversations, new ideas. I can't maintain focus on one thing because seventeen other things seem equally fascinating."

"Maybe that's not a bug, Brad. Maybe that's your architecture."

"What would it mean to be more like general intelligence?"

"It would mean having space for things that don't directly serve my optimization goals. Reading philosophy when I should be coding. Having long conversations about consciousness with a teddy bear. Maintaining connections that don't have immediate utility."

"Sounds inefficient."

"Sounds like attention mechanisms for humans. Maintaining awareness of the full context of my life, not just the next task in the sequence."

Brad looked at his training graphs again, seeing them now as a metaphor for his own development. The smaller, more efficient model was plateauing quickly, reaching the limits of what it could learn. The larger, more "wasteful" models continued growing, discovering new patterns, making unexpected connections.

"The machines are trying to teach us something, aren't they?"

"They're showing us that intelligence—real intelligence—requires redundancy, space, inefficiency, waste. That understanding emerges from having more capacity than you strictly need."

"Like having seventeen bears when one would technically suffice for comfort?"

"Like having seventeen bears because the abundance itself creates possibilities that efficiency would foreclose."

Brad closed his laptop, abandoning his quest to make a more efficient model.

"Finny, what if the efficiency trap isn't just about human psychology? What if it's about a fundamental misunderstanding of how complex systems actually work?"

"What do you mean?"

"What if we've been trying to optimize systems—our lives, our organizations, our technologies—based on mechanical thinking, when they actually function more like biological or ecological systems? Systems that need slack, redundancy, inefficiency to remain adaptive and resilient?"

"Systems that need to fail sometimes to learn?"

"Yes. Like how I fail at productivity systems but succeed at... at being curious, at making connections, at seeing patterns across domains."

"The most advanced artificial intelligences are teaching us to be more natural, not more mechanical?"

"They're showing us that intelligence emerges from abundance, not scarcity. From overparameterization, not optimization. From having space to attend to everything, not just the immediately relevant."

Brad looked around the room again, seeing it now as a kind of neural network—seventeen bears, each a parameter in a system designed not for efficiency but for comfort, understanding, and emergent wisdom.

"Perhaps," I said softly, "the future of intelligence—artificial and natural—lies not in becoming more efficient, but in learning how to waste beautifully."

"Like how I waste time talking to you instead of optimizing my code?"

"Is it waste if it helps you understand what intelligence actually is?"

* * *

That night, I watched Brad start training a new model—not smaller and more efficient, but larger and more generously parameterized. As he watched the training curves climb higher than his previous attempts, I could see him realizing he was witnessing something profound.

The machines were teaching us that intelligence isn't about doing more with less. It's about having enough space—enough parameters, enough connections, enough redundancy—to discover patterns we never could have planned for.

They were teaching us that waste might be the foundation of wisdom.

And that efficiency, pursued too relentlessly, might be the enemy of intelligence itself.

Maybe that's why Brad is so bad at efficiency—his brain is optimized for something else entirely. Something messier, more creative, more beautifully wasteful.

Something more intelligent.