The Making of AI Snake Oil

The CEO of an ageing tech company stands before a rarefied audience of bankers, diplomats and investors at a think-tank in Manhattan and declares, “I want you to think about data as the next natural resource.” It’s like oil and steam, but “the only limit is yourself.”

This was in 2013 and the trope of data being the new oil was already starting to show its age. From its coinage in 2006 by British entrepreneur Clive Humby to its effervescent adoption by marketers and analysts of every variety, the breathless hype is perhaps best captured by management consulting firm KPMG: “Data is the new oil — and the price is going up.”

The AI frenzy pervading boardrooms, cascading from CEO to foot soldier, was part of a larger hype cycle. That of popular imagination.

By 2020, we were to have driverless cars rendering millions jobless, computing hardware to emulate human intelligence, and software models to reverse engineer the brain. These modest milestones were mere pit stops towards the Singularity, that hallowed point at which AI achieves consciousness and far exceeds human intelligence.

In the sci-fi masterpiece Ex Machina, a billionaire with Elon Musk’s flair for grandiosity ruminates:

“One day the AIs are going to look back on us the same way we look at fossil skeletons on the plains of Africa. An upright ape living in dust with crude language and tools, all set for extinction.”

This surreal future, imminent and inevitable, overtook popular culture. It did not take long to reach the boardroom, where executives better suited to compensation committees and sales operations suddenly wanted to “cognify everything.” An ailment for which the AI snake oil was the perfect salve.

The origin story

The math that undergirds much of modern AI traces back to the humble metre. Yes, that’s right — the unit. The metric system was born in the aftermath of the French revolution so people could be free from arbitrary measures that could be changed at will by the local nobility. But what should the length of one meter be? The French decided that it would be one-ten millionth of the distance from the north pole to the equator.

There was a small problem, however. How does one go about measuring it?

That is precisely what Adrien-Marrie Legendre was trying to figure out when he discovered something curious. You can describe the world through many equations but those equations don’t always match with real observations. This was a known fact at the time. Legendre’s insight was that it would be more elegant to start with the data and derive one equation from them by minimizing the square of errors. This breakthrough — the mathematical precursor to the prediction machines we call AI — was relegated to an appendix in his 1806 book, New Methods for the Determination of Comet Orbits.

Let’s pause here. The math for most commercial AI has been around for 200 hundred years, longer if you include Thomas Bayes’ work on conditional probability. Why then this sudden obsession?

To understand this, we need to fast forward a couple of centuries.

The term Artificial Intelligence was coined in 1955 for a workshop at Dartmouth College. Among those in attendance was Marvin Minsky, future recipient of the Turing Award, often referred to as the Nobel Prize for computing. Minsky advocated for a top-down approach to AI in which expertise is encoded as symbolic rules. If one could program enough rules the theory posits, intelligence would emerge. This approach came to be known as Symbolic AI.

There was a competing approach taking shape at Cornell Aeronautical Laboratory. Led by research psychologist Frank Rosenblatt, this approach was based on how neurons interact in the brain. Rosenblatt called his model the Perceptron. In 1958, he demonstrated its feasibility on IBM 704, a 5-ton computer the size of a room.

The Perceptron fascinated the public, with the New Yorker calling it a “new electronic brain” and a machine that was “capable of what amounts to thought.”

Minsky was not impressed. He spent much of the 60s debating Rosenblatt, and in 1969 published a book, dense with theorems, arguing that the utility of Perceptrons was severely limited. His criticism stuck and gradually stifled funding for research in the nascent field of Artificial Neural Networks. The AI winter had begun.

The deep frost started to thaw in the 2000s as computing power caught up.

In 2012 Geoffrey Hinton, a professor at the University of Toronto, led a team that applied neural networks to classify images within 10% of human-level accuracy. About the same time, IBM’s Watson beat human champions at Jeopardy. A few years later and Google was translating passages from Ernest Hemingway.

The breakthrough was real. It was called deep learning.

The hype machine

“They’re like cockroaches,” says Ralph Widmann, who runs technology at a sizable publicly-listed company. If executives had a spirit animal, Ralph’s would be the night owl: prescient and scathing. Ralph is from an earlier era. An era when leaders were good at something other than just hustle and its related pretension, management.

I’m having dinner with Ralph at a midtown Italian joint in New York City. He is complaining about the management consultants who have taken over the board. Under their influence, the company has imported an executive to unlock billions through AI and IoT. There are Powerpoints describing the path to value with the first hundred million dollars or so to be captured within two fiscal quarters. If it is on a board-level presentation, then it must be right.

Entrepreneur and Next AI alumni Farrukh Jadoon echoes Ralph’s skepticism for top-down AI mandates by the board and their coterie of executives chasing short-term returns. Farrukh classifies companies into two categories: companies that need AI to solve a previously unsolvable problem and companies that do AI because their board, often instigated by strategy consultants, has asked them. In the first category are path-breaking companies such as Recursion Pharma applying deep learning to drug discovery. The latter comprises most of corporate North America.

AI has become big business.

According to market intelligence provider CB Insights, venture funding in AI went from $559M in 2013 to $26B in 2019. That is a 50-fold increase in 6 years. AI pixie dust became a valuable commodity and in abundant supply. A little sprinkle and you had investors from around the world lining up with chequebooks in hand.

For a real-world example of pixie dust in action, look no further than Engineer.ai.

Founded in 2016 to make bespoke software development “effortless” by creating an “AI powered assembly line,” the start-up raised $29.5M from investors, including Masayoshi Son’s SoftBank. While most companies aim to build tech for AI-assisted work by humans, Engineer.ai’s magic was to invert the logic. Their platform was to bring about the age of human-assisted AI. AI now almost had a sense of agency, building software merely with human assistance.

However, there was one tiny wrinkle in this elaborate yarn. There was no AI. The software projects they took on were built entirely by outsourced developers in India.

Fake it till you make it. Until you can’t.

When corporate executives and marketers think of AI, they often see a cash machine. Data goes in. Money comes out. Unfortunately, science does not abide by fiscal quarters.

Fair weather friend

There is one other inconvenient truth lurking in the shadows of the AI hype machine: the long material shadow cast by deep neural networks.

Much like your infinite scroll on Instagram, connecting to a remote cluster has an ethereal quality. It feels weightless and clean. Until, that is, you look at the vast, dirty supply chain obstructed from view. It is not that the dirt and grime and toxic fumes do not exist. It’s just that the exhaust fan is somewhere else.

In a 2019 paper, researchers from the University of Massachusetts quantified the carbon emission from training complex deep learning systems for natural language processing. The results were alarming. Training a complex neural network can emit up to 5 times the CO2 of a typical car over its lifetime. Let that sink in. All that steel and aluminum to manufacture. All that gas, idling and exhaust over the years of usage. Make that five-fold to visualize the environmental cost of training a single state-of-the-art model.

In fact, according to research by Neil Thomson at MIT Computer Science & AI Lab, eking out marginal improvements in accuracy requires throwing exponential amounts of computing at the problem. Deep learning’s hunger for computational power is far outpacing advancements in hardware performance, especially considering the “meagre improvements from the last vestiges of Moore’s law.”

Yes, there is research underway ranging from new paradigms that may exponentially increase computing power to deep learning algorithms that make training and prediction more efficient. However, it will take time for these efforts to bear fruit. In the meanwhile, we only have nine years to reduce our carbon emissions by 45% percent for the planet to remain habitable, according to the Intergovernmental Panel on Climate Change. The material shadow of our computing infrastructure, of which machine learning is a growing component, can no longer be ignored.

Finding your voice

“I felt completely trapped, a prisoner in my own body.”

These are the words of Joe Morris, a filmmaker from London. After a sore spot on his tongue would not heal, Joe decided to see a doctor. He was 31 years old and a non-smoker. It was supposed to be nothing, but the MRI showed a tumour that would need to be carved out and removed, taking along with it Joe’s ability to speak.

In an article in the Guardian, Jordan Kisner narrates Joe’s story and how VocalID, a Boston-based start-up, helps people like him preserve their voice. Founded by Rupal Patel, a speech pathologist and researcher at Northeastern University, VocalID uses machine learning to synthesize a voice that is unique to each individual.

Our voice is an integral part of who we are. We use it to pitch, sing, confront, and comfort. It is an imprint of our identity. Even the briefest of fragments is a window to our temperament, origin and perhaps even class. That is why the loss of one’s voice is an unmooring, a disability that is, at once, tragic and invisible. With VocalID, those robbed of speech can now find a voice of their own.

This is the real wonder of AI: the potential to give voice to the voiceless, bring vision to the blind, mobility to the paralyzed.

Right here, hidden in plain sight, lies a clue to the antidote for AI snake oil.

In his talk “How to recognize AI snake oil” Princeton professor Arvid Narayanan presents a case where a simple linear regression with a few easily understandable variables is almost as accurate as advanced machine learning algorithms with thousands of features. The technique that’s been around for over a hundred years performs almost as well as the latest whizbangs from the world of AI.

Perhaps the boring antidote to AI snake oil, then, is to fall in love with the problem — and not the solution. While deep learning may be the best model to predict if diabetes may lead to blindness, it may not be the right solution to predict criminal recidivism — or filter resumes.

Falling in love with the problem allows us to explore with an open mind without getting stymied by the hype machine or the vacuous management mantras from the corner office. The work becomes bottoms-up and science-driven. It demands patience and grit. It’s measured in meaningful outcomes, not quarterly fiscal metrics.

And from this mundane tenacity, this relentless grind to build meaning, emerges the remarkable.

ACKNOWLEDGEMENTS

Farid, thanks for reviewing this piece with the eye of an editor. Maryam, Aaron, Ali & Mehryar, thank you for the incisive feedback.

NOTES & REFERENCES

While writing this piece, I turned to the most authoritative source of human experience.

“Hey Siri, are you self aware?”

“I think I am, therefore, I might be.”

That, ladies and gentlemen, is pure wisdom.

And now, onto the more corporeal references:

The origin story

As with many scientific breakthroughs, there was high drama. German mathematician Carl Friedrich Gauss claimed priority for the discovery and an intellectual catfight ensued, with a loose reference to “…urinating on the ashes of my ancestors.’’ It seems the stereotype about ‘the sensitive academic’ has a rich history.

  • How the french revolution created the metric system, National Geographic.
  • The Path to Least Squares: Adrien-Marie Legendre, Bob Rosenfeld, Vermont Math Initiative.

Fair weather friend

The hype machine

Finding your voice

The views expressed in this article are mine alone.