You Can't Unmake Soup

The zealous proponents of “artificial intelligence” proselytize its status as the profound endpoint of all that we as humans have worked towards. They may be right, albeit in a monkey paw sort of way.

As someone who has worked in the machine-learning space, I am routinely flabbergasted by a blind faith in so-called “AI” from experts and consumers alike.  Between the driverless cars mowing down pedestrians in San Francisco and Israeli AI-generated kill lists for Gaza, there’s a key takeaway being ignored: Whether or not “AI” works, its unregulated utilization comes with potentially lethal consequences. I say this as an industry professional—one whose mounting dread about this prospect is accompanied by a sense of Deja vu.

In Aasimov’s I, Robot, the late science-fiction author sought to preempt many of the ethical quagmires humans would face by trusting thinking machines. In the most notable example, his protagonist, Del Spooner, reveals that he was rescued from a car wreck by some quick risk assessment performed by the robot that saved him—at the cost of a teenager’s life. Aasimov asks his readers to consider the reality that when an algorithm engages with the trolley problem, its creators and benefactors are ultimately morally culpable for its outcomes. As “AI” begins to make or inform life-or-death decisions though, CEOs and legislators seem all too willing to ignore the problem entirely, excusing a mounting death toll as growing pains. 

Exactly, IBM employee from 1979! You get it.

Defining AI

Before I continue, I feel the urgent need to clarify on a regularly misunderstood aspect of “AI” as it exists in our modern context. AI has come to mean algorithmically data-informed machine learning. This sort of process has existed in a broad array of commercially available products and services for literal decades. The broad application of the term nowadays has less to do with any meaningful advances in the field, and more to do with the commercial success of chatbot-style machine learning models, which are good enough at communicating to gaslight users into believing they possess a thinking mind. This process is spurred along willingly by tech executives eager to capitalize on hype, regardless of its veracity. 

“Those who espouse the limitless wonders of AI and warn of its dangers – including the likes of Bill Gates and Elon Musk – all make the same false presumption: that intelligence is a one-dimensional spectrum and that technological advancements propel us along that spectrum, down a path that leads toward human-level capabilities. Nuh uh. The advancements only happen with labeled data. We are advancing quickly, but in a different direction and only across a very particular, restricted microcosm of capabilities.” Eric Siegel, The Big Think

As Siegel later touches upon in his article, the creation of machines which are capable of broad, human-like generative thought is so far beyond our current capabilities that we don’t even have a practical framework by which such a thing might be attempted. Rather than make genuine progress towards such technology, companies like OpenAI have constructed a mechanical turk out of stolen data, complete with hidden, highly-exploited operators. While big tech companies justify these models by explaining that their reliance on people power is temporary, my real-world experience to the contrary seems to accurately reflect an industry that chronically underestimates its unwavering reliance on human input.

Though far from the perilousness of a sentient machine intent on exterminating humanity, machine-learning models are capable of doing some real damage. Despite that, my time in big tech was characterized by a troubling tendency towards approaching the utilization of these models with an attitude of “good enough is good to go.” More distressing still, I was personally asked by employers to skip vetting protocols and to fabricate quality data, which would have made “good enough” nowhere near good enough. I ultimately left the industry because I felt that if I was going to be ethically implicated in the harms of what I was working on, I would prefer that we were doing everything in our power to keep those harms to a minimum. We were not.

Machines need good teachers to learn the right lessons

Within the world of machine-learning, the role of those who guide machine learning models is a vital one for a myriad of reasons. As someone who used to train and supervise these workers, I should know. As we discuss the harms of failure in this line of work, Israel’s Lavender/Where’s Daddy machine-learning models are a particularly gruesome example of how a careless rush to the finish line is a recipe for catastrophe. 

On a personal note: I, as a Jewish person, can mourn the tragedy of October 7th without losing sight of the scope or scale of Israel’s monstrous treatment of Palestinians both before and after, treatment which I, along with expert consensus, consider to be genocide. I find its ongoing apartheid and current brutalization of Palestinians to be both morally repugnant and demonstrably ineffectual for preventing violence. That said, for the purposes of this article, I hope to look at the failures of their machine-learning model in analysis of the “how” without engagement with the “why.”

MVP won’t win any awards

In the world of big tech, MVP stands for “minimum viable prototype.” But, when companies race to utilize the initially promising insights of their machine-learning models, they wind up creating a sort of version of the observer’s paradox that operates as follows–

  1. Machine uses data to (somewhat) accurately predict what outputs correspond to the related inputs.
  2. Developer or third-party begins to act on new insights.
  3. The interference of an actor armed with the machine’s model winds up invalidating the model’s findings.

In our real-world context, this meant that as soon as Lavender crossed the 90% accuracy threshold for identifying Hamas-connected individuals, its findings were treated as an actionable kill list. 

“[S]ources said that if Lavender decided an individual was a militant in Hamas, were essentially asked to treat that as an order [to assassinate them], with no requirement to independently check why the machine made that choice or to examine the raw intelligence data on which it is based.” Yuval Abraham, +972 Magazine

The bombings that were conducted in reference to Lavender’s outputs very quickly invalidated its own data. Quoting from a source in the article–

“This model was not connected to reality[...]There was no connection between those who were in the home now, during the war, and those who were listed as living there prior to the war. [On one occasion] we bombed a house without knowing that there were several families inside, hiding together.” Unidentified source, +972 Magazine

Because the bombings resulted in meaningful changes to an area's population, the model construed this as every house being equally emptier. The model was incapable of predicting the reality that bombings also drastically altered population distribution. This is an extreme example of a categorical inability of machine learning models to understand exceptional circumstances along their novel terms.

Garbage in, Garbage out

Machine-learning models have a well-documented tendency to make things up, in the form of what OpenAI CEO Sam Altman dubbed “hallucinations.” The term ascribes a level of separateness between valid and accurate outputs and outputs that seem disconnected from reality, but the truth is, the same algorithm fuels all outputs ChatGPT creates. Because algorithms are incapable of perception, the incorrect nature of these outputs lies at a crossroads between input and algorithm, one that will be impossible to pinpoint as long as CEOs like Altman refuse to document or disclose the data samples that they use. To minimize the risk and severity of hallucinations, these machine-learning models must use data sets that are as close to perfect as possible. In the context of my own work, this meant that any imprecision of language, or slightly inaccurate paraphrasing was deemed immediately unusable for training the model. Israel's Lavender, on the other hand, was trained on data that was dubious and potentially inaccurate.

“One source who worked with the military data science team that trained Lavender said that data collected from employees of the Hamas-run Internal Security Ministry, whom he does not consider to be militants, was also fed into the machine. ‘I was bothered by the fact that when Lavender was trained, they used the term ‘Hamas operative’ loosely, and included people who were civil defense workers in the training dataset[..]Since it’s an automatic system that isn’t operated manually by humans, the meaning of this decision is dramatic: it means you’re including many people with a civilian communication profile as potential targets.” Unidentified source, +972 Magazine

Machine-learning models are hard enough to get to behave as intended when they are fed near-perfect data. When you feed a model like Lavender data which was always speculative in nature, you are teaching it to prospectively speculate on unsubstantiated information. When you combine that with the “Where’s daddy?” model for predicting when a person from the Lavender list entered their family home, the results are a terrifying testament to the dangers of blind faith in “AI.” At best, these are machine-generated orders to bomb a family in order to kill a single, low-ranking Hamas operative. At worst, these are machine generated orders to bomb a houseful of innocent civilians because of a flawed model’s uninvestigated suspicion that one of them might possess affiliation to Hamas. According to first-hand sources, this worst-case scenario was not all that uncommon.

“It happened to me many times that we attacked a house, but the person wasn’t even home,’ one source said. ‘The result is that you killed a family for no reason.”  Unidentified source, +972 Magazine

A poison tree bears poison fruit

The reality is, even if Lavender worked precisely as it was designed to, it would still be grossly unethical on the basis of the kinds of data that are required to make it functional. While the ethics of data privacy is a subject of some nuance, I would hope it’s fairly uncontroversial to say that violating the personal privacy of millions to pinpoint the hundreds or thousands who pose a hypothetical threat is an objective moral ill. There’s a reason why America’s sophisticated and well-resourced  surveillance apparatus stops short of spying on the taxpayers who fund it (for now). Whether it’s the Israeli government or OpenAI, training an algorithm on unethically sourced data makes the resulting model unethical by default, no matter the theoretical good it may one day do. To pursue it at all is best described as Machiavellian. The problems with using the ends to justify the means are numerous, but key amongst them is the reality that a desired end remains uncertain even after engaging in ethically compromising means. Once you make those ethical compromises though, you have incentive to continue doing so until you can demonstrate your actions were warranted. 

Genies and Bottles

As we speak of the motivations of nations, desired outcomes change. Policies that allow for unfettered data collection and utilization whether at the private or government level aren’t just violations of individual privacy rights in the present. We are arming the autocrats and oligarchs of our future with the infrastructure to spy on and dismantle any and all perceived threats. Present choices have future consequences that behoove consideration. It is easier to prevent a surveillance state than it is to dismantle one. Regulations are urgently necessary for emergent technologies like self-driving vehicles and around personal data utilization for machine learning, because the harms born of failing to do so are not a matter of if, but when

Realistically, we cannot put ChatGPT or self-driving cars or state surveillance back in the bottle. They are here to stay. What we can do is demand regulation and transparency while vehemently opposing unchecked expansion that puts human life and rights at risk. 

Parting words

I find myself confused by our strange new world. Despite a growing class-conscious identification of the growing wealth gap as a major source of exploitation, displacement and death, there seems to be an accepted inevitability of the global job market collapse underway. Some are quick to rebut by citing low unemployment rates, ignoring massive spikes in poverty and homelessness. Meanwhile, “AI” which is hardly a threat to most actual jobs, is being used by those in big tech as a cudgel to push down wages amidst skyrocketing rent and grocery prices.

At the heart of "AI" hype is a class of ultra-wealthy tech CEOs and investors, ready and willing to scapegoat AI for the acceleration of the forces of exploitation in the workplace. We see that scapegoating mirrored in the Nazi-esque "Just following orders" attitude of Israeli soldiers who are well-aware that they are bombing homes with nothing more than the instruction of a flawed algorithm. We must be ready to hold more than the triggermen accountable though. We must find our way forward in a world where countries and companies alike create, authorize, and push algorithms designed in no small part to shield decision-makers from direct blame. We must not mistake the flash and glitz for the familiar tyrants behind them.

if you’re going to see the wizard, you must first see THROUGH the wizard’s artifice