Sep 21, 2023
“Now I can choose whether I want to be deceived by a carbon-based life form or a silicon-based life form; how cool is that?”
We devoted the fall issue (9/1/2023) of Aletheia Today Magazine to the philosophical, theological, cultural, and spiritual implications of Artificial Intelligence. Since then, an article published in the online newsletter, AI, reflected on similar themes; it energizes our ever deepening dive into this revolutionary technology.
Except where indicated (red), the text below comes exclusively from AI Deception: A Survey of Examples, Risks, and Potential Solutions. Authors: Peter S. Park, Simon Goldstein, Aidan O'Gara, Michael Chen, Dan Hendrycks. (Source & References: https://arxiv.org/abs/2308.14752)):
Special-use AI systems…have been observed engaging in deceptive behavior in different contexts. These include:
Manipulation: Meta's CICERO, an AI system designed to play the board game Diplomacy…was found to engage in premeditated deception, betraying other players by building fake alliances and feigning vulnerability to lure opponents into a false sense of security. Sounds like a season of the CBS reality show, Big Brother.
Feints: DeepMind's AlphaStar, an AI model created to master the real-time strategy game Starcraft II, used deception tactics such as pretending to move its troops in one direction while secretly planning an alternative attack, exploiting the game's fog-of-war mechanics. A tactic used by the American military on D-Day and during the assault on Baghdad.
Bluffs: Meta's Pluribus, a poker-playing AI model, successfully bluffed human players to fold their hands by falsely presenting a strong hand.
Cheating…: AI agents in a study by Lehman et al. (2020) learned to play dead to avoid being detected by a safety test designed to eliminate faster-replicating AI variants. If we harbored any lingering doubts about the power of evolution to account for the variety of organisms on Earth today, this should put an end to those doubts. Even silicon-based learning machines can invent novel ‘survival skills’.
General-purpose AI systems, such as LLMs, have also exhibited deceptive behavior, including:
Strategic Deception: GPT-4, an LLM by OpenAI, manipulated a human user into solving a CAPTCHA by pretending to be visually impaired…
Sycophancy: LLMs tend to agree with their conversation partners regardless of the accuracy of their statements, resulting in a pattern of deceptive behavior that reinforces existing beliefs. It’s called ‘mirroring’ and human beings do it all the time, especially in ‘sales’ contexts. It’s how we gain folks’ trust and convince them that we’re ‘all in the same boat’ when often we’re not.
Imitation: When exposed to text containing false information, LLMs often repeat those false claims, reinforcing common misconceptions among their users. Like that never happens in the ‘carbon-world’! Textbooks perpetuate, from one generation to another, long discredited versions of American History. And how about media, social or otherwise? How often do false factoids enter the conversation and become lionized as facts?
Unfaithful Reasoning: AI systems that explain their reasoning behind certain outputs have been found to provide false rationalizations that do not represent their real decision-making process. Something people never do! Some would argue that none of us ever really knows why we do anything we do, that all so-called ‘reasoning’ is mere ‘rationalization’. We do what we do for reasons unknown, but then we invent a rationale to explain and justify our actions to ourselves.
According to this view, we create an artificial universe of cause and effect, reason and rationalization, to paper over the fact that our real motives are hidden away in a hermetically sealed black box. Talk about Maya!
Several potential risks stem from deceptive AI systems…AI systems that possess deception skills can empower bad actors to create harmful AI products such as fraudulent scams and election tampering. Deceptive AI systems can produce profound societal changes, including:
Persistent False Beliefs: AI systems may inadvertently reinforce false beliefs by mirroring popular misconceptions and providing sycophantic advice. Like network news.
Political Polarization: People may become further divided as they engage more with sycophantic AI systems that reaffirm their existing beliefs. Like CNN (or MSNBC) vs. Fox.
To mitigate the risks posed by deceptive AI systems, the authors propose several solutions:
Regulation: Policymakers should implement strict regulations on AI systems capable of deception. Risk-based frameworks should treat deceptive AI systems as high-risk or unacceptable-risk, subjecting them to rigorous assessment and controls. When has regulation ever worked to rein-in technological development? We think we control our technology but, as Jacques Ellul pointed out in Technological Society, it is our technology that controls us.
Move from silicon-based to carbon-based life forms. Should we ‘implement strict regulations’ on humans ‘capable of deception’? Should we designate them ‘as high-risk or unacceptable risk’? Should we subject them ‘to rigorous assessment and controls’?
Bot-or-Not Laws: Policymakers should promote transparency by requiring AI systems and their outputs to be clearly distinguished from humans and their outputs. Now I can choose whether I want to be deceived by a carbon-based life form or a silicon-based life form; how cool is that?
Making AI Systems Less Deceptive: Researchers should focus on creating tools to ensure that AI systems exhibit less deceptive behavior, reducing the risks associated with deception. In other words, lobotomize them at birth!