Artificial Intelligence The AI Classroom

What Do We Currently Know About OpenAI’s Q* Project?

A breakthrough AI system called Q* is supposedly being developed by OpenAI to solve complex math problems and has researchers excited yet concerned about the progress and development of AGI

Neil Regole

Updated April 30, 2024

DALL-E as an astronaut, generated with DALL-E 3, upscaled with Moonlight

Reading Time: 7 minutes

After the recent OpenAI drama, a new model that's believed to be incredible at high-level thinking and solving complex math problems has been speculated, and it’s called Q*. It allegedly has a team of researchers concerned that it may pose a threat to humanity.

The Q* project is said to potentially be used in groundbreaking scientific research that might even surpass human intelligence. But what exactly is the Q* project and what does it mean for the future of AI?

After Tons Of Speculation, Here's What We Found:

Q* is an internal project at OpenAI that some believe could be a breakthrough towards artificial general intelligence (AGI). It’s focused on efficiently solving complex mathematical problems.
The name "Q*" suggests it may involve quantum computing in some way to harness the processing power needed for AGI, but others think the "Q" refers to Q-learning, a reinforcement learning algorithm.
Some speculate that Q* is a small model that has shown promise in basic math problems, so OpenAI predicts that scaling it up could allow it to tackle highly complex problems.
Q* may be a module that interfaces with GPT-4, helping it reason more consistently by offloading complex problems onto Q*.
While intriguing, details on Q* are very limited and speculation is high. There are many unknowns about the exact nature and capabilities of Q*. Opinions differ widely on how close it brings OpenAI to AGI.

What Is The Q* Project?

OpenAI researchers have developed a new AI system called Q* (pronounced as Q-star) that displays an early ability to solve basic math problems. While details remain scarce, some at OpenAI reportedly believe Q* represents progress towards artificial general intelligence (AGI) - AI that can match or surpass human intelligence across a wide range of tasks.

However, an internal letter from concerned researchers raised questions about Q*'s capabilities and whether core scientific issues around AGI safety had been resolved prior to its creation. This apparently contributed to leadership tensions, including the brief departure of CEO Sam Altman before he was reinstated days later.

During an appearance at the APEC Summit, Altman made vague references to a recent breakthrough that pushes scientific boundaries, now thought to indicate Q*. So what makes this system so promising? Mathematics is considered a key challenge for advanced AI. Existing models rely on statistical predictions, yielding inconsistent outputs. But mathematical reasoning requires precise, logical answers every time. Developing those skills could unlock new AI potential and applications.

While Q* represents uncertain progress, its development has sparked debate within OpenAI about the importance of balancing innovation and safety when venturing into unknown territory in AI. Resolving these tensions will be critical as researchers determine whether Q* is truly a step toward AGI or merely a mathematical curiosity. Much work will most likely be required before its full capabilities are revealed.

What Is Q Learning?

The Q* project uses Q-learning which is a model-free reinforcement learning algorithm that determines the best course of action for an agent based on its current circumstances. The “Q” in Q-learning stands for quality, which represents how effective an action is at earning future rewards.

Algorithms are classified into two types: model-based and model-free. Model-based algorithms use transition and reward functions to estimate the best strategy, whereas model-free algorithms learn from experience without using those functions.

In the value-based approach, the algorithm teaches a value function to recognize which situations are more valuable and what actions to take. In contrast, the policy-based approach directly trains the agent on which action to take in a given situation.

Off-policy algorithms evaluate and update a strategy that is not the one used to take action. On the other hand, on-policy algorithms evaluate and improve the same strategy used to take action. To understand this more, I want you to think about an AI playing a game.

Value-Based Approach: The AI learns a value function to evaluate the desirability of various game states. For example, it may assign higher values to game states in which it is closer to winning.
Policy-Based Approach: Rather than focusing on a value function, the AI learns a policy for making decisions. It learns rules such as "If my opponent does X, then I should do Y."
Off-Policy Algorithm: After being trained with one strategy, the AI evaluates and updates a different strategy that it did not use during training. It may reconsider its approach as a result of the alternative strategies it looks into.
On-Policy Algorithm: On the other hand, an on-policy algorithm would evaluate and improve the same strategy it used to make moves. It learns from its actions and makes better decisions based on the current set of rules.

Value-based AI judges how good situations are. Policy-based AI learns which actions to take. Off-policy learning uses unused experience too. On-policy learning only uses what actually happened.

AI Vs AGI: What's The Difference?

While some regard Artificial General Intelligence (AGI) as a subset of AI, there is an important distinction between them.

AI Is Based on Human Cognition

AI is designed to perform cognitive tasks that mimic human capabilities, such as predictive marketing and complex calculations. These tasks can be performed by humans, but AI accelerates and streamlines them through machine learning, ultimately conserving human cognitive resources. AI is intended to improve people’s lives by facilitating tasks and decisions through preprogrammed functionalities, making it inherently user-friendly.

General AI Is Based on Human Intellectual Ability

General AI, also known as strong or strict AI, aims to provide machines with intelligence comparable to humans. Unlike traditional AI, which makes pre-programmed decisions based on empirical data, general AI aims to push the envelope, envisioning machines capable of human-level cognitive tasks. This is a LOT harder to accomplish though.

What Is The Future Of AGI?

Experts are divided on the timeline for achieving Artificial General Intelligence (AGI). Some well-known experts in the field have made the following predictions:

Louis Rosenberg of Unanimous AI predicts that AGI will be available by 2030.
Ray Kurzweil, Google’s director of engineering, believes that AI will surpass human intelligence by 2045.
Jürgen Schmidhuber, co-founder of NNAISENSE, believes that AGI will be available by 2050.

The future of AGI is uncertain, and ongoing research is being conducted to pursue this goal. Some researchers don’t even believe that AGI will ever be achieved. Goertzel, an AI researcher, emphasizes the difficulty in objectively measuring progress, citing the various paths to AGI with different subsystems.

A systematic theory is lacking, and AGI research is described as a “patchwork of overlapping concepts, frameworks, and hypotheses” that are sometimes synergistic and contradictory. Sara Hooker of research lab Cohere for AI stated in an interview that the future of AGI is a philosophical question. Artificial general intelligence is a theoretical concept, and AI researchers disagree on when it will become a reality. While some believe AGI is impossible, others believe it could be accomplished within a few decades.

Should We Be Concerned About AGI?

The idea of surpassing human intelligence rightly causes apprehension about relinquishing control. And while OpenAI claims benefits outweigh risks, recent leadership tensions reveal fears even within the company that core safety issues are being dismissed in favor of rapid advancement.

What is clear is that the benefits and risks of AGI are inextricably connected. Rather than avoiding potential risks, we must confront the complex issues surrounding the responsible development and application of technologies such as Q*. What guiding principles should such systems incorporate? How can we ensure adequate safeguards against misappropriation? To make progress on AGI while upholding human values, these dilemmas must be addressed.

There are no easy answers, but by engaging in open and thoughtful dialogue, we can work to ensure that the arrival of AGI marks a positive step forward for humanity. Technical innovation must coexist with ethical responsibility. If we succeed, Q* could catalyze solutions to our greatest problems rather than worsening them. But achieving that future requires making wise decisions today.

The Q* project has demonstrated impressive capabilities, but we must consider the possibility of unintended consequences or misuse if this technology falls into the wrong hands. Given the complexity of Q*'s reasoning, even well-intentioned applications could result in unsafe or harmful outcomes.

Want to Learn Even More?

If you enjoyed this article, subscribe to our free newsletter where we share tips & tricks on how to use tech & AI to grow and optimize your business, career, and life.

#AI #ChatGPT #Machine Learning #OpenAI