How close is AI to human-level intelligence? 人工智能距离达到人类级别的智能还有多远？...-CSDN博客

本文链接：https://blog.csdn.net/SE_JW/article/details/144336530

Large language models such as OpenAI’s o1 have electrified the debate over achieving artificial general intelligence, or AGI. But they are unlikely to reach this milestone on their own.

像OpenAI的o1这样的大型语言模型极大地推动了实现通用人工智能(AGI)的讨论。但是它们不太可能独自达到这一里程碑。

How close is AI to human-level intelligence? 人工智能距离达到人类级别的智能还有多远？_人工智能

Illustration: Petra Péterffy插图：佩特拉·彼得菲

OpenAI’s latest artificial intelligence (AI) system dropped in September with a bold promise. The company behind the chatbot ChatGPT showcased o1 — its latest suite of large language models (LLMs) — as having a “new level of AI capability”. OpenAI, which is based in San Francisco, California, claims that o1 works in a way that is closer to how a person thinks than do previous LLMs.OpenAI

在9月份推出了一款最新的人工智能(AI)系统，并做出了大胆的承诺。这家位于加利福尼亚州旧金山的公司展示了其最新的一套大型语言模型(LLMs)——o1，称其具有“新的人工智能能力水平”。OpenAI声称，o1的工作方式比之前的LLMs更接近人类的思维方式。

The release poured fresh fuel on a debate that’s been simmering for decades: just how long will it be until a machine is capable of the whole range of cognitive tasks that human brains can handle, including generalizing from one task to another, abstract reasoning, planning and choosing which aspects of the world to investigate and learn from?

该声明进一步激化了一个已经持续了几十年的争论：机器到底需要多长时间才能具备人类大脑所能处理的全部认知能力，包括从一个任务到另一个任务的泛化、抽象推理、规划和选择哪些方面来进行调查和学习？

How close is AI to human-level intelligence? 人工智能距离达到人类级别的智能还有多远？_人工智能_02

Such an ‘artificial general intelligence’, or AGI, could tackle thorny problems, including climate change, pandemics and cures for cancer, Alzheimer’s and other diseases. But such huge power would also bring uncertainty — and pose risks to humanity. “Bad things could happen because of either the misuse of AI or because we lose control of it,” says Yoshua Bengio, a deep-learning researcher at the University of Montreal, Canada.这样的人工通用智能(AGI)可以解决棘手的问题，包括气候变化、流行病和癌症、阿尔茨海默症等疾病的治疗方法。但是，如此强大的能力也会带来不确定性，并对人类构成风险。“坏事可能会因为人工智能的滥用或我们失去对它的控制而发生，”加拿大蒙特利尔大学(University of Montreal)深度学习研究员约书亚·本吉奥(Yoshua Bengio)说。

The revolution in LLMs over the past few years has prompted speculation that AGI might be tantalizingly close. But given how LLMs are built and trained, they will not be sufficient to get to AGI on their own, some researchers say. “There are still some pieces missing,” says Bengio.

过去几年里，LLMs(大型语言模型)领域发生了革命性的变化，这引发了人们对AGI(通用人工智能)即将到来的猜测。但一些研究人员表示，鉴于LLMs的构建和训练方式，它们本身并不足以实现AGI。“仍然有一些缺失的部分，”贝尼奥说。

What’s clear is that questions about AGI are now more relevant than ever. “Most of my life, I thought people talking about AGI are crackpots,” says Subbarao Kambhampati, a computer scientist at Arizona State University in Tempe. “Now, of course, everybody is talking about it. You can’t say everybody’s a crackpot.”显而易见的是，关于AGI的问题如今比以往任何时候都更加重要。亚利桑那州立大学坦佩分校的计算机科学家Subbarao Kambhampati说：“在我的大部分人生中，我认为那些谈论AGI的人是疯子。现在，当然，每个人都在谈论它。你不能说每个人都是疯子。”

Why the AGI debate changed为什么关于AGI(通用人工智能)的争论发生了变化？

The phrase artificial general intelligence entered the zeitgeist around 2007 after its mention in an eponymously named book edited by AI researchers Ben Goertzel and Cassio Pennachin. Its precise meaning remains elusive, but it broadly refers to an AI system with human-like reasoning and generalization abilities. Fuzzy definitions aside, for most of the history of AI, it’s been clear that we haven’t yet reached AGI. Take AlphaGo, the AI program created by Google DeepMind to play the board game Go. It beats the world’s best human players at the game — but its superhuman qualities are narrow, because that’s all it can do.“

通用人工智能”这一术语是在2007年由人工智能研究人员本·戈尔茨尔和卡西奥·佩纳奇尼在一本同名书中提及后才进入公众视野的。其确切含义至今仍不明确，但它大致指的是具有人类般推理和泛化能力的人工智能系统。尽管定义模糊不清，但在人工智能的大部分历史中，人们都清楚地认识到我们尚未达到通用人工智能的水平。例如，谷歌DeepMind开发的AI程序AlphaGo，它在棋盘游戏Go上击败了世界上最好的人类玩家——但这种超人能力是有局限性的，因为这只是它能做的事情。

The new capabilities of LLMs have radically changed the landscape. Like human brains, LLMs have a breadth of abilities that have caused some researchers to seriously consider the idea that some form of AGI might be imminent, or even already here.

LLMs的新功能彻底改变了这一领域。与人类大脑一样，LLMs拥有广泛的能力，这使得一些研究人员认真考虑了“某种形式的AGI即将来临”甚至“已经存在”的可能性。

This breadth of capabilities is particularly startling when you consider that researchers only partially understand how LLMs achieve it. An LLM is a neural network, a machine-learning model loosely inspired by the brain; the network consists of artificial neurons, or computing units, arranged in layers, with adjustable parameters that denote the strength of connections between the neurons. During training, the most powerful LLMs — such as o1, Claude (built by Anthropic in San Francisco) and Google’s Gemini — rely on a method called next token prediction, in which a model is repeatedly fed samples of text that has been chopped up into chunks known as tokens. These tokens could be entire words or simply a set of characters. The last token in a sequence is hidden or ‘masked’ and the model is asked to predict it. The training algorithm then compares the prediction with the masked token and adjusts the model’s parameters to enable it to make a better prediction next time.

当你考虑到研究人员对LLM(大型语言模型)如何实现这些能力的理解还不完全时，这种能力的广泛性尤其令人惊讶。LLM是一种神经网络，是一种受大脑启发的机器学习模型；网络由人工神经元(或计算单元)组成，按照层级排列，参数可以调整，表示神经元之间的连接强度。在训练过程中，最强大的LLM(如o1，由Anthropic在旧金山开发的Claude，以及谷歌的Gemini)使用了一种名为“下一个单词预测”的方法。在此过程中，模型会反复接收被分割成称为“token”的文本片段作为样本。这些token可以是完整的单词，也可以是一组字符。序列中的最后一个token被隐藏或“掩码”，模型被要求预测它。然后，训练算法将预测结果与掩码的token进行比较，并调整模型的参数，以便下次做出更准确的预测。

The process continues — typically using billions of fragments of language, scientific text and programming code — until the model can reliably predict the masked tokens. By this stage, the model parameters have captured the statistical structure of the training data, and the knowledge contained therein. The parameters are then fixed and the model uses them to predict new tokens when given fresh queries or ‘prompts’ that were not necessarily present in its training data, a process known as inference.

该过程将继续进行——通常使用数十亿个语言片段、科学文本和编程代码，直到模型能够可靠地预测被掩盖的单词。在此阶段，模型参数已经捕捉到了训练数据的统计结构和其中包含的知识。然后，这些参数被固定，模型使用它们来预测新的单词，当给出一些新的查询或“提示”时，这些查询或“提示”并不一定包含在模型的训练数据中，这一过程被称为推理。

The use of a type of neural network architecture known as a transformer has taken LLMs significantly beyond previous achievements. The transformer allows a model to learn that some tokens have a particularly strong influence on others, even if they are widely separated in a sample of text. This permits LLMs to parse language in ways that seem to mimic how humans do it — for example, differentiating between the two meanings of the word ‘bank’ in this sentence: “When the river’s bank flooded, the water damaged the bank’s ATM, making it impossible to withdraw money.”

一种名为“变换器”的神经网络结构的使用使LLMs取得了显著的进步。变换器允许模型学习某些单词对另一些单词有特别强的影响，即使它们在文本样本中相隔甚远。这使得LLMs能够以类似于人类的方式解析语言——例如，区分这个句子中“bank”的两种含义：“当河岸被淹没时，水损坏了银行的ATM机，使得无法取款。”

This approach has turned out to be highly successful in a wide array of contexts, including generating computer programs to solve problems that are described in natural language, summarizing academic articles and answering mathematics questions.

这种方法在许多不同的领域都取得了巨大的成功，包括生成用自然语言描述的问题的计算机程序、概括学术文章以及解答数学问题。

And other new capabilities have emerged along the way, especially as LLMs have increased in size, raising the possibility that AGI, too, could simply emerge if LLMs get big enough. One example is chain-of-thought (CoT) prompting. This involves showing an LLM an example of how to break down a problem into smaller steps to solve it, or simply asking the LLM to solve a problem step-by-step. CoT prompting can lead LLMs to correctly answer questions that previously flummoxed them. But the process doesn’t work very well with small LLMs.在这一过程中还出现了其他一些新功能，尤其是随着LLMs的规模不断扩大，这使得AGI也有可能随着LLMs的规模增长而出现。其中一个例子是“链式思维(CoT)提示”。这涉及到向LLM展示如何将问题分解为更小的步骤来解决，或者简单地要求LLM一步步解决问题。CoT提示可以让LLM正确回答以前困扰它们的问题。但是这个过程在小型LLM上并不奏效。

The limits of LLMs LLMs的限制

CoT prompting has been integrated into the workings of o1, according to OpenAI, and underlies the model’s prowess. Francois Chollet, who was an AI researcher at Google in Mountain View, California, and left in November to start a new company, thinks that the model incorporates a CoT generator that creates numerous CoT prompts for a user query and a mechanism to select a good prompt from the choices. During training, o1 is taught not only to predict the next token, but also to select the best CoT prompt for a given query. The addition of CoT reasoning explains why, for example, o1-preview — the advanced version of o1 — correctly solved 83% of problems in a qualifying exam for the International Mathematical Olympiad, a prestigious mathematics competition for high-school students, according to OpenAI. That compares with a score of just 13% for the company’s previous most powerful LLM, GPT-4o.

据OpenAI称，CoT提示已被整合到o1的工作机制中，并为其卓越表现奠定了基础。加利福尼亚州Mountain View的谷歌AI研究员弗朗索瓦·乔莱特(Francois Chollet)认为，该模型包含一个CoT生成器，它为用户查询生成多个CoT提示，并有一个机制从中选择最佳提示。在训练过程中，o1不仅被教导预测下一个令牌，而且还被教导为给定的查询选择最佳CoT提示。CoT推理的加入解释了为什么，例如，o1-preview(o1的高级版本)在国际数学奥林匹克竞赛(International Mathematical Olympiad)资格赛中正确解决了83%的问题，而该公司之前的最强大的LLM(大型语言模型)GPT-4o的得分仅为13%。

But, despite such sophistication, o1 has its limitations and does not constitute AGI, say Kambhampati and Chollet. On tasks that require planning, for example, Kambhampati’s team has shown that although o1 performs admirably on tasks that require up to 16 planning steps, its performance degrades rapidly when the number of steps increases to between 20 and 40 ². Chollet saw similar limitations when he challenged o1-preview with a test of abstract reasoning and generalization that he designed to measure progress towards AGI. The test takes the form of visual puzzles. Solving them requires looking at examples to deduce an abstract rule and using that to solve new instances of a similar puzzle, something humans do with relative ease.

但是，尽管技术如此先进，Kambhampati和Chollet表示，o1也有其局限性，并不构成AGI。例如，Kambhampati的研究小组发现，尽管o1在需要16步规划的任务中表现优异，但当规划步骤增加到20至40时，其表现迅速恶化。Chollet在用自己设计的用于衡量向AGI迈进进展的抽象推理和一般化测试挑战o1-preview时，也发现了类似的局限性。该测试以视觉谜题的形式出现。解决这些谜题需要查看示例来推导出抽象规则，并使用该规则来解决与此类似谜题的新实例，而人类在这方面相对容易。

LLMs, says Chollet, irrespective of their size, are limited in their ability to solve problems that require recombining what they have learnt to tackle new tasks. “LLMs cannot truly adapt to novelty because they have no ability to basically take their knowledge and then do a fairly sophisticated recombination of that knowledge on the fly to adapt to new context.”乔莱特说，无论大小，LLMs都无法解决需要将所学知识重新组合以应对新任务的问题。“LLMs无法真正适应新情况，因为它们没有能力从根本上将自己的知识进行相当复杂的重组，以适应新的情境。”

Can LLMs deliver AGI? LLMs(大型语言模型)能够实现AGI(通用人工智能)吗？

So, will LLMs ever deliver AGI? One point in their favour is that the underlying transformer architecture can process and find statistical patterns in other types of information in addition to text, such as images and audio, provided that there is a way to appropriately tokenize those data. Andrew Wilson, who studies machine learning at New York University in New York City, and his colleagues showed that this might be because the different types of data all share a feature: such data sets have low ‘Kolmogorov complexity’, defined as the length of the shortest computer program that’s required to create them ³. The researchers also showed that transformers are well-suited to learning about patterns in data with low Kolmogorov complexity and that this suitability grows with the size of the model. Transformers have the capacity to model a wide swathe of possibilities, increasing the chance that the training algorithm will discover an appropriate solution to a problem, and this ‘expressivity’ increases with size. These are, says Wilson, “some of the ingredients that we really need for universal learning”. Although Wilson thinks AGI is currently out of reach, he says that LLMs and other AI systems that use the transformer architecture have some of the key properties of AGI-like behaviour.

因此，LLMs最终能否实现AGI？它们的一个优势是，其底层的变换器架构可以在处理文本之外的其他类型信息时发现和利用统计模式，比如图像和音频，只要能找到适当的分词方法。纽约大学(New York University)的机器学习研究员安德鲁·威尔逊(Andrew Wilson)及其同事的研究表明，这可能是因为不同类型的数据都具有一个共同特征：这些数据集的“科莫戈罗夫复杂度”(Kolmogorov complexity)较低，即生成这些数据所需的最短计算机程序的长度。研究人员还发现，变换器非常适合学习低科莫戈罗夫复杂度数据中的模式，而且模型越大，这种适合度就越高。变换器具有建模大量可能性的能力，从而增加了训练算法发现适当解决方案的可能性，而这种“表达能力”会随着模型大小的增加而提高。威尔逊说，“这些是我们实现通用学习所需的一些关键成分。”尽管威尔逊认为目前AGI还难以实现，但他表示，使用变换器架构的LLMs和其他AI系统已经具备了一些AGI类似行为的关键特性。

How close is AI to human-level intelligence? 人工智能距离达到人类级别的智能还有多远？_人工智能_03

Yet there are also signs that transformer-based LLMs have limits. For a start, the data used to train the models are running out. Researchers at Epoch AI, an institute in San Francisco that studies trends in AI, estimate ⁴ that the existing stock of publicly available textual data used for training might run out somewhere between 2026 and 2032. There are also signs that the gains being made by LLMs as they get bigger are not as great as they once were, although it’s not clear if this is related to there being less novelty in the data because so many have now been used, or something else. The latter would bode badly for LLMs.

然而也有迹象表明，基于变换器的LLMs存在局限性。首先，用于训练这些模型的数据正在耗尽。旧金山一家研究AI趋势的机构Epoch AI的研究人员估计，用于训练的公开可用文本数据库存可能在2026年至2032年之间耗尽。此外，也有迹象表明，随着LLMs的增大，它们所取得的进步不再像以前那样显著，尽管还不清楚这是否与数据中存在的新颖性减少有关(因为已经有很多数据被使用了)，还是其他原因所致。如果是后者，那么这对LLMs来说将是一个坏兆头。

Raia Hadsell, vice-president of research at Google DeepMind in London, raises another problem. The powerful transformer-based LLMs are trained to predict the next token, but this singular focus, she argues, is too limited to deliver AGI. Building models that instead generate solutions all at once or in large chunks could bring us closer to AGI, she says. The algorithms that could help to build such models are already at work in some existing, non-LLM systems, such as OpenAI’s DALL-E, which generates realistic, sometimes trippy, images in response to descriptions in natural language. But they lack LLMs’ broad suite of capabilities.

谷歌伦敦深度思维研究部副总监拉亚·哈德塞尔(Raia Hadsell)提出了另一个问题。强大的基于变换器的LLMs被训练用来预测下一个单词，但她认为，这种单一的关注点过于局限，无法实现AGI。她表示，构建能够一次性或成块生成解决方案的模型可以使我们更接近AGI。这些有助于构建此类模型的算法已经在一些现有的非LLM系统中得到应用，例如OpenAI的DALL-E，它可以根据自然语言描述生成逼真、有时甚至令人迷幻的图像。但它们缺乏LLMs的广泛功能。

Build me a world model帮我构建一个世界模型。

The intuition for what breakthroughs are needed to progress to AGI comes from neuroscientists. They argue that our intelligence is the result of the brain being able to build a ‘world model’, a representation of our surroundings. This can be used to imagine different courses of action and predict their consequences, and therefore to plan and reason. It can also be used to generalize skills that have been learnt in one domain to new tasks by simulating different scenarios.

推动AGI(通用人工智能)发展的突破性进展的灵感来自神经科学家。他们认为，我们的智能是大脑能够构建“世界模型”的结果，这是一种对我们周围环境的表示。这种模型可以用来想象不同的行动方案并预测其后果，从而进行计划和推理。它还可以通过模拟不同的场景来将已在某个领域学到的技能推广到新的任务中。

Several reports have claimed evidence for the emergence of rudimentary world models inside LLMs. In one study ⁵, researchers Wes Gurnee and Max Tegmark at the Massachusetts Institute of Technology in Cambridge claimed that a widely used open-source family of LLMs developed internal representations of the world, the United States and New York City when trained on data sets containing information about these places, although other researchers noted on X (formerly Twitter) that there was no evidence that the LLMs were using the world model for simulations or to learn causal relationships. In another study ⁶, Kenneth Li, a computer scientist at Harvard University in Cambridge and his colleagues reported evidence that a small LLM trained on transcripts of moves made by players of the board game Othello learnt to internally represent the state of the board and used this to correctly predict the next legal move.

有几份研究报告称，LLMs内部已经出现了初步的世界模型。在一项研究中，麻省理工学院剑桥分校的韦斯·古尼(Wes Gurnee)和马克斯·泰格马克(Max Tegmark)称，当在包含这些地方信息的数据集上进行训练时，一种广泛使用的开源LLM家族构建了对世界、美国和纽约市的内部表示。然而，其他研究人员在X(原推特)上指出，没有证据表明LLMs正在使用世界模型进行模拟或学习因果关系。在另一项研究中，哈佛大学计算机科学家肯尼斯·李(Kenneth Li)和他的同事报告称，当在一个名为奥赛罗(Othello)的棋盘游戏的棋谱上进行训练时，一个小型LLM学会了在内部表示棋盘的状态，并利用这一信息正确预测了合法的下一步棋。

Other results, however, show how world models learnt by today’s AI systems can be unreliable. In one such study ⁷, computer scientist Keyon Vafa at Harvard University, and his colleagues used a gigantic data set of the turns taken during taxi rides in New York City to train a transformer-based model to predict the next turn in a sequence, which it did with almost 100% accuracy.

然而，其他研究显示，由当今人工智能系统学习的世界模型可能不可靠。例如，哈佛大学的计算机科学家Keyon Vafa和他的同事们使用了一个庞大的数据集，该数据集包含纽约市出租车行程中的转弯情况，并用基于变换器的模型来预测序列中的下一个转弯，该模型的准确率几乎达到100%。

By examining the turns the model generated, the researchers were able to show that it had constructed an internal map to arrive at its answers. But the map bore little resemblance to Manhattan (see ‘The impossible streets of AI’), “containing streets with impossible physical orientations and flyovers above other streets”, the authors write. “Although the model does do well in some navigation tasks, it’s doing well with an incoherent map,” says Vafa. And when the researchers tweaked the test data to include unforeseen detours that were not present in the training data, it failed to predict the next turn, suggesting that it was unable to adapt to new situations.

通过分析模型生成的路线，研究人员能够证明该模型构建了一张内部地图来得出答案。但是，这张地图与曼哈顿(参见“人工智能的不可能街道”)几乎没有相似之处，“包含具有不可能物理朝向的街道和飞越其他街道的飞桥”，作者写道。“虽然该模型在某些导航任务中表现不错，但它是在一张不连贯的地图上表现良好，”瓦法说。当研究人员调整测试数据以包括未在训练数据中出现的意外绕行时，该模型无法预测下一个转弯，这意味着它无法适应新情况。

The importance of feedback 反馈的重要性

One important feature that today’s LLMs lack is internal feedback, says Dileep George, a member of the AGI research team at Google DeepMind in Mountain View, California. The human brain is full of feedback connections that allow information to flow bidirectionally between layers of neurons. This allows information to flow from the sensory system to higher layers of the brain to create world models that reflect our environment. It also means that information from the world models can ripple back down and guide the acquisition of further sensory information. Such bidirectional processes lead, for example, to perceptions, wherein the brain uses world models to deduce the probable causes of sensory inputs. They also enable planning, with world models used to simulate different courses of action.

加利福尼亚州山景城谷歌DeepMind AGI研究团队成员迪利普·乔治(Dileep George)表示，当今的LLMs缺乏一个重要功能，那就是内部反馈。人类大脑充满了反馈连接，允许信息在神经元层之间双向流动。这使得信息能够从感觉系统流向大脑更高层，以创建反映我们环境的世界模型。这也意味着，来自世界模型的信息可以回流并指导进一步的感官信息获取。这种双向过程，例如，导致了感知的产生，大脑使用世界模型推断感官输入的可能原因。它们还使规划成为可能，世界模型用于模拟不同的行动方案。

But current LLMs are able to use feedback only in a tacked-on way. In the case of o1, the internal CoT prompting that seems to be at work — in which prompts are generated to help answer a query and fed back to the LLM before it produces its final answer — is a form of feedback connectivity. But, as seen with Chollet’s tests of o1, this doesn’t ensure bullet-proof abstract reasoning.

但目前的LLM只能以一种附加的方式使用反馈。在o1的情况下，似乎起作用的是内部CoT提示，即在LLM生成答案之前，生成提示来帮助回答查询并反馈给LLM的反馈连接方式。但这并不能确保LLM具有无懈可击的抽象推理能力，正如Chollet对o1的测试所显示的那样。

Researchers, including Kambhampati, have also experimented with adding external modules, called verifiers, onto LLMs. These check answers that are generated by an LLM in a specific context, such as for creating viable travel plans, and ask the LLM to rerun the query if the answer is not up to scratch ⁸. Kambhampati’s team showed that LLMs aided by external verifiers were able to create travel plans significantly better than were vanilla LLMs. The problem is that researchers have to design bespoke verifiers for each task. “There is no universal verifier,” says Kambhampati. By contrast, an AGI system that used this approach would probably need to build its own verifiers to suit situations as they arise, in much the same way that humans can use abstract rules to ensure they are reasoning correctly, even for new tasks.

包括坎巴帕蒂在内的研究人员还尝试在LLM上添加外部模块，称为验证器。这些验证器会在特定的上下文中检查LLM生成的答案，比如创建可行的旅行计划，如果答案不符合要求，就会要求LLM重新运行查询 ⁸ 。坎巴帕蒂的研究团队发现，在验证器的辅助下，LLM能够比普通LLM更好地创建旅行计划。问题是，研究人员必须为每个任务设计专用的验证器。“没有通用的验证器，”坎巴帕蒂说。相比之下，如果采用这种方法的AGI系统需要自行构建适合具体情况的验证器，就像人类可以使用抽象规则确保自己在处理新任务时推理正确一样。

Efforts to use such ideas to help produce new AI systems are in their infancy. Bengio, for example, is exploring how to create AI systems with different architectures to today’s transformer-based LLMs. One of these, which uses what he calls generative flow networks, would allow a single AI system to learn how to simultaneously build world models and the modules needed to use them for reasoning and planning.

Another big hurdle encountered by LLMs is that they are data guzzlers. Karl Friston, a theoretical neuroscientist at University College London, suggests that future systems could be made more efficient by giving them the ability to decide just how much data they need to sample from the environment to construct world models and make reasoned predictions, rather than simply ingesting all the data they are fed. This, says Friston, would represent a form of agency or autonomy, which might be needed for AGI. “You don’t see that kind of authentic agency, in say, large language models, or generative AI,” he says. “If you’ve got any kind of intelligent artefact that can select at some level, I think you’re making an important move towards AGI,” he adds.

AI systems with the ability to build effective world models and integrated feedback loops might also rely less on external data because they could generate their own by running internal simulations, positing counterfactuals and using these to understand, reason and plan. Indeed, in 2018, researchers David Ha, then at Google Brain in Tokyo, and Jürgen Schmidhuber at the Dalle Molle Institute for Artificial Intelligence Studies in Lugano-Viganelllo, Switzerland, reported ⁹ building a neural network that could efficiently build a world model of an artificial environment, and then use it to train the AI to race virtual cars.

拥有构建有效世界模型和集成反馈回路能力的AI系统可能不需要太多外部数据，因为它们可以通过运行内部模拟、提出反事实假设并利用这些假设来理解、推理和制定计划。实际上，2018年，当时在谷歌大脑东京分部工作的大卫·哈(David Ha)和瑞士卢加诺-维加内洛达勒莫尔勒人工智能研究所(Dalle Molle Institute for Artificial Intelligence Studies)的尤尔根·施米德胡伯(Jurgen Schmidhuber)报告称，他们已经构建了一个神经网络，能够高效地构建一个虚拟环境的世界模型，然后利用它来训练AI驾驶虚拟汽车。

How close is AI to human-level intelligence? 人工智能距离达到人类级别的智能还有多远？_人工智能_04

If you think that AI systems with this level of autonomy sound scary, you are not alone. As well as researching how to build AGI, Bengio is an advocate of incorporating safety into the design and regulation of AI systems. He argues that research must focus on training models that can guarantee the safety of their own behaviour — for instance, by having mechanisms that calculate the probability that the model is violating some specified safety constraint and reject actions if the probability is too high. Also, governments need to ensure safe use. “We need a democratic process that makes sure individuals, corporations, even the military, use AI and develop AI in ways that are going to be safe for the public,” he says.

如果你觉得这种自主性AI系统听起来很可怕，那你并不孤单。除了研究如何构建AGI外，班吉奥还是AI系统设计和监管中融入安全理念的支持者。他认为，研究必须专注于训练能够保证自身行为安全的模型，比如通过引入计算模型违反特定安全约束概率的机制，并在概率过高时拒绝采取行动。此外，政府需要确保安全使用。“我们需要一个民主过程，确保个人、企业甚至军队在开发和使用AI时，以确保对公众安全的方式进行。”

So will it ever be possible to achieve AGI? Computer scientists say there is no reason to think otherwise. “There are no theoretical impediments,” says George. Melanie Mitchell, a computer scientist at the Santa Fe Institute in New Mexico, agrees. “Humans and some other animals are a proof of principle that you can get there,” she says. “I don’t think there’s anything particularly special about biological systems versus systems made of other materials that would, in principle, prevent non-biological systems from becoming intelligent.”

人工智能专家表示，从理论上讲，实现AGI是完全有可能的。乔治说：“从理论上讲，不存在任何障碍。”新墨西哥州圣塔菲研究所的计算机科学家梅兰妮·米切尔也表示同意。“人类和其他一些动物证明了你可以做到这一点，”她说。“我不认为生物系统与其他材料制成的系统在原则上有什么特别的不同，这会阻止非生物系统变得智能。”

But, even if it is possible there is little consensus about how close its arrival might be: estimates range from just a few years from now to at least ten years away. If an AGI system is created, George says, we’ll know it when we see it. Chollet suspects it will creep up on us. “When AGI arrives, it’s not going to be as noticeable or as groundbreaking as you might think,” he says. “It will take time for AGI to realize its full potential. It will be invented first. Then, you will need to scale it up and apply it before it starts really changing the world.”但是，即使有可能，人们对其到来的时间也没有达成共识：估计范围从现在起只有几年到至少十年以后。乔治说，如果创建了AGI系统，我们就会知道的。乔尔莱特怀疑它会悄无声息地到来。“当AGI到来时，它可能不像你以为的那样引人注目或具有突破性，”他说。“AGI要实现其全部潜力需要时间。它会被首先发明出来。然后，你需要将其扩大规模并加以应用，之后它才会真正改变世界。”

Nature 636, 22-25 (2024) 《自然》636期，22-25页(2024年)