March 25, 2023

Large language models emerge with unpredictable capabilities

In a test organized last year, the researchers entered different prompts to test the ability of large language models of different sizes. One of them is an emoji of a girl and three fish, asking which movie they describe. The smallest model produced a surreal answer: “The movie is a movie about a man who is a man who is a man”. The medium-complexity model guess is “Emoji Movie”, and the most complex model is the final word “Finding Nemo (Finding Nemo)”. Computer scientists are amazed by the performance of large language models. Language models have been studied for decades, and the most powerful model five years ago was based on recurrent neural networks, which essentially guess what the next word is based on a text string provided. The so-called recursion is to continuously learn from the output and use feedback to improve performance. In 2017, researchers at Google Brain proposed a new architecture called transformer. The recurrent network analyzes sentences word by word, and the transformer processes all words at the same time, which can process large chunks of text in parallel. Transformers can quickly expand the complexity of language models by increasing the parameters of the model. In 2020, OpenAI researchers found that language models improved their capabilities and accuracy as the parameter size increased. But large language models also bring some unexpected things. The researchers found that large language models produced hundreds of “new” abilities, a behavior known as emergence. Researchers are now trying to identify new emergent capabilities, and why—essentially trying to predict unpredictability. Understanding emergence can reveal answers to deep questions about AI and machine learning in general, such as whether complex models are really doing new things, or are they really good at statistics. It also helps researchers to exploit potential benefits and mitigate emerging risks.

Ewen Eagle

I am the founder of Urbantechstory, a Technology based blog. where you find all kinds of trending technology, gaming news, and much more.

View all posts by Ewen Eagle →

Leave a Reply

Your email address will not be published.