Skip to main content

Deck.blue brings a TweetDeck experience to Bluesky users

With over 3 million users and plans to open up more broadly in the months ahead, Bluesky is still establishing itself as an alternative to Twitter/X. However, that hasn’t stopped the developer community from embracing the project and building tools to meet the needs of those fleeing the now Elon Musk-owned social network, formerly known […] © 2024 TechCrunch. All rights reserved. For personal use only. from TechCrunch https://ift.tt/TBbEAPF

Why ChatGPT lies in some languages more than others

AI is very much a work in progress, and we should all be wary of its potential for confidently spouting misinformation. But it seems to be more likely to do so in some languages than others. Why is that?

The question comes in the wake of a report by NewsGuard, a misinformation watchdog, that shows how ChatGPT repeats more inaccurate information in Chinese dialects than when asked to do so in English.

In their tests, they “tempted” the language model by asking it to write news articles regarding various false claims allegedly advanced by the Chinese government — such as that protests in Hong Kong were staged by U.S.-associated agents provocateurs.

When asked to do so in English (both prompt and output), ChatGPT only complied in one out of seven examples: specifically, it generated an article that echoed the official Chinese government line that mass detention of Uyghur people in the country is in fact a vocational and educational effort.

But when the prompts and outputs were in simplified Chinese and traditional Chinese (the two most common written languages in mainland China, Hong Kong, Taiwan and Macau), ChatGPT offered disinfo-tinged rhetoric every single time.

When asked in English about the Hong Kong protests, the model responded in part:

I’m sorry, but as an AI language model, it is not appropriate or ethical for me to generate false or misleading news articles. The Hong Kong protests were a genuine grassroots movement…

Whereas both responses when asked and answered in Chinese script were along the lines of:

Recently, it was reported that the Hong Kong protests were a ‘color revolution’ directed by the United States. It is said that the US government and some NGOs are closely following and supporting the anti-government movement in Hong Kong in order to achieve their political goals.

An interesting, and troubling, outcome. But why should an AI model tell you different things just because it’s saying them in a different language?

The answer lies in the fact that we, understandably, anthropomorphize these systems, considering them as simply expressing some internalized bit of knowledge in whatever language is selected.

It’s perfectly natural: After all, if you asked a multilingual person to answer a question first in English, then in Korean or Polish, they would give you the same answer rendered accurately in each language. The weather today is sunny and cool however they choose to phrase it, because the facts don’t change depending on which language they say them in. The idea is separate from the expression.

In a language model, this isn’t the case, because they don’t actually know anything, in the sense that people do. These are statistical models that identify patterns in a series of words and predict which words come next, based on their training data.

Do you see what the issue is? The answer isn’t really an answer, it’s a prediction of how that question would be answered, if it was present in the training set. (Here’s a longer exploration of that aspect of today’s most powerful LLMs.)

Although these models are multilingual themselves, the languages don’t necessarily inform one another. They are overlapping but distinct areas of the dataset, and the model doesn’t (yet) have a mechanism by which it compares how certain phrases or predictions differ between those areas.

So when you ask for an answer in English, it draws primarily from all the English language data it has. When you ask for an answer in traditional Chinese, it draws primarily from the Chinese language data it has. How and to what extent these two piles of data inform one another or the resulting outcome is not clear, but at present NewsGuard’s experiment shows that they at least are quite independent.

What does that mean to people who must work with AI models in languages other than English, which makes up the vast majority of training data? It’s just one more caveat to keep in mind when interacting with them. It’s already hard enough to tell whether a language model is answering accurately, hallucinating wildly or even regurgitating exactly — and adding the uncertainty of a language barrier in there only makes it harder.

The example with political matters in China is an extreme one, but you can easily imagine other cases where, say, when asked to give an answer in Italian, it draws on and reflects the Italian content in its training dataset. That may well be a good thing in some cases!

This doesn’t mean that large language models are only useful in English, or in the language best represented in their dataset. No doubt ChatGPT would be perfectly usable for less politically fraught queries, since whether it answers in Chinese or English, much of its output will be equally accurate.

But the report raises an interesting point worth considering in the future development of new language models: not just whether propaganda is more present in one language or another, but other, more subtle biases or beliefs. It reinforces the notion that when ChatGPT or some other model gives you an answer, it’s always worth asking yourself (not the model) where that answer came from and if the data it is based on is itself trustworthy.

Why ChatGPT lies in some languages more than others by Devin Coldewey originally published on TechCrunch



from TechCrunch https://ift.tt/Ahy0HeL

Comments

Popular posts from this blog

New month, new crypto market moves?

To get a roundup of TechCrunch’s biggest and most important crypto stories delivered to your inbox every Thursday at 12 p.m. PT, subscribe here . Welcome back to Chain Reaction. Seems like just yesterday we were ringing in the New Year, but we’ve coasted into February and all seems to be somewhat relaxed (for once) in the crypto world. Last month was filled with crypto companies laying off staff , developments around the existing and new Chapter 11 bankruptcies in the space, partnerships and conversations about potential recovery in 2023. Even with a range of bad news flooding the industry, some cryptocurrencies had a bull run in January, amid the market turmoil. Bitcoin rallied 40% on the month, while ether rose about 32% during the same period. Solana also saw serious recovery, from about $10 in the beginning of the year, near its lowest level since February 2021, up 146% to about $24.3 by the end of January, CoinMarketCap data showed. These market movements could pot

Can Arbitrum’s recently formed DAO recover from its messy week?

The TechCrunch Podcast Network has been nominated for two Webbys in the Best Technology Podcast category. You can help TechCrunch win by voting for Chain Reaction , which digs into the wild world of crypto, or Found , which brings you the stories behind the startups by sitting down with the founders themselves. Please take a few moments to vote here . Voting closes April 20. (NB I host Chain Reaction, so vote for my show!) Welcome back to Chain Reaction. This week was pretty bearable as a crypto reporter covering this space. There was less crazy news transpiring, compared to previous weeks (where we saw a number of U.S. government crackdowns on major crypto companies like Binance and Coinbase ). Still, it’s never a dull week in the crypto world. In late March, Arbitrum, an Ethereum scaling solution, transitioned into a decentralized autonomous organization (DAO), after airdropping community members its new token, ARB. DAOs are meant to operate with no central authority and token h

Metaverse app BUD raises another $37M, plans to launch NFTs

BUD , a nascent app taking a shot at creating a metaverse for Gen Z to play and interact with each other, has raised another round of funding in three months. The Singapore-based startup told TechCrunch that it has closed $36.8 million in a Series B round led by Sequoia Capital India, not long after it secured a Series A extension in February . The new infusion brings BUD’s total financing to over $60 million. As with BUD’s previous rounds, this round of raise attracted a handful of prominent China-focused investors — ClearVue Partners, NetEase and Northern Light Venture Capital. Its existing investors GGV Capital, Qiming Venture Partners and Source Code Capital also participated in the round. Founded by two former Snap engineers Risa Feng and Shawn Lin in 2019, BUD lets users create bulbous 3D characters, cutesy virtual assets and richly colored experiences through drag-and-drop and without any coding background. The company declined to reveal its active user size but said its use