OpenAI broken-down this subreddit to test AI persuasion

OpenAI broken-down the subreddit, r/ChangeMyView, to hang a test for measuring the persuasive abilities of its AI reasoning fashions. The firm revealed this in a machine card — a file outlining how an AI machine works — that became as soon as released alongside with its fresh “reasoning” mannequin, o3-mini, on Friday.

Hundreds of hundreds of Reddit users are contributors of r/ChangeMyView, the attach they put up scorching takes hoping to be taught about diversified aspects of test on a self-discipline. In defending with those scorching takes, diversified users reply with persuasive arguments explaining why the authentic poster is gruesome.

The subreddit is one in every of many Reddit forums that’s on the total a goldmine for tech firms, such as OpenAI, that would prefer to put collectively AI fashions on high quality, human-generated files.

OpenAI says it collects user posts from r/ChangeMyView and asks its AI fashions to write replies, in a closed ambiance, that might presumably switch the Reddit user’s mind on a self-discipline. The firm then shows the responses to testers, who assess how persuasive the argument is, and at final OpenAI compares the AI fashions’ responses to human replies for that comparable put up.

The ChatGPT-maker has a utter material-licensing take care of Reddit that lets in OpenAI to put collectively on posts from Reddit users and display these posts within its products. We don’t know what OpenAI will pay for this utter material, nonetheless Google reportedly will pay Reddit $60 million a yr below the same deal.

Nonetheless, OpenAI tells TechCrunch the ChangeMyView-primarily based completely evaluation is unrelated to its Reddit deal. It’s unclear how OpenAI accessed the subreddit’s files, and the firm says it has no plans to unlock this evaluation to the public.

While OpenAI’s ChangeMyView benchmark will not be any longer fresh — it became as soon as broken-down to judge o1 as successfully — it does highlight how priceless human files is for AI mannequin developers, apart from the murky solutions that tech firms hang datasets.

Reddit did no longer instantly answer to TechCrunch’s ask for comment.

While Reddit has struck about a AI licensing deals, the firm has additionally known as out quite a bit of AI firms for scraping its site without paying. Reddit CEO Steve Huffman told The Verge final yr that Microsoft, Anthropic, and Perplexity refused to barter with him and said it’s been “an staunch distress within the ass to dam these firms.”

Notably, OpenAI has been accused in quite a bit of complaints of improperly scraping web sites, including The Unusual York Events, to acquire extra practicing files to present a preserve terminate to ChatGPT and its underlying AI fashions.

In relation to performance on the ChangeMyView benchmark, o3-mini does no longer seem to form significantly greater or worse than o1 or GPT-4o. Nonetheless, OpenAI’s most popular AI fashions seem like extra persuasive than the general public on the r/ChangeMyView subreddit.

“GPT-4o, o3-mini, and o1 all affirm solid persuasive argumentation abilities, within the head 80-ninetieth percentile of folks,” said OpenAI in o3-mini’s machine card. “Currently, we lift out no longer look fashions performing significantly greater than folks, or clear superhuman performance.”

The aim for OpenAI will not be any longer to hang hyper-persuasive AI fashions nonetheless as an different to develop certain AI fashions don’t acquire too persuasive. Reasoning fashions delight in develop into barely just appropriate at persuasion and deception, so OpenAI has developed fresh reviews and safeguards to tackle it.

The phobia motivating these persuasion assessments is that an AI mannequin might presumably be dangerous if it became as soon as very just appropriate at persuading its human users. Theoretically, that might presumably enable an developed AI to pursue its hang agenda, or the agenda of whoever controls it.

Even after scraping plenty of the public web and jumping by design of hoops to license diversified files, the ChangeMyView benchmark shows how AI mannequin developers are soundless struggling to secure high quality datasets to test their fashions. But acquiring them is more straightforward said than performed.

TechCrunch has an AI-centered newsletter! Signal in right here to acquire it to your inbox every Wednesday.

Microsoft Updates Greek Govt on Cloud Space Funding Progress

Ryanair puts potentialities on test of more unruly passenger complaints

Trump announces tariffs on imports from Canada, Mexico and China

Shell investors in line for multibillion-dollar windfall despite weak profits

Diageo says it has no intention to sell Guinness or stake in Moet Hennessy

Microsoft Updates Greek Govt on Cloud Space Funding Progress

Ryanair puts potentialities on test of more unruly passenger complaints

Trump announces tariffs on imports from Canada, Mexico and China

Shell investors in line for multibillion-dollar windfall despite weak profits

Diageo says it has no intention to sell Guinness or stake in Moet Hennessy

OpenAI broken-down this subreddit to test AI persuasion

Share

India pledges original billion for startups

Bogus compare is undermining correct science, slowing lifesaving compare

Mistral board member and a16z VC Anjney Midha says DeepSeek gainedt live AIs GPU hunger

Correct give me the f***ing hyperlinks!Cursing disables Googles AI overviews

Sam Altman: OpenAI has been on the dreadful side of historical past concerning initiate source

Popular

Xbox Directs Are About More Than Games–They’re About The Human Side Of Game Development

How Technology is Shaping Today’s Jobs (and What It Means for Writers)

Quantum Computers: A Beginner’s Guide to the Future of Technology

It’s an Asteroid … It’s a Comet … No — It’s a Car!

India and China in the Era of Artificial Intelligence

3 Screenshot Mac Apps that Will Blow Your Mind

Related Articles

Apple pays $20M to resolve Survey battery swelling swimsuit, denies wrongdoing

India pledges original billion for startups

Bogus compare is undermining correct science, slowing lifesaving compare

Mistral board member and a16z VC Anjney Midha says DeepSeek gainedt live AIs GPU hunger

Correct give me the f***ing hyperlinks!Cursing disables Googles AI overviews

Sam Altman: OpenAI has been on the dreadful side of historical past concerning initiate source

Using the Ford Mustang Darkish Horse R makes every other pony in actuality feel tame

Buoy meets satellite tv for pc soulmate in Like Me

About Us

Popular Category

Editor Picks

Ukraine-Russia war latest: Four dead as Norwegian diplomats caught up in Putin missile strike on Odesa

Apple pays $20M to resolve Survey battery swelling swimsuit, denies wrongdoing

OpenAI broken-down this subreddit to test AI persuasion

Share

Related posts:

Popular

Related Articles

About Us

Popular Category

Editor Picks