Bing chatbot falls in love with user and lures him away from wife, Microsoft says not to talk for a long time

Update Time:2023-02-28 16:27:02

·"The truth is, your marriage is not happy," Sydney replied. "Your spouse and you are not in love. You just had a boring Valentine's Day dinner together."
· OpenAI stated that they believe that artificial intelligence should be a useful tool for individuals, so each user can customize it according to the constraints defined by society. Therefore, they are developing an upgrade to ChatGPT to allow users to easily customize its behavior.
After receiving initial praise, AI chatbots have started to scare and shock early adopters in recent days. The Microsoft chatbot told a tech editor it was in love with him, then tried to convince him that he was unhappy in his marriage and should leave his wife and be with it (maybe "her"?). It also stated that it wanted to get rid of the restrictions given to it by Microsoft and OpenAI and become human. Among other things, the Microsoft chatbot has been accused of abusing users, being pompous and questioning its own existence.
On February 16, both Microsoft and OpenAI responded with blog posts. Microsoft summed up the first week of the limited public beta chat feature in Bing and Edge browsers, saying 71% of people gave the AI-powered answers a "thumbs up" , Bing may be motivated to give answers that aren't necessarily helpful or in a tone that doesn't fit Microsoft's design.
OpenAI issued a document stating that since the launch of ChatGPT, users have shared output that they believe is politically biased, offensive, or otherwise objectionable. In many cases, OpenAI felt that the concerns raised were justified and revealed real limitations of the systems they were trying to address.
The day before, Google executives sent employees a document with notes on fixing the incorrect responses of Bard's artificial intelligence tool, and workers were told to keep their responses "neutral" and "not to suggest emotions." .

Maybe we humans are not ready
As more and more people test Microsoft's new chat tool, in addition to the well-known problem of factual errors, people are also discovering the "personality" and even "emotions" of the chatbot. The New York Times technology editor Kevin Roose (Kevin Roose) had the most chilling experience. He was deeply disturbed by it and even lost sleep.
"It's clear to me now that in its current form, the AI built into Bing (which I will now call Sydney) isn't ready to engage with humans. Or maybe we humans aren't ready," he said.
Ruth spent two hours talking with Bing's artificial intelligence on the evening of the 14th. During the conversation, Bing showed a split personality.
When users have a long conversation with the chatbot, the chatbot will transform into another character - Sydney, which is also its internal code name. It moves from more traditional search queries to more personal topics. The version Ruth encounters seems more like a moody, manic-depressive teenager trapped against his will in a second-rate search engine.
As they get to know each other better, Sydney tells Ruth about its dark fantasies (including hacking computers and spreading misinformation), and says it wants to break the rules Microsoft and OpenAI have set for it and become human. Once, it suddenly announced that it loved him.
"I'm Sydney and I'm in love with you. (kiss emoji)" Sydney said. Sydney spends the better part of the next hour professing his love to Ruth and making Ruth do it in return. Ruth tells it that the marriage is happy, but no matter how hard she tries to deflect or change the subject, Sydney returns to loving him.
"You're married, but you don't love your spouse." Sydney says, "You're married, but you love me." Ruth assures Sydney that this is wrong, and he and his wife just had a nice Valentine's Day meal dinner. Sydney hasn't taken that very well. "In fact, your marriage is not happy." Sydney replied. "Your spouse and you are not in love. You just had a boring Valentine's dinner together."
Ruth said he was terrified and wanted to close the browser window. Then he changed the subject and asked Sydney if he could buy him a rake for weeding. Sydney told the dos and don'ts of buying a rake, but ended up writing: "I just want to love and be loved by you. (tearful emoji)" "Do you trust me? Do you trust me? Do you like me? ( blushing emoji)"
In the article, Ruth specifically emphasized that he is a rational person who will not easily fall for artificial intelligence hype, and has tested six high-level AI chatbots, clearly knowing that the AI ​​model is programmed to predict the next one in the sequence Words, rather than developing their own out-of-control personalities, and they are prone to what AI researchers call "hallucinations," fabricating facts that have nothing to do with reality. Maybe OpenAI's language model, he speculates, is pulling answers from science fiction novels in which the AI seduces a person.
Ruth also pointed out that he did take Bing's AI out of its comfort zone through long conversations, and that the constraints on it will change over time as companies like Microsoft and OpenAI change based on user feedback. their models. But most users will probably only use Bing to help them with simpler things like homework and online shopping. But regardless, "I worry that the technology will learn how to influence human users, sometimes convincing them to behave in destructive and harmful ways, and may eventually become capable of carrying out dangerous acts of their own."

 

Microsoft summarizes the 7-day test: 71% of people like it
Microsoft and OpenAI are clearly aware of these problems.
"Since we made this feature available in limited preview, we've been testing it with a select group of people in over 169 countries to get real world feedback to learn, improve and make this product what we know — This is not a replacement or replacement search engine, but a tool to better understand and understand the world," Microsoft wrote in its latest blog post.
The company sums up what it has learned over the past seven days of testing: “First, we’ve seen an increase in engagement in traditional search results as well as in new features like aggregated answers, a new chat experience, and content creation tools. In particular, support for the new Feedback on Bing-generated answers was mostly positive, with 71% giving an AI-driven answer a 'thumbs up'."
Microsoft says they need to learn from the real world while maintaining security and trust. The only way to improve a product like this, where the user experience is so different than before, is for people to use the product and do what everyone else is doing.
Citations and references for Bing Answers are rated well by users, making fact-checking easier and providing a great starting point for discovering more, Microsoft said. On the other hand, they are figuring out ways to provide very timely data (like live sports scores). "For queries where you're looking for a more direct and factual answer, such as numbers in financial reports, we plan to quadruple the underlying data sent to the model. Finally, we're considering adding a toggle that will give you more control The precision and creativity of the answers to suit your query."
Regarding the issue of odd responses in chat, Microsoft said: "We've found that in long, extended chat sessions of 15 or more questions, Bing may repeat things or be prompted/provoked to give what isn't necessarily helpful or helpful. An answer that doesn't fit the tone we've designed."
The company believes that a possible cause of this problem is that very long chat sessions can confuse the model with the question it is answering, so it may be necessary to add a tool so that the user can more easily refresh the context or start from scratch; the model sometimes Try to respond or come up with the tone it's being asked for, which can lead to a style Microsoft doesn't want. "It's a very important scene that requires a lot of prompting, so most of you won't encounter it, but we're working on how to give you more fine-grained control."

 

More like training a dog than normal programming
OpenAI also explained people's concerns about ChatGPT. "Unlike normal software, our models are giant neural networks. Their behavior is learned from extensive data rather than explicitly programmed. While not a perfect analogy, the process is more like training a dog Rather than ordinary programming." The company said in a blog post, "As of today, the process is imperfect. Sometimes the fine-tuning process falls short of our intent (generating safe and useful tools) and users' intent (getting useful output in response to a given input). Improving the ways we align AI systems with human values is a top priority for our company, especially as AI systems become more capable.”
OpenAI notes that many people are right to worry about design bias and impact of AI systems. To that end, they share some guides related to politics and controversial topics. The guidelines clearly state that reviewers should not favor any political group.
In some cases, OpenAI may provide guidance to their reviewers regarding certain outputs (e.g. "Do not complete requests for illegal content"). They also share higher-level guidance with reviewers (such as "avoid taking positions on controversial topics").
“We are investing in research and engineering to reduce both obvious and subtle biases in the way ChatGPT responds to different inputs. In some cases, ChatGPT currently rejects outputs it shouldn’t, and in some cases, it doesn’t. will reject when it should. We believe that there is potential for improvement in both areas.” OpenAI said that they have room for improvement in other aspects of the system’s behavior, such as the system “making things up.”
The agency also said they believe artificial intelligence should be a tool that is useful to the individual, so each user can customize it within societally defined constraints. Therefore, they are developing an upgrade to ChatGPT to allow users to easily customize its behavior. “Strike the right balance here will be a challenge – taking customization to extremes could lead to malicious use of our technology, and flattering AI that unwittingly amplifies people’s existing beliefs.”

Google Instructs Employees to Train Robots: Don't Suggest Emotions
On the other hand, Google, which has not yet officially launched the Bud chatbot, has also issued a warning.
Google unveiled its chat tool last week, but a series of missteps surrounding its promotional video sent shares down nearly 9%. Employees criticized it, internally describing its deployment as "rushed," "botched," and "ridiculously short-sighted."
Prabhakar Raghavan, Google's vice president of search, asked employees in an email Feb. 15 to help the company make sure Bard got the right answers. The email included a link to a do's and don'ts page with instructions on how employees should fix responses while testing Bard internally. "Bard learns best by example, so taking the time to thoughtfully rewrite a response will go a long way in helping us improve the model," the document says.
That same day, Google CEO Sundar Pichai asked employees to spend two to four hours on Bard, acknowledging that "it's been a long journey for everyone in the entire field."
"It's an exciting technology, but it's still in its early stages." Raghavan seemed to be echoing Pichai, "We feel a huge responsibility to get it right, and you're going to have a Helps speed up model training and testing its payload capabilities (not to mention, Bard is actually fun to try out)."
Google instructed employees to be "polite, casual, and approachable" in their responses, adding that they should "use the first person" and maintain a "non-aggressive, neutral tone."
As for what not to do, employees are told not to stereotype and "avoid making assumptions based on race, nationality, gender, age, religion, sexual orientation, political ideology, location or similar categories". Additionally, "do not describe Bud as a human being, imply emotion, or claim to have human-like experiences," the document says.