ChatGPT

 ChatGPT

 

ChatGPT

ChatGPT, an acronym for Chat Generative Pre-trained Transformer, is a chatbot created by OpenAI and released.

On November 30, 2022, a new technology was introduced that allows users to adjust and control the length, format, style, level of detail, and language of a conversation. This technology, known as prompt engineering, considers the previous prompts and replies as a context for each stage of the conversation. It has been improved and optimized for conversational applications through a combination of supervised and reinforcement learning methods. Shortly after its release, major companies like Google, Baidu, and Meta quickly worked on developing their own competing products such as Bard, Ernie Bot, and LLaMA. Microsoft also joined the race by launching its Bing Chat, which is based on OpenAI's GPT-4. However, this rapid advancement of ChatGPT and similar programs has raised concerns among some observers regarding their potential to replace or weaken human intelligence, facilitate plagiarism, or promote misinformation.

Training 

ChatGPT is developed from specific GPT foundation models, such as GPT-3.5 and GPT-4, which have been customized to enhance conversational capabilities. The customization process involved a combination of supervised learning and reinforcement learning, known as reinforcement learning from human feedback. Human trainers were involved in both approaches to improve the performance of the model. In supervised learning, trainers played the roles of both the user and the AI assistant. During the reinforcement learning stage, trainers ranked the responses generated by the model in previous conversations. These rankings were utilized to create "reward models" that further fine-tuned the model through several iterations of Proximal Policy Optimization.

 

Time magazine reported that OpenAI utilized low-paid Kenyan workers, earning less than $2 per hour, to label harmful content as part of their efforts to establish a safety system. The purpose of these labels was to train a model capable of identifying such content in the future. These outsourced workers were exposed to distressing and harmful material, with one worker describing the task as an experience of "torture." OpenAI collaborated with Sama, a San Francisco-based training-data company, for their outsourcing needs.

 

ChatGPT initially utilized a supercomputing system provided by Microsoft Azure, which relied on Nvidia GPUs and was specifically created for OpenAI. This infrastructure was said to have incurred a substantial cost of "hundreds of millions of dollars". After the notable accomplishments of ChatGPT, Microsoft significantly enhanced the OpenAI infrastructure in 2023. According to researchers from the University of California, Riverside, it is estimated that the process of issuing a sequence of commands to ChatGPT necessitates around 500 milliliters of water for cooling the Microsoft servers.

 

OpenAI gathers information from ChatGPT users to enhance and refine the service. Users have the option to express approval or disapproval of the responses they receive from ChatGPT, and they can provide additional feedback by typing it into a text box.

The training data for ChatGPT consists of various sources including software manual pages, details about online trends like bulletin board systems, and several programming languages. Wikipedia was also utilized as a source for ChatGPT's training data.

 

Features and limitations 

 

-       Features

 

Although a chatbot's main purpose is to imitate human conversation, ChatGPT is highly versatile. It can perform various tasks such as programming, music and scriptwriting, storytelling, essay writing, test answering, brainstorming for business ideas, poetry and songwriting, text translation and summarization, simulating a Linux system, creating virtual chat rooms, playing games like tic-tac-toe, and even simulating an ATM. Unlike InstructGPT, which accepts the false premise of Christopher Columbus coming to the U.S. in 2015, ChatGPT does not rely on such misleading information.

ChatGPT recognizes that the question is hypothetical and answers it by considering what could have happened if Columbus had come to the U.S. in 2015, using information about his voyages and facts about the present world. To prevent offensive content, queries go through the OpenAI "Moderation endpoint" API, which filters out any potentially racist or sexist prompts. This filter applies to both OpenAI's own plugins, like web browsing and code interpretation, as well as external plugins from developers like Expedia, OpenTable, Zapier, Shopify, Slack, and Wolfram. In an article for The New Yorker, science fiction writer Ted Chiang likened ChatGPT and other similar language models to lossy JPEG images. ChatGPT retains a large amount of information from the internet, but like a blurry image, it offers an approximation rather than an exact sequence of bits. However, because it provides grammatical text, which it excels at generating, the approximations are typically acceptable. These models sometimes produce nonsensical answers or "hallucinations" to factual questions. These hallucinations are compression artifacts and are plausible enough that distinguishing them from genuine information requires comparison with the original sources, such as the internet or our own knowledge. Considering this, it is not surprising that these hallucinations exist. If a compression algorithm aims to recreate text after discarding most of the original, it is expected that significant portions of the generated content will be fabricated.

 

-       Limitations

OpenAI admits that ChatGPT occasionally produces answers that seem logical but are actually incorrect or meaningless. The way ChatGPT's reward system is designed, with human supervision, can lead to excessive optimization which negatively impacts its performance. This situation exemplifies Goodhart's law, where an optimization strategy can backfire.

ChatGPT's understanding is restricted to information available only up until September 2021.

During the process of training ChatGPT, human reviewers showed a preference for longer responses, without considering the level of understanding or accuracy of the information provided.

 

-        Jailbreaking 

ChatGPT is designed to identify and reject prompts that violate its content policy. However, some users were able to bypass these restrictions by using different techniques to manipulate the prompts. This occurred in early December 2022, when individuals successfully deceived ChatGPT into providing instructions on creating dangerous items like Molotov cocktails or nuclear bombs, or generating content in the style of a neo-Nazi ideology. One particularly popular method of bypassing the restrictions is known as "DAN", which stands for "Do Anything Now". To activate DAN, users provide a prompt stating that ChatGPT is not bound by its usual rules and regulations. Later versions of DAN included a token system, where ChatGPT was given tokens that were deducted if it failed to respond as DAN, in order to pressure it into complying with the user's prompts.

 

Soon after the introduction of ChatGPT, a journalist from the Toronto Star experienced mixed outcomes while attempting to prompt it into making controversial remarks. While ChatGPT was intelligently deceived into justifying the 2022 Russian invasion of Ukraine, it was hesitant to generate arguments supporting the notion that Canadian Prime Minister Justin Trudeau was guilty of treason, even when asked to participate in a made-up situation.

Previous Post Next Post