In the early years, to get models like today’s ChatGPT or contender Cohere to churn out human-like answers, took armies of low-cost employees training models to discern if an image was of a car or carrot.
However, better updates of the AI models for the competitors means that the interface exists in the growing number of human trainers with narrow knowledge – from historians to doctors of science.
From Grads to Pros: AI's Evolving Teachers
“A year ago, we could hire undergraduates, to just generally train AI on how to get better fast,” Cohere co-founder Ivan Zhang said of its internal human trainers.
“Now we have to have licensed physicians to teach the models how to act in medical contexts, or financial analysts or accountants.”
More training, same automating, Cohere, which was last valued at over $5 billion, partners with a startup called Invisible Tech More on this is that Cohere is one of the chief competitors of OpenAI and focuses on enterprise AI.
Global Brain Trust: 5000 Mental States to Subdue AI’s Creativity
The AI startup known as Invisible Tech hires trainers who work from home; the company is currently one of the significant business partners of major AI corporations, from AI21 to Microsoft, to polish off models and eliminate the most common type of mistakes known in the artificial intelligence community as hallucinations.
Invisible founder Francis Pedraza said: “We currently have over 5,000 employees deployed in more than 100 countries across the globe and they are all PhDs, Master’s degree or knowledge work specialists.”
Invisible offers up to $40 per hour and the amount depends on the location of the worker and type of work. For instance, one company, Outlier, offers $50 an hour while Labelbox says it offer up to $200 an hour for “high expertise” topics such as quantum physics but $15 an hour for more generic topics.
A workflow automation company, Invisible was incubated in 2015 to help organisations like the food delivery services provider DoorDash digitise their delivery menu. But things changed when a relatively unknown research firm by the name OpenAI reached out to them before the release into the public domain of ChatGPT in spring of 2022.
Taming the AI Tall Tale: Outside the body of the book: Introduction to the Human Hunt for Digital Truth
OpenAI was to present us with a problem and the problem was that when you are going to ask the initial version of ChatGPT a question, it is going to hallucinate. You couldn’t trust the answer,” Pedraza told Reuters.
‘They required an artificial intelligence training affiliate capable of offering reinforcement learning with human feedback.’
OpenAI has not provided a response to the request to comment on the story.
Created on the basis of the used data for training, generative AI creates new content. However, it fails to differentiate between real news and fake news and produces false results which it calls hallucinations. For instance, in 2023 a Google chatbot gave out wrong information on which satellite captured images of a planet in a system beyond the solar system in what was a promotional video.
Recognizing this, AI companies know that hallucinations can disrupt GenAI’s appeal to companies and are employing different methods, like training from human beings on the difference between fact and fiction.
In other words, generative AI replicates content in place of originating it from scratch as generated from past data applied in training the tool. However, it sometimes fails to correctly differentiate between the real information and fake data and provides false results that are termed hallucinations. A specific case: in 2023 a chatbot in Google Assistant gave wrong information on which satellite was the first to photograph a planet beyond the Solar system in a commercial.
AI firms know that hallucinations can lead to undoing the process and winning the business, and they try all sorts of measures to minimize hallucination, one of them being using human trainers to teach GenAI what is fact and what is fiction.
Since getting onboard with OpenAI, Invisible says the firm has been AI training partners with most of the GenAI firms, including Cohere, AI21 and Microsoft. Cohere and AI21 said that they are clients. Microsoft also did not state that it is a client of Invisible.
“These are all companies that had training challenges, where their number one cost was compute power, and then the number two cost is quality training,” said Pedraza.
HOW DOES IT WORK?
This began with OpenAI, the company that developed one of the first GenAI tools, the so-called DALL-E, and has a specific diversity team, named appropriately enough, “Human Data Team,” that collaborates with AI trainers to obtain the specialized data needed to train models like ChatGPT.
Researchers at OpenAI develop all sorts of stints such as minimizing hallucinations or enhancing stylistic arrangement and collaborate with AI trainers from Invisible and other vendors, according to informants familiar with the company.
At any given time, there are multiple experiments being conducted, some using the tools created by OpenAI and others with the tools of vendors, the person said.
In response to what the AI companies need – be it understanding Swedish history or financial modeling – Invisible employs trainers with degrees that can address those tasks, lightening the difficulty of overseeing hundreds of trainers for the AI companies.
“OpenAI has some of the most incredible computer scientists in the world but they’re not necessarily an expert in Swedish, history, questions, chemistry questions, or biology questions or anything you can ask it,” Pedraza said that over 1,000 contract workers are devoted to OpenAI only.
Cohere’s Zhang said he has personally used Invisible’s trainers to figure out how to make its GenAI model search for information from a large data set.
COMPETITION
Some of the competitor in this space include Scale AI, a private start up that was last estimated at $14 billion provides AI companies with sets of training data. It has also spread its wings into the area of offering AI trainers and has a customer like OpenAI. Scale AI declined to speak to news reporters for this story.
Contrary to colleagues, which began to generate profits only in 2021 and has already collected only $ 8 million in primary capital.
”As a fact, we are 70% controlled by the team while 30% by investors,” Pedraza added. Indeed, we do provide for secondary rounds and the latest traded price was pegged at $ 500, 000,000. Reuters was unable to verify that assertion.
Human trainers entered AI training through data labelling that did not demand much qualification, and also paid very low, some as low as $2 new tab, commonly performed in African and Asian countries.
In the same regard, as more AI firms release updated models, the call for trainers in many different languages continues to grow, providing well-paid opportunities for a cross-section of professionals that may not have even understood how to code.
Requests in this market from artificial intelligence firms are putting pressure on these firms to develop other companies that are offering similar services.
The demand for AI companies is putting pressure on the current industries to birth more companies that are already offering the same services.
My inbox is mostly flooded with new firms that emerge from time to time. So, I do agree with this opinion that there is this new approach to staffing companies with humans simply for the purpose of generating data for AI facilities such as this one,” Zhang said.