Apple has admitted to using publicly available YouTube videos to train its AI model, OpenELM, a disclosure that has sparked recent controversy. Although the company insists that this model has not been incorporated into any upcoming AI products under the name Apple Intelligence, reports suggest it was used in developing their new systems.
Investigations in a previous week revealed that Apple and other major companies used translations of YouTube videos to train their AI models, involving over 170,000 clips from well-known creators. These data were utilized in the development of OpenELM, which has been open-source since its announcement last April.
Apple's YouTube AI Controversy
Apple confirmed that the OpenELM model has not been included in any Apple Intelligence features, including iPhones, Mac computers, or Vision Pro VR glasses. The company stated that the model was developed for research and educational purposes and has been published as open source on the company's machine learning research website.
Despite acknowledging the use that violates copyright, Apple asserted it would not use this model in its new products and has no plans for future releases of it, even though it was previously described as an initial version.
Reports indicated that Apple, along with other companies like Anthropic and NVIDIA, used a dataset called "YouTube Translation" as part of developing their AI models. This dataset was collected as part of a broader project known as "The Pile" by EleutherAI, a non-profit organization.
In its announcement of upcoming Apple Intelligence features, Apple clarified that the training was done on licensed and publicly available data collected using its web crawling tools.
Admission and Controversy
Apple has admitted to using YouTube videos to train its AI model, OpenELM, sparking controversy despite the company’s assurance that this model is not included in any upcoming Apple Intelligence products.
Investigation Findings
Recent investigations revealed that Apple, along with other major companies, used translations of over 170,000 YouTube videos to train their AI models, utilizing this data to develop OpenELM.
Company Statement
Apple confirmed that the OpenELM model is not included in any Apple Intelligence features such as iPhones, Macs, or Vision Pro VR glasses, emphasizing that it was developed for research and educational purposes and published as open source.
Copyright Concerns
Despite acknowledging a breach of copyright, Apple stated it would not use OpenELM in new products and has no plans for future releases of this model, though it was previously considered an initial version.
Use of YouTube Translation Data
Reports indicate that Apple, along with companies like Anthropic and NVIDIA, used the "YouTube Translation" dataset for AI model development, which was part of "The Pile" project by EleutherAI, a non-profit organization.
Data Collection and Licensing
In announcing new Apple Intelligence features, Apple clarified that its training involved licensed and publicly available data collected using its web crawling tools.