Large model training

Amazon researcher points out that training of big language models needs to be wary of data pitfalls

Researchers at Amazon warn of the need to be wary of data traps during the training process of large language models, Techradar reports. They point out that there is currently a large amount of content on the web that is generated by machine translation, and that this low-quality content can be a problem for the training process. The researchers found that a large number of web...

rat packing

February 7, 2024

712

artificial intelligence (AI)

Preventing artwork from being trained by AI, University of Chicago develops AI contamination tool Nightshade

A team of computer science researchers at the University of Chicago recently released a tool called Nightshade, which is designed to prevent AI systems from training and learning from works of art. The principle of this tool is to add data at the pixel level, making it impossible for either humans or AI systems to simply recognize it, but if it is used to train an AI model...

guoguo

February 7, 2024

989

artificial intelligence (AI)

Allen Institute for Artificial Intelligence opensource text generation AI models and training data

The Allen Institute for Artificial Intelligence (AI2) recently announced that it will open-source to the public its newly developed text-generating AI models, as well as the data used to train these models. This initiative aims to advance the field of artificial intelligence and promote communication and collaboration between academia and industry. It is reported that AI2 is open-sourcing this text generation...

Alan Turing (1912-1954), English mathematician, considered as the father of computer science

February 5, 2024

1.0K

artificial intelligence (AI)

Apple seeks to partner with big publishers to train AI with newsgathering rights

Recently, according to the New York Times, Apple is actively negotiating content licensing deals with several mainstream news publishers. The move is aimed at acquiring the vast amount of news data needed to train the AI system. In order to achieve this goal, Apple has already held preliminary talks with prominent media organizations such as Condé Nast, NBC News and IAC...

rat packing

December 25, 2023

1.2K

artificial intelligence (AI)

Embarrassed, Google Gemini suspected of using Baidu Wenshin one-sentence training, triggering netizens' debate

Recently, some netizens found that Gemini-Pro claimed to be a big model of Baidu's language when they used Gemini for Chinese conversations on the Google Vertex AI platform. This news has triggered heated discussions and concerns among netizens. According to the microblogging V @ appendix broke the news, in the test of Google Gemini, if you use Chinese to ask Gemin...

Alan Turing (1912-1954), English mathematician, considered as the father of computer science

December 19, 2023

776