Wave Information Releases "Source 2.0" Basic Big Model and Announces Full Open Source

The Source 2.0 basic large model includes three parameter scales such as 102.6 billion, 51.8 billion, and 2.1 billion, with English, mixed Chinese and English, and Chinese as the training languages, respectively.

Source 2.0 adopts a large model-based data production and filtering method, which ensures data diversity while improving data quality in each category. In terms of arithmetic, Source 2.0 adopts a non-uniform streaming parallelism approach, using a combination of pipeline parallelism + optimizer parameter parallelism + data parallelism, so as to make the model's memory usage distribution more balanced in each stage of streaming parallelism, and to avoid the problem of reduced training efficiency caused by memory bottlenecks.

In the evaluation, Source 2.0 was tested for its ability in code generation, math problem solving, and factual quizzing, and the results showed that the overall performance of Source 2.0 was at an upper-middle level.

Wave Information Releases "Source 2.0" Basic Big Model and Announces Full Open Source

Wave Information Releases "Source 2.0" Basic Big Model and Announces Full Open Source

Source 2.0 adopts a comprehensive open source strategy, and the whole series of model parameters and codes can be downloaded and used for free. By using high-quality Chinese and English materials such as Chinese and English books, encyclopedias, and theses, the proportion of Internet corpus content is reduced. In order to obtain Chinese mathematical data, theWave InformationCleaned about 10PB of internet data from 2018 to present, but only acquired about 10GB of math data.

Overall, the "Source 2.0" basic model released by Wave Information has the following characteristics:

Large model size

The "Source 2.0" basic model released by Wave Information includes three parameter scales: 102.6 billion, 51.8 billion and 2.1 billion.

Strong programming, reasoning, and logic skills

The "Source 2.0" basic big model released by Wave Information demonstrates advanced capabilities in programming, reasoning and logic.

Open source and free

The "Source 2.0" basic model released by Wave Information adopts a comprehensive open-source strategy, and all series of model parameters and code can be downloaded and used for free.

 

For those interested, check out the original link to the code open source:

https://github.com/IEIT-Yuan/Yuan-2.0

As well as related papers, I hope you find them helpful:

https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/docs/Yuan2.0_paper.pdf

This article comes from users or anonymous contributions, does not represent the position of Mass Intelligence; all content (including images, videos, etc.) in this article are copyrighted by the original author. Please refer to this site for the relevant issues involvedstatement denying or limiting responsibilityPlease contact the operator of this website for any infringement of rights (Contact Us) We will handle this as stated. Link to this article: https://dzzn.com/en/2023/1752.html

Like (0)
Previous November 28, 2023 at 3:27 pm
Next November 28, 2023 at 3:46 pm

Recommended

Leave a Reply

Please Login to Comment