Open source tightens and closed source rises, AI big models fall into the Zhenlo

Tesla CEO Elon Musk, OpenAI CEO Sam Altman, Baidu CEO Robin Li, 360 CEO Zhou Hongyi... it's been a long time since the tech circle has been this lively. A debate on whether large models should be open-source or closed-source has drawn global tech leaders to express their views and opinions, making the large model route debate the hottest topic in the tech circle recently.

01

Musk's Battle with OpenAI

The Tech Circle's War of Words

"I believe that open-source is actually a form of an 'intelligence tax.' When you think rationally about what value large models can bring and at what cost, you'll realize that you should always choose closed-source models. Today, whether it's ChatGPT or Wenxin Yiyan and other closed-source models, they are definitely more powerful and have lower inference costs than open-source models." — In early July, during the World Artificial Intelligence Conference (WAIC 2024), Robin Li discussed open-source and closed-source models in a roundtable interview and bluntly stated that open-source is actually an 'intelligence tax.' Such a "sharp" view quickly made headlines and sparked heated discussions among many netizens.

Baidu CEO Robin Li strongly supports the closed-source route for AI large models during the World Artificial Intelligence Conference (WAIC 2024).

This is not the first time Robin Li has criticized open-source models and opposed open-source practices. As early as April this year, Robin Li expressed a similar "closed-source makes money" viewpoint, even stating that "open-source models will become increasingly outdated." In the tech circle where freedom of speech is valued, Robin Li's advocacy for the closed-source route of AI large models quickly faced numerous counterarguments.

Advertisement

Regarding Robin Li's prediction about open-source models, 360 Chairman Zhou Hongyi clearly holds an opposing view. In his speech at the Harvard China Forum, he said: "I have always believed in the power of open-source. Some people online are talking nonsense, and don't let them fool you into thinking that open-source is not as good as closed-source. In one sentence, without open-source, there would be no Linux, no internet, and even the company saying this has grown to where it is today by leveraging the power of open-source. The number of engineers and scientists gathered in the open-source community is hundreds of times that of closed-source. I believe that within the next one or two years, the power of open-source is very likely to reach or exceed the level of closed-source."

As an important pillar enterprise in the field of large models, Alibaba Cloud CTO Zhou Jingren also supports open-source large models, believing that open-source helps to accelerate the process of artificial intelligence application implementation. He pointed out that the download volume and the number of users of open-source models are both growing rapidly, showing the important role of open-source models in the field of artificial intelligence.In addition to the aforementioned domestic tech giants, Bai Chuan Zhi Neng CEO Wang Xiaochuan, Sheng Shu Technology co-founder and CEO Tang Jiayu, Qiming Venture Capital partner Zhou Zhifeng, Kunlun Wanwei Chairman Fang Han, and other practitioners, investors, and industry professionals in the tech circle have also joined the debate on whether AI large models should be open-source or closed-source. The participation of many big names in the debate has made the discussion increasingly lively.

The debate over the open-source or closed-source nature of AI large models was not initiated by domestic tech leaders. As early as February of this year, Musk sued ChatGPT manufacturer OpenAI and its CEO Altman. In 2015, Musk co-founded OpenAI with Altman and others, with the core mission of "creating a safe general AI for the benefit of all humanity," and its positioning was as a "non-profit organization."

However, Musk resigned from the OpenAI board in 2018 and reportedly abandoned his commitment to continue funding it. Altman said in an interview at the time: "It was tough, I had to readjust to ensure there was enough funding." In 2019, OpenAI closed its source code, citing "the danger of disseminating language models to the public." Since then, Musk has publicly criticized OpenAI multiple times, saying it has become a closed-source, profit-maximizing company. He once posted: "I'm confused, how did a non-profit organization I donated about $100 million to become a for-profit organization worth $30 billion?"

Musk is a staunch supporter of the AI open-source camp.

As OpenAI's core technology no longer opens its source and its relationship with Microsoft becomes increasingly close, Musk's dissatisfaction is understandable. In the lawsuit, Musk criticized: "OpenAI has become, in effect, a closed-source subsidiary of the world's largest technology company, Microsoft. Under the leadership of its new board, OpenAI is not only developing but actually perfecting an AGI to maximize Microsoft's profits, rather than benefiting humanity.

As the big names choose their sides and get involved, the companies behind them also begin to take sides, and a massive ecological camp battle is gradually emerging.

02

Llama 3 Seizes the "Match Point"Two Major Routes Spark "Team Battles"

There has always been a debate between the "open source" and "closed source" paths in the operating system and software industry.

As early as 1998, Christine Peterson first proposed the concept of "open source software" (Open Source Software), and since then, open source has flourished globally. More than twenty years have passed, and Microsoft, which once loudly declared "open source software is a cancer," has become a proponent of "open source." Companies like Red Hat and SUSE, which vigorously develop "open source" havens, have also achieved great success.

In the current field of AI large models, if we look purely from a technical perspective, closed-source large models are indeed leading in capabilities, such as OpenAI's GPT-4, Anthropic's Claude-3, and Google's Gemini Ultra, all of which are closed source. The situation in China is similar, with well-known large models like Huawei's PanGu, Baidu's Wenxin Yiyan, ByteDance's Yunque, and the Dark Side of the Moon Kimi, which are also basically following the closed-source route. Against this backdrop, it may be difficult to form a consensus in the industry on whether large models should be open source or closed source, but the emergence of Llama 3 has given the open source camp hope for a strong rise.

Llama 3's performance is close to GPT-4

Llama 3 is divided into large, medium, and small versions. Compared to other models: the small-scale 88 model performs slightly better or is roughly on par with models of the same size, such as Mistra7B and Gemma 7B; the medium-scale 70B model performs slightly better or is comparable to Gemini Pro 1.5 and Claude3 Sonnet, and surpasses GPT-3.5; the largest 400B model is still in the training process, with a design goal of being multimodal and multilingual. According to the training data currently published by Meta, its performance is comparable to GPT-4.

At the same time, as soon as Llama 3 was released, AWS, Microsoft Azure, Google Cloud, Baidu Intelligent Cloud, as well as Hugging Face, IBM WatsonX, NVIDIA NIM, and Snowflake successively announced the launch of Llama 3 on their platforms, supporting the training, deployment, and inference operation of Llama 3, reflecting a strong ecological linkage.

Industry leaders OpenAI and Anthropic have adopted completely different strategies; they provide closed-source AI models and insist on keeping the technology firmly in their own hands. Many other startups have also "bet on" open source, including the Franco-American joint venture Hugging Face, which launched the first open source alternative to ChatGPT, HuggingChat. Some well-funded companies have also open-sourced similar products, such as the American chip manufacturing company Cerebras (open-sourcing Cerebras-GPT) and the American software company Databricks (open-sourcing Dolly).

Domestically, Ali Tongyi announced in August last year that it joined the open source camp, following the "full modality, full size" approach to layout, covering different parameter levels, and open-sourcing multimodal models of language and vision. Aliyun explained that the training and iteration costs of large models are extremely high, and most AI developers and small and medium-sized enterprises cannot afford them. The open source trend of large models driven by Meta and Aliyun allows developers not to train models from scratch and gives developers the initiative in model selection, greatly accelerating the application and implementation process of large models.Compared to Meta's fully open-source approach and the extremely closed-source routes of OpenAI and Baidu, other large model companies tend to choose a middle path: open-sourcing the "low-end" versions of their models while keeping higher-parameter models closed-source. For instance, Google's Gemini multimodal model is closed-source, but in February of this year, they announced the open-sourcing of the single-modal Gemma language model; Mistral AI from France was initially a proponent of open-source models, but after receiving investment from Microsoft in February, they closed-sourced their newly released flagship large model, Mistral Large; Wang Xiaochuan's Baichuan Intelligence has a similar approach, with the first generation of the Baichuan large model released in April 2023 and the Baichuan 2 released in September both being open-source, but the ultra-large model Baichuan 3, launched in January this year, is entirely closed-source; another member of China's AI large model "Five Little Dragons" (Zhipu, Baichuan, MiniMax, Dark Side of the Moon, Zero One Infinity) — Zhipu AI, also chose a closed-source model when releasing GLM-4 in January.

In contrast to "consistency in words and actions," the choice between open-source and closed-source routes by major tech companies and their CEOs is actually more about making decisions based on their own corporate conditions, with interests being the primary factor in determining their stance.

03

What is the real contention?

The commercial logic behind the verbal disputes

The debate between the open-source and closed-source approaches within the AI camp ultimately boils down to the dialectics of commercial logic.

Open-source is not just about releasing code; its focus lies in "collaboration." Open-source projects are typically maintained by communities of enthusiasts and volunteers, with a lower degree of commercialization.

For example, the birth of the Linux operating system and the promotion of the GNU project are representative of the open-source ecosystem during that period. In the AI era, open-source large models can help users simplify the model training and deployment process, saving substantial initial and ongoing financial investments. Users can simply download pre-trained models from open-source communities like HuggingFace for free and fine-tune them to quickly build high-quality models, significantly reducing the cost for enterprises to build and train large models. In the pricing provided by cloud service provider Anyscale, the 70b version costs only $4 for 1 million tokens. GPT-3.5 is twice as expensive at $8 for 1 million tokens.

If using a closed-source model, 1 million tokens are consumed quickly, with costs far exceeding $0.6 per hour. Jia Yangqing, the founder of LeptonAI, once shared in a closed-door event: In North America, many companies first use closed-source large models for experimentation (such as models from OpenAI). The experimental scale is around several hundred million (tokens), with costs in the range of several thousand dollars. Once the data flywheel is in motion, they save the existing data and fine-tune their own models with smaller open-source models.The cost of training AI large models has always been high. Therefore, after the open-source release of Llama 2 and Llama 3, they quickly attracted global developers and enthusiasts to participate in development and improvement. Currently, a series of open-source basic models and industry models have rapidly emerged, which can greatly accelerate the speed of innovation and iteration. Especially in the field of large AI models on the end side that emphasize private deployment, open-source large models allow small and medium-sized enterprises and even start-ups to quickly have "unique" exclusive large models.

In contrast, the closed-source camp, although lacking the way to quickly enhance influence by calling friends and attracting partners like open source, has the advantage of retaining certain technical barriers because it is not so open. Other companies that want to obtain the ability to support closed-source projects have to pay, and the establishment of this commercial capability makes closed-source projects naturally more profitable, thus obtaining the capital for sustainable development. It is more suitable for companies with certain first-mover advantages and obvious technical and resource advantages in the field of large AI models to choose, after all, closed source can further build the company's moat.

Of course, the terminal application companies' views on open source or closed source routes are not as "clear-cut" as imagined. The wise move for terminal companies is to use a variety of models, taking into account both effects and costs. This means that open source and closed source are not absolutely opposed. After understanding the technical characteristics and advantages and disadvantages of the two major routes, people can completely make the two complement and coexist.

04

The open source in the AI era

In fact, it has already become a bit "distorted"

The so-called open source is not as simple as just making the source code public. It also needs to meet some conditions, such as allowing everyone to freely use, modify, and share this software, and it can also be used to create new things. In the early days of software development, open source was mainly driven by individuals and small teams, focusing on sharing code and collaborating to solve problems. Open source projects were usually maintained by enthusiast and volunteer communities, and the most significant feature was the low degree of commercialization, such as the birth of the Linux operating system and the promotion of the GNU project, which are representative of the open source ecosystem during this period.

In the subsequent Internet era and the era of cloud computing, there have been many cases of open source technology. They not only built the foundation of websites and network applications but also became an important part of the cloud computing infrastructure. To some extent, the technological wave of large models was also initiated by the open source model. After all, it was Google that first open-sourced the Transformer, which led to the later industry-shaking ChatGPT by OpenAI.

However, it is not difficult to see that the current open source technology, the end point is commercial use, and the purpose of commercial use is naturally to make a profit. Therefore, many companies that were originally based on open source, such as Google and OpenAI, quickly turned to the closed source direction after occupying the commercial high ground. On the contrary, traditional Internet business representatives such as Meta and Musk have taken up the banner of open source, but their purpose is also very clear, that is, to use open source to grab the market. Therefore, in the future, we may see the two camps continuously "turning their guns"...The Universal Large Model adopts a completely open-source license that allows users to modify the source code again.

The open-source nature in the era of large models has actually become more complex. For instance, the way of open-sourcing is no longer the "complete openness" that everyone assumes. Taking the Meta Llama series models and Mistral series models as examples, although they are both open-source, Llama is a limited open-source. Although the source code is open, there are certain restrictions on the use, modification, and distribution of the model, and Meta also retains the right to revoke the open-source status of Llama at any time.

On the other hand, the Mistral series models adopt a completely open open-source license, allowing users to use and modify the software with almost no restrictions. They not only open-source the model weights but also the architecture of the model.

However, it should be noted that even with such completely open large models, the training data and training process are not open-sourced. This is because large models typically require a large amount of data, computational resources, and expertise for training and optimization, which are often only provided by large technology companies. Therefore, the open-sourcing of large models is often led by top internet companies. In other words, the current open-source large models are no longer like the community-designed Linux or Android of the past; all the resources behind them come from relatively singular tech giants. So, Meta and Google's choice to open-source some large models is actually a strategic move to occupy an ecological niche, which also leads to the technical logic behind open-sourcing no longer being as simple as before.

05

Are closed-source large models more technologically powerful?

This still needs to be analyzed on a case-by-case basis.

Robin Li once directly stated, "Open-source models will become increasingly outdated." This means that, in Robin Li's view, the performance of closed-source large models will surpass that of open-source large models. We cannot speculate on the future as described by Li, but the current performance comparison between open-source and closed-source large models is still "verifiable." Taking the current benchmark, Meta Llama3, as an example, it has two versions with 8B and 70B parameters, and a 400B model with over 400 billion parameters is also claimed to be released later. From the test results, the Llama 3 with 70B parameters can already compete with the Gemini Pro1.5, which is estimated to have 175B parameters, in various indicators. Even when facing GPT-4, it does not lose too much, and this is achieved without fine-tuning of Llama3, which is sufficient to show that it still has a lot of room for improvement.Compared to closed-source models like GPT-4, Llama 3's performance is not significantly worse.

Coincidentally, the domestic open-source large model Alibaba's Tongyi Qianwen QWEN2-72B-Instruct has surpassed Meta's Llama3-70B-Instruct on HuggingFace's open-source large model leaderboard, temporarily taking the top spot with an average score of 42.49, showing exceptional performance in mathematics, long-text reasoning, and knowledge comprehension.

Similarly, the models recently released by the other two giants in the open-source community, Mistreal and Grok, have also demonstrated levels comparable to GPT-4. Among them, Grok1.5V has multimodal capabilities, with metrics on par with GPT-4, while the Miq 70B model based on Llama2 training that Mistreal recently released also shows capabilities close to GPT-4.

 

The Qianwen large model currently ranks first on the Huggingface leaderboard.

Therefore, many industry insiders believe that the gap between open-source and closed-source models is not only not widening but is actually narrowing. This is because open-source models have stronger customizability, allowing almost anyone to fine-tune the model according to their own ideas. With the continuous improvement of computing power and the reduction of energy consumption, training cycles measured in days will become the norm, and the cumulative effect of fine-tuning will quickly help smaller models overcome their size disadvantages.

However, there is still a significant challenge for open-source large models in terms of application implementation, which is data privacy. After all, large models cannot be directly used by industry users and need to be optimized with proprietary scenario data. The models trained with these data may then be open-sourced, which can deepen the concerns of enterprises. Moreover, on the other hand, in the co-construction process of open-source large models, there are still many difficulties in obtaining data, judging the quality of data, allocating weights, and determining the contributions of various parties. There is no such trouble with closed-source large models, as the ownership and usage rights of data and models are very clear and firmly in the hands of the enterprises themselves.

 

The high training costs of large models determine that only large technology companies can be upstream players.Strategies Vary in Emphasis

But Most Opt for a Balanced Approach to Open and Closed Source

Since neither open source nor closed source can completely replace the other, most vendors nowadays are adopting a strategy of developing both simultaneously, albeit with different focal points in their approaches. Currently, companies such as Baidu, Alibaba Cloud, Tencent, Huawei Cloud, Zhipu, Baichuan, Zero One Infinity, and Kunlun Wanwei are all working on both open and closed source large models. These enterprises are employing various strategies to combine the strengths of open and closed source, aiming for rapid technological iteration and maximizing commercial interests.

As early as 2021, Baidu open-sourced the ERNIE-M multilingual pre-trained model. While adhering to a closed-source approach, Baidu demonstrates a parallel strategy by providing third-party open-source large model APIs through its PaddlePaddle Intelligent Cloud platform. This method allows Baidu to maintain the independence of its technology while also leveraging the innovation benefits brought by the open-source ecosystem. Alibaba Cloud has explicitly stated that open source is part of its strategy, forming an integrated system of open and closed source, and emphasizing the services of the "Hundred Refinements" platform to gain benefits in computing power, tools, and services. Although Tencent started open-sourcing large models relatively late, its open-sourced Hunyuan Wensheng Graph model shows Tencent's active exploration in the open-source field.

Huawei Cloud develops the PanGu large model through a closed-source approach and provides third-party open-source large models through the "Hundred Models, Thousand States" section, showing its diversified layout in open and closed source strategies. Zhipu, as one of the early domestic companies to open-source large models, has gained widespread attention through its open-source models and has promoted significant development in financing and commercialization. Baichuan Corporation released an open-source commercial large model, which attracted industry attention and continued to open-source new models subsequently. Kunlun Wanwei open-sourced the sparse large language model Skywork-MoE and adopted a closed-source business model in fields such as music and gaming. Kai-Fu Lee's startup, Zero One Infinity, has also adopted a dual approach of open and closed source, expanding the market through open-source models and releasing the closed-source Yi-Large model.

In summary, major vendors have different emphases in their open and closed source strategies, forming a complex competitive landscape — open-source large models can promote technology sharing and community collaboration, while closed-source large models help protect intellectual property and obtain commercial benefits. This diversified development model not only promotes technological innovation but also provides more choices for users with different needs.