AIJB

Dutch News Publishers Collaborate on Responsible AI Language Model GPT-NL

Dutch News Publishers Collaborate on Responsible AI Language Model GPT-NL
2025-07-18 journalistiek

den haag, vrijdag, 18 juli 2025.
Dutch news publishers have provided a substantial collection of news articles for the training of the AI language model GPT-NL. This initiative, a collaboration between TNO, NDP Nieuwsmedia, and other institutions, aims to provide a responsible and legal alternative to existing AI models. The dataset contains more than 20 billion tokens and will help strengthen the position of journalism in the Netherlands while complying with European laws and regulations. Training for GPT-NL began in June 2025 and is being further refined for its first use in the fourth quarter of this year.

A Responsible Alternative

GPT-NL is an initiative by non-profit organisations TNO, NFI, and SURF, developed for the Netherlands using high-quality Dutch data. The model complies with European laws and regulations, such as the AI Act, and pays publishers for the use of their content. This stands in stark contrast to many international models from Big Tech, which are often trained on news articles without permission or compensation [1][2][3].

Collaboration and Data

Dutch news publishers, represented by NDP Nieuwsmedia, have provided a large collection of news articles for the training of GPT-NL. This dataset includes more than 20 billion tokens, sourced from over 30 national and regional news sources, including DPG Media, Mediahuis, Erdee Mediagroep, and De Groene Amsterdammer. The press agency ANP has also joined the collective [2][3][4].

Impact on News Production and Consumption

The development of GPT-NL has a significant impact on news production and consumption in the Netherlands. By training the model on legally obtained, high-quality data, the integrity of journalism is strengthened. Additionally, it offers a responsible alternative for AI applications in journalism, which can lead to more accurate and reliable news reports [1][2][3].

Benefits and Drawbacks

The use of GPT-NL in journalism offers both benefits and drawbacks. A significant benefit is the improved accuracy and consistency in news reports, thanks to the use of controlled and legally obtained data. Moreover, the model can assist in automating routine tasks, such as generating summaries and detecting errors [1][2][3]. However, there are also ethical considerations. The use of AI in journalism must be carefully managed to ensure that the human element is not overlooked and to maintain the quality and authenticity of the news [5][6].

Ethical Considerations

A crucial aspect of the development of GPT-NL is the attention to ethical considerations. The model is trained on controlled, legally obtained data, which aligns with public values and copyright law. This is a significant difference from many international models, which are often trained on random copies from the internet. Nevertheless, challenges remain, such as managing closed datasets and ensuring the privacy of data subjects [4][5][6].

Future Perspectives

Training for GPT-NL began in June 2025 and is being further refined for its first use in the fourth quarter of this year. The consortium developing GPT-NL is working with the government and NijBegun on a European co-financing application for the construction of an AI factory in Groningen. This step is a significant milestone in the development of responsible AI innovation in the Netherlands [1][2][3].

Sources