Wikimedia

Wikimedia Software: Toward a common good on the net

Zarah Ziadi

In our efforts to provide the world with free and open access to knowledge, Wikimedia Deutschland prioritizes developing, improving and expanding sustainable open-source software. Open-source software forms an infrastructure that allows people to share, use and increase knowledge. In 2023 our focus remained on Wikidata, the free knowledge base, and our free software Wikibase. Both contributed significantly to keeping that infrastructure strong, as did our project “Technical Wishes”. As a result, we had much to celebrate in 2023.

10 years of Technical Wishes and a long Reparatursommer

To ensure that volunteers in Wikimedia projects can perform their work in the best possible way, technical conditions must be just right — above all Wikipedia’s user-friendliness. The “Technical Wishes” project has been around since 2013, brought to life by a Wikipedian who initiated a community poll. Not long after, Wikimedia Deutschland became involved in the project that has made that community poll a permanent fixture, a poll that now drives regular improvements.

“Every year, we invite you to let us know what problems you see — that’s what the wiki page Wish parking lot is for,” explains Johanna Strodt, the project manager responsible for community communications around Technical Wishes at Wikimedia. Prior to each survey, all the concerns are clustered and divided into subject areas. The vote determines the area the Wikimedia product development team will focus on for two years. “This does not only serve the German-speaking community, we also improve the underlying MediaWiki software for all wikis across the world,” explains Strodt. The most recent survey, in which around 1,000 volunteers participated, saw the topic “How to simplify re-using references” come out on top. Until this project, volunteer Wikipedia writers citing multiple passages from the same book as sources for their text have had to re-enter the source manually each time.

In 2023, Wikimedia Deutschland’s Technical Wishes added a second project (to its offerings) for the first time: Reparatursommer. Strodt explains: “The community has often asked us if we could help repair certain tools or helper applications developed by volunteers that are no longer working, when for various reasons the community doesn’t know what to do.” Because capacity was available at short notice, the “repair summer” began as a trial balloon; we received 30 requests of which we could address 19: partly by providing an assistance service for volunteer developers, enabling them to solve the problems themselves, partly in the form of bug fixes by Wikimedia Deutschland. The good news is that we were able to extend the project beyond the period we initially planned. As of now, it goes by the name “Technical Wishes Repair Help“.

»The community has often asked us if we could help repair certain tools or helper applications developed by volunteers that are no longer working, when for various reasons the community doesn’t know what to do.«
Johanna Strodt

Connecting more easily with Wikidata — introducing the REST API

One focus of our technology work is the further development of Wikidata, the free and open knowledge base. With roughly 110 million data records, Wikidata has become a central resource used by numerous Wikimedia projects (with Wikipedia leading the way), as well as by a growing number of external users. To make it easier for people around the world to access its structured data, Wikidata has an application programming interface (API). In 2023 the project added a second, more user-friendly interface: the REST API. It complies with the latest industry standards, which makes it easier to use than the older Action API.

Above all, this benefits developers who have no previous experience with Wikidata. Although not all API functions have been implemented in the REST API yet, we are already noticing positive effects on user-friendliness, which can be seen in the Feedback on the REST API from 2023.

Wikidata & AI — the fight against disinformation

Since the release of ChatGPT in 2022, the rise of AI models has progressed rapidly. Many large language models (LLMs) have been trained on Free Knowledge from Wikidata and Wikipedia. The structured, referenced data in the Wikidata knowledge graph is particularly beneficial: it significantly improves the quality and accuracy of LLM results, and “hallucinations” are often considerably reduced.

Also in 2023, Wikimedia Deutschland sought ways to incorporate Wikidata data more effectively into large language models in order to further improve the reliability of the results. Thus came the idea for a generative AI system prototype, incorporating a RAG pipeline (Retrieval Augmented Generation), which was presented at the AI.Dev/Cassandra Summit in the USA. Fundamentally, our further work in this area should first and foremost benefit open-source and open-data organizations, as well as startups that use large language models for non-profit open projects and whose values match those of Wikimedia Deutschland.

Wikimedia Deutschland is also continuing its collaboration with the Linux Foundation’s AI & Data Generative AI Commons. The work was begun to define and support the open technology ecosystem for generative AI, and it was put into practice by building a community of organizations that develop, promote and curate open data and models. The main mission of the LF AI & Data Foundation is to raise awareness around open generative AI and to support community projects in the areas of artificial intelligence, machine learning, education, outreach, and data-related ethical and responsible AI projects.

Knowledge graph — using networked information to create knowledge

A knowledge graph is akin to a large map or an information network. The best way to illustrate this is to imagine many small cards with various names of things written on them. The cards are connected (to one another) by the ways in which the concepts (written) on them relate to each other. For example, one card might say “Vesuvius”, and another might list the word “volcano”; they would be connected because Vesuvius is a volcano. In this way, a knowledge graph shows how different ideas and concepts relate to one another. As examples of knowledge graphs, the Wikidata project and the Wikibase software represent much more than conventional databases.

No match for “mismatch” — better data quality with the Mismatch Finder

Occasionally, owing to inconsistent sources or research results, conflicting data on a single subject may exist. This can lead to mismatches between the data in Wikidata and external databases. To allow these discrepancies to be checked in a timely manner, Wikidata launched the Mismatch Finder, an important tool for improving data quality.

The Mismatch Finder stores discrepancies that have been identified by individuals or institutions, creating a comprehensive pool of information about potential mismatches for the Wikidata community. They can then resolve these mismatches, making their work much easier. The Mismatch Finder thus contributes to improving the quality of networked data obtained from different databases, which leads to more accurate results in digital applications.

Wikibase — networking the entire knowledge of humankind

Just like the data contained in our free knowledge base (Wikidata), the software that it’s based on (Wikibase) is also available under an open license: it can be used by anyone working with large data sets, for example in galleries, libraries, archives, and museums (GLAM institutions) or in science and research. As with Wikidata, the Wikibase software can store, manage, retrieve and link structured data for a wide variety of knowledge repositories. More and more institutions and organizations are using Wikibase and similar Linked Open Data technologies. The world’s knowledge is thus becoming increasingly networked; knowledge silos are being dismantled. We owe this to the principle of Linked Open Data, which links data (together) and makes it publicly accessible. The result: networked information ecosystems that build on one another and give way to new insights.

Wikibase Cloud Beta opens up!

Wikibase Cloud celebrated an important milestone in 2023: the cloud service entered the open beta phase. Now anyone can use our cloud service to create a free knowledge database of linked data, also known as a knowledge graph. Thus, we take another important step towards expanding the infrastructure for Free Knowledge. Open beta allows us to continue testing and optimizing the platform before it is fully developed.

Wikibase and museums — guests at the ICOM Triennial

The ICOM Triennial in Valencia in 2023 was significant in raising Wikibase’s profile. Wikimedia Germany was on the spot to demonstrate the advantages of free software to the museums and cultural institutions represented there. We participated in numerous conversations that earned us greater credibility, trust and acceptance across the museum landscape. In fact, the event led to new partnerships with institutions in India and Brazil, to name but two.

Cooperation with the Goethe-Institut; workshops in Nigeria

In 2023, in addition to the ongoing development of the software itself, Wikimedia Deutschland also drove projects and collaborations to make Wikibase better known. One example is the workshop program we developed with the Goethe-Institut aimed at cultural heritage bodies and cultural institutions. At the kick-off seminar, organized with the Goethe-Institut Thessaloniki in Greece, participants learned how GLAM institutions might make use of Wikibase and how the principles of Linked Open Data work.

Another impactful project was our Wikibase workshop series for librarians in Nigeria, for which Wikimedia Deutschland took a targeted “train the trainer” approach. Individual participants received intensive training in the Wikibase software and then passed on their knowledge to other librarians. This approach was successful: during the workshop, several Wikibase instances were created that can now be put to use in libraries and institutions. On top of that, Wikimedia Deutschland has begun a collaboration with AfLIA (African Library and Information Associations and Institutions), an organization which facilitates collaboration with libraries throughout Africa.