Harness the Power of Data in the Cloud — Unveil the Future of Tech

Wikidata Transforms Data into Vectors for Open LLM Access

Wikidata's vectorization project opens up its vast knowledge to Large Language Models, improving their reliability and reducing false information. The project, a collaboration with Jina AI and Astra DB, aims to enhance LLM applications like fact-checking.

, and Administrator

2025 October 1 . 2:07 PM

1 min read

This is an article and here we can see planets, a machine and some text.

Wikidata Transforms Data into Vectors for Open LLM Access

Wikidata, the world's largest open knowledge graph, is transforming its data into vectors and storing them in Astra DB, a vector database. This project, a collaboration between the Wikimedia Foundation and Jina AI, started in September 2024. The goal is to provide a freely accessible interface for Large Language Models (LLMs), making them more transparent, reliable, and fair.

Wikidata, maintained by around 24,000 volunteers worldwide each month, contains approximately 119 million entries. It recommends using semantic vector search to identify correct datasets and then structuring the knowledge using a graph database (GraphRAG). The vector database supports search queries in English, French, and Arabic, with Spanish and Mandarin planned by the end of the year.

The new technology aims to improve LLMs by providing them with structured, up-to-date, and verified information. This reduces incorrect answers and hallucinations. Wikimedia envisions applications such as fact-checking or tools for vandalism prevention. The source code of the application is available under the open MIT license.

The embedding project, initiated in September 2024 with partners Jina AI and Astra DB, enables developers to connect Wikidata's vectorized data to LLMs using Retrieval Augmented Generation (RAG) and Model Context Protocol (MCP). This open access to Wikidata aims to enhance the quality of LLMs worldwide.

Latest

This is the aerial view of a city. in this we can see buildings, towers, motor vehicles,...

Lifestyle

Romania's IPTV: The Future of Viewing Experiences

IPTV is revolutionizing Romania's content consumption. Engage with live polls, AR, and personalized content on your mobile devices. The future is here.

, and Administrator

2025 October 9

In the picture we can see a car engine with pipes, battery in it.

Climate-change

China Boosts EV Safety from 2026 with Mandatory Impact Tests and 'Battery Bazooka'

China's new EV safety rules promise tougher testing. The 'battery bazooka' could revolutionize fire prevention worldwide.

, and Administrator

2025 October 9

This is a paper. On this something is written.

War-and-conflicts

EU Committee Visits Taiwan Amid Rising Hybrid Threats and China Tensions

EU committee visits Taiwan to align against hybrid threats. President Lai Ching-te warns of increasing threats to both Taiwan and the EU.

, and Administrator

2025 October 9

In this image we can see there is a tool box with so many tools in it.

Stay Safe Online with Wise Learner Hub

CyberCX Speeds Up Essential Eight Compliance with New Solution

CyberCX's new solution cuts Essential Eight compliance time from months to days. It's a game-changer for organisations looking to bolster their cybersecurity fundamentals.

, and Administrator

2025 October 9

Wikidata Transforms Data into Vectors for Open LLM Access

Wikidata Transforms Data into Vectors for Open LLM Access

Read also:

Related

Latest