Best Distributed Databases - Page 2

Compare the Top Distributed Databases as of June 2025 - Page 2

  • 1
    Greenplum

    Greenplum

    Greenplum Database

    Greenplum Database® is an advanced, fully featured, open source data warehouse. It provides powerful and rapid analytics on petabyte scale data volumes. Uniquely geared toward big data analytics, Greenplum Database is powered by the world’s most advanced cost-based query optimizer delivering high analytical query performance on large data volumes. Greenplum Database® project is released under the Apache 2 license. We want to thank all our current community contributors and are interested in all new potential contributions. For the Greenplum Database community no contribution is too small, we encourage all types of contributions. An open-source massively parallel data platform for analytics, machine learning and AI. Rapidly create and deploy models for complex applications in cybersecurity, predictive maintenance, risk management, fraud detection, and many other areas. Experience the fully featured, integrated, open source analytics platform.
  • 2
    Blazegraph

    Blazegraph

    Blazegraph

    Blazegraph™ DB is a ultra high-performance graph database supporting Blueprints and RDF/SPARQL APIs. It supports up to 50 Billion edges on a single machine. It is in production use for Fortune 500 customers such as EMC, Autodesk, and many others. It is supporting key Precision Medicine applications and has wide-spread usage for life science applications. It is used extensively to support Cyber analytics in commercial and government applications. It powers the Wikimedia Foundation's Wikidata Query Service. You can choose an executable jar, war file, or tar.gz distribution. Blazegraph is designed to be easy to use and get started. It ships without SSL or authentication by default for this reason. For production deployments, we strongly recommend you enable SSL, authentication, and appropriate network configurations. There are some helpful links below to enable you to do this.
  • 3
    Grakn

    Grakn

    Grakn Labs

    Building intelligent systems starts at the database. Grakn is an intelligent database - a knowledge graph. An insanely intuitive & expressive data schema, with constructs to define hierarchies, hyper-entities, hyper-relations and rules, to build rich knowledge models. An intelligent language that performs logical inference of data types, relationships, attributes and complex patterns, during runtime, and over distributed & persisted data. Out-of-the-box distributed analytics (Pregel and MapReduce) algorithms, accessible through the language through simple queries. Strong abstraction over low-level patterns, enabling simpler expressions of complex constructs, while the system figures out the most optimal query execution. Scale your enterprise Knowledge Graph with Grakn KGMS and Workbase. A distributed database designed to scale over a network of computers through partitioning and replication.
  • 4
    GaussDB

    GaussDB

    Huawei Cloud

    GaussDB (for MySQL) is a next generation MySQL-compatible, enterprise-class distributed database service. It uses a decoupled compute and storage architecture and data functions virtualization (DFV) storage that auto-scales up to 128 TB per DB instance. There is virtually no risk of data loss. It supports millions of QPS throughputs and cross-AZ deployment, combining the performance and reliability of commercial databases with the flexibility of open source databases. By decoupling compute and storage, connecting them through RDMA, and using a "log as database" architecture, you can get seven times the performance of open-source databases. To scale read capacity and performance, you can add up to 15 read replicas for a primary node within minutes. GaussDB(for MySQL) is fully compatible with MySQL. You can easily migrate your MySQL databases to GaussDB(for MySQL) without reconstructing existing applications and without sharding.
    Starting Price: $2,586.04 per month
  • 5
    CrateDB

    CrateDB

    CrateDB

    The enterprise database for time series, documents, and vectors. Store any type of data and combine the simplicity of SQL with the scalability of NoSQL. CrateDB is an open source distributed database running queries in milliseconds, whatever the complexity, volume and velocity of data.
  • 6
    GigaSpaces

    GigaSpaces

    GigaSpaces

    Smart DIH is an operational data hub that powers real-time modern applications. It unleashes the power of customers’ data by transforming data silos into assets, turning organizations into data-driven enterprises. Smart DIH consolidates data from multiple heterogeneous systems into a highly performant data layer. Low code tools empower data professionals to deliver data microservices in hours, shortening developing cycles and ensuring data consistency across all digital channels. XAP Skyline is a cloud-native, in memory data grid (IMDG) and developer framework designed for mission critical, cloud-native apps. XAP Skyline delivers maximal throughput, microsecond latency and scale, while maintaining transactional consistency. It provides extreme performance, significantly reducing data access time, which is crucial for real-time decisioning, and transactional applications. XAP Skyline is used in financial services, retail, and other industries where speed and scalability are critical.
  • 7
    HCL OneDB

    HCL OneDB

    HCL Software

    Build and run distributed, database-driven enterprise applications with the highest levels of availability, scalability, and performance completely cloud native. For enterprises just starting their cloud native journey or those already executing a multi-cloud strategy, OneDB offers the flexibility, reliability, and ease-of-use needed to meet your application needs. Capturing the value of data for insight and actionable intelligence is made easier through fully automated database administration. You can drastically reduce the need for deep technical expertise to launch new ideas and still stay ahead of the competition. OneDB is great for application development. From broad support of interfaces and APIs to extensive programming language support, developers will find everything they need with OneDB. HCL offers the most versatile cloud native database in the market.
  • 8
    PolarDB

    PolarDB

    Alibaba Cloud

    PolarDB is designed for business-critical database applications that require fast performance, high concurrency, and automatic scaling. You can scale up to millions of queries per second and 100 TB per database cluster with 15 low latency read replicas. PolarDB is six times faster than standard MySQL databases, and delivers the security, reliability, and availability of traditional commercial databases at 1/10 the cost. PolarDB embodies the proven database technology and best practices honed over the last decade that supported hyper-scale events such as the Alibaba Double 11 Global Shopping Festival. To support the developer community, we are introducing Always Free ApsaraDB for PolarDB (all three variations) when you use no more than 1 instance (2-core and 8GB of memory), and up to 50GB of storage. Register now and renew each month to continue this benefit. Regional resource availability is subject to change.
  • 9
    Citus

    Citus

    Citus Data

    Citus gives you the Postgres you love, plus the superpower of distributed tables. 100% open source. Now with schema-based and row-based sharding, plus Postgres 16 support. Scale Postgres by distributing data & queries. You can start with a single Citus node, then add nodes & rebalance shards when you need to grow. Speed up queries by 20x to 300x (or more) through parallelism, keeping more data in memory, higher I/O bandwidth, and columnar compression. Citus is an extension (not a fork) to the latest Postgres versions, so you can use your familiar SQL toolset & leverage your Postgres expertise. Reduce your infrastructure headaches by using a single database for both your transactional and analytical workloads. Download and use Citus open source for free. You can manage Citus yourself, embrace open source, and help us improve Citus via GitHub. Focus on your application & forget about your database. Run your app on Citus in the cloud with Azure Cosmos DB for PostgreSQL.
    Starting Price: $0.27 per hour
  • 10
    Tarantool

    Tarantool

    Tarantool

    Corporations need a way to ensure uninterrupted operation of their systems, high speed of data processing, and reliability of storage. The in-memory technologies have proven themselves well in solving these problems. For more than 10 years, Tarantool has been helping companies all over the world build smart caches, data marts, and golden client profiles while saving server capacity. Reduce the cost of storing credentials compared to siloed solutions and improve the service and security of client applications. Reduce data management costs of maintaining a large number of disparate systems that store customer identities. Increase sales by improving the speed and quality of customer recommendations for goods or services through the analysis of user behavior and user data. Improve mobile and web channel service by accelerating frontends to reduce user outflow. IT systems of large organizations operate in a closed loop of a local network, where data circulates unprotected.
  • 11
    NuoDB

    NuoDB

    NuoDB

    The world is moving to distributed applications and architectures, and your database should too. Learn how you can deploy where you want, when you want, and how you want with a distributed SQL database. Migrate existing SQL applications to a distributed, multi-node architecture that can dynamically scale out and in. Our Transaction Engines (TEs) and Storage Managers (SMs) work together to ensure ACID compliance across multiple nodes. Deploy in a distributed architecture. When you deploy your database with multiple nodes, the loss of one or multiple nodes will not result in the loss of database access. Deploy TEs and SMs to meet your variable workload needs, or deploy in the different environments the teams in your organization uses: in private and public clouds, in hybrid environments, and across clouds.
  • 12
    Couchbase

    Couchbase

    Couchbase

    Unlike other NoSQL databases, Couchbase provides an enterprise-class, multicloud to edge database that offers the robust capabilities required for business-critical applications on a highly scalable and available platform. As a distributed cloud-native database, Couchbase runs in modern dynamic environments and on any cloud, either customer-managed or fully managed as-a-service. Couchbase is built on open standards, combining the best of NoSQL with the power and familiarity of SQL, to simplify the transition from mainframe and relational databases. Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational databases such as SQL and ACID transactions with JSON’s versatility, with a foundation that is extremely fast and scalable. It’s used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vector search, high-speed caching, and much more.
  • 13
    MarkLogic

    MarkLogic

    Progress Software

    Unlock data value, accelerate insightful decisions, and securely achieve data agility with the MarkLogic data platform. Combine your data with everything known about it (metadata) in a single service and reveal smarter decisions—faster. Get a faster, trusted way to securely connect data and metadata, create and interpret meaning, and consume high-quality contextualized data across the enterprise with the MarkLogic data platform. Know your customers in-the-moment and provide relevant and seamless experiences, reveal new insights to accelerate innovation, and easily enable governed access and compliance with a single data platform. MarkLogic provides a proven foundation to help you achieve your key business and technical outcomes—now and in the future.
  • 14
    VoltDB

    VoltDB

    VoltDB

    Volt Active Data is a data platform built to make your entire tech stack leaner, faster, and less expensive, so that your applications (and your company) can scale seamlessly to meet the ultra-low latency SLAs of 5G, IoT, edge computing, and whatever comes next. Designed to augment your existing big data investments, such as NoSQL, Hadoop, Kubernetes, Kafka, and traditional databases or data warehouses, Volt Active Data replaces the various layers typically required to make contextual decisions on streaming data with a single, unified layer that can handle ingest to action in less than 10 milliseconds. The world is full of data that’s generated, stored, forgotten, and then deleted. “Active Data” is data that needs to be acted on immediately to gain business value from it. There are lots of traditional and NoSQL data storage products that you can use to keep such data. There’s also data that you can make money from, if only you can act on it fast enough to ‘influence the moment’.
  • 15
    Neo4j

    Neo4j

    Neo4j

    Neo4j’s graph data platform is purpose-built to leverage not only data but also data relationships. Using Neo4j, developers build intelligent applications that traverse today's large, interconnected datasets in real time. Powered by a native graph storage and processing engine, Neo4j’s graph database delivers an intuitive, flexible and secure database for unique, actionable insights.
  • 16
    Apache HBase

    Apache HBase

    The Apache Software Foundation

    Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Automatic failover support between RegionServers. Easy to use Java API for client access. Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options. Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX.
  • 17
    Google Cloud Bigtable
    Google Cloud Bigtable is a fully managed, scalable NoSQL database service for large analytical and operational workloads. Fast and performant: Use Cloud Bigtable as the storage engine that grows with you from your first gigabyte to petabyte-scale for low-latency applications as well as high-throughput data processing and analytics. Seamless scaling and replication: Start with a single node per cluster, and seamlessly scale to hundreds of nodes dynamically supporting peak demand. Replication also adds high availability and workload isolation for live serving apps. Simple and integrated: Fully managed service that integrates easily with big data tools like Hadoop, Dataflow, and Dataproc. Plus, support for the open source HBase API standard makes it easy for development teams to get started.
  • 18
    TiDB

    TiDB

    PingCAP

    An open-source, cloud-native, distributed SQL database for elastic scale and real-time analytics. Supported by a wealth of open-source data migration tools in the ecosystem, TiDB gives you the freedom to choose your own vendor and avoid lock-in. Purposely built to deliver SQL at scale, TiDB eliminates the scaling problems of traditional relational databases without intrusion to your application. HTAP database platform that enables real-time situation awareness and decision making on live transactional data and eliminates friction between IT and business goals. TiDB is ACID-compliant and strongly consistent. You can use TiDB as a scale-out MySQL database with familiar SQL syntaxes and ecosystem tools. TiDB automatically shards your data so you don’t have to do it manually. You can simply add new nodes to scale horizontally and elastically to meet your business growth. TiDB simplifies the ETL process and automatically recovers from errors.
  • 19
    Vitess

    Vitess

    Vitess

    A database clustering system for horizontal scaling of MySQL. Vitess combines many important MySQL features with the scalability of a NoSQL database. Its built-in sharding features let you grow your database without adding sharding logic to your application. Vitess automatically rewrites queries that hurt database performance. It also uses caching mechanisms to mediate queries and prevent duplicate queries from simultaneously reaching your database. Vitess automatically handles functions like master failovers and backups. It uses a lock server to track and administer servers, letting your application be blissfully ignorant of database topology. Vitess eliminates the high-memory overhead of MySQL connections. Vitess servers easily handle thousands of connections at once. MySQL doesn’t natively support sharding, but you will likely need it as your database grows.
  • 20
    JanusGraph

    JanusGraph

    JanusGraph

    JanusGraph is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. JanusGraph is a project under The Linux Foundation, and includes participants from Expero, Google, GRAKN.AI, Hortonworks, IBM and Amazon. Elastic and linear scalability for a growing data and user base. Data distribution and replication for performance and fault tolerance. Multi-datacenter high availability and hot backups. All functionality is totally free. No need to buy commercial licenses. JanusGraph is fully open source under the Apache 2 license. JanusGraph is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time. Support for ACID and eventual consistency. In addition to online transactional processing (OLTP), JanusGraph supports global graph analytics (OLAP) with its Apache Spark integration.
  • 21
    Nebula Graph
    The graph database built for super large-scale graphs with milliseconds of latency. We are continuing to collaborate with the community to prepare, popularize and promote the graph database. Nebula Graph only allows authenticated access via role-based access control. Nebula Graph supports multiple storage engine types and the query language can be extended to support new algorithms. Nebula Graph provides low latency read and write , while still maintaining high throughput to simplify the most complex data sets. With a shared-nothing distributed architecture , Nebula Graph offers linear scalability. Nebula Graph's SQL-like query language is easy to understand and powerful enough to meet complex business needs. With horizontal scalability and a snapshot feature, Nebula Graph guarantees high availability even in case of failures. Large Internet companies like JD, Meituan, and Xiaohongshu have deployed Nebula Graph in production environments.
  • 22
    Apache Geode
    Build high-speed, data-intensive applications that elastically meet performance requirements at any scale. Take advantage of Apache Geode's unique technology that blends advanced techniques for data replication, partitioning and distributed processing. Apache Geode provides a database-like consistency model, reliable transaction processing and a shared-nothing architecture to maintain very low latency performance with high concurrency processing. Data can easily be partitioned (sharded) or replicated between nodes allowing performance to scale as needed. Durability is ensured through redundant in-memory copies and disk-based persistence. Super fast write-ahead-logging (WAL) persistence with a shared-nothing architecture that is optimized for fast parallel recovery of nodes or an entire cluster.
  • 23
    FoundationDB

    FoundationDB

    FoundationDB

    FoundationDB is multi-model, meaning you can store many types data in a single database. All data is safely stored, distributed, and replicated in the Key-Value Store component. FoundationDB is easy to install, grow, and manage. It has a distributed architecture that gracefully scales out, and handles faults while acting like a single ACID database. FoundationDB provides amazing performance on commodity hardware, allowing you to support very heavy loads at low cost. FoundationDB has been running in production for years and been hardened with lessons learned. Backing FoundationDB up is an unmatched testing system based on a deterministic simulation engine. We encourage your participation in our open-source community! Join us in technical and user discussions on the community forums, and learn how to contribute.
  • 24
    Apache Accumulo

    Apache Accumulo

    Apache Corporation

    With Apache Accumulo, users can store and manage large data sets across a cluster. Accumulo uses Apache Hadoop's HDFS to store its data and Apache ZooKeeper for consensus. While many users interact directly with Accumulo, several open source projects use Accumulo as their underlying store. To learn more about Accumulo, take the Accumulo tour, read the user manual and run the Accumulo example code. Feel free to contact us if you have any questions. Accumulo has a programming mechanism (called Iterators) that can modify key/value pairs at various points in the data management process. Every Accumulo key/value pair has its own security label which limits query results based off user authorizations. Accumulo runs on a cluster using one or more HDFS instances. Nodes can be added or removed as the amount of data stored in Accumulo changes.
  • 25
    HerdDB

    HerdDB

    Diennea

    HerdDB is a SQL distributed database implemented in Java. It has been designed to be embeddable in any Java Virtual Machine. It is optimized for fast "writes" and primary key read/update access patterns. HerdDB is designed to manage hundreds of tables. It is simple to add and remove hosts and to reconfigure tablespaces to easly distribute the load on multiple systems. HerdDB leverages Apache Zookeeper and Apache Bookkeeper to build a fully replicated, shared-nothing architecture without any single point of failure. At the low level HerdDB is very similar to a key-value NoSQL database. On top of that an SQL abstraction layer and JDBC Driver support enables every user to leverage existing known-how and port existing applications to HerdDB. At Diennea we developed EmailSuccess, a powerfull MTA (Mail Transfer Agent), designed to deliver millions of email messages per hour to inboxes all around the world,
  • 26
    Apache Kudu

    Apache Kudu

    The Apache Software Foundation

    A Kudu cluster stores tables that look just like tables you're used to from relational (SQL) databases. A table can be as simple as a binary key and value, or as complex as a few hundred different strongly-typed attributes. Just like SQL, every table has a primary key made up of one or more columns. This might be a single column like a unique user identifier, or a compound key such as a (host, metric, timestamp) tuple for a machine time-series database. Rows can be efficiently read, updated, or deleted by their primary key. Kudu's simple data model makes it a breeze to port legacy applications or build new ones, no need to worry about how to encode your data into binary blobs or make sense of a huge database full of hard-to-interpret JSON. Tables are self-describing, so you can use standard tools like SQL engines or Spark to analyze your data. Kudu's APIs are designed to be easy to use.
  • 27
    BigchainDB

    BigchainDB

    BigchainDB

    With high throughput, low latency, powerful query functionality, decentralized control, immutable data storage and built-in asset support, BigchainDB is like a database with blockchain characteristics. BigchainDB allows developers and enterprise to deploy blockchain proof-of-concepts, platforms and applications with a blockchain database, supporting a wide range of industries and use cases. Rather than trying to enhance blockchain technology, BigchainDB starts with a big data distributed database and then adds blockchain characteristics - decentralized control, immutability and the transfer of digital assets. No single point of control. No single point of failure. Decentralized control via a federation of voting nodes makes for a P2P network. Write and run any MongoDB query to search the contents of all stored transactions, assets, metadata and blocks. Powered by MongoDB itself.
  • 28
    rqlite

    rqlite

    rqlite

    The lightweight, user-friendly, distributed relational database built on SQLite. Fault tolerance and high availability with zero hassle. rqlite is a distributed relational database that combines the simplicity of SQLite with the robustness of a fault-tolerant, highly available system. It's developer-friendly, its operation is straightforward, and it's designed for reliability with minimal complexity. Deploy in seconds, with no complex configurations. Seamlessly integrates with modern cloud infrastructures. Built on SQLite, the world’s most popular database. Supports full-text search, Vector Search, and JSON documents. Access controls and encryption for secure deployments. Rigorous, automated testing ensures high quality. Clustering provides high availability and fault tolerance. Automatic node discovery simplifies clustering.
  • 29
    Oceanbase

    Oceanbase

    Oceanbase

    OceanBase eliminates the complexity of traditional sharding databases, enabling you to effortlessly scale your database to meet ever-growing workloads, whether horizontally, vertically, or even at the tenant level. This facilitates on-the-fly scaling and linear performance growth without downtime or necessitating changes to applications in high-concurrency scenarios, ensuring quicker and more reliable responses to performance-intensive critical workloads. Empower mission-critical workloads and performance-intensive applications across both OLTP and OLAP scenarios, all while maintaining full compatibility with MySQL. 100% ACID Compliance, natively supports distributed transactions with multi-replica strong synchronization built upon Paxos protocols. Experience ultimate query performance that your mission-critical and time-sensitive workloads can depend on. This effectively eliminates downtime, and ensures your mission-critical workload remains always available.
  • 30
    ArangoDB

    ArangoDB

    ArangoDB

    Natively store data for graph, document and search needs. Utilize feature-rich access with one query language. Map data natively to the database and access it with the best patterns for the job – traversals, joins, search, ranking, geospatial, aggregations – you name it. Polyglot persistence without the costs. Easily design, scale and adapt your architectures to changing needs and with much less effort. Combine the flexibility of JSON with semantic search and graph technology for next generation feature extraction even for large datasets.