High Performance and Efficiency
Regatta was built for extreme performance. Each individual Regatta node delivers the maximal performance. Moreover, Regatta’s distributed and parallel algorithms can optimally exploit the cluster-wide resources, resulting in linearly- scalable performance for both transactional and analytical activities. More transactions can be executed per second, and lengthy queries can be completed faster, when more nodes and more resources are in the cluster.
Regatta’s implementation aims to use resources like RAM, CPU, disk, network as lean as possible. This resource-usage efficiency not only improves execution performance time, but also reduces costs of performing those operations as less hardware is required. Regatta’s groundbreaking technology provides extreme performance of distributed transactions with strong ACID guarantees.
Regatta uses state-of-the-art parallel and distributed analytics algorithms, aided by a powerful query planner and optimizer that drives the calculation strategy dynamically. Read more about Regatta’s query and analytics here. Unlike other databases, Regatta’s transactional capabilities do not compromise analytics capabilities or bulk workloads. On the contrary, OLAP activities such as lengthy reporting can take place concurrently to the transactional activities (OLTP). Bulk operations, such as high-rate ingress, can be performed full-steam, leveraging the inherent high bandwidth available in scale-out architectures that allow clients to communicate with all the database nodes concurrently.
To eliminate single node bottlenecks, Regatta utilizes many-to-many data propagation, which is also a powerful strategy for cutting processing time of large tasks. Many-to-many propagation can also significantly improve throughput between the clients and Regatta, both in terms of sending from the client to the Regatta database, and vice-versa. The client can submit commands as well as transfer data in parallel from many client-nodes to many Regatta nodes. Of course, large amounts of data results can be transferred from many Regatta nodes to many client nodes in a similar manner. In addition, Regatta can store results in a distributed and parallel manner, allowing clients to access results at a later stage. This approach may sometimes dramatically reduce the overall time taken for an operation to execute.
Often, disk I/O is the main cause of database performance problems. While disks are dramatically slower than RAM, each single access to disk represents a much larger chunk of data than a single access to RAM. As a result, the database’s data layout – the persistent data structure that the database uses for storing table data on disk – is a critical factor determining the database performance under various constellations. Most databases are built on one type of data layout. Unfortunately, there is no single data layout that could optimally fit all. Generally, the optimal data layout depends on the disk media type (i.e., magnetic HDD or flash SSD), the type of the data stored and the nature of the workload. Therefore, “single layout” databases tend to perform well with a limited set of combinations of specific media, type of data, and workloads, but under-perform elsewhere.
Unlike most databases that only work with a single data layout, Regatta uses various data-layouts that are optimal for different types of data, media-types and workloads. Regatta’s specific SSD-aware optimizations extract the highest possible performance from flash media. Furthermore, if installed in the servers, Regatta leverages persistent-RAM solutions (e.g., Optane Memory or other NVRAM solutions) for accelerating the work with the disk even further. Read more about Regatta’s optimized data-layouts here.
Like all relational database, when indexes cannot boost a query, Regatta performs partial-scans and sometimes full-scans to retrieve the required data. Regatta optimizes those scans such that they can reach the theoretical bandwidth of the underlying disk devices. Since Regatta distributes and balances data across the relevant nodes and disks, those scans can leverage the cluster’s massive I/O and compute parallelism to run dramatically faster.
Regatta provides sophisticated partitioning capabilities that can be used when data-locality boosts some analytical and/or transactional workloads. Furthermore, the user could easily realize various tiering strategies that may bring hotter data to faster media disks and/or to faster nodes. Read about Regatta’s partitioning and tiering here. Regatta’s RAM management combines state-of-the-art caching mechanisms, the ability to store specific columns or tables in-RAM (like in-memory databases), and the ability to use RAM resources across the entire cluster. The latter allows to temporarily keep large intermediate calculation results in distributed RAM, which may be significantly faster than the alternative of swapping to disk.