Simple beats complex – more detail

While feature-rich, scale-out products may sound complex to manage, when designed right and with architectural simplicity at the core – the opposite is true.

Simplicity in platforms translates to simpler and better code which in turn accelerates iteration velocity for devs. This post will analyze this topic in more detail, and discuss how Regatta does this.

Regatta is very easy to use. With Regatta, users are guaranteed a powerful yet simple experience. Most of all as Regatta matures it represents one of the most simplifying opportunities for the broader data platform complexity that frustrates us all.

Regatta offers the strongest-possible ACID guarantees, powered by groundbreaking technology that allows distributed transactions at high performance. Such ACID guarantees allow the application developer to focus on business logic, rather than trying to deal with various failure scenarios, race conditions and various other corner cases that are typical for databases that provide no or limited ACID guarantees. Developers working with Regatta write less code, and have shorter times-to-market, than those relying on NoSQL databases that are eventually-consistent, or on scale-out sharding databases that do not guarantee ACID across the entire cluster. Read more about Regatta’s transactional support here.

Regatta guarantees strong ACID across the entire cluster, enabling transactions to safely span multiple nodes. Unlike traditional scale-out-by-sharding databases that have no or limited functionality across the boundaries of a single shard, all of Regatta’s functionality, whether transactional or analytical, is always fully supported across the entire cluster. This simplifies development as attempting to solve those scale-out-by-sharding limitations requires developers to write much more, and more complex application code. Importantly, in many cases, it is impossible to handle those limitations at the level of the application code, no matter how hard one tries.

Traditional scale-out-by-sharding databases bring additional complexities and headaches to both developers and operations engineers. Due to the various limitations with scale-out sharding, the choice of sharding criterion is often critical. Unfortunately, no matter how hard one tries, any choice of sharding criterion will generally benefit one type of usage and hurt another. The way in which the application code is written is directly impacted by the choice of the sharding criterion. Sooner or later that sharding criterion may need to be modified (aka resharding), for instance to expand the cluster or to deal with capacity-overflow of a specific shard. Resharding is known to be a painful disruptive process. It involves massive reshuffling of data across single-node database shards and often requires significant changes to be written in the application code.

In contrast, Regatta does not suffer from any of the above. The developer does not need to deal with any sharding criteria, as Regatta can automatically distribute tables across the cluster. Any excessive uneven growth or shrinkage in table rows stored across the intended disks and nodes will invoke automatic non-disruptive rebalancing, and so will any addition (or removal) of nodes or disks to a pool on which table data is stored. Regatta’s rebalancing does not require any modifications to the application code since the data placement is transparent to the application to begin with. This completely releases administrators from the operational headache and cost traditionally associated with resharding. Moreover, Regatta’s elasticity eliminates the need for careful “capacity planning” ahead of time. Rather than having to “commit” to a fixed configuration, you can start with your “best guess”, and if needed fine-tune later. Read more about Regatta’s elasticity here.

Regatta’s sophisticated partitioning provides various benefits. First, it enables maintaining of locality among related rows. In addition, it allows tiering. Furthermore, it supports segregation for multi-tenancy purposes. Segregation also benefits some scenarios with multiple competing workloads. Partitioning is purely optional in Regatta. Regatta enables users to define partitioning with simple and intuitive partitioning-criteria, high-level policies and placement rules, leaving Regatta to do the underlying hard work. Unlike some other databases, Regatta takes as much responsibility as possible. For example, node capacity overflow would be automatically dealt with, way ahead of time, by automatic rebalancing. Read more about Regatta’s partitioning and tiering here.

Regatta provides a single solution that delivers functionality that traditionally requires multiple separate types of databases, cumbersome ETL processes, and data warehouses; and requires to keep various types of data separated as “silos”. This traditional approach often results in operational complexities and inefficiencies. Also, executing queries on real-time data across those separate solutions’ data silos requires extensive and convoluted application code to be written – or may simply not be possible. While Regatta enables business to gain fast insights into both near-real-time and historical data, this is generally impossible with traditional solutions. Managing a variety of solutions requires familiarity with more products, handling a more complex set of intertwined solutions, which in turn requires more staff with that must be much more knowledgeable and experienced. All this translates into higher costs. Organizations’ data management requirements tend to evolve over time. However, having a plethora of databases, processes, and integrations hinders agility. The resulting rigidity makes it hard to make any significant changes without massive application code rewrites, acquisition of new skillsets, disruptive migrations, painful bugs, overprovisioning of compute, network, and storage infrastructure, and so forth.

As opposed to the traditional approach, Regatta can store multiple types of data as well as perform transactional, analytical, and high-ingress workloads – and it can do them simultaneously. That eliminates the operational headaches of ETL, as well as the complexities of managing silos of data. In addition, it is possible to perform analytics on up-to-date data, without special efforts and added complexity in the application’s code. Such analytics could provide insights at real-time or close-to-real-time. Having a single universal database enables standardization of skills. Rather than having multiple product-specific specialists whose time and skills are limited to a single database, resulting in a few team members working overtime on their products while others’ time may be under-utilized, Regatta database team members can share workloads and be productive all the time.

Regatta is cloud and platform-agnostic. It does not rely on third-party clustering systems thus improving deployment flexibility. Regatta supports a large variety of hardware configurations and supports any mix of server types in the same cluster. This is good news if you deploy Regatta on-premise and/or in the cloud, since this simplifies deployment and lowers TCO.

Regatta supports standard SQL. SQL is the most popular and universal database language, and the most widely used relational database language. As such, most developers can start benefiting from Regatta straight away. Maybe even more importantly, SQL is a declarative language. This means that you only have to define “what you need” the database to do, and not “how exactly the database should achieve what you need” as in non-declarative languages that are provided by many NoSQL databases. Using SQL often results in dramatically shorter code. Furthermore, the developer does not need to bother with trying to find the optimal way for data retrieval. Instead, Regatta’s query optimizer takes care of all that. Regatta’s query optimizer uses “inside information” and various approaches to optimize data access and results.

< Back to blog

Sign up for the newsletter

    Thanks for subscribing!

    You’re already a subscriber – thank you!

    Latest Blogs

    Is dev/database hate inevitable?

    Commonly one of the first things one does when learning a new language and using a new framework is to build a basic…

    Sharding – some dirty little secrets

    An examination of the dirty secrets of sharding (and some pseudo-distributed database systems)

    What goes around, indeed.

    In June, Michael Stonebreaker and Andy Pavlo published a whitepaper that really hit home for me. Suggestion: take a…

    The art and science of scaling out a transactional relational database

    Why are we talking about this? Relational transactional database systems provide data with rich functionality and strong…
    Showing 1-3 of 8 articles
    Skip to content