HTAP databases, i.e., Transactional Analysis Mixed Load DBs, have become a popular new type of database. Not only is the concept very hot, but it has also gradually become in addition to OLTP, and OLAP, more and more database users of the new selection specification. However, at the same time, there are many phenomena: First, overnight, all databases have become HTAP databases; second, in addition to the "can simultaneously carry transactions and analyze SQL" this easy to fuzzy understanding, there is no unambiguous definition; naturally, the application of HTAP scenarios, but also the eight immortals across the sea, the various and unclear. Naturally, the application scenarios of HTAP are also varied and unclear. These, all lead to HTAP has become a gimmick tendency.
It is argued that since HTAP is to become a new standard and specification, it must be as clearly defined as possible. The most basic principle to achieve this is that it must be technically different from the classic database capabilities of the past (not just distributed), and must bring innovation and upgrading to the customer's digitalization process, which includes the business architecture, application architecture, data architecture, and technical architecture at the same time. For this principle, its definition and definition can be continuously explored, but this paper only puts forward the following points for reference:
(1) HTAP in the technical architecture and design goals should not be equivalent to the classic Oracle and MySQL, or distributed Oracle and MySQL-like, because if the classic Oracle and MySQL is also considered HTAP, then (measured by "can carry transactions and analyze SQL"), certainly is If classic Oracle and MySQL are considered HTAPs (measured by the ability to carry transactions and analyze SQL at the same time), then the definition of HTAP is meaningless;
(2) htap database transactions and analysis of the execution of the task should be able to achieve transparent use of the user and have a non-influential basis, rather than more than the AP greatly affects the TP, TP more than a significant impact on the AP, the classic Oracle and MySQL is the same;
(3) HTAP should not be oriented to the number of warehouse class pure OLAP needs. That is, its enhancement of enterprise data architecture, at this stage, should not be aimed at abandoning the data warehouse system;
(4) A modern HTAP database, should be a distributed database.
As I said earlier, HTAP should be the business and architectural innovation and enhancement, not just replacement or performance improvement. From this point of view, this paper considers that HTAP application scenarios are mainly focused on the following two aspects:
Analytic-Embedded OLTP
With the ability of HTAP, the future of transactional business systems should be on the business transaction side, inherently be analyzed, and not affect the performance of the transaction and data consistency. Such as wind control, marketing, or other original data platform in the background through data migration and synchronization to complete the ability, a considerable part, can be migrated to the side of the business system to complete in real-time, become an inherent function of the business system, so that it can complete a certain degree of business closed-loop, which is inevitably technology-driven development of the important direction of modern business.
Future business systems should be designed according to this standard, which is of great significance to the transformation and upgrading of the business capabilities of modern trading systems.
Data SuperStore
Most of the data warehouse (Data Warehouse) systems are for "management" and born, the application makes it difficult to enjoy the dividends of data. In the vast majority of enterprises, after spending a lot of effort to establish a data warehouse (Data Warehouse) system, the business system and most of the staff can only "request technical assistance to complete" and "the data into the business system to" two ways to use the data. This phenomenon of separating applications and data is a pain point that most enterprises have been extremely concerned about for a long time in the past and still today.
Data consumption-oriented, on top of the existing data platform, the establishment of "use" as the core, "management" as the basis of the data services platform, that is, the correct interpretation of the concept of the data center, has become a lot of business planning and implementation of the key innovation and upgrading of one of the applications. It is different from Data WareHouse in storing and managing, but it allows users across the enterprise will be able to data (accurately business-oriented organization of the data assets, because it is not the focus of this article, not to be repeated here) as a supermarket of commodities as free to choose and consume, so that the entire enterprise to enjoy the dividends of the data, so here that the call data service supermarket (Data) Therefore, it is more appropriate to call it Data SuperStore. However, leaving aside the establishment of the data asset system and other architectural and modeling aspects, what kind of database should be used to host the SuperStore?
SuperStore for data consumption, that is, to carry a large number of from the whole enterprise, high concurrency service-oriented query requirements (QPS TP type), but also to carry a large number of exploratory statistical analysis needs (AP type), this requirement is not a pure OLAP database, but also not a pure OLTP database can meet, and obviously must be elastic distributed. Therefore, in the past, a considerable number of data services platforms used multiple types of databases, combined to meet different needs designs. HTAP database, then, should be the best choice for the scenario.