Database High Availability vs High Performance
Published on: April 10th 2010
10 years ago, people did not care too much about getting the database fast or available. Databases were small and ECommerce or online payment or any kind of social community did not exists. Performance and availability were somewhat important for several companies but a large portion of them did not care much. They could either tell their employees that the database will be back up and running in a few hours. Performance_wise, obviously we could not wait forever but once again large databases were not that common.
I do remember working for a company that waited every single night an entire night batch to get their accounting consolidated. Not that this was that major of a work. I believe today it would take about 10-20 minutes to do the same work.
Now, we can say that machine 2 machine (telecommunication for example) and people 2 machine (Ecommerce for example) require extreme availability and performance. Additional technology enriches the panel of connection types such as widgets, services farms. Heterogeneous behaviour makes database availibility and performance a science.
Historically the data-tier has alwways been the centralized storage area where all data needed to be persisted, backed up, secured resides. There is therefore a direct connection with reading and writing on disk. For DBAs, that is the main bottleneck for performance and availability: the storage area.
Other problems might happen but there are all related to typical application single point of failures like the operating system failing, the memory failing, the network failing, etc...
You can have serious application requiring a very tiny portion of disk to run. Lately enriched website would need serious storage for images, large-code files, multimedia files but technically, these are all related to storage.
The data-tier could work the same way: bringing all data onto memory and therefore removing the impact of disk and all database editors work on that whether or not they wanted too.
NoSql and grid technology like xkoto or Oracle RAC try to make the persistent data brought to memory and make it like a buffer so that the "operational" database is on the cache layer.
Sql Server cachestore and SAN cache do that as well making reads much faster and ensurring somewhat of better performance.
So there are two areas to take care off:
- the getting the data or writing the data part: either on the cache of disk
- the transactional part to keep consistency of the data
getting data and writing data can be done fairly easily: you have a standalone server and you are done. You need to keep the data available: you get a replication, a mirroring on a different server that would be your partner. You can silmplify all by clustering the all thing although even this solution will give you a spof on the storage layer.
Writing the data, same.
Now that you cover your availability, you need to ensure that your data is secured. Your availability does not secure your data: it just makes sure your data will be available but if this is crap data, this will be still crap data. Recently a client of mine was subject to sql injection. Yeah, you got it: no form validation on the client side; no validation on the server side and a bunch of dynamic ad-hoc queries. Classic! Another client of mine recently did want to update one line out of millions of row but actually committed an update on all rows. Oops!
Well in this case, your availability does not help you. Great! you can still connect your data but it is all crap! You need to make sure you know your tolerance when you have a disastry like that and your backup regime will become handy. For both clients of mine we were able to recover all data correctly thanks to a good supervised backup regime.
Ok, let's way we cover all basis on that: good availability, good disastry recovery strategy, what is next? Well you get your performance to take care off and this is where it gets complicated.
Because sometimes performance does not help availability or disastry recovery plan and vice-versa.
Here are some clear example:*
1. you can denormalize data to make them eithier to query: more storage, more need of disk, more backup place, more network for the robocopy, more transaction log, etc...
2. you increse the need of indexing: same results
3. you increase the need of reindexing: same result
So how in the world will you be able to improve performance without getting your database on their knees.
Solutions 1 and 2 work no problem if you accept the consequences. If you add a couple of ressources in the momery, cpu, SAN area, you are probably fine and this might be some of the components to deal with.
But there are other areas
1. Federation of servers: get availability automatically be spreading the load onto different servers. Something I implemented very easily with telecommunication companies. We talk more about gateways in those companies. Fir high volumetric data, we just get several servers to handle the data onto different servers. This gives the availability because if one gateway does not work, then we go on another one. We secure the data by implementing serious backup regime and a datawarehouse that lets the database tier upfront very tiny: improving availability because less ressources needed and performance because less data to query no matter how your execution plan is.
2. Keep only operational data on your operational server; separate front and back office data: some clients unfortunately do not separate the data that resides at the back office and the front office. You should separate it not only at the database level but ideally at the server level. It is like the best practice for database architecture. You do not want to have CLR or application stuff on your server because your server is not made to handle different type of technologies that eat eachother on ressources like CPU or memory. generally your back office data will be rich and focus on the business to be run (datamining, billing, client information b2b, etc...). Front office will try to be light if possible.
3. Application tier: you can have as many layers as you want. In some way, it is good architecture to have multi-tier environnement to make it scalable. This will enable your to switch servers easily and maintain your layers independantly. However, it does not change the fact that, in an ECommerce, for example, your web navigation is critical. If you manage to make your client uses less dependancy for the most viewed pages you will gain in performance. Let's say there is one data you need to show to the client but you know it is changing all the time and it will consume ressources, you will try to make it available on a less-viewed page like you will make your client click on it, not show it automatically. I am not saying it is what you have to do. Functional aspect of your application might require the availability of the data very quickly. I said, you need to consider seriously the performance aspect of your application navigation from the web-tier to the application-tier to the data-tier.
4. Virtualising the data-tier: as virtualisation becomes very popular and cloud computing, this is definitely another performance and availability improvement. NoSql projects are looking on this sense by trying to get into the data-tier only if needed pretty much. Grids goes to the database layer but try to spread the load on identical data servers. In other words, they read data from the most ressourcefull server and write the data on the most available server and then peer-to-peer replications (mySql for example) or command-type replication (xkoto for example) can take care of it. Virtualisation is a way to do federation of servers in some ways, so I can dig that. However, it is to count on an additional layer of abstraction that might be a black box for you.
5. Data partitioning: you can obviously store data even within the same table in different raid array drives getting the loads more evenly whether your write or you read.
6. Make your database model towards performance. Debate around row-store vs. column-store / relational vs. non-relational database is important especially for performance. Lately referential integrity is less important than performance. Think about it, you delete a member of your website but because you want it to be fast you only delete the core information (the row on the primary table). Well tehcnically your member is gone whether you have parasite data. Then you can have a cleanup process on the background that would clean those useless data anymore. For a non-experienced DBA it is difficult to say that as this is one of the important point: referential integrity. But believe me, you have to adapt your database to the situation! After all, some people would say that database is just a way to store the data in a usefull order.
On all the solution, I prefer the federation of servers and I think you probably can apply to most of the industries. If ECommerce might require getting very large "lookup" data like looking at the memeber data, you can still federate data like orders or payments or emails sent, etc... I usually advise client to segment their databases/servers based on their websites, on the countries for international websites, on the alphabetical order of the logins, etc...
It is always possible to segment your data this way.
Any comments would be apreciated 🙂