Subscribe & Follow
Jobs
- Senior Data Analyst (SQL - ODS) Cape Town
- Senior Information Manager Johannesburg
- Senior C# Software Developer Johannesburg
- Mid-level Senior Developers George
- Senior IT Manager Ballito
- Junior C# Developer Johannesburg
- Intermediate C# Developer Cape Town
- Intermediate Full Stack C# Developer Centurion
- Intermediate Software Engineer – Sandton – up to R840k per annum Sandton
- Intermediate C# Full Stack Developer Johannesburg
Big data - do you really need it in your production database?
While governance and regulations may require that certain data is kept for legal purposes, it is not necessary to store this data in expensive, "instant access" databases. Historical data can be archived, saving money and time, and helping organisations to use the right big data to make better business decisions.
As content generation continues to explode, data-storage strategies have become increasingly important for businesses. Effectively managing this data should be a top priority, for greater cost-effectiveness and efficiency. When it comes to big data, not only is it not cost-effective, but it is also impractical to store all data up front in the production database and can, in fact, decrease everyday server performance. Organisations need to have a strategy in place to reduce storage costs in the face of exponential data growth, optimise performance according to the needs of the business and mitigate the risk of lost data and information.
Performance compromised
The production database should contain only the current data that is needed for the day-to-day business and operations of the organisation. This database should feature high-performance capabilities to deliver the data to users quickly and efficiently. However, if it is being used to store data that is not needed for everyday use and becomes "heavy" or bogged down with data, performance will inevitably be compromised.
Data cleansing and consolidation can assist with reducing data volumes in the production database, but this is often not enough to deliver the required performance gains. Strategy needs to be put into place to ensure that data is archived, removed from production and stored in more cost-effective options. This strategy, however, must be linked into the business and its needs, including its daily operations. If data storage, retention and archiving strategies are not in line with the needs of the business, users will not be able to access the data they need when they need it and as fast as they need to. It is vital first to understand the needs of business and then put rules into place about archiving. This means that archiving is not simply an IT decision, but a business decision too, and database administrators need to understand the business in order to provide advice for a better strategy.
Making searching faster
With data maintenance plans and archiving strategies in place, data can be moved out of the production environment onto archive servers, which will still enable the data to be easily accessed by users, but will not affect the performance of the production server. This will make searching faster and increase performance when accessing or creating data. Historical data will take longer to access, but, as this is not needed as often, the performance gains on daily data outweigh the minor inconvenience.
Partitioning data in this way will bring down the costs of hardware, software and licensing as well, saving organisations money. Production databases must deliver high performance, which means higher cost. Database size, server memory or CPUs and licensing are interlinked. Those companies that have historical data residing on the production server will need to spend more on ensuring high performance and licensing that is based on CPUs. Archive servers do not need to provide the same levels of performance, so lower cost and lower specified servers can be used for this purpose. This approach also means that maintenance on the production environment is easier, rebuilding indexes is quicker and back-ups will run faster. In general, performance and uptime will be maximised. The archive server can also be used as a quality-control environment to validate data integrity in a safer manner, as doing this on the production server can have a negative impact on business performance.
Ultimately the rules of data storage strategy are simple. The production database should contain only the data needed for day-to-day operations and all historical data should be moved onto an archive server. This will allow the production server to be streamlined and deliver the best possible performance, and will optimise the cost of maintenance and running of storage. This, in turn, will allow organisations to deal with big data in a more intelligent fashion, comply with regulations concerning data retention, and make more agile decisions based on current data thanks to optimised system performance.