Many organizations are now facing with a lot of compliance regulations, from SOCII to PCI, GDPR to KYC compliance and many audits are done to understand how they protect their data from external hacking but also internal employees.

The biggest threat might not be the one we believe and this articles explains why:
https://hbr.org/2016/09/the-biggest-cybersecurity-threats-are-inside-your-company

Nowdays, system engineers and infra architects are making sure to protect thoroughly the access to company network, implementing network segmentation, disk encryption of laptops, indirect access from the production boxes, 2-way authentication… And this is all good. DevOps teams are increasing as the cloud is taking control. However, many security breaches are still at the data layer because of the lack of expertise or the lack of processes in place.

Working as the chief data officer for a large financial company and senior data architect for a large telecommunication company, both having PII personal information, I needed to find an objective and secure system to prevent access to production data or at least access to data within a specific scope of responsibilities.

I have implemented a lot of security supervision system, from success/failed logon traces to policy-based restriction avoiding DDL implementation by unauthorized personnel, fine-grained security audit and implementation but today I like to talk about a process that is somewhat forgotten and yet absolutely vital for security.

Companies care about their data and particularly their transactional data and wants to know how to prevent data loss. They usually confuse data availability and data loss (I invite you to look at this 101 explanation: https://www.cloudberrylab.com/resources/blog/rto-vs-rpo-difference/).

Once you pass this through and you have setup appropriate concept, then you need to care about security of your data at REST, regarding the backup.

I will take the example of SQL server databases but this can be applied to all database engine.

Here are the precautions to take beforehand:

  1. Identify the right persons in charge of database administration and enforce separation of duties:
    In the beginning, when you are a startup, this makes sense your agility must be extreme and a company can grow faster than its security but at some point, many reasons require separation. Your developers should not access your production and just a handful of people should be able to access the databases. Additionally specific NDA and confidentiality protocol should be required for those people as they have an access that is absolutely paramount to the company.
  2. Stick to the process:
    One of the hardest thing is sticking to the process while keeping the velocity of the development cycle. It is absolutely possible. The company just need to be on top of it. Something that will make companies jump here is the following statement: functionality and regression risks are important. New code needs to be tested. However it is less important than a security breach and why? Because a security hole will hunt you forever and will be dragged. Some of my clients are giving up on such issues because they do not know how to re-implement or how to keep such culture. It is absolutely necessary to keep on to it. Keeping Security by design is an asset that many companies lack (even though they say formally they are secured). Auditors are usually checking boxes and do not really check all the layers.
  3. Classify your data and validate new structure via a data governance team:
    This is something that starts being implemented but it is again pretty cumbersome, especially at the beginning and it is yet fundamental. The argument that “kills” the reason that company should not do this is the following: every new structure, small or large should always go via the validation of a data architect. I have seen many clients believing the impact of a “small” structural change cannot affect production and yet it does grandly. For the clients who have to enforce data privacy regulations such as GDPR, keep your DPO involved in the technical changes by giving him access to agnostic mechanisms to check security. The DPO should for example have an interface to see how the user has the right to be forgotten, how security to the production databases change when the database relates to personal data, etc…
  4. SSL encryption, Network segmentation, Disk encryption and backup encryption:
    To enforce Data in transit and data at REST, implement encryption at all layers. You might have to change or upgrade drivers for application. All database engines are now offering encryption at the database layer (you might have to upgrade your database engine as well). For example, on SQL server, transparent database encryption or backup encryption are offered. The second one is less intrusive (no performance hint) and offered on standard edition so a lot of companies might rather consider this option.
  5. Set up a “REBASE” server and we will focus on it here.

 

A rebase server, as I call it is a server that prepares the data for the development environments by obfuscating the data and reducing or not the database size, in order to create development environments or refresh development environments on-demand or on an automated basis. We have been setting up the solution for many clients successfully and this prevents major database hacking.

Here is the implementation on SQL server on premises (or dedicated VM, not serverless).

PREREQUISITES

  • A windows-based server (virtual or not must be installed with the necessary space to restore the databases from production.
  • Install of developer edition of SQL server 2016 instances on the collation corresponding to the production platforms. The developer edition is free of licence. The rebase server is just to get the development, testing and demo instances ready so this does not provide any production-based activity nor the applications are connecting there.
  • The rebase server MUST access the production backups (read permissions)

ENCRYPTION

  • DATA IN TRANSIT: Force encryption from SQL server instances, whether or not with certificates.
    As the rebase server is not meant to be connected with applications, it will be easy to use Sql server data studio or Sql server management client with upgraded drivers that are up-to-date with such encryption. If no certificate is setup, there is no real need to specify the options. If you use the self-signed certificate, use TrustCertitifcate=True or apply the certificate at the client machine. This would ensure encryption on the DATA in transit. You can easily check that connections with the dynamic management view.

SELECT session_id, connect_time, net_transport, encrypt_option, auth_scheme, client_net_address

FROM sys.dm_exec_connections

 

  • DATA AT REST:
    Restore from a database with transparent database encryption implemented or from an encrypted backup. I will take the case of the backup encryption example.
    1. Backup master key and certificate from the source database server where the backup is encrypted. For ease of maintenance, let’s assume we want all our production SQL server database servers with same database master key and same certificate. You can of course use different certificates but I recommend keeping all the same all across for a consistent maintenance.

Use Master
GO

BACKUP MASTER KEY
TO FILE = '<Share>\Keys\DMK.key'
ENCRYPTION BY PASSWORD = 'XXXXXXXXXXXXXXX';
GO
BACKUP CERTIFICATE BackupCert
TO FILE = '<Share>\\Keys\BackupCert.cer'
WITH PRIVATE KEY
(
FILE = 'G:\Keys\BackupCertPrivateKey.key'
, ENCRYPTION BY PASSWORD = 'XXXXXXXXXXXXXXX'
);
GO

 

2. Restore the key and certificate onto the rebase server:

Use Master
GO

RESTORE MASTER KEY

FROM FILE = 'G:\Keys\DMK.key'

-- the password that was used to encrypt the master key in the source SQL Server instance
DECRYPTION BY PASSWORD = 'XXXXXXXXXXXXXXX'
-- the password with which we want to encrpyt it on the target SQL Server instance
-- the service key is different on the source and the target SQL Server instances
ENCRYPTION BY PASSWORD = 'XXXXXXXXXXXXXXX'
;

-- Since the master key is not registered in the master database
-- We need to open it in order to decrypt it
-- It stays opened for the session duration

OPEN MASTER KEY
DECRYPTION BY PASSWORD = 'XXXXXXXXXXXXXXX'

-- Restoring the certificate by the private key
-- the password is the one we used to encrypt it on the source SQL Server instance

CREATE CERTIFICATE BackupCert
FROM FILE = 'G:\Keys\BackupCert.cer'
WITH PRIVATE KEY
(
FILE = 'G:\Keys\BackupCertPrivateKey.key'
, DECRYPTION BY PASSWORD = 'XXXXXXXXXXXXXXX' )

 

3. Implement encrypted stored procedures on the Master database to automate implementation

In order to restore encrypted backup, we recommend to create objects that allows the restore without knowing the password and give only permissions to DBA to do the restores. Only the handful of people allowed to be sysadmin will be able to restore the database. Technically, the architect can implement the encryption without telling the DBA the password, increasing even more the separation of duties.

USE master

GO

CREATE PROCEDURE sp_OpenMK
WITH ENCRYPTION
AS

OPEN MASTER KEY
DECRYPTION BY PASSWORD = 'XXXXXXXXXXXXXXX'
GO

 

USE master

GO

CREATE PROCEDURE sp_CloseMK

WITH ENCRYPTION

AS

CLOSE MASTER KEY

GO

 

4. Testing the restore from an encrypted backup

Backup script from source database:

-- Backing up a database and ciphering the resulting file only requires us to choose an encryption algorithm and to specify the certificate we want to use:
BACKUP DATABASE testencrypt.bak
TO DISK = 'G:\Keys\testencrypt'
WITH COPY_ONLY, COMPRESSION, STATS = 1
, ENCRYPTION (ALGORITHM = AES_256, SERVER CERTIFICATE = BackupCert)

And the restore without opening the key confirms the protection:

RESTORE DATABASE testencrypt.bak
FROM DISK = 'G:\Keys\testencrypt'

Msg 33111, Level 16, State 3, Line 1

Cannot find server certificate with thumbprint '0x683E6D9192B48856C67B5CB4F4FD9ECCB980BB46'.

Msg 3013, Level 16, State 1, Line 1

RESTORE DATABASE is terminating abnormally.

 

With the key open, the restore does perform with no issue:

-– Finally the database by first opening the key if it is not open

EXEC Master.dbo.sp_OpenMK
GO

-- The instruction is identical for a non-encrypted database

RESTORE DATABASE testencrypt.bak
FROM DISK = 'G:\Keys\testencrypt'
WITH REPLACE

-- Closing the master key

GO
EXEC Master.dbo.sp_CloseMK

 

 

Result is showing restore completion:

Processed 328 pages for database 'testencrypt', file 'testencrypt' on file 1.
Processed 1 pages for database 'testencrypt', file 'testencrypt_log' on file 1.

RESTORE DATABASE successfully processed 329 pages in 0.061 seconds (42.032 MB/sec).

 

DATABASE RESTORE AUTOMATION

You then need to create a database to manage the restore automation and it is even more important the more databases you need to manage.

On one of our clients, hundreds of databases need to be restored and we need to make sure we handle automation through a data model handling:

  • The server and database settings: the server name, database name
  • The rebase schedule: backup type, recovery type, certificate to use (if multiple server (in our case we simplified to use only one certificate), the sequence of the backup regime to restore
  • The server and database settings at the rebase server : where the data and log files are located, the new database name (if needed to be changed to avoid collision of database name if multiple source servers with same-name databases must be restore on the same rebase server)
  • A database that logs each restoration task: the schedule start end end date so that we can add stats and monitoring on the restoration process at the whole group of rebase or at the database and even at the backup type level.

Contact us if you need the database model implemented.

Then we need to implement a stored procedure to manger the schedule and a stored procedure that handles each specific restoration task.

The Rebase schedule stored procedure would be able to read the schedule table and go through the restoration with parameters such as @SourceServer, @SourceDatabase, @BackupType…

The Restore stored procedure will need to dynamically script the full or diff backup (log can be done but usually this suffices for rebase implementation). The restore stored procedure needs to be able to handle to scan through a folder to use the most recent full and diff backups and needs to be able to look at the log sequence number of the backup to compare the restoring state of the database you restore.

Contact us if you need the stored procedure code.

 

DATABASE OBFUSCATION

Once the databases are restored. The database needs to be anonymized and obfuscated to then backup unencrypted backups for the development servers.

Setting up scripts and implementing them into a table with a sequencing will enable logging of complex obfuscation processes. If your obfuscation process is simple, a scheduled job might suffice with job steps.

 

DATABASE REDUCTION

Reduction of database size is logical and physical. Usually large tables can easily be truncated or we can keep only a few thousand rows of such large tables. Additional functional setup might be also done to prepare development environment including the change of urls, the removal of accounts, etc… All of those actions can reduce logically the size of the database.

Once this is perform, reduction of files can be performed either easily (shrink of files although this will fragment) or if it is too complicated by moving the indexes (including clustered ones) to different filegroups. This process is very complex and must be tune according to the client environments.

DATABASE BACKUPS

You can then backup your database with obfuscated and reduced databases unencrypted and put the backups onto an accessible share for the dev environments whether the developers can restore the backups themselves or the DBA upon requests of the developers.