I’ve faced the task to create a daily job which generates a CSV-files and store it for an an indefinite period in a table. As you know CSV-files are not the tiniest ones and “indefinite period” could be result into “ORA-01653: unable to extend table YYY by 128 in tablespace XXX“.
So I’ve looked deeper into Oracle Securefiles , available since release 11gR1. The former LOB technic is called “Basicfile“. They have implemented a lot of nice features to handle unstructured data easier and less resource consumed. My post will point out the ability to save storage by using compression and deduplication features.
At first I create a table with secure files
CREATE TABLE MY_SECUREFILES ( ID NUMBER, CSV_FILE_NAME VARCHAR2(4000 BYTE), CSV_FILE BLOB, MIME_TYPE VARCHAR2(255 BYTE), CREATED_ON DATE, CREATED_BY VARCHAR2(255 BYTE), CONSTRAINT PK_MY_SECUREFILES PRIMARY KEY ( ID ) USING INDEX ) LOB ( CSV_FILE ) STORE AS SECUREFILE ( NOCACHE -- Enable to hold LOB at the buffer cache COMPRESS HIGH -- Enable compression for LOB DEDUPLICATE -- Enable deduplication feature ) ENABLE ROW MOVEMENT;
Secondly I imported mutliple CSV-files with an approximate size of 32 MB per file.
DECLARE l_query varchar2(32767); BEGIN l_query := q'#SELECT * FROM <SOURCE TABLE>#'; INSERT INTO MY_SECUREFILES ( csv_file_name, csv_file, mime_type ) VALUES ( 'Report_' || TO_CHAR(SYSDATE,'DD-MON-YYYY') || '.csv', FK_TRANSFORMTOCSV(l_query), -- Selfmade function to transform SQL-query result into BLOB 'text/csv' ); END; /
Results
I ran several tests by enabling only compression, deduplication or combining both.
Source Size-MB | Target Size-MB | Factor | Comment |
160 | 4,1 | 38,8 | DEDUPLICATION + COMPRESSION_HIGH |
5,1 | 31,2 | COMPRESSION_HIGH | |
145,2 | 1,1 | Only DEDUPLICATION | |
184,1 | 0,9 | <nothing> |
In best case you get storage savings with a factor of 38 ! In absolute figures my 160 MB files are only 4,1 MB big at the database. This is awesome. To be fair I had a little bit luck because other file types are less compression friendly.
Keep in mind that take advantage of the “normal” database features: Files will be backed up, they are protected from OS users.