Monday, March 9, 2015

Divide large SQL file under Linux in an Emergency Run.

Divide large SQL file under Linux in an Emergency Run.
-----------------------------------------------------------------------------------------------------

Requirement  : Think about in an emergency situation you have to run large SQL file where you are from remote desktop and you don't have GUI access.
            
Methodology : In such a situation lets divide SQL file to small size files and run them separately.

Advantage   : You don't need to wait until it runs as it will take lots of time.
                        It will save your time if SQL errors occurred in the middle of the process.
                              You can fix any SQL errors if them occurred in the middle of the process.

1. Change directory to file location

Command : cd /home/saminda_06488/Documents/DB-CCG

2. Read the size of the file in bytes

Command : du -b
                   du -b CM_CDRS_DATA_TABLE.sql

5476295390 bytes ( Around 5GB to 6 GB )
  
3. Methodology
  
Lets read entire file by 500 MB ( 50 X 1024 X 1024 bytes = 524288000 bytes ) per time
We are going to use Linux command called "head" with specifying number of bytes
Then we are sending the output to new SQL file called "1.sql"

Command : head -c 524288000 CM_CDRS_DATA_TABLE.sql > 1.sql

4. Once it finished then read the last lines of 1.sql by Linux command called "tail"

Command : tail 1.sql

It will be like this.

Insert into CCG.CM_CDRS (UCID,CLI,YEAR,MONTH,DURATION,CM_TIMESTAMP,CM_STATUS,CM_START_TIME,CALL_VDN) values ('1050761310051687','776541947','2011','07',83,to_date('07-JUL-11','DD-MON-RR'),'N','2011-07-07 20:44:47','6340');
Insert into CCG.CM_CDRS (UCID,CLI,YEAR,MONTH,DURATION,CM_TIMESTAMP,CM_STATUS,CM_START_TIME,CALL_VDN) values ('1054221310051772','771087148','2011','07',61,to_d



5. Count characters of Last incomplete SQL line. Only if you have such line

Insert into CCG.CM_CDRS (UCID,CLI,YEAR,MONTH,DURATION,CM_TIMESTAMP,CM_STATUS,CM_START_TIME,CALL_VDN) values ('1054221310051772','771087148','2011','07',61,to_d

Count is 159 characters.

6. Lets adjust the 1.sql file size by reducing 159 bytes from 524288000 bytes. That is 524287841 bytes

Command : head -c 524287841 CM_CDRS_DATA_TABLE.sql > 1.sql


7. Verify it by reading last 10 rows.

Command : tail 1.sql

Insert into CCG.CM_CDRS (UCID,CLI,YEAR,MONTH,DURATION,CM_TIMESTAMP,CM_STATUS,CM_START_TIME,CALL_VDN) values ('1056491310051831','773833438','2011','07',3,to_date('07-JUL-11','DD-MON-RR'),'N','2011-07-07 20:47:11','6341');
Insert into CCG.CM_CDRS (UCID,CLI,YEAR,MONTH,DURATION,CM_TIMESTAMP,CM_STATUS,CM_START_TIME,CALL_VDN) values ('1053431310051750','772605427','2011','07',74,to_date('07-JUL-11','DD-MON-RR'),'N','2011-07-07 20:45:50','6341');
Insert into CCG.CM_CDRS (UCID,CLI,YEAR,MONTH,DURATION,CM_TIMESTAMP,CM_STATUS,CM_START_TIME,CALL_VDN) values ('1050761310051687','776541947','2011','07',83,to_date('07-JUL-11','DD-MON-RR'),'N','2011-07-07 20:44:47','6340');



It seems it is OK.


8. Keep newly generated 1.sql in seperate place and now we are going to generate next 50 MB.

Now we have to generate new CM_CDRS_DATA_TABLE.sql without first 50 MB.
Now we have to read bytes from down to top. Here we are going to use command called "tail"

First remove first 50 MB ( 524288000 - 159 ) bytes = 524287841 bytes

Now get the remaining bytes to generate new CM_CDRS_DATA_TABLE.sql
( 5476295390 - 524287841 ) bytes = 4952007549 bytes


command : tail -c 5423866749 CM_CDRS_DATA_TABLE.sql > CM_CDRS_DATA_TABLE.sql_TMP_1


9. Now follow again 1-9 steps until you finished entire SQL file.


  
10. Once you have spitted entire 6 GB SQL file in to 500 mb, now you can run them one by one in correct sequence.