Chapter 14: Sorting and Merging


Chapter 14: Sorting and Merging

Sorting is arranging records in particular order. This is generally done so that sequential processing can be done.
(We saw how important sorting is when we performed sequential updates using new master file and old master file, Also we saw sorting is important when we perform control break processing)

Two ways to sort a file:
1)    Using a utility or database management sort program.
2)    COBOL sort verb.

Simplified Format for COBOL Sort Statement:
The programmer must specify whether the key field is to be an ASCENDING KEY or a DESCENDING KEY.

Collating Sequence:
Two major codes used for representing data in a computer are
1)    EBCDIC (an abbreviation for Extended Binary Coded Decimal Interchange Code), primarily used on mainframes, and

2)    ASCII (an abbreviation for American Standard Code for Information Interchange), widely used on PCs.
The sequencing of characters from lowest to highest, which is referred to as the collating sequence, is somewhat different in EBCDIC and ASCII.
Basic numeric sorting and basic alphabetic sorting are performed the same way in EBCDIC and ASCII.
 These codes are, however, not the same when alphanumeric fields containing both letters and digits or special characters are sorted.
Ø  Letters are considered “less than” numbers in EBCDIC, and letters are considered “greater than” numbers in ASCII.

Ø  Lowercase letters are considered “less than” uppercase letters in EBCDIC and “greater than” uppercase letters in ASCII.


Sequencing Records with More Than One SORT Key:
The SORT verb may be used to sequence records with more than one key field.
The first KEY field indicated is the major field to be sorted, the next KEY fields represent intermediate sort fields, followed by minor sort fields.
The following is a SORT statement that sorts records into ascending alphabetic NAME sequence within LEVEL-NO within OFFICE-NO:
SORT SORT-FILE
ON ASCENDING KEY OFFICE-NO
ON ASCENDING KEY LEVEL-NO
ON ASCENDING KEY NAME

USING PAYROLL-FILE-IN
GIVING SORTED-PAYROLL-FILE-OUT
Different sequences: Because all key fields are independent, some key fields can be sorted in ASCENDING sequence and others in DESCENDING sequence.
Combining the ON keyword: Note too that the words ON and KEY were not underlined in the instruction format, which means that they are optional words. If all key fields are to be sorted in ascending sequence, as in the preceding, we can condense the coding by using the phrase ON ASCENDING KEY only once. Note that this technique we can combine only when all the sort keys follow same sequence.
SORT SORT-FILE
ON ASCENDING KEY MAJOR-KEY
                                  INTERMEDIATE-KEY
                                  MINOR-KEY

WITH DUPLICATES IN ORDER: With the most current version of COBOL, you can request the computer to put records with same value for the sort field into the sort file in the same order that they appeared in the original input file. We add the WITH DUPLICATES IN ORDER clause to accomplish this.


Coding Simple Sort Procedure using the USING and GIVING clause:
3 files are used in a sort:
1. Input file: File of unsorted input records.
2. Work or sort file: File used to store records temporarily during the sorting process.
3. Output file: File of sorted output records.
All these files would be defined in the ENVIRONMENT DIVISION using standard ASSIGN clauses.

SORT data is usually assigned to a special work device indicated by SYSWORK.
SELECT SORT-FILE ASSIGN TO SYSWORK.
Your system may use SYSWORK (or some other special name) in the ASSIGN clause for the work or sort file. The SORT-FILE is actually assigned to a temporary work area that is used during processing but not saved.
FDs are used in the DATA DIVISION to define and describe the input and output files in a batch program in the usual way.
The sort or work file is described with an SD entry (which is an abbreviation for sort file description).
SD and FD  entries are very similar.
Also note that the field(s) specified as the KEY field(s) for sorting purposes must be defined as part of the sort record format.
SORT  SORT-FILE
ON ASCENDING KEY S-DEPT-NO àDefined within the SD file
USING UNSORTED-MASTER-FILE
GIVING SORTED-MASTER-FILE
STOP RUN

Program:
***************************** Top of Data ******************************
        IDENTIFICATION DIVISION.                                        
        PROGRAM-ID. SORTTES1.                                           
        AUTHOR. SUKUL MAHADIK.                                          
       *AUTHOR IS NOT COMPULSORY. BUT IS A GOOD COMMENT                  
        ENVIRONMENT DIVISION.                                           
       *ENVIRONMENT DIVISION HAS CONFIG SECTIONS AND I-O SECTION         
        CONFIGURATION SECTION.                                           
        SOURCE-COMPUTER. IBM-370.                                       
        OBJECT-COMPUTER. IBM-370.                                       
        INPUT-OUTPUT SECTION.                                           
       *FILE-CONTROL IS A PARAGRAPH                                      
        FILE-CONTROL.                                                   
            SELECT INPUT-FILE ASSIGN TO INPUT01                         
            ORGANIZATION IS SEQUENTIAL                                   
            FILE STATUS IS WS-INPUT-FILE-STATUS.                        
       *********************************************                    
       * WHEN WE CODE FILE STATUS FOR THE SORT INPUT AND OUTPUT         
       * DATASETS FOLLOWING IS THE MESSAGE THAT IT SHOWS:               
       * FILE "INPUT-FILE" IN THE "USING" PHRASE OF THE "SORT"          
       * STATEMENT WAS ACCEPTED AS BEING ELIGIBLE FOR THE               
       * "FASTSRT" COMPILER OPTION, BUT HAD A FILE STATUS               
       * DATA-NAME.  THE FILE STATUS DATA ITEM WILL NOT BE SET          
       * DURING THE SORT.                                               
       * THE REASON BEING SORT ITSELF HANDLES THE OPENING AND CLOSING   
       * OF THE DATASETS AND HENCE STATUS VARIABLES WILL NOT BE         
       * CONSIDERED.
       * WHEN WE USE INPUT AND OUTPUT PROCEDURES FASTSRT OPTION GETS DISABLED.                                                   
       ***************************************                          
            SELECT SORT-FILE ASSIGN TO SORT01.                          
       * WE DONT NEED TO CODE ORGANIZATION CLAUSE OR FILE STATUS FOR SORT DATASETS     
       * IF WE DONE IT WOULD BE IGNORED.                                
       *    ORGANIZATION IS SEQUENTIAL.
       * REASON FILE STATUS IS NOT ALLOWED IS THAT SORT DATASET OPENING AND CLOSING
       * IS NEVER TAKEN CARE BY PROGRAMMER.                                  
            SELECT OUTPUT-FILE ASSIGN TO OUTPUT01                       
            ORGANIZATION IS SEQUENTIAL                                  
            FILE STATUS IS WS-OUTPUT-FILE-STATUS.                        
       * NOTE NO - BETWEEN FILE AND STATUS                               
        DATA DIVISION.                                                  
        FILE SECTION.                                                   
        FD INPUT-FILE                                                    
       * RECORDING,LABEL,BLOCK,RECORD SHOULD NOT BEGIN IN AREA A        
            RECORDING MODE IS F                                         
            LABEL RECORDS ARE STANDARD                                   
            BLOCK CONTAINS 0 RECORDS                                    
            RECORD CONTAINS 80 CHARACTERS.                              
        01 INPUT-FILE-REC.                                              
            05  IM-ACCT-NO PIC 9(8).                                    
            05  IM-NAME PIC X(13).                                      
            05  IM-AMOUNT  PIC 999.99.                                  
        SD SORT-FILE.                                                    
       * WE CANNOT USE THE RECORDING MODE, LABEL, BLOCK ETC FOR         
       * A SORT DATASET. IF WE CODE THESE THINGS THEY WLL BE PROCESSED  
       * AS COMMENTS                                                    
       *THE RECORD LENGTH OF THE SORT FILE SHOULD NOT BE LESS THAT      
       * INPUT FILE. OTHERWISE COMPILER GIVES A WARNING.                
       * NOTE THAT FOR THESE FILES THE ECORD LAYOUT IS NOT COMPLETE     
       * 80 CHARACTERS . BUT THE RECORD CONTAINS CLAUSE INDICATES       
       * THE LENGTH.                                                    
        01 SORT-FILE-REC.                                               
            05  SR-ACCT-NO PIC 9(8).                                    
            05  SR-NAME PIC X(13).                                      
            05  SR-AMOUNT  PIC 999.99.                                  
            05  FILLER PIC X(53).                                       
       * ADED A FILLER FOR 53 BYTES TO MAKE THE LENGTH TO 80.           
       * SORT DATASET SHOULD NOT HAVE LENGTH LESSER THAN INPUT FILE      
        FD OUTPUT-FILE                                                  
            RECORDING MODE IS F                                          
            LABEL RECORDS ARE STANDARD                                  
            BLOCK CONTAINS 0 RECORDS                                    
            RECORD CONTAINS 80 CHARACTERS.                              
        01 OUTPUT-FILE-REC.                                              
            05  OP-ACCT-NO PIC 9(8).                                    
            05  OP-NAME PIC X(13).                                      
            05  OP-AMOUNT  PIC 999.99.                                   
        WORKING-STORAGE SECTION.                                        
        01  WS-INPUT-FILE-STATUS PIC XX.                                
        01  WS-OUTPUT-FILE-STATUS PIC XX.                               
       * END OF FILE INDICATOR                                          
        01  WS-INPUT-FILE-EOF  PIC X VALUE 'N'.                         
        01  WS-OUTPUT-FILE-EOF PIC X VALUE 'N'.                         
       * FILE STATUS IS ALWAYS 2 CHARACTERS                              
        PROCEDURE DIVISION.                                             
        100-MAIN-MODULE.                                                
               SORT SORT-FILE ON ASCENDING KEY SR-ACCT-NO               
       *NOTE THAT WE SORT THE SORT-FILE.(AND NOT INPUT FILE)                                
       *THE KEY FIELD SHOULD BE FROM THE SORT DATASET DESCRIPTION.      
       * ITS POSSIBLE TO SORT DATA ON MULTIPLE FIELDS.                   
               USING INPUT-FILE                                          
               GIVING OUTPUT-FILE                                       
               STOP RUN.                                                
 **************************** Bottom of Data ****************************

JCL:
***************************** Top of Data *********************
//TESTJCL3 JOB (EWDS),'TEST JCL',NOTIFY=&SYSUID               
//DWJ030C0 EXEC PGM=SORTTES1                                  
//STEPLIB DD DSN=CMN.EDWS.STGO.#001621.LOD,DISP=SHR           
//INPUT01 DD DSN=SM017R.SORT.INPUT,DISP=(SHR)                 
//OUTPUT01 DD DSN=SM017R.SORT.OUTPUT,DISP=(NEW,CATLG,DELETE), 
//            DCB=(LRECL=80,RECFM=FB,BLKSIZE=0)               
//SORT01 DD DSN=SM017R.SORT.FILE,DISP=(NEW,DELETE,DELETE)     
//SYSOUT DD SYSOUT=*                                          
**************************** Bottom of Data *******************

Input File:
***************************** Top of Data ******************************
00000014sukul mahadik 234.45                                            
00000001Dushyant jadha124.53                                           
00000500roger kyambhej045.35                                           
**************************** Bottom of Data ****************************

OUtput file:
***************************** Top of Data *******
 00000001Dushyant jadha124.53                    
 00000014sukul mahadik 234.45                    
 00000500roger kyambhej045.35                    
 **************************** Bottom of Data *****



Self Test:
1.Suppose we want EMPLOYEE-FILE records in alphabetic order by NAME within DISTRICT within TERRITORY, all in ascending sequence. The output ?le is called SORTED-EMPLOYEE-FILE.
Complete the following SORT statement:
SORT  WORK-FILE ...
Answer:
ON ASCENDING KEY TERRITORY
ON ASCENDING KEY DISTRICT
ON ASCENDING KEY NAME
USING EMPLOYEE-FILE
GIVing SORTED-EMPLOYE-FILE.

2.How many files are required in a simple SORT routine? Describe these files.
Answer: 3 files
INput file
Sort work file
Output file

3.The work or sort file is defined as an _______in the DATA DIVISION.
Answer: SD

4.Suppose we have an FD called NET-FILE-IN, an SD called NET-FILE, and an FD called NETFILE-OUT.
We want NET-FILE-OUT sorted into ascending DEPT-NO sequence. Code the PROCEDURE DIVISION entry.

Answer:
SORT NET-FILE
ON ASCENDING KEY DEPT-NO
USING NET-FILE-IN
GIVING NETFILE-OUT

5. In Question 4, DEPT-NO must be a Field defined within the (SD/FD)file.
Answer: SD


Following is what actually happens when we SORT dataset using the USING and GIVING clause:
Consider the following SORT statement:
SORT  SORT-FILE
ON ASCENDING KEY TERR
USING IN-FILE
GIVING SORTED-MSTR
This statement performs the following operations:
1.Opens IN-FILEand SORTED-MSTR.
2.Moves IN-FILE records to the SORT-FILE.
3.Sorts SORT-FILE into ascending sequence by TERR, which is a field defined as part of the SD SORT-FILE record.
4.Moves the sorted SORT-FILE to the output file called SORTED-MSTR.
5.Closes IN-FILE and SORTED-MSTR after all records have been processed.
Note that the records from the input file are first moved to the sort dataset.
The SORT statement can, however, be used in conjunction with procedures that process records before they are sorted and/or process records after they are sorted.





INPUT PROCEDURE:
We can perform certain processing of input records before they are sorted using INPUT PROCEDURE in place of USING clause.
Expanded format:
The INPUT PROCEDURE processes data from the incoming file prior to sorting.
We may use INPUT procedure to perform:
1)    Data validations
2)    Eliminate records
3)    Eliminate unwanted fields
4)    Count number of records.
Earlier when using USING and GIVING clause( simple SORT), the opening and closing of the files was taken care by SOrT itself.
But when we use the INPUT procedure we should remember that responsibility of opening and closing the input file would be with the input procedure.
Also note that we do not WRITE records to be sorted; instead, we RELEASE them for sorting purposes. We must release records to the sort file in an INPUT PROCEDURE. With a USING option, this is done for us automatically.
Note that the RELEASE verb is followed by a record-name, just like the WRITE statement.
That is, the RELEASE verb functions just like a WRITE but is used to output sort records. In summary, an INPUT PROCEDURE opens the input file, processes input records, and releases them to the sort file so they can then be sorted. After all input records are processed, the input file must be closed because when INPUT PROCEDURE is used, the input file is not automatically closed as it is when the USING clause is coded. The format for the RELEASE is:
The RELEASE is the verb used to write records to a sort file. Hence we need to move the record first to sort-record and then release it
Example:
MOVE IN-REC TO SORT-REC
RELEASE SORT-REC.
Or
RELEASE SORT-REC FROM IN-REC.       Functions like a WRITE ... FROM

Program:
***************************** Top of Data ******************************
       IDENTIFICATION DIVISION.                                        
       PROGRAM-ID. SORTTES1.                                            
       AUTHOR. SUKUL MAHADIK.                                          
      *AUTHOR IS NOT COMPULSORY. BUT IS A GOOD COMMENT                 
       ENVIRONMENT DIVISION.                                           
      *ENVIRONMENT DIVISION HAS CONFIG SECTIONS AND I-O SECTION        
       CONFIGURATION SECTION.                                          
       SOURCE-COMPUTER. IBM-370.                                       
       OBJECT-COMPUTER. IBM-370.                                        
       INPUT-OUTPUT SECTION.                                           
      *FILE-CONTROL IS A PARAGRAPH                                     
       FILE-CONTROL.                                                   
           SELECT INPUT-FILE ASSIGN TO INPUT01                         
           ORGANIZATION IS SEQUENTIAL                                  
           FILE STATUS IS WS-INPUT-FILE-STATUS.                        
      *********************************************                     
      * WHEN WE CODE FILE STATUS FOR THE SORT INPUT AND OUTPUT         
      * DATASETS FOLLOWING IS THE MESSAGE THAT IT SHOWS:               
      * FILE "INPUT-FILE" IN THE "USING" PHRASE OF THE "SORT"          
      * STATEMENT WAS ACCEPTED AS BEING ELIGIBLE FOR THE               
      * "FASTSRT" COMPILER OPTION, BUT HAD A FILE STATUS               
      * DATA-NAME.  THE FILE STATUS DATA ITEM WILL NOT BE SET          
      * DURING THE SORT.                                                
      * THE REASON BEING SORT ITSELF HANDLES THE OPENING AND CLOSING   
      * OF THE DATASETS AND HENCE STATSU VARIABLES WILL NOT BE         
      * CONSIDERED.                                                    
      ***************************************                          
           SELECT SORT-FILE ASSIGN TO SORT01.                          
      * WE DONT NEED TO CODE ORGANIZATION CLAUSE FOR SORT DATASETS     
      * IF WE DONE IT WOULD BE IGNORED.                                
      *    ORGANIZATION IS SEQUENTIAL.                                 
           SELECT OUTPUT-FILE ASSIGN TO OUTPUT01                       
           ORGANIZATION IS SEQUENTIAL                                  
           FILE STATUS IS WS-OUTPUT-FILE-STATUS.                       
      * NOTE NO - BETWEEN FILE AND STATUS                              
       DATA DIVISION.                                                  
       FILE SECTION.                                                    
       FD INPUT-FILE                                                   
      * RECORDING,LABEL,BLOCK,RECORD SHOULD NOT BEGIN IN AREA A        
           RECORDING MODE IS F                                         
           LABEL RECORDS ARE STANDARD                                  
           BLOCK CONTAINS 0 RECORDS                                    
           RECORD CONTAINS 80 CHARACTERS.                              
       01 INPUT-FILE-REC.                                               
           05  IM-ACCT-NO PIC 9(8).                                    
           05  IM-NAME PIC X(13).                                      
           05  IM-AMOUNT  PIC 999.99.                                  
       SD SORT-FILE.                                                   
      * WE CANNOT USE THE RECORDING MODE, LABEL, BLOCK ETC FOR         
      * A SORT DATASET. IF WE CODE THESE THINGS THEY WLL BE PROCESSED  
      * AS COMMENTS                                                    
      *THE RECORD LENGTH OF THE SORT FILE SHOULD NOT BE LESS THAT      
      * INPUT FILE. OTHERWISE COMPILER GIVES A WARNING.                
      * NOTE THAT FOR THESE FILES THE ECORD LAYOUT IS NOT COMPLETE     
      * 80 CHARACTERS . BUT THE RECORD CONTAINS CLAUSE INDICATES       
      * THE LENGTH.                                                    
       01 SORT-FILE-REC.                                               
           05  SR-ACCT-NO PIC 9(8).                                    
           05  SR-NAME PIC X(13).                                      
           05  SR-AMOUNT  PIC 999.99.                                  
           05  FILLER PIC X(53).                                        
      * ADED A FILLER FOR 53 BYTES TO MAKE THE LENGTH TO 80.           
      * SORT DATASET SHOULD NOT HAVE LENGTH LESSER THAN INPUT FILE     
       FD OUTPUT-FILE                                                  
           RECORDING MODE IS F                                         
           LABEL RECORDS ARE STANDARD                                  
           BLOCK CONTAINS 0 RECORDS                                    
           RECORD CONTAINS 80 CHARACTERS.                              
       01 OUTPUT-FILE-REC.                                             
           05  OP-ACCT-NO PIC 9(8).                                    
           05  OP-NAME PIC X(13).                                      
           05  OP-AMOUNT  PIC 999.99.                                  
       WORKING-STORAGE SECTION.                                        
       01  WS-INPUT-FILE-STATUS PIC XX.                                
       01  WS-OUTPUT-FILE-STATUS PIC XX.                                
      * END OF FILE INDICATOR                                          
       01  WS-INPUT-FILE-EOF  PIC X VALUE 'N'.                         
       01  WS-OUTPUT-FILE-EOF PIC X VALUE 'N'.                         
      * COUNTER                                                        
       01 COUNT-KEEP PIC 999999 VALUE ZEROS.                           
       01 COUNTDROP PIC 999999 VALUE ZEROS.                            
      * FILE STATUS IS ALWAYS 2 CHARACTERS                             
       PROCEDURE DIVISION.                                             
       100-MAIN-MODULE.                                                
              SORT SORT-FILE ON ASCENDING KEY SR-ACCT-NO               
      *NOTE THAT WE SORT THE SORT-FILE.                                
      *THE KEY FIELD SHOULD BE FROM THE SORT DATASET DESCRIPTION.      
      * ITS POSSIBLE TO SORT DATA ON MULTIPLE FIELDS.                  
              INPUT PROCEDURE IS 200-PROCESS-INPUT                     
              GIVING OUTPUT-FILE                                       
              DISPLAY 'RECORDS KEPT:' COUNT-KEEP                       
              DISPLAY 'RECORDS DROPPED:' COUNTDROP                     
              STOP RUN.                                                
       200-PROCESS-INPUT.                                              
           OPEN INPUT INPUT-FILE                                       
      * WHEN USING INPUT PROCEDURE WE SHOULD OPEN THE FILE             
      * SORT WILL NOT DO THAT FOR US.                                  
           PERFORM 300-READ-INPUT-FILE                                 
      * READ 1ST RECORD BEFORE THE LOOP .                              
      * NEXT RECORD WILL BE READ AT THE END OF THE LOOP/                
           PERFORM UNTIL WS-INPUT-FILE-EOF = 'Y'                       
             IF IM-ACCT-NO NOT = 0                                     
             COMPUTE COUNT-KEEP = COUNT-KEEP + 1                       
                RELEASE SORT-FILE-REC FROM INPUT-FILE-REC              
      *NOTE THAT WITH RELEASE WE USE THE FIELD DEFINED IN SD           
      * RELEASE IS SIMLAR TO WRITE.                                    
      * WE BASICALLY WRITE RECORDS FROM INPUT PROCEDURE TO THE SORT    
      * WORK FILE USING RELEASE.                                        
             ELSE                                                      
             COMPUTE    COUNTDROP = COUNTDROP + 1                      
             END-IF                                                    
           PERFORM 300-READ-INPUT-FILE                                 
           END-PERFORM                                                  
           CLOSE INPUT-FILE.                                           
      * WHEN USING PROCEDURES SORT DOES NOT CLOSE THE INPUT FILE       
      * ON ITS OWN. WE SHOULD CLOSE IT IN THE INPUT PROCEDURE.         
       300-READ-INPUT-FILE.                                            
            READ INPUT-FILE                                            
            AT END MOVE 'Y' TO WS-INPUT-FILE-EOF.                      
**************************** Bottom of Data ****************************

JCL:
***************************** Top of Data ******************************
//TESTJCL3 JOB (EWDS),'TEST JCL',NOTIFY=&SYSUID                        
//DWJ030C0 EXEC PGM=SORTTES2                                           
//STEPLIB DD DSN=CMN.EDWS.STGO.#001621.LOD,DISP=SHR                    
//INPUT01 DD DSN=SM017R.SORT.INPUT,DISP=(SHR)                          
//OUTPUT01 DD DSN=SM017R.SORT.OUTPU2,DISP=(NEW,CATLG,DELETE),          
//            DCB=(LRECL=80,RECFM=FB,BLKSIZE=0)                        
//SORT01 DD DSN=SM017R.SORT.FILE,DISP=(NEW,DELETE,DELETE)              
//SYSOUT DD SYSOUT=*                                                   
**************************** Bottom of Data ****************************

Input:
***************************** Top of Data *************
00000000kiran bedi    900.34                          
00000014sukul mahadik 234.45                          
00000001Dushyant jadha124.53                          
00000500roger kyambhej045.35                           
00000000SUNTL mahadik 098.56                          
**************************** Bottom of Data ***********

Output:
***************************** Top of Data **********
00000001Dushyant jadha124.5                        
00000014sukul mahadik 234.4                        
00000500roger kyambhej045.3                        
**************************** Bottom of Data ********

SSO output:
RECORDS KEPT:000003       
 RECORDS DROPPED:000002    
****************************


Input Procedure summary:
1) The INPUT PROCEDURE of the SORT should refer to a paragraph-name but it could refer to a section-name.

2) In the paragraph specified in the INPUT PROCEDURE:
a.       OPEN the input file.
b.       PERFORM a paragraph that will read and process input records until there is no more data.
c.       After all records have been processed, close the input file.
d.       After the last sentence in the INPUT PROCEDURE paragraph is executed, control will then return to the SORT, at which time the records in the sort file will be sorted.
3) At the paragraph that processes input records prior to sorting:
a.       Perform any operations on input that are required.
b.       MOVE input data to the sort record.
c.       RELEASE each sort record, which makes it available for sorting.
d.       Continue to read input until there is no more data.

Note, too, that we never OPEN or CLOSE the sort file-name specified in the SD. It is always opened and closed automatically, as are files specified with USING or GIVING. Only the input file processed in an INPUT PROCEDURE needs to be opened and closed by the program.

SORT layout different from Input Layout:
There could be a chance that not all the fields in the input file are required in the sorted output file.
In such cases the sort file layout would be different that the input file layout.
It would be possible, although inefficient, to (1) first sort the input and produce a sorted master, and (2) then code a separate module to read from the sorted master, moving the data in a rearranged format to a new sorted master.
Instead we can use a input procedure and move only the required fields from the input to sort record and then release it.




OUTPUT Procedure:
OUTPUT PROCEDURE is very similar to the INPUT PROCEDURE except that an INPUT PROCEDURE processes presorted records and an OUTPUT PROCEDURE processes records in the sort file after they have been sorted.
The full format for the SORT, including both INPUT and OUTPUT PROCEDURE options, is as follows:
The word GIVING can be followed by more than one file-name, which means that we can create multiple copies of the sorted file.
 An OUTPUT PROCEDURE processes all sorted records in the sort file and handles the transfer of these records to the output file.
 In an INPUT PROCEDURE we RELEASE records to a sort file rather than writing them.
 In an OUTPUT PROCEDURE we RETURN records from the sort file rather than reading them. Syntax for RETURN is almost same as READ.
 RETURN Basically means read from the sort file and release basically means write to the sort file.
Format:

Output procedure summary:
1) The OUTPUT PROCEDURE of the SORT should refer to a paragraph-name, but it could refer to a section-name.
2) In the paragraph specified in the OUTPUT PROCEDURE:
          a. OPEN the output file.
          b. PERFORM a paragraph that will RETURN (which is like a READ) and process records from the sort file until there is no more data. The records are in sequence in the sort file.
          c. After all records have been processed, CLOSE the output file.
          d. When the OUTPUT PROCEDURE paragraph has been fully executed, control will then return to the SORT.
3) At the paragraph that processes the sort records after they have been sorted but before they are created as output:    
          a. Perform any operations on the work or sort records.
          b. MOVE the work or sort record to the output area.
          c. WRITE each sort record to the output file. (A WRITE ... FROM can be used in place of a MOVE and WRITE.)

Recall that the SD file as well as files specified with USING or GIVING are opened and closed automatically. The programmer opens and closes the input file in an INPUT PROCEDURE and the output file in an OUTPUT PROCEDURE.

Program:
***************************** Top of Data ******************************
       IDENTIFICATION DIVISION.                                        
       PROGRAM-ID. SORTTES1.                                           
       AUTHOR. SUKUL MAHADIK.                                          
      *AUTHOR IS NOT COMPULSORY. BUT IS A GOOD COMMENT                  
       ENVIRONMENT DIVISION.                                           
      *ENVIRONMENT DIVISION HAS CONFIG SECTIONS AND I-O SECTION        
       CONFIGURATION SECTION.                                          
       SOURCE-COMPUTER. IBM-370.                                       
       OBJECT-COMPUTER. IBM-370.                                       
       INPUT-OUTPUT SECTION.                                           
      *FILE-CONTROL IS A PARAGRAPH                                      
       FILE-CONTROL.                                                   
           SELECT INPUT-FILE ASSIGN TO INPUT01                         
           ORGANIZATION IS SEQUENTIAL                                  
           FILE STATUS IS WS-INPUT-FILE-STATUS.                        
      *********************************************                    
      * WHEN WE CODE FILE STATUS FOR THE SORT INPUT AND OUTPUT         
      * DATASETS FOLLOWING IS THE MESSAGE THAT IT SHOWS:               
      * FILE "INPUT-FILE" IN THE "USING" PHRASE OF THE "SORT"          
      * STATEMENT WAS ACCEPTED AS BEING ELIGIBLE FOR THE               
      * "FASTSRT" COMPILER OPTION, BUT HAD A FILE STATUS               
      * DATA-NAME.  THE FILE STATUS DATA ITEM WILL NOT BE SET          
      * DURING THE SORT.                                               
      * THE REASON BEING SORT ITSELF HANDLES THE OPENING AND CLOSING   
      * OF THE DATASETS AND HENCE STATSU VARIABLES WILL NOT BE         
      * CONSIDERED.                                                    
      ***************************************                          
           SELECT SORT-FILE ASSIGN TO SORT01.                          
      * WE DONT NEED TO CODE ORGANIZATION CLAUSE FOR SORT DATASETS     
      * IF WE DONE IT WOULD BE IGNORED.                                
      *    ORGANIZATION IS SEQUENTIAL.                                 
           SELECT OUTPUT-FILE ASSIGN TO OUTPUT01                       
           ORGANIZATION IS SEQUENTIAL                                  
           FILE STATUS IS WS-OUTPUT-FILE-STATUS.                       
      * NOTE NO - BETWEEN FILE AND STATUS                              
       DATA DIVISION.                                                  
       FILE SECTION.                                                   
       FD INPUT-FILE                                                   
      * RECORDING,LABEL,BLOCK,RECORD SHOULD NOT BEGIN IN AREA A        
           RECORDING MODE IS F                                         
           LABEL RECORDS ARE STANDARD                                  
           BLOCK CONTAINS 0 RECORDS                                     
           RECORD CONTAINS 80 CHARACTERS.                              
       01 INPUT-FILE-REC.                                              
           05  IM-ACCT-NO PIC 9(8).                                    
           05  IM-NAME PIC X(14).                                      
           05  IM-AMOUNT  PIC 999.99.                                  
       SD SORT-FILE.                                                   
      * WE CANNOT USE THE RECORDING MODE, LABEL, BLOCK ETC FOR         
      * A SORT DATASET. IF WE CODE THESE THINGS THEY WLL BE PROCESSED  
      * AS COMMENTS                                                    
      *THE RECORD LENGTH OF THE SORT FILE SHOULD NOT BE LESS THAT      
      * INPUT FILE. OTHERWISE COMPILER GIVES A WARNING.                
      * NOTE THAT FOR THESE FILES THE ECORD LAYOUT IS NOT COMPLETE     
      * 80 CHARACTERS . BUT THE RECORD CONTAINS CLAUSE INDICATES       
      * THE LENGTH.                                                    
       01 SORT-FILE-REC.                                               
           05  SR-ACCT-NO PIC 9(8).                                    
           05  SR-NAME PIC X(14).                                       
           05  SR-AMOUNT  PIC 999.99.                                  
           05  FILLER PIC X(53).                                       
      * ADED A FILLER FOR 53 BYTES TO MAKE THE LENGTH TO 80.           
      * SORT DATASET SHOULD NOT HAVE LENGTH LESSER THAN INPUT FILE     
       FD OUTPUT-FILE                                                  
           RECORDING MODE IS F                                         
           LABEL RECORDS ARE STANDARD                                   
           BLOCK CONTAINS 0 RECORDS                                    
           RECORD CONTAINS 80 CHARACTERS.                              
       01 OUTPUT-FILE-REC.                                             
           05  OP-ACCT-NO PIC 9(8).                                    
           05  OP-NAME PIC X(14).                                      
           05  OP-AMOUNT  PIC 999.99.                                  
       WORKING-STORAGE SECTION.                                         
       01  WS-INPUT-FILE-STATUS PIC XX.                                
       01  WS-OUTPUT-FILE-STATUS PIC XX.                               
      * END OF FILE INDICATOR                                          
       01  WS-INPUT-FILE-EOF  PIC X VALUE 'N'.                         
       01  WS-OUTPUT-FILE-EOF PIC X VALUE 'N'.                         
       01  WS-SORT-FILE-EOF PIC X VALUE 'N'.                           
      * COUNTER                                                         
       01 COUNT-KEEP PIC 999999 VALUE ZEROS.                           
       01 COUNTDROP PIC 999999 VALUE ZEROS.                            
       01 TEMP-NUM-FLD PIC 999V99.                                     
       PROCEDURE DIVISION.                                             
       100-MAIN-MODULE.                                                
              SORT SORT-FILE ON ASCENDING KEY SR-ACCT-NO               
      *NOTE THAT WE SORT THE SORT-FILE.                                
      *THE KEY FIELD SHOULD BE FROM THE SORT DATASET DESCRIPTION.      
      * ITS POSSIBLE TO SORT DATA ON MULTIPLE FIELDS.                  
              INPUT PROCEDURE IS 200-PROCESS-INPUT                     
              OUTPUT PROCEDURE IS 400-WRITE-OUTPUT                     
              DISPLAY 'RECORDS KEPT:' COUNT-KEEP                       
              DISPLAY 'RECORDS DROPPED:' COUNTDROP                     
              STOP RUN.                                                 
       200-PROCESS-INPUT.                                              
           OPEN INPUT INPUT-FILE                                       
      * WHEN USING INPUT PROCEDURE WE SHOULD OPEN THE FILE             
      * SORT WILL NOT DO THAT FOR US.                                  
           PERFORM 300-READ-INPUT-FILE                                 
      * READ 1ST RECORD BEFORE THE LOOP .                              
      * NEXT RECORD WILL BE READ AT THE END OF THE LOOP/               
           PERFORM UNTIL WS-INPUT-FILE-EOF = 'Y'                       
             IF IM-ACCT-NO NOT = 0                                     
             COMPUTE COUNT-KEEP = COUNT-KEEP + 1                       
                RELEASE SORT-FILE-REC FROM INPUT-FILE-REC              
      *NOTE THAT WITH RELEASE WE USE THE FIELD DEFINED IN SD           
      * RELEASE IS SIMLAR TO WRITE.                                    
      * WE BASICALLY WRITE RECORDS FROM INPUT PROCEDURE TO THE SORT    
      * WORK FILE USING RELEASE.                                       
             ELSE                                                      
             COMPUTE    COUNTDROP = COUNTDROP + 1                      
             END-IF                                                    
           PERFORM 300-READ-INPUT-FILE                                 
           END-PERFORM                                                 
           CLOSE INPUT-FILE.                                           
      * WHEN USING PROCEDURES SORT DOES NOT CLOSE THE INPUT FILE       
      * ON ITS OWN. WE SHOULD CLOSE IT IN THE INPUT PROCEDURE.         
       400-WRITE-OUTPUT.                                                
           OPEN OUTPUT OUTPUT-FILE                                     
      * WHEN USING OUTPUT PROCEDURE WE SHOULD OPEN AND CLOSE THE       
      * FILES. SORT WONT DO IT FOR US THE WAY IT DOES WITH             
      * USING AND GIVING                                               
           RETURN SORT-FILE                                            
      * JUST AS WITH READ WE USE RETURN WITH A FILE.                   
      * NOTE THAT WE READ,RETURN WITH A FILE                           
      * AND WRITE AND RELEASE WITH A RECORD NAME                       
           AT END MOVE 'Y' TO WS-SORT-FILE-EOF                         
           END-RETURN                                                  
      * RETURN ALSO HAS A EXPLICIT SCOPE TERMINATOR THE WAY READ HAS   
           PERFORM UNTIL WS-SORT-FILE-EOF = 'Y'                        
           DISPLAY 'SR-AMOUNT:' SR-AMOUNT                              
           DISPLAY 'SR-NAME:' SR-NAME                                   
           MOVE SR-AMOUNT TO OP-AMOUNT                                 
           MOVE SR-NAME TO OP-NAME                                     
           MOVE SR-ACCT-NO TO OP-ACCT-NO                               
      *    MOVE SR-AMOUNT TO TEMP-NUM-FLD                              
            COMPUTE TEMP-NUM-FLD = FUNCTION NUMVAL(OP-AMOUNT) + 100    
      * WE CAN USE NUMVAL TO GET THE NUMERIC VALUE OF A NUMERIC        
      * EDITED FIELD.                                                  
           MOVE TEMP-NUM-FLD TO  OP-AMOUNT                             
           WRITE OUTPUT-FILE-REC                                       
           RETURN SORT-FILE                                             
           AT END MOVE 'Y' TO WS-SORT-FILE-EOF                         
           END-RETURN                                                  
           END-PERFORM                                                 
           CLOSE OUTPUT-FILE.                                          
       300-READ-INPUT-FILE.                                            
            READ INPUT-FILE                                             
            AT END MOVE 'Y' TO WS-INPUT-FILE-EOF.                      
**************************** Bottom of Data ****************************

JCL:
***************************** Top of Data ***********************
//TESTJCL3 JOB (EWDS),'TEST JCL',NOTIFY=&SYSUID                 
//DWJ030C0 EXEC PGM=SORTTES1                                    
//STEPLIB DD DSN=CMN.EDWS.STGO.#001621.LOD,DISP=SHR             
//INPUT01 DD DSN=SM017R.SORT.INPUT,DISP=(SHR)                   
//OUTPUT01 DD DSN=SM017R.SORT.OUTPUT,DISP=(NEW,CATLG,DELETE),   
//            DCB=(LRECL=80,RECFM=FB,BLKSIZE=0)                 
//SORT01 DD DSN=SM017R.SORT.FILE,DISP=(NEW,DELETE,DELETE)       
//SYSOUT DD SYSOUT=*                                             
**************************** Bottom of Data *********************

Input file:
***************************** Top of Data ****
00000000kiran bedi    900.34                 
00000014sukul mahadik 234.45                 
00000001Dushyant jadha124.53                  
00000500roger kyambhej045.35                 
00000000SUNTL mahadik 098.56                 
**************************** Bottom of Data **

Output:
***************************** Top of Data ****
00000001Dushyant jadha224.53                 
00000014sukul mahadik 334.45                 
00000500roger kyambhej145.35                 
**************************** Bottom of Data **




The Merge Statement:
          COBOL has a MERGE statement that will combine two or more files into a single file.
          Its format is similar to that of the SORT:

Rules for ASCENDING/DESCENDING KEY, USING, GIVING, and OUTPUT PROCEDURE are the same as for the SORT.
With the USING clause, we indicate the files to be merged.
At least two file-names must be included for a merge, but more than two are permitted.

Unlike the SORT, however, an INPUT PROCEDURE may not be specified with a MERGE statement.
That is, using the MERGE statement, you may process records only after they have been merged, not before. (input has to come from files)
The OUTPUT PROCEDURE has the same format as with the SORT.
The MERGE statement automatically handles the opening, closing, and input/output (READ/WRITE functions) associated with the files.
The files to be merged must each be in sequence by the key field.
 If ASCENDING KEY is specified, then the merged output file will have records in increasing order by key field, and if DESCENDING KEY is specified, the merged output file will have key fields from high to low.




Self Test:
Code a simple SORT to read a file called IN-FILE, sort it into ascending name sequence, and create an output file called OUT-FILE.
Answer:

SORT SORT-FILE
ON ASCENDING KEY NAME
USING IN-FILE
GIVING OUT-FILE

It is possible to process records before they are sorted by using the _______ option in place of the _______ option.
Answer: INPUT PROCEDURE , USING

A(n) (unsorted input, sorted output) file is opened in an INPUT PROCEDURE and a(n) (un-sorted input, sorted output) file is opened in an OUTPUT PROCEDURE.
Answer:
Unsorted input
Sorted output


In place of a WRITE statement in an INPUT PROCEDURE, the _______ verb is used to write records onto the sort or work file.
Answer:
RELEASE SORT-REC FROM IN_REC

In place of a READ statement in an OUTPUT PROCEDURE, the _______ verb is used to read records from the sort or work file.
Answer:
RETURN SORT-FILE-NAME
AT END
NOT AND END
END-RETURN

(T or F) The RELEASE statement uses a file-name, as does the RETURN statement.
Answer:
False. RELEASE statement uses the record name and RETURN uses the file name.

Code a simple SORT to read a file called IN-PAYROLL, sort it into ascending NAME sequence, and create an output file called OUT-PAYROLL.
Answer:
SORT SORt-FILE ON ASCENDING KEY NAME
USING IN-PAYROLL
GIVING OUT-PAYROLL

(T or F) A WORK or SORT file is required when sorting.
Answer: True



True-False Question

False____ 1. If the OUTPUT PROCEDURE is used with the SORT verb, then the INPUT PROCEDURE is required.

True____ 2. RELEASE must be used in an INPUT PROCEDURE; RETURN must be used in an OUTPUT PROCEDURE.

False____ 3. The results of sorting will always be the same regardless of whether the computer uses ASCII or EBCDIC.
Exp: COllating sequence is different in ASCII and EBCDIC


True____ 4. The RELEASE statement is used in place of the WRITE statement in an INPUT PROCEDURE.

False____ 5. A maximum of three SORT fields are permitted in a single SORT statement.

False____ 6. The only method for sorting a disk file is with the use of the SORT statement in COBOL.
Exp: We can use SORT utility of RDBMS to perform sort

False____ 7. Data may be sorted in either ascending or descending sequence and the sort field must be numeric.

True____ 8. The procedure-name specified in the INPUT PROCEDURE clause is a paragraph-name.
But it can be section name also

False____ 9. If a file is described by an SD, it is not defined in a SELECT clause and does not have an FD.
It is defined in SELECT clause, But does not have FD, It has SD.

False____ 10. In the EBCDIC collating sequence, a blank has the lowest value and the SORT verb does not distinguish between upper- and lowercase letters.

False____ 11. The syntax for SORT and MERGE are very different.
Sort can have INPUT procedure, Merge cannot. The input to merge has to come directly using USING.

False____ 12. A sort can be performed with a minimum of two files: the input file and the file of sorted output records.
Sort work file is required.

No comments:

Post a Comment