Chapter 14: Sorting and Merging
Sorting is
arranging records in particular order. This is generally done so that
sequential processing can be done.
(We saw how
important sorting is when we performed sequential updates using new master file
and old master file, Also we saw sorting is important when we perform control
break processing)
Two ways to sort
a file:
1)
Using
a utility or database management sort program.
2)
COBOL
sort verb.
Simplified Format
for COBOL Sort Statement:
The programmer
must specify whether the key field is to be an ASCENDING KEY or a DESCENDING KEY.
Collating Sequence:
Two major codes
used for representing data in a computer are
1)
EBCDIC (an
abbreviation for Extended Binary Coded Decimal Interchange Code), primarily
used on mainframes, and
2)
ASCII (an
abbreviation for American Standard Code for Information Interchange), widely
used on PCs.
The sequencing of
characters from lowest to highest, which is referred to as the collating
sequence, is somewhat different in EBCDIC and ASCII.
Basic numeric
sorting and basic alphabetic sorting are performed the same way in EBCDIC and
ASCII.
These codes are, however, not the same when
alphanumeric fields containing both letters and digits or special characters are
sorted.
Ø Letters are considered “less than” numbers in
EBCDIC, and letters are considered “greater than” numbers in ASCII.
Ø Lowercase letters are considered “less than”
uppercase letters in EBCDIC and “greater than” uppercase letters in ASCII.
Sequencing Records with More Than One
SORT Key:
The SORT verb may
be used to sequence records with more than one key field.
The first KEY field
indicated is the major field to be sorted, the next KEY fields represent
intermediate sort fields, followed by minor sort fields.
The following is
a SORT statement that sorts records into ascending alphabetic NAME sequence
within LEVEL-NO within OFFICE-NO:
SORT SORT-FILE
ON ASCENDING KEY OFFICE-NO
ON ASCENDING KEY LEVEL-NO
ON ASCENDING KEY NAME
USING PAYROLL-FILE-IN
GIVING SORTED-PAYROLL-FILE-OUT
ON ASCENDING KEY OFFICE-NO
ON ASCENDING KEY LEVEL-NO
ON ASCENDING KEY NAME
USING PAYROLL-FILE-IN
GIVING SORTED-PAYROLL-FILE-OUT
Different sequences: Because all key fields are independent, some
key fields can be sorted in ASCENDING sequence and others in DESCENDING sequence.
Combining the ON keyword: Note too that the words ON and KEY were not underlined
in the instruction format, which means that they are optional words. If all key
fields are to be sorted in ascending sequence, as in the preceding, we can
condense the coding by using the phrase ON ASCENDING KEY only once. Note that
this technique we can combine only when all the sort keys follow same sequence.
SORT SORT-FILE
ON ASCENDING KEY
MAJOR-KEY
INTERMEDIATE-KEY
MINOR-KEY
INTERMEDIATE-KEY
MINOR-KEY
WITH DUPLICATES IN ORDER: With the most current version of COBOL, you
can request the computer to put records with same value for the sort field into
the sort file in the same order that they appeared in the original input file.
We add the WITH
DUPLICATES IN ORDER clause to accomplish this.
Coding Simple Sort Procedure using
the USING and GIVING clause:
3 files are used in
a sort:
1. Input file:
File of unsorted input records.
2. Work or sort
file: File used to store records temporarily during the sorting process.
3. Output file:
File of sorted output records.
All these files
would be defined in the ENVIRONMENT DIVISION using standard ASSIGN clauses.
SORT data is
usually assigned to a special work device indicated by SYSWORK.
SELECT SORT-FILE
ASSIGN TO SYSWORK.
Your system may
use SYSWORK (or some other special name) in the ASSIGN clause for the work or
sort file. The SORT-FILE is actually assigned to a temporary work area that is
used during processing but not saved.
FDs are used in
the DATA DIVISION to define and describe the input and output files in a batch
program in the usual way.
The sort or work file is described with an SD entry (which is an abbreviation for sort file description).
SD and FD entries are very similar.
The sort or work file is described with an SD entry (which is an abbreviation for sort file description).
SD and FD entries are very similar.
Also note that the field(s) specified
as the KEY field(s) for sorting purposes must be defined as part of the sort record
format.
SORT
SORT-FILE
ON ASCENDING KEY S-DEPT-NO àDefined within the SD file
USING UNSORTED-MASTER-FILE
GIVING SORTED-MASTER-FILE
USING UNSORTED-MASTER-FILE
GIVING SORTED-MASTER-FILE
STOP RUN
Program:
***************************** Top of Data
******************************
IDENTIFICATION DIVISION.
PROGRAM-ID. SORTTES1.
AUTHOR. SUKUL MAHADIK.
*AUTHOR IS NOT
COMPULSORY. BUT IS A GOOD COMMENT
ENVIRONMENT DIVISION.
*ENVIRONMENT DIVISION
HAS CONFIG SECTIONS AND I-O SECTION
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM-370.
OBJECT-COMPUTER. IBM-370.
INPUT-OUTPUT SECTION.
*FILE-CONTROL IS A PARAGRAPH
FILE-CONTROL.
SELECT INPUT-FILE ASSIGN TO INPUT01
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-INPUT-FILE-STATUS.
*********************************************
* WHEN WE CODE FILE
STATUS FOR THE SORT INPUT AND OUTPUT
* DATASETS FOLLOWING
IS THE MESSAGE THAT IT SHOWS:
* FILE
"INPUT-FILE" IN THE "USING" PHRASE OF THE
"SORT"
* STATEMENT WAS
ACCEPTED AS BEING ELIGIBLE FOR THE
* "FASTSRT"
COMPILER OPTION, BUT HAD A FILE STATUS
* DATA-NAME. THE FILE STATUS DATA ITEM WILL NOT BE
SET
* DURING THE
SORT.
* THE REASON BEING
SORT ITSELF HANDLES THE OPENING AND CLOSING
* OF THE DATASETS AND
HENCE STATUS VARIABLES WILL NOT BE
* CONSIDERED.
* WHEN WE USE INPUT AND OUTPUT PROCEDURES
FASTSRT OPTION GETS DISABLED.
***************************************
SELECT SORT-FILE ASSIGN TO SORT01.
* WE DONT NEED TO
CODE ORGANIZATION CLAUSE OR FILE STATUS FOR SORT DATASETS
* IF WE DONE IT WOULD
BE IGNORED.
* ORGANIZATION IS SEQUENTIAL.
* REASON FILE STATUS
IS NOT ALLOWED IS THAT SORT DATASET OPENING AND CLOSING
* IS NEVER TAKEN CARE
BY PROGRAMMER.
SELECT OUTPUT-FILE ASSIGN TO OUTPUT01
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-OUTPUT-FILE-STATUS.
* NOTE NO - BETWEEN
FILE AND STATUS
DATA DIVISION.
FILE SECTION.
FD INPUT-FILE
*
RECORDING,LABEL,BLOCK,RECORD SHOULD NOT BEGIN IN AREA A
RECORDING MODE IS F
LABEL RECORDS ARE STANDARD
BLOCK CONTAINS 0 RECORDS
RECORD CONTAINS 80 CHARACTERS.
01 INPUT-FILE-REC.
05 IM-ACCT-NO PIC 9(8).
05 IM-NAME PIC X(13).
05 IM-AMOUNT PIC 999.99.
SD SORT-FILE.
* WE CANNOT USE THE
RECORDING MODE, LABEL, BLOCK ETC FOR
* A SORT DATASET. IF
WE CODE THESE THINGS THEY WLL BE PROCESSED
* AS COMMENTS
*THE RECORD LENGTH OF
THE SORT FILE SHOULD NOT BE LESS THAT
* INPUT FILE.
OTHERWISE COMPILER GIVES A WARNING.
* NOTE THAT FOR THESE
FILES THE ECORD LAYOUT IS NOT COMPLETE
* 80 CHARACTERS . BUT
THE RECORD CONTAINS CLAUSE INDICATES
* THE LENGTH.
01 SORT-FILE-REC.
05 SR-ACCT-NO PIC 9(8).
05 SR-NAME PIC X(13).
05 SR-AMOUNT PIC 999.99.
05 FILLER PIC X(53).
* ADED A FILLER FOR
53 BYTES TO MAKE THE LENGTH TO 80.
* SORT DATASET SHOULD
NOT HAVE LENGTH LESSER THAN INPUT FILE
FD OUTPUT-FILE
RECORDING MODE IS F
LABEL RECORDS ARE STANDARD
BLOCK CONTAINS 0 RECORDS
RECORD CONTAINS 80 CHARACTERS.
01 OUTPUT-FILE-REC.
05 OP-ACCT-NO PIC 9(8).
05 OP-NAME PIC X(13).
05 OP-AMOUNT PIC 999.99.
WORKING-STORAGE SECTION.
01 WS-INPUT-FILE-STATUS PIC
XX.
01 WS-OUTPUT-FILE-STATUS PIC
XX.
* END OF FILE INDICATOR
01 WS-INPUT-FILE-EOF PIC X VALUE 'N'.
01 WS-OUTPUT-FILE-EOF PIC X
VALUE 'N'.
* FILE STATUS IS ALWAYS 2 CHARACTERS
PROCEDURE DIVISION.
100-MAIN-MODULE.
SORT SORT-FILE ON ASCENDING KEY SR-ACCT-NO
*NOTE THAT WE SORT
THE SORT-FILE.(AND NOT INPUT FILE)
*THE KEY FIELD SHOULD
BE FROM THE SORT DATASET DESCRIPTION.
* ITS POSSIBLE TO
SORT DATA ON MULTIPLE FIELDS.
USING INPUT-FILE
GIVING OUTPUT-FILE
STOP RUN.
**************************** Bottom of Data
****************************
JCL:
***************************** Top of Data
*********************
//TESTJCL3 JOB (EWDS),'TEST
JCL',NOTIFY=&SYSUID
//DWJ030C0 EXEC PGM=SORTTES1
//STEPLIB DD
DSN=CMN.EDWS.STGO.#001621.LOD,DISP=SHR
//INPUT01 DD
DSN=SM017R.SORT.INPUT,DISP=(SHR)
//OUTPUT01 DD
DSN=SM017R.SORT.OUTPUT,DISP=(NEW,CATLG,DELETE),
//
DCB=(LRECL=80,RECFM=FB,BLKSIZE=0)
//SORT01 DD
DSN=SM017R.SORT.FILE,DISP=(NEW,DELETE,DELETE)
//SYSOUT DD SYSOUT=*
**************************** Bottom of Data
*******************
Input File:
***************************** Top of Data
******************************
00000014sukul mahadik 234.45
00000001Dushyant jadha124.53
00000500roger kyambhej045.35
**************************** Bottom of Data
****************************
OUtput file:
***************************** Top of Data
*******
00000001Dushyant jadha124.53
00000014sukul mahadik 234.45
00000500roger kyambhej045.35
**************************** Bottom of Data
*****
|
Self Test:
1.Suppose we want
EMPLOYEE-FILE records in alphabetic order by NAME within DISTRICT within
TERRITORY, all in ascending sequence. The output ?le is called
SORTED-EMPLOYEE-FILE.
Complete the
following SORT statement:
SORT WORK-FILE ...
Answer:
ON ASCENDING KEY
TERRITORY
ON ASCENDING KEY
DISTRICT
ON ASCENDING KEY
NAME
USING
EMPLOYEE-FILE
GIVing
SORTED-EMPLOYE-FILE.
2.How many files
are required in a simple SORT routine? Describe these files.
Answer: 3 files
INput file
Sort work file
Output file
3.The work or
sort file is defined as an _______in the DATA DIVISION.
Answer: SD
4.Suppose we have
an FD called NET-FILE-IN, an SD called NET-FILE, and an FD called NETFILE-OUT.
We want
NET-FILE-OUT sorted into ascending DEPT-NO sequence. Code the PROCEDURE
DIVISION entry.
Answer:
SORT NET-FILE
ON ASCENDING KEY
DEPT-NO
USING NET-FILE-IN
GIVING
NETFILE-OUT
5. In Question 4,
DEPT-NO must be a Field defined within the (SD/FD)file.
Answer: SD
Following is what
actually happens when we SORT dataset using the USING and GIVING clause:
Consider the
following SORT statement:
SORT SORT-FILE
ON ASCENDING KEY TERR
USING IN-FILE
GIVING SORTED-MSTR
ON ASCENDING KEY TERR
USING IN-FILE
GIVING SORTED-MSTR
This statement
performs the following operations:
1.Opens
IN-FILEand SORTED-MSTR.
2.Moves IN-FILE records to the SORT-FILE.
3.Sorts SORT-FILE into ascending sequence by TERR, which is a field defined as part of the SD SORT-FILE record.
4.Moves the sorted SORT-FILE to the output file called SORTED-MSTR.
5.Closes IN-FILE and SORTED-MSTR after all records have been processed.
2.Moves IN-FILE records to the SORT-FILE.
3.Sorts SORT-FILE into ascending sequence by TERR, which is a field defined as part of the SD SORT-FILE record.
4.Moves the sorted SORT-FILE to the output file called SORTED-MSTR.
5.Closes IN-FILE and SORTED-MSTR after all records have been processed.
Note that the records
from the input file are first moved to the sort dataset.
The SORT statement
can, however, be used in conjunction with procedures that process records
before they are sorted and/or process records after they are sorted.
INPUT PROCEDURE:
We can perform
certain processing of input records before they are sorted using INPUT
PROCEDURE in place of USING clause.
Expanded format:
The INPUT
PROCEDURE processes data from the incoming file prior to sorting.
We may use INPUT
procedure to perform:
1)
Data validations
2)
Eliminate
records
3)
Eliminate
unwanted fields
4)
Count
number of records.
Earlier when
using USING and GIVING clause( simple SORT), the opening and closing of the
files was taken care by SOrT itself.
But when we use the INPUT procedure we should remember that responsibility of opening and closing the input file would be with the input procedure.
But when we use the INPUT procedure we should remember that responsibility of opening and closing the input file would be with the input procedure.
Also
note that we do not WRITE records to be sorted; instead, we RELEASE them for sorting
purposes. We must release
records to the sort file in an INPUT PROCEDURE. With a USING option, this is
done for us automatically.
Note
that the RELEASE verb is followed by a record-name, just like the WRITE statement.
That is, the
RELEASE verb functions just like a WRITE but is used to output sort records. In
summary, an INPUT PROCEDURE opens the input file, processes input records, and releases
them to the sort file so they can then be sorted. After all input records are processed,
the input file must be closed because when INPUT PROCEDURE is used, the input
file is not automatically closed as it is when the USING clause is coded. The
format for the RELEASE is:
The RELEASE is
the verb used to write records to a sort file. Hence we need to move the record first to
sort-record and then release it
Example:
MOVE IN-REC TO SORT-REC
RELEASE SORT-REC.
Or
RELEASE SORT-REC FROM IN-REC. Functions like a WRITE ... FROM
Program:
***************************** Top of Data
******************************
IDENTIFICATION DIVISION.
PROGRAM-ID. SORTTES1.
AUTHOR. SUKUL MAHADIK.
*AUTHOR IS NOT COMPULSORY. BUT IS A GOOD COMMENT
ENVIRONMENT DIVISION.
*ENVIRONMENT DIVISION HAS CONFIG SECTIONS AND I-O SECTION
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM-370.
OBJECT-COMPUTER. IBM-370.
INPUT-OUTPUT SECTION.
*FILE-CONTROL IS A PARAGRAPH
FILE-CONTROL.
SELECT INPUT-FILE ASSIGN TO INPUT01
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-INPUT-FILE-STATUS.
*********************************************
*
WHEN WE CODE FILE STATUS FOR THE SORT INPUT AND OUTPUT
*
DATASETS FOLLOWING IS THE MESSAGE THAT IT SHOWS:
*
FILE "INPUT-FILE" IN THE "USING" PHRASE OF THE
"SORT"
*
STATEMENT WAS ACCEPTED AS BEING ELIGIBLE FOR THE
*
"FASTSRT" COMPILER OPTION, BUT HAD A FILE STATUS
*
DATA-NAME. THE FILE STATUS DATA ITEM
WILL NOT BE SET
*
DURING THE SORT.
*
THE REASON BEING SORT ITSELF HANDLES THE OPENING AND CLOSING
*
OF THE DATASETS AND HENCE STATSU VARIABLES WILL NOT BE
*
CONSIDERED.
***************************************
SELECT SORT-FILE ASSIGN TO SORT01.
*
WE DONT NEED TO CODE ORGANIZATION CLAUSE FOR SORT DATASETS
*
IF WE DONE IT WOULD BE IGNORED.
* ORGANIZATION IS
SEQUENTIAL.
SELECT OUTPUT-FILE ASSIGN TO OUTPUT01
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS
WS-OUTPUT-FILE-STATUS.
*
NOTE NO - BETWEEN FILE AND STATUS
DATA DIVISION.
FILE SECTION.
FD INPUT-FILE
*
RECORDING,LABEL,BLOCK,RECORD SHOULD NOT BEGIN IN AREA A
RECORDING MODE IS F
LABEL RECORDS ARE STANDARD
BLOCK CONTAINS 0 RECORDS
RECORD CONTAINS 80 CHARACTERS.
01 INPUT-FILE-REC.
05 IM-ACCT-NO PIC 9(8).
05 IM-NAME PIC X(13).
05 IM-AMOUNT PIC 999.99.
SD SORT-FILE.
*
WE CANNOT USE THE RECORDING MODE, LABEL, BLOCK ETC FOR
*
A SORT DATASET. IF WE CODE THESE THINGS THEY WLL BE PROCESSED
*
AS COMMENTS
*THE RECORD LENGTH OF THE SORT FILE SHOULD NOT BE LESS THAT
*
INPUT FILE. OTHERWISE COMPILER GIVES A WARNING.
*
NOTE THAT FOR THESE FILES THE ECORD LAYOUT IS NOT COMPLETE
*
80 CHARACTERS . BUT THE RECORD CONTAINS CLAUSE INDICATES
*
THE LENGTH.
01 SORT-FILE-REC.
05
SR-ACCT-NO PIC 9(8).
05 SR-NAME PIC X(13).
05 SR-AMOUNT PIC 999.99.
05 FILLER PIC X(53).
*
ADED A FILLER FOR 53 BYTES TO MAKE THE LENGTH TO 80.
*
SORT DATASET SHOULD NOT HAVE LENGTH LESSER THAN INPUT FILE
FD OUTPUT-FILE
RECORDING MODE IS F
LABEL RECORDS ARE STANDARD
BLOCK CONTAINS 0 RECORDS
RECORD CONTAINS 80 CHARACTERS.
01 OUTPUT-FILE-REC.
05 OP-ACCT-NO PIC 9(8).
05 OP-NAME PIC X(13).
05
OP-AMOUNT PIC 999.99.
WORKING-STORAGE SECTION.
01 WS-INPUT-FILE-STATUS PIC
XX.
01 WS-OUTPUT-FILE-STATUS PIC
XX.
*
END OF FILE INDICATOR
01 WS-INPUT-FILE-EOF PIC X VALUE 'N'.
01 WS-OUTPUT-FILE-EOF PIC X
VALUE 'N'.
*
COUNTER
01 COUNT-KEEP PIC 999999 VALUE ZEROS.
01 COUNTDROP PIC 999999 VALUE ZEROS.
*
FILE STATUS IS ALWAYS 2 CHARACTERS
PROCEDURE DIVISION.
100-MAIN-MODULE.
SORT SORT-FILE ON ASCENDING KEY SR-ACCT-NO
*NOTE THAT WE SORT THE SORT-FILE.
*THE KEY FIELD SHOULD BE FROM THE SORT DATASET DESCRIPTION.
*
ITS POSSIBLE TO SORT DATA ON MULTIPLE FIELDS.
INPUT PROCEDURE IS 200-PROCESS-INPUT
GIVING OUTPUT-FILE
DISPLAY 'RECORDS KEPT:' COUNT-KEEP
DISPLAY 'RECORDS DROPPED:' COUNTDROP
STOP RUN.
200-PROCESS-INPUT.
OPEN INPUT INPUT-FILE
* WHEN USING INPUT
PROCEDURE WE SHOULD OPEN THE FILE
* SORT WILL NOT DO
THAT FOR US.
PERFORM 300-READ-INPUT-FILE
* READ 1ST RECORD
BEFORE THE LOOP .
* NEXT RECORD WILL BE READ AT THE END OF
THE LOOP/
PERFORM UNTIL WS-INPUT-FILE-EOF = 'Y'
IF IM-ACCT-NO NOT = 0
COMPUTE COUNT-KEEP = COUNT-KEEP + 1
RELEASE SORT-FILE-REC FROM
INPUT-FILE-REC
*NOTE THAT WITH
RELEASE WE USE THE FIELD DEFINED IN SD
* RELEASE IS SIMLAR TO
WRITE.
* WE BASICALLY WRITE
RECORDS FROM INPUT PROCEDURE TO THE SORT
* WORK FILE USING
RELEASE.
ELSE
COMPUTE COUNTDROP = COUNTDROP
+ 1
END-IF
PERFORM 300-READ-INPUT-FILE
END-PERFORM
CLOSE INPUT-FILE.
* WHEN USING
PROCEDURES SORT DOES NOT CLOSE THE INPUT FILE
* ON ITS OWN. WE
SHOULD CLOSE IT IN THE INPUT PROCEDURE.
300-READ-INPUT-FILE.
READ INPUT-FILE
AT END MOVE 'Y' TO WS-INPUT-FILE-EOF.
**************************** Bottom of Data
****************************
JCL:
***************************** Top of Data
******************************
//TESTJCL3 JOB (EWDS),'TEST
JCL',NOTIFY=&SYSUID
//DWJ030C0 EXEC PGM=SORTTES2
//STEPLIB DD
DSN=CMN.EDWS.STGO.#001621.LOD,DISP=SHR
//INPUT01 DD
DSN=SM017R.SORT.INPUT,DISP=(SHR)
//OUTPUT01 DD
DSN=SM017R.SORT.OUTPU2,DISP=(NEW,CATLG,DELETE),
//
DCB=(LRECL=80,RECFM=FB,BLKSIZE=0)
//SORT01 DD
DSN=SM017R.SORT.FILE,DISP=(NEW,DELETE,DELETE)
//SYSOUT DD SYSOUT=*
**************************** Bottom of Data
****************************
Input:
***************************** Top of Data
*************
00000000kiran bedi 900.34
00000014sukul mahadik 234.45
00000001Dushyant jadha124.53
00000500roger kyambhej045.35
00000000SUNTL mahadik 098.56
**************************** Bottom of Data
***********
Output:
***************************** Top of Data
**********
00000001Dushyant jadha124.5
00000014sukul mahadik 234.4
00000500roger kyambhej045.3
**************************** Bottom of Data
********
SSO output:
RECORDS KEPT:000003
RECORDS DROPPED:000002
****************************
|
Input Procedure summary:
1) The INPUT
PROCEDURE of the SORT should refer to a paragraph-name but it could refer to a
section-name.
2) In the
paragraph specified in the INPUT PROCEDURE:
a. OPEN the input file.
b. PERFORM a paragraph that
will read and process input records until there is no more data.
c. After all records have
been processed, close the input file.
d. After the last sentence
in the INPUT PROCEDURE paragraph is executed, control will then return to the
SORT, at which time the records in the sort file will be sorted.
3) At the
paragraph that processes input records prior to sorting:
a. Perform any operations on
input that are required.
b. MOVE input data
to the sort record.
c. RELEASE each
sort record, which makes it available for sorting.
d. Continue to read input
until there is no more data.
Note, too, that
we never OPEN or CLOSE the sort file-name specified in the SD. It is always opened
and closed automatically, as are files specified with USING or GIVING. Only the
input file processed in an INPUT PROCEDURE needs to be opened and closed by the
program.
SORT layout different from Input Layout:
There
could be a chance that not all the fields in the input file are required in the
sorted output file.
In such cases the sort file layout would be different that the input file layout.
In such cases the sort file layout would be different that the input file layout.
It would be
possible, although inefficient, to (1) first sort the input and produce a
sorted master, and (2) then code a separate module to read from the sorted
master, moving the data in a rearranged format to a new sorted master.
Instead
we can use a input procedure and move only the required fields from the input
to sort record and then release it.
OUTPUT Procedure:
OUTPUT PROCEDURE
is very similar to the INPUT PROCEDURE except that an INPUT PROCEDURE processes
presorted records and an OUTPUT PROCEDURE processes records in the sort file
after they have been sorted.
The full format
for the SORT, including both INPUT and OUTPUT PROCEDURE options, is as follows:
The word GIVING
can be followed by more than one file-name, which means that we can create
multiple copies of the sorted file.
An OUTPUT PROCEDURE processes all sorted
records in the sort file and handles the transfer of these records to the
output file.
In an INPUT PROCEDURE we RELEASE records to a
sort file rather than writing them.
In an OUTPUT PROCEDURE we RETURN records from
the sort file rather than reading them. Syntax for RETURN is almost same as
READ.
RETURN Basically means read from the sort file
and release basically means write to the sort file.
Format:
Output procedure summary:
1) The OUTPUT
PROCEDURE of the SORT should refer to a paragraph-name, but it could refer to a
section-name.
2) In the
paragraph specified in the OUTPUT PROCEDURE:
a. OPEN the output file.
b. PERFORM a paragraph that will
RETURN (which is like a READ) and process records from the sort file until
there is no more data. The records are in sequence in the sort file.
c. After all records have been
processed, CLOSE the output file.
d. When the OUTPUT PROCEDURE paragraph
has been fully executed, control will then return to the SORT.
3) At the
paragraph that processes the sort records after they have been sorted but
before they are created as output:
a. Perform any operations on the work
or sort records.
b. MOVE the work or sort record to the
output area.
c. WRITE each sort record to the
output file. (A WRITE ... FROM can be used in place of a MOVE and WRITE.)
Recall that the
SD file as well as files specified with USING or GIVING are opened and closed
automatically. The programmer opens and closes the input file in an INPUT
PROCEDURE and the output file in an OUTPUT PROCEDURE.
Program:
***************************** Top of Data
******************************
IDENTIFICATION DIVISION.
PROGRAM-ID. SORTTES1.
AUTHOR. SUKUL MAHADIK.
*AUTHOR IS NOT COMPULSORY. BUT IS A GOOD COMMENT
ENVIRONMENT DIVISION.
*ENVIRONMENT DIVISION HAS CONFIG SECTIONS AND I-O SECTION
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM-370.
OBJECT-COMPUTER. IBM-370.
INPUT-OUTPUT SECTION.
*FILE-CONTROL IS A PARAGRAPH
FILE-CONTROL.
SELECT INPUT-FILE ASSIGN TO INPUT01
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-INPUT-FILE-STATUS.
*********************************************
*
WHEN WE CODE FILE STATUS FOR THE SORT INPUT AND OUTPUT
*
DATASETS FOLLOWING IS THE MESSAGE THAT IT SHOWS:
*
FILE "INPUT-FILE" IN THE "USING" PHRASE OF THE
"SORT"
*
STATEMENT WAS ACCEPTED AS BEING ELIGIBLE FOR THE
*
"FASTSRT" COMPILER OPTION, BUT HAD A FILE STATUS
*
DATA-NAME. THE FILE STATUS DATA ITEM
WILL NOT BE SET
*
DURING THE SORT.
*
THE REASON BEING SORT ITSELF HANDLES THE OPENING AND CLOSING
* OF THE DATASETS AND HENCE STATSU
VARIABLES WILL NOT BE
*
CONSIDERED.
***************************************
SELECT SORT-FILE ASSIGN TO SORT01.
*
WE DONT NEED TO CODE ORGANIZATION CLAUSE FOR SORT DATASETS
*
IF WE DONE IT WOULD BE IGNORED.
* ORGANIZATION IS
SEQUENTIAL.
SELECT OUTPUT-FILE ASSIGN TO
OUTPUT01
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-OUTPUT-FILE-STATUS.
*
NOTE NO - BETWEEN FILE AND STATUS
DATA DIVISION.
FILE SECTION.
FD INPUT-FILE
* RECORDING,LABEL,BLOCK,RECORD SHOULD NOT
BEGIN IN AREA A
RECORDING MODE IS F
LABEL RECORDS ARE STANDARD
BLOCK CONTAINS 0 RECORDS
RECORD CONTAINS 80 CHARACTERS.
01 INPUT-FILE-REC.
05 IM-ACCT-NO PIC 9(8).
05
IM-NAME PIC X(14).
05 IM-AMOUNT PIC 999.99.
SD SORT-FILE.
*
WE CANNOT USE THE RECORDING MODE, LABEL, BLOCK ETC FOR
*
A SORT DATASET. IF WE CODE THESE THINGS THEY WLL BE PROCESSED
*
AS COMMENTS
*THE RECORD LENGTH OF THE SORT FILE SHOULD NOT BE LESS THAT
*
INPUT FILE. OTHERWISE COMPILER GIVES A WARNING.
*
NOTE THAT FOR THESE FILES THE ECORD LAYOUT IS NOT COMPLETE
*
80 CHARACTERS . BUT THE RECORD CONTAINS CLAUSE INDICATES
* THE LENGTH.
01 SORT-FILE-REC.
05 SR-ACCT-NO PIC 9(8).
05 SR-NAME PIC X(14).
05 SR-AMOUNT PIC 999.99.
05 FILLER PIC X(53).
*
ADED A FILLER FOR 53 BYTES TO MAKE THE LENGTH TO 80.
* SORT DATASET SHOULD NOT HAVE LENGTH
LESSER THAN INPUT FILE
FD OUTPUT-FILE
RECORDING MODE IS F
LABEL RECORDS ARE STANDARD
BLOCK CONTAINS 0 RECORDS
RECORD CONTAINS 80 CHARACTERS.
01 OUTPUT-FILE-REC.
05
OP-ACCT-NO PIC 9(8).
05 OP-NAME PIC X(14).
05 OP-AMOUNT PIC 999.99.
WORKING-STORAGE SECTION.
01 WS-INPUT-FILE-STATUS PIC
XX.
01 WS-OUTPUT-FILE-STATUS PIC
XX.
*
END OF FILE INDICATOR
01 WS-INPUT-FILE-EOF PIC X VALUE 'N'.
01 WS-OUTPUT-FILE-EOF PIC X
VALUE 'N'.
01 WS-SORT-FILE-EOF PIC X VALUE
'N'.
*
COUNTER
01 COUNT-KEEP PIC 999999 VALUE ZEROS.
01 COUNTDROP PIC 999999 VALUE ZEROS.
01 TEMP-NUM-FLD PIC 999V99.
PROCEDURE DIVISION.
100-MAIN-MODULE.
SORT SORT-FILE ON ASCENDING KEY SR-ACCT-NO
*NOTE THAT WE SORT THE SORT-FILE.
*THE KEY FIELD SHOULD BE FROM THE SORT DATASET DESCRIPTION.
*
ITS POSSIBLE TO SORT DATA ON MULTIPLE FIELDS.
INPUT PROCEDURE IS 200-PROCESS-INPUT
OUTPUT PROCEDURE IS
400-WRITE-OUTPUT
DISPLAY 'RECORDS KEPT:' COUNT-KEEP
DISPLAY 'RECORDS DROPPED:' COUNTDROP
STOP RUN.
200-PROCESS-INPUT.
OPEN INPUT INPUT-FILE
*
WHEN USING INPUT PROCEDURE WE SHOULD OPEN THE FILE
* SORT WILL NOT DO THAT FOR US.
PERFORM 300-READ-INPUT-FILE
*
READ 1ST RECORD BEFORE THE LOOP .
*
NEXT RECORD WILL BE READ AT THE END OF THE LOOP/
PERFORM UNTIL WS-INPUT-FILE-EOF = 'Y'
IF IM-ACCT-NO NOT = 0
COMPUTE COUNT-KEEP = COUNT-KEEP + 1
RELEASE SORT-FILE-REC FROM
INPUT-FILE-REC
*NOTE THAT WITH RELEASE WE USE THE FIELD DEFINED IN SD
*
RELEASE IS SIMLAR TO WRITE.
*
WE BASICALLY WRITE RECORDS FROM INPUT PROCEDURE TO THE SORT
*
WORK FILE USING RELEASE.
ELSE
COMPUTE COUNTDROP = COUNTDROP
+ 1
END-IF
PERFORM 300-READ-INPUT-FILE
END-PERFORM
CLOSE INPUT-FILE.
*
WHEN USING PROCEDURES SORT DOES NOT CLOSE THE INPUT FILE
*
ON ITS OWN. WE SHOULD CLOSE IT IN THE INPUT PROCEDURE.
400-WRITE-OUTPUT.
OPEN OUTPUT OUTPUT-FILE
* WHEN USING OUTPUT
PROCEDURE WE SHOULD OPEN AND CLOSE THE
* FILES. SORT WONT DO
IT FOR US THE WAY IT DOES WITH
* USING AND GIVING
RETURN SORT-FILE
* JUST AS WITH READ WE
USE RETURN WITH A FILE.
* NOTE THAT WE
READ,RETURN WITH A FILE
* AND WRITE AND
RELEASE WITH A RECORD NAME
AT END MOVE 'Y' TO WS-SORT-FILE-EOF
END-RETURN
* RETURN ALSO HAS A EXPLICIT SCOPE
TERMINATOR THE WAY READ HAS
PERFORM UNTIL WS-SORT-FILE-EOF = 'Y'
DISPLAY 'SR-AMOUNT:' SR-AMOUNT
DISPLAY 'SR-NAME:' SR-NAME
MOVE SR-AMOUNT TO OP-AMOUNT
MOVE SR-NAME TO OP-NAME
MOVE SR-ACCT-NO TO OP-ACCT-NO
*
MOVE SR-AMOUNT TO TEMP-NUM-FLD
COMPUTE TEMP-NUM-FLD = FUNCTION NUMVAL(OP-AMOUNT) + 100
* WE CAN USE NUMVAL TO
GET THE NUMERIC VALUE OF A NUMERIC
* EDITED FIELD.
MOVE TEMP-NUM-FLD TO
OP-AMOUNT
WRITE OUTPUT-FILE-REC
RETURN SORT-FILE
AT END MOVE 'Y' TO WS-SORT-FILE-EOF
END-RETURN
END-PERFORM
CLOSE OUTPUT-FILE.
300-READ-INPUT-FILE.
READ INPUT-FILE
AT END MOVE 'Y' TO WS-INPUT-FILE-EOF.
**************************** Bottom of Data
****************************
JCL:
***************************** Top of Data
***********************
//TESTJCL3 JOB (EWDS),'TEST
JCL',NOTIFY=&SYSUID
//DWJ030C0 EXEC PGM=SORTTES1
//STEPLIB DD
DSN=CMN.EDWS.STGO.#001621.LOD,DISP=SHR
//INPUT01 DD
DSN=SM017R.SORT.INPUT,DISP=(SHR)
//OUTPUT01 DD
DSN=SM017R.SORT.OUTPUT,DISP=(NEW,CATLG,DELETE),
//
DCB=(LRECL=80,RECFM=FB,BLKSIZE=0)
//SORT01 DD
DSN=SM017R.SORT.FILE,DISP=(NEW,DELETE,DELETE)
//SYSOUT DD SYSOUT=*
**************************** Bottom of Data
*********************
Input file:
***************************** Top of Data
****
00000000kiran bedi 900.34
00000014sukul mahadik 234.45
00000001Dushyant jadha124.53
00000500roger kyambhej045.35
00000000SUNTL mahadik 098.56
**************************** Bottom of Data
**
Output:
***************************** Top of Data
****
00000001Dushyant jadha224.53
00000014sukul mahadik 334.45
00000500roger kyambhej145.35
**************************** Bottom of Data
**
|
The Merge Statement:
COBOL has a MERGE statement that will
combine two or more files into a single file.
Its format is similar to that of the
SORT:
Rules for
ASCENDING/DESCENDING KEY, USING, GIVING, and OUTPUT PROCEDURE are the same as
for the SORT.
With the USING
clause, we indicate the files to be merged.
At least two file-names must be included for a merge, but more than two are permitted.
At least two file-names must be included for a merge, but more than two are permitted.
Unlike the SORT,
however, an INPUT PROCEDURE may not be specified with a MERGE statement.
That is, using the MERGE statement, you may process records only after they have been merged, not before. (input has to come from files)
The OUTPUT PROCEDURE has the same format as with the SORT.
That is, using the MERGE statement, you may process records only after they have been merged, not before. (input has to come from files)
The OUTPUT PROCEDURE has the same format as with the SORT.
The MERGE
statement automatically handles the opening, closing, and input/output
(READ/WRITE functions) associated with the files.
The files to be merged must each be
in sequence by the key field.
If ASCENDING KEY is specified, then the merged
output file will have records in increasing order by key field, and if
DESCENDING KEY is specified, the merged output file will have key fields from
high to low.
Self Test:
Code a simple
SORT to read a file called IN-FILE, sort it into ascending name sequence, and
create an output file called OUT-FILE.
Answer:
SORT SORT-FILE
ON ASCENDING KEY
NAME
USING IN-FILE
GIVING OUT-FILE
It is possible to
process records before they are sorted by using the _______ option in place of
the _______ option.
Answer: INPUT
PROCEDURE , USING
A(n) (unsorted
input, sorted output) file is opened in an INPUT PROCEDURE and a(n) (un-sorted
input, sorted output) file is opened in an OUTPUT PROCEDURE.
Answer:
Unsorted input
Sorted output
In place of a
WRITE statement in an INPUT PROCEDURE, the _______ verb is used to write
records onto the sort or work file.
Answer:
RELEASE SORT-REC
FROM IN_REC
In place of a
READ statement in an OUTPUT PROCEDURE, the _______ verb is used to read records
from the sort or work file.
Answer:
RETURN
SORT-FILE-NAME
AT END
NOT AND END
END-RETURN
(T or F) The
RELEASE statement uses a file-name, as does the RETURN statement.
Answer:
False. RELEASE
statement uses the record name and RETURN uses the file name.
Code a simple
SORT to read a file called IN-PAYROLL, sort it into ascending NAME sequence,
and create an output file called OUT-PAYROLL.
Answer:
SORT SORt-FILE ON
ASCENDING KEY NAME
USING IN-PAYROLL
GIVING
OUT-PAYROLL
(T or F) A WORK
or SORT file is required when sorting.
Answer: True
True-False Question
False____ 1. If
the OUTPUT PROCEDURE is used with the SORT verb, then the INPUT PROCEDURE is
required.
True____ 2.
RELEASE must be used in an INPUT PROCEDURE; RETURN must be used in an OUTPUT
PROCEDURE.
False____ 3. The
results of sorting will always be the same regardless of whether the computer
uses ASCII or EBCDIC.
Exp: COllating
sequence is different in ASCII and EBCDIC
True____ 4. The
RELEASE statement is used in place of the WRITE statement in an INPUT
PROCEDURE.
False____ 5. A
maximum of three SORT fields are permitted in a single SORT statement.
False____ 6. The
only method for sorting a disk file is with the use of the SORT statement in
COBOL.
Exp: We can use
SORT utility of RDBMS to perform sort
False____ 7. Data
may be sorted in either ascending or descending sequence and the sort field
must be numeric.
True____ 8. The
procedure-name specified in the INPUT PROCEDURE clause is a paragraph-name.
But it can be
section name also
False____ 9. If a
file is described by an SD, it is not defined in a SELECT clause and does not
have an FD.
It is defined in
SELECT clause, But does not have FD, It has SD.
False____ 10. In
the EBCDIC collating sequence, a blank has the lowest value and the SORT verb does
not distinguish between upper- and lowercase letters.
False____ 11. The
syntax for SORT and MERGE are very different.
Sort can have
INPUT procedure, Merge cannot. The input to merge has to come directly using
USING.
False____ 12. A
sort can be performed with a minimum of two files: the input file and the file
of sorted output records.
Sort work file is
required.
No comments:
Post a Comment