Thursday, September 30, 2010

COBOL Tutorial 00300 – Edited Fields

As I have mentioned in the last tutorial, you use edited fields in COBOL to format data fields into human-readable display strings. Let’s start with a numeric field:

01 NUMERIC-FIELD PIC 999999V99.

and some COBOL code that set and display the field value:

MOVE 1234.5 TO NUMERIC-FIELD.
DISPLAY NUMERIC-FIELD: ' NUMERIC-FIELD.

As we’ve demonstrated in the previous tutorial, unused digits are padded with ugly zeros:

NUMERIC-FIELD: 001234.50

Let me put my C# programmer hat on again (apologies to Java, ruby, python, C/C++, assembly and many other programmers who don’t like C#), when we have to format a variable for display, we often use the string.Format method with a formatting string containing special formatting characters, which is “0,0.00” in the following example:

// returns 1,234.50
string.Format("{0:0,0.00}", 1234.5)

Now let’s come back to COBOL, an edited field is basically a normal COBOL data field with a formatting string in the picture clause instead of the “A”, “X” or “9” data type specifiers. The edit field's formating string is based on similar ideas as C#’s one. To achieve the same output as the C# code above, I use “ZZZ,ZZZ.99” formating string as shown in the following example:

01 EDITED-NUMERIC-FIELD PIC ZZZ,ZZZ.99.

Unlike the place holding character “9”, each unused “Z” in the picture clause is not filled with “0” and the “,” just inserts a comma in the display value. Therefore, if we move the value of NUMERIC-FIELD to the EDITED-NUMERIC-FIELD and then display the content of EDIT-NUMERIC-FIELD:

MOVE NUMERIC-FIELD TO EDITED-NUMERIC-FIELD.
DISPLAY 'EDITED-NUMERIC-FIELD: ' EDITED-NUMERIC-FIELD.

The result is a much more readable output:

EDITED-NUMERIC-FIELD:   1,234.50

Keep in mind that an edited field is basically an alpha-numeric field, so you cannot perform arithmetic calculation with it. For example if you add following line of code to your program:

ADD 1 TO EDITED-NUMERIC-FIELD.

You will get the following compile error:

Error: 'EDITED-NUMERIC-FIELD' is not numeric name

This is a summary of commonly used formatting special characters:

  • “B” – Inserts a blank space.
  • “Z” – Place holder for a numeric character or space if unused.
  • “,” – inserts a comma.
  • “/” – Inserts a slash.
  • “0” – Inserts a zero.

Here are some examples to demonstrate how to use them:

Picture Clause Input Output
9999/99/99 20100101 2010/01/01
9999B99B99 20100101 2010 01 01
-ZZZ,ZZZ.99 -1234.5 -1234.50
ZZZ,ZZZ.99- -1234.5 1234.50-
X0X0X0X ABCD A0B0C0D

Finally, here's the COBOL program I used to develop the example code in this tutorial.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. EDITED-FIELD.

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01 NUMERIC-FIELD PIC 999999V99.
       01 EDITED-NUMERIC-FIELD PIC ZZZ,ZZZ.99.
       01 EDITED-NEGATIVE-FIELD1 PIC -ZZZ,ZZZ.99.
       01 EDITED-NEGATIVE-FIELD2 PIC ZZZ,ZZZ.99-.
       01 EDITED-DATE-FIELD1 PIC 9999/99/99.
       01 EDITED-DATE-FIELD2 PIC 9999B99B99.
       01 EDITED-ZERO-FIELD PIC X0X0X0X.

       PROCEDURE DIVISION.
            MOVE 1234.5 TO NUMERIC-FIELD.
            DISPLAY '       NUMERIC-FIELD: ' NUMERIC-FIELD.
            
            MOVE NUMERIC-FIELD TO EDITED-NUMERIC-FIELD.
            DISPLAY 'EDITED-NUMERIC-FIELD: ' EDITED-NUMERIC-FIELD.

            MOVE -1234.5 TO EDITED-NEGATIVE-FIELD1.
            DISPLAY 'EDITED-NEGATIVE-FIELD1: ' EDITED-NEGATIVE-FIELD1.

            MOVE -1234.5 TO EDITED-NEGATIVE-FIELD2.
            DISPLAY 'EDITED-NEGATIVE-FIELD2: ' EDITED-NEGATIVE-FIELD2.

            MOVE 20100101 TO EDITED-DATE-FIELD1.
            DISPLAY 'EDITED-DATE-FIELD1: ' EDITED-DATE-FIELD1.

            MOVE 20100101 TO EDITED-DATE-FIELD2.
            DISPLAY 'EDITED-DATE-FIELD2: ' EDITED-DATE-FIELD2.
            
            MOVE 'ABCD' TO EDITED-ZERO-FIELD.
            DISPLAY 'EDITED-ZERO-FIELD: ' EDITED-ZERO-FIELD.
            STOP RUN.

Sunday, September 26, 2010

COBOL Tutorial 000200 – Data Fields

Variables are called Fields in COBOL and definitions of variables are declared in the Picture clause (can be abbreviated with PIC). Why is it called the Picture Clause? According to the book Sams Teach Yourself COBOL in 24 Hours, this is because it “paints a picture of how a field looks by defining every details and characteristic of the field”, still doesn’t quite make sense to me but anyway.

Let’s start by talking about what data fields (variables) look like in C#. When we declare a variable, the first thing we have to think about is the data type, which determines what kind of data can it hold. Normally we wouldn’t worry about the number of digits or length of the string unless we know their values can get ridiculously large or long.

int integerVariable = 12345678;
string stringVariable = "abcd1234";
decimal decimalVariable = 1234.5678m;

In a COBOL world, however, the size does matter and you have to specify both the type and the size for each data field at the same time with a special character mask in its “Picture Clause” (the PIC keyword) as shown in code below:

01 ALPHA-FIELD         PIC AAAAAAAAAA.
01 NUMERIC-FIELD       PIC 9999999999.
01 ALPHA-NUMERIC-FIELD PIC XXXXXXXXXX.

There are really only 3 types of data field in COBOL: literal, numeric and alpha-numeric and as their names suggest, they hold alphabetical, numeric and alpha-numeric characters respectively.

Let’s look at the ALPHA-FIELD first, its picture clause specifies a “AAAAAAAAAA” mask, each “A” character is a place holder for a single alphabetical character, so the ALPHA-FIELD can be used to store 10 alphabetical characters. Similarly, each “9” is a place holder for a single digit and a “X” is a place holder for a single alpha-numeric character. Therefore, the NUMERIC-FIELD and ALPHA-NUMERIC-FIELD can hold 10 digits and 10 alpha-numeric characters respectively.

To declare a decimal data field, we need to add the special character “V” in the numeric data field mask, which specifies the decimal point location. For example, the DECIMAL-NUMERIC-FIELD data field declared below let you store decimal values with up to 5 digits before and after the decimal point.

01 DECIMAL-NUMERIC-FIELD PIC 99999V99999.

As you have probably noticed, the format mask can get very ugly for large fields and hence COBOL allows you to abbreviate it with the special character followed by the number of appearances in round brackets, so for example we can abbreviate “XXXXXXXXXX” to X(10) and “99999V99999” to 9(5)V9(5).

Let’s put these all together into a small program:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. EDITED.

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01 ALPHA-FIELD PIC A(10).
       01 NUMERIC-FIELD PIC 9(10).
       01 ALPHA-NUMERIC-FIELD PIC X(10).
       01 DECIMAL-NUMERIC-FIELD PIC 9(5)V9(5).

       PROCEDURE DIVISION.
            MOVE 'ABCEFG' TO ALPHA-FIELD
            DISPLAY '         ALPHA-FIELD: ' ALPHA-FIELD.

            MOVE 123456 TO NUMERIC-FIELD.
            DISPLAY '       NUMERIC-FIELD: ' NUMERIC-FIELD.

            MOVE 'ABC123' TO ALPHA-NUMERIC-FIELD
            DISPLAY ' ALPHA-NUMERIC-FIELD: ' ALPHA-NUMERIC-FIELD.


            MOVE 1234.5 TO DECIMAL-NUMERIC-FIELD.
            DISPLAY 'EDITED-NUMERIC-FIELD: ' DECIMAL-NUMERIC-FIELD.
            
            STOP RUN.

Notice that I’ve used the abbreviated picture clause mask (e.g. X(10) instead of XXXXXXXXXX) in the example above. Also, we haven’t covered the MOVE command yet but basically that’s how you move (assign) values into data fields in COBOL. If you compile and run this program you’ll see the following outputs in the console:

         ALPHA-FIELD: ABCEFG    
       NUMERIC-FIELD: 0000123456
 ALPHA-NUMERIC-FIELD: ABC123    
EDITED-NUMERIC-FIELD: 01234.50000

As you have probably noticed, unused digits in numeric fields are padded with zeros, which is quite ugly. We’ll cover how to make it look prettier with Edited Fields in the next tutorial.

Update: This is the 3rd revision of this COBOL data fields tutorial, I’ve decided to cut this tutorial into two parts because it was getting too long and messy.

Thursday, September 23, 2010

COBOL Tutorial 000100 – the ‘Hello World’

Due to a new project at work, I'm starting to learn COBOL (not the sexiest language, I know). I found the hardest part about learning COBOL for me is to know what COBOL keywords mean in terms of more modern programming languages. Therefore, I thought I’ll write a couple of short tutorials here to explain some of these differences in case some other programmers are interested in learning this 50+ years old programming language.

As with learning any other programming language, the first example has to be the “Hello World” and here’s the source code:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. HELLOWORLD.
       PROCEDURE DIVISION.
            DISPLAY 'HELLO WORLD'.
            STOP RUN.

Every COBOL program needs an IDENFICATION DIVISION and the PROGRAM-ID (which is HELLOWORLD in our example). All program logic will sit under the PROCEDURE DIVISION. The rest of the program should be pretty self-explanatory.

The full stop (.) is the equivalent of semi-colon (;) in C-derived programming languages, which denotes the end of a coding line.

To compile this code, I used the OpenCOBOL compiler. You can install it under Ubuntu 10.04 by typing the following command in the shell:

sudo apt-get install open-cobol

After installing the compiler, you can then compile the program by running (assuming that you’ve saved the source code in a file called helloworld.cob):

cobc -x -free helloworld.cob

The -free compiler flag tells the cobc compiler to use the free source code format. Without it, the compiler will require you to enter 7 spaces at the beginning of each line. The –x flag, on the other hand, tells the compiler to produce an executable rather than a .so file (we’ll talk about .so files later).

Finally, the compiler may produce few warnings about “dereferencing type-punned pointer” but you can just ignore them. After the compilation finishes, you will find the executable helloworld in your directory.