Record Reformatter (RR32)
A tool to process EBCDIC or ASCII files and typically produce fixed length data
The major application of the Record Reformatter (RR32) is in restructuring records for moving between different systems or to produce simple reports based on complex file structures. The restructuring is often used to expand packed numeric fields such as COMP-3 or binary/floating point numbers. It can also handle floating point numbers from a VAX system. It can also read IBM type records on your PC.
Other applications include extracting fields from a database for indexing or summary applications. It can also be used to do an intelligent EBCDIC/ASCII conversion on data where there is a mixture of text and binary fields.
The Record Reformatter has many tools built in to assist the user in analyzing unknown records and field structures. This includes calculating record lengths by looking for patterns in the data and showing data in several formats which will allow the user to select a sensible conversion.
Where Cobol data descriptions are available in a file form, this can be read and the record structure determined automatically. In a similar way AS400 savelib tapes may be read with MediaMerge/PC to produce a Record Reformatter file description.
The date conversion routines and the general restructuring of records may assist in converting data files in applications requiring updates due to any left over year 2000 (Y2K) problem.
The Record Reformatter may be used as a stand-alone tool, in conjunction with InterMedia for Windows or MM/PC, and as such it provides a very powerful method of handling many IBM type tapes and records.
About Records
Before details of how to use this are explained, a few notes on record structure and definitions are required. As with all applications, there will be exceptions that cannot be handled, but in practice these will not be very common. Records can be fixed length or variable length
Fixed length records
A fixed length record always has the same length, but a file may be made up of many different types of record which may each have different lengths.
The Record Reformatter analyze function will try and determine the record length automatically even for records without carriage returns. The user can override any automatic record length determination.
A record is defined by unique codes in a fixed location irrespective of the record length. For instance the first 2 characters may be a two-digit number that determines the type of record. Up to 20 different types of record may be defined each with different record lengths and field structures. RR32 will operate with fixed length records where some records are truncated ( e.g. by a CRLF) indicating that the rest of the record is just spaces.
Fields are fixed lengths and locations within a record. Different record types will have different fields and field positions.
Before any work may be performed on records, it is essential to define the following points
Types of input records:
- Record identifier code, position and length
- Each record length
- Field structure for each record type
- Length of output record it translates to
- Optional record terminator for truncated records
Output records are defined by:
- Length
- Field locations
- Types of conversion between input an output fields
- Value of filler for unmapped fields
All these values are entered via the Record Reformatter Editor. In analysis mode field types can be automatically determined. Due to the nature of text-based records, this initial analysis will not be always correct so full editing of the first approximation may be archived later. Alternatively the structure may be entered manually, from a Cobol Data definition file, or DBase file.
Variable Length Records
With fixed length records each field and record has a pre-determined length. With variable length records fields and records are marked by end characters. A typical marking would be fields are separated by a comma, and records by a CR. Thus two records could like as below
field1,field2,A longer field,last field(CR)
fa,fb,fc,final field in this record(CR)
Thus to define such a file structure, it is necessary to assign values for field and record delimiters. With the Record Reformatter both of these delimiters may be one or two characters long.
With variable length fields, only a single record structure can be handled.
Automatic Analysis Mode
A very powerful feature of RR32 is the ability to analyze records and create an outline input field definition automatically. Although this is not a complete substitute for documentation describing a record, it can be extremely useful in analyzing an otherwise unknown record.
Analyze will try and determine field breaks, as well as field types. This includes text (ASCII or EBCDIC) and packed fields. may be moved, added or deleted. The generated routine may then be edited in any way required. This can include adding fields, deleting, concatnating and sorting field definitions.
Output creation and testing
Once an input record has been defined a typical application is to convert it into say a quote comma quote delimited record. The output can be automatically generated based on the input field locations and descriptions. Several typical output types may be created such as Comma delimited and Quote comma Quote. If an input field is described as a 'Strip field' it will not be included in the output record.
The output may then be edited as required so that lengths can be controlled or modified. In the same way fixed data fields may be added.
Once created, the output may be tested, and the first 128K of the input sample file will be converted and displayed on a split screen of both input and output file using the InterMedia File Viewer.
Specification of Features
| Maximum number of record types | 20 |
| Maximum number of field definitions | 3000 |
| Maximum file length | Only limited by disk space |
| Record Analysis | First 50 records |
| Setup wizard | Yes |
| Win95/98 | Yes |
| Win NT 4.0, 2000 | Yes |
| Win 3.x | No |
Field Conversions
Each field may be converted by the following commands. The position of each output field is entirely dependent on the user, and fields may be omitted or included multiple times as required.
Copy
This is probably the most common conversion rule to be used. It very simply copies the input field to the output field. If the input and output fields are different lengths, either the end of the field will be truncated.
Copy Reverse
This will copy the field, as described above, and then reverse it. Thus a field such as
"Hello 1234"
will become
"4321 olleH"
The reversing works on the final length of the output field so if padding was required, the padding would end up at the start of the field.
For fields of 1,2,4 bytes in length, this operation is identical to Swap 8, Swap 16 and Swap 32.
Copy Pascal
This will copy a Pascal text string. A Pascal string starts with a byte giving the length followed by the string. The length byte is stripped when copying
As in Copy but converts an ASCII input field to EBCDIC
As in copy but converts an EBCDIC field to ASCII
FILL ASCII
This fills the output field with the ASCII string defined in the parameters. Any codes in <> are treated as hex values. Example of string is 12<09>Hello. It may typically be used for inserting tags between fields, or even a ',' to make a record ',' delimited
FILL HEX
This is the same as Fill ASCII but allows the user to insert non-printing characters.
For example to insert a CRLF, the output string
0D0A
would be used.
It may also be used to insert EBCDIC characters
ASCII-PACKED
This converts an ASCII number to a packed field. This takes numbers and uses nibbles (4 bits) to represent the number, thus a number '1234', would be in hex, 31 32 33 34, and this would be converted to 01 23 4C, where C represents +. A D would represent -, and F, unsigned. This method of storing numbers effectively compresses the space required by a factor of 2, and is common within many IBM based record structures. It is also known as IBM COMP-3
EBCDIC-PACKED
This converts an EBCDIC number to a packed field. (See ASCII-PACKED)
Packed to ASCII
Packed to EBCDIC
Packed fields are a very common occurrence in many IBM records. The numbers may be signed as above or unsigned, in which case a series of hex characters 12 34 56 would represent the decimal number 123456. There is a feature where packed numbers may be decoded even when placed on 4 bit (nibble) boundaries, rather than byte boundaries.
Convert Date
The convert date operator will convert a date field to a DDMMYYYY date format.
The actual output date format is selected on the configuration screen of the routine.
The type of date conversion is dependent on the combo box at the right of the line.
Conversion options are as below
- Date 7-4-5. This relates to the bits of a 16-digit number, where the most significant bits represent the year, from 1900 to 2027. The next 4 digits are the month, 1-12, and the final 5 bits, the day 1-31.
- DATE YYMMDD This inserts system date as YYYY/MM/DD
- DATE MMDDYY This inserts system date as MM/DD/YYYY
- DATE DDMMYY This inserts system date as DD/MM/YYYY
- Julian Date, IBM date from about 4700BC!
- Many more date formats are added as found
TIME
Inserts system time in output string, as HH:MM:SS
SWAP 32
Swaps 4 byte arrays. This can be useful to convert numbers from Little Endolian to Big Endolian
SWAP 16
Swaps two characters from the input string
example
Input = InterMedia
output = nIetMrdeai
SWAP 8
Swaps two nibbles from an input byte. For example, 0D(hex) would be converted to D0(hex)
Record Count
This inserts the current record count
+/- Number HiLo
+/- Number LoHi
VAX Float
This converts the input binary number to an ASCII string. The output buffer is right justified, and if not large enough the most significant digits will be truncated. If the value is negative, a '-' sign will be added. This conversion feature can be extremely important when trying to import binary files into a text file format
The range of numbers is as below
Digits Output buffer size
| Input Digit | Output buffer size |
| 1 | 4 |
| 2 | 6 |
| 3 | 8 |
| 4 | 11 |
| 8 | 38 max |
| 10 | 200 (not yet implemented) |
For VAX floating point numbers there are 4 defined lengths,
- F-Float 4 bytes
- D-Float 8 Bytes
- G-Float 8 Bytes (Not implemented yet)
- H-Float 16 Bytes (Not implemented yet)
All VAX numbers are signed, and the ordering is fixed
The size of the input number will be taken from the input field definition and may be 1-4 bytes in length. The ordering of the number will be high byte first for HiLo and low byte first for LoHi.
For floating point numbers (8 characters in length) the output is almost unlimited. If the output is longer than the field allowed for, the number will be displayed in scientific notation (e.g. 2.63E5). If the output from a floating point number contains invalid characters, this is most likely one of two reasons,
- It is not a floating point number
- The order should be swapped, ie HiLo, or LoHi
- The 6-digit character is a special floating point implementation. It is not known how
- standard this is. The 6-byte array is as below
- Byte 1 mantissa
- Byte 2-6 Exponent in Lo-Hi ordering
- The exponent is in the range of 0.5 - 1.5
- The mantissa is a multiple of 2
- Number HiLo
- Number LoHi
This is as above, but the number is not signed, and so the output buffer can be one character shorter
Cobol Num
This rule will convert signed strings from Cobol systems. The string includes its sign as part of the last digit. The output may also be formatted with the same commands as described below in Formatting Numeric fields.
Formatting Numeric fields
Numeric fields have an extra edit field at the right of the screen. This is to allow for formatting of the output. By using this the number of decimal places may be determined and leading zeros displayed or suppressed.
The options are extremely variable and the command line is in the structure below
- #,4.2
where the symbols are as below
- If a , or $ sign is shown this means that the money sign is added in front of the number.
- If # is set then leading zeros are displayed
- If a , is in the line, then significant numbers are broken down into groups of 3, separated by a comma, such as 1,234,567.12
- The final number is the number of significant and decimal places. Thus 3.2 would be 3 leading digits and 2 decimal places.
- If the field is left blank, then no numeric formatting will take place.
- If the output field is too short, then significant digits are truncated.
Some examples:
- 3.4 100.1234
- #6.2 0000012.99
- #,6.2 123,456.55
- ,8.2 1,435.55
- 5.0 43241
TransTab (InterMedia for Windows only)
The TransTab option is the same of 'Copy Field', but an IMW translation table may be applied. The translation table is any IMW table and it performs a complete string and byte translation. Thus the output string may be longer than the input string. If too long for the output field, it will be truncated (from the right).
Typical applications could be case mapping (make all lower case, or all upper case) or handling accented characters, or different EBCDIC conversions.
There is a limitation of a maximum number of 8 different translation tables definable within a single Record Reformatter table. There is no limit on the number of fields that may be converted
IB Field
This will insert and IntelliBase / IntelliBase 95 field marker. The parameter should be a number between 1 and 9999. The output data is always a 4-digit number, preceded by a 0EH, and followed by a 0FH. The length should always be 6.