Metadata-Version: 2.1
Name: pycobol2csv
Version: 1.0.6
Summary: A Python library to convert COBOL ebcdic file to CSV format based on copybook
Home-page: https://github.com/jasonli-lijie/pycobol2csv
Download-URL: https://github.com/user/reponame/archive/v_01.tar.gz
Author: Jason Li
Author-email: niomobileapp@gmail.com
License: MIT
Project-URL: Bug Tracker, https://github.com/jasonli-lijie/pycobol2csv/issues
Keywords: COBOL,EBCDIC,CSV
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ebcdic

# pycobol2csv
pycobol2csv is a Python library to convert COBOL ebcdic file to CSV format. The package is built to cater for advanced features in COBOL copybooks such as *OCCURES x TIMES*, *BINARY*, *COMP*. 

The CSV file is RDBMS friendly and all headers are ready to be used as database column names.
CSV conversion is controlled by config file in *csv_config.json*

- [x] Update in version 1.0.6

Recently Microsoft *upgraded* Spark in Synapse from version 3.2 to 3.4, which *upgraded* the included Python version from 3.8 to 3.10, which is a version with known issues on csv writer. 

Any users on Python 3.10 should **upgrade to pycobol2csv 1.0.6 and above ASAP**, otherwise there might get an error *_csv.Error: need to escape, but no escapechar set*. 

Other Python versions (such as 3.8, 3.11) are safe so far.

- [x] Update in version 1.0.5

Added more enhancements for the outdated REDEFINE and PIC syntax for a new client.



#### Install the python module:



`pip install pycobol2csv`

#### To use the module:

```
from pycobol2csv import convert_cobol_file, decode_copybook_file

row_length, cobol_struc = decode_copybook_file(copybook_file)

convert_cobol_file(copybook_file, data_file, output_file, config_file, codepage, debug=False)

```

- copybook_file: copybook filename
- data_file: data filename 
- output_file: output csv filename
- config_file: csv configuration filename, refer to csv_config.json
- codepage: codepage for edibic, refer to https://docs.python.org/3.7/library/codecs.html#standard-encodings for details
- debug: enable for more debug information, default is OFF

#### test 

2 sets of test data have been created from scratch. Each set includes a copybook and an EBCDIC data file.

To test:

```
python convert_cobol_test_main.py --copybook [COPYBOOK_FILE] --data [DATA_FILE] --output [CSV_FILE]

```

#### known issues and limitations

- Be aware of the resources available in your runtime environment and make sure the Cobol file size is not beyond the limit or cause any performance issue.

To handle large Cobol files, you can split the files into smaller chunks and then process the chunks in parallel. Please refer to the [medium post](https://medium.com/@jasonli.lijie/process-large-cobol-files-efficiently-with-pycobol2csv-pycobol2parquet-f023533607e4) for details.


<!-- Repo: https://github.com/jasonli-lijie/pycobol2csv -->
