Splitting Large Files into Smaller Files and Writing to Target Location – Adeptia Help

Objective

To split large input files into smaller files based on unique Order_ID values and generate individual output files for each order. These files will be uploaded to the target SFTP location.

Problem Statement

Currently, the system generates large files (400MB–600MB), each containing records for multiple Order_IDs. To enhance processing efficiency and manageability, we need to split these files into smaller, more granular files — one per unique Order_ID.

Example: If the input file contains records with 3 unique Order_IDs, the output should include 3 separate files — each corresponding to one Order_ID — on the target SFTP location.

Proposed Solution Using Adeptia: Step-by-Step Implementation

Define Source File and Layout:
1. Configure the source activity to read the input file (e.g., FTP Source).
2. Create a Schema/Layout based on the structure of the input file.
Extract Unique Order_ID Values:
1. Create another text layout containing only the Order_ID field. This layout is used only for extracting distinct Order_IDs.
Create Data Mapping for Unique Order_IDs:
1. Source Layout: Use the layout created in step 1.b above.
2. Target Layout: Use the layout created in step 2.a above.
3. Mapping Rules:
  1. Apply a for-each at the record level.
  2. Use the following predicate to filter unique values:

[not (preceding :: ORDER_ID=ORDER_ID)]

Add a Data Splitter and Setup Gateway:
1. Use a Data Splitter to iterate through each unique Order_ID and split the records accordingly.
2. Configure a Gateway Condition to control the flow for each Order_ID.
3. Refer to this documentation for plugin configuration: https://support.adeptia.com/hc/en-us/articles/207875483-Data-Splitter-Custom-Plugin

Store Order ID in Context Schema:
1. Create a Data Mapping to store each unique Order_ID into a Context Variable using a ContextSchema.
  1. Source Layout: Use the layout from step 2.a above.
  2. Target Layout: Use a Context Schema with a variable named Order_ID.
  3. Mapping Rules:
    1. Map Order_ID → Order_ID in context.

Transform Data per Order ID:
1. Create a new Data Mapping to transform the filtered records to JSON format:
  1. Source Layout: Use the main input layout, created in step 1.b above.
  2. Target Layout: Use a JSON layout.
  3. Mapping Rules:
    1. At the root level of the target layout, create a Local Variable to fetch the current Order_ID from context:
      1. Variable Name: varGetOrderID
      2. Variable Value: get-context (‘Order_ID ’,’’)

1. 1. 1. Apply for-each rule and use the following mapping condition:
      1. Apply for-each at the Record level of the JSON layout i.e. ‘item’ here
      2. Define the given condition as a predicate to fetch the distinct Order Ids: $Input_Source_Layout/Root/Record[ORDER_ID=$varGetOrderID]

Write Output Files to Target Location:
1. Attach the target JSON layout.
2. Configure the required target location (e.g., FTP Target).
3. Save the Process Flow and execute it
Expected Output:
1. Each iteration produces one JSON file containing records for a single Order_ID.
2. The files are uploaded to the specified target location.

Benefits of This Approach:
1. Efficient handling of large files.
2. Improved system performance.
3. Simplified downstream processing.
4. Modular and maintainable data output with one file per Order_ID.

Please find the attached ZIP file containing the sample process flow for the reference.

SplitFileByOrderID_PF.zip
30 KB Download

Objective

Problem Statement

Proposed Solution Using Adeptia: Step-by-Step Implementation

Related articles