HD4DP v2 csv upload

Last updated: 2022-07-05 09:17

Introduction

This page explains the functioning of the CSVUploader feature. The CSVUploader feature is aimed to do a bulk upload of records : by filling a csv file, one record per row represents one submission so a user can fill as much records as needed.

Architecture

The CSVUploader is located under hd-connect/csvuploader. It both uses hd-connect-csvuploader and hd-connect-proxy modules.

The CSVUploader overall architecture is explained in the sequence diagram below.

3rd party libraries & frameworks

  • Apache Camel : https://camel.apache.org/
  • Spring Boot :

Testing and functioning

  • The CSVUploader creates at root level (SFTP for end user, or hd-all for developer) a folder that contains a subfolder per existing organizations.
  • In each organization folder, a folder per DCD is created.
  • To test the CSVUploader, the tester has to put a csv test file in the appropriate folder, regarding the organization and the concerned DCD.
  • The CSVUploader will do a polling with a delay of 1 min, process the csv file and then create 3 folders:
    • ARCHIVE folder: contains the source csv file
    • RESULTS folder: contains the results of the csv file processing. This file recalls the specified data, and the final status of the processing : Success or Error. If an error occurred, the error message is displayed. For several uploads, the result is appended each time at the end of the result file.
    • ERROR folder : this folder is created if the csv test file hasn't been parsed, due to a I/O error (file corrupted, not found etc ...). So for now, only technical errors are catched and the source csv file is moved to that folder instead of the ARCHIVE folder. At terms, this folder should contains every result which is an error. RESULT folder should contain only results that ends with a SUCCESS status.

Sample test files are available :

  • dwhTestDCD.csv
  • eHealthTestDCD_with_repeatables.csv
  • eHealthTestDCD.csv

A test file is defined with this structure :

  • first row : the columns of the DCD. Each column corresponds to a field in the DCD
  • row 2 -> N : one record per line. each cell contains the value of the current record regarding the header of the column.

Formats

Some formats are specific :

  • Dates : should be dd/mm/yyyy
  • Boolean : true / false
  • Codes : the value of the code (not the translation)
  • Multi codes : there is only one column per field. So when a select box is set as multiple, values have to be separated by a "|". e.g. : 68452|68453|68454
  • Repeatables blocks : in some DCDs, a complete block of fields is repeatable. In that case, value have to be separated by a ";".
    • e.g. : A block is containing 3 fields : A (Lob), B (Type klep) and C (Aantal kleppen).

The block is repeated once by clicking on "Add another" button. In the CSV, there is still one column for each field.

If for the first block, values are :

  • A -> 68545 (=RLL)
  • B -> 13245 (=38101000053)
  • C -> 1

and for the second block, values are :

  • A -> 68548 (=LLL)
  • B -> 13245 (=38101000053)
  • C -> 1

In the CSV file it will result in the following :

  • Column A : 68545;68548
  • Column B : 13245;13245
  • Column C : 1;1

It is possible to mix multi select values and repeatables blocks (if a multi select box is inside a block component that could be repeated). This will end as :

If for the first block, values are :

  • A -> 68545|68944|68946
  • B -> 1
  • C -> 2

and for the second block, values are :

  • A -> 78945|78950
  • B -> 3
  • C -> 4

In the CSV file it will result in the following :

  • Column A : 68545|68944|68946;78945|78950
  • Column B : 1;3
  • Column C : 2;4

Process to upload the csv file of DCDs and its verification in HD4DP v2.0

NOTE1 : The IT service at the Data Providers side will make sure that the csv files will get into the correct folder of the relevant DCD
NOTE2 : The CSV files, that are extracted from the databases by the IT serviceses of the Data Providers, must be using the UFT-8 character set

Uploading the csv file into the correct DCD folder

The DevOps team of Sciensano will provide the Data Providers an SFTP user and password. With these credentials the IT Services of the Data Providers will be albe to connect to the Upload folders of Sciensano.

The main folder structure of the csv upload folder will look as such:

Each folder consists of sub folders which contains the DCD.

Find below an overview of the DCD's that can be uploaded.

  • Handhygiene-1:
    • PRE campaign
    • POST campaign
    • IN & OUTSIDE campaign
  • MEARaxone-2
    • meaRaxone
  • Endobronchialvalve-4 (Zephyr):
    • Primo-implantation
    • Replacement
    • Follow-up
  • Orthopridehip-7:
    • Primo-implantation
    • Revision
    • Resection
  • Orthoprideknee-8:
    • Primo-implantation
    • Revision
    • Resection
  • Orthopridetotalfemur-9:
    • Primo-implantation
    • Revision
    • Resection
  • For Spine Tango that will be:
    • Intake
    • Conservative treatment
    • Surgery
    • Patient questionnaire
  • For Surgical and percutaneous heart valves that will be:
    • Implantation
    • Follow-up
  • TestDCD_PROJECT-6
    • testDCD01
    • testDCD02
This example shows the different DCDs of an Orthopride Knee project

The csv files will be placed in the folders of the DCD which we want to upload. We double click on the folder of the dcd and the structure will be either empty, whether filled with folders containing the following names:

  • ARCHIVE (after a csv file has been processed, the original csv file will be saved in this folder)
  • RESULT (when the csv file has been processed, a file will be created or append with the result of the upload of the csv file)
  • ERROR (when the csv file contains erroneous formatting, the csv file won't get processed and an error file will be created or append with the errors and reason why the csv file couldn't be processed)

Example: I want to upload an Orthopride Knee Primo-Implantation dcd, so I will place the orthopride knee primo implantation csv file into the dcd-21-v-1-kneePrimoImplantation folder.

  • Open the dcd-21-v-1-kneePrimoImplantation folder
  • Put the orthopride knee primo implantation csv file into the folder
  • Wait until the file has been processed (the file will disappear from this folder if it has been processed - don't forget to click on the refresh button)
  • Go into the RESULT folder and refresh the folder to update the file with the latest changes
  • Double click the file to open it and to read the result of the upload process

The status is the most important line that indicates whether the upload was a success or not.

Verify the uploading the file into the correct DCD folder

Now login to HD4DP and go to relevant DCD and check the processed file data