Flexible Data Loader

 Flexible Data Loader

  1. Log in to OpenEMPI and select File > File List. The User File Management system opens.
  2. Enter a Name for the file in the space provided. 
  3. Click the Browse button and locate the .csv file that contains the persons you want to import into OpenEMPI.
    Sample data file in the "data" directory: pixpdq2013_connectathon_patients_v3.csv
  4. Click the Upload button. The file is displayed in the list, as shown in the following image:

  5. You can now import the file. Select the check box for the file you want to import.
  6. Click Import. File Loader dialog box opens to allow you to choose the file loader.Select flexibleDataLoader from the drop down for File Loader
     
  7. When you select to import the data from a file in the User Files screen the screen in the screen capture comes up for configuring the file loading process. You must first select the Flexible File Loader from the drop down which will adjust the list of configuration options that are available for this loader. The first option is use to indicate whether the first row in the file is a header and whether is should be skipped or not.

  8. The next option is used to specify whether this is an import process or not. OpenEMPI differentiates between adding and importing records. When you import a record the record is simply added to the repository but no further processing is performed on the record. If you select to add a record then, in addition to adding the record to the repository, the system will invoke the matching algorithm to determine if there are any other records in the repository that refer to the same record and whether they should be leaked or not.

  9. The bulk import parameter indicates that the system should optimize the import process. This option should be used when importing large numbers of records but should not be used when the system is in production mode processing incoming transactions from any of the API endpoints since the processing of those requests will be severely affected by the bulk import process.

  10. The Mapping File Name entry box is used for specifying the mapping file. The system assumes that the mapping file is present in the OpenEMPI configuration directory (conf) in the server’s OpenEMPI home directory so the mapping file name entered should be just the filename and not an explicit path. This is done for security reasons.
  11. The last checkbox is used to tell the system that the import process should only be simulated and the records should not be actually loaded onto the server. By enabling this option the user can verify that the mapping field works properly by reviewing the messages in the log file.
  12. Click the Import button on the File Loader dialog box to start the import process. After the process is finished, a success message is displayed. The number of rows processed and imported is displayed for the file. After the import operation is initiated you can check the openempi.log file to ensure that the process is progressing normally and to review any errors that may have occurred.


 

Mapping File Configuration

Data file format and Mapping File configuration can be customized based on your requirements. Sample Data File used for uploading (pixpdq2013_connectathon_patients_v3.csv) is below:

Patient ID,Family Name,Given Name,Mother Maiden Name,DOB,Admin Sex,Street,City,State,Zip,Home Phone,,SSN 
GK-891:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,GREGORYX,KELLY,JACKSONX,19261215,F,378 Gregory Lane,LOUISVILLE,KY,40202,502,566-5945,401-25-4355 
GD-951:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,GREGORYX,DAVID,JACKSONX,19840515,M,4646 Hayhurst Lane,LIVONIA,MI,48150,248,819-3049,382-31-8194 
GD-698:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,GREGORYX,DANNY,JACKSONX,19291015,M,20 Point Street,CHICAGO,IL,60606,773,831-8091,336-07-2225 
HJ-361:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,HINOJOXS,JOYCE,GROSSX,19671214,F,2183 Radio Park Drive,ATLANTA,GA,30303,706,750-4736,258-10-2602 
BJ-516:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,BRACYX,JANE,DOXEZ,19361218,F,1905 Romrog Way,ROCK SPRINGS,WY,82901,307,887-5902,520-04-4608 
SM-942:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,STAMXM,MARK,FAUCHERX,19850621,M,3277 Kenwood Place,POMPANO BEACH,FL,33060,954,530-6237,267-32-9501 
WL-251:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,WILXLIS,LISA,GARDNXER,19641030,F,4168 Brannon Avenue,JACKSONVILLE,FL,32044,904,777-8423,771-18-2570 
WV-941:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,WILXLIS,VIRGINIA,FLIMINXG,19341012,F,159 Pearl Street,SACRAMENTO,CA,94260,916,283-4592,570-49-5327 
WP-410:NIST2010:2.16.840.1.113883.3.72.5.9.1:ISO,WILXLIS,PHILLIP,BEARXD,19550201,M,1691 Cook Hill Road,BRIDGEPORT,CT,06604,203,500-5517,040-22-7380

 

In this file, Patient ID is an identifier with four fields with delimiter ':' (colon). The Field Names for the corresponding Columns are obtained from the Person.java model file (OpenEMPI core module).

Data File Format for the uploaded file:

No. Of Columns13
DelimiterComma
Column IndexColumn NameField NameData Type
1
Patient ID
identifierString
2
Family Name
familyNameString
3
Given Name
givenNameString
4
Mother Maiden Name
mothersMaidenNameString
5
DOB
dateOfBirthDate
6
Admin Sex
genderString
7
Street
address1String
8
City
cityString
9
State
stateString
10
Zip
postalCodeString
11
Home Phone
phoneAreaCodeString
12 phoneNumberString
13
SSN 
ssnString

 

For this File Format, the Mapping File  Configuration is given below (file-loader-map-connectathon-v3-preconnectathon.xml). The delimiter and header are configured with tag values:

  • delimeter and header-first-line.

Since the first field is the identifier with four fields (colon as the delimiter), it is configured accordingly using the tags:

  • is-identifier, datatype, delimeter and data-type (field)
  • datatype, column-index and field-name (sub fields)

Subsequent fields are configured using the values for datatype, column-index and field-name from the Data File Format table above. Data type field can be formatted using the  date-format-string tag.

file-loader-map-connectathon-v3-preconnectathon.xml
<?xml version="1.0" encoding="UTF-8"?> 
<file-loader-map 
    xsi:schemaLocation="http://configuration.openempi.openhie.org/fileloadermap fileloadermap.xsd" 
    xmlns="http://configuration.openempi.openhie.org/fileloadermap" 
    xmlns:fl="http://configuration.openempi.openhie.org/fileloadermap" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    delimeter="," 
    header-first-line="true"> 
    <fields> 
        <field 
            datatype="String" 
            one-to-many="true" 
            is-identifier="true" 
            delimeter=":"> 
            <subfields> 
                <field 
                    datatype="String"> 
                    <column-index>1</column-index> 
                    <field-name>identifier</field-name> 
                </field> 
                <field 
                    datatype="String"> 
                    <column-index>2</column-index> 
                    <field-name>namespaceIdentifier</field-name> 
                </field> 
                <field 
                    datatype="String"> 
                    <column-index>3</column-index> 
                    <field-name>universalIdentifier</field-name> 
                </field> 
                <field 
                    datatype="String"> 
                    <column-index>4</column-index> 
                    <field-name>universalIdentifierTypeCode</field-name> 
                </field> 
            </subfields> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>2</column-index> 
            <field-name>familyName</field-name> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>3</column-index> 
            <field-name>givenName</field-name> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>4</column-index> 
            <field-name>mothersMaidenName</field-name> 
        </field> 
        <field 
            datatype="Date" 
            date-format-string="yyyyMMdd"> 
            <column-index>5</column-index> 
            <field-name>dateOfBirth</field-name> 
        </field>         
        <field 
            datatype="String"> 
            <column-index>6</column-index> 
            <field-name>gender</field-name> 
        </field>         
        <field 
            datatype="String"> 
            <column-index>7</column-index> 
            <field-name>address1</field-name> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>8</column-index> 
            <field-name>city</field-name> 
        </field>         
        <field 
            datatype="String"> 
            <column-index>9</column-index> 
            <field-name>state</field-name> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>10</column-index> 
            <field-name>postalCode</field-name> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>11</column-index> 
            <field-name>phoneAreaCode</field-name> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>12</column-index> 
            <field-name>phoneNumber</field-name> 
        </field> 
        <field 
            datatype="String"> 
            <column-index>13</column-index> 
            <field-name>ssn</field-name> 
        </field> 
    </fields> 
</file-loader-map>

 

If a Column in the data file needs to be ignored, it can be configured as follows:

<field                      
  is-ignored="true">
  <column-index>3</column-index>
  <field-name/>
</field>

 

 

Schema

Mapping File Format is defined by the XML Schema file-loader-map.xsd. The current version of the schema is shown below:

<xsd:schema
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:fl="http://configuration.openempi.openhie.org/fileloadermap"
    targetNamespace="http://configuration.openempi.openhie.org/fileloadermap"
    elementFormDefault="qualified">

    <xsd:annotation>
        <xsd:documentation xml:lang="en">
            File loader mapping objects used to configure the import of data from a
            file and map them into the schema
            of OpenEMPI
        </xsd:documentation>
    </xsd:annotation>

    <xsd:element name="file-loader-map">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="fields" type="fl:FieldsType" minOccurs="1"/>
            </xsd:sequence>
            <xsd:attribute name="delimeter" type="xsd:token" />
            <xsd:attribute name="header-first-line" type="xsd:boolean"/>
            <xsd:attribute name="training-data-extractor" type="xsd:string"/>
        </xsd:complexType>
    </xsd:element>

    <xsd:complexType name="SubFieldsType">
        <xsd:sequence minOccurs="1" maxOccurs="unbounded">
            <xsd:element name="field" type="fl:FieldType"/>
        </xsd:sequence>
    </xsd:complexType>

    <xsd:complexType name="FieldsType">
        <xsd:sequence minOccurs="1" maxOccurs="unbounded">
            <xsd:element name="field" type="fl:FieldType"/>
        </xsd:sequence>
    </xsd:complexType>
    
    <xsd:complexType name="FieldType">
        <xsd:sequence>
            <xsd:element name="column-index" minOccurs="1" type="xsd:integer"/>
            <xsd:element name="field-name" minOccurs="0" type="xsd:token"/>
            <xsd:element name="subfields" minOccurs="0" type="fl:SubFieldsType"/>
        </xsd:sequence>
        <xsd:attribute name="datatype" type="fl:datatype"/>
        <xsd:attribute name="date-format-string" type="xsd:string"/>
        <xsd:attribute name="delimeter" type="xsd:token"/>
        <xsd:attribute name="enclosing-character" type="xsd:string"/>
        <xsd:attribute name="is-cluster-id" type="xsd:boolean" default="false"/>
        <xsd:attribute name="is-identifier" type="xsd:boolean" default="false"/>
        <xsd:attribute name="is-ignored" type="xsd:boolean" default="false"/>
        <xsd:attribute name="identifier-domain-name" type="xsd:string"/>
        <xsd:attribute name="namespace-identifier" type="xsd:string"/>
        <xsd:attribute name="one-to-many" type="xsd:boolean" default="false"/>
        <xsd:attribute name="universal-identifier" type="xsd:string"/>
        <xsd:attribute name="universal-identifier-type-code" type="xsd:string"/>
    </xsd:complexType>
    
    <xsd:simpleType name="datatype">
        <xsd:restriction base="xsd:string">
            <xsd:enumeration value="String"/>
            <xsd:enumeration value="Date"/>
            <xsd:enumeration value="Integer"/>
        </xsd:restriction>
    </xsd:simpleType>
</xsd:schema>

 

 

 

Â