Release 4.3.1
Upgraded the version of various libraries to eliminate security vulnerabilities that were recently reported
Added a framework for evaluating the performance of the blocking algorithm against a ground-truth dataset where the set of duplicate records is known
Added a framework for evaluating the performance of the matching algorithm against a ground-truth dataset where the set of duplicate records is known
Reduced the startup time of the server especially for sites with millions of records in the repository
Fixed an issue with the operation to start and stop the listener service for the HL7v2 PIX/PDQ protocol support
Fixed an issue where after adding a new entity using an import statement, you are unable to create a custom field against it.
Fixed an issue where after switching to the probabilistic matching algorithm, you are unable to use a custom field as one of the matching fields until after restarting the server
Fixed an issue with the cache manager resetting the connection to the configuration database
Fixed an issue where after deleting an entity, the configuration of the blocking and matching algorithms associated with that entity were not being deleted
Added support for quickly loading the configuration of an entity from a JSON backup to help support customers that need assistance with the configuration of their instances
Release 4.3.0
Modified the architecture of OpenEMPI to solely rely on the graph database and to no longer use a relational database to store configuration information.
Added a new machine learning-based algorithm that allows the user to determine the best similarity metric (comparator) and threshold to use for a specific matching field specifically for their dataset.
Added a data standardization feature that allows the user to identify issues in their dataset and resolve them across the entire dataset
Added the ability to switch matching algorithms for a particular entity through the UI
Added a dashboard of performance related metrics to the home page of the web manager application to assist users identify resource related issues early and resolve them before they become critical
Added a scheduled task that deletes logged links from the database so that they don’t accumulate and cause resource issues after a while.
Modified the reports to not use fonts that require additional packages to be installed on some Linux distributions
Added more validation of the parameters that are used to configure transformation functions for custom fields
Upgraded numerous libraries and packages to recent versions to eliminated any security vulnerabilities that have been identified
Added the ability to rebuild individual blocking rounds instead of having to rebuild all blocking rounds at once
Added the ability to copy the blocking and matching configuration from one entity to another through the UI
Made various changes that improved the performance of the system
Release 4.2.7
Added a number of charts on the home page of the administrative application that present performance information about the instance to help administrators identify resource issues early
Upgraded a number of libraries to eliminate potential security vulnerabilities
Fixed an issue with the REST API where a request to retrieve clusters of a given size fails when a record has an identifier associated with an unknown identifier domain
Added a report that presents statistics about probable links grouped by the associated vector number
Resolved an issue with some REST API calls where they would fail to serialize the response as an XML document
Release 4.2.6
Modified the probabilistic matching algorithm service so that when changing the upper and/or lower bounds through the UI, the classification of each pattern is updated immediately.
Added the ability to copy the blocking and matching configuration from one entity to another through the UI or REST API so that it doesn't have to be done manually.
Added the ability to rebuild the blocking indexes of a single blocking round so that we can reduce the impact on production instances for rebuilding all rounds at the same time.
Upgraded the Postgres and other dependencies that have reported security vulnerabilities
Modified the fonts used for some of the reports to eliminate dependencies on the Times New Roman font that is not present by default on all Linux distributions.
Release 4.2.5
Added a REST API endpoint that returns information about the candidate records by blocking round identified by the blocking algorithm for evaluation given a record
Enhanced the concatenation transformation function to allow for sorting of the values of the fields to be concatenated thereby increasing the effectiveness of the custom field in identifying matching records
Performed some optimizations to the blocking service to reduce the potential duplicate evaluation of record pairs by the matching algorithm
Fixed a bug where the dates associated with an entity in the UI were not being displayed properly
Fixed a bug with the import of an exported entity schema where custom fields were not being imported correctly
Fixed an issue in the creation of entity indexes where the naming convention of the indexes were causing confusion to the the database parser in the interpretation of the command
Upgraded the versions of various libraries including Artemis and GSON among others that are utilized by OpenEMPI to eliminate security vulnerabilities that have been identified
Upgraded the version of the Postgres driver to eliminate a security vulnerability present in an earlier version
Release 4.2.4
Fixed a bug with the generation of a Data Profile based on a file where the job was not completing successfully if the data file was not formatted properly
Added an option to the user profile page on the UI to allow the user to specify a blank value for the state field
Replaced the use of log4j-1 with reload4j to eliminate any concern regarding the vulnerabilities detected in log4j-1
Upgraded the version of the Postgres driver used by the application to a more recent version that is free from security vulnerabilities
Release 4.2.3
Applied a change to enforce the access permission that a user without the USERS_VIEW permission should not be able to view their own profile
Fixed a bug where disabling a user account through the UI would not actually disable the account
Made fixes to the PIX Feed v3 interface to ensure that the validation of identifiers doesn’t fail when identifiers are specified using the PIX/PDQv3 convention
Fixed the configuration of the deployment of the PIX/PDQ v3 interface to work properly when the interface is enabled
Upgraded the reference data lookup code to ensure the hibernate layer works properly with the version of Hibernate included in this release
Upgraded the version of Apache CXF used to eliminate any potential security vulnerabilities
Fixed the update of global identifiers assigned to records so that when an update changes the association between two clusters of records where each cluster also has manual links from an external source, the global identifiers are not updated correctly
Applied a validation check to prevent the search capability from allowing users to query for all records using the wildcard character
Release 4.2.2
Fixed a bug where certain data issues would cause the process that regenerates all links to stop before visiting every record
Add support to the UI for replacing one matching field with another without having to update the probabilistic model in the probabilistic matching algorithm
Extended the concatenation function to allow for the concatenation of more than two fields
Release 4.2.1
Added a new configuration setting that allow you to enforce the policy where manual links stored as part of the remember manual classifications feature override classifications by the matching algorithm to link records together
Improved the performance of the file load operation by using caching of identifier domain information
Fixed an issue where the modal dialog that display the evaluation of the association between two records in the system did not color-code correctly vector patterns that correspond to null-scored vector paterns
Improved the performance of the blocking re-indexing operation by implementing lower granularity locking of resources
Fixed a bug where the information dialog that pops-up when a record pair is classified manually through the review record pairs page could not be dismissed
Release 4.2.0
Added the ability to classify clusters of more than two records that have been linked together using a graphical interface that allows the users to review the field values and associations of all the records involved in the cluster
Added the ability to submit two records for evaluation by the matching algorithm both through the UI and the REST API
Added the ability to link any two records together manually through the UI
Added support for Ground Truth Analysis via the REST API to allow the user to evaluate the accuracy of the algorithm based on a labeled dataset
Added support for the Single Best Record functionality that allows the user to retrieve the most representative record (golden record) from a cluster of records that have been linked together by the matching algorithm
Added the ability to customize the color coding used for displaying vector agreement/disagreement patterns and pairs of records
Updated many libraries to eliminate security vulnerabilities
Fixed an issue with the Reevaluate Probable Links functionality when the Remember Manual Classification decisions has been enabled
Fixed an issue with paging in the User Files page of the UI
Fixed an issue in the displaying of two records side-by-side in the UI when the value for a field on the left-hand side record is null
Fixed an issue with Identifier Domain management using the latest version of Hibernate
Fixed an issue with the Reporting Module using the latest version of Hibernate
...