The ChoiceMaker 2 Record Matching System

2023-07-19  |  百检 107浏览

This paper describes the key features of an innovative record matching system called ChoiceMaker 2 developed by ChoiceMaker Technologies (CMT). We begin with an overview of the stages that a record matching system goes through to find an incoming "query record" in a database. We then consider the stages one by one: We sketch out our patent-pending process for identifying possible matches to the query record, which is known as "blocking". We describe the process by which we use a machine learning technique known as maximum entropy modeling to tune the system to the problem at hand. Next we describe the ClueMaker ; programming language that CMT has developed for describing record matching characteristics. We describe our method for testing record matching models and describe how our IDE facilitates this process. We describe the process by which we develop record matching models. Finally, we discuss systems integration issues and the interfaces that ChoiceMaker offers for deployment.