Show simple item record

[journal article]

dc.contributor.authorComber, Samde
dc.date.accessioned2020-01-14T09:10:23Z
dc.date.available2020-01-14T09:10:23Z
dc.date.issued2019de
dc.identifier.issn2409-5370de
dc.identifier.urihttps://www.ssoar.info/ssoar/handle/document/66034
dc.description.abstractThe last decade has heralded an unprecedented rise in the number, frequency and availability of data sources. Yet they are often incomplete, meaning data fusion is required to enhance their quality and scope. In the context of spatial analysis, address matching is critical to enhancing household socio-economic and demographic characteristics. Matching administrative, commercial, or lifestyle data sources to items such as household surveys has the potential benefits of improving data quality, enabling spatial data visualisation, and the lowering of respondent burden in household surveys. Typically when a practitioner has high quality data, unique identifiers are used to facilitate a direct linkage between household addresses. However, real-world databases are often absent of unique identifiers to enable a one-to-one match. Moreover, irregularities between the text representations of potential matches mean extensive cleaning of the data is often required as a pre-processing step. For this reason, practitioners have traditionally relied on two linkage techniques for facilitating matches between the text representations of addresses that are broadly divided into deterministic or mathematical approaches. Deterministic matching consists of constructing hand-crafted rules that classify address matches and non-matches based on specialist domain knowledge, while mathematical approaches have increasingly adopted machine learning techniques for resolving pairs of addresses to a match. In this notebook we demonstrate methods of the latter by demonstrating the utility of machine learning approaches to the address matching work flow. To achieve this, we construct a predictive model that resolves matches between two small datasets of restaurant addresses in the US. While the problem case may seem trivial, the intention of the notebook is to demonstrate an approach that is reproducible and extensible to larger data challenges. Thus, in the present notebook, we document an end-to-end pipeline that is replicable and instructive towards assisting future address matching problem cases faced by the regional scientist.  de
dc.languageende
dc.subject.ddcNaturwissenschaftende
dc.subject.ddcScienceen
dc.titleDemonstrating the utility of machine learning innovations in address matching to spatial socio-economic applicationsde
dc.description.reviewbegutachtet (peer reviewed)de
dc.description.reviewpeer revieweden
dc.identifier.urlhttps://openjournals.wu-wien.ac.at/ojs/index.php/region/article/view/276de
dc.source.journalRegion: the journal of ERSA
dc.source.volume6de
dc.publisher.countryAUT
dc.source.issue3de
dc.subject.classozNaturwissenschaften, Technik(wissenschaften), angewandte Wissenschaftende
dc.subject.classozNatural Science and Engineering, Applied Sciencesen
dc.rights.licenceCreative Commons - Namensnennung, Nicht-kommerz. 4.0de
dc.rights.licenceCreative Commons - Attribution-NonCommercial 4.0en
internal.statusformal und inhaltlich fertig erschlossende
dc.type.stockarticlede
dc.type.documentZeitschriftenartikelde
dc.type.documentjournal articleen
dc.source.pageinfo17-37de
internal.identifier.classoz50200
internal.identifier.journal791
internal.identifier.document32
internal.identifier.ddc500
dc.identifier.doihttps://doi.org/10.18335/region.v6i3.276de
dc.description.pubstatusVeröffentlichungsversionde
dc.description.pubstatusPublished Versionen
internal.identifier.licence32
internal.identifier.pubstatus1
internal.identifier.review1
internal.dda.referencehttps://openjournals.wu-wien.ac.at/ojs/index.php/region/oai/@@oai:ojs.openjournals.wu.ac.at:article/276
internal.dda.referencehttps://openjournals.wu-wien.ac.at/ojs/index.php/region/oai@@oai:ojs.openjournals.wu.ac.at:article/276
ssoar.urn.registrationfalsede


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record