dc.contributor.author | Sarracino, Francesco | |
dc.contributor.author | Mikucka, Malgorzata | |
dc.date.accessioned | 2017-10-23T10:06:45Z | |
dc.date.available | 2017-10-23T10:06:45Z | |
dc.date.issued | 2017 | |
dc.identifier.issn | 1864-3361 | |
dc.identifier.uri | http://www.ssoar.info/ssoar/handle/document/54374 | |
dc.description.abstract | "Recent studies documented that survey data contain duplicate records. We assess how duplicate records affect regression estimates, and we evaluate the effectiveness of solutions to deal with duplicate records. Results show that the chances of obtaining unbiased estimates when data contain 40 doublets (about 5% of the sample) range between 3.5% and 11.5% depending on the distribution of duplicates. If 7 quintuplets are present in the data (2% of the sample), then the probability of obtaining biased estimates ranges between 11% and 20%. Weighting the duplicate records by the inverse of their multiplicity, or dropping superfluous duplicates outperform other solutions in all considered scenarios. Our results illustrate the risk of using data in presence of duplicate records and call for further research on strategies to analyze affected data." (author's abstract) | en |
dc.language | en | |
dc.subject.ddc | Sozialwissenschaften, Soziologie | de |
dc.subject.ddc | Social sciences, sociology, anthropology | en |
dc.subject.other | duplicated observations; estimation bias; Monte Carlo simulation; inference | |
dc.title | Bias and efficiency loss in regression estimates due to duplicated observations: a Monte Carlo simulation | |
dc.description.review | begutachtet (peer reviewed) | de |
dc.description.review | peer reviewed | en |
dc.source.journal | Survey Research Methods | |
dc.source.volume | 11 | |
dc.publisher.country | DEU | |
dc.source.issue | 1 | |
dc.subject.classoz | Erhebungstechniken und Analysetechniken der Sozialwissenschaften | de |
dc.subject.classoz | Methods and Techniques of Data Collection and Data Analysis, Statistical Methods, Computer Methods | en |
dc.subject.thesoz | Umfrageforschung | de |
dc.subject.thesoz | survey research | en |
dc.subject.thesoz | Datenqualität | de |
dc.subject.thesoz | data quality | en |
dc.subject.thesoz | Regression | de |
dc.subject.thesoz | regression | en |
dc.subject.thesoz | Schätzung | de |
dc.subject.thesoz | estimation | en |
dc.rights.licence | Deposit Licence - Keine Weiterverbreitung, keine Bearbeitung | de |
dc.rights.licence | Deposit Licence - No Redistribution, No Modifications | en |
internal.status | formal und inhaltlich fertig erschlossen | |
internal.identifier.thesoz | 10040714 | |
internal.identifier.thesoz | 10055811 | |
internal.identifier.thesoz | 10056459 | |
internal.identifier.thesoz | 10057146 | |
dc.type.stock | article | |
dc.type.document | Zeitschriftenartikel | de |
dc.type.document | journal article | en |
dc.source.pageinfo | 17-44 | |
internal.identifier.classoz | 10105 | |
internal.identifier.journal | 674 | |
internal.identifier.document | 32 | |
internal.identifier.ddc | 300 | |
dc.identifier.doi | https://doi.org/10.18148/srm/2017.v11i1.7149 | |
dc.description.pubstatus | Veröffentlichungsversion | de |
dc.description.pubstatus | Published Version | en |
internal.identifier.licence | 3 | |
internal.identifier.pubstatus | 1 | |
internal.identifier.review | 1 | |
internal.check.abstractlanguageharmonizer | CERTAIN | |