Vista completa de documento

Nº Sistema 000476270
Autor LinkForteza, Nicolás
Autor LinkGarcía-Uribe, Sandra
Título A score function to prioritize editing in household survey data [Recurso electrónico] : a machine learning approach / Nicolás Forteza, Sandra García-Uribe.
Publicado en Journal of official statistics [Artículos], v.41, issue 1, March 2025, pp. 144–171
Nota general Artículo de revista
Resumen Errors in household finance survey data collection can lead to inaccuracies in population estimates. Manual case-by-case revision has traditionally been used to identify and edit potential errors and omissions in the data, such as omitted or misreported assets, income, and debts. Selective editing strategies aim at reducing the editing burden by prioritizing cases through a scoring function. However, the application of traditional selective editing strategies to household finance survey data is challenging due to their underlying assumptions. Using data from the Spanish Survey of Household Finances, we develop a machine learning approach to classify data during the editing phase into cases affected by severe errors and omissions. We compare the performance of several supervised classification algorithms and find that a Gradient Boosting Trees classifier outperforms the competitors. We then use the resulting score to prioritize cases and consider data editing efforts into the choice of an optimal classification threshold. [Resumen de autor] [eng]
Restricciones Acceso público y gratuito a la versión electrónica en Internet
Acceso electrónico  Acceso al texto completo. 
Relacionado con Documentos de Trabajo / Banco de España ; 2330
Clasificación LinkC3-Métodos Econométricos y Estadísticos. 
LinkC6-Programas informáticos de Econometría. 
LinkR81-Big data e inteligencia artificial. 
Materia LinkRecogida de datos
LinkAprendizaje automático
LinkPredicción
LinkEncuestas
Materia LinkEspaña

2013 Banco de España, Madrid, España. Reservados todos los derechos
Basado en Ex Libris (© 2009 Ex Libris)

Contacto