| Nº Sistema | 000476270 |
| Autor | |
| Autor | |
| Título | A score function to prioritize editing in household survey data [Recurso electrónico] : a machine learning approach / Nicolás Forteza, Sandra García-Uribe. |
| Publicado en | Journal of official statistics [Artículos], v.41, issue 1, March 2025, pp. 144–171 |
| Nota general | Artículo de revista |
| Resumen | Errors in household finance survey data collection can lead to inaccuracies in population estimates. Manual case-by-case revision has traditionally been used to identify and edit potential errors and omissions in the data, such as omitted or misreported assets, income, and debts. Selective editing strategies aim at reducing the editing burden by prioritizing cases through a scoring function. However, the application of traditional selective editing strategies to household finance survey data is challenging due to their underlying assumptions. Using data from the Spanish Survey of Household Finances, we develop a machine learning approach to classify data during the editing phase into cases affected by severe errors and omissions. We compare the performance of several supervised classification algorithms and find that a Gradient Boosting Trees classifier outperforms the competitors. We then use the resulting score to prioritize cases and consider data editing efforts into the choice of an optimal classification threshold. [Resumen de autor] [eng] |
| Restricciones | Acceso público y gratuito a la versión electrónica en Internet |
| Acceso electrónico | |
| Relacionado con | Documentos de Trabajo / Banco de España ; 2330 |
| Clasificación | |
| Materia | |
| Materia |
2013 Banco de España, Madrid, España. Reservados todos los derechos
Basado en Ex Libris (© 2009 Ex Libris)