Regression analysis for interval-valued symbolic data versus noisy variables and outliers Cover Image

Regression analysis for interval-valued symbolic data versus noisy variables and outliers
Regression analysis for interval-valued symbolic data versus noisy variables and outliers

Author(s): Marcin Pełka, Andrzej Dudek
Subject(s): Economy
Published by: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Keywords: regression analysis; interval-valued symbolic data; noisy variables; outliers

Summary/Abstract: Regression analysis is perhaps the best known and most widely used method used for the analysis of dependence; that is, for examining the relationship between a set of independent variables (X’s) and a single dependent variable (Y). In general regression, the model is a linear combination of independent variables that corresponds as closely as possible to the dependent variable [Lattin, Carroll, Green 2003, p. 38]. The aim of the article is to present two suitable adaptations for a regression analysis of symbolic interval-valued data (centre method and centre and range method) and to compare their usefulness when dealing with noisy variables and/or outliers. The empirical part of the paper presents the results of simulation studies based on artificial and real data, without noisy variables and/or outliers and with noisy variable and outliers. The results are compared according to the values of two coefficients of determination 2 RL and 2 . RU The results show that usually the centre and range method obtains better results even when the data set contains noisy variables and outliers, but in some cases the centre method obtains better results than the centre and range method.

  • Issue Year: 2016
  • Issue No: 52
  • Page Range: 35-42
  • Page Count: 8
  • Language: English