The influence of feature grouping algorithm in outlier detection with categorical data

outlier detection

Autores

  • Sharon Femi Paul Sunder Nathaniel Sri Venkateswara College of Engineering https://orcid.org/0000-0002-2984-9521
  • Kala Alwarsamy Sri Venkateswara College of Engineering
  • Rajalakshmi Viswanathan Sri Venkateswara College of Engineering
  • Ganesh Vaidyanathan Subramanian Sri Venkateswara College of Engineering
  • Vidhya Veerabahu Sri Venkateswara College of Engineering

DOI:

https://doi.org/10.4025/actascitechnol.v46i1.66902

Palavras-chave:

outlier detection; feature grouping; categorical data; lof; isolation forest.

Resumo

Outlier mining has become a rapidly developing domain over the recent years with increasing importance in the fields like banking, sensor networks, and health care. In general, anomaly detection methods are compatible with numerical data and ignore categorical data. However, in real-time problems, both numerical and categorical data are to be considered to obtain accurate results. There are several methods available for the outlier detection of high dimensional data in numerical data. In this paper, a feature grouping algorithm for anomaly detection is proposed that considers the categorical data also. This algorithm correlates the features of categorical data and forms feature clusters and detects the outliers. The features are assigned feature weights based on their levels of appearance and the outlier scores are determined. The performance of the feature grouping algorithm is then compared with the traditional algorithms like LOF and Isolation Forest algorithm and state-of-the-art methods like WATCH on UCI datasets. From the experimental evaluation of the results obtained, it is found that the proposed algorithm is comparatively better than the existing algorithms for categorical data.

Downloads

Não há dados estatísticos.

Downloads

Publicado

2024-04-17

Como Citar

Nathaniel, S. F. P. S. ., Alwarsamy, K. ., Viswanathan, R. ., Subramanian, G. V. ., & Veerabahu, V. (2024). The influence of feature grouping algorithm in outlier detection with categorical data : outlier detection. Acta Scientiarum. Technology, 46(1), e66902. https://doi.org/10.4025/actascitechnol.v46i1.66902

Edição

Seção

Ciência da Computação

 

0.8
2019CiteScore
 
 
36th percentile
Powered by  Scopus

 

 

0.8
2019CiteScore
 
 
36th percentile
Powered by  Scopus

Artigos mais lidos pelo mesmo(s) autor(es)