Home > Published Issues > 2026 > Volume 17, No. 1, 2026 >
JAIT 2026 Vol.17(1): 141-152
doi: 10.12720/jait.17.1.141-152

Feature Selection for High-Dimensional Data: A Case Study of NFT Valuation

Geun-Cheol Lee 1, Heejung Lee 2, and Hoon-Young Koo 3,*
1. College of Business, Konkuk University, Seoul 05029, South Korea
2. School of Interdisciplinary Industrial Studies, Hanyang University, Seoul 04763, South Korea
3. School of Business, Chungnam National University, Daejeon 34134, South Korea
Email: gclee@konkuk.ac.kr (G.L.); stdream@hanyang.ac.kr (H.L.); koohy@cnu.ac.kr (H.K.)
*Corresponding author

Manuscript received October 9, 2025; revised October 28, 2025; accepted November 11, 2025; published January 20, 2026.

Abstract—In this study, we propose hedonic models for valuing Non-Fungible Tokens (NFTs) from the Azuki collection. We first analyze the NFT’s metadata and introduce a market volatility-robust dependent variable. Specific information of Azuki attributes is encoded via Term Frequency-Inverse Document Frequency (TF-IDF) to reflect both presence and collection-wide scarcity, yielding hundreds of features for each token. Two hedonic models are considered: a linear model and a squared model. To address high dimensionality, we tailor three variable-selection procedures—forward, backward, and stepwise—and compare them with regularization benchmarks and machine-learning methods. Using actual Azuki transaction data, we evaluate performance on a train-validation partition. The squared model overfits out of sample, while the linear model generalizes better and is adopted as the baseline. Applying variable selection to the linear baseline improves both parsimony and predictive performance. Machine-learning models exhibit very high training fit but notable performance degradation on the validation set, indicating overfitting in this setting. Overall, carefully specified hedonic models combined with principled variable selection offer competitive, interpretable, and more generalizable NFT valuation.
 
Keywords—Non-Fungible Token (NFT), NFT valuation, hedonic model, variable selection, high-dimensional data, Term Frequency-Inverse Document Frequency (TF-IDF), Azuki

Cite: Geun-Cheol Lee, Heejung Lee, and Hoon-Young Koo, "Feature Selection for High-Dimensional Data: A Case Study of NFT Valuation," Journal of Advances in Information Technology, Vol. 17, No. 1, pp. 141-152, 2026. doi: 10.12720/jait.17.1.141-152

Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions