研究目的
Investigating the challenge of missing data prediction in coevolving time series by employing temporal dynamic matrix factorization techniques.
研究成果
The proposed temporal dynamic matrix factorization methods effectively improve the performance of missing data prediction in large-scale coevolving time series, showing low prediction errors even with high missing ratios. The methods demonstrate satisfactory effectiveness and efficiency, especially when implemented on Apache Spark for large-scale data processing. Future work may focus on extending these methods to handle heterogeneous data sets.
研究不足
The proposed methods aim at handling homogeneous data sets, indicating a limitation in dealing with heterogeneous data sets.
1:Experimental Design and Method Selection:
The study employs temporal dynamic matrix factorization techniques to predict missing values in large-scale coevolving time series. It utilizes both the interior patterns of each time series and the information across multiple sources to build an initial model. Hybrid regularization terms are imposed to constrain the objective functions of matrix factorization.
2:Sample Selection and Data Sources:
The experiments are conducted on four data sets, including two medium scale data sets (Motes and Sea-Surface Temperature) and two large scale data sets (Gas Sensor Array under dynamic gas mixtures and Synthetic data set).
3:List of Experimental Equipment and Materials:
The study uses Apache Spark platform for parallel computing experiments in a cluster of four working machines.
4:Experimental Procedures and Operational Workflow:
The proposed methods include building initial models via hybrid regularization, updating models with new samples using batch updating and fine-tuning strategies, and implementing the methods on Apache Spark for large-scale data processing.
5:Data Analysis Methods:
The performance of the proposed methods is evaluated using root mean squared error (RMSE) to measure the prediction quality.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容