IMHO, I think that you should run an automated Empirical Orthogonal Functions (EOF) Analysis, with its corresponding principal components, to cluster your entities based on the "chart" type.
In a sense, EOF Analysis is conceptually similar to Fourier Analysis (FA), with the only difference that in FA you always use a set of known functions (Sine and Cosine) as the eigenfunctions, while in EOF Analysis, this tools helps us determine what the eigenfunctions should be, and their relative importance.
I hope this helps you with this problem.
Kind regards, GEN
manpreet
Best Answer
2 years ago
So, I have more than 20,000 entities. Each entities has their own data point (time series). Let say entities A1 to A20000. A1 has data point from year 1 to year 60. A2 has data point from year 5 to 60, and so on. We can make some plot year vs value each year for each entities.
My task now is to make a cluster of the entity based on the shape the chart they make. For example, A1 data point chart (assume barplot) will make quadratic-like shape, A2 data point chart will make exponential -like shape, and so on. There would be some entity with random chart shape like scattered.
Is there any algorithm to create this type of clustering? I tried to create just 1 shape detection algorithm, monotonic increase shape, and I think it works good but I need an automatic shape detection algorithm. My method also still not robust enough to detect some small fluctuation. For example in the monotonic increase shape (the data in the newer year is greater than its previous year), if some data in a year dropped a quite big, it failed to detect it is monotonic increase type although generally speaking, it is monotonic increase.
Any suggestion?