K-optimal pattern discovery

K-optimal pattern discovery is a data mining technique that provides an alternative to the frequent pattern discovery approach that underlies most association rule learning techniques.

Frequent pattern discovery techniques find all patterns for which there are sufficiently frequent examples in the sample data. In contrast, k-optimal pattern discovery techniques find the k patterns that optimize a user-specified measure of interest. The parameter k is also specified by the user.

Examples of k-optimal pattern discovery techniques include:

k-optimal classification rule discovery.^[1]
k-optimal subgroup discovery.^[2]
finding k most interesting patterns using sequential sampling.^[3]
mining top.k frequent closed patterns without minimum support.^[4]
k-optimal rule discovery.^[5]

In contrast to k-optimal rule discovery and frequent pattern mining techniques, subgroup discovery focuses on mining interesting patterns with respect to a specified target property of interest. This includes, for example, binary, nominal, or numeric attributes,^[6] but also more complex target concepts such as correlations between several variables. Background knowledge^[7] like constraints and ontological relations can often be successfully applied for focusing and improving the discovery results.

References

^ Webb, G. I. (1995). OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3, 431-465.
^ Wrobel, Stefan (1997) An algorithm for multi-relational discovery of subgroups. In Proceedings First European Symposium on Principles of Data Mining and Knowledge Discovery. Springer.
^ Scheffer, T., & Wrobel, S. (2002). Finding the most interesting patterns in a database quickly by using sequential sampling. Journal of Machine Learning Research, 3, 833-862.
^ Han, J., Wang, J., Lu, Y., & Tzvetkov, P. (2002) Mining top-k frequent closed patterns without minimum support. In Proceedings of the International Conference on Data Mining, pp. 211-218.
^ Webb, G. I., & Zhang, S. (2005). K-optimal rule discovery. Data Mining and Knowledge Discovery, 10(1), 39-79.
^ Kloesgen, W. (1996). "EXPLORA: A multipattern and multistrategy discovery assistant". Advances in Knowledge Discovery and Data Mining. pp. 249–271. Retrieved 2021-04-14.
^ Atzmueller, Martin; Puppe, Frank; Buscher, Hans-Peter (1 August 2005). "Exploiting background knowledge for knowledge-intensive subgroup discovery" (PDF). Proceedings of the 19th international joint conference on Artificial intelligence. Morgan Kaufmann Publishers. pp. 647–652.

External links

"Bringing you the state-of-the-art in Data Science". Bringing you the state-of-the-art in Data Science. 2017-05-06. Retrieved 2021-04-14.
Atzmueller, Martin (2015-05-17). "VIKAMINE: Subgroup Discovery and Analytics". VIKAMINE. Retrieved 2021-04-14.

[1] Webb, G. I. (1995). OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3, 431-465.

[2] Wrobel, Stefan (1997) An algorithm for multi-relational discovery of subgroups. In Proceedings First European Symposium on Principles of Data Mining and Knowledge Discovery. Springer.

[3] Scheffer, T., & Wrobel, S. (2002). Finding the most interesting patterns in a database quickly by using sequential sampling. Journal of Machine Learning Research, 3, 833-862.

[4] Han, J., Wang, J., Lu, Y., & Tzvetkov, P. (2002) Mining top-k frequent closed patterns without minimum support. In Proceedings of the International Conference on Data Mining, pp. 211-218.

[5] Webb, G. I., & Zhang, S. (2005). K-optimal rule discovery. Data Mining and Knowledge Discovery, 10(1), 39-79.

[6] Kloesgen, W. (1996). "EXPLORA: A multipattern and multistrategy discovery assistant". Advances in Knowledge Discovery and Data Mining. pp. 249–271. Retrieved 2021-04-14.

[7] Atzmueller, Martin; Puppe, Frank; Buscher, Hans-Peter (1 August 2005). "Exploiting background knowledge for knowledge-intensive subgroup discovery" (PDF). Proceedings of the 19th international joint conference on Artificial intelligence. Morgan Kaufmann Publishers. pp. 647–652.

[1]