# | Title | Journal | Year | Citations |
---|
1 | A Tutorial on Support Vector Machines for Pattern Recognition | Data Mining and Knowledge Discovery | 1998 | 12,673 |
2 | Stochastic gradient boosting | Computational Statistics and Data Analysis | 2002 | 4,655 |
3 | PLS path modeling | Computational Statistics and Data Analysis | 2005 | 4,249 |
4 | ggplot2 | Wiley Interdisciplinary Reviews: Computational Statistics | 2011 | 2,171 |
5 | Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach | Data Mining and Knowledge Discovery | 2004 | 2,034 |
6 | Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values | Data Mining and Knowledge Discovery | 1998 | 1,773 |
7 | Deep learning for time series classification: a review | Data Mining and Knowledge Discovery | 2019 | 1,656 |
8 | Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals | Data Mining and Knowledge Discovery | 1997 | 1,218 |
9 | Response surface methodology | Wiley Interdisciplinary Reviews: Computational Statistics | 2010 | 1,210 |
10 | Experiencing SAX: a novel symbolic representation of time series | Data Mining and Knowledge Discovery | 2007 | 1,190 |
11 | Algorithms and applications for approximate nonnegative matrix factorization | Computational Statistics and Data Analysis | 2007 | 1,162 |
12 | Frequent pattern mining: current status and future directions | Data Mining and Knowledge Discovery | 2007 | 1,109 |
13 | Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications | Data Mining and Knowledge Discovery | 1998 | 1,059 |
14 | E-Commerce Recommendation Applications | Data Mining and Knowledge Discovery | 2001 | 1,056 |
15 | Goodness-of-fit indices for partial least squares path modeling | Computational Statistics | 2013 | 978 |
16 | Discovery of Frequent Episodes in Event Sequences | Data Mining and Knowledge Discovery | 1997 | 974 |
17 | Partial least squares regression and projection on latent structure regression (PLS Regression) | Wiley Interdisciplinary Reviews: Computational Statistics | 2010 | 961 |
18 | Bursty and Hierarchical Structure in Streams | Data Mining and Knowledge Discovery | 2003 | 898 |
19 | Graph based anomaly detection and description: a survey | Data Mining and Knowledge Discovery | 2015 | 897 |
20 | The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances | Data Mining and Knowledge Discovery | 2017 | 838 |
21 | Robust smoothing of gridded data in one and higher dimensions with missing values | Computational Statistics and Data Analysis | 2010 | 805 |
22 | Empirical characterization of random forest variable importance measures | Computational Statistics and Data Analysis | 2008 | 783 |
23 | Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey | Data Mining and Knowledge Discovery | 1998 | 751 |
24 | On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality | Data Mining and Knowledge Discovery | 1997 | 729 |
25 | Discretization: An Enabling Technique | Data Mining and Knowledge Discovery | 2002 | 729 |
26 | Levelwise Search and Borders of Theories in Knowledge Discovery | Data Mining and Knowledge Discovery | 1997 | 706 |
27 | A survey of hierarchical classification across different application domains | Data Mining and Knowledge Discovery | 2011 | 693 |
28 | On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach | Data Mining and Knowledge Discovery | 1997 | 653 |
29 | Adaptive Fraud Detection | Data Mining and Knowledge Discovery | 1997 | 649 |
30 | On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration | Data Mining and Knowledge Discovery | 2003 | 649 |
31 | BIRCH: A New Data Clustering Algorithm and Its Applications | Data Mining and Knowledge Discovery | 1997 | 643 |
32 | Consistent and asymptotically normal PLS estimators for linear structural equations | Computational Statistics and Data Analysis | 2015 | 634 |
33 | How many principal components? stopping rules for determining the number of non-trivial axes revisited | Computational Statistics and Data Analysis | 2005 | 626 |
34 | Experimental comparison of representation methods and distance measures for time series data | Data Mining and Knowledge Discovery | 2013 | 612 |
35 | Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem | Data Mining and Knowledge Discovery | 1998 | 582 |
36 | Multicollinearity | Wiley Interdisciplinary Reviews: Computational Statistics | 2010 | 580 |
37 | Testing and dating of structural changes in practice | Computational Statistics and Data Analysis | 2003 | 573 |
38 | Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap | Computational Statistics and Data Analysis | 2009 | 564 |
39 | A classification EM algorithm for clustering and two stochastic versions | Computational Statistics and Data Analysis | 1992 | 550 |
40 | InceptionTime: Finding AlexNet for time series classification | Data Mining and Knowledge Discovery | 2020 | 542 |
41 | An adjusted boxplot for skewed distributions | Computational Statistics and Data Analysis | 2008 | 514 |
42 | Cluster-wise assessment of cluster stability | Computational Statistics and Data Analysis | 2007 | 512 |
43 | Practical variable selection for generalized additive models | Computational Statistics and Data Analysis | 2011 | 512 |
44 | Community detection in Social Media | Data Mining and Knowledge Discovery | 2012 | 509 |
45 | The EM algorithm for graphical association models with missing data | Computational Statistics and Data Analysis | 1995 | 500 |
46 | Bayesian Networks for Data Mining | Data Mining and Knowledge Discovery | 1997 | 489 |
47 | An application of changepoint methods in studying the effect of age on survival in breast cancer | Computational Statistics and Data Analysis | 1999 | 488 |
48 | Controlled experiments on the web: survey and practical guide | Data Mining and Knowledge Discovery | 2009 | 486 |
49 | Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models | Computational Statistics and Data Analysis | 2003 | 474 |
50 | The Bayesian information criterion: background, derivation, and applications | Wiley Interdisciplinary Reviews: Computational Statistics | 2012 | 473 |
51 | Selecting and estimating regular vine copulae and application to financial returns | Computational Statistics and Data Analysis | 2013 | 467 |
52 | On the exact distribution of maximally selected rank statistics | Computational Statistics and Data Analysis | 2003 | 455 |
53 | Hierarchical Clustering Algorithms for Document Datasets | Data Mining and Knowledge Discovery | 2005 | 452 |
54 | Robust forecasting of mortality and fertility rates: A functional data approach | Computational Statistics and Data Analysis | 2007 | 447 |
55 | Three naive Bayes approaches for discrimination-free classification | Data Mining and Knowledge Discovery | 2010 | 445 |
56 | On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study | Data Mining and Knowledge Discovery | 2016 | 445 |
57 | Training and assessing classification rules with imbalanced data | Data Mining and Knowledge Discovery | 2014 | 444 |
58 | Multiple factor analysis (AFMULT package) | Computational Statistics and Data Analysis | 1994 | 439 |
59 | Advances in Instance Selection for Instance-Based Learning Algorithms | Data Mining and Knowledge Discovery | 2002 | 435 |
60 | Characteristic-Based Clustering for Time Series Data | Data Mining and Knowledge Discovery | 2006 | 435 |
61 | Why the Monte Carlo method is so important today | Wiley Interdisciplinary Reviews: Computational Statistics | 2014 | 423 |
62 | Evaluating latent class analysis models in qualitative phenotype identification | Computational Statistics and Data Analysis | 2006 | 422 |
63 | Computing LTS Regression for Large Data Sets | Data Mining and Knowledge Discovery | 2006 | 415 |
64 | BACON: blocked adaptive computationally efficient outlier nominators | Computational Statistics and Data Analysis | 2000 | 412 |
65 | PARAFAC: Parallel factor analysis | Computational Statistics and Data Analysis | 1994 | 401 |
66 | Bayesian computing with INLA: New features | Computational Statistics and Data Analysis | 2013 | 400 |
67 | Fuzzy set theory | Wiley Interdisciplinary Reviews: Computational Statistics | 2010 | 387 |
68 | Bayesian spatial modeling of genetic population structure | Computational Statistics | 2008 | 377 |
69 | Maximum likelihood estimation in nonlinear mixed effects models | Computational Statistics and Data Analysis | 2005 | 374 |
70 | Relaxed Lasso | Computational Statistics and Data Analysis | 2007 | 372 |
71 | Genetic process mining: an experimental evaluation | Data Mining and Knowledge Discovery | 2007 | 372 |
72 | Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation | Data Mining and Knowledge Discovery | 2005 | 370 |
73 | Overdispersion: Models and estimation | Computational Statistics and Data Analysis | 1998 | 368 |
74 | Classification of time series by shapelet transformation | Data Mining and Knowledge Discovery | 2014 | 368 |
75 | An improved approximation to the precision of fixed effects from restricted maximum likelihood | Computational Statistics and Data Analysis | 2009 | 359 |
76 | ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels | Data Mining and Knowledge Discovery | 2020 | 359 |
77 | Benchmark for filter methods for feature selection in high-dimensional classification data | Computational Statistics and Data Analysis | 2020 | 356 |
78 | maxLik: A package for maximum likelihood estimation in R | Computational Statistics | 2011 | 353 |
79 | FURIA: an algorithm for unordered fuzzy rule induction | Data Mining and Knowledge Discovery | 2009 | 351 |
80 | Time series classification with ensembles of elastic distance measures | Data Mining and Knowledge Discovery | 2015 | 349 |
81 | Community discovery using nonnegative matrix factorization | Data Mining and Knowledge Discovery | 2011 | 348 |
82 | A general class of zero-or-one inflated beta regression models | Computational Statistics and Data Analysis | 2012 | 347 |
83 | An extensive comparison of recent classification tools applied to microarray data | Computational Statistics and Data Analysis | 2005 | 344 |
84 | The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements | Wiley Interdisciplinary Reviews: Computational Statistics | 2019 | 344 |
85 | Possibility theory and statistical reasoning | Computational Statistics and Data Analysis | 2006 | 343 |
86 | The BOSS is concerned with time series classification in the presence of noise | Data Mining and Knowledge Discovery | 2015 | 340 |
87 | A note on the validity of cross-validation for evaluating autoregressive time series prediction | Computational Statistics and Data Analysis | 2018 | 329 |
88 | On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms | Data Mining and Knowledge Discovery | 2004 | 323 |
89 | An anova test for functional data | Computational Statistics and Data Analysis | 2004 | 321 |
90 | Statistical analysis of financial networks | Computational Statistics and Data Analysis | 2005 | 321 |
91 | Multiple factor analysis: principal component analysis for multitable and multiblock data sets | Wiley Interdisciplinary Reviews: Computational Statistics | 2013 | 319 |
92 | A comparison of algorithms for fitting the PARAFAC model | Computational Statistics and Data Analysis | 2006 | 316 |
93 | Outlier identification in high dimensions | Computational Statistics and Data Analysis | 2008 | 313 |
94 | Model-based clustering of high-dimensional data: A review | Computational Statistics and Data Analysis | 2014 | 312 |
95 | Mining the customer credit using classification and regression tree and multivariate adaptive regression splines | Computational Statistics and Data Analysis | 2006 | 311 |
96 | Survey on mining subjective data on the web | Data Mining and Knowledge Discovery | 2012 | 307 |
97 | Open-source machine learning: R meets Weka | Computational Statistics | 2009 | 305 |
98 | Ridge regression | Wiley Interdisciplinary Reviews: Computational Statistics | 2009 | 305 |
99 | Adaptive proposal distribution for random walk Metropolis algorithm | Computational Statistics | 1999 | 304 |
100 | Mining Non-Redundant Association Rules | Data Mining and Knowledge Discovery | 2004 | 304 |