11.5K(top 1%)
papers
276.7K(top 1%)
citations
179(top 1%)
h-index
348(top 1%)
g-index
164.1K
all documents
13.8K
doc citations

Top Articles

#TitleJournalYearCitations
1A Tutorial on Support Vector Machines for Pattern RecognitionData Mining and Knowledge Discovery199812,673
2Stochastic gradient boostingComputational Statistics and Data Analysis20024,655
3PLS path modelingComputational Statistics and Data Analysis20054,249
4ggplot2Wiley Interdisciplinary Reviews: Computational Statistics20112,171
5Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree ApproachData Mining and Knowledge Discovery20042,034
6Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical ValuesData Mining and Knowledge Discovery19981,773
7Deep learning for time series classification: a reviewData Mining and Knowledge Discovery20191,656
8Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-TotalsData Mining and Knowledge Discovery19971,218
9Response surface methodologyWiley Interdisciplinary Reviews: Computational Statistics20101,210
10Experiencing SAX: a novel symbolic representation of time seriesData Mining and Knowledge Discovery20071,190
11Algorithms and applications for approximate nonnegative matrix factorizationComputational Statistics and Data Analysis20071,162
12Frequent pattern mining: current status and future directionsData Mining and Knowledge Discovery20071,109
13Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its ApplicationsData Mining and Knowledge Discovery19981,059
14E-Commerce Recommendation ApplicationsData Mining and Knowledge Discovery20011,056
15Goodness-of-fit indices for partial least squares path modelingComputational Statistics2013978
16Discovery of Frequent Episodes in Event SequencesData Mining and Knowledge Discovery1997974
17Partial least squares regression and projection on latent structure regression (PLS Regression)Wiley Interdisciplinary Reviews: Computational Statistics2010961
18Bursty and Hierarchical Structure in StreamsData Mining and Knowledge Discovery2003898
19Graph based anomaly detection and description: a surveyData Mining and Knowledge Discovery2015897
20The great time series classification bake off: a review and experimental evaluation of recent algorithmic advancesData Mining and Knowledge Discovery2017838
21Robust smoothing of gridded data in one and higher dimensions with missing valuesComputational Statistics and Data Analysis2010805
22Empirical characterization of random forest variable importance measuresComputational Statistics and Data Analysis2008783
23Automatic Construction of Decision Trees from Data: A Multi-Disciplinary SurveyData Mining and Knowledge Discovery1998751
24On Bias, Variance, 0/1—Loss, and the Curse-of-DimensionalityData Mining and Knowledge Discovery1997729
25Discretization: An Enabling TechniqueData Mining and Knowledge Discovery2002729
26Levelwise Search and Borders of Theories in Knowledge DiscoveryData Mining and Knowledge Discovery1997706
27A survey of hierarchical classification across different application domainsData Mining and Knowledge Discovery2011693
28On Comparing Classifiers: Pitfalls to Avoid and a Recommended ApproachData Mining and Knowledge Discovery1997653
29Adaptive Fraud DetectionData Mining and Knowledge Discovery1997649
30On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical DemonstrationData Mining and Knowledge Discovery2003649
31BIRCH: A New Data Clustering Algorithm and Its ApplicationsData Mining and Knowledge Discovery1997643
32Consistent and asymptotically normal PLS estimators for linear structural equationsComputational Statistics and Data Analysis2015634
33How many principal components? stopping rules for determining the number of non-trivial axes revisitedComputational Statistics and Data Analysis2005626
34Experimental comparison of representation methods and distance measures for time series dataData Mining and Knowledge Discovery2013612
35Real-world Data is Dirty: Data Cleansing and The Merge/Purge ProblemData Mining and Knowledge Discovery1998582
36MulticollinearityWiley Interdisciplinary Reviews: Computational Statistics2010580
37Testing and dating of structural changes in practiceComputational Statistics and Data Analysis2003573
38Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrapComputational Statistics and Data Analysis2009564
39A classification EM algorithm for clustering and two stochastic versionsComputational Statistics and Data Analysis1992550
40InceptionTime: Finding AlexNet for time series classificationData Mining and Knowledge Discovery2020542
41An adjusted boxplot for skewed distributionsComputational Statistics and Data Analysis2008514
42Cluster-wise assessment of cluster stabilityComputational Statistics and Data Analysis2007512
43Practical variable selection for generalized additive modelsComputational Statistics and Data Analysis2011512
44Community detection in Social MediaData Mining and Knowledge Discovery2012509
45The EM algorithm for graphical association models with missing dataComputational Statistics and Data Analysis1995500
46Bayesian Networks for Data MiningData Mining and Knowledge Discovery1997489
47An application of changepoint methods in studying the effect of age on survival in breast cancerComputational Statistics and Data Analysis1999488
48Controlled experiments on the web: survey and practical guideData Mining and Knowledge Discovery2009486
49Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture modelsComputational Statistics and Data Analysis2003474
50The Bayesian information criterion: background, derivation, and applicationsWiley Interdisciplinary Reviews: Computational Statistics2012473
51Selecting and estimating regular vine copulae and application to financial returnsComputational Statistics and Data Analysis2013467
52On the exact distribution of maximally selected rank statisticsComputational Statistics and Data Analysis2003455
53Hierarchical Clustering Algorithms for Document DatasetsData Mining and Knowledge Discovery2005452
54Robust forecasting of mortality and fertility rates: A functional data approachComputational Statistics and Data Analysis2007447
55Three naive Bayes approaches for discrimination-free classificationData Mining and Knowledge Discovery2010445
56On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical studyData Mining and Knowledge Discovery2016445
57Training and assessing classification rules with imbalanced dataData Mining and Knowledge Discovery2014444
58Multiple factor analysis (AFMULT package)Computational Statistics and Data Analysis1994439
59Advances in Instance Selection for Instance-Based Learning AlgorithmsData Mining and Knowledge Discovery2002435
60Characteristic-Based Clustering for Time Series DataData Mining and Knowledge Discovery2006435
61Why the Monte Carlo method is so important todayWiley Interdisciplinary Reviews: Computational Statistics2014423
62Evaluating latent class analysis models in qualitative phenotype identificationComputational Statistics and Data Analysis2006422
63Computing LTS Regression for Large Data SetsData Mining and Knowledge Discovery2006415
64BACON: blocked adaptive computationally efficient outlier nominatorsComputational Statistics and Data Analysis2000412
65PARAFAC: Parallel factor analysisComputational Statistics and Data Analysis1994401
66Bayesian computing with INLA: New featuresComputational Statistics and Data Analysis2013400
67Fuzzy set theoryWiley Interdisciplinary Reviews: Computational Statistics2010387
68Bayesian spatial modeling of genetic population structureComputational Statistics2008377
69Maximum likelihood estimation in nonlinear mixed effects modelsComputational Statistics and Data Analysis2005374
70Relaxed LassoComputational Statistics and Data Analysis2007372
71Genetic process mining: an experimental evaluationData Mining and Knowledge Discovery2007372
72Ordinal, Continuous and Heterogeneous k-Anonymity Through MicroaggregationData Mining and Knowledge Discovery2005370
73Overdispersion: Models and estimationComputational Statistics and Data Analysis1998368
74Classification of time series by shapelet transformationData Mining and Knowledge Discovery2014368
75An improved approximation to the precision of fixed effects from restricted maximum likelihoodComputational Statistics and Data Analysis2009359
76ROCKET: exceptionally fast and accurate time series classification using random convolutional kernelsData Mining and Knowledge Discovery2020359
77Benchmark for filter methods for feature selection in high-dimensional classification dataComputational Statistics and Data Analysis2020356
78maxLik: A package for maximum likelihood estimation in RComputational Statistics2011353
79FURIA: an algorithm for unordered fuzzy rule inductionData Mining and Knowledge Discovery2009351
80Time series classification with ensembles of elastic distance measuresData Mining and Knowledge Discovery2015349
81Community discovery using nonnegative matrix factorizationData Mining and Knowledge Discovery2011348
82A general class of zero-or-one inflated beta regression modelsComputational Statistics and Data Analysis2012347
83An extensive comparison of recent classification tools applied to microarray dataComputational Statistics and Data Analysis2005344
84The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinementsWiley Interdisciplinary Reviews: Computational Statistics2019344
85Possibility theory and statistical reasoningComputational Statistics and Data Analysis2006343
86The BOSS is concerned with time series classification in the presence of noiseData Mining and Knowledge Discovery2015340
87A note on the validity of cross-validation for evaluating autoregressive time series predictionComputational Statistics and Data Analysis2018329
88On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning AlgorithmsData Mining and Knowledge Discovery2004323
89An anova test for functional dataComputational Statistics and Data Analysis2004321
90Statistical analysis of financial networksComputational Statistics and Data Analysis2005321
91Multiple factor analysis: principal component analysis for multitable and multiblock data setsWiley Interdisciplinary Reviews: Computational Statistics2013319
92A comparison of algorithms for fitting the PARAFAC modelComputational Statistics and Data Analysis2006316
93Outlier identification in high dimensionsComputational Statistics and Data Analysis2008313
94Model-based clustering of high-dimensional data: A reviewComputational Statistics and Data Analysis2014312
95Mining the customer credit using classification and regression tree and multivariate adaptive regression splinesComputational Statistics and Data Analysis2006311
96Survey on mining subjective data on the webData Mining and Knowledge Discovery2012307
97Open-source machine learning: R meets WekaComputational Statistics2009305
98Ridge regressionWiley Interdisciplinary Reviews: Computational Statistics2009305
99Adaptive proposal distribution for random walk Metropolis algorithmComputational Statistics1999304
100Mining Non-Redundant Association RulesData Mining and Knowledge Discovery2004304