11.4K(top 1%)
papers
262.8K(top 1%)
citations
178(top 1%)
h-index
325(top 1%)
g-index
174.8K
all documents
13.5K
doc citations

Top Articles

#TitleJournalYearCitations
1Stochastic gradient boostingComputational Statistics and Data Analysis20024,655
2PLS path modelingComputational Statistics and Data Analysis20054,249
3ggplot2Wiley Interdisciplinary Reviews: Computational Statistics20112,171
4Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree ApproachData Mining and Knowledge Discovery20042,034
5Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical ValuesData Mining and Knowledge Discovery19981,773
6Deep learning for time series classification: a reviewData Mining and Knowledge Discovery20191,656
7Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-TotalsData Mining and Knowledge Discovery19971,218
8Response surface methodologyWiley Interdisciplinary Reviews: Computational Statistics20101,210
9Experiencing SAX: a novel symbolic representation of time seriesData Mining and Knowledge Discovery20071,190
10Algorithms and applications for approximate nonnegative matrix factorizationComputational Statistics and Data Analysis20071,162
11Frequent pattern mining: current status and future directionsData Mining and Knowledge Discovery20071,109
12Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its ApplicationsData Mining and Knowledge Discovery19981,059
13E-Commerce Recommendation ApplicationsData Mining and Knowledge Discovery20011,056
14Goodness-of-fit indices for partial least squares path modelingComputational Statistics2013978
15Discovery of Frequent Episodes in Event SequencesData Mining and Knowledge Discovery1997974
16Partial least squares regression and projection on latent structure regression (PLS Regression)Wiley Interdisciplinary Reviews: Computational Statistics2010961
17Bursty and Hierarchical Structure in StreamsData Mining and Knowledge Discovery2003898
18Graph based anomaly detection and description: a surveyData Mining and Knowledge Discovery2015897
19The great time series classification bake off: a review and experimental evaluation of recent algorithmic advancesData Mining and Knowledge Discovery2017838
20Robust smoothing of gridded data in one and higher dimensions with missing valuesComputational Statistics and Data Analysis2010805
21Empirical characterization of random forest variable importance measuresComputational Statistics and Data Analysis2008783
22Automatic Construction of Decision Trees from Data: A Multi-Disciplinary SurveyData Mining and Knowledge Discovery1998751
23On Bias, Variance, 0/1—Loss, and the Curse-of-DimensionalityData Mining and Knowledge Discovery1997729
24Discretization: An Enabling TechniqueData Mining and Knowledge Discovery2002729
25Levelwise Search and Borders of Theories in Knowledge DiscoveryData Mining and Knowledge Discovery1997706
26A survey of hierarchical classification across different application domainsData Mining and Knowledge Discovery2011693
27Adaptive Fraud DetectionData Mining and Knowledge Discovery1997649
28On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical DemonstrationData Mining and Knowledge Discovery2003649
29BIRCH: A New Data Clustering Algorithm and Its ApplicationsData Mining and Knowledge Discovery1997643
30Consistent and asymptotically normal PLS estimators for linear structural equationsComputational Statistics and Data Analysis2015634
31How many principal components? stopping rules for determining the number of non-trivial axes revisitedComputational Statistics and Data Analysis2005626
32Experimental comparison of representation methods and distance measures for time series dataData Mining and Knowledge Discovery2013612
33Real-world Data is Dirty: Data Cleansing and The Merge/Purge ProblemData Mining and Knowledge Discovery1998582
34MulticollinearityWiley Interdisciplinary Reviews: Computational Statistics2010580
35Testing and dating of structural changes in practiceComputational Statistics and Data Analysis2003573
36Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrapComputational Statistics and Data Analysis2009564
37A classification EM algorithm for clustering and two stochastic versionsComputational Statistics and Data Analysis1992550
38InceptionTime: Finding AlexNet for time series classificationData Mining and Knowledge Discovery2020542
39An adjusted boxplot for skewed distributionsComputational Statistics and Data Analysis2008514
40Cluster-wise assessment of cluster stabilityComputational Statistics and Data Analysis2007512
41Practical variable selection for generalized additive modelsComputational Statistics and Data Analysis2011512
42Community detection in Social MediaData Mining and Knowledge Discovery2012509
43The EM algorithm for graphical association models with missing dataComputational Statistics and Data Analysis1995500
44An application of changepoint methods in studying the effect of age on survival in breast cancerComputational Statistics and Data Analysis1999488
45Controlled experiments on the web: survey and practical guideData Mining and Knowledge Discovery2009486
46Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture modelsComputational Statistics and Data Analysis2003474
47The Bayesian information criterion: background, derivation, and applicationsWiley Interdisciplinary Reviews: Computational Statistics2012473
48Selecting and estimating regular vine copulae and application to financial returnsComputational Statistics and Data Analysis2013467
49On the exact distribution of maximally selected rank statisticsComputational Statistics and Data Analysis2003455
50Hierarchical Clustering Algorithms for Document DatasetsData Mining and Knowledge Discovery2005452
51Robust forecasting of mortality and fertility rates: A functional data approachComputational Statistics and Data Analysis2007447
52Three naive Bayes approaches for discrimination-free classificationData Mining and Knowledge Discovery2010445
53On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical studyData Mining and Knowledge Discovery2016445
54Training and assessing classification rules with imbalanced dataData Mining and Knowledge Discovery2014444
55Multiple factor analysis (AFMULT package)Computational Statistics and Data Analysis1994439
56Advances in Instance Selection for Instance-Based Learning AlgorithmsData Mining and Knowledge Discovery2002435
57Characteristic-Based Clustering for Time Series DataData Mining and Knowledge Discovery2006435
58Why the Monte Carlo method is so important todayWiley Interdisciplinary Reviews: Computational Statistics2014423
59Evaluating latent class analysis models in qualitative phenotype identificationComputational Statistics and Data Analysis2006422
60Computing LTS Regression for Large Data SetsData Mining and Knowledge Discovery2006415
61BACON: blocked adaptive computationally efficient outlier nominatorsComputational Statistics and Data Analysis2000412
62PARAFAC: Parallel factor analysisComputational Statistics and Data Analysis1994401
63Bayesian computing with INLA: New featuresComputational Statistics and Data Analysis2013400
64Fuzzy set theoryWiley Interdisciplinary Reviews: Computational Statistics2010387
65Bayesian spatial modeling of genetic population structureComputational Statistics2008377
66Maximum likelihood estimation in nonlinear mixed effects modelsComputational Statistics and Data Analysis2005374
67Relaxed LassoComputational Statistics and Data Analysis2007372
68Genetic process mining: an experimental evaluationData Mining and Knowledge Discovery2007372
69Ordinal, Continuous and Heterogeneous k-Anonymity Through MicroaggregationData Mining and Knowledge Discovery2005370
70Overdispersion: Models and estimationComputational Statistics and Data Analysis1998368
71Classification of time series by shapelet transformationData Mining and Knowledge Discovery2014368
72An improved approximation to the precision of fixed effects from restricted maximum likelihoodComputational Statistics and Data Analysis2009359
73ROCKET: exceptionally fast and accurate time series classification using random convolutional kernelsData Mining and Knowledge Discovery2020359
74Benchmark for filter methods for feature selection in high-dimensional classification dataComputational Statistics and Data Analysis2020356
75maxLik: A package for maximum likelihood estimation in RComputational Statistics2011353
76FURIA: an algorithm for unordered fuzzy rule inductionData Mining and Knowledge Discovery2009351
77Time series classification with ensembles of elastic distance measuresData Mining and Knowledge Discovery2015349
78Community discovery using nonnegative matrix factorizationData Mining and Knowledge Discovery2011348
79A general class of zero-or-one inflated beta regression modelsComputational Statistics and Data Analysis2012347
80An extensive comparison of recent classification tools applied to microarray dataComputational Statistics and Data Analysis2005344
81The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinementsWiley Interdisciplinary Reviews: Computational Statistics2019344
82Possibility theory and statistical reasoningComputational Statistics and Data Analysis2006343
83The BOSS is concerned with time series classification in the presence of noiseData Mining and Knowledge Discovery2015340
84A note on the validity of cross-validation for evaluating autoregressive time series predictionComputational Statistics and Data Analysis2018329
85On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning AlgorithmsData Mining and Knowledge Discovery2004323
86An anova test for functional dataComputational Statistics and Data Analysis2004321
87Statistical analysis of financial networksComputational Statistics and Data Analysis2005321
88Multiple factor analysis: principal component analysis for multitable and multiblock data setsWiley Interdisciplinary Reviews: Computational Statistics2013319
89A comparison of algorithms for fitting the PARAFAC modelComputational Statistics and Data Analysis2006316
90Outlier identification in high dimensionsComputational Statistics and Data Analysis2008313
91Model-based clustering of high-dimensional data: A reviewComputational Statistics and Data Analysis2014312
92Mining the customer credit using classification and regression tree and multivariate adaptive regression splinesComputational Statistics and Data Analysis2006311
93Survey on mining subjective data on the webData Mining and Knowledge Discovery2012307
94Open-source machine learning: R meets WekaComputational Statistics2009305
95Ridge regressionWiley Interdisciplinary Reviews: Computational Statistics2009305
96Adaptive proposal distribution for random walk Metropolis algorithmComputational Statistics1999304
97Mining Non-Redundant Association RulesData Mining and Knowledge Discovery2004304
98Efficient Adaptive-Support Association Rule Mining for Recommender SystemsData Mining and Knowledge Discovery2002301
99(null)Data Mining and Knowledge Discovery2001294
100Analysis of Type-II progressively hybrid censored dataComputational Statistics and Data Analysis2006292