# | Title | Journal | Year | Citations |
---|
1 | Long Short-Term Memory | Neural Computation | 1997 | 58,553 |
2 | Support-vector networks | Machine Learning | 1995 | 38,107 |
3 | Bagging predictors | Machine Learning | 1996 | 18,759 |
4 | A Fast Learning Algorithm for Deep Belief Nets | Neural Computation | 2006 | 12,682 |
5 | Induction of decision trees | Machine Learning | 1986 | 12,034 |
6 | Q-learning | Machine Learning | 1992 | 8,093 |
7 | An Information-Maximization Approach to Blind Separation and Blind Deconvolution | Neural Computation | 1995 | 7,791 |
8 | Backpropagation Applied to Handwritten Zip Code Recognition | Neural Computation | 1989 | 7,701 |
9 | Gene Selection for Cancer Classification using Support Vector Machines | Machine Learning | 2002 | 7,229 |
10 | Induction of Decision Trees | Machine Learning | 1986 | 6,920 |
11 | Support-Vector Networks | Machine Learning | 1995 | 6,650 |
12 | Nonlinear Component Analysis as a Kernel Eigenvalue Problem | Neural Computation | 1998 | 6,336 |
13 | Laplacian Eigenmaps for Dimensionality Reduction and Data Representation | Neural Computation | 2003 | 5,873 |
14 | Extremely randomized trees | Machine Learning | 2006 | 4,796 |
15 | Estimating the Support of a High-Dimensional Distribution | Neural Computation | 2001 | 4,068 |
16 | Fast Learning in Networks of Locally-Tuned Processing Units | Neural Computation | 1989 | 3,752 |
17 | Bayesian Network Classifiers | Machine Learning | 1997 | 3,662 |
18 | Bayesian Interpolation | Neural Computation | 1992 | 3,639 |
19 | Bagging Predictors | Machine Learning | 1996 | 3,456 |
20 | Universal Approximation Using Radial-Basis-Function Networks | Neural Computation | 1991 | 3,401 |
21 | Finite-time Analysis of the Multiarmed Bandit Problem | Machine Learning | 2002 | 3,350 |
22 | Simple statistical gradient-following algorithms for connectionist reinforcement learning | Machine Learning | 1992 | 3,328 |
23 | Training Products of Experts by Minimizing Contrastive Divergence | Neural Computation | 2002 | 3,295 |
24 | Learning to Forget: Continual Prediction with LSTM | Neural Computation | 2000 | 3,293 |
25 | Instance-based learning algorithms | Machine Learning | 1991 | 3,271 |
26 | Learning to predict by the methods of temporal differences | Machine Learning | 1988 | 3,207 |
27 | Adaptive Mixtures of Local Experts | Neural Computation | 1991 | 3,109 |
28 | A Learning Algorithm for Continually Running Fully Recurrent Neural Networks | Neural Computation | 1989 | 3,097 |
29 | The strength of weak learnability | Machine Learning | 1990 | 2,922 |
30 | Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations | Neural Computation | 2002 | 2,887 |
31 | Neural Networks and the Bias/Variance Dilemma | Neural Computation | 1992 | 2,832 |
32 | A Fast Fixed-Point Algorithm for Independent Component Analysis | Neural Computation | 1997 | 2,828 |
33 | Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms | Neural Computation | 1998 | 2,651 |
34 | Technical Note: Q-Learning | Machine Learning | 1992 | 2,549 |
35 | A Bayesian method for the induction of probabilistic networks from data | Machine Learning | 1992 | 2,512 |
36 | Support Vector Data Description | Machine Learning | 2004 | 2,482 |
37 | Canonical Correlation Analysis: An Overview with Application to Learning Methods | Neural Computation | 2004 | 2,353 |
38 | Genetic Algorithms and Machine Learning | Machine Learning | 1988 | 2,316 |
39 | Theoretical and Empirical Analysis of ReliefF and RReliefF | Machine Learning | 2003 | 2,316 |
40 | The NEURON Simulation Environment | Neural Computation | 1997 | 2,305 |
41 | Learning Bayesian networks: The combination of knowledge and statistical data | Machine Learning | 1995 | 2,244 |
42 | Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review | Neural Computation | 2017 | 2,237 |
43 | New Support Vector Algorithms | Neural Computation | 2000 | 2,216 |
44 | The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks | Neural Computation | 2008 | 2,126 |
45 | A Practical Bayesian Framework for Backpropagation Networks | Neural Computation | 1992 | 2,072 |
46 | Instance-Based Learning Algorithms | Machine Learning | 1991 | 2,061 |
47 | Text Classification from Labeled and Unlabeled Documents using EM | Machine Learning | 2000 | 2,050 |
48 | A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures | Neural Computation | 2019 | 1,983 |
49 | Hierarchical Mixtures of Experts and the EM Algorithm | Neural Computation | 1994 | 1,982 |
50 | Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems | Neural Computing and Applications | 2016 | 1,937 |
51 | An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants | Machine Learning | 1999 | 1,936 |
52 | Natural Gradient Works Efficiently in Learning | Neural Computation | 1998 | 1,919 |
53 | Multi-Verse Optimizer: a nature-inspired algorithm for global optimization | Neural Computing and Applications | 2016 | 1,910 |
54 | An Introduction to Variational Methods for Graphical Models | Machine Learning | 1999 | 1,889 |
55 | Improved Boosting Algorithms Using Confidence-rated Predictions | Machine Learning | 1999 | 1,885 |
56 | Unsupervised Learning by Probabilistic Latent Semantic Analysis | Machine Learning | 2001 | 1,884 |
57 | Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering | Neural Computation | 2004 | 1,883 |
58 | A theory of learning from different domains | Machine Learning | 2010 | 1,852 |
59 | Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy | Machine Learning | 2003 | 1,814 |
60 | Choosing Multiple Parameters for Support Vector Machines | Machine Learning | 2002 | 1,746 |
61 | Markov logic networks | Machine Learning | 2006 | 1,700 |
62 | BoosTexter: A Boosting-based System for Text Categorization | Machine Learning | 2000 | 1,674 |
63 | A Bayesian Method for the Induction of Probabilistic Networks from Data | Machine Learning | 1992 | 1,649 |
64 | An Introduction to MCMC for Machine Learning | Machine Learning | 2003 | 1,641 |
65 | The CN2 induction algorithm | Machine Learning | 1989 | 1,627 |
66 | Independent Component Analysis Using an Extended Infomax Algorithm for Mixed Subgaussian and Supergaussian Sources | Neural Computation | 1999 | 1,614 |
67 | (null) | Machine Learning | 2003 | 1,613 |
68 | The Strength of Weak Learnability | Machine Learning | 1990 | 1,527 |
69 | Generalized Discriminant Analysis Using a Kernel Approach | Neural Computation | 2000 | 1,489 |
70 | Mixtures of Probabilistic Principal Component Analyzers | Neural Computation | 1999 | 1,485 |
71 | Classifier chains for multi-label classification | Machine Learning | 2011 | 1,483 |
72 | Knowledge acquisition via incremental conceptual clustering | Machine Learning | 1987 | 1,423 |
73 | Very Simple Classification Rules Perform Well on Most Commonly Used Datasets | Machine Learning | 1993 | 1,417 |
74 | SPADE: An Efficient Algorithm for Mining Frequent Sequences | Machine Learning | 2001 | 1,411 |
75 | Improvements to Platt's SMO Algorithm for SVM Classifier Design | Neural Computation | 2001 | 1,410 |
76 | Projected Gradient Methods for Nonnegative Matrix Factorization | Neural Computation | 2007 | 1,399 |
77 | Learning in the presence of concept drift and hidden contexts | Machine Learning | 1996 | 1,364 |
78 | Genetic algorithms and Machine Learning | Machine Learning | 1988 | 1,357 |
79 | Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel | Neural Computation | 2003 | 1,354 |
80 | What Size Net Gives Valid Generalization? | Neural Computation | 1989 | 1,317 |
81 | A Resource-Allocating Network for Function Interpolation | Neural Computation | 1991 | 1,248 |
82 | Queries and concept learning | Machine Learning | 1988 | 1,230 |
83 | A survey on semi-supervised learning | Machine Learning | 2020 | 1,224 |
84 | The Lack of A Priori Distinctions Between Learning Algorithms | Neural Computation | 1996 | 1,179 |
85 | Stacked regressions | Machine Learning | 1996 | 1,152 |
86 | Learning logical definitions from relations | Machine Learning | 1990 | 1,151 |
87 | The max-min hill-climbing Bayesian network structure learning algorithm | Machine Learning | 2006 | 1,145 |
88 | Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors | Neural Computation | 2013 | 1,128 |
89 | Regularization Theory and Neural Networks Architectures | Neural Computation | 1995 | 1,105 |
90 | Feature Linking via Synchronization among Distributed Assemblies: Simulations of Results from Cat Visual Cortex | Neural Computation | 1990 | 1,059 |
91 | GTM: The Generative Topographic Mapping | Neural Computation | 1998 | 1,043 |
92 | What Is the Goal of Sensory Coding? | Neural Computation | 1994 | 1,039 |
93 | Soft Margins for AdaBoost | Machine Learning | 2001 | 1,000 |
94 | The Helmholtz Machine | Neural Computation | 1995 | 990 |
95 | Machine Learning for the Detection of Oil Spills in Satellite Radar Images | Machine Learning | 1998 | 986 |
96 | High-Order Contrasts for Independent Component Analysis | Neural Computation | 1999 | 984 |
97 | Reduction Techniques for Instance-Based Learning Algorithms | Machine Learning | 2000 | 983 |
98 | First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method | Neural Computation | 1992 | 981 |
99 | Logistic Model Trees | Machine Learning | 2005 | 981 |
100 | Learning to Predict by the Methods of Temporal Differences | Machine Learning | 1988 | 964 |