# | Title | Journal | Year | Citations |
---|
1 | A high-performance, portable implementation of the MPI message passing interface standard | Parallel Computing | 1996 | 1,639 |
2 | Automated empirical optimizations of software and the ATLAS project | Parallel Computing | 2001 | 928 |
3 | The ganglia distributed monitoring system: design, implementation, and experience | Parallel Computing | 2004 | 903 |
4 | Hybrid scheduling for the parallel solution of linear systems | Parallel Computing | 2006 | 805 |
5 | Robust taboo search for the quadratic assignment problem | Parallel Computing | 1991 | 726 |
6 | Parallel reactive molecular dynamics: Numerical methods and algorithmic techniques | Parallel Computing | 2012 | 716 |
7 | Genetic algorithms and neural networks: optimizing connections and connectivity | Parallel Computing | 1990 | 622 |
8 | The parallel genetic algorithm as function optimizer | Parallel Computing | 1991 | 569 |
9 | Data management and transfer in high-performance computational grid environments | Parallel Computing | 2002 | 467 |
10 | PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation | Parallel Computing | 2012 | 396 |
11 | Graph partitioning models for parallel computing | Parallel Computing | 2000 | 371 |
12 | A dynamic model and parallel tabu search heuristic for real-time ambulance relocation | Parallel Computing | 2001 | 360 |
13 | Evolution algorithms in combinatorial optimization | Parallel Computing | 1988 | 352 |
14 | A class of parallel tiled linear algebra algorithms for multicore architectures | Parallel Computing | 2009 | 327 |
15 | Swift: A language for distributed parallel scripting | Parallel Computing | 2011 | 319 |
16 | Parallel algorithms for hierarchical clustering | Parallel Computing | 1995 | 314 |
17 | Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming | Parallel Computing | 2004 | 311 |
18 | Particle Swarm based Data Mining Algorithms for classification tasks | Parallel Computing | 2004 | 309 |
19 | Towards dense linear algebra for hybrid GPU accelerated manycore systems | Parallel Computing | 2010 | 295 |
20 | SUPERB: A tool for semi-automatic MIMD/SIMD parallelization | Parallel Computing | 1988 | 290 |
21 | Optimization of sparse matrix–vector multiplication on emerging multicore platforms | Parallel Computing | 2009 | 276 |
22 | PT-Scotch: A tool for efficient parallel graph ordering | Parallel Computing | 2008 | 271 |
23 | Parallel recombinative simulated annealing: A genetic algorithm | Parallel Computing | 1995 | 261 |
24 | Symmetry in interconnection networks based on Cayley graphs of permutation groups: A survey | Parallel Computing | 1993 | 260 |
25 | Extensible component-based architecture for FLASH, a massively parallel, multiphysics simulation code | Parallel Computing | 2009 | 219 |
26 | BSPlib: The BSP programming library | Parallel Computing | 1998 | 218 |
27 | The PVM concurrent computing system: Evolution, experiences, and trends | Parallel Computing | 1994 | 207 |
28 | The communication challenge for MPP: Intel Paragon and Meiko CS-2 | Parallel Computing | 1994 | 201 |
29 | From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming | Parallel Computing | 2012 | 198 |
30 | A hybrid MPI–OpenMP scheme for scalable parallel pseudospectral computations for fluid turbulence | Parallel Computing | 2011 | 196 |
31 | DAGuE: A generic distributed DAG engine for High Performance Computing | Parallel Computing | 2012 | 196 |
32 | A hybrid multi-objective Particle Swarm Optimization for scientific workflow scheduling | Parallel Computing | 2017 | 194 |
33 | Parallel implementation of the TRANSIMS micro-simulation | Parallel Computing | 2001 | 193 |
34 | Parallel Tabu search heuristics for the dynamic multi-vehicle dial-a-ride problem | Parallel Computing | 2004 | 192 |
35 | Parallel GRASP with path-relinking for job shop scheduling | Parallel Computing | 2003 | 191 |
36 | Matrix algorithms on a hypercube I: Matrix multiplication | Parallel Computing | 1987 | 181 |
37 | Multiprocessor FFTs | Parallel Computing | 1987 | 179 |
38 | Component averaging: An efficient iterative parallel algorithm for large and sparse unstructured problems | Parallel Computing | 2001 | 175 |
39 | A parallel tabu search algorithm for solving the container loading problem | Parallel Computing | 2003 | 172 |
40 | FFT algorithms for vector computers | Parallel Computing | 1984 | 166 |
41 | MapReduce in MPI for Large-scale graph algorithms | Parallel Computing | 2011 | 162 |
42 | Distributed processing of very large datasets with DataCutter | Parallel Computing | 2001 | 161 |
43 | PaStiX: a high-performance parallel direct solver for sparse symmetric positive definite systems | Parallel Computing | 2002 | 160 |
44 | High performance computing using MPI and OpenMP on multi-core parallel systems | Parallel Computing | 2011 | 151 |
45 | Computational aspects of a code to study rotating turbulent convection in spherical shells | Parallel Computing | 1999 | 149 |
46 | Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations | Parallel Computing | 2011 | 147 |
47 | New advances in chemistry and materials science with CPMD and parallel computing | Parallel Computing | 2000 | 146 |
48 | Sparse matrix multiplication: The distributed block-compressed sparse row library | Parallel Computing | 2014 | 143 |
49 | Cost-efficient task scheduling for executing large programs in the cloud | Parallel Computing | 2013 | 141 |
50 | Monitors, messages, and clusters: The p4 parallel programming system | Parallel Computing | 1994 | 139 |