Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling
Distribution of the number of citations over years.