Algebraic multigrid (AMG) solvers are an essential component in many large-scale scientific applications. However, increasing amounts of communication and data movement on coarser levels present significant challenges. In this talk, we discuss several methods targeted at improving the performance, such as reducing the number of coarse levels through aggressive coarsening, improving locality of communicating processors through repartitioning and careful mapping of subdomains to processors, reducing data movement through gathering and storing data redundantly. The results demonstrate substantial speedups on a variety of test problems.