With single-core speeds no longer rising, the large supercomputers of the future will feature increasing levels of parallelism. Applications will have to deal with this reality in order to scale effectively to these machines. In this talk, we focus on how to scale algebraic multigrid to future machines. The big challenge is an increasing amount of communication on coarse grids. We overview the body of past work on the subject, and present performance models we are using to guide our work to avoid difficulties that the past work faced. We then present results for two approaches, using hybrid distributed memory/shared memory programming and redundantly distributing data across otherwise inactive processors on coarse grids.