In recent years, evaluation of the quality of academic research has become an increasingly important and influential business. It determines, often to a large extent, the amount of research funding flowing into universities and similar institutes from governmental agencies and it impacts upon academic careers. Policy makers are becoming increasingly reliant upon, and influenced by, the outcomes of such evaluations. In response, university managers are increasingly attracted to simple metrics as guides to the dynamics of the positions of their various institutions in league tables. However, these league tables are invariably drawn up by inexpert bodies such as newspapers and magazines, using arbitrary measures and criteria. Terms such as "critical mass" and "h-index" are bandied about without understanding of what they actually mean. Rather than accepting the rise and fall of universities, departments and individuals on a turbulent sea of arbitrary measures, we suggest it is incumbent upon the scientific community itself to clarify their nature. Here we report on recent attempts to do that by properly defining critical mass and showing how group size influences research quality. We also examine currently predominant metrics and show that these fail as reliable indicators of group research quality.