How many abbreviations does a paper need?
The scientific literature is awash with abbreviations. Some of them are clearly necessary: if your paper is about α-N-acetylneuraminyl-2,3-β-galactosyl-1,3-N-acetyl-galactosaminide 6-α-sialyltransferase (EC 220.127.116.11) and you need to mention it fifteen times, no one will suggest that you call it by that name at every mention. In many contexts it will be sufficient if you just call it "the enzyme" or sometimes just "it", or, if the paper concerns other enzymes, "sialyltransferase". Otherwise you do need to define an abbreviation, preferably one with an intuitively obvious meaning.
How many times should an abbreviation be defined? The maximum number is one: defining it in a list at the beginning of the paper, again at first mention, and again later in the paper is two definitions too many. Of course, you can always make it worse by giving different definitions for the same abbreviation in different places, or, come to that, different abbreviations for the same definition (I have seen all of those).
Where should it be defined? Ideally it should be in a footnote at the beginning of the paper. It sounds reasonable to define it at first mention, say in the Introduction. But what about readers who go straight to the Discussion: how easily will they find the definition?
Which abbreviations should be defined? All of those that are not in the list of accepted abbreviations and symbols in the Instructions to Authors of the journal should be defined. Symbols recommended by the IUBMB, such as ATP, are accepted by all biochemistry journals and should be regarded as symbols, not as abbreviations. No biochemist will find it useful to see ATP defined in the paper as adenosine 5'-triphosphate, but if your paper is directed to a broader audience, such as philosophers or physicists, or if it is an educational paper intended to be read by students, then a definition illustrated with a structure is useful. To take an extreme case, a paper about energy consumption by tennis players should avoid "ATP" altogether, or at least make it clear that it does not refer to the Association of Tennis Professionals.
How much space do abbreviations save? Very little, in many cases: maybe a few lines in a typical biochemistry paper. I have seen papers submitted to journals in which some of abbreviations increase the length of the paper, either because an abbreviation is defined but never used, or because it is defined more than once.
How do you choose a good abbreviation? Once you’ve decided that you really need an abbreviation, for example for α-N-acetylneuraminyl-2,3-β-galactosyl-1,3-N-acetyl-galactosaminide 6-α-sialyltransferase, you need to choose one that is reasonably evocative of the meaning. The commonly used term sialyltransferase satisfies that criterion, and if that is too long then sialase might be acceptable, but ANGAGAST suggests nothing (though it has the merit of being pronounceable). Something that should be avoided is to use an abbreviation that suggests something different: for example, the term "MoCo" is often used for the molybdenum cofactor found in some enzymes, but it is a bad term, because Mo is a standard chemical symbol used with its standard chemical meaning, whereas Co has nothing to do with cobalt.
But using abbreviations is convenient for writing a paper! Yes, but it is not the authors' convenience that needs to be considered, but the readers'. In any case, avoiding abbreviations in the paper does not mean avoiding them while writing it. It requires little work to write "GPDH" throughout, and then use the computer to find all instances of "GPDH" and convert them to "glycerol 3-phosphate dehydrogenase” (or whatever you want it to be). Alternatively, for abbreviations that are used a great deal, there are applications available for most computer systems to expand them automatically. For example, I virtually never write "dehydrogenase"; I write "dh", but readers never see that because it gets expanded immediately. TextExpander, which I use, gives an audible signal each time, which is useful for two reasons: if I expect to hear a signal but I don’t, then I have probably made an error, such as typing "dhh" instead of "dh"; alternatively, if I hear two signals when I expect none it indicates a different sort of error, such as typing "kn ow" instead of "know", which expands to "kinetic of which".
What about amino acid sequences? Here we are dealing with recommended symbols rather than ad hoc abbreviations, but similar ideas apply. The three-letter code is best avoided when writing about one amino acid at a time, for example, in the middle of a sentence it is more readable to put "lysine" rather than "Lys", but the three-letter code is fine for short sequences, and much easier for non-specialists than the one-letter code (who immediately thinks of lysine on seeing "K"? Sequencing specialists, certainly, but not everyone is a specialist). For sequences longer than a few residues in length the one-letter code may well be unavoidable. For a research paper that's all that needs to be said, but when writing for a general audience a table of equivalences is often worth including.
What about other sequences? Much the same applies to other polymer sequences, such as those of polysaccharides, but there even more, because whereas every biochemist is familiar with the amino acids, many are less familiar with some of the basic units of polysaccharides. On the other hand gene sequences present little problem, because everyone knows what C, A, G, T and U are.
By now it will be evident that I am not keen on abbreviations, and I quote from the 4th edition of my book Fundamentals of Enzyme Kinetics: "There are no abbreviations in this book (other than in verbatim quotations and the index, which needs to include the entries readers expect to find)". No one has ever complained that the lack of abbreviations has made it too long and difficult to read. Sometimes abbreviations are unavoidable, but the guiding principle should always be to think of the convenience of readers and not of that of authors.