The Chemist's English

The following is a summary of some of the main points made in Robert W. Schoenfeld's delightful book, "The Chemist's English" (Wiley-VCH, Weinheim, Germany, 1990; ISBN 3-527-28003-0 or 0-89573-946-1). [It is presently out of print but Wiley-VCH has announced the 3rd revised edition for 2001.]

If you write a lot of scientific papers and want to improve on their presentation, this volume is worth every penny.

I have interspersed the wisdom of Prof. Schoenfeld (z"l) in some places with comments and/or examples of my own.

The numbering of the headers refers to the relevant chapters in Prof. Schoenfeld's book.

0: The Queen's or the President's English?

[Gershom:] Ignoring regional variants, there are two main standards for written English: Oxford or British English (as defined by the Oxford English Dictionary) and American English (as defined by the Merriam-Webster dictionary). Neither English is "wrong": the main thing is that, if you happen to prefer British spelling and usage (defence vs. defense, colour vs. color, synthesise vs. synthesize), usage (pavement vs. sidewalk, solicitor vs. lawyer), and punctuation ("Smith, Jones and Brown, Solicitors" vs. "Cohen, Levy, and Friedman, Attorneys at Law") you should consistently stick with it.

Interestingly, while Prof. Schoenfeld worked in an "Oxford English" environment, he offers persuasive arguments in Section 15.3 as to why the "American" endings "-ize" and "-or" are etymologically more correct than their "Oxford" counterparts "-ise" and "-our". [Gershom: Personally, like most people who grew up on the Continent, I was taught Oxford English in school. (My mother tongue is Dutch.) Since however most of my reading for business or pleasure has been in English since high school, and the vast majority of these books/magazines were originally printed in the US, at some point the American spellings got burned into my EPROMs forever :-).]

This aside, it has been my experience that British readers tend to be more tolerant of complex sentence structures and arcane vocabulary (if used correctly) than their American counterparts. By the same token, informal and "slangy" writing is much more accepted among the President's fellow citizens than among the Queen's subjects. The behavior of copy editors tends to reflect these "national" preferences.

2: The search for the missing ablative

English, unlike Latin, has no "ablative" case, and hence a plethora of prepositions take its place.

For Chemist's English there exists a rough rule of thumb: with is used for simple instruments or techniques, by for more complex ones, and by means of for yet more elaborate and/or abstruse ones. (Crystals are separated with a spatula, liquids by distillation, and vapors by means of the Finkelstein-Horowitz 969 Kromo-Graf apparatus.) "Through" is in principle acceptable, but has fallen into disuse. Phrases like "was analyzed using mass spectrometry" are not the most grammatically correct English (since "using" is a participle, it should be attached to a noun) but are fairly common in scientific English. (Note that the same sentence in the active voice --- "we analyzed ... using mass spectrometry" is quite correct since "using" is attached to "we" there.)

3: Arguing with authority

Many chemists have a tendency to misuse intransitive verbs as transitive ones. For instance, "the compound was reacted with diazomethane" is sloppy English, since "to react" is an intransitive verb.

"Due to its low solubility, the NMR spectrum of the compound in heavy water could not be measured" should, strictly speaking, read "Owing to", since "due" is an unattached participle. However, in present-day English usage, "due to" and "owing to" can be used interchangeably.

4: Defying the dictionary

In standard written English, "-ly" and "-ially" mean the same thing. Not so in the Chemist's English: a "partially hydrogenated product" is an intermediate compound in a hydrogenation process (e.g. butene from butadiene), while a "partly hydrogenated product" contains unchanged starting material. As a rule of thumb, the chemical antonym of "partially" is "completely", while that of "partly" is "wholly".

5: To reflux or not to reflux, that is the question

The boundary between nouns and verb in Elizabethan English is almost as vague as that in Hebrew. "Shakespeare prided himself on his ability to take advantage thereof; in fact he gloried in it; he showered the language with new words coined with a felicity that beggars description." In modern English, overuse of this device by headline writers has given it an undertone of brusque and coarse action ("he gunned, knifed, and bludgeoned the hapless victims he had lined up"). Schoenfeld offers the following simple rules of thumb:

6: English scientists secretly practice German vice!

German (and Dutch) are notorious for their long "sausage words": "Hauptwortkombinationenzusammenstellungsbedürfnis" is best translated as "compound noun assembly mania". (The longest entry in the authoritative "Van Dale Dictionary of the Dutch language" is "wapenstilstandsonderhandelingen" [armistice negotiations].) But "proton magnetic resonance spectroscopy literature survey" or "cyclic ligand planar nitrogen array" only seem less unwieldy because of the odd space thrown in. (At least in Dutch and German, the building blocks all have to be nouns.)

Nevertheless, this type of linguistic "peptide bond" is both common and indispensable in Chemical English as a device for expressing a subtle relationship. Examples: "protein crystallography", "trial run", "hydrolysis experiment". Even in everyday English, one has come to depend on it, cfr. "vice squad investigates call girl racket".

As a rule, however, lengthy "peptide chains" like "ring junction carbon environment differences" should be hydrolyzed, e.g. to "differences in the environment of the ring junction carbons" or "differences in the environment of the carbons at the ring junctions". Two exceptions: (a) names of chemical compounds can always be considered as a single "amino acid", e.g. "lithium aluminium hydride reduction" is not an obnoxious 'tetrapeptide' but a quite acceptable 'dipeptide'; (b) very long words or chemical names should not be incorporated in a chain.

7: Of nuts, muttons, and shotguns

The hyphen has some "big brothers" that most people outside the printing trade are unaware of: What about hyphens in compound nouns? Often this is considered to be the intermediate stage in compound noun formation, i.e., from "wave function" via "wave-function" to "wavefunction".

Schoenfeld however rightly points out that "compound nouns" have never become "compound-nouns", let alone "compoundnouns". He rather argues that "in the marriage of two nouns, the hyphen is the shotgun". [Note for non-native speakers: a "shotgun marriage" is the colloquial English term for a situation in which someone makes a girl pregnant and is then forced by her father — sometimes literally at gunpoint — to marry her.] In other words: the hyphen is used only if the pair at first sight seems ill-matched. Once the link between the two words has become familiar, the hyphen can be dispensed with: whether or not the two parts coalesce (e.g. bookcase) or stay apart (e.g. filter paper) largely depends on frequency of usage and attractiveness.

In three-word groupings, if even a remote ambiguity exists (e.g. "complex ion mechanism"), add a hyphen where necessary (e.g. "near-ultraviolet spectrum").

8: Tetravalency of carbon disproved

A great many scientific terms derive from Latin or Greek. Common sense suggests that one should not mix the two up in a single term: e.g., write "multilingual" (Latin) or "polyglot" (Greek), but not "polylingual", "unimolecular" and not "monomolecular" reactions.

Exception 1: in chemical nomenclature (i.e. the naming of chemical compounds), Greek numerical prefixes (mono-, di-, tri-, tetra-, ..., oligo-, poly-) are always preferred over their Latin counterparts (uni-, bi-, tri-, quadri-, ..., multi-).

Exception 2: while "tetravalent" and "pentavalent" are, strictly speaking, sha'atnez [mixed wool and linen, forbidden in Jewish law], the union of Greek numerical prefixes with the Latin-derived "-valent" has become accepted usage.

9: This chapter explains...

Editors used to frown on sentences like "Table 2 lists..." or "Figure 3 displays..." because they contain anthropomorphisms, i.e., instances of ascribing human thought or behavior to inanimate objects. However, anthropomorphization is a much more common figure of style than most people realize: how about "the car drove past" (rather than "was driven past"), "the ship sailed past" (rather than "was steered past"), or "Israel rejects UN resolution as one-sided" (rather than "the government of Israel rejects.."). In fact, the Jerusalem Post routinely carries anthropomorphism to the point where Israel is referred to as "she". [Gershom: And I thought that only in Hebrew somebody's "fatherland" could be female.]

But all good things can be carried to ridiculous excess. Gas chromatograms cannot "proclaim" the purity of a compound; nor can a curve "militate" against a particular interpretation. If (for the sake of color) there is any arguing, militating, agreeing, or "belie"-ing to be done, it should be left to the data, results, curve shapes, rates of increase and so on, rather than to the Figures or the Tables that contain them. Or perhaps the authors should come onstage and bravely say "On the basis of the data in Figure xyz we argue..."

"Detached participles" are participles (verbs turned into adjectives) which do not have a noun to which they belong. E.g. "The compound was difficult to crystallize, resulting in considerable loss of material", or "The compound turned yellow, suggesting that auto-oxidation took place". The latter example is immediately recognized as an anthropomorphism: while compounds may accept hydrogen atoms or expel azo groups, clearly they cannot make suggestions to the author. An elegant solution: "A color change to yellow occurred, suggesting that auto-oxidation took place". The color change is in the eye of the beholder.

10: The painful plight of the pendant participle

This refers to the German-style Subordinate Clause, wherein the subject of the verb by a vast stream of verbiage, which even a further sub-sub-clause include may, separated is. [Gershom: a sentence like that is perfectly acceptable in a strongly inflected language like German or Latin, but unreadable in a weakly inflected language like English.]

15: The chemist and the capercailzie

A number of chemical terms have entered the general vocabulary, e.g. catalyst, throughput, quantum leap, and interface. As such words cross the 'interface' between Chemical English and Standard English, they often undergo more or less subtle changes in meaning, and a similar process can be observed in the other direction.

For instance, it is common to speak of a 'facile synthesis' (one which is easy to carry out) or a 'facile reaction' (one with a low barrier height). In Standard English, however, 'facile' has a derogatory undertone: e.g. a 'facile victory' is a hollow victory, a 'facile argument' is one which sounds good at first sight but would not survive in-depth analysis, and a musician with a 'facile' style would be a superficial crowd-pleaser without any real substance.

Also, while 'spectral' would mean 'ghostly' in Standard English, it is commonly understood as 'pertaining to spectra' in Chemical English. But please do not use 'moiety' (a corruption of the French moitié, meaning 'one-half') for a small part of a molecule!

16: That fellow acronym he all time make trouble

Chemical nomenclature and Pidgin English have the peculiar similarity that they both have transitory vocabularies built up of a very small number of `morphemes'. (Morphemes are to the linguist what atoms are to the chemist.) In Chemical Pidgin, a 'table' would be referred to as 'rectangle-r-1,c-2,c-3,c-4-tetrastick' (r stands for reference, c for cis), which one is greatly tempted to abbreviate to RATS.

RATS of this type (acronyms, or RAshei Teivot in Hebrew) infest scientific English. As "pest control" measures, Prof. Schoenfeld suggests that their creation be limited to situations where:

  1. the acronym really abbreviates. E.g., ALP for Australian Labor Party has no merit over "Labor", but "laser" has obvious merits over Light Amplification by Stimulated Emission of Radiation.
  2. it is used a reasonable number of times. There is no point creating an acronym for something you only use twice in a long article
[Gershom: the 'acronym pest' is particularly rampant in my own field (computational quantum chemistry), where it is virtually impossible to write a paper without using some 20–30 acronyms for basis sets, exchange-correlation functionals, electron correlation methods... Some journals (like the Journal of Chemical Physics) have a strict policy that every nontrivial acronym must be defined upon first use in a paper: while I find it as annoying as the next guy to explain B3LYP (Becke 3-parameter hybrid exchange with Lee-Yang-Parr correlation) and CCSD(T) (coupled cluster with all single and double substitutions and quasiperturbative connected triple excitations) every time I sit down to write a paper, it is a minor annoyance that may save a reader a major headache.

In this case, however, complete avoidance of acronyms is IMHO (in my humble opinion) counterindicated, since most computational chemists will recognize acronyms like LDA and MBPT-2 a good deal faster than 'local density approximation' and 'second-order many-body perturbation theory', respectively. Also, acronyms make for nice search strings in databases of abstracts, so be sure to include them in your abstract.]

17: On the divisibility of earth/worms

Prof. Schoenfeld proposes the following hyphenation rules for Chemical English:
  1. All "common-usage" words are hyphenated according to the Merriam-Webster dictionary. [Gershom: in practice this means nowadays "accept the suggestion of your word processor's auto-hyphenation feature".] The 'rules of thumb' are that
    1. Where morphemes are joined together in such a way that the pronunciation of each morpheme is not changed greatly, the morpheme boundary is re-spect-ed
    2. In other cases, pronunciation prevails. Thus pre-vailing becomes prev-alent and re-fer becomes ref-erence. Where morphemes are joined together in such a way that the pronunciation of each morpheme is not changed greatly, the morpheme boundary is re-spect-ed
  2. In words that are clearly Chemical English, one always hyphenates on the morpheme boundary, i.e. meaning always prevails over pronunciation.

18: Instant stylistics

Prof. Schoenfeld's definition of a "good paper" is one that is written so as to yield up its information to the reader in the shortest possible reading time.

According to this criterion, the quality of a paper as a function of its information density goes through a maximum between the twin extremes of "rocks of terseness" and "swamps of verbosity".

In the ideal world, everybody reading your paper will be eager to know what you have to say; in the real world, the reader has a low boredom threshold, and is just looking for an excuse to put down your paper unread. In the theater it is said "In the first 10 minutes, somebody must kick the cat", and a rabbi I knew used to say about sermons: "If you haven't struck gold in the first 20 minutes, stop boring.". In other words: somewhere in the introduction, kick the cat as hard as you can to get the attention of the reader. (If you have no cats to kick — i.e., no important points to make — perhaps the paper should not be written at all.)

20: Driveliferous Jargonogenesis

The following four "grievous sins" should be avoided like the plague: "Million dollar words" can truly be worth a million dollars, if used sparingly and in the right context. Used out of turn, they sink to the value of one-billion Zaire notes in the waning years of Mobutu.

Rabbi Hillel the Elder is quoted in the Mishna (part of the Talmud) as saying "Do not make an incomprehensible statement in the hope that it will eventually be understood." (Pirkei Avot 3:5) Every scientific writer should frame this bit of wisdom and hang it above his/her desk.

21: Brevity = Soul of wit?

One can however fall into the other extreme, and write prose so terse and dense that it becomes impenetrable to the average reader. In particular, avoid the following discourtesies towards the reader:

24: Is you is or is you ain't my data?

In English, plurals of foreign words generally follow the original language. Examples:
Singular Plural (From:)
lingua franca lingue franche (Italian)
kibbutz kibbutzim (Hebrew)
beduin beduin (Arabic)
spectrum spectra (Latin)
datum data (Latin)
appendix appendices (Latin)
vertex vertices (Latin)
matrix matrices (Latin)
phenomenon phenomena (Greek)
criterion criteria (Greek)
The fact that some people do not know that "data" has a singular is their problem. In the case of the "singularized plural" agenda however, the singular agendum never entered the English language, and the synthetic plural "agendas" (hidden or otherwise) is occasionally encountered.

Some borrowings rejoice in two plurals with distinct meanings: "indices" (mathematical) and "indexes" (all other senses) are both correct plural forms of "index". "Formulae" and "formulas" are both acceptable, although only the former is etymologically correct.

Yet other borrowings never use the original plural: e.g. one writes protons, neutrons, leptons, ... but not prota, neutra, lepta,...

Some borrowings from German have acquired English endings, notably "zwitterions" (not Zwitterionen), eigenvalues, eigenfunctions, and eigenvectors (instead of Eigenwerte, Eigenfunktionen, and Eigenvektoren, respectively.
[GERSHOM: Prof. Max Holthausen wrote with a nice example. The German borrowing "ansatz" should strictly speaking be capitalized (Ansatz) --- the way nouns are in the German language --- and its proper German plural is Ansätze or [equivalently] Ansaetze. Yet many English-speaking authors use the anglicized form "ansatzes". An intermediate solution would be to use the German plural but drop the capitalization (which violates English grammar), i.e., "ansätze" or "ansaetze". For those curious about such matters, "ä" and "ae" are equivalent in German, as are "ü" and "ue", as well as "ö" and "oe".]