The Chemist's English
Page Under construction
The following is a summary of some of the main points made in Robert W.
Schoenfeld's
delightful book, "The Chemist's English" (Wiley-VCH, Weinheim, Germany,
1990; ISBN 3-527-28003-0
or 0-89573-946-1). [It is presently out of print but
Wiley-VCH has announced the 3rd revised edition for 2001.]
If you write a lot of scientific papers and want to improve on their
presentation, this volume is worth every penny.
I have interspersed the wisdom of Prof. Schoenfeld (z"l) in some
places with
comments and/or examples of my own.
The numbering of the headers refers to the relevant chapters in
Prof. Schoenfeld's book.
0: The Queen's or the President's English?
[Gershom:] Ignoring regional variants, there are two main standards for
written English: Oxford or British English (as defined by the Oxford
English Dictionary)
and American English (as defined by the Merriam-Webster dictionary).
Neither English is "wrong": the main thing is that, if you happen to
prefer British
spelling and usage (defence vs. defense, colour vs.
color, synthesise vs. synthesize),
usage (pavement vs. sidewalk, solicitor vs. lawyer), and
punctuation ("Smith, Jones and Brown, Solicitors" vs. "Cohen,
Levy, and Friedman, Attorneys at Law")
you should consistently stick with it.
Interestingly, while Prof. Schoenfeld worked in an "Oxford English"
environment,
he offers persuasive arguments in Section 15.3 as to why the "American"
endings
"-ize" and "-or" are etymologically more correct than their "Oxford"
counterparts "-ise" and "-our". [Gershom:
Personally, like most people who grew up on the Continent, I was taught
Oxford
English in school. (My mother tongue is Dutch.) Since however most of
my reading for business or pleasure has been in English
since high school, and the vast majority of these books/magazines were
originally
printed in the US, at some point the American spellings got burned into
my EPROMs forever :-).]
This aside, it has been my experience that British readers tend
to be more tolerant of complex sentence structures and arcane
vocabulary (if
used correctly) than their American counterparts. By the same token,
informal
and "slangy" writing is much more accepted among the President's fellow
citizens
than among the Queen's subjects. The behavior of copy editors tends to
reflect these
"national" preferences.
2: The search for the missing ablative
English, unlike Latin, has no "ablative" case, and hence a plethora of
prepositions
take its place.
For Chemist's English there exists a rough rule of thumb: with
is used
for simple instruments or techniques, by for more complex ones,
and by means of for yet more elaborate and/or abstruse ones.
(Crystals are separated with
a spatula, liquids by distillation, and vapors by means of the
Finkelstein-Horowitz
969 Kromo-Graf apparatus.) "Through" is in principle acceptable, but
has fallen into
disuse. Phrases like "was analyzed using mass spectrometry" are not the
most grammatically
correct English (since "using" is a participle, it should be attached
to a noun) but are
fairly common in scientific English. (Note that the same sentence in
the active
voice --- "we analyzed ... using mass spectrometry" is quite correct
since "using"
is attached to "we" there.)
3: Arguing with authority
Many chemists have a tendency to misuse intransitive verbs as
transitive ones. For instance,
"the compound was reacted with diazomethane" is sloppy English, since
"to react" is an intransitive verb.
"Due to its low solubility, the NMR spectrum of the compound in
heavy water could not
be measured" should, strictly speaking, read "Owing to", since "due" is
an unattached participle.
However, in present-day English usage, "due to" and "owing to" can be
used interchangeably.
4: Defying the dictionary
In standard written English, "-ly" and "-ially" mean the same thing.
Not so in
the Chemist's English: a "partially hydrogenated product" is an
intermediate compound in a hydrogenation process (e.g. butene from
butadiene), while a "partly
hydrogenated product" contains unchanged starting material. As a rule
of thumb,
the chemical antonym of "partially" is "completely", while that of
"partly" is
"wholly".
5: To reflux or not to reflux, that is the question
The boundary between nouns and verb in Elizabethan English is almost as
vague as that in Hebrew.
"Shakespeare prided himself on his ability to take advantage
thereof;
in fact he gloried in it; he showered the language with
new words coined with a felicity that beggars
description." In modern English, overuse of this device by headline
writers has given it an undertone of brusque and
coarse action ("he gunned, knifed, and bludgeoned the hapless victims
he had lined
up"). Schoenfeld offers the following simple rules of thumb:
- Do not turn a noun into a verb if all you gain is the omission of
one or two words. (E.g., not "the compound complexes with" but "the
compound forms complexes with".) Most common exception: "to reflux" is
accepted Chemical English.
- Do not turn a noun into a verb if a verb derived from the same
stem already exists (e.g. not "to destruct" but "to destroy"). However,
some exceptions have become common usage in Chemical English (e.g. "to
bond", and "to self-destruct").
- For nouns derived from Greek, Latin, or French, verbs can often
be formed legitimately by adding "-ize", "-ate", or "-ify", or by other
simple devices. E.g. not "to chromatogram" but "to chromatograph".
Exception: "to program" has become Standard English.
- Do not turn a cluster of nouns into a single verb (e.g. "to
hydrogen-bond")
- Avoid turning proper names into verbs: (i.e., do not Reformatzky
the language). [Gershom: one common exception outside chemistry: the
practice of quoting nearly an entire article interspersed by
point-by-point refutation has come to be known in the online journalism
community as "fisking", after the anti-American and anti-Israeli
<strike>agitpropnik</strike> journalist Robert Fisk who was
the first well-known frequent target of this practice.]
6: English scientists secretly practice German vice!
German (and Dutch) are notorious for their long "sausage words":
"Hauptwortkombinationenzusammenstellungsbedürfnis"
is best translated as "compound noun assembly mania". (The longest
entry in the authoritative "Van Dale Dictionary
of the Dutch language" is "wapenstilstandsonderhandelingen" [armistice
negotiations].) But "proton magnetic resonance spectroscopy literature
survey" or "cyclic ligand planar
nitrogen array" only seem less unwieldy because of the odd space thrown
in. (At least
in Dutch and German, the building blocks all have to be nouns.)
Nevertheless, this type of linguistic "peptide bond" is both common
and indispensable
in Chemical English as a device for expressing a subtle relationship.
Examples:
"protein crystallography", "trial run", "hydrolysis experiment". Even
in everyday
English, one has come to depend on it, cfr. "vice squad investigates
call girl racket".
As a rule, however, lengthy "peptide chains" like "ring junction
carbon environment
differences" should be hydrolyzed, e.g. to "differences in the
environment of the
ring junction carbons" or "differences in the environment of the
carbons at the
ring junctions". Two exceptions: (a) names of chemical compounds can
always be considered
as a single "amino acid", e.g. "lithium aluminium hydride reduction" is
not an obnoxious
'tetrapeptide' but a quite acceptable 'dipeptide'; (b) very long words
or chemical
names should not be incorporated in a chain.
7: Of nuts, muttons, and shotguns
The hyphen has some "big brothers" that most people outside the
printing trade are
unaware of:
- the en-dash (–, known as a "nut" among British
typesetters), which has the width of a letter "n" (hence the name). It
is used, e.g., for numerical ranges (20–30 kcal/mol, pp.
184–205, and the like), for bonds between atomic symbols, and
between different proper names (e.g. Diels–Alder reaction but
Lennard-Jones potential!). [Gershom: in LaTeX, type -- which will
automatically be parsed as a "nut".]
- the em-dash (—, known as a "mutton" among British
typesetters), which has of a letter "m". It can be used — and
very effectively — to give emphasis to a parenthetical clause.
(The Dutch name for an em-dash literally means "attention dash".) When
the emphasis is not warranted, the muttons should be slaughtered and
give way to simple commas. [Gershom: in LaTeX, type --- which will
automatically be parsed as a "mutton".]
- the minus sign, which occupies the same space as a "nut" but
does not fill that
space entirely. [Gershom: in LaTeX type $-$ outside math mode, just -
in math mode
(e.g. in \equation or \eqnarray environments).]
What about hyphens in compound nouns? Often this is considered to be
the intermediate stage in compound noun formation, i.e., from "wave
function" via "wave-function" to
"wavefunction".
Schoenfeld however rightly points out that "compound nouns" have
never become
"compound-nouns", let alone "compoundnouns". He rather argues that "in
the marriage of two nouns, the hyphen is the shotgun". [Note for
non-native speakers: a "shotgun
marriage" is the colloquial English term for a situation in which
someone makes
a girl pregnant and is then forced by her father — sometimes
literally
at gunpoint — to marry her.] In other words: the hyphen is used
only if the
pair at first sight seems ill-matched. Once the link between the two
words has
become familiar, the hyphen can be dispensed with: whether or not the
two
parts coalesce (e.g. bookcase) or stay apart (e.g. filter paper)
largely depends
on frequency of usage and attractiveness.
In three-word groupings, if even a remote ambiguity exists (e.g.
"complex ion mechanism"),
add a hyphen where necessary (e.g. "near-ultraviolet spectrum").
8: Tetravalency of carbon disproved
A great many scientific terms derive from Latin or Greek. Common sense
suggests
that one should not mix the two up in a single term: e.g., write
"multilingual" (Latin)
or "polyglot" (Greek), but not "polylingual", "unimolecular" and not
"monomolecular"
reactions.
Exception 1: in chemical nomenclature (i.e. the naming of chemical
compounds), Greek
numerical prefixes (mono-, di-, tri-, tetra-, ..., oligo-, poly-) are
always
preferred over their Latin counterparts (uni-, bi-, tri-, quadri-, ...,
multi-).
Exception 2: while "tetravalent" and "pentavalent" are, strictly
speaking, sha'atnez [mixed wool and linen, forbidden in Jewish law],
the union of Greek numerical
prefixes with the Latin-derived "-valent" has become accepted usage.
9: This chapter explains...
Editors used to frown on sentences like "Table 2 lists..." or "Figure 3
displays..."
because they contain anthropomorphisms, i.e., instances of ascribing
human thought
or behavior to inanimate objects. However, anthropomorphization is a
much more
common figure of style than most people realize: how about "the car
drove past"
(rather than "was driven past"), "the ship sailed past" (rather than
"was steered
past"), or "Israel rejects UN resolution as one-sided" (rather than
"the government
of Israel rejects.."). In fact, the Jerusalem Post routinely carries
anthropomorphism
to the point where Israel is referred to as "she". [Gershom: And I
thought that only in Hebrew somebody's "fatherland" could be female.]
But all good things can be carried to ridiculous excess. Gas
chromatograms cannot "proclaim" the purity of a compound; nor can a
curve "militate" against a
particular interpretation. If (for the sake of color) there is any
arguing, militating, agreeing, or "belie"-ing to be done, it should be
left to the
data, results, curve shapes, rates of increase and so on, rather than
to
the Figures or the Tables that contain them. Or perhaps the authors
should
come onstage and bravely say "On the basis of the data in Figure xyz we
argue..."
"Detached participles" are participles (verbs turned into
adjectives) which do
not have a noun to which they belong. E.g. "The compound was difficult
to
crystallize, resulting in considerable loss of material", or
"The compound
turned yellow, suggesting that auto-oxidation took place". The
latter example
is immediately recognized as an anthropomorphism: while compounds may
accept hydrogen atoms or expel azo groups, clearly they cannot make
suggestions to the author. An elegant solution: "A color change to
yellow occurred, suggesting that auto-oxidation took place".
The color change is in the eye
of the beholder.
10: The painful plight of the pendant participle
This refers to the German-style Subordinate Clause, wherein the subject
of the
verb by a vast stream of verbiage, which even a further sub-sub-clause
include
may, separated is. [Gershom: a sentence like that is perfectly
acceptable in
a strongly inflected language like German or Latin, but unreadable in a
weakly inflected language like English.]
15: The chemist and the capercailzie
A number of chemical terms have entered the general vocabulary, e.g. catalyst,
throughput, quantum leap, and interface. As such
words cross
the 'interface' between Chemical English and Standard English, they
often undergo
more or less subtle changes in meaning, and a similar process can be
observed in
the other direction.
For instance, it is common to speak of a 'facile synthesis' (one
which is
easy to carry out) or a 'facile reaction' (one with a low barrier
height).
In Standard English, however, 'facile' has a derogatory undertone: e.g.
a 'facile victory'
is a hollow victory, a 'facile argument' is one which sounds good at
first
sight but would not survive in-depth analysis, and a musician with a
'facile' style
would be a superficial crowd-pleaser without any real substance.
Also, while 'spectral' would mean 'ghostly' in Standard English, it
is commonly understood as 'pertaining to spectra' in Chemical English.
But please do not use 'moiety' (a corruption of the French moitié,
meaning 'one-half') for a small part of a molecule!
16: That fellow acronym he all time make trouble
Chemical nomenclature and Pidgin English have the peculiar similarity
that
they both have transitory vocabularies built up
of a very small number of `morphemes'. (Morphemes are to the linguist
what
atoms are to the chemist.) In Chemical Pidgin, a 'table' would be
referred to
as 'rectangle-r-1,c-2,c-3,c-4-tetrastick' (r stands for reference, c
for cis), which one is greatly tempted to abbreviate to RATS.
RATS of this type (acronyms, or RAshei Teivot in Hebrew)
infest scientific English. As "pest control" measures, Prof. Schoenfeld
suggests
that their creation be limited to situations where:
- the acronym really abbreviates. E.g., ALP for Australian Labor
Party has no merit over "Labor", but "laser" has obvious merits over
Light Amplification by Stimulated Emission of Radiation.
- it is used a reasonable number of times. There is no point
creating an acronym for something you only use twice in a long article
[Gershom: the 'acronym pest' is particularly rampant in my own field
(computational quantum chemistry), where it is virtually impossible to
write a paper
without using some 20–30 acronyms for basis sets,
exchange-correlation functionals,
electron correlation methods... Some journals (like the Journal of
Chemical Physics)
have a strict policy that every nontrivial acronym must be defined upon
first
use in a paper: while I find it as annoying as the next guy to explain
B3LYP (Becke 3-parameter hybrid exchange with Lee-Yang-Parr
correlation) and CCSD(T)
(coupled cluster with all single and double substitutions and
quasiperturbative
connected triple excitations) every time I sit down to write a paper,
it is a minor annoyance
that may save a reader a major headache.
In this case, however, complete avoidance of acronyms is IMHO (in my
humble opinion)
counterindicated, since most computational chemists will recognize
acronyms like
LDA and MBPT-2 a good deal faster than 'local density approximation'
and 'second-order many-body perturbation theory', respectively. Also,
acronyms make for
nice search strings in databases of abstracts, so be sure to include
them in your abstract.]
17: On the divisibility of earth/worms
Prof. Schoenfeld proposes the following hyphenation rules for Chemical
English:
- All "common-usage" words are hyphenated according to the
Merriam-Webster dictionary. [Gershom: in practice this means nowadays
"accept the suggestion of your word processor's auto-hyphenation
feature".] The 'rules of thumb' are that
- Where morphemes are joined together in such a way that the
pronunciation of each morpheme is not changed greatly, the morpheme
boundary is re-spect-ed
- In other cases, pronunciation prevails. Thus pre-vailing
becomes prev-alent and re-fer becomes ref-erence. Where morphemes are
joined together in such a way that the pronunciation of each morpheme
is not changed greatly, the morpheme boundary is re-spect-ed
- In words that are clearly Chemical English, one always
hyphenates on the morpheme boundary, i.e. meaning always prevails over
pronunciation.
18: Instant stylistics
Prof. Schoenfeld's definition of a "good paper" is one that is written
so
as to yield up its information to the reader in the shortest
possible
reading time.
According to this criterion, the quality of a paper as a function
of its information density goes through a maximum between the twin
extremes
of "rocks of terseness" and "swamps of verbosity".
In the ideal world, everybody reading your paper will be eager to
know
what you have to say; in the real world, the reader has a low boredom
threshold, and
is just looking for an excuse to put down your paper unread. In the
theater it
is said "In the first 10 minutes, somebody must kick the cat", and a
rabbi
I knew used to say about sermons: "If you haven't struck gold in the
first
20 minutes, stop boring.". In other words: somewhere in the
introduction,
kick the cat as hard as you can to get the attention of the reader. (If
you have no cats to kick — i.e., no important points to make
— perhaps the paper should not be written at all.)
20: Driveliferous Jargonogenesis
The following four "grievous sins" should be avoided like the plague:
- Irrelevance (loudly saying nothing at all)
- Proudly proclaiming the obvious
- Tautology (saying the same thing several times over)
- Pompous polysyllabicity, i.e. the use of overblown language to
create a false sense of importance and/or profundity.
"Million dollar words" can truly be worth a million dollars, if used
sparingly
and in the right context. Used out of turn, they sink to the value of
one-billion Zaire
notes in the waning years of Mobutu.
Rabbi Hillel the Elder is quoted in the Mishna (part of the Talmud)
as saying "Do not make an
incomprehensible statement in the hope that it will eventually be
understood."
(Pirkei Avot 3:5) Every scientific writer should frame this bit of
wisdom and hang it
above his/her desk.
21: Brevity = Soul of wit?
One can however fall into the other extreme, and write prose so terse
and dense that
it becomes impenetrable to the average reader. In particular, avoid the
following
discourtesies towards the reader:
- Wrongly assuming ideas to be self-evident. This is most liable to
occur in interdisciplinary work, or papers in borderline fields.
- Name dropping without explanation. While no organic chemist would
need to be
told what a Diels–Alder reaction is, not every chemical physicist
would
immediately recognize "Heck and Suzuki couplings" for what they are. If
there is
even a remote chance that a typical reader of the intended journal might not
recognize a certain proper name, a quick word of explanation might be
in order.
- Acronym dropping. [Gershom: see note under point 16.]
- Obscure jargon. Of course, as with name dropping, "obscurity" of
jargon is a function of the journal readership.
For instance, while every computer
scientist knows what an "embarrassingly parallel" program is, most
chemists would not.
While no editor would flag this term in a paper submitted to the Transactions
of the
Association for Computing Machinery, an explanatory footnote in
clear but precise
language would be in order in the Journal of Computational Chemistry.
- Hieroglyphics. "9 reacts with 13 to give 17
in 69% yield"
becomes totally incomprehensible unless the scheme defining these three
structures is
printed on the same or facing page (which generally cannot be
guaranteed). Something
like "the acid 9 reacts with the amine 13 to yield the
amide 17" might be easier to digest.
24: Is you is or is you ain't my data?
In English, plurals of foreign words generally follow the original
language.
Examples:
Singular |
Plural |
(From:) |
lingua franca |
lingue franche |
(Italian) |
kibbutz |
kibbutzim |
(Hebrew) |
beduin |
beduin |
(Arabic) |
spectrum |
spectra |
(Latin) |
datum |
data |
(Latin) |
appendix |
appendices |
(Latin) |
vertex |
vertices |
(Latin) |
matrix |
matrices |
(Latin) |
phenomenon |
phenomena |
(Greek) |
criterion |
criteria |
(Greek) |
The fact that some people do not know that "data" has a singular is their problem. In the case of the
"singularized plural" agenda however, the singular agendum
never entered the English language, and the synthetic
plural "agendas" (hidden or otherwise) is occasionally encountered.
Some borrowings rejoice in two plurals with distinct meanings:
"indices" (mathematical) and "indexes" (all other senses) are both
correct plural forms of
"index". "Formulae" and "formulas" are both acceptable, although only
the former
is etymologically correct.
Yet other borrowings never use the original plural: e.g. one writes
protons, neutrons, leptons, ... but not prota, neutra, lepta,...
Some borrowings from German have acquired English endings, notably
"zwitterions"
(not Zwitterionen), eigenvalues, eigenfunctions, and
eigenvectors (instead of Eigenwerte, Eigenfunktionen, and Eigenvektoren,
respectively.
[GERSHOM: Prof. Max Holthausen wrote with a nice example.
The German borrowing "ansatz" should strictly speaking be capitalized (Ansatz)
--- the way nouns are in the German language --- and its proper German plural
is Ansätze or [equivalently] Ansaetze. Yet many English-speaking authors
use the anglicized form "ansatzes". An intermediate solution would be to
use the German plural but drop the capitalization (which violates English
grammar), i.e., "ansätze" or "ansaetze". For those curious about such
matters, "ä" and "ae" are equivalent in German, as are "ü" and "ue", as well as
"ö" and "oe".]
TO BE CONTINUED