Skip to main content.


The first thing to say about XML is what it's not. It's not anything to do with CUP-XML, a system by which Cambridge University Press shields its indexers from all exposure to XML coding. Like the majority of publishers, they have chosen to keep their XML in house.

There is no point in claiming to be able to index using XML if you have only CUP-XML experience; it's like saying you can fly an airliner when you've only ever flown as a passenger. It will do you personally, and the reputation of the profession, no favours. That's not to denigrate CUP-XML skills, which are highly relevant, but they have more in common with other tagging systems, as used for example by OUP and Elsevier, than with direct XML which is a form of embedded indexing. If you don't know the difference between an element and an attribute, or between well-formedness and validity, you certainly don't know any XML. And, if you're curious, don't forget the Glossary.

That's the end of a stern but necessary health warning. So what is XML? It's the basis of an incredibly powerful set of data description languages that look superficially like the HTML used to drive websites. There's actually a tiny bit of it in the banner to this page, illustrating the DocBook standard.

Aside from our own PTG glossary entry on XML there are several more detailed guides:

And what about CUP-XML itself? The best start is to look at Maureen MacGlashan's paper 'No, not embedded indexing: CUP-XML, Elsevier, OUP et al', prepared for the 2011 Society of Indexers Conference, which discusses CUP-XML in context with the related Elsevier and OUP systems and which contains links to additional material by James Lamb and from CUP itself.

Society of Indexers, Woodbourn Business Centre, 10 Jessell Street, Sheffield S9 3HY
The Society of Indexers is a company limited by guarantee and incorporated in England and Wales.
Registration number 6303822.
Valid CSS | Valid XHTML