Our Transcription System

Because spoken language is different from written language, it is difficult to write down spoken language and still show what it actually says. If we use the normal English writing system, we will miss many important features. Therefore, we use a special system to write down what speakers say.

Our transcription system is based on one developed at the University of Michigan, for their MICASE project. We changed their system a little to make it more suitable for our purpose.

It is not a difficult system when you get used to it.

The Main Points of Our Transcription System

  • We use idea units (which are rather like sentences). Idea units are usually indicated by the voice and intonation (they are usually just a few words, and average about two seconds long). Idea units are often followed by a short pause, but not always.
  • We do not use capital letters to start idea units, we only use capital letters for names.
  • Idea units end in periods <.> and sometimes have commas <,> to indicate pauses and short grammatical units. Longer pauses use three periods <...>.
  • We try to write down everything the person says, including mistakes, repetitions, and repairs.
  • We use an underline <_>to show that the speaker has not finished the idea or the grammatical unit. We use a hyphen <-> to show the speaker has not finished the word.
  • We use normal spelling for most words. Although we have special spellings for certain common abbreviations, such as short forms of words.
  • If there is something we don’t understand, we use round brackets with x’s inside. We try to show how many words are missing. E.g. (xx xx) shows about two words are missing. If we think we know what the words are, but we are not sure, we put the speech in round brackets.
  • Any foreign words, slang, or mispronunciations are written in italics.
  • Helpful comments go in square brackets.