Transcription: Difference between revisions

From Zack's Wiki
Jump to navigation Jump to search
ecurators>Zack
ecurators>Zack
No edit summary
Line 1: Line 1:
This transcription guide was originally prepared by Danielle Crecca to systematize the practice of transcribing interviews and conversations. It also includes notes regarding how this system might be effectively used given the constraints of the software and resource that our project implements. The following table includes a selection of notations that we use particularly often.
= Creating transcripts within MaxQDA =


See [[Timestamps]] for more potentially relevant information.
== Creating time stamps ==
When transcribing using the MaxQDA built-in transcription tool, it is necessary to record time stamps.


{| class="wikitable"
Time stamps can be recorded within MaxQDA as part of the transcription process. They can also be created on their own by right clicking anywhere within a document.
|-
! scope="col"| Notation
! scope="col"| Description
|-
| scope="row"| (.)
| A full stop inside brackets denotes a micro pause, a notable pause but of no significant length.
|-
| scope="row"| (0.2)
| A number inside brackets denotes a timed pause. This is a pause long enough to time and subsequently show in transcription.
|-
| scope="row"| CAPITALS
| Where capital letters appear, it denotes that something was said loudly or even shouted
|-
| scope="row"| [
| Square brackets denote a point where overlapping speech occurs.
|-
| scope="row"| <u>Underlined text<u>
| Underlined text where overlaid laughter occurs.
|-
| scope="row"| (( ))
| Non-verbal vocal actions and events encased within two rounded brackets.
|-
| scope="row"| ((unclear))
| Intelligible or unclear speech are denoted with a “unclear” placed within rounded brackets.
|-
|}


= Things to consider =
= Importing transcripts =
<b>If you use an external transcription tool, ensure that transcripts conform to the following rules:</b>
* Transcripts should be saved as a text file (.txt) and have an identical file name as the media that they are derived from.
* Leave one blank line between each paragraph.
* Each paragraph should begin with a timestamp, followed by the speaker's designation.
* There should be no space before or after the timestamp, and no hashtags (<code>#</code>) should be used.
** If your transcription software adds these, they can be removed using your text editor's find-and-replace function.


== Overlaps and simultaneous speech ==
  00:03:40.5Zack: What will digging this hole accomplish for the project?
Opening square brackets are inserted at exactly the point in speaking where the overlap starts, and closing square brackets, where it ends. In both Jefferson and GAT, the respective brackets are aligned with each other within the text, however this is fairly tedious to do in MaxQDA. Perhaps just the indication of overlap with the brackets is sufficient? Will need to discuss this further. Please refer to Selting & Auer page 13 for a fuller discussion of this.  
 
  00:03:44.3Jim: It will fill a gap in time and space.


=== Example: ===
* Ensure that speakers' names are spelled consistently throughout the document.
Subject 1: Are you going too?
Subject 2: No, I have to [work.
Subject 1: How about a] drink to celebrate [the day?
Subject 2: That] would be great.


== Laughter ==
<b>To import transcripts:</b>
Kowal and O’Connell note two type of notation conventions for laughter. The first being what they term as “ha-ha laughter” where the approximate number and phonetic laughter syllables are transcribed, i.e. HA HA HA HA. The second being overlaid laughter, which occurs as an notation conventions: so-called ha-ha laughter was transcribed by an approximation to the number and phonetic constitution of laughter syllables; so-called overlaid laughter, overlay on spoken- word syllables. This is difficult to transcribe so it showed by underlining those parts of an utterance which were produced laughing.
1. Under the Import ribbon, select Focus Group.
2. Select the text file containing the properly-formatted transcript from the file menu.


=== Example: ===
A new document will be created containing the imported transcript.
Subject 1: What do you do?
[image of the document window]
Subject 2: HA HA HA HA HA AHH
Subject 1: I want to know, what do you do?
Subject 2: <u>Transcribe music.</u> Read books. <u>Swim at the river. Go out at night.</u>


== Non-verbal vocal actions and events ==
Timestamps will be displayed using the clock icon, and will not be displayed in the text.
Non-verbal vocal actions and events are denoted with two rounded brackets ((  )). If the non-verbal action cannot be attributed to any one speaker the notion is entered as a new line in the transcript with its own timestamp.
[zoomed in image of the clock icon]


=== Example: ===
Speakers' names will be styled bold, and will also form the basis of an auto-code implementation so that the
Subject 1: Hello ((coughs)) I am ready.
[zoomed in images of the speakers' auto-coding, and of the filters created in the documents and codes windows]
((recording device beeps))
Subject 2: Great.


== Intelligibility ==
== Importing timestamps ==
Intelligible or unclear speech are denoted with a “unclear” placed within rounded brackets, (unclear). GAT has suggestions for uncertainties/alternatives in speech, however adding in assumptions may lead to bias.
MaxQDA 2018 does not play nicely with timestamps. Timestamps can only be imported as part of imported transcripts. They can not be imported on their own or be automatically assigned to transcripts that already exist within MaxQDA.


=== Example: ===
When importing a transcript containing timestamps, the timestamps must be formatted in the following way: <code>HH:MM:SS:m</code>, with no spaces between the final digit and the text that the time stamp precedes. Example:
Subject 1: Are you sleeping?
Subject 2: (unclear) I was.
Subject 1: Oh never mind then.


= Jeffersonian notation =
  00:00:27.6Zack: Hi, how are you?
The transcription protocol is adapted from Jefferson notation, which is summarized in the following table.


{| class="wikitable"
= Exporting transcripts =
|-
! Notation
! Description
|-
| [ ]
| Square brackets mark the start and end of overlapping speech. They are aligned to mark the precise position of overlap as in the example below.
|-
| ↑↓
| Vertical arrows precede marked pitch movement, over and above normal rhythms of speech. They are used for notable changes in pitch beyond those represented by stops, commas and question marks.
|-
| →
| Side arrows are used to draw attention to features of talk that are relevant to the current analysis.
|-
| under<u>lin</u>ing
| Indicates emphasis; the extent of underlining within individual words locates emphasis and also indicates how heavy it is.
|-
| CAPITALS
| Mark speech that is hearably louderthan surrounding speech. This is beyond the increase in volume that comes as a by product of emphasis.
|-
| °↑<u>I</u> know it,°
| ‘Degree’ signs enclose hearably quieter speech.
|-
| that’s r*ight.
| Asterisks precede a ‘squeaky’ vocal delivery.
|-
| (0.4)
| Numbers in round brackets measure pauses in seconds (in this case, 4 tenths of a second). If they are not part of a particular speaker’s talk they should be on a new line. If in doubt use a new line.
|-
| (.)
| A micropause, hearable but too short to measure.
|-
| ((stoccato))
| Additional comments from the transcriber, e.g. about features of context or delivery.
|-
| she wa::nted
| Colons show degrees of elongation of the prior sound; the more colons, the more elongation.
|-
| hhh
| Aspiration (out-breaths); proportionally as for colons.
|-
| .hhh
| Inspiration (in-breaths); proportionally as for colons.
|-
| Yeh,
| ‘Continuation’ marker, speaker has not finished; marked by fall-rise or weak rising intonation, as when delivering a list.
|-
| y’know?
| Question marks signal stronger, ‘questioning’ intonation, irrespective of grammar.
|-
| Yeh.
| Full stops mark falling, stopping intonation (‘final contour’), irrespective of grammar, and not necessarily followed by a pause.
|-
| bu-u-
| Hyphens mark a cut-off of the preceding sound.
|-
| >he said<
| ‘Greater than’ and ‘lesser than’ signs enclose speeded-up talk. Occasionally they are used the other way round for slower talk.
|-
| solid.= =We had
| ‘Equals’ signs mark the immediate ‘latching’ of successive talk, whether of one or more speakers, with no interval.
|-
| heh heh
| Voiced laughter. Can have other symbols added, such as underlinings, pitch movement, extra aspiration, etc.
|-
| sto(h)p i(h)t
| Laughter within speech is signalled by h’s in round brackets
|-
|}


Jefferson, G. (2004). Glossary of transcript symbols with an introduction. In G. H. Lerner (Ed). Conversation Analysis: Studies from the First Generation. (pp: 13-31). Amsterdam: John Benjamins.
== Exporting timestamps ==
Hepburn, A. and Bolden, G. B. (2013). Transcription. In Sidnell, J. & Stivers, T. (Eds). Blackwell Handbook of Conversation Analysis (pp 57-76). Oxford: Blackwell.
A document's time stamps can be displayed as a table and exported as an Excel file, however this information does not indicate where they belong in the text. It may be possible, however, to align an Excel export of time stamps with an Excel export of a document's text (arranged by paragraph). This involves lots of manual work, and consistent recording practices at the time of time stamps's creation.


=== Additional notation for crying and similar ‘emotional expression’: ===
Merging MaxQDA projects does not preserve time stamps.
{| class="wikitable"
|-
! scope="col"| Notation
! scope="col"| Description
|-
| °°help°°
| Whispering –enclosed by double degree signs.
|-
| .shih
| Wet sniff.
|-
| .skuh
| Snorty sniff.
|-
| ~grandson~
| Wobbly voice –enclosed by tildes.
|-
| ↑↑Sorry
| Very high pitch –represented by one or more upward arrows.
|-
| k(hh)ay
| Aspiration in speech–an ‘h’ represents aspiration: in parenthesis indicates a sharper more plosive sound.
|-
| hhhelp
| Outside parenthesis indicates a softer more breathy sound.
|-
| Huhh .hhih
| Sobbing–combinations of ‘hhs’, some with full stops before them to indicate inhaled rather than exhaled, many have voiced vowels.
|-
| Hhuyuhh
| Some also have voiced consonants.
|-
| >hhuh<
| If sharply inhaled or exhaled enclosed inthe ‘greater than/less than’ symbols (> <).
|-
| ↑Mm:. hh (3.5)
| Silence–numbers in parentheses represent silence in tenths of a second.
|-
|}


Hepburn, A. (2004). Crying: Notes on description, transcription, and interaction. Research on Language and Social Interaction, 37(3), 251-290.
At the moment, these time stamps are essentially useless since they will be lost during any merge. However, we keep the original MaxQDA file created for each transcription job with the hope that the developers incorporate more effective tools for handling time stamps in the future. We will


===Additional commonly used notations: ===
MaxQDA files created for the purpose of transcription should be named in the following way: <code>[Interviewee]_[YYMMDD]_[Version]_[TranscriberInitials]</code>. Example: BrandonOlson-2019-06-06-ZB.mx18. They should be stored in <code>admin/transcripts</code>.
{| class="wikitable"
 
|-
= See also =
! scope="col"| Notation
[[Transcript notation]]
! scope="col"| Description
|-
| $funny$
| Smile voice –laughing/chuckling between markers.
|-
| #sad#
| Talk between markers is croaky.
|-
| <strong>t, d,</strong>
| Boldface consonant represent a hardened sound.
|-
|}

Revision as of 15:09, 6 January 2020

Creating transcripts within MaxQDA

Creating time stamps

When transcribing using the MaxQDA built-in transcription tool, it is necessary to record time stamps.

Time stamps can be recorded within MaxQDA as part of the transcription process. They can also be created on their own by right clicking anywhere within a document.

Importing transcripts

If you use an external transcription tool, ensure that transcripts conform to the following rules:

  • Transcripts should be saved as a text file (.txt) and have an identical file name as the media that they are derived from.
  • Leave one blank line between each paragraph.
  • Each paragraph should begin with a timestamp, followed by the speaker's designation.
  • There should be no space before or after the timestamp, and no hashtags (#) should be used.
    • If your transcription software adds these, they can be removed using your text editor's find-and-replace function.
 00:03:40.5Zack: What will digging this hole accomplish for the project?
 
 00:03:44.3Jim: It will fill a gap in time and space.
  • Ensure that speakers' names are spelled consistently throughout the document.

To import transcripts: 1. Under the Import ribbon, select Focus Group. 2. Select the text file containing the properly-formatted transcript from the file menu.

A new document will be created containing the imported transcript. [image of the document window]

Timestamps will be displayed using the clock icon, and will not be displayed in the text. [zoomed in image of the clock icon]

Speakers' names will be styled bold, and will also form the basis of an auto-code implementation so that the [zoomed in images of the speakers' auto-coding, and of the filters created in the documents and codes windows]

Importing timestamps

MaxQDA 2018 does not play nicely with timestamps. Timestamps can only be imported as part of imported transcripts. They can not be imported on their own or be automatically assigned to transcripts that already exist within MaxQDA.

When importing a transcript containing timestamps, the timestamps must be formatted in the following way: HH:MM:SS:m, with no spaces between the final digit and the text that the time stamp precedes. Example:

 00:00:27.6Zack: Hi, how are you?

Exporting transcripts

Exporting timestamps

A document's time stamps can be displayed as a table and exported as an Excel file, however this information does not indicate where they belong in the text. It may be possible, however, to align an Excel export of time stamps with an Excel export of a document's text (arranged by paragraph). This involves lots of manual work, and consistent recording practices at the time of time stamps's creation.

Merging MaxQDA projects does not preserve time stamps.

At the moment, these time stamps are essentially useless since they will be lost during any merge. However, we keep the original MaxQDA file created for each transcription job with the hope that the developers incorporate more effective tools for handling time stamps in the future. We will

MaxQDA files created for the purpose of transcription should be named in the following way: [Interviewee]_[YYMMDD]_[Version]_[TranscriberInitials]. Example: BrandonOlson-2019-06-06-ZB.mx18. They should be stored in admin/transcripts.

See also

Transcript notation