M09b8 - The word count in Turnitin - managing inconsistencies

Overview

There are inconsistencies between the automatic word counts in word processed documents, PDFs, and these file types viewed in Turnitin.

When the uploaded document is Microsoft Word format, there is also an issue with Turnitin not including footnotes and endnotes.

To students Digital Education recommend that when saving their work they follow any guidance from their department. We also recommend that if students have trouble submitting (as occasionally happens) then saving as PDF tends to work - but this may change the word count. We have guidance for students on submitting their work.

To staff Digital Education recommend giving students particular guidance which as far as possible refers to (i.e. links to rather than reproduces) the generic guidance we provide for students, flagging up any points of departure explicitly.

There's no UCL-wide policy for calculating word counts, but there may be faculty-wide policies so if in doubt about local practice, do consult your own faculty's Board of Examiners.

About the inconsistencies

The inconsistencies are illustrated by an experiment from one academic department, which yielded the following data for the same text. In this case the discrepancy is up to 41 words.

Document type

Word

PDF

Turnitin Word*

Turnitin PDF*

Document with table (table created in Word)


594

594

594

598

The above + footnote


609

610

594**

614

The above + extra table copied from Excel
and pasted as a picture into the document

609

611

594***

635

* 'Turnitin Word' or 'Turnitin PDF' means those file formats rendered in Turnitin's marking environment.

** the footnote was not counted by Turnitin when the document was uploaded as a Word document.

*** the table, when copied from excel and pasted as a picture into the Word document, was not counted by Turnitin when the document was uploaded as a Word document.

Turnitin's approach to the word count

Turnitin explains its approach to word count as follows:

“Turnitin uses a word counting algorithm very similar to that of Microsoft Word.  For everything except HTML, PDF, and PS file types, we rely on Microsoft Word's word count system."

Note: Turnitin's word count does not count the words in textboxes, footnotes, and endnotes since these are not included by default. For example, if the whole paper (e.g., a form) is in a textbox, it may be rejected because the word count is too low.”

  • Although Microsoft Word word count does in fact include words in text boxes, footnotes and endnotes (unless that checkbox is unchecked in the Word Count dialog box), Turnitin doesn't include them in its word count unless the document is uploaded as a text based PDF (with text based PDFs, it can't tell the difference - it's all just words).
  • Image based PDFs are solely image files which Turnitin cannot process.

What departments can do

At least one faculty Board of Examiners has discussed this matter formally. There was agreement that it wouldn't be feasible to have faculty-wide framework calculating word counts.

Other institutions e.g. University of Sussex urge their staff to treat the word count as indicative only, and to use their judgement. In this respect the word count is not accurate, but it may still be an advance on the judgements which used to happen with paper-based marking.

If there is a likelihood that, to circumvent the word count, students will displace text from the body of the essay into footnotes or any other form which is not counted by the algorithm, then consider asking the students to state an inclusive word count before uploading.

Students should be provided with clear, consistent, salient guidance on:

  • Whether there is a template and instructions to standardise things like margins, font style and size, etc. (it may be that standardised submissions are a further visual aid to markers on word count who can then estimate that x number of words is roughly y number of pages - though diagrams and images complicate the estimation).
  • The format students should save their work in (again please note that only in PDF format does Turnitin count footnotes, endnotes, etc.).
  • Whether the word count should be based on the Turnitin value (and how to check) or the original word-processed value.
  • If the word count isn't based on the Turnitin value, then whether students should include a statement of the word count.
  • Whether the word count should include words in textboxes, footnotes and endnotes (it often will - otherwise there is an incentive for students to display text into these elements to exclude them from the word count), and some guidance about how to configure the word count to include them.
  • That it is possible for markers to download original work to check the word count.  

Departments should also direct markers' attention to guidance provided to students.

Note that it is possible to bulk download Turnitin submissions in the original file format, before and after the Post Date. 

Take opportunities to explain and repeat the ethical reasons for a word count so they become understood as equal opportunity for students rather than convenience for markers.

If this doesn't meet your needs

Consider using Moodle Assignment. Since student work opens in its original format, there won't be a discrepancy between the original and uploaded work.

Questions and answers

Can we ask students to break up the work into separate files, isolating the actual essay into a single file?

In theory it's possible to ask students to split work into different files e.g. one for the coversheet, one for the body of the work, and one for the references section. However, while this would improve the accuracy of the word count (though only somewhat - see above) this may prove too onerous both for the students and for the markers, who would need to frequently cross-reference between files.