Principles

During the metadata entry process, a set of five principles were developed to outline and protect the overall standard by which the metadata is being documented. See Poynter and Spiegel (2015) for more details.

The principles of questionnaire documentation are as follows:

0. Documentation must make sense

Principle 0 must be maintained at all times. Ensuring the documentation makes sense should be the basis for all decisions made based on the principles below. Note that this does not refer to correcting mistakes in the questionnaire, but choosing the best method in which the questionnaire is documented. For example, when there are two options for condition text, the text which refers to the true branch is used, and the alternative is usually added as a statement. However, in cases were the second statement only contains the direction (with or without an arrow) it does not have any meaning. In example 1 below,  'Go to Section D on page 30' would be ignored, because it is not clear which answer it refers to without the arrow, and does not make sense on its own.

Example 1 ALSPAC My Son/Daughter’s Health and Behaviour

1. Maintain and do not alter the semantic meaning of the questionnaire

Principle 1 must also be maintained at all times. CLOSER intends that the metadata documented is capable of being shared with other DDI compliant organisations, hence principle 1 ensures that CLOSER produces consistent and comparable metadata. The practice of keeping to principle 1 requires decisions to be made as to what questionnaire elements provide meaning. For example it was decided that bold font provided no semantic meaning and therefore CLOSER is not documenting the weight of the font. The order of multiple choice options was deemed relevant and therefore special care is taken to preserve the order within the documentation. The three following principles were conceived purposefully to give structure and guidance as to how the first principle should be followed at all times and in all situations. In example 2 below we can see that “Does the mother care for children at home ... ?” is in bold. We do not document this, however we do document that the order of the code list is Yes, No, No known.

Example 2 BCS Birth Questionnaire

2. Do not correct the questionnaire

Principle 2 should only be broken when doing so maintains principles 0 and 1. It is fairly common to find what seem like mistakes in the questionnaire design; these can range from typos (e.g. 'martial status' instead of 'marital status') to impossible condition logic. Any mistakes within the questionnaire could have altered the data being collected, and therefore it is important to avoid correcting the metadata. Also, what seems like a mistake always has the potential of being done purposefully. In the case of a typo, the misspelt word can be aliased within the search engine to allow effective searching (e.g. searches looking for the 'marital', would also find aliased questions with the word 'martial').

There are rare situations where principle 2 must be violated in order to follow principle 1 and principle 0. An example of this is can be seen in Example 3, when a code value is accidently printed with the wrong value. There is no easy way to alias code values and CLOSER's documentation would suggest that there are two distinctly different codes, while the dataset would only refer to one of the codes. In this case, documenting the mistake would mislead the user and violate Principle 1 and 0. As can be seen in Figure 4 the code values change partway down the right hand column. This causes an issue while documenting as it is not possible to change a Code Answer part way down a Question Grid, therefore the question can no longer be documented as a single Question Grid. In addition to this, it is highly unlikely that the change of code value was intentional; so it is also highly unlikely that the dataset would reflect the mistake. In this situation it is deemed appropriate to violate principle 4 and check if the dataset contains any values of 3, or were they all recorded as 1. If it is confirmed that the code values of 3 are a mistake then correcting the code values provides a more accurate documentation of the structure of the question and how the question is mapped to the data collected.

Example 3 ALSPAC Questionnaire My Little Boy/Girl


















3. Only record what is contained within the questionnaire

Principle 3 should only be broken when doing so maintains principles 0, 1 and 2. There are situations where the questionnaire does not provide all of the information to document meaningfully or to generate valid DDI. However, it is important to refrain from adding additional information that is not within the questionnaire. An example of when not to add to the metadata is when a questionnaire asks:

3.2 "how many?"

Often this question and similar questions can be found within a condition, immediately following a question similar to

3.1 "Do you own a car?"

Question 3.2 is largely meaningless without being able to see question 3.1, which makes it tempting to concatenate question 3.1's text to question 3.2 creating:

3.3 "Do you own a car? how many?"

Doing this violates Principle 1. The solution to maintain context for question 3.2 can be found by accurate routing using conditional constructs. For example question 3.2 will be within the condition ‘If yes to 3.2.’

Example 4 provides an example of where the metadata must be added in order to maintain Principle 1 is when a questionnaire uses an arrow to denote a condition.It  is impossible to document an arrow literally and leaving it out of the documentation altogether changes the logical flow of the questionnaire and violates principle 0 of making sense. Therefore, text representing the arrow's meaning has to be added.

Example 4 MCS Questionnaire Teacher Paper Questionnaire







4. Do not allow the data recorded (i.e. the variables) to inform the metadata archiving

Principle 4 is the least significant principle. Whilst documenting the structure, flow and intent of a questionnaire, it may seem harmless to consult the collected data in order to better understand the questionnaire being documented. This practice, however, should also be avoided. The aim of the ingest programme is to record the instruments used for data collection as accurately as possible. Using information that was created after the collection event alters the perception and understanding of the instrument.

An example of this can be seen in Example 5, when a question offers a set of multiple choice options and an "other (specify)" option. If the same response is often specified within the other option, then it is relatively common to code that answer in addition to the originally offered options (e.g. Consultant). It is important not to document this additional coded answer, because it was not offered to the respondents and therefore potentially had an effect on the collected data.

The most common situation where breaking Principle 4 is valid is when Principles 2 or 3 must be broken. For example, in the situation where it is most appropriate to correct the questionnaire, it is obviously vital to check whether the mistake was intentional or not and if the mistake had a distinct effect on the collected data.

Example 5 BCS Birth Questionnaire










References

Poynter, W. and Spiegel, J. (2015) Protocol Development for Large-Scale Metadata Archiving using DDI-Lifecycle. IASSIST Quarterly, 39, 3, p.23-29.