Using Txt Files

The mapping process can be less manual by using .txt files. Please send the files to Hayley Mills to load into Archivist. Once loaded you can then check the mappings. The format specifications are given at below along with examples. A list of the Archivist pages which can be used to check the mappings (and view a list of questions) can be found at the end of the page with examples. 

It is recommended that you create only one topic mapping file first (preferably tv.txt) as the other will be inherited once the question and variable have been mapped. Any gaps in topics can then be filled afterwards. This should prevent topic conflicts.

A spreadsheet usually created by CLOSER using this template which contains all the questions, variables and topic list, is sent to the studies to help with creating the mappings. When the completed mappings have been returned by the study, these are checked for:

  • Gaps in mappings
    • questions which have not been mapped to variables
    • questions which have not been mapped to topics
    • variables which have not been mapped to topics
  • Any obviously incorrect topic mappings 
  • Variables which have not been mapped to questions are in the DVs tab

If there are issues to resolve, the spreadsheet it then sent back to the study to complete. The finalised Excel file is then reformatted into the qv.txt, tv.txt, tq.txt and dv.txt as described below and any rows which don't contain mappings are removed. They are then loaded into Archivist as descried below, and checked for any conflicts or gaps. If there are any issues, the all_mappings.txt (e.g. ncds_65_eq_all_mappings.txt) is downloaded and used to highlight these to the study. This includes all questions and all variables, even if they are not mapped but doesn't include any derived variable mappings. Topics will also need to be added.

Question to Variable mappings

These are loaded by navigating to Admin > Instruments > search for the instrument prefix > IMPORT MAPPINGS. Choose files and select the qv.txt file you want to import. See format specifications below for details of the qv.txt file. You can also select to import question to topic mappings tq.txt at the same time. From the dropdown select whether the file is Q-V Mapping or T-Q Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file. 

Common reasons for failures:

  • The incorrect file has been selected
  • The incorrect mapping type has been selected
  • The file format is not correct
  • The instrument and dataset is not linked in Archvist
  • The content is not valid
    • The names are not the correct case
    • The variable in not in the dataset
    • Typos or spaces in the names

Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the qv.txt file and re-import.

Note: When re-importing, the current mappings will be replaced, so you must include all the question to variables you want to import in the file, not only the records (rows) you want to update. 

Note: All manual mappings will be replaced by the imported qv.txt mapping file. If you do not want this to happen, then export the qv.txt first, then update this before importing. 

Variable to Topic mappings

Questions and variables which are mapped together must have the same topic. Questions which are mapped to a variable which has been linked to a topic, will automatically inherit that topic from the variable. After the question to variables have been imported successfully, it makes sense to add the variable topic mappings which the questions will inherit those topics. These are loaded by navigating to Admin > Datasets> search for the dataset prefix > IMPORT MAPPINGS. Choose files and select the tv.txt file you want to import. See format specifications below for details of the tv.txt file. You can also select to import derived variable mappings dv.txt at the same time. From the dropdown select whether the file is T-V Mapping or D-V Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file. 

In addition to the common reasons for failures listed above, other reasons include:

  • Topics conflict - questions and variables which are mapped together must have the same topic. If a different topic has already been assigned to question which has been mapped to that variable this will be invalid
  • If a variable has been mapped to 0.

Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the tv.txt file and re-import.

Note: When re-importing, the previous imported topic mappings will not be replaced unless they have been updated in the file. 

Note: All manual mappings will be replaced by the imported tv.txt mapping file. If you do not want this to happen, then export the tv.txt first, then update this before importing.  

Note: Only one topic can be applied to a grid as a whole- this means that all variables mapped to a grid question must have the same topic.

Note: There is a difference between a topic not being assigned (0) and a topic mapped to None (000), which is considered a topic, and can cause conflicts. 

Question to Topic mappings

After the question to variables have been imported successfully, you can import the tv.txt mappings files as above, alternatively if you only have question to topic mappings, or you have both tv.txt and tq.txt you can import those next or at the same time. 

These are loaded by navigating to Admin > Instruments> search for the instrument prefix > IMPORT MAPPINGS. Choose files and select the tq.txt file you want to import. See format specifications below for details of the tq.txt file. You can also select to import the question to variable mappings qv.txt at the same time. From the dropdown select whether the file is T-Q Mapping or Q-V Mapping. Note. the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is one invalid record (row) in the file, see common reasons listed above. 

Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the tq.txt file and re-import.

Note: If you map to 0 (i.e. no topic), this will give an invalid error, but it will map the question to 0, this way you can reset the mapping back to no mappings. This is not the same as None topic. 

Note. When re-importing, the current question to topic mappings will be replaced, so you must include all the question to topic mappings you want to import in the file, not only the records (rows) you want to update. 

Note. All manual mappings will be replaced by the imported tq.txt mapping file. If you do not want this to happen, then export the tv.txt first, then update this before importing.  

Note: Only one topic can be applied to a grid as a whole- this means that all variables mapped to a grid question must have the same topic.

Note: There is a difference between a topic not being assigned (0) and a topic mapped to None (000), which is considered a topic, and can cause conflicts. 

Derived variable mappings

Derived variables do not directly map to a question, but are created using other variables. A derived variable will have at least two source variables mapped to it. These are loaded by navigating to Admin > Datasets> search for the dataset prefix > IMPORT MAPPINGS. Choose files and select the dv.txt file you want to import. See format specifications below for details of the dv.txt file. You can also select to import question to topic mappings tv.txt at the same time. From the dropdown select whether the file is D-V Mapping or T-V Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file. 

Note: Derived variables and their source variables do not have to have the same topic, although a suggested topic is inherited from the source variable. 

Format specifications

qv.txt

Mapping file which links questions and variables.

  • Tab delimited
  • 4 columns:

1. Questionnaire prefix with _ccs01 suffix

2. Question label (with optional suffix grid cell coordinates in the format $X;Y. Please refer to the Grid coordinates table on the condition page for how to reference grid cells.)

3. Dataset prefix (which usually matched the questionnaire prefix but without the suffix)

4. Variable name

tv.txt

Mapping file which links topics and all variables.

  • Tab delimited
  • 3 columns:

1. Dataset prefix

2. Variable name

3. Topic ID (Uses Colectica topic IDs)

tq.txt 

Mapping file which links topics and questions (this can be inherited using the qv and tv mappings).

  • Tab delimited
  • 3 columns:

1. Questionnaire prefix with _ccs01 suffix

2. Question label

3. Topic ID (Uses Colectica topic IDs)

dv.txt

Mapping file which links derived variables to source variables.

  • Tab delimited
  • 4 columns:

1. Dataset prefix

2. Derived variable name

3. Dataset prefix

3. Source variable name

Reviewing the mappings

Dataset and instrument views can be used to check for gaps and conflicts.

  • Dataset view allows you to check Question-Variable, Derived variable, and Variable-topic mappings. Datasets > search and select the Name of the dataset.  Any mapping conflicts will appear in red. 
  • Instrument Map view allows you to check Question-Variable and Question-topic mappings. On the instrument page search for the prefix then select MAP. Any mapping conflicts will appear in red.
  • Instrument view allows you to check Question-Variable mappings. Instruments > search and select the Prefix (or view) of the instrument. This will gives the questionnaire view with the variable names listed against them.
  • All_mappings.txt view allows you to check and download Question-Variable, Variable-topic and Question-topic mappings. On the instrument page search for the prefix then select MAP, then click Download File, this will list; Question name, Question text, Question Topic ID, Variable name, Variable Topic ID, and Variable Label. Note it doesn't include derived variables mappings.

Viewing and downloading mappings

Most .txt files can be viewed from one Instrument Export page. Navigate to Admin > Instrument exports > search for the prefix then select VIEW e.g. https://closer-archivist.herokuapp.com/admin/instruments/ncds_65_eq/exports

  • QV Question Variables Download qv.txt  - To view the qv.txt file in the format described above. Note: this will only include questions which have variables mapped to them. 
  • TQ Topic Questions - Download tq.txt - To view the tq.txt file in the format described above. Note: this will only include questions which have topics mapped to them. 
  • Mapper Question and sequences Download mapper.txt - To view the mapper.txt file which contains the sequences, question name, sequence ID and question text. 
  • CC Questions Construct Questions Download cc_questions.txt - To view cc_questions.txt which contains the question name, and text for question items and question grid sub-questions. 
  • variables.txt - To view the variables for the dataset linked to the questionnaire, includes variable name, label and whether it is normal or a derived variable. 
  • tv.txt - To view the tv.txt file in the format described above. Note: this will include all variables whether they are mapped to a topic or not. 
  • dv.txt - To view the dv.txt file in the format described above. 

If there are no question mappings, variable mappings can also be viewed by navigating to Admin > Datasets > search for the prefix then select VIEW e.g. https://closer-archivist.herokuapp.com/admin/datasets/130/exports