Finding and Working with Microdata in <odesi>
ii) Cross Tabulation
iii) Bar charts
<odesi> is a web-based data discovery, extraction and analysis tool. It includes data that can be used for research on topics related to education, health, the environment, society and the economy.
With <odesi> you can explore data online before downloading datasets. This can be particularly useful if you are not sure what surveys to use, or if you simply want to examine a particular dataset.
<odesi> also lets you build bar charts and tables, and produce frequency and cross-tabulations, as well as correlations. These, as well as entire datasets can be downloaded to your computer.
Go to the Data Library Website: http://data.library.utoronto.ca/. In the table labelled Microdata click the <odesi> link under Canada.
This will open the <odesi> homepage.
There are two ways to find data in <odesi>. You can either search <odesi> with keywords or you can browse its content.
If you know the name of the survey you want to access, or if you remember certain keywords about it, use the Search box.
The more keywords you type in, the more precise your search will be. You can add search box rows by clicking on the blue plus sign at the end of the search box. Type each keyword into a separate search box.
Once you are done typing appropriate keywords into the search, press Search. Your search results will display below the search box. Your search results will display below the search box. If you want to explore individual results in more detail, click Explore & Download.
If you want to explore or browse the content of <odesi>, you can do so by clicking the browse button on the top right of <odesi>’s homepage.
Clicking this link will open the <odesi> repository in a separate window. On the left you can see topical headings under which you can find surveys and datasets.
To open a survey, for example the “Canadian Community Health Survey – Healthy Aging from 2008-2009,” click on the plus sign next to Health, then next to Canada, then next to Canadian Community Health Survey, and finally next to 2008-2009. After that, click on table symbol next to the survey name to open the survey.
a) Once you clicked on the table symbol next to the title of a survey, you can see the survey’s abstract in the frame on the right. You will notice that you are now in the Description Tab.
b) To see the variables of a survey, click on the plus sign next to Variable Description on the left. You will notice that the variables are grouped into topics. Click on the plus sign next to a topic to see the variables. In this case, we clicked on the Care giving heading to get a list of variables under this topic. Variables can be identified by the graph symbol found to their left.
c) If you click on an individual survey variable, for instance “Provide assist. –medical needs,” you can explore variable name and label, frequency information, the literal survey question and summary statistics:
To create tables in <odesi> you need to be in the Tabulation Tab:
In the following section we will create a table based on data from the “Canadian Community Health Survey – Healthy Aging from 2008-2009.” The table will show the variable Frequency of having more than 5 drinks (alcohol use) by province.
Open <odesi> Repository Nesstar. To open the “Canadian Community Health Survey – Healthy Aging from 2008-2009,” click on the plus sign next to Health, then next to Canada, then next to Canadian Community Health Survey and next to 2008-2009. After that, click on table symbol next to the survey name to open the survey.
Select the Tabulation Tab by clicking on Tabulation. In order to add variables to the tabulation, you need to open the variables list on the left. Do so by clicking on the plus sign next to the Variable Description heading on the left.
Under the variable heading Alcohol use you can find different variables related to alcohol use. To add a variable to a table, click on it. Select add to row to see a frequency distribution in the frame on the right.
The frequency distribution will then be displayed.
If you add more variables to your table, you create a cross tabulation. For example, if we add a geographical unit to our frequency distribution, we create a cross tabulation showing frequency of drinks by province.
Click on the variable you want to add to your table, and select add to column.
You can see the following cross-tabulation in the right frame.
Note: If more than one variable is added to a row or column, the variables will appear nested in the table.
Above a table there are a number of drop-down boxes, one for each variable and a type (or measure) box. These boxes enable you to move variables or to choose a different category for a variable.
For instance, if you only want to see the provinces of Quebec, Ontario, Manitoba and Saskatchewan in your table, click on Change Selection in the drop-down box.
A new window opens in which you select the categories you want to have displayed in your table. Click Ok, and the table in <odesi> will automatically update.
If you want to see separate tables for each province, add the “Province of residence of respondent”-variable as a filter instead of as a column (either when selecting the variable from the list on the left, or later from the drop down menu).
By changing a variable to filter, only one category of the chosen variable will be visible in the table at any one time. A ‘variable’ box will be created above the table from which other categories can be selected.
Adding a Measure
It is also possible to add continuous variables, e.g. age, or a person’s height measurement, to a table as a measure. For example, select the “Canada Fitness Survey 1981,” to be found under Health > Canada > Canada Fitness Survey. Once in the Tabulation Tab, select these variables from the Demographics heading: ‘Marital status’ as a row variable, ‘Sex’ as a column variable, and ‘Age’ as the measure variable. The data displayed represents the average age for each combination of ‘Marital status’ and ‘Sex.’
When a tabulation analysis is conducted, you have the option of displaying your analysis graphically. To visualize your data, click on the following symbol on the upper right-hand side: . Once clicked, you can see the list of graphs that are available to you.
You can visualize your data in three ways:
Are available when there are one or more variables in the table (except where there is only one measure variable).
Stacked bar charts
Are available when there are two or more variables in the table (but no measure variable).
Are available when there are two or more variables in the table (but no measure variable).
To perform a correlation, you must use the Analysis Tab.
In the example below, we will create a correlation between the lack of companionship and depression. We will use the “Canadian Community Health Survey – Healthy Aging from 2008-2009.”
1) Select the “Canadian Community Health Survey – Healthy Aging from 2008-2009” in the <odesi> Repository (Nesstar). Once you have done that, click on the analysis tab. The following page opens:
2) Click on the plus sign next to Variable Description in the list on the left to see the variable headings. By exploring them you will see that the variable “Lack of companionship” can be found under the Loneliness variables, and the variable “Depressed >= 2weeks” can be found under the Depression variables.
To create a correlation with the variables “Lack of companionship” and “Depressed >= 2 weeks,” click on them separately and select Add to correlation. Once you have selected your variables, your correlation is displayed in a table in the right frame.
Note: Two display options are available: ‘Significance’ and ‘Count.’ Check one or both options, then Update, to alter the displayed information.
You can also visualize your correlation by clicking on the scatterplot symbol on the upper right.
To return to the table, click the table symbol in the upper right corner.
Note: <odesi> allows for both listwise correlation analysis and pairwise correlation analysis. The latter is particularly useful when you have more than two variables in your correlation, or when you have checked Significance and Count under Display Options:
To start a new analysis, select on the upper right.
Survey datasets may include one or more weighting variables. These typically correct unequal selection probabilities.
Select the Weight option on the upper right corner, and any predefined weighting variables will be displayed. Click on the weighting variable listed in the left-hand box to select it.
Then click on the right facing arrow between boxes to move the weighting variable to the box on the right. Then click Ok to apply the weighting.
Note: You can do this any time during your use of <odesi>, that is to say, either before or after selecting variables.
It is possible to use a variable that is not pre-defined as a weight in the dataset. In that case, click on the weight option symbol, select a variable from the browsable list on the left and choose ‘Add as weight.’ The variable will then appear in the box on the right.
To indicate that a weight variable is in use the message ‘Weight is on’ appears below the results of any analyses, graphics, or tabulations. By hovering over this message the label for the weighting variable will be displayed.
Datasets can be very large and can thus take a long time to sort through. Many times, researchers only use a small selection of data from a particular dataset. When that is the case, it is advisable to create a data subset.
In other words, if you know that you only require a small part of a dataset, it is best if you create a subset.
This section will help you create and download a subset in <Odesi>. You will create a subset of the “Canadian Community Health Survey – Healthy Aging from 2008-2009” that will include the following variables:
Frequency of having more than 5 drinks (alcohol use)
Feeling isolated from others (loneliness)
Immigrant (socio-demographic variable)
Province of residence of respondent (geographic variable)
The subset will be created in <odesi> and the output file will be an SPSS file.
To open a survey, for example the “Canadian Community Health Survey – Healthy Aging from 2008-2009,” click on the plus sign next to Health, then next to Canada, then next to Community Health Survey and finally next to 2008-2009.
To select the Canadian Community Health Survey, click on the table symbol next to the survey name. After that, click on the plus sign next to Variable Description to see the survey variables.
Because you want to create and download a subset from this survey, click on the download symbol on the upper right.
The download page will then open. Click on the Subset button to select the variables you want to have included in your subset.
The following page opens:
You are now ready to select all of the variables you want to include in your subset. You do this by selecting them from the variable list on the left.
The variables that can be selected have the following sign in front of them:
Click on the variable you want to have in your subset. <odesi> will give you the option to add it to your subset.
To select more variables, go back to the list of variables on the left-hand side and add them to the subset.
Continue to add the variables you want to have in your subset.
Once you have added all the variables, double check the variables now listed under the Variables Tab and click OK.
You now need to select the format of your subset. Click on Select Data Format to see the available formats. For this example, select SPSS.
Once you selected your format, click on Download next to the format window.
Your subset will be downloaded in a zip file. Save it in your folder, then unzip it to see your subset.
For questions or comments, please contact the Map and Data Library, located on the 5th floor of Robarts Library.