Headings: !!Data Analysis Tools and Applications !!!!*Excel* !!!!*GIS* !!!!NetSurvey !!!!*Sociological Insights* !!!!*Stata* !!!!*SPSS* !!!!*StudentCHIP* !!!!Survey Documentation and Analysis (SDA)
Excel :Spreadsheet and data analysis package. Very good for basic graphs and charts.
Geographic Information Systems (GIS) Tools :Mapping and spatial analysis.
NetSurvey :In-class web survey tool. Also useful for on-line exams.
Sociological Insights :Freeware data analysis package.
Stata :Multiple purpose statistical package.
SPSS:Another commonly used and user-friendly statistical package.
StudentCHIP :Basic statistical package and its internet version, WebCHIP.
Survey Documentation & Analysis (SDA) :Online tool for data extraction and analysis.
- Excel is the Swiss Army knife of data analysis. By combining its extensive library of formulas and the ability to "point" or link to the contents of other cells, you can make summary reports, tables and graphs - all of which can be updated with new information on-the-fly. As an introduction to its capabilities, start with the following two resources: Graphing with Excel - a collection of tutorials on spreadsheet, graphing and data analytical features, and Analyzing Data with Microsoft Excel 2002 (1 MB Microsoft Word file). Microsoft's Excel website also offers several tutorials on many different features of Excel. Below we highlight some references for several basic functions in Excel:
- Spreadsheet basics:
- Graphs: "Insert > Chart..."
- Importing data: "Data > Text to columns..."
- Cross-tabulation: Data > PivotTables Report...
- The real value of PivotTables is the ability to drag and drop row, column and cell variables on-the-fly. PivotTables are discussed in pages 11-16 of the Microsoft tutorial and in a tutorial
from Duke University's Business School. One helpful hint: To get the content of the table without all the formatting or without duplicating the PivotTable you can select the PivotTables, then go to Edit > Copy and use Edit > Paste Special...Values.
- Statistical tests: Tools > Data Analysis...
- Excel can do all of the following: ANOVA, correlation, covariance, descriptive statistics, exponential smoothing, F-test, Fourier analysis, moving average, random number generation, rank and percentile, regression, sampling, t-tests and z-test. However, this functionality is off by default. To turn this on for the present and future sessions go to Tools > Add-Ins... and check the Data Analysis ToolPak button.
- Here are two innovative resources that use Excel for data analysis:
- Geographic Information Services (GIS) software allows data visualization through the use of thematic maps that show the spatial distribution of variables such as income, race, or education within geographic areas such as state boundaries, zip codes, or census tracts. These maps allow multiple geographical features and variables to be analyzed simultaneously. GIS software also includes relational database capabilities that allow multivariate analyses such as multi-level cross-tabulations (through the use of queries). For example, one could map the migration patterns of college-educated, Asian-Americans who earn more than $60K annually in Los Angeles County. Moreover, the output can extend beyond maps to any picture or image.
- UCLA's Academic Technology Services (ATS) offers a good overview and introduction to various Geographic Information Services (GIS) resources.
- The ATS website also provides links to useful GIS sites on the web, including:
- The following sites also provide some introductory information for new users:
- Social Explorer is a new, interactive mapping program created by Andrew Beveridge at CUNY that allows you to create thematic data maps of demographic trends in the U.S. The program is still under development (but contains a lot of useful features already); it is free and open to anyone using the web. Currently, maps can be created from mostly 2000 Census data and some earlier versions, but there are plans to add data from as far back as 1790! The program focus on the New York and Los Angeles regions, and permits analyses of change over time. You can read through a basic "getting started" tutorial here.
- The 2000 Census maps can be accessed here, in the "Maps" link off of the home page. From this page, you can examine themes from the drop-down menu, either for the nation as a whole or for a particular state or Census tract. Some useful features of the program include:
- Making a report of the information contained in your map, which can be downloaded to Excel and presented as a tabular complement to the map's visual display. To make a report users can simply click on the "report" button in the mapmaker window once they have created a map.
- Creating a slideshow of several maps. This feature can be used to visually examine change in regions or neighborhoods over time by progressing from one time period to the next. It can also be used to examine a region in successively more detail, by presenting the broad area and then presenting maps that contain zoomed-in information. The slideshow option is also accessible in the mapmaker window. In the future, users will be able to save and reuse the slideshows that they create in the program.
- SEDAC (Socioeconomic Data and Applications Center) at Columbia maintains a collection of GIS mapping tools that can be freely used online. For example:
- Demographic Data Viewer is an online mapping tool that allows you to create highly detailed and customizable information maps for 220 demographic variables from 1990 Census data. Regions can be mapped from the national level down to the census tract level, making this a great tool for local and metropolitan level analyses. You can also download the data for any map that you create. There is unfortunately no save feature for the maps you create but you can use the Windows Print-Screen function to save your graphs as an image file. See the tutorial for more information. For example, this sample map of the Hispanic Population in Los Angeles - 1990 took about 15 minutes to create and save.
- US-Mexico Demographic Data Viewer allows interactive mapping of the US-Mexico border region for over 200 socioeconomic variables. The site includes both a tutorial and documentation.
- The Young Research Library at UCLA also offers GIS resources. The library has 4 GIS work stations located in the reference room on the first floor of YRL that are dedicated to UCLA students, faculty and staff. These computers run ArcGIS and the library has a comprehensive CD tutorial (and manual) to get people going on the software.
- YRL licenses GeoLytics Census data from 1970 to 2000. The reference librarians (and especially the YRL Cartographic Information Librarian David Deckelbaum (ddeckelb@library.ucla.edu) can help you gain access to these data.
- YRL offers numerous other databases that have geo-coded statistics or boundary files. Listings can be found on the UCLA Library Online Catalog and are also summarized on the "RIS User Guide G5" handout available in the YRL reference room. In addition to Census 2000 files, holdings include state, regional and county data (business, crime, prisons, roads, schools, population, satellite, health facilities, etc.). YRL also maintains a comprehensive collection of data on Los Angeles and Long Beach.
- UCLA's Social Science Computing (SSC) department has created Class Web Surveys to help UCLA faculty (and students) conduct online surveys in their classes. The surveys can include both multiple choice and open-ended questions. This tool can be used to administer online exams and quizzes as well.
- SSC also provides a demo.
- The Class Web Survey option is embedded in every class web site under the "Other/Advanced Tools" in the administrator's menu. To learn more about Net Survey, go to the SSC page on Interactive Tools, where you can also view the survey as the student would experience it. The page also offers links to classes that have used Net Survey in the past.
- Sociological Insights is a free program developed by Jim Spickard, Professor of Sociology at the University of Redlands for the purpose of helping students learn data analysis more easily, and to think sociologically while developing skills in quantitative reasoning and hands-on data analysis.
- The program can be downloaded from Spickard's web page. It comes with some embedded data sets: The 1994 General Social Survey, 2000 General Social Survey, and state level data. Each data file contains between 100 and 300 variables and has been cleaned and prepared for analysis. Spickard promises to provide additional data files in the future, which he will post on his web page.
- The state level data come from the 2000 Census and other sources. These contain about 280 different variables including geographic and atmospheric data (e.g. average temperature), population characteristics (size, change, % urban, median age, sex ratio, population density etc.), mortality and fertility and other demographic and public health data (obesity rate, alcoholism, smoking, and drinking behavior etc.), health insurance coverage and costs, risk behaviors (gun carrying, etc.), voting patterns, crime and different crime rates, and many others.
- The survey data (1994 and 2000 GSS, in two separate files) contain more than 100 variables each. The choice of variables include basic demographic information (race, gender, education, employment status, prestige of occupation), background variables (parents' occupation, prestige, SEI, education), family characteristics (household, spouse and children), political variables such as voting and political attitudes, risk-taking behavior, and a measure of ability (the GSS vocabulary test). All variables have been categorized, but the categories seem to be rather reasonable for the purpose of classroom use.
- The program has extensive help files and also explains some basic statistical concepts such as correlation and regression.
- Stata is one of the most widely used multiple-purpose statistical software packages. It has become a standard tool for most sociologists, including many in the UCLA Sociology Department.
- Stata is taught in sociology undergraduate classes such as SOC113 (see the version taught by David McFarland and in other classes, such as Roger Waldinger's seminar "Children of the Immigrants."
- UCLA's Stata Portal has a comprehensive set of links to tutorials, examples, and FAQs about Stata. The site is part of the larger UCLA Statistical Computing Portal. Statistical analyses with Stata are also the subject of several web books including one on regression analysis and one on using graphics in Stata
- Stata is installed and available to all UCLA students in the CLICC lab and to students in the social sciences in the SSC labs (on both Macs and PCs).
- Students can purchase a limited version of Stata for 39 USD per year and the full version for 89 USD per year. Students can also purchase a perpetual license version for 129 USD (those licenses also include the minimal "Getting Started" manual. To learn more about ordering and buying Stata, visit this ATS page.
- ATS used to offer introductory Stata seminars at the beginning of each quarter but no longer conducts these. Instead, they have notes from the seminars available for download and you can also see movies showing how to operate Stata. If you are considering using Stata in your classes, check with ATS about the possibility of their staff giving an introductory tutorial in your lecture or seminar, which they can tailor to your needs.
- If students become so interested in Stata that they want to learn more, they can take a class in Statistics with Stata offered by the UCLA statistics department (notes are provided through ATS).
- Examples of handouts created specifically for teaching students about Stata are available on some sociology class websites. See, for example, Elizabeth Frankenberg's handout created for Soc. 104 in 2002. Also see Margot Jackson's handouts, created for Elizabeth Frankenberg's Fall 2005 Soc. 20 class: one on basic commands in Stata and one on making tables and graphs.
- Note that Stata 8 has drop down menus that make simple data analysis accessible in a windows-menu driven format that does not require writing any programming files. This makes the program much more accessible for undergraduate classes.
- In addition to Stata, SPSS is a very commonly used statistical package in undergraduate courses. Its "point and click" options make it very user-friendly for beginners.
- Like Stata, SPSS is accessible at the UCLA computing labs: the CLICC lab and the SSC labs.
- SPSS has been used in several undergraduate courses at UCLA. See, for example, the syllabus for Rebecca Davis' Soc. 20 (Intro. to Research Methods) course. Also see the syllabus for Prof. Esfandiari's Applied Statistics Course, Stat. 110B. This syllabus has examples of homework assignments using SPSS.
- The Statistical Computing center of UCLA's Academic Technology Services (ATS) offers extensive web-based tutorials about SPSS. Perhaps the best place to start is the "starter kit" page for SPSS, which links many of the available resources from the same page. These resources include:
- ATS also allows students to borrow books about statistics with SPSS.
- The Texas A&M Statistics Department has put together a comprehensive series of tutorials for SPSS, ranging from installing SPSS to graphing to more advanced regression analysis.
- SDA is "a set of programs for the documentation and Web-based analysis of survey data." Many databases, including those housed at ICPSR such as the General Social Survey, use this set of programs as an interface to allow researchers to access, extract and analyze data using the Web. The advantage is that one can access all of the features online. The disadvantage is that this interface has some important limitations such as no graphing capabilities. The SDA site also offers a gateway to a host of data sets including the GSS and the ICPSR Library. See also SDA's documentation page.
- SDA's extensive Online Help does a good job of explaining how to use its various capabilities including running frequencies or cross-tabulations. However, we found the help feature less informative on the creation and recoding of variables. As a supplement, we have written a detailed explanation
on how to use SDA's RECODE and COMPUTE methods.
History
Last edited Monday, 12 December 2005 at 13:54 by mj
http://www.wiki.org -- Wiki home:
DataInTheClassroom