0% found this document useful (0 votes)
283 views35 pages

CH - 1 - Introduction To Econometrics Software Stata

Application to basic Econometrics Software basically Stata

Uploaded by

mengistu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
283 views35 pages

CH - 1 - Introduction To Econometrics Software Stata

Application to basic Econometrics Software basically Stata

Uploaded by

mengistu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
Mengistu Yismaw (MSc.) Department of Economics Debre Markos University (Burie Campus) Email: [email protected] Lorie ra mics This course: objective After learning this course, students will be able to: Develop skills in using advanced statistical software Provide practical experience related to data management and analysis Develop skills related to modern univariate and multivariate statistical analysis Develop skills to deal with time series and panel data Develop skills in summarizing information directed to practical decision making VVvVVWVV Vv Able to compute and generate results as well as interpret the results Chapter Outline Introduction to STATA What is STATA, Basic features of Stata Do file Opening do fle Running commands from the do-fle Log file Opening log file ata Management Importing data in to STATA © labelling variables © labelling values (© Getting to know your data: browse, edit lst, codebook, Summarizing * Generating and transforming a variable: natural logarithm, sum/dlflerence. © Dealing with missing values (© Some of the Stata logical and relational operators ©. Brief Overview of Main SPSS Features OE TT TT TT Introduction to STATA What is STATA? Q The word stata is a combination of the words statistics and data Q Stata is a complete, integrated and powerful data analysis software package that is capabilities for: = Data management and manipulation = Statistical analysis = Data visualization TTY en ne ee ee en ee Tn ry Lei U Tool bar: Contains buttons that provide quick access to Stata’s more commonly used features U Working directory: shows from where the stata load and save files ++ Use the command cd to change the working directory (It will be discussed in detail later) U Variable window: Once you have data loaded, variables in the dataset will be listed with their labels in the order they appear on the dataset U Properties window: Displays variable and dataset properties. 0 Command window: You can enter commands directly into the Command window Note: Stata don't accept uppercase letters! O Result window: Contains all the results from performing analyses, e.g. syntax, tables, charts etc Note: Double-clicking on a variable name will cause it to appear in the Command Window OTT PIII TOTTI Ceti) Review window: The review window lists previously issued commands *% Successful commands will appear black. slo SC sesnensie {do "c\Ueesymenate 5 taba detine Gonder “+ Unsuccessful commands will appear red an * Double-click a command to run it again ra STI Menu Bar: The menu bar shows the stata menu used for data management, visualization and analysis. The following is a description of the Stata menus: File: Allows to opens and saves Stata data files; opens and closes Log files; saves or prints graphs; imports and exports files, exits Stata. Edit: Allows you to copy output to a word processor or other application. Data: Helps to open the data editor and data browser; Summarize data; Label datasets and variables; Replace and generate data. Graphics: Contains all of Stata’s graphing tools. Statistics: Allows data summaries and all statistical tests. User: Place to store any user-generated commands. Window: Controls the windows opened in Stata. Help: A good resource if you have questions about how to use Stata. O Stata do-files are text files where users can store and run their commands for reuse, rather than retyping the commands into the command window = Reproducibility = Easier debugging and changing commands QO We recommend always using a do-file when using Stata QO The file extension .do is used for do-files Peres TIT I ‘T DEBRE MARKOS UNIVERSITY(DAAU) > Q Click on the pencil and the toolbar Q Click filES do > select appropriate folder to save the do-file > open Q Or use the command doedit and then press enter ® You will see the do-file open. Ones Eo PII os PrpeCenc ony ens, Comments, which are not executed, are usually preceded by * or /{and are O The do-file editor colors Stata commands blue a et Q Words in quotes (file names, string values) are colored “red PTT rr TITTIES y T Running commands from the do-file Ee Q Highlight the command and execute (click) the rightmost icon on the do-file editor toolbar Q Highlight the command and hit: = Ctrl-D for windows = Shift+Cmd+D for Mac Notes: > Multiple commands can be selected and executed > Stata will normally assume that a newline signifies the end of command >You can extend commands over multiple lines by placing /// at the end of each line except for the last Make sure to put a space before /// O Log-file helps to create a copy or save of everything that is sent to the results window, with the exceptiog of graphs. O Click file> log + begin select appropriate folder to save the log-file save Although it will not save them in the same way a do- does, it also retain your commands Q Anew log-file starts with begin a log file and ends with close the log-file. TT erry Importing data in to STATA OBefore new data can be loaded, memory must be cleared. Any dataset from the memory must be removed Because Stata will only hold one data set in memory at a time. ‘Syntax: clear or clear all CO change or adjust directory ® We have to told the stata in which folder we are working on syntax: ed “paste folder location” STENT IMT "7 Import exeel data After you adjust the directory to where you stored the data, nin the following ‘command Syntecx: import excel “Excel file name”, sheet ("sheet name”) firstrow clear > The “firstiow clear” option helps to teat the frst row as a variable 5 saeetca stare pplication toner Save data in stata format : : Syne: save “data file name", replace : > The replace option will overwrite an existing file withthe same name > Data files stored in State's format are known as ta files Use the saved data in stata format ‘Syntax: use “datafile name”, clear > The ‘clear’ option helps to remave the any opened dataset from the s Notes: 1 Double clicking on a cts file in Windows will open up a the data in a now instance of Stata (not in the current instance) Hence, be careful of having many Statas open rrr aT Soe TT Using the menu to import EXCEL data oo Q To import data using the stata menu follow the following steps Select File -> Import -> Excel spreadsheet (xls, xlsx) or “Text data(delimited, *.csy,..)" > tick “import first row as variable names” -> ok ‘CHAPTER ONE: INT'N TO SOFTWARES AND DATA MAGT DEBE MARKDOS UNIVE! Preparing data for import QTo get data into Stata cleanly, make sure the data in your “ Rectangular v Each column (variable) should have the same number of rows (observations) Y No graphs, sums, or averages in the file ran Fre rere TTY TITTY Getting to know your data A. Browse or Edit the dataset w . . * nee the data are loaded, we can view the dataset as a gsr Boo anantst spreadsheet using the command browse or edit Syntax: browse or edit In other way: > Click on the magnifying glass with spreadsheet icon for browse Click on the pencil with spreadsheet icon for edit Note: browse helps to see the data, while edit allows modifying the data Using both techniques you will se the da > Black columns are numeric > Red columns are strings, and > Blue columns are numeric with string labels ry Ce tT " Q The list command prints observai to the Stata console simply issuing “List” Will list all observations and variables > Not usually recommended except for small datasets Q. Specify variable names to list only those varjablés Example: list Yield age fragment ITT oT ry labelling variables Q Label allows you to provide the variable with a longer, more complete description Q The variable label will sometimes be used in output and often in graphs Syntax: label variable var.name “label of the variable” Examples: >» label variable Yield "Maize Output per total land size cultivated” > label variable age “Age of respondents" > label variable fragment "Number of plots" TT Cr SOT OT) Describe Ushort variable names make coding more efficient but can obscure the variable’s meaning Hence, use ‘describe or des’ command to know full meaning of the variable Notes: > Simply issuing “des or describé will describe all observations and variables > You can specify variable names to describe only those variables Example: list Yield age Sy oe) 1 C. labelling values a a a > > > Many stati variables al software including stata only accept numeric labelling values used give numerical values for the nominal or ordinal variables To create a new set of value labels use label define Syntax: label define varname valuel “labelname1” value2, labelname2".. Examples: label define Gender 0 "Female" 1 "Male* label define EducLevel 0 “iliterate” 1 “Primary” 2 “Secondary ~ 3 “Post-Secondary label define Credit Dummy 0" lo 1 "Yes ar Pere SIO S Codebook Oo Used to inspect more about the variable/s including: Labelling variable Value label Type of the variable i.e. string or numeric Missing value SNK NK Range, std. dev., percentiles. Q Specify variable names to want to know more about the variable Example: codebook Yield Gender Keep in mind! CUTE E, Summarizing QOThe summarize/sum command helps to calculate some summary statistics such as mean, std. d minimum and maximu Why? TT a F. Encode Helps to change string variable to numer. Note: when we encode the variable, the stata automatically may change the value label. > So, you have to adjust or change the value label which is appropriate for you > Know you can see some statist EducLevel and Credit. values for Gender, ITI OT G. Keep and drop OQ keep helps to keep the variable/s you need, and remove other variables Syntax: keep Varl var2... Example: keep Yield age v The stata automatically remove all variables except Yield and age Oo Drop helps to remove the variable/s you don’t need, and keep other variables Syntax: drop varl var2... Example: drop Yield age v The stata automatically remove the variables Yield and keep all other variables H. Generating and transforming a variable Q Variables often do not arrive in the form that we need Q Use generate (often abbreviated gen or g) to create variables, usually from operations on existing variables such as: Y Sums/differences/products/squares of variables/natural logarithm CITT O To generate natural logarithm of Yield gen een rely = O To generate a sum of two variables sus oaga, lgenerate LandFrag- nd + fragment “ O To generate a difference of two variables : : : ‘generate LandFragement= land —fragment| Bt 7 0 To generate square of the variable ‘gen agesqu= age" age gen agesqu2= ages? TST (yn Missing values Q Values of the variable can be missed due to various reasons such as non-response, refusal... U Missing numeric values in Stata are represented by Q You can check for missing by testing for equality to Syntax: count if Varname==. Example: count if Credit==. QO When make any estimation or analysis , the anal will be wrong. > So, we have to correct the missed values, before making any estimation. rrr Tay Dealing with missing values UIf missed vale can't be corrected through call or retum the questionnaire to the respondent, we can correct the missing values manually, like: ‘Substitute an Imputed Response: The respondents’ pattern of responses to other questions may be related to one another and it might be possible to calculate or infer the answer : to one question from the answer to another. Suppose the following questionnaire: 1. Did you get credit during the previous cropping season? Yes no 2. Ifyou said yes, for question no. 1, how much? > Though the respondent didn’t give “answer for Q4, the answér for Q2 gives a hint for missed value. i.e yes! ra ITT ET Eee rT) os ry * Use a global constant: such as Null, unknown, not applicable etc. “ Substitute a neutral value: use the mean response to the variable + Ignore/delete the record with missing values i.e © Case (list) wise deletion G > Cases or respondents with any missing responses are discarded from the analysis (eg. delete case 10). s ote sec % drop if Credit_Dummy_nt==. © Pair wise deletion The variable with any missing responses are discarded from the 3 4 maimey sd a analysis (ea. delete the varlable Credit_Dummy). drop Credit_Dummy_nt v ry TITTLE IT EI eres rr) Stata logical and relational operators Examples: keep if Gender < less than > greater than >= greater than or equal to <= less than or equal to & and | or ! not equal to drop if age = 50 keep if age |= 50 replace Credit = 500 if Credit 'MENGISTU Y. 'DEBRE MARKOS UNIVERSITY(DMU] a Stata documentation Q For any further help stata has an excellent documentation Q Select Help > PDF Documentation’ > Stata will © automatically generate the PDF documentation = — = TIT Q_ Brief Overview of Main SPSS Features Q Entering and Saving Data in SPSS Exercises Q Suppose you want to make analysis for respondents 65 and older. What is the syntax (command) used to keep the data for 65 and older respondents?

You might also like