Title: Working with Array in Stata
1Working with Array in Stata
- Vivien W. Chen
- Izumi Mori
2Do you
- Write repetitive programs for a set of variables
or data files? - Use longitudinal data that repeat the same data
management for multiple years? - Want to reduce the chance of coding errors due to
repeated programming?
3The Purpose of This Workshop
- Introduce the corresponding programs to SAS array
in Stata forvalue foreach - Include the use of macro local global
- Introduce programs for temporary files and
variables tempvar tempvar - Reshape data structure for longitudinal data
4Example Data
Variable name Variable label Value label
country Country code ISO 3-digit
cnt Country code 3-character
female female 1female, 0male
fborn Foreign-born student 1foreign-born, 0native-born
Hisei (occ) Highest parental occupational status (SEI) continuous
Misced (med) Educational level of mother (ISCED) 0lowest to 6highest
cultposs Cultural possessions at home PISA 2006 (WLE) continuous
pv1math Plausible value in math continuous
(Source data PISA 2000, 2003, 2006)
5The Use of forvalues foreach
- forvalues is a loop that executes commends with a
defined numeric local macro. Specific numbers or
a range of numbers can be added in the macro
before braces and . A range may be 1/5,
meaning 1 to 5 or 0(5)100, indicating 0 to 100
in steps of 5. -
6- Example 1 replace missing values for XXX in each
year file - One solution is
- use file2000,clear
- replace XXX. If XXXlt0
- use file2003,clear
- replace XXX. If XXXlt0
- use file2006,clear
- eplace XXX. If XXXlt0
- Alternatively
- forv n2000 2003 2006
- use filen,clear
- replace XXX. If XXXlt0
-
7- foreach is a more generally useful statement. It
can work with a list of variables, numbers, and
names. It also can create new variable list. - Example 2 (the same task as example 1)
- foreach n of numlist 2000 2003 2006
- use filen,clear
- replace XXX. If XXXlt0
-
8- Example 3 In each year file, (1) create dummy
variables for mothers education (2)create
interactions terms for gender, foreign born with
mother edu and parental occ (3) create a year ID
in each file and (4) append those files
together. - foreach n of numlist 2000 2003 2006
- use filen,clear
- tab med, gen(med)
-
- foreach x of varlist female fborn
- gen x_occxocc
-
- foreach n2 of numlist 0/6
- gen x_medn2xmedn2
-
-
- gen yrn
-
-
- use file2000,clear
- append using file2003 file 2006
1
2
3
4
9- Are these the alternatives?
- foreach n of numlist 2000 2003 2006
- use filen,clear
- tab med, gen(med)
-
- foreach x of varlist female fborn
- gen x_occxocc
-
- gen x_medxmed (?)
-
-
- gen yrn
-
- append using filen (?)
-
-
If you want to interact with the continuous
variable of mothers education, you can do so.
This wont work, because its in the loop of each
year.
10Make Use of Macro local global
- A local macro is created in a do file and ceases
while the do file terminates. - A global macro terminates while a user exits
Stata program -
11- Example 4 regression analysis
-
- local x1 female fborn
- local x2 occ med1-med6 cultposs
- / global x1 female fborn
- global x2 occ med1-med6 cultposs /
-
- reg p1vmath x1
- reg p1vmath x1 x2
- / reg p1vmath x1
- reg p1vmath x1 x2 /
-
- Can I just use by instead of foreach country?