HowTo Post-Process the CG2 Output
The post-processing backs out the constant (a=y-xb) and imposes the
identification rules. The person and firm effects are only
identified within connected groups. The first step involves
identifying the groups present in your data. The group
information is combined with the CG2 parameter estimates and a final SAS dataset
is produced.
The Groups Program
The grouping program identifies the set(s) of persons and firms that
are connected to each other. Connectedness is most easily defined
through an example. Pick any firm in the data and identify all
the workers ever employed at the firm. Then identify all of the
firms each employee ever worked at. For the new expanded set of
firms identify all of the workers ever employed at those firms.
Repeat the algorthm until no more firms/workers can be added.
Labor market data typically has 95+ percent of the workers and
firms in the first group with the rest of the workers in many small
groups.
The groups program is crucial since the persons and firm effects are
only identified within a group (unless you are willing to make some
assumptions). This information must be available before any
identification rule can be implemented.
- Go to the 02_runcg_out/groups directory.
- Run the firmcells.sas program, creating the firmcells file.
The firmcells file is the same as cellsout, but it is sorted by
firm ID, person ID.
- Open the rungroups.ksh file with a text editor. At the
bottom of the file make sure that the location of the groups binary is
correct (use an explicit path. Some versions of Unix have a
system groups program). Run rungroups.ksh
- Examine groups.log for any errors
- Run groupstats.sas if you are interested in the number and size of the groups (the synthetic data should only have two groups).
Identification
The final stage involves calculating the constant, bringing in the
parameter estimates, imposing the identification rule, and decomposing
earnings into various components (constant, xb, experience, person,
firm, h=person + exper).
- Go to the 03_cgpost directory
- Open the config_param.sas file with a text editor. Set the
depvar, persid, and firmid macro variables. Set the betadir macro
variable to the location where you ran cg2. Skip down a few lines
and set the rhs macro variable appropriately.
- Run the 00_setup.sas program. Make sure you run it twice
the first time or the program will not finish properly. The
program automatically creates the cg.coef file used by other programs
and sets up links to cg.betas, cg.in, and cg.means.
- Run the 01_rhs.ksh shell script with the first argument cg.
The script generates a SAS program that creates a SAS dataset
(rhs.sas7bdat) containing the covariates (betas) from the CG2 run.
- Run the 02_means_2v3.ksh script with the argument cg. The
script generates a SAS program that creates a SAS dataset
(means02.sas7bdat) containing the means of the dependent and right hand
side variables.
- Run the 03_constant.sas program. Creates the constant using the property that a regression goes through the means.
- Run the 04_fe_read.sas program. Reads in the groups, person
effects, and the firm effects and creates SAS datasets for each of them
in the same location where CG2 was run (groups.sas7bdat,
theta.sas7bdat, psi.sas7bdat).
- Run the 05_xb.sas program. Calculates the Xb and experience
index for each observation (stored in xb.sas7bdat). Depending on
the specification of experience in your model you may need to modify
this program.
- Run the 06_join_all.sas program. Brings all of the
components (groups, person, firm, xb, exper) together into one file
(hcest1.sas7bdat).
- Run the 07_identify.sas program. The first step is
identifying the model. The person effects are set to mean zero
within each group. In contrast, the firm effects are assumed mean
zero within each group, and the extra degree of freedom is used to
estimate an additional firm effect. The firm effects are set to
mean zero for the entire sample only. Everything is almost ready
except for groups there are usually some groups where we cannot
separately identify the person and firm effect (only one person and one
firm). For these groups I randomly draw a person and firm effect
from a distribution similar to the overall distributions. Our
measure of human capital (h) and the residuals are calculated.
- YOU ARE FINISHED!!!
Return to the HowTo or Main page.