Chapter 17 - STATISTICS FOR BUSINESS & ECONOMICS by Paul Newbold
*****************************************************************************
* CHAPTER 17 - STATISTICS FOR BUSINESS & ECONOMICS, 4th Ed., by Paul Newbold*
*****************************************************************************
*
* Index Numbers on page 684.
*
* The SAMPLE command is used to set the sample range 1 10 for the data found
* in Table 17.4 and 17.6.
*
SAMPLE 1 10
*
* Table 17.4 stores the Prices per Bushel of Wheat, Corn and Soybeans on
* page 681.
*
READ YEAR PWHEAT PCORN PSOYBEAN PAVERAGE PINDEX / LIST
1 1.33 1.33 2.85 1.837 100.0
2 1.34 1.08 3.03 1.817 98.9
3 1.76 1.57 4.37 2.567 139.7
4 3.95 2.55 5.68 4.060 221.0
5 4.09 3.03 6.64 4.587 249.7
6 3.56 2.54 4.92 3.673 199.9
7 2.73 2.15 6.81 3.897 212.1
8 2.33 2.02 6.42 3.590 195.4
9 2.97 2.25 6.12 3.780 205.8
10 3.78 2.52 6.28 4.193 228.3
*
* Table 17.6 stores the Production in millions of Bushels of Wheat, Corn
* and Soybeans on page 684.
*
READ YEAR WHEAT CORN SOYBEAN / LIST
1 1352 4152 1127
2 1618 5641 1176
3 1545 5573 1271
4 1705 5647 1547
5 2122 5829 1547
6 2142 6266 1288
7 2026 6357 1716
8 1799 7082 1843
9 2134 7939 2268
10 2370 6648 1817
*
* The INDEX commmand computes the price indexes from a set of price and
* quantity data on a number of commodities. SHAZAM automatically calculates
* the Divisia, Paasche, Laspeyres and Fisher Price and Quantity Indexes
* when the INDEX command is specified. The BASE= option specifies the
* observation number to be used as the base period for the index.
*
* The format of the command is:
*
* INDEX p1 q1 p2 q2 p3 q3 ... / options
*
* Table 17.5 - Laspeyres Price Index for Wheat, Corn and Soybean on page 682
* is replicated with the INDEX command. The LASPEYRES= option stores the
* Laspeyres Price Index in the vector specified.
*
*
INDEX PWHEAT WHEAT PCORN CORN PSOYBEAN SOYBEAN / BASE=1 LASPEYRES=PLS
GENR PINDEX=PLS*100
PRINT PINDEX
*
* Table 17.7 - Laspeyres Quantity Index for Wheat, Corn and Soybean on page
* 684 is replicated with the INDEX command. In this case, the quantities
* are specified before the prices. The QLASPEYRES= option stores the
* Laspeyres Quantity Index in the vector specified.
*
*
INDEX WHEAT PWHEAT CORN PCORN SOYBEAN PSOYBEAN / BASE=1 QLASPEYRES=QLS
GENR QINDEX=QLS*100
PRINT QINDEX
*
* The Aggregate Laspeyres Price Index for Wheat, Corn and Soybean is
* estimated with the base year 6.
*
INDEX PWHEAT WHEAT PCORN CORN PSOYBEAN SOYBEAN / BASE=6
*
*----------------------------------------------------------------------------
* Change in Base Period
*
* First read the data for the 1971 and 1976 based indexes.
*
SAMPLE 1 10
READ YEAR P71 P76 / LIST
1971 100.0 0.0
1972 92.2 0.0
1973 131.2 0.0
1974 212.0 0.0
1975 243.0 0.0
1976 198.5 100.0
1977 0.0 94.0
1978 0.0 86.7
1979 0.0 94.9
1980 0.0 107.0
SAMPLE 6 10
*
* Using the GENR command copy the last 5 years of variable P76 into the
* SPLICE index.
*
GENR SPLICE=P76
SAMPLE 1 5
*
* Compute the first 5 years of P71 using the 1976 base.
*
GENR SPLICE=P71*P76:6/P71:6
SAMPLE 1 10
*
* Now print all 10 years of the SPLICED INDEX with the PRINT command listed
* in the last Column of Table 17.8 on page 685.
*
PRINT YEAR SPLICE
*
DELETE / ALL
*
*----------------------------------------------------------------------------
* A Nonparametric Test for Randomness on page 688.
*
SAMPLE 1 16
READ DAY VOLUME / LIST
1 98
2 93
3 82
4 103
5 113
6 111
7 104
8 103
9 114
10 107
11 111
12 109
13 109
14 108
15 128
16 92
*
* The median observation of the volume data is calculated using the MEDIAN=
* option on the STAT command. The median value is saved in a constant
* called M.
*
STAT VOLUME / MEDIAN=M
PRINT M
*
* There are 2 ways in computing the Runs Test. In the textbook, Newbold uses
* the residuals around the median. In SHAZAM, the Runs Test is calculated
* with the residuals around the mean. The most common way in calculating
* the Runs Test is using the residuals around the mean.
*
* SHAZAM automatically computes the Runs Test when the OLS command is
* specified with the RSTAT option. The LIST option is used to list and print
* out the residuals. This is a visual check for the number of residuals that
* are above and below the mean.
*
OLS VOLUME / RSTAT LIST
*
* Plot 17.3 on page 689 is replicated with the PLOT command.
*
PLOT VOLUME DAY
*
DELETE / ALL
*
*----------------------------------------------------------------------------
* Example 17.1, page 691
*
* The TIME command specifies the beginning year and frequency for a time
* series. This is an alternate form of the SAMPLE command.
*
TIME 1931 1
SAMPLE 1931.0 1960.0
READ YEAR SALES / LIST
1931 1806
1932 1644
1933 1814
1934 1770
1935 1518
1936 1103
1937 1266
1938 1473
1939 1423
1940 1767
1941 2161
1942 2336
1943 2602
1944 2518
1945 2637
1946 2177
1947 1920
1948 1910
1949 1984
1950 1787
1951 1689
1952 1866
1953 1896
1954 1684
1955 1633
1956 1657
1957 1569
1958 1390
1959 1387
1960 1289
STAT SALES / MEDIAN=M
PRINT M
*
* In this example, the Runs Test results from SHAZAM match those in the
* textbook. The mean and median values in this example are close enough that
* it does not move the residual from a plus to a negative value or vica versa.
*
OLS SALES / RSTAT LIST
*
DELETE / ALL
*
*----------------------------------------------------------------------------
* Components of a Time Series, page 692
*
SAMPLE 1 11
READ YEAR CREDIT / LIST
1 133
2 155
3 165
4 171
5 194
6 231
7 274
8 312
9 313
10 333
11 343
PLOT CREDIT YEAR
*
DELETE / ALL
*
*----------------------------------------------------------------------------
* Components of a Time Series, page 693
*
SAMPLE 1 32
*
* The BYVAR option on the READ command tells SHAZAM to read in the data
* variable by variable rather than observation by observation.
*
READ Q / BYVAR LIST
0.300 0.460 0.345 0.910
0.330 0.545 0.440 1.040
0.495 0.680 0.545 1.285
0.550 0.870 0.660 1.580
0.590 0.990 0.830 1.730
0.610 1.050 0.920 2.040
0.700 1.230 1.060 2.320
0.820 1.410 1.250 2.730
*
* The GENR command with the TIME function is used to generate a time index
* so that the first observation is equal to 1 and the rest are consecutively
* numbered.
*
GENR YEAR=TIME(0)
PLOT Q YEAR
*
DELETE / ALL
*
*-----------------------------------------------------------------------------
* Moving Averages, page 698
*
* The TIME command specifies the beginning year and frequency for a time
* series. This is an alternate form of the SAMPLE command.
*
TIME 1931.0 1
SAMPLE 1931.0 1960.0
READ YEAR SALES
1931 1806
1932 1644
1933 1814
1934 1770
1935 1518
1936 1103
1937 1266
1938 1473
1939 1423
1940 1767
1941 2161
1942 2336
1943 2602
1944 2518
1945 2637
1946 2177
1947 1920
1948 1910
1949 1984
1950 1787
1951 1689
1952 1866
1953 1896
1954 1684
1955 1633
1956 1657
1957 1569
1958 1390
1959 1387
1960 1289
*
* The TIME(0) function is used to create a time index so that the first
* observation is equal to 1 and the rest are consecutively numbered.
*
GENR T=TIME(0)
*
* The 5-Point Centered Moving Average for the SALES variable is calculated
* using the GENR command and LAG function. The LAG(x,n) function lags the
* variable x, n times. Using a negative value for n on the LAG(x,n)
* function will lead future variables.
*
GENR SMA5=(LAG(SALES,2)+LAG(SALES)+SALES+LAG(SALES,-1)+LAG(SALES,-2))/5
PRINT T SALES SMA5
*
* The SAMPLE command is used to specify the range for the PLOT command.
* In this example, to replicate Figure 17.7 on page 698 the sample range
* is from 1933 to 1958. The years omitted are not plotted.
*
SAMPLE 1933.0 1958.0
*
* The YMIN=, YMAX=, XMIN=, and XMAX= options are specified on the PLOT
* command to specify the desired range for the X and Y axis to replicate
* Figure 17.7 on page 698.
*
PLOT SMA5 YEAR / YMIN=1100 YMAX=2700 XMIN=1933 XMAX=1958
*
DELETE / ALL
*
*-----------------------------------------------------------------------------
* Extraction of the Seasonal Component Through Moving Averages, page 699
*
SAMPLE 1 32
*
* The BYVAR option on the READ command reads the data in by variable and not
* by observation. Therefore, SHAZAM will read the data on Row 1 of the data
* file from left to right until each observation has been read for variable
* X and then continue with Row 2 etc. until all 32 observations have been
* read.
*
READ X / BYVAR
0.300 0.460 0.345 0.910
0.330 0.545 0.440 1.040
0.495 0.680 0.545 1.285
0.550 0.870 0.660 1.580
0.590 0.990 0.830 1.730
0.610 1.050 0.920 2.040
0.700 1.230 1.060 2.320
0.820 1.410 1.250 2.730
*
* The GENR command and TIME(0) function is used to create a time index so
* that the first observation is equal to 1 and the rest are consecutively
* numbered.
*
GENR T=TIME(0)
*
* The 4-Point Moving Average for the Earnings variable, X, is calculated
* using the GENR command and LAG function. The LAG(x,n) function lags the
* variable x, n times. Using a negative value for n on the LAG(x,n)
* function will lead future variables.
*
GENR FPMA=(LAG(X,2)+LAG(X,1)+X+LAG(X,-1))/4
*
* The 4-Point Centered Moving Average for the Earnings variable, X, is
* calculated using the GENR command and LAG function. The LAG(x,n) function
* lags the variable x, n times. Using a negative value for n on the LAG(x,n)
* function will lead future variables. This average is calculated using 3
* separate GENR statements to ensure there is no confusion.
*
GENR P1=(LAG(X,2)+LAG(X,1)+X+LAG(X,-1))/4
GENR P2=(LAG(X,1)+X+LAG(X,-1)+LAG(X,-2))/4
*
* The SAMPLE command is used to change the range of the data from 1 30 to
* 3 30 since the data was lagged back 2 time periods.
*
SAMPLE 3 30
GENR XSTAR=(P1+P2)/2
*
* Table 17.13 on page 700 is replicated with the PRINT command. The SAMPLE
* command is used before each PRINT command to ensure the desired data is
* printed only.
*
SAMPLE 1 32
PRINT T X
SAMPLE 3 31
PRINT FPMA
SAMPLE 3 30
PRINT XSTAR
*
*-----------------------------------------------------------------------------
* The SAMPLE command is used to change the range to 3 31 in calculating
* Column 5 of Table 17.14 on page 702.
*
SAMPLE 3 31
GENR COL5=(X/XSTAR)*100
PRINT COL5
*
* The GENR command with the SUM and SEAS function is used to create an index
* called CSINDEX to represent each cross-section. A repeating time index
* called TINDEX is created with the GENR command for the 4 observations.
*
SAMPLE 1 32
GENR CSINDEX=SUM(SEAS(4))
GENR TINDEX=TIME(0)-4*(CSINDEX-1)
PRINT CSINDEX TINDEX COL5
*
* The sample range is changed to include only observations 3 to 30 in
* calculating the median of each quarter with the STAT command. The DO
* command creates a DO-loop to execute the 3 commands immediately following.
* The first command skips all observations where the variable TINDEX not
* equal to 1. If TINDEX is equal to 1 then the STAT command is executed.
* The descriptive statistics of the variable COL5 is printed. The PMEDIAN
* option prints the median, mode and quartiles for variable COL5. The
* MEDIAN= option stores the median value in a constant. Then the DELETE
* SKIP$ command permanently eliminates all the SKIPIF commands in effect.
* The ENDO command indicates the end of the DO-loop.
*
SAMPLE 3 30
DO #=1,4
SKIPIF(TINDEX.NE.#)
STAT COL5 / PMEDIAN MEDIAN=M#
DELETE SKIP$
ENDO
*
* The GEN1 command is used to generate the constant for the sum of the
* median values.
*
GEN1 MEDSUM=M1+M2+M3+M4
*
* The sample range is reset to 1 32 to calculate the Seasonal Index. The
* DO-loop is used to calculate the Seasonal Index of each quarter in
* Table 17.14 on page 702.
*
SAMPLE 1 32
DO #=1,4
GEN1 SINDEX#=M#*400/MEDSUM
PRINT SINDEX#
ENDO
*
* The SET NOWARNSKIP command is used to suppress the printing of the warning
* message as to which observations will be skipped. The Adjusted Series
* values is generated with the GENR command within a DO-loop.
*
DO #=1,4
SET NOWARNSKIP
SKIPIF(TINDEX.NE.#)
GENR AS=X*(100/SINDEX#)
DELETE SKIP$
ENDO
*
* The Adjusted Series data in Table 17.14 is printed with the PRINT command.
* Notice, this command is specified after the Do-loop has ended. If the
* PRINT command was specified within the DO-loop the values for AS would be
* printed each time the DO-loop was executed.
*
PRINT AS
*
* The PLOT command is used to replicate Figure 17.9 on page 703. The YMIN=
* and YMAX= options are specified so the range of the Y-axis is the same as
* the textbook. The NOPRETTY option must be included when the YMIN= or YMAX=
* option is specified. The WIDE option increases the size of the plot on the
* terminal screen. If the WIDE option is omitted the plot will be compressed
* and it does not resemble that in Figure 17.9.
*
PLOT AS T / YMIN=0.300 YMAX=1.900 NOPRETTY WIDE
*
DELETE / ALL
*
*-----------------------------------------------------------------------------
* Simple Exponential Smoothing, page 708
* This example was done by Diana Whistler.
*
SAMPLE 1 30
READ SALES / BYVAR
1806 1644 1814 1770 1518 1103 1266 1473 1423 1767
2161 2336 2602 2518 2637 2177 1920 1910 1984 1787
1689 1866 1896 1684 1633 1657 1569 1390 1387 1289
GENR YEAR=TIME(1930)
*
* Set the smoothing constant.
*
GEN1 A=0.4
*
* Generate the smoothed time series.
*
SAMPLE 1 30
GENR S=SALES
GENR PREDICT=SALES
SAMPLE 2 30
GENR S=A*LAG(S)+(1-A)*SALES
*
* Generate 1-step ahead predictions.
*
GENR PREDICT=LAG(S)
SAMPLE 1 30
*
* Generate forecast errors.
*
GENR E=SALES-PREDICT
*
* Calculate model diagnostics.
*
GENR E2=E*E
STAT E2 / SUMS=SSE
PRINT SSE
*
* Print the results (see Newbold, Table 17.16, page 710)
*
FORMAT(2F14.0,2F14.1,F14.2)
PRINT YEAR SALES S PREDICT E / FORMAT
*
DELETE / ALL
*
*-----------------------------------------------------------------------------
* Holt-Winters Exponential Smoothing Forecasting Model, page 712.
* This example was done by Diana Whistler.
*
SAMPLE 1 11
READ X / BYVAR
133 155 165 171 194 231 274 312 313 333 343
GEN1 A=0.7
GEN1 B=0.6
DIM S 16 T 11
SAMPLE 2 2
GENR S=X
GENR T=X-LAG(X)
*
* Do the recursive computations.
*
SET NODOECHO
DO #=3,11
SAMPLE # #
GENR S=A*X+(1-A)*(LAG(S)+LAG(T))
GENR T=B*(S-LAG(S))+(1-B)*LAG(T)
ENDO
*
* Print the results
*
SAMPLE 1 11
PRINT X S T
*
* Forecasting
*
SAMPLE 12 16
GENR OBS=TIME(0)-11
GENR S=S:11+OBS*T:11
PRINT OBS S
*
DELETE / ALL
*
*-----------------------------------------------------------------------------
* Autoregressive Models on page 725.
*
SAMPLE 1 30
READ YEAR X / LIST
1931 1806
1932 1644
1933 1814
1934 1770
1935 1518
1936 1103
1937 1266
1938 1473
1939 1423
1940 1767
1941 2161
1942 2336
1943 2602
1944 2518
1945 2637
1946 2177
1947 1920
1948 1910
1949 1984
1950 1787
1951 1689
1952 1866
1953 1896
1954 1684
1955 1633
1956 1657
1957 1569
1958 1390
1959 1387
1960 1289
*
* The GENR command is used with the LAG(x) function to generate the variables
* lagged SALES one time period (X1), lagged SALES two time periods (X2),
* lagged SALES three time periods (X3) and lagged SALES four time periods
* (X4).
*
GENR X1=LAG(X)
GENR X2=LAG(X,2)
GENR X3=LAG(X,3)
GENR X4=LAG(X,4)
*
* The sample range of the first-order model must be changed from 2 30 since
* the first observation is lost when the SALES variable was lagged one time
* period. The first-order model is estimated with the OLS command.
*
SAMPLE 2 30
OLS X X1
*
* The second-order model is estimated with the OLS command but the sample
* range is changed accordingly as the first two observations are lost in the
* lagging process of the SALES variable. The COEF= option is used to save
* the regression estimates in the vector called COEF. These values will
* be used in forecasting the X31.
*
SAMPLE 3 30
OLS X X1 X2 / COEF=COEF
*
* The third-order model is estimated and the sample range is changed
* accordingly as the first three observations are lost in the lagging process
* of the SALES variable.
*
SAMPLE 4 30
OLS X X1 X2 X3
*
* The fourth-order model is estimated and the sample range is changed
* accordingly as the first four observations are lost in the lagging process
* of the SALES variable.
*
SAMPLE 5 30
OLS X X1 X2 X3 X4
*
* The regression coefficients for the second-order model were saved in the
* vector COEF. The CONSTANT is saved in COEF:3, the regression estimate for
* X lagged one time period is saved in COEF:1 and the regression estimate for
* X lagged two time periods is saved in COEF:2. The sales figures for X29
* and X30 are stored in the vector X in X:29 and X:30. The GEN1 command is
* used to forecast the value of X when t=31.
*
GEN1 X31=COEF:3+(COEF:1*X:30)+(COEF:2*X:29)
PRINT X31 COEF:3 COEF:1 X:30
PRINT COEF:2 X:29
*
* Similarly, X can be forecasted when t=32 and t=33 using the GEN1 command.
*
GEN1 X32=COEF:3+(COEF:1*X31)+(COEF:2*X:30)
GEN1 X33=COEF:3+(COEF:1*X32)+(COEF:2*X31)
PRINT X32 X33
*
DELETE / ALL
*
*-----------------------------------------------------------------------------
*
STOP