*****************************************************************************
* CHAPTER 13 - STATISTICS FOR BUSINESS & ECONOMICS, 5th Edition             *
*****************************************************************************
* Example 13.1, p. 533
*
* A random sample of students is identified as TASTER, the number of these
* students that preferred original product is ORIG, the number of these
* students that preferred new product is NEW.  The FORMAT command is used
* to read character data in SHAZAM.  The format of the FORMAT command is:
*
*    FORMAT(list)
*
* where:  list  = contains edit descriptors
*
*         nX    = advances the column position by n spaces
*         nFw.d = the field is w characters wide and contains a number
*                 such that d digits occur after the decimal point.  The
*                 field is repeated n times.
*         Aw    = the field is w characters wide and contains a SHAZAM
*                 character variable.  The maximum limit is A8.
*
* The FORMAT option must be specified on the READ command to ensure SHAZAM
* reads in the data according to the previously stated FORMAT command.
*
SAMPLE 1 8
FORMAT(A1,2X,F1.0,2X,F1.0)
READ TASTER ORIG NEW / FORMAT
A  6  8
B  4  9
C  5  4
D  8  7
E  3  9
F  6  9
G  7  7
H  5  9
GENR DIFF=ORIG-NEW
*
* Replicate Table 13.1, p. 534
*
* Some systems do not permit printing in column 1, so two different FORMAT
* commands are needed, one for the READ command and one for the PRINT
* command.
*
FORMAT(1X,A2,1X,F3.1,1X,F3.1,1X,F4.1)
PRINT TASTER ORIG NEW DIFF / FORMAT
*
* The DISTRIB command provides functions of probability distributions.  The
* format is:
*
*    DISTRIB vars / options
*
* where:  vars    = a list of variables
*         options = a list of options that are required on the
*                   specified type of distribution
*
* In this example, the type of distribution is Binomial with parameter
* value of 0.50 (P=0.50) and the sample size of 7 (N=7).
*
GENR POS=(DIFF.GT.0)
STAT POS / SUMS=X
*
* The GEN1 and DISTRIB command is used to print out critical values in lieu
* of referring to a statistical table.  The GEN1 command is used to generate
* a constant, X.  The format of the DISTRIB command is:
*
*  DISTRIB vars / options
*
*  where:  vars      = list of variables
*          options   = list of desired options
*          TYPE=     - specifies the type of distribution
*
* Case 1 - (X .LE. 2)
*
*  Note:  .LE. = Less Than or Equal To
*
DO #=0,2
GEN1 X#=#
DISTRIB X# / TYPE=BINOMIAL P=0.50 N=7
*
* After each DISTRIB command, the PDF is stored in the temporary variable,
* $PDF.  The GEN1 command is used to save it as a scalar in the PDF# variable.
*
GEN1 PDF#=$PDF
PRINT PDF#
ENDO
*
* The GEN1 command is used to calculate the PDF.
*
GEN1 PDF02=PDF0+PDF1+PDF2
PRINT PDF02
*
* Case 1 - (X .LE. 2) + (X .GE. 5)
*
*  Note:  .GE. = Greater Than or Equal To
*
DO ?=5,7
GEN1 X?=?
DISTRIB X? / TYPE=BINOMIAL P=0.50 N=7
GEN1 PDF?=$PDF
PRINT PDF?
ENDO
GEN1 PVALUE=PDF02+PDF5+PDF6+PDF7
PRINT PVALUE
*
DELETE / ALL
*----------------------------------------------------------------------------
* Example 13.2, p. 535
*
* A random sample of children is N, the number of these children that
* preferred peanut butter ripple ice cream is PB, the number of these
* children that preferred bubblegum surprise is BS, and the number of these
* children that expressed no preference is NP.
*
GEN1 N=100
GEN1 PB=56
GEN1 BS=40
GEN1 NP=4
*
* The Null Hypothesis is that there is no overall preference in this
* population for one flavor over the other.  Before the analysis can be
* performed, the sample of children that expressed no preference must be
* subtracted from the original sample size to yield the group of children
* that gave a response for either ice cream flavours.
*
GEN1 N=N-NP
*
* The sample proportion preferring bubblegum surprise is defined as MEAN
* and the sample standard deviation is SIGMA.
*
GEN1 MEAN=N*0.50
GEN1 SIGMA=0.50*SQRT(N)
GEN1 SSTAR=BS+0.5
PRINT MEAN SIGMA SSTAR
*
* The test statistic is then calculated using the GEN1 command.
*
GEN1 Z=(SSTAR-MEAN)/SIGMA
PRINT Z
GEN1 SIGMA2=(0.5*SQRT(96))**2
GEN1 X=48
DISTRIB X / TYPE=NORMAL MEAN=40.5 VAR=SIGMA2
GEN1 X=40
DISTRIB X / TYPE=BINOMIAL P=0.50 N=96
GEN1 PVALUE=2*$CDF
PRINT PVALUE
*
*****************************************************************************
*                                                                           *
* Note:  In this chapter, the signs of the test statistic in SHAZAM may be  *
*        the opposite that is listed in the textbook.  The opposite sign    *
*        occurs when the sample mean is a smaller number than the mean.     *
*                                                                           *
*****************************************************************************
*
DELETE / ALL
*----------------------------------------------------------------------------
* Example 13.3, p. 536
*
SAMPLE 1 23
READ(INCOME.DIF) / DIF LIST
*
* The Null Hypothesis is that the starting income is equal to $35,000 and
* the Alternative Hypothesis is that is not equal to $35,000.
*
GEN1 N=22
GEN1 S=17
*
* The sample with a starting income greater than $35,000 is defined as MEAN
* and the sample standard deviation is SIGMA.
*
GEN1 MEAN=0.5*N
GEN1 SIGMA=0.5*SQRT(N)
GEN1 SSTAR=S-0.50
*
* The test statistic is then calculated using the GEN1 command.
*
GEN1 Z=(SSTAR-MEAN)/SIGMA
PRINT Z
*
GEN1 SIGMA2=(0.50*SQRT(22))**2
GEN1 X=11
DISTRIB X / TYPE=NORMAL MEAN=16.5 VAR=SIGMA2
GEN1 XP=5
DISTRIB XP / TYPE=BINOMIAL P=0.50 N=22
GEN1 PVALUE=2*$CDF
PRINT PVALUE
*
DELETE / ALL
*----------------------------------------------------------------------------
* Example 13.4, p. 539
*
* Read this example carefully.  Be sure you understand the methodology.
*
*----------------------------------------------------------------------------
* Example 13.5, p. 542
*
* A sample of thirty-one matched pairs of firms is N.  The smaller of the
* rank sums, 189, was for those pairs where the ratio was higher for the
* firm without sophisticated postaudit procedures is T.
*
GEN1 N=31
GEN1 T=189
*
* The Wilcoxon statistic under the Null Hypothesis that the distribution of
* differences in ratios is centered on 0 against the Alternative Hypothesis
* that the ratio of market valuation to replacement cost of assets tends to
* be lower for firms without sophisticated postaudit procedures.
*
* First the Mean of T is calculated.  Then the Variance of T and the Standard
* Deviation of T.
*
GEN1 MEANT=(N*(N+1))/4
GEN1 VART=(N*(N+1)*(2*N+1))/24
GEN1 SIGMAT=SQRT(VART)
*
* The Wilcoxon Statistic is calculated using the previously determined Mean
* and Standard Deviaton of T.
*
GEN1 WILCOXON=(T-MEANT)/SIGMAT
PRINT MEANT VART SIGMAT WILCOXON
*
* The Wilcoxon Statistic is easily calculated in SHAZAM using the GEN1 and
* DISTRIB command.
*
GEN1 Y=248
DISTRIB Y / TYPE=NORMAL MEAN=189 VAR=VART
*
DELETE / ALL
*----------------------------------------------------------------------------
* Example 13.6, p. 545
*
SAMPLE 1 12
READ(HOURS.DIF) FINANCE ACCOUNT / DIF
*
* The sample size for the number of hours per week students spend studying
* for introductory Finance course is N1 and Accounting course is N2.  The
* Rank Sum for finance students is R1.
*
GEN1 N1=10
GEN1 N2=12
GEN1 R1=93.5
*
* The Mann-Whitney U Statistic for the sample of Finance students is
* calculated using the GEN1 command.
*
GEN1 U=(N1*N2)+((N1*(N1+1))/2)-R1
PRINT U
*
* The Mean and Variance of the Mann-Whitney Statistic is:
*
GEN1 MEANU=(N1*N2)/2
GEN1 VARU=((N1*N2)*(N1+N2+1))/12
PRINT MEANU VARU
*
* The Decision Rule is:
*
GEN1 Z=(U-MEANU)/SQRT(VARU)
PRINT Z
*
GEN1 YU=60
DISTRIB YU / TYPE=NORMAL MEAN=81.5 VAR=VARU
GEN1 PVALUE=2*$CDF
PRINT PVALUE
*
DELETE / ALL
*----------------------------------------------------------------------------
* Example 13.7, p. 548
*
* A random sample of the performance of firms that does not give management
* forecasts of earnings is defined as N1 and a random sample of the
* performance of firms that give management forecasts of earnings is defined
* as N2.  The sum of the ranks for firms not disclosing management earnings
* forecasts is defined as R1.
*
GEN1 N1=80
GEN1 N2=80
GEN1 R1=7287
*
* The Null Hypothesis is that the central locations of the population
* distributions of earnings variabilities are the same for the two types of
* firms.
*
* The Mann-Whitney statistic is calculated with the GEN1 command.
*
GEN1 U=(N1*N2)+((N1*(N1+1))/2)-R1
PRINT U
*
* The Mean, Variance and Standard Deviation of the Mann-Whitney statistic is:
*
GEN1 MEANU=(N1*N2)/2
GEN1 SIGMA2U=(N1*N2*(N1+N2+1))/12
PRINT MEANU SIGMA2U
*
* The Decision Rule is:
*
GEN1 DR=(U-MEANU)/SQRT(SIGMA2U)
PRINT DR
*
* The GEN1 and DISTRIB commands can be used to calculate the statistics which
* the same as the Mann-Whitney.
*
GEN1 Y=3200
DISTRIB Y / TYPE=NORMAL MEAN=2353 VAR=SIGMA2U
*
* The Wilcoxon Rank Sum Test is calculated with the GEN1 command.
*
GEN1 T=7287
GEN1 ET=(N1*(N1+N2+1))/2
GEN1 VART=((N1*N2)*(N1+N2+1))/12
GEN1 Z=(T-ET)/SQRT(VART)
PRINT ET VART Z
*
* Once again, the GEN1 and DISTRIB commands are used to calculate the
* statistic that is the same as the Wilcoxon Rank Sum Test.
*
GEN1 W=6440
DISTRIB W / TYPE=NORMAL MEAN=7287 VAR=VART
*
DELETE / ALL
*-----------------------------------------------------------------------------
* Example 13.8, p. 552
*
SAMPLE 1 17
READ MAGAZINE RANKX RANKY
  1   14    2
  2    8    4
  3    1   16
  4   16    1
  5   17    5
  6   13    6
  7   15    8
  8    2   11
  9    7    9
 10    3   13
 11    6   12
 12    9   17
 13    5    3
 14    4    7
 15   11   14
 16   12   15
 17   10   10
*
* The Spearman Rank Correlation Coefficient can be calculated between
* Cost of Advertising and Circulation, X and Return-On-Inquiry Cost, Y
* with the PRANKCOR option on the STAT command.
*
STAT RANKX RANKY / PRANKCOR
*
* The manual way in calculating the Spearman Rank Correlation Coefficient
* is illustrated below.
*
GENR D=RANKX-RANKY
GENR D2=D**2
PRINT MAGAZINE RANKX RANKY D D2
STAT D2 / SUMS=SUMD
PRINT SUMD
GEN1 N=17
GEN1 R=1-((6*SUMD)/(N*(N**2-1)))
PRINT R
*
*----------------------------------------------------------------------------
*
STOP