Revista Mexicana Ciencias Agrícolas volume 11 number 4 May 16 - June 29, 2020
DOI: https://doi.org/10.29312/remexca.v11i4.2249
Article
SAS code to analyze a complete dialelic and heterosis. An environment
Delfina de Jesús Pérez López
Claudia Saavedra Guevara
Martín Rubí Arriaga
J. Ramón Pascual Franco Martínez
Francisco Gutiérrez Rodríguez
Andrés González Huerta§
Center for Research and Advanced Studies in Phytomeroration-Faculty of Agricultural Sciences-Autonomous University of the State of Mexico-University Campus ‘El Cerrillo’. El Cerrillo Piedras Blancas, Toluca, State of Mexico, Mexico. AP. 435. Tel. 722 2965531, ext. 148. (djperezl@uaemex.mx; csg1003@yahoo.com; m-rubi65@yahoo.com.mx; jrfrancom@uaemex.mx; fgrfca@hotmail.com.
§Corresponding author: agonzalezh@uaemex.mx.
Abstract
The development of statistical analysis system (SAS) programs and their validation with freely available software is essential when there are no financial resources to acquire the license of an appropriate statistical package. In this study a code for SAS is presented and its validation is performed with the program proposed by Zhang and Kang (1997), modified by Saavedra (2019). The code generates an analysis of variance with partition of the effects of treatments on parents (P), direct crosses (CD), reciprocal crosses (CR), P vs crosses, and CD vs CR. In addition to generating the comparison of treatment means with the Tukey test, the genetic effects for parents or for their crosses are estimated (Gi, Sij, Rij, Mi); as well as, those of heterosis with the average of both parents or with the best of them. Since both codes only coincide in the calculation of the previously indicated genetic effects, their simultaneous application is suggested to carry out a complete analysis of Griffing’s method 1 (1956a, b). The code that has been proposed will be especially useful for plant breeders and geneticists and especially for undergraduate and graduate level biological and agricultural science students with little training in the programming language at SAS.
Keywords: Griffing method 1, model 1, randomized complete blocks, Tukey test.
Reception date: February 2020
Acceptance date: April 2020
Introduction
Dialectic crosses were designed before the 1950’s, but soon became a powerful tool for plant and animal breeders, who to recognize the merit of various parents evaluated their progenies through the effects and variances of general combinatorial aptitude (ACG) and specific (ACE) (Sprague and Tatum, 1942; Griffing, 1956a, b; González et al., 2007a, b). These define new heterotic patterns or a segregating population from which it is possible to isolate again outstanding plants, predict the response to the selection or the behavior of hybrids or synthetics formed with new lines (Hallauer and Miranda, 1988; Christie and Shattuck, 1992; González et al., 2007a, b).
Analysis of a complete diallel crosses experiment without a personal computer (PC) is laborious and to save time, there are several statistical packages such as SAS (https://www.sas.com/store/index.ep), Excel (Microsoft Office), Indostat (https://www.indostat.org), AGD-R (https://data.cimmyt.org/dataset.xhtml?persistentld= hdl:11529/10202), Agrobase II, generation (http://www.agronomix.com), PB Tools (https://pbtools.software.informer.com/2.0/), TNAUSTAT (https:// sites. google.com/site/tnaustat) and GSCA (https://bioseqdata.com/gsca/gsca. htm), among others; of these, only Agrobase II generation and Indostat must be purchased under license with a cost higher than $1 000.00 USD, because at least three modules are required to properly operate both softwares. Although SAS is the best statistical package, it is common for breeders and geneticists to use various software to analyze data from experiments designed in the agricultural and biological sciences (Padilla et al., 2019a; Padilla et al., 2019b; Saavedra, 2019).
Also, for many users it is difficult to download free software because there is an incompatibility problem between it and their PCs, there are technical problems during the downloads, the necessary permission is not obtained, the researchers do not respond to the requests or the program does not work in versions old or recent Windows. In this context, it would be desirable to elaborate and validate some codes for SAS, for versions 6.01 or higher (SAS, 1989), that allow complementing the genetic-statistical analysis for experiments of complete dialectic crosses.
Materials and methods
Full dialectic
In methodology 1, described in Saavedra (2019), the analysis of variance (ANOVA) for a single environment contains repetitions (R), treatments (Trat) and experimental error, its statistical model corresponds to a randomized complete block design. In ANOVA, the effects of Trat are divided into progenitors (P), direct crosses (CD), reciprocal crosses (CR), P vs crosses and CD vs CR, as suggested by González et al. (2007b), both contrasts estimate average heterosis and maternal and non-maternal effects.
The program calculates the differences between Trat with the Tukey test (SAS, 1989). This code can be easily modified if the user requires other means comparison tests, or various regression and correlation analyzes, these analyzes can be extended to series of experiments in time and space (Saavedra, 2019).
In methodology 2, which corresponds to method 1 of Griffing (1956a, b), the ANOVA for a single trial has repetitions (R), general combinatorial aptitude (ACG), specific combinatorial aptitude (ACE), maternal effects (EM) and reciprocal effects (ER); in the series of experiments the interactions of these with only two environments could be estimated.
Also, in both cases, the effects of gi for each parent or of sij for each cross, the reciprocal and maternal effects would be estimated (Zhang and Kang, 1997). The variance and heritability components, and the prediction of hybrids and synthetics could be estimated with other programs for SAS (Martínez, 1983; González et al., 2007a, b; Montesinos et al., 2007).
Defining variables in code
In the database called “diallel” female, male, YH, YP, YM, X, Y, A, B, C, D and M are defined, in the female and male variables the combinations of each female with each male, YH, YP and YM correspond to the cross, female and male means, respectively. In X, Y the totals for each pair of CD and CR are captured. After the sum over repetitions has been done. In A, B, C, D, each line of the CD or the CR appears twice, as female and as male (Yi. or Y.i.; Yj.. or Y.j.)M is the great arithmetic mean, GI, SIJ, RIJ and MI are the same genetic effects that are estimated with the formulas proposed in method 1 of Griffing (1956a, b).
Values used in the code
In this study, 96 data were used, corresponding to four parents, their six direct crosses and their six reciprocal crosses, registered in six repetitions (Saavedra, 2019).
Results and discussion
Since its creation in 1972, SAS programs for the analysis of diallelic cross experiments have been implemented on personal computers PC’s by several researchers. The great achievements that have been obtained for PC’s are attributed to Schaffer and Usanis (1989); Burow and Coors (1994); Magari and Kang (1994); Zhang and Kang (1997); Martínez (1983, 1991), among others. More recently, Mastache and Martínez (1998a, 1998b, 1999a, 1999b), they refined their algorithms to obtain the best empirical linear and unbiased predictors (MPLI) of the effects of the parents, to help users with little training in programming, when using completely random designs (DCA) and complete random blocks (BCA).
Also, Mastache and Martínez (2003) obtained an integrated algorithm for its simultaneous analysis in balanced experiments for fixed or random effects models. These and other programs could also be used to validate and complement the outputs that were obtained with the code proposed in the present study (Zhang et al., 2005; Montesinos et al., 2007).
Zhang et al. (2005) modified the codes of Zhang and Kang (1997); in Diallel-SAS05, they discussed a more efficient program for the genetic-statistical analysis four methods of Griffing’s (1956a, b), including those corresponding to designs II and III of Gardner and Eberhart (1966). This program is friendlier and easier to modify than Diallel-SAS, when parents vary from 4 to 12, when there is no restriction on the number of environments, and when the effects and variances of ACG and ACE for parents and crosses are estimated, as well as their interactions with environments. As with other statistical packages, there are problems in deploying on personal computers with recent versions of Windows (Padilla et al., 2019a, b).
With program 1a the ANOVA and the comparison of means (Tukey, p= 0.01) are calculated. Since Trat and its components are considered as fixed effects, the F tests are tested with the mean square of the experimental or residual error of the model. In your code, Data, SET, IF-THEN, ANOVA, and GLM are used to define subsets of data. The user will be careful to respect the correct order in the database: P, CD and CR, the signs and the coefficients of the contrasts, as for other statistical packages, must be captured within the program. If there is any doubt to design this type of contrasts, it is suggested to consult Padilla et al. (2019a).
At the SAS output, if R= 6 and Trat= 4, the ANOVA corresponds to parents; your hypothesis test is not correct because it was constructed as a subset and its mean square of the residual is a fraction of the 96 data. In this context, a table of F should be consulted at this stage there are no restrictions regarding the number of variables to analyze. The code can be modified to include tests for the least significant difference (DMS or LSD), Dunnett, or mutually orthogonal contrasts, among others. With two or more variables, it is possible to modify the program to perform regression and correlation, estimate simple statistics and apply multivariate methodologies, among others.
The code corresponding to program 1a is presented below:
Data corn; Input rep trat PVG;Cards;
1 01 758
1 02 761
6 15 768
6 16 758;
DATA PARENTS;SET CORN;IF TRAT>4 THEN DELETE;*parents only;
DATA CD;SET CORN;IF TRAT<5 OR TRAT>10 THEN DELETE;* only direct crosses;
DATA CR; SET CORN; IF TRAT<11 THEN DELETE; *only reciprocal crosses;
PROC ANOVA DATA= CORN; CLASS REP TRAT; MODEL PVG=REP TRAT; MEANS TRAT/TUKEY LINES ALPHA=0.01;* Analysis with 96 data;
PROC GLM DATA= CORN; CLASS REP TRAT; MODEL PVG=REP TRAT;
CONTRAST "P VS CROSSES"TRAT 12 12 12 12 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4;
CONTRAST "CD VS CR" TRAT 0 0 0 0 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1;
PROC ANOVA DATA= PARENTS; CLASS REP TRAT; MODEL PVG=REP TRAT;*Analysis of variance for parents;
PROC ANOVA DATA=CD; CLASS REP TRAT; MODEL PVG=REP TRAT;* Analysis of variance for direct crosses;
PROC ANOVA DATA=CR; CLASS REP TRAT; MODEL PVG=REP TRAT;*Analysis of variance for reciprocal crosses; RUN;
With program 1b, estimates of genetic effects (Gi, Sij, Rij, MI) and heterosis (%) are obtained. The definition of variables before CARDS must be correctly indicated using the data in Tables 1 and 2 (Saavedra, 2019), but the denominator values of the GI, SIJ, RIJ and MI formulas should be
corrected, if R, P or both of them. At this stage it is essential to resort to the devices that Martínez (1983) used establishing a logical way of relating formulas Griffing (1956a, b) with the programming language in SAS (SAS Institute, 1989).
Table 1. Volumetric weight of the grain (g L-1) of 16 crosses formed with four lines.
Crosses | R1 | R2 | R3 | R4 | R5 | R6 | Total | Mean |
1) 1x1 | 758 | 734 | 750 | 790 | 758 | 765 | 4 555 | 759.1 |
2) 2x2 | 761 | 762 | 737 | 779 | 763 | 773 | 4 575 | 762.5 |
3) 3x3 | 802 | 812 | 802 | 838 | 793 | 782 | 4 829 | 804.8 |
4) 4x4 | 790 | 768 | 780 | 772 | 783 | 775 | 4 668 | 778 |
5) 1x2 | 814 | 792 | 770 | 781 | 775 | 755 | 4 687 | 781.1 |
6) 1x3 | 805 | 803 | 806 | 832 | 813 | 824 | 4 883 | 813.8 |
7) 1x4 | 791 | 775 | 777 | 791 | 795 | 780 | 4 709 | 784.8 |
8) 2x3 | 819 | 816 | 793 | 814 | 818 | 786 | 4 846 | 807.6 |
9) 2x4 | 779 | 778 | 758 | 798 | 783 | 755 | 4 651 | 775.1 |
10) 3x4 | 830 | 830 | 850 | 853 | 828 | 806 | 4 997 | 832.8 |
11) 2x1 | 774 | 772 | 786 | 750 | 794 | 769 | 4 645 | 774.16 |
12) 3x1 | 789 | 808 | 816 | 808 | 824 | 806 | 4 851 | 808.5 |
13) 4x1 | 787 | 815 | 815 | 825 | 802 | 796 | 4 840 | 806.6 |
14) 3x2 | 817 | 832 | 808 | 775 | 790 | 797 | 4 819 | 803.1 |
15) 4x2 | 756 | 768 | 756 | 754 | 753 | 768 | 4 555 | 759.1 |
16) 4x3 | 850 | 820 | 840 | 850 | 805 | 758 | 4 923 | 820.5 |
Total | 12 722 | 12 865 | 12 644 | 12 810 | 12 677 | 12 495 | 76 033 | 792.01 |
Table 2. Values used to estimate genetic effects and heterosis.
Lines | 1 | 2 | 3 | 4 | Total |
1 | 4 555 | 4 687 | 4 883 | 4 709 | 18 834 |
2 | 4 645 | 4 575 | 4 846 | 4 651 | 18 717 |
3 | 4 851 | 4 819 | 4 829 | 4 997 | 19 496 |
4 | 4 840 | 4 555 | 4 923 | 4 668 | 18 986 |
Total | 18 891 | 18 636 | 19 481 | 19 025 | 76 033 |
Note: added over repetitions and row or column totals are the contribution of each female or male line, respectively.
In some columns, such as for the large arithmetic mean (M), which is a constant for the 12 mattings, there are duplicate values, but it is easy to establish which parent or cross they correspond to because the data is shown in descending order (González et al., 2007a, b; Saavedra, 2019).
The code for program 1b is presented below:
DATA HETERO; INPUT FEMALE MALE YH YP YM X Y A B C D M;
MP= YP+YM)/2;*to calculate the mean of the parents (MP);
BP= MAX (YP,YM);*to choose the best parent (BP);
DMP= YH-MP;* to estimate the numerator of the heterosis formula with MP;
HMP= (DMP/MP)*100;*to estimate heterosis with the mean of the parents, in %;
DBP= YH-BP;*calculate the numerator of the heterosis formula with BP;
HBP= (DBP/BP)*100;*calculates heterosis with the best father, in %;
GI= (A+B)/48 - M;*estimate the effects of gi;
SIJ= (X+Y)/12 -(A+B+C+D)/48 + M;*calculate the effects of Sij;
RIJ= (X-Y)/12;*determines the effects of rij;
MI= (A-B)/48;* calculate the effects mi;
CARDS;
1 2 781.1 759.1 762.5 4687 4645 18834 18891 18717 18636 792.01
1 3 813.8 759.1 804.8 4883 4851 18834 18891 19496 19481 792.01
1 4 784.8 759.1 778.0 4709 4840 18834 18891 18986 19025 792.01
2 3 807.6 762.5 804.8 4846 4819 18717 18636 19496 19481 792.01
2 4 775.1 762.5 778.0 4651 4555 18717 18636 18986 19025 792.01
3 4 832.8 804.8 778.0 4997 4923 19496 19481 18986 19025 792.01
2 1 774.1 762.5 759.1 4645 4687 18636 18717 18891 18834 792.01
3 1 808.5 804.8 759.1 4851 4883 19481 19496 18891 18834 792.01
4 1 806.6 778.0 759.1 4840 4709 19025 18986 18891 18834 792.01
3 2 803.1 804.8 762.5 4819 4846 19481 19496 18636 18717 792.01
4 2 759.1 778.0 762.5 4555 4651 19025 18986 18636 18717 792.01
4 3 820.5 778.0 804.8 4923 4997 19025 18986 19481 19496 792.01
TITLE ‘Effects of gi, sij, rij, mi and heterosis for the general dialectic’;
DATA DOS; SET HETERO; PROC PRINT; RUN;
ANOVA procedure
Dependent variable: PVG
Sum of Square of
Source DF square the mean F-Value Pr > F
Model 20 51191.16667 2559.55833 9.69 <.0001
Error 75 19801.82292 264.02431
Total correct 95 70992.98958
R-square Coef Var Root MSE PVG Mean
0.721074 2.051592 16.24882 792.0104
Square of
Source DF ANOVA SS the mean F-Value Pr > F
rep 5 3365.67708 673.13542 2.55 0.0347
trat 15 47825.48958 3188.36597 12.08 <.0001
Tukey studentized range test (HSD) for PVG
Note: This test controls the rate of the probability of making an experimentwise Type I error, but usually has a higher Type II error rate than REGWQ.
Alpha 0.01
Degrees of freedom error 75
Mean square error 264.0243
Critical value of the studentized range 5.76634
Minimal significant difference 38.251
Means with the same letter are not significantly different.
Tukey Grouping Mean N trat
A 832.833 6 10
B A 820.500 6 16
B A C 813.833 6 6
B D A C 808.500 6 12
B D A C 807.667 6 8
B D A C 806.667 6 13
B D A C 804.833 6 3
B D A C 803.167 6 14
B D E C 784.833 6 7
D E C 781.167 6 5
D E C 778.000 6 4
D E 775.167 6 9
D E 774.167 6 11
E 762.500 6 2
E 759.167 6 1
E 759.167 6 15
Square of
Contrast DF Contrast SS the mean F-Value Pr > F
P VS CRUZAS 1 8075.086806 8075.086806 30.58 <.0001
CD VS CR 1 272.222222 272.222222 1.03 0.3132
Dependent variable: PVG
Square of
Source DF ANOVA SS the mean F-Value Pr > F
rep 5 1952.875000 390.575000 2.00 0.1376
trat 3 7805.458333 2601.819444 13.30 0.0002
Square of
Source DF ANOVA SS the mean F-Value Pr > F
rep 5 2869.25000 573.85000 3.57 0.0143
trat 5 15157.25000 3031.45000 18.84 <.0001
Square of
Source DF ANOVA SS the mean F-Valor Pr > F
rep 5 1741.80556 348.36111 0.90 0.4948
trat 5 16515.47222 3303.09444 8.56 <.000
Effects of Gi, Sij, Rij, Mi and heterosis for methodology 1
Obs FEMALE MALE YH YP YM X Y A B C D M MP
1 1 2 781.1 759.1 762.5 4687 4645 18834 18891 18717 18636 792.01 760.80
2 1 3 813.8 759.1 804.8 4883 4851 18834 18891 19496 19481 792.01 781.95
3 1 4 784.8 759.1 778.0 4709 4840 18834 18891 18986 19025 792.01 768.55
4 2 3 807.6 762.5 804.8 4846 4819 18717 18636 19496 19481 792.01 783.65
5 2 4 775.1 762.5 778.0 4651 4555 18717 18636 18986 19025 792.01 770.25
6 3 4 832.8 804.8 778.0 4997 4923 19496 19481 18986 19025 792.01 791.40
7 2 1 774.1 762.5 759.1 4645 4687 18636 18717 18891 18834 792.01 760.80
8 3 1 808.5 804.8 759.1 4851 4883 19481 19496 18891 18834 792.01 781.95
9 4 1 806.6 778.0 759.1 4840 4709 19025 18986 18891 18834 792.01 768.55
10 3 2 803.1 804.8 762.5 4819 4846 19481 19496 18636 18717 792.01 783.65
11 4 2 759.1 778.0 762.5 4555 4651 19025 18986 18636 18717 792.01 770.25
12 4 3 820.5 778.0 804.8 4923 4997 19025 18986 19481 19496 792.01 791.40
Obs BP DMP HMP DBP HBP GI SIJ RIJ MI
1 762.5 20.30 2.66824 18.6 2.43934 -6.0725 5.5517 3.5000 -1.1875
2 804.8 31.85 4.07315 9.0 1.11829 -6.0725 5.2183 2.6667 -1.1875
3 778.0 16.25 2.11437 6.8 0.87404 -6.0725 9.9267 -10.9167 -1.1875
4 804.8 23.95 3.05621 2.8 0.34791 -13.8225 7.2183 2.2500 1.6875
5 778.0 4.85 0.62967 -2.9 -0.37275 -13.8225 -10.9067 8.0000 1.6875
6 804.8 41.40 5.23124 28.0 3.47913 20.0108 14.7600 6.1667 0.3125
7 762.5 13.30 1.74816 11.6 1.52131 -13.8225 5.5517 -3.5000 1.6875
8 804.8 26.55 3.39536 3.7 0.45974 20.0108 5.2183 -2.6667 0.3125
9 778.0 38.05 4.95088 28.6 3.67609 -0.1142 9.9267 10.9167 -0.8125
10 804.8 19.45 2.48198 -1.7 -0.21123 20.0108 7.2183 -2.2500 0.3125
11 778.0 -11.15 -1.44758 -18.9 -2.42931 -0.1142 -10.9067 -8.0000 -0.8125
12 804.8 29.10 3.67703 15.7 1.95080 -0.1142 14.7600 -6.1667 -0.8125
The previous results were validated with the program developed by Zhang and Kang (1997). In ANOVA the code allows the partitioning of possible cross effects in ACG, ACE, ER and EM, when SORT, BY, GLM, IF-THEN, DROP, ARRAY, ELSE, GLM, CONTRAST, ESTIMATE and some MACROS were implemented. In Martínez (1991) these and other components are presented to elaborate the reference code.
The program of Zhang and Kang (1997) applies to the four methods of Griffing (1956a, b), for method 1 m variables are analyzed in two environments. In the present study, this was adjusted to a single environment by implementing the restriction I IF ENV> 1 THEN DELETE or IF ENV< 2 THEN DELETE, captured before DROP and after INPUT. Modifying it is more laborious for users with little training in programming and, especially when the analysis extends to series of experiments (Singh, 1973; Mastache and Martínez, 2003; Zhang et al., 2005).
TNAUSTAT software, in addition to calculating the genetic effects related to parents and their crosses in method 1 of Griffing (1956a, b) also, simultaneously, allows the calculation of hybrid vigor with the mean of both parents, with the best of them and additionally, based on commercial heterosis. This has the additional advantage of estimating the genetic parameters corresponding to the mating design I, proposed by Hayman (1954). However, this software was designed to work properly on a platform with MS Dos, so DOSBox software must be downloaded in advance.
The Zhang and Kang program (1997), modified by Saavedra (2019), is presented below:
OPTIONS PS=56 LS=78; TITLE 'METHOD 1'; DATA METHOD1;
INPUT I J REP HYBRID YIELD ENV; IF ENV>1 THEN DELETE;DROP N NI NJ P;
P=4;*NUMBER OF PARENTAL LINES? ; ARRAY GCA (N) G1 G2 G3;DO N=1 TO (P-1);
GCA= ((I=N)-(I=P)) + ((J=N)-(J=P)); END;ARRAY SCA(N) S11 S12 S13 S22 S23 S33;
N=0; DO NI=1 TO (P-1); DO NJ=NI TO (P-1); N+1; IF NI=NJ THEN DO;
SCA=(I=NI)*((J=NJ)-(J=P))+(I=P)*((J=P)-(J=NI));END;ELSE DO;
SCA=(I=NI)*(J=NJ)-(J=P)*((I=NI)+(I=NJ)-(I=P)*2)+(I=NJ)*(J=NI)
-(I=P)*((J=NI)+(J=NJ));END;END;END;
ARRAY REC (N) R12 R13 R14 R23 R24 R34; N=0; DO NI=1 TO (P-1);
DO NJ= (NI+1) TO P; N+1; REC= (I=NI)*(J=NJ)-(j=NI)*(I=NJ); END;END;
ARRAY MAT (N) M1 M2 M3; DO N=1 TO (P-1); MAT= (I=N) + (J=P)-(J=N)-(I=P);
END;ARRAY NONM (N) N12 N13 N23;N=0;DO NI=1 TO (P-2);DO NJ=(NI+1) TO (P-1);N+1;NONM=((I=NI)*(J=NJ))-(I=NJ)*(J=NI)-((I=NI)*(J=P))+(I=NJ)*(J=P)
+ ((I=P)*((J=NI)-(J=NJ))); END;END;CARDS;
1 1 1 01 758 1
1 2 1 02 814 1
1 3 1 03 805 1
4 3 6 15 758 1
4 4 6 16 775 1;
PROC SORT; BY REP ENV I J; PROC GLM;CLASS REP ENV HYBRID;MODEL YIELD=ENV REP(ENV) HYBRID HYBRID*ENV;TEST H=HYBRID E=HYBRID*ENV;LSMEANS HYBRID;
RUN; TITLE 'DIALLEL-SAS 1'; PROC GLM; CLASS REP ENV HYBRID;
MODEL YIELD= ENV REP (ENV) G1 G2 G3 S11 S12 S13 S22 S23 S33 R12 R13 R14 R23 R24 R34 G1*ENV G2*ENV G3*ENV S11*ENV S12*ENV S13*ENV S22*ENV S23*ENV S33*ENV R12*ENV R13*ENV R14*ENV R23*ENV R24*ENV R34*ENV;
%MACRO GCASCA; CONTRAST 'GCA' G1 1, G2 1, G3 1;
CONTRAST 'SCA' S11 1, S12 1, S13 1, S22 1, S23 1, S33 1;
ESTIMATE 'G1' G1 1; ESTIMATE 'G2' G2 1; ESTIMATE 'G3' G3 1;
Estimate 'G4' G1 -1 G2 -1 G3 -1;
ESTIMATE 'S11' S11 1; ESTIMATE 'S12' S12 1; ESTIMATE 'S13' S13 1;
ESTIMATE 'S22' S22 1; ESTIMATE 'S23' S23 1; ESTIMATE 'S33' S33 1;
Estimate 'S14' S11 -1 S12 -1 S13 -1;
Estimate 'S24' S12 -1 S22 -1 S23 -1;
Estimate 'S34' S13 -1 S23 -1 S33 -1;
Estimate 'S44' S11 1 S12 2 S13 2 S22 1 S23 2 S33 1;
%MEND GCASCA; %GCASCA %MACRO INTERACT;
CONTRAST 'GCA*ENV' G1*ENV 1 -1, G2*ENV 1 -1, G3*ENV 1 -1;
CONTRAST 'SCA*ENV' S11*ENV 1 -1, S12*ENV 1 -1, S13*ENV 1 -1, S22*ENV 1 -1, S23*ENV 1 -1, S33*ENV 1 -1; %MEND INTERACT; %INTERACT
CONTRAST 'REC' R12 1, R13 1, R14 1, R23 1, R24 1, R34 1;
ESTIMATE 'R12' R12 1; ESTIMATE 'R13' R13 1; ESTIMATE 'R14' R14 1;
ESTIMATE 'R23' R23 1; Estimate 'R24' R24 1; ESTIMATE 'R34' R34 1;
CONTRAST 'REC*ENV' R12*ENV 1 -1,R13*ENV 1 -1,R14*ENV 1 -1,R23*ENV 1 -1,R24*ENV 1 -1,R34*ENV 1 -1;
CONTRAST 'MAT SS' R12 1 R13 1 R14 1, R12 -1 R23 1 R24 1, R13 -1 R23 -1 R34 1, R14 -1 R24 -1 R34 -1; ESTIMATE 'MAT1' R12 1 R13 1 R14 1/DIVISOR=3;
ESTIMATE 'MAT2' R12 -1 R23 1 R24 1/DIVISOR=3;
ESTIMATE 'MAT3' R13 -1 R23 -1 R34 1/DIVISOR=3;
ESTIMATE 'MAT4' R14 -1 R24 -1 R34 -1/DIVISOR=3; RUN;
TITLE 'DIALLEL-SAS 2'; PROC GLM; CLASS REP ENV HYBRID;
MODEL YIELD= ENV REP (ENV) G1 G2 G3 S11 S12 S13 S22 S23 S33
M1 M2 M3 N12 N13 N23 G1*ENV G2*ENV G3*ENV
S11*ENV S12*ENV S13*ENV S22*ENV S23*ENV S33*ENV
M1*ENV M2*ENV M3*ENV N12*ENV N13*ENV N23*ENV;
%GCASCA %INTERACT
CONTRAST 'MAT SS' M1 1, M2 1, M3 1;
CONTRAST 'NONM SS' N12 1, N13 1, N23 1;
CONTRAST 'MAT*ENV' M1*ENV 1 -1, M2*ENV 1 -1, M3*ENV 1 -1;
CONTRAST 'NONM*ENV' N12*ENV 1 -1, N13*ENV 1 -1, N23*ENV 1- 1;
ESTIMATE 'M1' M1 1; ESTIMATE 'M2' M2 1; ESTIMATE 'M3' M3 1;
Estimate 'M4' M1 -1 M2 -1 M3 -1;
ESTIMATE 'N12' N12 1; ESTIMATE 'N13' N13 1; ESTIMATE 'N23' N23 1;
Estimate 'N14' N12 -1 N13 -1;
Estimate 'N24' N12 1 N23 -1;
Estimate 'N34' N13 1 N23 1; RUN;
To validate the code presented in program 1a, some results are shown that are generated by the program of Zhang and Kang (1997).
GLM procedure
Dependent variable: YIELD
Sum of Square of
Source DF Square The mean F-Value Pr > F
Model 20 51191.16667 2559.55833 9.69 <.0001
Error 75 19801.82292 264.02431
Total correct 95 70992.98958
R-square Coef Var Root MSE YIELD mean
0.721074 2.051592 16.24882 792.0104
Square of
Source DF Type I SS The mean F-Value Pr > F
ENV 0 0.00000 . . .
REP(ENV) 5 3365.67708 673.13542 2.55 0.0347
HYBRID 15 47825.48958 3188.36597 12.08 <.0001
ENV*HYBRID 0 0.00000 . . .
Square of
Contrast DF Contrast SS The mean F-Value Pr > F
GCA 3 30162.39583 10054.13194 38.08 <.0001
SCA 6 14715.59375 2452.59896 9.29 <.0001
REC 6 2947.50000 491.25000 1.86 0.0988
(MAT SS) (3) (240.75000) 80.25000 0.30 0.8224
(NONM SS) (3) (2706.75000) 902.25000 3.42 0.0216
Error
Parameter Estimate Standard Value t Pr > |t|
G1 -6.0729167 2.03110309 -2.99 0.0038
G2 -13.8229167 2.03110309 -6.81 <.0001
G3 20.0104167 2.03110309 9.85 <.0001
G4 -0.1145833 2.03110309 -0.06 0.9552
S12 5.5520833 3.70826994 1.50 0.1385
S13 5.2187500 3.70826994 1.41 0.1635
S23 7.2187500 3.70826994 1.95 0.0553
S14 9.9270833 3.70826994 2.68 0.0091
S24 -10.9062500 3.70826994 -2.94 0.0043
S34 14.7604167 3.70826994 3.98 0.0002
M1 -1.1875000 2.03110309 -0.58 0.5605
M2 1.6875000 2.03110309 0.83 0.4087
M3 0.3125000 2.03110309 0.15 0.8781
M4 -0.8125000 2.03110309 -0.40 0.6903
R12 3.5000000 4.69063167 0.75 0.4579
R13 2.6666667 4.69063167 0.57 0.5714
R14 -10.9166667 4.69063167 -2.33 0.0226
R23 2.2500000 4.69063167 0.48 0.6329
R24 8.0000000 4.69063167 1.71 0.0922
R34 6.1666667 4.69063167 1.31 0.1926
Conclusions
The programs ‘1a’ and ‘1b’ are easy to use and modify to carry out an analysis of variance in a single environment, with the subdivision of the effects of the treatments in parents (P), direct crosses (CD), reciprocal crosses (CR), P versus crosses and CD versus CR. It is also useful for comparing treatment means (Tukey, p= 0.01) and for estimating heterosis with the parents’ mean and with the best of these when analyzing a variable.
Zhang and Kang’s (1997) program was designed to analyze ‘m’ variables, but it is more difficult to manipulate when parents and environments are different from 5 and 2, respectively. Due to this restriction, it was necessary to modify the code with P= 4. The code proposed by Zhang and Kang estimates general and specific combinatorial aptitude, reciprocal and maternal effects, but does not include the Tukey test or the estimation of heterosis.
All three codes for SAS run on versions over 10 years from their commercial release and on the most recent academic proofs. The program of Zhang and Kang (1997) allowed the reliable validation of the code proposed in the present study when the genetic effects were estimated, but the three codes must be used to carry out a more complete diallelic analysis.
Cited literature
Burow, M. D. and Coors, J. G. 1994. DIALLEL: a microcomputer program for the simulation and analysis of diallel crosses. Agron. J. 86(1):154-158.
Christie, B. R. and Shattuck, V. I. 1992. The diallel cross: design, analysis and use for plant breeders. Plant Breeding Reviews. 9(1):9-36.
Gardner, C. O. and Eberhart, S. A. 1966. Analysis and interpretation of the variety cross diallel and related populations. Biometrics. 22(3): 439-452.
González, H. A.; Sahagún, C. J. y Pérez, L. D. J. 2007a. Estudio de ocho líneas de maíz en un experimento dialélico incompleto. Ciencias Agrícolas Informa. 16(1):3-9.
González, H. A.; Pérez, L. D.; Sahagún, C. J.; Norman, M. T. H.; Balbuena, M. A. and Gutiérrez, R. F. 2007b. Análisis de una cruza dialélica completa de líneas endogámicas de maíz. Ciencias Agrícolas Informa. 16(1):10-17.
Griffing, B. 1956a. A generalized treatment of the use of diallel crosses in quantitative inheritance. Heredity. 10(1):31-50.
Griffing, B. 1956b. Concept of general and specific combining ability in relation to diallel crossing systems. Austr. J. Biol. Sci. 9(4):463-491.
Hallauer, A. R.; Miranda, F. O. J. B.1988. Quantitative Genetics in Maize Breeding. Iowa State University Press, Ames. Second Edition. USA. 468 p.
Hayman, B. I. 1954. The theory and analysis of the diallel crosses. Genetics. 39(1):798-809.
Magari, R. and Kang, M. S. 1994. Interactive BASIC program for Griffing’s diallel analysis. Journal of Heredity. 85(4):336.
Martínez, G. A. 1983. Diseño y análisis de los experimentos de cruzas dialélicas. Centro de Estadística y Cálculo. Colegio de Postgraduados. Chapingo, Estado de México. 252 p.
Martínez, G. A. 1991. Análisis de los experimentos dialélicos a través del procedimiento IML del SAS. Comunicaciones en Estadística y Cómputo. 10(2):1-36.
Mastache, L. A. A.; Martínez, G. A.; Castillo, M. A. y González, C. F. V. 1998a. Los mejores predictores lineales e insesgados (MPLI) en experimentos dialélicos parciales sin efectos maternos. Revista Fitotecnia Mexicana. 21(1):49-60.
Mastache, L. A. A.; Martínez, G. A. y Castillo, M. A. 1998b. Los mejores predictores lineales e insesgados (MPLI) en experimentos dialélicos parciales con efectos maternos. Revista Fitotecnia Mexicana. 21(2):171-184.
Mastache, L. A. A.; Martínez, G. A. y Castillo, M. A. 1999a. Los mejores predictores lineales e insesgados (MPLI) en los diseños dos y cuatro de Griffing. Agrociencia. 33(1):81-90.
Mastache, L. A. A.; Martínez, G. A. y Castillo, M. A. 1999b. Los mejores predictores lineales e insesgados (MPLI) en los diseños uno y tres de Griffing. Agrociencia. 33(3):349-359.
Mastache, L. A. A. y Martínez, G. A. 2003. Un algoritmo para el análisis, estimación y predicción en experimentos dialélicos balanceados. Revista Fitotecnia Mexicana. 26(3):191-200.
Montesinos, L. O. A.; Mastache, L. A. A.; Luna, E. I. y Hidalgo, C. J. V. 2007. Mejor predictor lineal e insesgado combinado para aptitud combinatoria general y análisis combinado de los diseños uno y tres de Griffing. Técnica Pecuaria en México. 45(2):131-146.
Padilla, L. A.; González, H. A.; Pérez, L. D. J.; Rubí, A. M.; Gutiérrez, R. F. y Franco, M. J. R. P. 2019a. InfoStat, InfoGen y SAS para contrastes mutuamente ortogonales en experimentos en bloques completos al azar en parcelas subdivididas. Rev. Mex. Cienc. Agríc. 10(6):1417-1431.
Padilla, L. A.; González, H. A.; Pérez, L. D. J.; Rubí, A. M.; Gutiérrez, R. F.; Ramírez, D. J. F.; Franco, M. J. R. P. y Serrato, C. R. 2019b. Programas para SAS e InfoStat para analizar una serie de experimentos en parcelas subdivididas. En: temas selectos en la innovación de las ciencias agropecuarias. Alfaomega Grupo Editor SA. de CV. Primera edición (Salgado y otros, eds.). México, DF. 724 p. ISBN: 9786075384115.
SAS Institute, Inc. 1989. SAS/IML software: Usage and reference. Version 6. First edition. Cary, N. C.
Saavedra, G. C. 2019. Estimación de parámetros genéticos en maíz con dos metodologías usando datos de una cruza dialélica completa. I. Un ambiente. Tesis de Maestro en Fitomejoramiento. Facultad de Ciencias Agrícolas, Universidad Autónoma del Estado de México. Toluca, Estado de México. 96 p.
Singh, D. 1973. Diallel analysis for combining ability over several environments-II. Indian Journal of Genetics and Plant Breeding. 33(1):469-481.
Schaffer, H. E. y Usanis, R. A. 1989. General least squares analysis of diallel experiments: A computer program. Genetics Dep. Res. Rep. 1. North Carolina State University, Raleigh. 61 p.
Sprague, G. F. and Tatum, L. A. 1942. General vs specific combining ability in single crosses of corn. J. Amer. Soc. Agron. 34(1):923-932.
Zhang, Y. and Kang, M. S. 1997. DIALLEL-SAS: A SAS program for Griffing’s Diallel Analyses. Agronomy Journal. 89(2):176-182.
Zhang, Y.; Kang, M. S. and Lamkey, K. R. 2005. DIALLEL-SAS05: A comprehensive program for Griffing’s and Gardner-Eberhart analysis. Agron. J. 97(4):1097-1106.