<multiple phenotype>
plink --file mydata --pheno pheno2.txt --pheno-name bmi --assoc
will select the second phenotype labelled "bmi", for analysis
Finally, if there is more than one phenotype, then for basic association tests, it is possible to specify that all phenotypes be tested, sequentially, with the output sent to different files: e.g. if bigpheno.raw contains 10,000 phenotypes, then
plink --bfile mydata --assoc --pheno bigpheno.raw --all-pheno
will loop over all of these, one at a time testing for association with SNP, generating a lot of output. You might want to use the --pfilter command in this case, to only report results with a p-value less than a certain value, e.g. --pfilter 1e-3.
The --merge option can also be used with binary PED files, either as input or output, but not as the second file: i.e.
plink --bfile data1 --merge data2.ped data2.map --make-bed --out merge
For example, consider we had 4 PED/MAP filesets (labelled fA.* through fD.*) and 4 binary filesets, labelled fE.* through fH.*). Then using the command
plink --file fA --merge-list allfiles.txt --make-bed --out mynewdata
would create the binary fileset
mynewdata.bed
mynewdata.bim
mynewdata.fam
To analyse only a specific chromosome use
plink --file data --chr 6
Based on a range of SNPs (--from and --to)
To select a specific range of markers (that must all fall on the same chromosome) use, for example:
plink --bfile mydata --from rs273744 --to rs89883
To extract only a subset of SNPs, it is possible to specify a list of required SNPs and make a new file, or perform an analysis on this subset, by using the command
plink --file data --extract mysnps.txt
<range file>
Alternatively, you can use the command --range to modify the behavior of --extract and --exclude. If the --range flag is added, then instead of a list of SNPs, PLINK will expect a list of chromosomal ranges to be given instead, one per line.
plink --file data --extract myrange.txt --range
All SNPs within that range will then be excluded or extracted. The format of myrange.txt should be, one range per line, whitespace-separated:
CHR Chromosome code (1-22, X, Y, XY, MT, 0)
BP1 Start of range, physical position in base units
BP2 End of range, as above
LABEL Name of range/gene
For example,
2 30000000 35000000 R1
2 60000000 62000000 R2
X 10000000 20000000 R3
VIF variance inflation factor
A VIF of 10 is often taken to represent near collinearity problems in standard multiple regression analyses
A VIF of 1 would imply that the SNP is completely independent of all other SNPs. Practically, values between 1.5 and 2 should probably be used