
PPH - an efficient program for deducing haplotypes from 
genotypes, determining if there
are deduced haplotypes that fit a tree model (i.e. a perfect phylogeny, 
a coalescent). We now have additional programs for this problem, and so
we more recently refer to this program as GPPH to distiguish it from
other programs for the PPH problem.

This package was written by Ren-Hua Chung at U.C. Davis Computer Science under the
direction of Dan Gusfield.
Copyright (C) 2002 R.H. Chung and D. Gusfield

We make no warranties or guarantees, assume no liability and grant no 
rights for commercial use. 

To cite the method, please use: 

"Haplotyping as Perfect Phylogeny: Conceptual
Framework and Efficient Solutions" D. Gusfield 
In Proceedings of RECOMB, Sixth Annual Conference on Research in 
Computational Molecular Biology, April 2002

To cite the program, please use:

"PPH - A program for deducing haplotypes that fit a perfect phylogeny"
R.H. Chung and D. Gusfield,
UCD Computer Science Technical Report CSE-2002-27

and 

Perfect phylogeny haplotyper: haplotype inferral using a tree model.
Bioinformatics Vol. 19, no. 6, 2003 p.780-781

For the first two papers, and other papers on haplotyping see
wwwcsif.cs.ucdavis.edu/~gusfield/paperlist.html

This package contains these files:

pph.out - the main program
gpph - another copy of the main program, using its more current name.
check.out - check consistency
extract1.pl - extracts haplotype to genotype of format 1
extract2.pl - extracts haplotype to genotype of format 2
extract3.pl - extracts haplotype to genotype of format 3
extract4.pl - extracts haplotype to genotype of format 4
extract5.pl - extracts haplotype to genotype of format 5
extract6.pl - extracts haplotype to genotype of format 6
extract7.pl - extracts haplotype to genotype of format 7
change01.pl - convert the file from the hudson program
FORMAT.txt - describes the formats of the input files
README.txt - this file
f1 - an example for format 1
f2 - an example for format 2
f3 - an example for format 3
f4 - an example for format 4
f5 - an example for format 5
f6 - an example for format 6
f7 - an example for format 7

The program is mostly written in C++, but uses several small Perl
programs as well, so you will have to have a Perl interpreter installed.
Perl is generally standard on Unix and Linux systems, and is easilly obtained
for MAC OSX and windows.

*************
Installation:
*************

1. Uncompress the tar file.
2. Enter the sub-directory "pph".
3. There should be an executable file "pph.out".

You will need both a C++ compiler and a Perl interpreter installed. Both
are typically already installed on Unix systems.

*********************************
Step by step to Use this program:
*********************************

1. Type "pph.out" or "gpph" to start the program.

2. Please input the filename:
   Input the filename of the file which contains genotype or haplotype information.
   
3. Please input the number of the file format: ( Please refer to the file "FORMAT.txt" ):
   This program can read seven different kinds of formats. Please refer to the file 
   "FORMAT.txt" for more information of the formats. Input value should be from 1 to 7.
   
4. Please input the file name that holds the ancestral vector, i.e., the binary
vector specifying the character states at the root of the tree.  

5. There are two special cases of particular importance. If the ancestral states are all-0, 
then just enter "d" for the default case; if you do not know the ancestral states, enter
"m" for the majority vector). In that case, the program will determine if there is an 
unrooted tree that could evolve the given genotypes.
   
6. The program also checks for consistency of the output. That is, it checks
that each output haplotype pair does actually correspond to the correct input
genotype vector.  The program reports the result of that verification step.

Please report any bug or suggestion to Ren-Hua Chung at rchung@ucdavis.edu. Thanks.
