This question has also been asked on Biostars
Hi, I am very new to this area, and I am taking a class about bioinformatics. For an independent project assignment, I need to do a GWAS. I am using the bash terminal. I downloaded all the fastq I need,
- trimmed the files,
- converted them into bam/sam
- then performed vcf
- and then bed/bim/fam etc.
However, when I tried to perform GWAS in plink, I realized I dont have phenotype data. It supposed to have two phenotypes.
There are two groups/phenotypes of fastq files, each containing 29 samples. Let's say they are group 1 and 2.
For each group, I converted every fastq to sam
then bam
, then I combined 29 bam to one bam. Then I combined two bams (for the two groups) together to a vcf.gz.
Problem. There is no phenotype data in the following plink files.
Questions What step has been wrong? Or what I should do to incorporate the phenotype data?
I dont have to be perfect (like the QC steps), and I cannot understand too complicated codes. I am seeking a simple output such as a Manhattan plot or alternative pipeline.