0
$\begingroup$

This question has also been asked on Biostars

Hi, I am very new to this area, and I am taking a class about bioinformatics. For an independent project assignment, I need to do a GWAS. I am using the bash terminal. I downloaded all the fastq I need,

  • trimmed the files,
  • converted them into bam/sam
  • then performed vcf
  • and then bed/bim/fam etc.

However, when I tried to perform GWAS in plink, I realized I dont have phenotype data. It supposed to have two phenotypes.

There are two groups/phenotypes of fastq files, each containing 29 samples. Let's say they are group 1 and 2.

For each group, I converted every fastq to sam then bam, then I combined 29 bam to one bam. Then I combined two bams (for the two groups) together to a vcf.gz.

Problem. There is no phenotype data in the following plink files.

Questions What step has been wrong? Or what I should do to incorporate the phenotype data?

I dont have to be perfect (like the QC steps), and I cannot understand too complicated codes. I am seeking a simple output such as a Manhattan plot or alternative pipeline.

$\endgroup$
3
  • 1
    $\begingroup$ Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. $\endgroup$
    – Community Bot
    Commented Nov 24, 2022 at 20:34
  • $\begingroup$ Please show the command lines you have run. You've asked a question about what step was wrong, but that's difficult to work out without knowing precise details about what you did. $\endgroup$
    – gringer
    Commented Nov 25, 2022 at 23:23
  • $\begingroup$ Do you have phenotypic data associated with the samples? As noted on the biostars answer, you might need to not combine BAMs such that the relation of each sample with phenotype is maintained. $\endgroup$ Commented Nov 26, 2022 at 4:19

0

Browse other questions tagged or ask your own question.