Newest '1000genomes' Questions - Bioinformatics Stack Exchange

4 votes

1 answer

102 views

tabix errors when accessing 1000 Genomes data: "[E::bgzf_read] bgzf_read_block error -1 after 50219 of 52392 bytes" and Could not load .tbi/.csi index

I am trying to access 1000 Genomes (1KG) data using tabix as per the 1KG tabix documentation "How do I get a genomic region sub-section of your files?" ...

N Brouwer

141

asked Jul 2 at 15:03

1 vote

1 answer

278 views

LD clump GRCh38 GWAS results

The vignette of R package ieugwasr describes a plink based wrapper function for LD clumping GWAS data using the 1000 genomes ...

Joonatan

11

asked Jun 30, 2023 at 20:07

3 votes

2 answers

95 views

Should genotype imputation be ancestry specific?

I'm wondering if imputation, specifically Beagle, needs a reference panel that matches the sample's ancestry group. For example, Beagle documentation suggests the 1000 Genomes Project phase 3 ...

BigMistake

568

asked May 23, 2023 at 20:08

3 votes

2 answers

344 views

Where do I get a large reference VCF?

I would like to download a large .vcf file containing many (hundreds or thousands) of samples. Ideally, I would download different population-specific .vcf files, but the ability to sort/filter by ...

BigMistake

568

asked May 3, 2023 at 1:00

2 votes

3 answers

113 views

Interpreting short indel calls in 1000 Genomes Project VCFs

Consider the following short indel polymorphism rs59679400 on chr7. ...

Daniel Standage

5,090

asked Dec 29, 2022 at 19:20

1 vote

1 answer

244 views

Where can I download 30x 1000 genomes cram files?

From the preprint published by 1000 genome project (https://www.biorxiv.org/content/10.1101/2021.02.06.430068v1.full) I think the 30x data is for WGS. Can anyone confirm for me if the following file ...

Shafayet Rahat

295

asked Sep 9, 2022 at 10:53

2 votes

0 answers

104 views

SNPs with high population differentiation from 1k Genome dataset

I am trying to reproduce the results from this paper "Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences". ...

koolaids

21

asked Aug 24, 2022 at 20:23

3 votes

2 answers

64 views

Chosing an imputation panel for SNP-Chip data?

I have about 1,000 SNP-Chip data (samples) that I'd like to impute over (for the purpose of having more rsIDs to match against GWAS data). However, I don't know the ancestry of each sample / the ...

Dan Bolser

470

asked Feb 14, 2022 at 11:49

1 vote

1 answer

361 views

Interpreting imputation result from GLIMPSE

I'm following this tutorial of GLIMPSE for learning. I was expecting some extra SNPS coming from the 1000 genome reference in the resulting .vcf file. Though I understand the phasing in the output ...

Shafayet Rahat

295

asked Jan 1, 2022 at 11:03

1 vote

3 answers

976 views

Masking sites in a vcf file

I need to mask all sites in a vcf file flagged by the 1000 Genomes Project as being unfit for population genetic analyses. The sites for all chromosomes are available at: 1000Genomes masked sites From ...

John

115

asked May 28, 2021 at 13:45

3 votes

1 answer

68 views

dbnSNP frequency anomalies

Sometimes dbSNP reports very different allele frequencies for different large-scale genome projects e.g. between 1000 Genomes and GnomAD rs11822440 1000Genomes A=0.4629 C=0.5371 GnomAD A=0.99997 ...

afaulconbridge

131

asked Oct 13, 2020 at 13:30

4 votes

1 answer

824 views

Difference between genome assembly and genome sequence alignment to a reference to find structural variants

I'm trying to determine what the difference and benefits of genome assembly and genome sequence alignments are when trying to identify structural variants or transposons in populations. I've been ...

M4r1n4

41

asked Jul 26, 2020 at 8:26

1 vote

0 answers

19 views

Difference between "trans-ethnic" and "cross-ancestry"

A quick terminology question today. I see the terms "cross-ancestry" and "trans-ethnic" used seemingly interchangeably in literature. Is there any real difference between those two ...

jesseaam

111

asked Jul 14, 2020 at 15:25

-2 votes

2 answers

334 views

Removing common variants in the 1000 genomes database from .vcf [closed]

I have 15 .vcf files. I need to remove `common variants in the 1000 genomes database' appearing in at least 0.5% of the population Do you know from where I may start? Thank you so much

Zizogolu

2,160

asked Mar 12, 2020 at 13:30

0 votes

1 answer

90 views

Obtaining HGDP project data in fasta format

I need to obtain sample data from modern humans in fasta format. I just need some megabytes of data from every individual. I actually use a script that obtains the cram file from here (ftp.1000genomes....

juanjo75es

359

asked Dec 20, 2019 at 18:53

Stack Exchange Network

Questions tagged [1000genomes]

tabix errors when accessing 1000 Genomes data: "[E::bgzf_read] bgzf_read_block error -1 after 50219 of 52392 bytes" and Could not load .tbi/.csi index

LD clump GRCh38 GWAS results

Should genotype imputation be ancestry specific?

Where do I get a large reference VCF?

Interpreting short indel calls in 1000 Genomes Project VCFs

Where can I download 30x 1000 genomes cram files?

SNPs with high population differentiation from 1k Genome dataset

Chosing an imputation panel for SNP-Chip data?

Interpreting imputation result from GLIMPSE

Masking sites in a vcf file

dbnSNP frequency anomalies

Difference between genome assembly and genome sequence alignment to a reference to find structural variants

Difference between "trans-ethnic" and "cross-ancestry"

Removing common variants in the 1000 genomes database from .vcf [closed]

Obtaining HGDP project data in fasta format

Hot Network Questions

Questions tagged [1000genomes]

Related Tags