Skip to main content

Questions tagged [1000genomes]

The tag has no usage guidance.

4 votes
1 answer
102 views

tabix errors when accessing 1000 Genomes data: "[E::bgzf_read] bgzf_read_block error -1 after 50219 of 52392 bytes" and Could not load .tbi/.csi index

I am trying to access 1000 Genomes (1KG) data using tabix as per the 1KG tabix documentation "How do I get a genomic region sub-section of your files?" ...
N Brouwer's user avatar
  • 141
1 vote
1 answer
278 views

LD clump GRCh38 GWAS results

The vignette of R package ieugwasr describes a plink based wrapper function for LD clumping GWAS data using the 1000 genomes ...
Joonatan's user avatar
3 votes
2 answers
95 views

Should genotype imputation be ancestry specific?

I'm wondering if imputation, specifically Beagle, needs a reference panel that matches the sample's ancestry group. For example, Beagle documentation suggests the 1000 Genomes Project phase 3 ...
BigMistake's user avatar
3 votes
2 answers
344 views

Where do I get a large reference VCF?

I would like to download a large .vcf file containing many (hundreds or thousands) of samples. Ideally, I would download different population-specific .vcf files, but the ability to sort/filter by ...
BigMistake's user avatar
2 votes
3 answers
113 views

Interpreting short indel calls in 1000 Genomes Project VCFs

Consider the following short indel polymorphism rs59679400 on chr7. ...
Daniel Standage's user avatar
1 vote
1 answer
244 views

Where can I download 30x 1000 genomes cram files?

From the preprint published by 1000 genome project (https://www.biorxiv.org/content/10.1101/2021.02.06.430068v1.full) I think the 30x data is for WGS. Can anyone confirm for me if the following file ...
Shafayet Rahat's user avatar
2 votes
0 answers
104 views

SNPs with high population differentiation from 1k Genome dataset

I am trying to reproduce the results from this paper "Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences". ...
koolaids's user avatar
3 votes
2 answers
64 views

Chosing an imputation panel for SNP-Chip data?

I have about 1,000 SNP-Chip data (samples) that I'd like to impute over (for the purpose of having more rsIDs to match against GWAS data). However, I don't know the ancestry of each sample / the ...
Dan Bolser's user avatar
1 vote
1 answer
361 views

Interpreting imputation result from GLIMPSE

I'm following this tutorial of GLIMPSE for learning. I was expecting some extra SNPS coming from the 1000 genome reference in the resulting .vcf file. Though I understand the phasing in the output ...
Shafayet Rahat's user avatar
1 vote
3 answers
976 views

Masking sites in a vcf file

I need to mask all sites in a vcf file flagged by the 1000 Genomes Project as being unfit for population genetic analyses. The sites for all chromosomes are available at: 1000Genomes masked sites From ...
John's user avatar
  • 115
3 votes
1 answer
68 views

dbnSNP frequency anomalies

Sometimes dbSNP reports very different allele frequencies for different large-scale genome projects e.g. between 1000 Genomes and GnomAD rs11822440 1000Genomes A=0.4629 C=0.5371 GnomAD A=0.99997 ...
afaulconbridge's user avatar
4 votes
1 answer
824 views

Difference between genome assembly and genome sequence alignment to a reference to find structural variants

I'm trying to determine what the difference and benefits of genome assembly and genome sequence alignments are when trying to identify structural variants or transposons in populations. I've been ...
M4r1n4's user avatar
  • 41
1 vote
0 answers
19 views

Difference between "trans-ethnic" and "cross-ancestry"

A quick terminology question today. I see the terms "cross-ancestry" and "trans-ethnic" used seemingly interchangeably in literature. Is there any real difference between those two ...
jesseaam's user avatar
  • 111
-2 votes
2 answers
334 views

Removing common variants in the 1000 genomes database from .vcf [closed]

I have 15 .vcf files. I need to remove `common variants in the 1000 genomes database' appearing in at least 0.5% of the population Do you know from where I may start? Thank you so much
Zizogolu's user avatar
  • 2,160
0 votes
1 answer
90 views

Obtaining HGDP project data in fasta format

I need to obtain sample data from modern humans in fasta format. I just need some megabytes of data from every individual. I actually use a script that obtains the cram file from here (ftp.1000genomes....
juanjo75es's user avatar

15 30 50 per page