Skip to main content
Inserted missing spaces to make text easier to read.
Source Link
Peter Erwin
  • 17.3k
  • 1
  • 41
  • 57

Ok,I I get that this is a long question but it's ok if you don't know just thought I'd give this a try anyway.I I get that it's a long shot.Basically Basically I'm trying to replicate a methodology used in the paper below.It It basically involves using a table of stars with various parameters and applying cuts until you get a sub-sample of (bulge) stars.The The problem I'm having is that the authors find a different number of stars to the number I get after applying the cuts.I I get 21389 stars whilst they get 21052.I I think I've used the exact same methodology as them but it's hard to tell just from the paper.I I was wondering if anybody could see anything obviously wrong in my python code.I'm I'm pretty much a complete newbie at python tbh. The paper is below and the relevant sections are from 2-Data and sample up to 2.2.

Ok,I get that this is a long question but it's ok if you don't know just thought I'd give this a try anyway.I get that it's a long shot.Basically I'm trying to replicate a methodology used in the paper below.It basically involves using a table of stars with various parameters and applying cuts until you get a sub-sample of (bulge) stars.The problem I'm having is that the authors find a different number of stars to the number I get after applying the cuts.I get 21389 stars whilst they get 21052.I think I've used the exact same methodology as them but it's hard to tell just from the paper.I was wondering if anybody could see anything obviously wrong in my python code.I'm pretty much a complete newbie at python tbh. The paper is below and the relevant sections are from 2-Data and sample up to 2.2.

Ok, I get that this is a long question but it's ok if you don't know just thought I'd give this a try anyway. I get that it's a long shot. Basically I'm trying to replicate a methodology used in the paper below. It basically involves using a table of stars with various parameters and applying cuts until you get a sub-sample of (bulge) stars. The problem I'm having is that the authors find a different number of stars to the number I get after applying the cuts. I get 21389 stars whilst they get 21052. I think I've used the exact same methodology as them but it's hard to tell just from the paper. I was wondering if anybody could see anything obviously wrong in my python code. I'm pretty much a complete newbie at python tbh. The paper is below and the relevant sections are from 2-Data and sample up to 2.2.

added 408 characters in body
Source Link
James K
  • 125.8k
  • 6
  • 314
  • 440

import astropy

from astropy.table import Table

from astropy.io import fits

import numpy as np

import matplotlib.pyplot as plt

Define the HDUList object

hdulist = fits.open('DR17_McMillan_astroNN.fits')

Access the primary data table

data_table = hdulist[1].data

Close the FITS file to release memory

hdulist.close()

Filter for stars with ASPCAPFLAG = 0

filtered_data = data_table[data_table['ASPCAPFLAG'] == 0]

Filter for stars between 3000K and 6000K

teff_filtered_data = filtered_data[(filtered_data['TEFF'] >= 3000) & (filtered_data['TEFF'] <= 6000)]

Define the conditions for filtering

logg_condition = teff_filtered_data['LOGG'] < 3.6 snr_condition = teff_filtered_data['SNR'] > 70

Apply the conditions to filter the data

logg_snr_filtered_data = teff_filtered_data[logg_condition & snr_condition]

#The parent sample is thus defined.Now the bulge sample must be created.

Calculate X, Y, Z, and Rgc

xsun = 8.e3 weighted_dist = logg_snr_filtered_data['weighted_dist'] # Assuming 'dist' is the weighted distance column GLON = logg_snr_filtered_data['GLON'] # Assuming 'GLON' is the Galactic longitude column GLAT = logg_snr_filtered_data['GLAT'] # Assuming 'GLAT' is the Galactic latitude column

axx = (xsun - weighted_dist * np.cos(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.)) / 1.e3

ayy = weighted_dist * np.sin(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.) / 1.e3

azz = weighted_dist * np.sin(GLAT * np.pi / 180.) / 1.e3

aRgc = np.sqrt(axx2. + ayy2. + azz**2.)

Define the conditions for masking using Rgc

rgc_condition = aRgc < 4 # Adjust this threshold as needed

Apply the condition to filter the data based on Rgc

masked_data = logg_snr_filtered_data[rgc_condition]

Calculate fractional distance uncertainty

fractional_dist_uncertainty = masked_data['weighted_dist_error'] / masked_data['weighted_dist']

Define condition for fractional distance uncertainty

fractional_dist_uncertainty_condition = fractional_dist_uncertainty < 0.2

Apply the condition to filter the data

final_filtered_data = masked_data[fractional_dist_uncertainty_condition]

Define columns related to compositions

composition_columns = ['MG_FE', 'C_FE', 'N_FE', 'O_FE', 'SI_FE'] # Columns for Mg, C, N, O, and Si compositions

Define conditions for masking out stars with undetermined compositions

undetermined_composition_condition = ( ~np.isnan(masked_data['MG_FE']) & ~np.isnan(masked_data['C_FE']) & ~np.isnan(masked_data['N_FE']) & ~np.isnan(masked_data['O_FE']) & ~np.isnan(masked_data['SI_FE']) )

Combine the conditions using logical AND

final_mask = undetermined_composition_condition

Apply the final mask to the filtered data

final_filtered_data = masked_data[final_mask]

Load the VAC catalogue data

vac_catalogue_data = fits.open('VAC_GC_DR17_synspec_rev1-v1_1.fits.fits')

Access the data table from the HDUList object

vac_data_table = vac_catalogue_data[1].data

Extract the column containing IDs from the VAC catalogue

vac_ids = vac_data_table['APOGEE_ID']

Check which IDs from the filtered data are present in the VAC catalogue

mask_vac = np.isin(final_filtered_data['APOGEE_ID'], vac_ids)

Remove stars from the filtered data that are present in the VAC catalogue

final_filtered_data_cleaned = final_filtered_data[~mask_vac]

Count the number of stars in the final filtered data

num_stars = len(final_filtered_data_cleaned) print("Number of stars:", num_stars)

import astropy

from astropy.table import Table

from astropy.io import fits

import numpy as np

import matplotlib.pyplot as plt

# Define the HDUList object
hdulist = fits.open('DR17_McMillan_astroNN.fits')

# Access the primary data table
data_table = hdulist[1].data

# Close the FITS file to release memory
hdulist.close()

# Filter for stars with ASPCAPFLAG = 0
filtered_data = data_table[data_table['ASPCAPFLAG'] == 0]

# Filter for stars between 3000K and 6000K
teff_filtered_data = filtered_data[(filtered_data['TEFF'] >= 3000) & (filtered_data['TEFF'] <= 6000)] 

# Define the conditions for filtering
logg_condition = teff_filtered_data['LOGG'] < 3.6
snr_condition = teff_filtered_data['SNR'] > 70

# Apply the conditions to filter the data
logg_snr_filtered_data = teff_filtered_data[logg_condition & snr_condition]


#The parent sample is thus defined.Now the bulge sample must be created.




# Calculate X, Y, Z, and Rgc
xsun = 8.e3
weighted_dist = logg_snr_filtered_data['weighted_dist']  # Assuming 'dist' is the weighted distance column
GLON = logg_snr_filtered_data['GLON']  # Assuming 'GLON' is the Galactic longitude column
GLAT = logg_snr_filtered_data['GLAT']  # Assuming 'GLAT' is the Galactic latitude column

axx = (xsun - weighted_dist * np.cos(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.)) / 1.e3

ayy = weighted_dist * np.sin(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.) / 1.e3

azz = weighted_dist * np.sin(GLAT * np.pi / 180.) / 1.e3

aRgc = np.sqrt(axx**2. + ayy**2. + azz**2.)

# Define the conditions for masking using Rgc
rgc_condition = aRgc < 4  # Adjust this threshold as needed

# Apply the condition to filter the data based on Rgc
masked_data = logg_snr_filtered_data[rgc_condition]

# Calculate fractional distance uncertainty
fractional_dist_uncertainty = masked_data['weighted_dist_error'] / masked_data['weighted_dist']

# Define condition for fractional distance uncertainty
fractional_dist_uncertainty_condition = fractional_dist_uncertainty < 0.2

# Apply the condition to filter the data
final_filtered_data = masked_data[fractional_dist_uncertainty_condition]

# Define columns related to compositions
composition_columns = ['MG_FE', 'C_FE', 'N_FE', 'O_FE', 'SI_FE']  # Columns for Mg, C, N, O, and Si compositions

# Define conditions for masking out stars with undetermined compositions
undetermined_composition_condition = (
    ~np.isnan(masked_data['MG_FE']) &
    ~np.isnan(masked_data['C_FE']) &
    ~np.isnan(masked_data['N_FE']) &
    ~np.isnan(masked_data['O_FE']) &
    ~np.isnan(masked_data['SI_FE'])
)

# Combine the conditions using logical AND
final_mask = undetermined_composition_condition

# Apply the final mask to the filtered data
final_filtered_data = masked_data[final_mask]

# Load the VAC catalogue data
vac_catalogue_data = fits.open('VAC_GC_DR17_synspec_rev1-v1_1.fits.fits')

# Access the data table from the HDUList object
vac_data_table = vac_catalogue_data[1].data

# Extract the column containing IDs from the VAC catalogue
vac_ids = vac_data_table['APOGEE_ID']

# Check which IDs from the filtered data are present in the VAC catalogue
mask_vac = np.isin(final_filtered_data['APOGEE_ID'], vac_ids)

# Remove stars from the filtered data that are present in the VAC catalogue
final_filtered_data_cleaned = final_filtered_data[~mask_vac]

# Count the number of stars in the final filtered data
num_stars = len(final_filtered_data_cleaned)
print("Number of stars:", num_stars)

import astropy

from astropy.table import Table

from astropy.io import fits

import numpy as np

import matplotlib.pyplot as plt

Define the HDUList object

hdulist = fits.open('DR17_McMillan_astroNN.fits')

Access the primary data table

data_table = hdulist[1].data

Close the FITS file to release memory

hdulist.close()

Filter for stars with ASPCAPFLAG = 0

filtered_data = data_table[data_table['ASPCAPFLAG'] == 0]

Filter for stars between 3000K and 6000K

teff_filtered_data = filtered_data[(filtered_data['TEFF'] >= 3000) & (filtered_data['TEFF'] <= 6000)]

Define the conditions for filtering

logg_condition = teff_filtered_data['LOGG'] < 3.6 snr_condition = teff_filtered_data['SNR'] > 70

Apply the conditions to filter the data

logg_snr_filtered_data = teff_filtered_data[logg_condition & snr_condition]

#The parent sample is thus defined.Now the bulge sample must be created.

Calculate X, Y, Z, and Rgc

xsun = 8.e3 weighted_dist = logg_snr_filtered_data['weighted_dist'] # Assuming 'dist' is the weighted distance column GLON = logg_snr_filtered_data['GLON'] # Assuming 'GLON' is the Galactic longitude column GLAT = logg_snr_filtered_data['GLAT'] # Assuming 'GLAT' is the Galactic latitude column

axx = (xsun - weighted_dist * np.cos(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.)) / 1.e3

ayy = weighted_dist * np.sin(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.) / 1.e3

azz = weighted_dist * np.sin(GLAT * np.pi / 180.) / 1.e3

aRgc = np.sqrt(axx2. + ayy2. + azz**2.)

Define the conditions for masking using Rgc

rgc_condition = aRgc < 4 # Adjust this threshold as needed

Apply the condition to filter the data based on Rgc

masked_data = logg_snr_filtered_data[rgc_condition]

Calculate fractional distance uncertainty

fractional_dist_uncertainty = masked_data['weighted_dist_error'] / masked_data['weighted_dist']

Define condition for fractional distance uncertainty

fractional_dist_uncertainty_condition = fractional_dist_uncertainty < 0.2

Apply the condition to filter the data

final_filtered_data = masked_data[fractional_dist_uncertainty_condition]

Define columns related to compositions

composition_columns = ['MG_FE', 'C_FE', 'N_FE', 'O_FE', 'SI_FE'] # Columns for Mg, C, N, O, and Si compositions

Define conditions for masking out stars with undetermined compositions

undetermined_composition_condition = ( ~np.isnan(masked_data['MG_FE']) & ~np.isnan(masked_data['C_FE']) & ~np.isnan(masked_data['N_FE']) & ~np.isnan(masked_data['O_FE']) & ~np.isnan(masked_data['SI_FE']) )

Combine the conditions using logical AND

final_mask = undetermined_composition_condition

Apply the final mask to the filtered data

final_filtered_data = masked_data[final_mask]

Load the VAC catalogue data

vac_catalogue_data = fits.open('VAC_GC_DR17_synspec_rev1-v1_1.fits.fits')

Access the data table from the HDUList object

vac_data_table = vac_catalogue_data[1].data

Extract the column containing IDs from the VAC catalogue

vac_ids = vac_data_table['APOGEE_ID']

Check which IDs from the filtered data are present in the VAC catalogue

mask_vac = np.isin(final_filtered_data['APOGEE_ID'], vac_ids)

Remove stars from the filtered data that are present in the VAC catalogue

final_filtered_data_cleaned = final_filtered_data[~mask_vac]

Count the number of stars in the final filtered data

num_stars = len(final_filtered_data_cleaned) print("Number of stars:", num_stars)

import astropy

from astropy.table import Table

from astropy.io import fits

import numpy as np

import matplotlib.pyplot as plt

# Define the HDUList object
hdulist = fits.open('DR17_McMillan_astroNN.fits')

# Access the primary data table
data_table = hdulist[1].data

# Close the FITS file to release memory
hdulist.close()

# Filter for stars with ASPCAPFLAG = 0
filtered_data = data_table[data_table['ASPCAPFLAG'] == 0]

# Filter for stars between 3000K and 6000K
teff_filtered_data = filtered_data[(filtered_data['TEFF'] >= 3000) & (filtered_data['TEFF'] <= 6000)] 

# Define the conditions for filtering
logg_condition = teff_filtered_data['LOGG'] < 3.6
snr_condition = teff_filtered_data['SNR'] > 70

# Apply the conditions to filter the data
logg_snr_filtered_data = teff_filtered_data[logg_condition & snr_condition]


#The parent sample is thus defined.Now the bulge sample must be created.




# Calculate X, Y, Z, and Rgc
xsun = 8.e3
weighted_dist = logg_snr_filtered_data['weighted_dist']  # Assuming 'dist' is the weighted distance column
GLON = logg_snr_filtered_data['GLON']  # Assuming 'GLON' is the Galactic longitude column
GLAT = logg_snr_filtered_data['GLAT']  # Assuming 'GLAT' is the Galactic latitude column

axx = (xsun - weighted_dist * np.cos(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.)) / 1.e3

ayy = weighted_dist * np.sin(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.) / 1.e3

azz = weighted_dist * np.sin(GLAT * np.pi / 180.) / 1.e3

aRgc = np.sqrt(axx**2. + ayy**2. + azz**2.)

# Define the conditions for masking using Rgc
rgc_condition = aRgc < 4  # Adjust this threshold as needed

# Apply the condition to filter the data based on Rgc
masked_data = logg_snr_filtered_data[rgc_condition]

# Calculate fractional distance uncertainty
fractional_dist_uncertainty = masked_data['weighted_dist_error'] / masked_data['weighted_dist']

# Define condition for fractional distance uncertainty
fractional_dist_uncertainty_condition = fractional_dist_uncertainty < 0.2

# Apply the condition to filter the data
final_filtered_data = masked_data[fractional_dist_uncertainty_condition]

# Define columns related to compositions
composition_columns = ['MG_FE', 'C_FE', 'N_FE', 'O_FE', 'SI_FE']  # Columns for Mg, C, N, O, and Si compositions

# Define conditions for masking out stars with undetermined compositions
undetermined_composition_condition = (
    ~np.isnan(masked_data['MG_FE']) &
    ~np.isnan(masked_data['C_FE']) &
    ~np.isnan(masked_data['N_FE']) &
    ~np.isnan(masked_data['O_FE']) &
    ~np.isnan(masked_data['SI_FE'])
)

# Combine the conditions using logical AND
final_mask = undetermined_composition_condition

# Apply the final mask to the filtered data
final_filtered_data = masked_data[final_mask]

# Load the VAC catalogue data
vac_catalogue_data = fits.open('VAC_GC_DR17_synspec_rev1-v1_1.fits.fits')

# Access the data table from the HDUList object
vac_data_table = vac_catalogue_data[1].data

# Extract the column containing IDs from the VAC catalogue
vac_ids = vac_data_table['APOGEE_ID']

# Check which IDs from the filtered data are present in the VAC catalogue
mask_vac = np.isin(final_filtered_data['APOGEE_ID'], vac_ids)

# Remove stars from the filtered data that are present in the VAC catalogue
final_filtered_data_cleaned = final_filtered_data[~mask_vac]

# Count the number of stars in the final filtered data
num_stars = len(final_filtered_data_cleaned)
print("Number of stars:", num_stars)
added 6 characters in body
Source Link

axx = (xsun - weighted_dist * np.cos(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.)) / 1.e3 ayy

ayy = weighted_dist * np.sin(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.) / 1.e3 azz

azz = weighted_dist * np.sin(GLAT * np.pi / 180.) / 1.e3 aRgc

aRgc = np.sqrt(axx2. + ayy2. + azz**2.)

axx = (xsun - weighted_dist * np.cos(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.)) / 1.e3 ayy = weighted_dist * np.sin(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.) / 1.e3 azz = weighted_dist * np.sin(GLAT * np.pi / 180.) / 1.e3 aRgc = np.sqrt(axx2. + ayy2. + azz**2.)

axx = (xsun - weighted_dist * np.cos(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.)) / 1.e3

ayy = weighted_dist * np.sin(GLON * np.pi / 180.) * np.cos(GLAT * np.pi / 180.) / 1.e3

azz = weighted_dist * np.sin(GLAT * np.pi / 180.) / 1.e3

aRgc = np.sqrt(axx2. + ayy2. + azz**2.)

Source Link
Loading