0

I need to convert my pandas dataframe to tuple of numpy arrays. My code is like,

import pandas as pd

data = {'Class':[10., 11., 9., 8.],
        'Age':[27., 24., 22., 32.],
        'Mark':[76., 56., 89., 45.],
        'Fees':[1000., 1200., 590., 605.]}
df = pd.DataFrame(data)
list(df.itertuples(name='data', index=False))

output >>

[data(Class=10., Age=27., Mark=76., Fees=1000.), data(Class=11., Age=24., Mark=56., Fees=1200.), data(Class=9., Age=22., Mark=89., Fees=590.), data(Class=8., Age=32., Mark=45., Fees=605.)]

But, I need output as like

data(
     Class=array([10.,11.,9.,8.],dtype=float32)
     Age=array([27., 24., 22., 32.],dtype=float32)
     Mark=array([76., 56., 89., 45.],dtype=float32)
     Fees=array([1000., 1200., 590., 605.],dtype=float32))

Please help me with this

2 Answers 2

1

You can use:

from collections import namedtuple

Data = namedtuple('data', ['Class', 'Age', 'Mark', 'Fees'])
out = Data(*df.values.T)

Output:

>>> out
data(Class=array([10., 11.,  9.,  8.]),
     Age=array([27., 24., 22., 32.]),
     Mark=array([76., 56., 89., 45.]),
     Fees=array([1000., 1200.,  590.,  605.]))
1

You can achieve the desired output by converting each column of the DataFrame to a numpy array and then using a namedtuple to store those arrays. Here's how you can do it:

import pandas as pd
import numpy as np
from collections import namedtuple

data = {'Class': [10., 11., 9., 8.],
        'Age': [27., 24., 22., 32.],
        'Mark': [76., 56., 89., 45.],
        'Fees': [1000., 1200., 590., 605.]}

df = pd.DataFrame(data)

# Convert each column to a numpy array
class_arr = np.array(df['Class'], dtype=np.float32)
age_arr = np.array(df['Age'], dtype=np.float32)
mark_arr = np.array(df['Mark'], dtype=np.float32)
fees_arr = np.array(df['Fees'], dtype=np.float32)

# Create a namedtuple
DataTuple = namedtuple('Data', ['Class', 'Age', 'Mark', 'Fees'])

# Create an instance of the namedtuple with the numpy arrays
data_instance = DataTuple(Class=class_arr, Age=age_arr, Mark=mark_arr, Fees=fees_arr)

print(data_instance)

Output:

Data(Class=array([10., 11.,  9.,  8.], dtype=float32), Age=array([27., 24., 22., 32.], dtype=float32), Mark=array([76., 56., 89., 45.], dtype=float32), Fees=array([1000., 1200.,  590.,  605.], dtype=float32))

Not the answer you're looking for? Browse other questions tagged or ask your own question.