CSV
From In memory data in the Tensorflow documentation there is a clear example for a CSV that uses Pandas that I touched on in my comments:
"For any small CSV dataset the simplest way to train a TensorFlow model on it is to load it into memory as a pandas Dataframe or a NumPy array.
A relatively simple example is the abalone dataset.
The dataset is small.
All the input features are all limited-range floating point values.
Here is how to download the data into a pandas DataFrame:"
You should be able to then adapt the example to your CSV formatted data.
Some related resources that may help:
The key feature to look out for is how to split the data. Pandas has a 'shuffle' method that often gets used to separate out some of the data for later testing. Or you can use the convenient ' train_test_split function' from sklearn library, which is able to handle pandas dataframes as well as numpy arrays. See 'How to Split a Dataframe into Train and Test Set with Python' under 'Splitting and saving'
Quick summary for that from the Daily Python Tip post from January 12, 2018
"split pandas dataframe into two random subsets:
from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size=0.2)"
NPZ
For .npz
formatted data see the tensorflow documentation here and this top answer to 'How can I import the MNIST dataset that has been manually downloaded?' and adapt the examples to yours.
numpy arrays can be handled by the train_test_split
function of the sklearn library, as discussed above in the CSV section.
csv
library, see here, that is a standard library for that built in to Python, meaning it doesn't need to be installed separately. Note the title along the very top of the link above, " 3.11.3 Documentation » The Python Standard Library » File Formats » csv". Often others will choose to install & use Pandas, Python Data Analysis Library as it reads CSV files and much more & so it is familiar.b = np.load('filename.npz')
. Please read How do I ask a good question?, ...%pip install numpy
or%conda install -c anaconda numpy
(based on here found by searching 'anaconda numpy'), depending on your main package manager. Always use conda/Anaconda primarily if you are using Anaconda/conda and only resort topip
in cases there is no conda recipe. See ...