You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My task is to read millions of numpy images and get a region dynamically. The application dictates that the images are stored in batches of about 3000 to 6000 in files. These files contain a pickled dict of numpy arrays. On Windows, reading gets dramatically slower with each call, see logs below. These logs are run on two laptops with completely identical hardware, including 40GB of RAM. While Windows is much slower anyways, it should not get slower with each call?
import os
import pickle
import time
import platform
import sys
import numpy as np
import psutil
filePath = r'C:\images.pkl'
print(f"Versions: {platform.system()=}, {platform.release()=}, {platform.version()=}, {sys.version=}, {np.__version__=}")
imagesDict = {i: np.random.randint(0, 255, (300, 300), dtype=np.uint8) for i in range(4000)}
with open(filePath, 'wb') as file:
pickle.dump(imagesDict, file, pickle.HIGHEST_PROTOCOL)
thumbs = []
num_image_sets = 0
durations_s_sum = 0.
for i in range(500):
start_s = time.perf_counter()
with open(filePath, 'rb') as file:
imagesDict: dict[int, np.ndarray] = pickle.load(file)
for key in imagesDict.keys():
image = imagesDict[key]
thumb = image[:50, :50].copy()
thumbs.append(thumb)
durations_s_sum += (time.perf_counter() - start_s)
num_image_sets += 1
if 50 <= num_image_sets:
memory_info = psutil.Process(os.getpid()).memory_info()
print(f"{durations_s_sum:4.1f}s for 50 pickle files, rss={memory_info.rss/1024/1024:6,.0f}MB, vms={memory_info.vms/1024/1024:6,.0f}MB")
durations_s_sum = 0.
num_image_sets = 0
Windows 11
# Versions: platform.system()='Windows', platform.release()='10', platform.version()='10.0.22631', sys.version='3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)]', np.__version__='1.26.3'
# 11.7s for 50 pickle files, rss= 1,211MB, vms= 1,215MB
# 11.8s for 50 pickle files, rss= 1,492MB, vms= 1,499MB
# 13.8s for 50 pickle files, rss= 2,272MB, vms= 2,302MB
# 15.7s for 50 pickle files, rss= 2,802MB, vms= 2,845MB
# 18.3s for 50 pickle files, rss= 3,328MB, vms= 3,383MB
# 21.0s for 50 pickle files, rss= 3,837MB, vms= 3,905MB
# 25.6s for 50 pickle files, rss= 4,369MB, vms= 4,448MB
# 28.0s for 50 pickle files, rss= 4,898MB, vms= 4,989MB
# 32.3s for 50 pickle files, rss= 5,427MB, vms= 5,530MB
# 36.7s for 50 pickle files, rss= 5,966MB, vms= 6,081MB
It looks like it's due to having so many elements on the heap. The top most common functions are all involving heap allocation/free (and possibly heap security/validity checks)
If we clear thumbs each time through the loop, the time is stable and we get a far more likely profile (the ? is going to be all of Python - I don't have symbols for this Anaconda build). It's still memory heavy
Having a few million images in memory helps to speed up machine learning for GPU-RAM-poors. What I did to circumvent the issue, is using h5py instead of pickle. I posted the h5py code and output here [1]. h5py is slower than pickle at the beginning but keeps about the same speed. As we typically go higher than 2e6 images, that helps quite a bit.
Bug report
Bug description:
My task is to read millions of numpy images and get a region dynamically. The application dictates that the images are stored in batches of about 3000 to 6000 in files. These files contain a pickled dict of numpy arrays. On Windows, reading gets dramatically slower with each call, see logs below. These logs are run on two laptops with completely identical hardware, including 40GB of RAM. While Windows is much slower anyways, it should not get slower with each call?
Windows 11
Linux
CPython versions tested on:
3.11
Operating systems tested on:
Windows
The text was updated successfully, but these errors were encountered: