All Questions
5,666
questions
-2
votes
0
answers
16
views
i am in the process of using pandasai, but keep getting this error from pandas.compat import is_numpy_dev as _is_numpy_dev
from pandas.compat import is_numpy_dev as _is_numpy_dev # pyright: ignore # noqa:F401
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
i am running numpy version 2
python version 3.11
pandas 1....
0
votes
1
answer
45
views
Issue with Custom Rounding Function in Python Pandas
I have implemented a custom rounding function in Python using Pandas, but it's not producing the expected results in certain cases. It basically should always round down to the nearest step.Here's the ...
0
votes
0
answers
19
views
Synthetic Data Vault MultiTableMetadata and get_column_pair_plot
I need help about using get_column_pair_plot because I have difficulty on understanding how to use MultiTableMetadata. Consider the following data :
import numpy as np
import pandas as pd
from sdv....
1
vote
0
answers
74
views
Optimize loops in Numpy correlation matrices
I have a piece of code to calculate price sensitivity based on the product and its rating.
Below is the original data set with product type, reported year, customer’s rating, price per unit, and ...
2
votes
0
answers
51
views
How can I calculate Pearson Correlation in a memory-efficient way using Pandas?
I am building a simple user-based recommendation system using 10M MovieLens dataset. While calculating the Pearson Correlation, the enormous size of the data (69878 row, 10677 cols) overwhelms my ...
-2
votes
0
answers
56
views
Most efficient way to compare \ work with filtered Series / Dataframe rows
When I'create filtered Series or Dataframe object I get filtered indices too:
not_na_prices:pd.Dataframe = price[(price["price1"].notna()) & (price["price2"].notna())]
print(...
0
votes
1
answer
34
views
Drop row in dataframe if equal to previous row [duplicate]
I have a dataframe like the one below, where I have a daily count of points for each team. However, it's a tough task to earn points and on many days the points stay the same. Since I'm turning the ...
0
votes
1
answer
32
views
How to align different entries with same column elements but different positions under just one column in Pandas?
I have time series data that looks like this:
time
Team A
Team B
Team C
14:00:00
2pts
0pts
0pts
time
Team B
Team A
Team C
14:01:00
3pts
2pts
0pts
time
Team B
Team A
Team C
14:02:00
3pts
2pts
...
3
votes
2
answers
86
views
How to format the dataframe into a 2D table
I have following issue with formatting a pandas dataframe into a 2D format.
My data is:
+----+------+-----------+---------+
| | Jobs | Measure | Value |
|----+------+-----------+---------|
| 0 ...
3
votes
1
answer
110
views
Sum multiple rows from multiple columns in a dataframe for a group
For each group in a groupby, I want to sum certain rows from several columns and output them in a new column, is_m_days.
Each Group (a Group has CT/RT and has a Quantity from 1 or 2 or 3 or more rows,...
0
votes
2
answers
166
views
Sum rows from a column based on a condition and output them in a new column
In my dataframe I want to sum certain rows in a column and output them in a new column 'UE_more_days'
is
ATEXT BEGUZ_UE UE_more_days
0 11.00 0.0
1 CT 23.00 ...
2
votes
2
answers
58
views
Dataframe Expansion: Generating Genomic Positions +/- 250 Nucleotides
I have a df that looks like (with 300k more rows of other genomic coordinates):
chromosome start end
chr1 11859 11879
I want to expand the df such that for each row, it will ...
0
votes
0
answers
78
views
Proportionately split dataframe with multiple target columns
I have a dataframe with 30 rows and 10 columns. 5 of the columns are input features and the other 5 are output/target columns. The target columns contain classes represented as 0, 1, 2. I want to ...
-1
votes
1
answer
49
views
Why does assigning one DataFrame to another in Pandas not create a new copy when the Copy-on-Write feature is enabled? [duplicate]
The Pandas documentation describes Copy-on-Write (CoW) behavior as: "CoW means that any DataFrame or Series derived from another in any way always behaves as a copy." Pandas CoW ...
0
votes
0
answers
24
views
Dask query on dates columns
I am trying to filter a huge dask.DataFrame (~800k lines and 30 cols). I want to use the dask.query function of dask.
start_date = np.datetime64(start_date, 'ns')
end_date = np.datetime64(end_date, '...