Questions tagged [data]
Questions mostly concerned with managing data, without focus on pre-processing or modelling.
870
questions
0
votes
0
answers
16
views
Accessing PubMed Central full-texts via FTP?
I wish to process full-texts of many articles using europepmc::epmc_ftxt (so that I can later use tidypmc::pmc_text and tidypmc::separate_text). I find that the R coding (a) below is much too slow in ...
0
votes
0
answers
22
views
Is it okay to ignore missing data despite a very small percentage?
My dataset has seven cols and 63 mil rows. There are two columns with missing data. The first one [Terminal Number] has 400k rows of missing data but thankfully, I can extract values from another ...
0
votes
0
answers
5
views
I need to figure out calculation for a field in Tableau
I am working as a Data Analyst in a university and I am very new to this job. I have been told, If I don't get things working, there are some better people than me and they will hire them. I have been ...
0
votes
0
answers
21
views
Comparing Systems Based on Their Variance
I am using simulations to compare two economic models and want to understand their impact on returns (i.e., the percentage change in prices). I have employed common random numbers for these ...
0
votes
0
answers
21
views
How Can a Mathematician with Programming Experience Transition to Hands-On Data Science and ML?
I have a strong foundation in mathematics and programming, but limited hands-on experience with the data science skills currently in demand in the industry. I am eager to start applying my knowledge ...
0
votes
0
answers
13
views
How can discrete wavelet transform (DWT) help in data reduction
Hi I am reading Data Mining concepts and Techniques by jiawei Han. In page 100 it is mentioned that Wavelet transform helps in data reduction. But I can not find the mathematical proof of this as well ...
0
votes
0
answers
16
views
Training Data for Duplicate Detection: Allow External Information?
We have collected metadata of scientific publications (in a bilingual English-French context) from several international platforms (OpenAlex, Scopus) and French platforms (Hal, Idref, etc.). Many ...
0
votes
0
answers
18
views
Generating transaction data for a dataset to train on
My project is to predict what payment option a customer might use depending on various factors on a checkout screen.
For example here are some of the fields I would have
Variables : User_Location ...
0
votes
0
answers
10
views
Understanding the data leakage article on The ICML 2013 Whale Challenge - Right Whale Redux
I came across this article while trying to understand more about data leakage.
https://www.kaggle.com/competitions/the-icml-2013-whale-challenge-right-whale-redux/discussion/4865
Though really good ...
0
votes
1
answer
16
views
Identify valid points on a map
I have a binary matrix that I turned into a maze map for my project
I have a code that I wrote down that goes through a lot of points
I want to know if each of the points falls on an allowed area (...
0
votes
0
answers
4
views
Increasing data via RBF neural networks
Is there a code that can increase the number of data in one dataset with the help of a RBF neural network? A special design should be created for the model? Is there no pre-made model? Is there ...
1
vote
1
answer
31
views
I have some topics from python out of which I want to know the one which actually needs to be studied for data science roles specifically
The topics are as follows:
Collections
Itertools
Regex and parsing
Error and exceptions
Date and time
0
votes
0
answers
11
views
Help required in opening files of a dataset (.phys, .thermal, .pts, .ass extensions)
We have received a dataset that consists of audio, visual, thermal, and physiological modalities. Upon exploring the dataset, we encountered some challenges in opening the following file types:
.phys ...
0
votes
0
answers
12
views
How do we set daterange in Sailpoint IIQ
I'm trying to set the daterange of a report dynamically. But it seems as if the package I'm using doesn't implement the interface Serializable as it is supposed to. Do you have any idea how I can ...
0
votes
0
answers
19
views
Weighted average for multiple confusion matrix
I have problem and i have no idea how to resolve it.
I have 4 Confusion Matrixes and i need to calculate for example Matthews Correlation for each CM.
In second step i want calculate BIG KPI result ...