Skip to main content

Questions tagged [data]

Questions mostly concerned with managing data, without focus on pre-processing or modelling.

0 votes
0 answers
16 views

Accessing PubMed Central full-texts via FTP?

I wish to process full-texts of many articles using europepmc::epmc_ftxt (so that I can later use tidypmc::pmc_text and tidypmc::separate_text). I find that the R coding (a) below is much too slow in ...
Abiologist's user avatar
0 votes
0 answers
22 views

Is it okay to ignore missing data despite a very small percentage?

My dataset has seven cols and 63 mil rows. There are two columns with missing data. The first one [Terminal Number] has 400k rows of missing data but thankfully, I can extract values from another ...
Ramon Araneta's user avatar
0 votes
0 answers
5 views

I need to figure out calculation for a field in Tableau

I am working as a Data Analyst in a university and I am very new to this job. I have been told, If I don't get things working, there are some better people than me and they will hire them. I have been ...
Darth_Vader_DataAnalyst's user avatar
0 votes
0 answers
21 views

Comparing Systems Based on Their Variance

I am using simulations to compare two economic models and want to understand their impact on returns (i.e., the percentage change in prices). I have employed common random numbers for these ...
Apod's user avatar
  • 1
0 votes
0 answers
21 views

How Can a Mathematician with Programming Experience Transition to Hands-On Data Science and ML?

I have a strong foundation in mathematics and programming, but limited hands-on experience with the data science skills currently in demand in the industry. I am eager to start applying my knowledge ...
Khadeeja's user avatar
0 votes
0 answers
13 views

How can discrete wavelet transform (DWT) help in data reduction

Hi I am reading Data Mining concepts and Techniques by jiawei Han. In page 100 it is mentioned that Wavelet transform helps in data reduction. But I can not find the mathematical proof of this as well ...
user165164's user avatar
0 votes
0 answers
16 views

Training Data for Duplicate Detection: Allow External Information?

We have collected metadata of scientific publications (in a bilingual English-French context) from several international platforms (OpenAlex, Scopus) and French platforms (Hal, Idref, etc.). Many ...
joadorn's user avatar
0 votes
0 answers
18 views

Generating transaction data for a dataset to train on

My project is to predict what payment option a customer might use depending on various factors on a checkout screen. For example here are some of the fields I would have Variables : User_Location ...
Naeem Mujeeb's user avatar
0 votes
0 answers
10 views

Understanding the data leakage article on The ICML 2013 Whale Challenge - Right Whale Redux

I came across this article while trying to understand more about data leakage. https://www.kaggle.com/competitions/the-icml-2013-whale-challenge-right-whale-redux/discussion/4865 Though really good ...
sarah's user avatar
  • 1
0 votes
1 answer
16 views

Identify valid points on a map

I have a binary matrix that I turned into a maze map for my project I have a code that I wrote down that goes through a lot of points I want to know if each of the points falls on an allowed area (...
May's user avatar
  • 1
0 votes
0 answers
4 views

Increasing data via RBF neural networks

Is there a code that can increase the number of data in one dataset with the help of a RBF neural network? A special design should be created for the model? Is there no pre-made model? Is there ...
Erfan Mollai's user avatar
1 vote
1 answer
31 views

I have some topics from python out of which I want to know the one which actually needs to be studied for data science roles specifically

The topics are as follows: Collections Itertools Regex and parsing Error and exceptions Date and time
Kanika Dixit's user avatar
0 votes
0 answers
11 views

Help required in opening files of a dataset (.phys, .thermal, .pts, .ass extensions)

We have received a dataset that consists of audio, visual, thermal, and physiological modalities. Upon exploring the dataset, we encountered some challenges in opening the following file types: .phys ...
Anup Kumar Gupta's user avatar
0 votes
0 answers
12 views

How do we set daterange in Sailpoint IIQ

I'm trying to set the daterange of a report dynamically. But it seems as if the package I'm using doesn't implement the interface Serializable as it is supposed to. Do you have any idea how I can ...
srikanthbollu's user avatar
0 votes
0 answers
19 views

Weighted average for multiple confusion matrix

I have problem and i have no idea how to resolve it. I have 4 Confusion Matrixes and i need to calculate for example Matthews Correlation for each CM. In second step i want calculate BIG KPI result ...
Debosy's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
58