Questions tagged [dataset]
A dataset is a collection of data, often in tabular or matrix form. This tag is NOT intended for data requests ("where can I find a dataset about ...") --> see OpenData
1,523
questions
0
votes
0
answers
9
views
How to download all listed companies on Yahoo Finance with detailed data (e.g., sector, market cap) for research purposes?
I'm currently working on a data science research project and I need to download a comprehensive dataset of all listed companies available on Yahoo Finance. The dataset should include detailed ...
0
votes
0
answers
16
views
Accessing PubMed Central full-texts via FTP?
I wish to process full-texts of many articles using europepmc::epmc_ftxt (so that I can later use tidypmc::pmc_text and tidypmc::separate_text). I find that the R coding (a) below is much too slow in ...
0
votes
0
answers
45
views
How weight vector behave when we initialize the weight to 0 in case of perceptron
While reading in book i encountered this statement
Now, the reason we don't initialize the weights to zero is that the learning rate (eta) only has an effect on the classification outcome if the ...
0
votes
0
answers
8
views
Data Quality - classification, analysis and references
Im working on a data quality project that aims to improve the processing and handling of data for my business.
I am looking for references or learning materials that will help me classify the typical ...
1
vote
1
answer
20
views
Modelling a 3-variable system, with separate relations between two pairs
I am having some trouble approaching some data modelling of the following structured dataset I'm trying to analyse, and then creating a surface from it.
So I have 3 variables: say x, y, and z (...
0
votes
0
answers
19
views
Theoretical model performance
this is a theoretical question that I would like to learn more form. Let's say we have some task and we have four datasets A, B, C and D. Using this data I want to obtain the best neural network for ...
0
votes
0
answers
24
views
Where can I find a Database of Corrected Essays in Portuguese-BR?
I'm looking for a database of essays that could be from ENEM, competitions, entrance exams, universities, etc., however, they must have been corrected in Portuguese-BR by humans. Does anyone know ...
0
votes
0
answers
18
views
Generating transaction data for a dataset to train on
My project is to predict what payment option a customer might use depending on various factors on a checkout screen.
For example here are some of the fields I would have
Variables : User_Location ...
0
votes
0
answers
16
views
What's wrong with my retention chart?
Recently, I had been assigned a problem to draw a retention chart from a given JSON data of user sessions over time. Here's the actual problem statement - "Draw a retention chart, which shows the ...
0
votes
0
answers
14
views
Impact of Adding Imbalanced Data on Model Performance for Different Groups
Suppose I initially have a dataset with 50 samples of type A and 50 samples of type B, each with several features. I built a neural network model using this data and recorded the prediction accuracy ...
2
votes
1
answer
48
views
What is dependency in sequential data?
I read from this article, We know that Whenever the points in the dataset are dependent on the other points in the dataset the data is said to be Sequential data.
A common example of this is a ...
1
vote
0
answers
38
views
Class imbalance for binary classification tasks
I am looking to train a binary classifier. Most of my experience so far has been with generative models, not classifiers, so I am wondering with respect to training data, what is a good ratio of 0 and ...
0
votes
0
answers
12
views
Why the graph doesn't display the missing value even though values in data file are not missing?
i'm estimating a GARCH model , modeling the volatility in Bitcoin price and expected inflation, the graph is log diff of expected inflation proxy "market yield on treasury securities "
Thank ...
0
votes
0
answers
8
views
Is there any way to split train/test database on multi feature? And how is unseen data?
I'm working on a Seq2Seq problem which is data to text problem.
Here are my concerns:
How do they define a sample is unseen? I mean it's seq2seq problem I don't think a sample just need a different ...
1
vote
1
answer
43
views
Does f1 score evaluate only the model or does it also enable us to observe and evaluate the data?
I have a dataset. This dataset consists of the data that the actual picture that needs to be drawn, that is, the 100-point graded paper, and the similarity between 100 and 0 points graded pictures ...