Questions tagged [math]
For questions about mathematics related to artificial intelligence.
272
questions
1
vote
1
answer
44
views
Why is this RL derivation right?
This comes from the paper, Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review.
I don't know the why the following derivation is true. The paper only briefly explains ...
0
votes
0
answers
19
views
Is there any AI trained in algebraic simplifications of symbolic expressions?
Is there any artificial intelligence specifically trained in algebraic simplifications of symbolic expressions?
0
votes
0
answers
22
views
How to combine language models and theorem provers?
Suppose that a mathematician wants to find a proof of a conjecture or understand details of some proof. Can one combine language models like ChatGpt or Copilot and theorem proving software like Coq to ...
0
votes
0
answers
16
views
Trying to understand backprop with convolution mathematically
I'm trying to understand how backpropagation works mathematically with convolutions.
I have a VERY simple setup.
X input to Y output through a linear filter:
kernel = torch.ones((1, 1, 3, 3)) / 9 # ...
0
votes
0
answers
29
views
Backpropagation math question
Hi I have a very simple model and I'm trying to learn the math of it.
Basically, I have an input matrix X m x n. An output matrix Y m x n is formed from some convolution H. The figure of merit is ...
0
votes
0
answers
14
views
Showing Axis-Aligned Rectangles With Noise Are PAC-Learnable (FML, Problem 2.6)
I asked the following in Math Stack Exchange and was told "You may be more likely to get an answer on stats.stackexchange.com". I figured this is a more suitable place.
In what follows, an ...
1
vote
0
answers
64
views
Why doesn't the Kolmogorov-Arnold representation theorem imply an MLP-like structure?
Recently, Kolmogorov-Arnold Networks (KANs) generated a lot of hype, with "AI experts" throwing around terms like "ML 2.0" and "a new era of ML".
KANs are supposedly ...
0
votes
1
answer
89
views
Why is policy gradient theorem so important?
What is the problem that the policy gradient solves? From what I understand the problem is taking the gradient of the state distirbution $d^{\pi_{\theta}}$, but what is exactly the problem here (maybe ...
1
vote
0
answers
55
views
AI capable of mathematical creation? (reference request)
I was studying philosophy of AI when a question came to my mind.
I'd like to write an essay (both informative and argumentative) about the state of the art in Artificial Intelligence advancing the ...
0
votes
0
answers
10
views
How to quantify the impact of tokenizer output changes on text encoder embeddings?
In natural language processing, generating text embeddings involves two key modules: the tokenizer and the text encoder. The tokenizer converts text into a token vector, which the encoder then uses to ...
0
votes
0
answers
26
views
Need some feedback on an idea for using reinforcement learning in the context of medical imaging reconstruction
Disclaimer -- this idea may be totally half-baked, I'm not sure. I have used deep learning models in image reconstruction before (and this is a super hot topic in the field right now), but only in the ...
0
votes
1
answer
34
views
Could AlphaZero be trained to prefer "beautiful" games?
Could a version of AlphaZero be trained that learned not only how to win, but how to win in a "beautiful" way?
Jurgen Schmidhuber wrote a paper in 2008, which basically models "beauty&...
0
votes
0
answers
167
views
How is pass@k metric defined for automated theorem provers if we have a verifier?
The pass@k metric was proposed to measure the percentage of successful code samples (https://arxiv.org/abs/2107.03374), but it has also been used in automated theorem proving such as https://arxiv.org/...
1
vote
0
answers
20
views
Is it reasonable to ask for the same time-regularity of the high and low dimensional signals?
Consider we are dealing with sequential data sampled from a continuous time signal $x(t)\in \mathbb{R}^n$, so that the dataset will look like $\{x_0,x_1,…,x_n\}$, with $x_i= x(t_i)$.
Assume that we ...
2
votes
1
answer
132
views
Neural Networks are universal approximators? - Exercice 20.1 UML
I'm working on this question which can be found at page 282 of "Understanding Machine Learning: From Theory to Algorithms" by Shai Shalev-Shwartz and Shai Ben-David.
The statement is as ...