All Questions
12,690
questions
0
votes
0
answers
2
views
What are the most common methods for handling non-stationary environments in reinforcement learning?
What are the most common algorithms, methods for handling non-stationary environments in reinforcement learning?
0
votes
0
answers
4
views
Adding activation functions in computational graph
In some cases, I saw that the activation function like sigmoid is not added in the computational graph. Is it like a personal choice, or there could be any other reason? In general, is there any hard ...
1
vote
0
answers
14
views
Two-agent sequential RL
I have the following RL model that I want to train (see the diagram below). My idea is to have two agents: agent A and agent B. Agent A observes the input I1 and ...
0
votes
0
answers
8
views
Averaging classification result vs direct infer
Suppose I have tensor X_test with this shape:
(10,2,512) Where:
10 is num of ID
2 is num of Channel of every ID, let's say ...
0
votes
0
answers
24
views
Are there any non-transformer LLMs?
Almost all LLMs are based on the transformer architecture, but are there any examples of ones that don't use transformers?
1
vote
2
answers
54
views
Doubt regarding policy gradient theorem and REINFORCE algorithm
After reading Sutton and Barto, I was able to understand the derivation of this theorem. The only thing I don't get is the following part from REINFORCE algorithm:
How are these terms equivalent and ...
0
votes
0
answers
17
views
Neural Network Library with C++ API and GPU support
In case I need the following functionality:
C++ API and GPU support
Basic vanilla feed-forward neural network (including deep networks)
Recurrent neural network support
Long short-term memory (LSTM) ...
1
vote
0
answers
34
views
How can we distinguish whether we have create an AGI or a LLM?
Given that language is a powerful tool for learning (indeed, we use language to teach a lot, if not most, what Humans get taught), how would we be able to decide whether the neural net we taught is &...
0
votes
0
answers
7
views
Generation of text describing moving objects in video
How might I generate text messaging from live video describing how objects of significance are moving, left, right, away from me, in or out of a building etc., without using lidar or similar to assess ...
0
votes
0
answers
16
views
why loss value in probablistic-layers at the beggining is large value
I try to build model with conv-flipout layer and desne-flipout layer instead of conv2d and dense-layer
but when training the model , first the loss value become large as(100) then decrease slowly even ...
0
votes
0
answers
12
views
What is the best strategy to train a model with multi (sub)goals in the same environment?
To be able to explain my question I thought it is probably better to consider the following example: Let's take an environment, where a bridge crane need to lift a barrel from the position "start&...
0
votes
0
answers
14
views
Combinig output of two different machine learning models for accurate invoice data extraction: Is this a viable approach?
I am working (trying to work) on a project to extract relevant information from invoices. Currently I don't achieve much good accuracy so am trying to come up with some new ideas. I am considering ...
0
votes
0
answers
15
views
How to add/embed categorical features in transformers network?
I would like to give more context to my transformers by adding some metadata related to each token. This metadata is mostly categorical (3 fields, with 3 possible values for each field).
In addition ...
1
vote
1
answer
43
views
Why is this RL derivation right?
This comes from the paper, Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review.
I don't know the why the following derivation is true. The paper only briefly explains ...
1
vote
1
answer
12
views
Does ideal probability flow-ode models still generate good quality data?
I have read the question and the given answer Does probability flow ODE trajectory (in the context of diffusion models) represents a bijective mapping between any distribution to a gaussian?
Now I ...