All Questions

0 votes
0 answers
2 views

What are the most common methods for handling non-stationary environments in reinforcement learning?

What are the most common algorithms, methods for handling non-stationary environments in reinforcement learning?
Mika's user avatar
  • 351
0 votes
0 answers
4 views

Adding activation functions in computational graph

In some cases, I saw that the activation function like sigmoid is not added in the computational graph. Is it like a personal choice, or there could be any other reason? In general, is there any hard ...
opu 웃's user avatar
  • 101
1 vote
0 answers
14 views

Two-agent sequential RL

I have the following RL model that I want to train (see the diagram below). My idea is to have two agents: agent A and agent B. Agent A observes the input I1 and ...
zdm's user avatar
  • 311
0 votes
0 answers
8 views

Averaging classification result vs direct infer

Suppose I have tensor X_test with this shape: (10,2,512) Where: 10 is num of ID 2 is num of Channel of every ID, let's say ...
Muhammad Ikhwan Perwira's user avatar
0 votes
0 answers
24 views

Are there any non-transformer LLMs?

Almost all LLMs are based on the transformer architecture, but are there any examples of ones that don't use transformers?
user56510's user avatar
1 vote
2 answers
54 views

Doubt regarding policy gradient theorem and REINFORCE algorithm

After reading Sutton and Barto, I was able to understand the derivation of this theorem. The only thing I don't get is the following part from REINFORCE algorithm: How are these terms equivalent and ...
DeadAsDuck's user avatar
0 votes
0 answers
17 views

Neural Network Library with C++ API and GPU support

In case I need the following functionality: C++ API and GPU support Basic vanilla feed-forward neural network (including deep networks) Recurrent neural network support Long short-term memory (LSTM) ...
Damir Tenishev's user avatar
1 vote
0 answers
34 views

How can we distinguish whether we have create an AGI or a LLM?

Given that language is a powerful tool for learning (indeed, we use language to teach a lot, if not most, what Humans get taught), how would we be able to decide whether the neural net we taught is &...
kutschkem's user avatar
  • 111
0 votes
0 answers
7 views

Generation of text describing moving objects in video

How might I generate text messaging from live video describing how objects of significance are moving, left, right, away from me, in or out of a building etc., without using lidar or similar to assess ...
Nicholas Walton's user avatar
0 votes
0 answers
16 views

why loss value in probablistic-layers at the beggining is large value

I try to build model with conv-flipout layer and desne-flipout layer instead of conv2d and dense-layer but when training the model , first the loss value become large as(100) then decrease slowly even ...
a-eng's user avatar
  • 1
0 votes
0 answers
12 views

What is the best strategy to train a model with multi (sub)goals in the same environment?

To be able to explain my question I thought it is probably better to consider the following example: Let's take an environment, where a bridge crane need to lift a barrel from the position "start&...
Dave's user avatar
  • 194
0 votes
0 answers
14 views

Combinig output of two different machine learning models for accurate invoice data extraction: Is this a viable approach?

I am working (trying to work) on a project to extract relevant information from invoices. Currently I don't achieve much good accuracy so am trying to come up with some new ideas. I am considering ...
rowor's user avatar
  • 1
0 votes
0 answers
15 views

How to add/embed categorical features in transformers network?

I would like to give more context to my transformers by adding some metadata related to each token. This metadata is mostly categorical (3 fields, with 3 possible values for each field). In addition ...
JulienG's user avatar
1 vote
1 answer
43 views

Why is this RL derivation right?

This comes from the paper, Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. I don't know the why the following derivation is true. The paper only briefly explains ...
yeebo xie's user avatar
1 vote
1 answer
12 views

Does ideal probability flow-ode models still generate good quality data?

I have read the question and the given answer Does probability flow ODE trajectory (in the context of diffusion models) represents a bijective mapping between any distribution to a gaussian? Now I ...
saleh's user avatar
  • 158

15 30 50 per page
1
2 3 4 5
846