Newest Questions - Artificial Intelligence Stack Exchange

0 votes

0 answers

2 views

What are the most common methods for handling non-stationary environments in reinforcement learning?

What are the most common algorithms, methods for handling non-stationary environments in reinforcement learning?

Mika

351

asked 47 mins ago

0 votes

0 answers

4 views

Adding activation functions in computational graph

In some cases, I saw that the activation function like sigmoid is not added in the computational graph. Is it like a personal choice, or there could be any other reason? In general, is there any hard ...

opu 웃

101

asked 2 hours ago

1 vote

0 answers

14 views

Two-agent sequential RL

I have the following RL model that I want to train (see the diagram below). My idea is to have two agents: agent A and agent B. Agent A observes the input I1 and ...

zdm

311

asked 14 hours ago

0 votes

0 answers

8 views

Averaging classification result vs direct infer

Suppose I have tensor X_test with this shape: (10,2,512) Where: 10 is num of ID 2 is num of Channel of every ID, let's say ...

Muhammad Ikhwan Perwira

191

asked 22 hours ago

0 votes

0 answers

24 views

Are there any non-transformer LLMs?

Almost all LLMs are based on the transformer architecture, but are there any examples of ones that don't use transformers?

user56510

3

asked 23 hours ago

1 vote

2 answers

54 views

Doubt regarding policy gradient theorem and REINFORCE algorithm

After reading Sutton and Barto, I was able to understand the derivation of this theorem. The only thing I don't get is the following part from REINFORCE algorithm: How are these terms equivalent and ...

DeadAsDuck

13

asked yesterday

0 votes

0 answers

17 views

Neural Network Library with C++ API and GPU support

In case I need the following functionality: C++ API and GPU support Basic vanilla feed-forward neural network (including deep networks) Recurrent neural network support Long short-term memory (LSTM) ...

Damir Tenishev

188

asked yesterday

1 vote

0 answers

34 views

How can we distinguish whether we have create an AGI or a LLM?

Given that language is a powerful tool for learning (indeed, we use language to teach a lot, if not most, what Humans get taught), how would we be able to decide whether the neural net we taught is &...

kutschkem

111

asked 2 days ago

0 votes

0 answers

7 views

Generation of text describing moving objects in video

How might I generate text messaging from live video describing how objects of significance are moving, left, right, away from me, in or out of a building etc., without using lidar or similar to assess ...

Nicholas Walton

1

asked 2 days ago

0 votes

0 answers

16 views

why loss value in probablistic-layers at the beggining is large value

I try to build model with conv-flipout layer and desne-flipout layer instead of conv2d and dense-layer but when training the model , first the loss value become large as(100) then decrease slowly even ...

a-eng

1

asked 2 days ago

0 votes

0 answers

12 views

What is the best strategy to train a model with multi (sub)goals in the same environment?

To be able to explain my question I thought it is probably better to consider the following example: Let's take an environment, where a bridge crane need to lift a barrel from the position "start&...

Dave

194

asked 2 days ago

0 votes

0 answers

14 views

Combinig output of two different machine learning models for accurate invoice data extraction: Is this a viable approach?

I am working (trying to work) on a project to extract relevant information from invoices. Currently I don't achieve much good accuracy so am trying to come up with some new ideas. I am considering ...

rowor

1

asked 2 days ago

0 votes

0 answers

15 views

How to add/embed categorical features in transformers network?

I would like to give more context to my transformers by adding some metadata related to each token. This metadata is mostly categorical (3 fields, with 3 possible values for each field). In addition ...

JulienG

1

asked Jul 17 at 8:22

1 vote

1 answer

43 views

Why is this RL derivation right?

This comes from the paper, Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. I don't know the why the following derivation is true. The paper only briefly explains ...

yeebo xie

67

asked Jul 17 at 8:08

1 vote

1 answer

12 views

Does ideal probability flow-ode models still generate good quality data?

I have read the question and the given answer Does probability flow ODE trajectory (in the context of diffusion models) represents a bijective mapping between any distribution to a gaussian? Now I ...

saleh

158

asked Jul 17 at 7:41

Stack Exchange Network

All Questions

What are the most common methods for handling non-stationary environments in reinforcement learning?

Adding activation functions in computational graph

Two-agent sequential RL

Averaging classification result vs direct infer

Are there any non-transformer LLMs?

Doubt regarding policy gradient theorem and REINFORCE algorithm

Neural Network Library with C++ API and GPU support

How can we distinguish whether we have create an AGI or a LLM?

Generation of text describing moving objects in video

why loss value in probablistic-layers at the beggining is large value

What is the best strategy to train a model with multi (sub)goals in the same environment?

Combinig output of two different machine learning models for accurate invoice data extraction: Is this a viable approach?

How to add/embed categorical features in transformers network?

Why is this RL derivation right?

Does ideal probability flow-ode models still generate good quality data?

Hot Network Questions

All Questions

Related Tags