Temat: deep reinforcement learning - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Deep reinforcement learning overview of the state of the art
Autorzy:: Fenjiro, Y.
Benbrahim, H.
Powiązania:: https://bibliotekanauki.pl/articles/384788.pdf
Data publikacji:: 2018
Wydawca:: Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Automatyki i Pomiarów
Tematy:: reinforcement learning
deep learning
convolutional network
recurrent network
deep reinforcement learning
Opis:: Artificial intelligence has made big steps forward with reinforcement learning (RL) in the last century, and with the advent of deep learning (DL) in the 90s, especially, the breakthrough of convolutional networks in computer vision field. The adoption of DL neural networks in RL, in the first decade of the 21 century, led to an end-toend framework allowing a great advance in human-level agents and autonomous systems, called deep reinforcement learning (DRL). In this paper, we will go through the development Timeline of RL and DL technologies, describing the main improvements made in both fields. Then, we will dive into DRL and have an overview of the state-ofthe- art of this new and promising field, by browsing a set of algorithms (Value optimization, Policy optimization and Actor-Critic), then, giving an outline of current challenges and real-world applications, along with the hardware and frameworks used. In the end, we will discuss some potential research directions in the field of deep RL, for which we have great expectations that will lead to a real human level of intelligence.
Źródło:: Journal of Automation Mobile Robotics and Intelligent Systems; 2018, 12, 3; 20-39
1897-8649
2080-2145
Pojawia się w:: Journal of Automation Mobile Robotics and Intelligent Systems
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Multi agent deep learning with cooperative communication
Autorzy:: Simões, David
Lau, Nuno
Reis, Luís Paulo
Powiązania:: https://bibliotekanauki.pl/articles/1837537.pdf
Data publikacji:: 2020
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:: multi-agent systems
deep reinforcement learning
centralized learning
Opis:: We consider the problem of multi agents cooperating in a partially-observable environment. Agents must learn to coordinate and share relevant information to solve the tasks successfully. This article describes Asynchronous Advantage Actor-Critic with Communication (A3C2), an end-to-end differentiable approach where agents learn policies and communication protocols simultaneously. A3C2 uses a centralized learning, distributed execution paradigm, supports independent agents, dynamic team sizes, partiallyobservable environments, and noisy communications. We compare and show that A3C2 outperforms other state-of-the-art proposals in multiple environments.
Źródło:: Journal of Artificial Intelligence and Soft Computing Research; 2020, 10, 3; 189-207
2083-2567
2449-6499
Pojawia się w:: Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Handling realistic noise in multi-agent systems with self-supervised learning and curiosity
Autorzy:: Szemenyei, Marton
Reizinger, Patrik
Powiązania:: https://bibliotekanauki.pl/articles/2147129.pdf
Data publikacji:: 2022
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:: deep reinforcement learning
multi-agent environment
autonomous driving
robot soccer
self-supervised learning
Opis:: Most reinforcement learning benchmarks – especially in multi-agent tasks – do not go beyond observations with simple noise; nonetheless, real scenarios induce more elaborate vision pipeline failures: false sightings, misclassifications or occlusion. In this work, we propose a lightweight, 2D environment for robot soccer and autonomous driving that can emulate the above discrepancies. Besides establishing a benchmark for accessible multiagent reinforcement learning research, our work addresses the challenges the simulator imposes. For handling realistic noise, we use self-supervised learning to enhance scene reconstruction and extend curiosity-driven learning to model longer horizons. Our extensive experiments show that the proposed methods achieve state-of-the-art performance, compared against actor-critic methods, ICM, and PPO.
Źródło:: Journal of Artificial Intelligence and Soft Computing Research; 2022, 12, 2; 135--148
2083-2567
2449-6499
Pojawia się w:: Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: An automated driving strategy generating method based on WGAIL–DDPG
Autorzy:: Zhang, Mingheng
Wan, Xing
Gang, Longhui
Lv, Xinfei
Wu, Zengwen
Liu, Zhaoyang
Powiązania:: https://bibliotekanauki.pl/articles/2055167.pdf
Data publikacji:: 2021
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: automated driving system
deep learning
deep reinforcement learning
imitation learning
deep deterministic policy gradient
system jezdny
uczenie głębokie
uczenie przez naśladowanie
Opis:: Reliability, efficiency and generalization are basic evaluation criteria for a vehicle automated driving system. This paper proposes an automated driving decision-making method based on the Wasserstein generative adversarial imitation learning–deep deterministic policy gradient (WGAIL–DDPG(λ)). Here the exact reward function is designed based on the requirements of a vehicle’s driving performance, i.e., safety, dynamic and ride comfort performance. The model’s training efficiency is improved through the proposed imitation learning strategy, and a gain regulator is designed to smooth the transition from imitation to reinforcement phases. Test results show that the proposed decision-making model can generate actions quickly and accurately according to the surrounding environment. Meanwhile, the imitation learning strategy based on expert experience and the gain regulator can effectively improve the training efficiency for the reinforcement learning model. Additionally, an extended test also proves its good adaptability for different driving conditions.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2021, 31, 3; 461--470
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Simplification of deep reinforcement learning in traffic control using the Bonsai Platform
Uproszczenie uczenia się przez głębokie wzmocnienie w zarządzaniu ruchem z wykorzystaniem Platformy Bonsai
Autorzy:: Skuba, Michal
Janota, Aleš
Powiązania:: https://bibliotekanauki.pl/articles/2013058.pdf
Data publikacji:: 2020-12-31
Wydawca:: Uniwersytet Technologiczno-Humanistyczny im. Kazimierza Pułaskiego w Radomiu
Tematy:: control
deep reinforcement learning
model
simulation
traffic
sterowanie
uczenie w głębokim uczeniu przez wzmacnianie
symulacja
ruch drogowy
Opis:: The paper deals with the problem of traffic light control of road intersection. The authors use a model of a real road junction created in the AnyLogic modelling tool. For two scenarios, there are three simulation experiments performed – fixed time control, fixed time control after AnyLogic-based optimizations, and dynamic control obtained through the cooperation of the AnyLogic tool and the Bonsai platform, utilizing benefits of deep reinforcement learning. At present, there are trends to simplify machine learning processes as much as possible to make them accessible to practitioners with no artificial intelligence background and without the need to become data scientists. Project Bonsai represents an easy-to-use connector, that allows to use AnyLogic models connected to the Bonsai platform - a novel approach to machine learning without the need to set any hyper-parameters. Due to unavailability of real operational data, the model uses simulation data only, with presence and movement of vehicles only (no pedestrians). The optimization problem consists in minimizing the average time that agents (vehicles) must spend in the model, passing the modelled intersection. Another observed parameter is the maximum time of individual vehicles spent in the model. The authors share their practical, mainly methodological, experiences with the simulation process and indicate economic cost needed for training as well.
Artykuł dotyczy problemu sterowania sygnalizacją świetlną na skrzyżowaniach dróg. Autorzy wykorzystują model rzeczywistego węzła drogowego utworzony w narzędziu do modelowania AnyLogic. Dla dwóch scenariuszy wykonywane są trzy eksperymenty symulacyjne - sterowanie światłami sygnalizacyjnymi o stałym czasie działania, sterowanie światłami sygnalizacyjnymi o stałym czasie działania po optymalizacji w oparciu o AnyLogic, i sterowanie dynamiczne dzięki współpracy między AnyLogic i platformą Bonsai, wykorzystując korzyści płynące z uczenia się przez głębokie wzmocnienie. Obecnie istnieją tendencje do maksymalnego upraszczania procesów uczenia maszynowego, aby były dostępne dla praktyków bez doświadczenia w zakresie sztucznej inteligencji i bez konieczności zostania naukowcami danych. Project Bonsai to łatwe w obsłudze złącze, które pozwala na korzystanie z modeli AnyLogic podłączonych do platformy Bonsai - nowatorskie podejście do uczenia maszynowego bez konieczności ustawiania hiperparametrów. Ze względu na niedostępność rzeczywistych danych eksploatacyjnych model wykorzystuje tylko dane symulacyjne, tylko z obecnością i ruchem pojazdów (bez pieszych). Problem optymalizacji polega na zminimalizowaniu średniego czasu, jaki agenci (pojazdy) muszą spędzać w modelu, mijając modelowane skrzyżowanie. Kolejnym obserwowanym parametrem jest maksymalny czas przebywania poszczególnych pojazdów w modelu. Autorzy dzielą się praktycznymi, głównie metodologicznymi, doświadczeniami związanymi z procesem symulacji oraz wskazują koszty ekonomiczne potrzebne do uczenia.
Źródło:: Journal of Civil Engineering and Transport; 2020, 2, 4; 191-202
2658-1698
2658-2120
Pojawia się w:: Journal of Civil Engineering and Transport
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Three-dimensional path-following control of an autonomous underwater vehicle based on deep reinforcement learning
Autorzy:: Liang, Zhenyu
Qu, Xingru
Zhang, Zhao
Chen, Cong
Powiązania:: https://bibliotekanauki.pl/articles/32898215.pdf
Data publikacji:: 2022
Wydawca:: Politechnika Gdańska. Wydział Inżynierii Mechanicznej i Okrętownictwa
Tematy:: autonomous underwater vehicle (AUV)
three-dimensional path following
deep reinforcement learning-based control
lineof-sight guidance
controller chattering
Opis:: In this article, a deep reinforcement learning based three-dimensional path following control approach is proposed for an underactuated autonomous underwater vehicle (AUV). To be specific, kinematic control laws are employed by using the three-dimensional line-of-sight guidance and dynamic control laws are employed by using the twin delayed deep deterministic policy gradient algorithm (TD3), contributing to the surge velocity, pitch angle and heading angle control of an underactuated AUV. In order to solve the chattering of controllers, the action filter and the punishment function are built respectively, which can make control signals stable. Simulations are carried out to evaluate the performance of the proposed control approach. And results show that the AUV can complete the control mission successfully.
Źródło:: Polish Maritime Research; 2022, 4; 36-44
1233-2585
Pojawia się w:: Polish Maritime Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Wykorzystywanie programów uczenia w głębokim uczeniu przez wzmacnianie. O istocie rozpoczynania od rzeczy małych
Using Training Curriculum with Deep Reinforcement Learning. On the Importance of Starting Small
Autorzy:: KOZIARSKI, MICHAŁ
KWATER, KRZYSZTOF
WOŹNIAK, MICHAŁ
Powiązania:: https://bibliotekanauki.pl/articles/456567.pdf
Data publikacji:: 2018
Wydawca:: Uniwersytet Rzeszowski
Tematy:: głębokie uczenie przez wzmacnianie
uczenie przez transfer
uczenie się przez całe życie
proces uczenia
deep reinforcement learning
transfer learning
lifelong learning,
curriculum learning
Opis:: Algorytmy uczenia się przez wzmacnianie są wykorzystywane do rozwiązywania problemów o stale rosnącym poziomie złożoności. W wyniku tego proces uczenia zyskuje na złożoności i wy-maga większej mocy obliczeniowej. Wykorzystanie uczenia z przeniesieniem wiedzy może czę-ściowo ograniczyć ten problem. W artykule wprowadzamy oryginalne środowisko testowe i eks-perymentalnie oceniamy wpływ wykorzystania programów uczenia na głęboką odmianę metody Q-learning.
Reinforcement learning algorithms are being used to solve problems with ever-increasing level of complexity. As a consequence, training process becomes harder and more computationally demanding. Using transfer learning can partially elevate this issue by taking advantage of previ-ously acquired knowledge. In this paper we propose a novel test environment and experimentally evaluate impact of using curriculum with deep Q-learning algorithm.
Źródło:: Edukacja-Technika-Informatyka; 2018, 9, 2; 220-226
2080-9069
Pojawia się w:: Edukacja-Technika-Informatyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: A hybrid control strategy for a dynamic scheduling problem in transit networks
Autorzy:: Liu, Zhongshan
Yu, Bin
Zhang, Li
Wang, Wensi
Powiązania:: https://bibliotekanauki.pl/articles/2172126.pdf
Data publikacji:: 2022
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: service reliability
transit network
proactive control method
deep reinforcement learning
hybrid control strategy
niezawodność usług
sieć tranzytowa
uczenie głębokie
kontrola hybrydowa
Opis:: Public transportation is often disrupted by disturbances, such as the uncertain travel time caused by road congestion. Therefore, the operators need to take real-time measures to guarantee the service reliability of transit networks. In this paper, we investigate a dynamic scheduling problem in a transit network, which takes account of the impact of disturbances on bus services. The objective is to minimize the total travel time of passengers in the transit network. A two-layer control method is developed to solve the proposed problem based on a hybrid control strategy. Specifically, relying on conventional strategies (e.g., holding, stop-skipping), the hybrid control strategy makes full use of the idle standby buses at the depot. Standby buses can be dispatched to bus fleets to provide temporary or regular services. Besides, deep reinforcement learning (DRL) is adopted to solve the problem of continuous decision-making. A long short-term memory (LSTM) method is added to the DRL framework to predict the passenger demand in the future, which enables the current decision to adapt to disturbances. The numerical results indicate that the hybrid control strategy can reduce the average headway of the bus fleet and improve the reliability of bus service.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2022, 32, 4; 553--567
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "deep reinforcement learning" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język