Temat: reinforcement learning - Katalog OPAC zbiorów

Skocz do pozycji: 16.

Tytuł:: Handling realistic noise in multi-agent systems with self-supervised learning and curiosity
Autorzy:: Szemenyei, Marton
Reizinger, Patrik
Powiązania:: https://bibliotekanauki.pl/articles/2147129.pdf
Data publikacji:: 2022
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:: deep reinforcement learning
multi-agent environment
autonomous driving
robot soccer
self-supervised learning
Opis:: Most reinforcement learning benchmarks – especially in multi-agent tasks – do not go beyond observations with simple noise; nonetheless, real scenarios induce more elaborate vision pipeline failures: false sightings, misclassifications or occlusion. In this work, we propose a lightweight, 2D environment for robot soccer and autonomous driving that can emulate the above discrepancies. Besides establishing a benchmark for accessible multiagent reinforcement learning research, our work addresses the challenges the simulator imposes. For handling realistic noise, we use self-supervised learning to enhance scene reconstruction and extend curiosity-driven learning to model longer horizons. Our extensive experiments show that the proposed methods achieve state-of-the-art performance, compared against actor-critic methods, ICM, and PPO.
Źródło:: Journal of Artificial Intelligence and Soft Computing Research; 2022, 12, 2; 135--148
2083-2567
2449-6499
Pojawia się w:: Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 17.

Tytuł:: Stabilizer design of PSS3B based on the KH algorithm and Q-Learning for damping of low frequency oscillations in a single-machine power system
Autorzy:: Mohamadi, Farshid
Sedaghati, Alireza
Powiązania:: https://bibliotekanauki.pl/articles/41190034.pdf
Data publikacji:: 2023
Wydawca:: Politechnika Warszawska, Instytut Techniki Cieplnej
Tematy:: 3-band power system stabilize
reinforcement learning
Q-learning
system zasilania
uczenie przez wzmacnianie
Opis:: The aim of this study is to use the reinforcement learning method in order to generate a complementary signal for enhancing the performance of the system stabilizer. The reinforcement learning is one of the important branches of machine learning on the area of artificial intelligence and a general approach for solving the Marcov Decision Process (MDP) problems. In this paper, a reinforcement learning-based control method, named Q-learning, is presented and used to improve the performance of a 3-Band Power System Stabilizer (PSS3B) in a single-machine power system. For this end, we first set the parameters of the 3-band power system stabilizer by optimizing the eigenvalue-based objective function using the new optimization KH algorithm, and then its efficiency is improved using the proposed reinforcement learning algorithm based on the Q-learning method in real time. One of the fundamental features of the proposed reinforcement learning-based stabilizer is its simplicity and independence on the system model and changes in the working points of operation. To evaluate the efficiency of the proposed reinforcement learning-based 3-band power system stabilizer, its results are compared with the conventional power system stabilizer and the 3-band power system stabilizer designed by the use of the KH algorithm under different working points. The simulation results based on the performance indicators show that the power system stabilizer proposed in this study underperform the two other methods in terms of decrease in settling time and damping of low frequency oscillations.
Źródło:: Journal of Power Technologies; 2023, 103, 4; 230-242
1425-1353
Pojawia się w:: Journal of Power Technologies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 18.

Tytuł:: Markov Decision Process based Model for Performance Analysis an Intrusion Detection System in IoT Networks
Autorzy:: Kalnoor, Gauri
Gowrishankar, -
Powiązania:: https://bibliotekanauki.pl/articles/1839336.pdf
Data publikacji:: 2021
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Tematy:: DDoS
intrusion detection
IoT
machine learning
Markov decision process
MDP
Q-learning
NSL-KDD
reinforcement learning
Opis:: In this paper, a new reinforcement learning intrusion detection system is developed for IoT networks incorporated with WSNs. A research is carried out and the proposed model RL-IDS plot is shown, where the detection rate is improved. The outcome shows a decrease in false alarm rates and is compared with the current methodologies. Computational analysis is performed, and then the results are compared with the current methodologies, i.e. distributed denial of service (DDoS) attack. The performance of the network is estimated based on security and other metrics.
Źródło:: Journal of Telecommunications and Information Technology; 2021, 3; 42-49
1509-4553
1899-8852
Pojawia się w:: Journal of Telecommunications and Information Technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 19.

Tytuł:: A strategy learning model for autonomous agents based on classification
Autorzy:: Śnieżyński, B.
Powiązania:: https://bibliotekanauki.pl/articles/330672.pdf
Data publikacji:: 2015
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: autonomous agents
strategy learning
supervised learning
classification
reinforcement learning
czynnik niezależny
uczenie nadzorowane
uczenie ze wzmocnieniem
Opis:: In this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer–pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2015, 25, 3; 471-482
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 20.

Tytuł:: Optimal control of dynamic systems using a new adjoining cell mapping method with reinforcement learning
Autorzy:: Arribas-Navarro, T.
Prieto, S.
Plaza, M.
Powiązania:: https://bibliotekanauki.pl/articles/205725.pdf
Data publikacji:: 2015
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: optimal control
cells mapping
state space
reinforcement learning
stability
nonlinear control
controllability
Opis:: This work aims to improve and simplify the procedure used in the Control Adjoining Cell Mapping with Reinforcement Learning (CACM-RL) technique, for the tuning process of an optimal contro ller during the pre-learning stage (controller design), making easier the transition from a simulation environment to the real world. Common problems, encountered when working with CACM-RL, are the adjustment of the cell size and the long-term evolution error. In this sense, the main goal of the new approach, developed for CACM-RL that is proposed in this work (CACMRL*), is to give a response to both problems for helping engineers in deﬁning of the control solution with accuracy and stability criteria instead of cell sizes. The new approach improves the mathematical analysis techniques and reduces the engineering eﬀort during the design phase. In order to demonstrate the behaviour of CACM-RL*, three examples are described to show its application to real problems. In All the examples, CACM-RL* improves with respect to the considered alternatives. In some cases, CACM- RL* improves the average controllability by up to 100%.
Źródło:: Control and Cybernetics; 2015, 44, 3; 369-387
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 21.

Tytuł:: Epoch-incremental reinforcement learning algorithms
Autorzy:: Zajdel, R.
Powiązania:: https://bibliotekanauki.pl/articles/330530.pdf
Data publikacji:: 2013
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: reinforcement learning
epoch incremental algorithm
grid world
uczenie ze wzmocnieniem
algorytm przyrostowy
Opis:: In this article, a new class of the epoch-incremental reinforcement learning algorithm is proposed. In the incremental mode, the fundamental TD(0) or TD(λ) algorithm is performed and an environment model is created. In the epoch mode, on the basis of the environment model, the distances of past-active states to the terminal state are computed. These distances and the reinforcement terminal state signal are used to improve the agent policy.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2013, 23, 3; 623-635
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 22.

Tytuł:: Uczenie ze wzmocnieniem regulatora Takagi-Sugeno metodą elementów ASE/ACE
Reinforcement learning with use of neuronlike elements ASE/ACE of Takagi-Sugeno controller
Autorzy:: Zajdel, R.
Powiązania:: https://bibliotekanauki.pl/articles/156302.pdf
Data publikacji:: 2005
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: regulator rozmyty
uczenie ze wzmocnieniem
wahadło odwrócone
fuzzy controller
reinforcement learning
inverted pendulum
Opis:: W artykule opisano zastosowanie algorytmu uczenia ze wzmocnieniem metodą elementów ASE/ACE do uczenia następników reguł regulatora rozmytego Takagi-Sugeno. Poprawność proponowanych rozwiązań zweryfikowano symulacyjnie w sterowaniu układem wahadło odwrócone - wózek. Przeprowadzono również eksperymenty porównawcze z klasyczną siecią elementów ASE/ACE. Pokazano zalety i wady rozwiązania klasycznego i rozmytego.
The adaptation of reinforcement learning algorithm with the use of ASE/ACE elements for rule consequence learning of the Takagi-Sugeno fuzzy logic controller is proposed. The solution is applied to control of the cart-pole system and tested by computer simulations. The original neuronlike elements ASE/ACE are simulated as well. Advantages and disadvantages of the both approaches (fuzzy and classical) are demonstrated.
Źródło:: Pomiary Automatyka Kontrola; 2005, R. 51, nr 1, 1; 47-49
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 23.

Tytuł:: Self-learning controller of active magnetic bearing based on CARLA method
Samo uczący sie sterownik aktywnego łozyslka magnetycznego oparty na metodzie CARLA
Autorzy:: Brezina, T.
Turek, M.
Pulchart, J.
Powiązania:: https://bibliotekanauki.pl/articles/152983.pdf
Data publikacji:: 2007
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: sterowanie aktywnego łożyska magnetycznego
active magnetic bearing control
continuous action reinforcement learning automata
Opis:: The active magnetic bearing control through analytically designed linear PD regulator, with parallel nonlinear compensation represented by automatic approximator is described in this contribution. Coefficient (parameter) values come from actions of Continuous Action Reinforcement Learning Automata (CARLAs). Influence of CARLAs parameters to learning is discussed. Parameters influence is proved by simulation study. It is shown that learning improvement can be reached by selecting appropriate parameters of learning.
W artykule przedstawiono sterowanie aktywnego łożyska magnetycznego za pomocą analitycznie dobranego regulatora PD z nieliniową kompensacją równoległą. Współczynniki kompensacji są wyznaczane automatycznie z użyciem metody CARLA (Continuous Action Reinforcement Automata). Zbadano wpływ parametrów metody na proces uczenia się kompensatora w oparciu o eksperymenty symulacyjne. Wykazano, że właściwy dobór parametrów metody prowadzi do poprawienia skuteczności procesu uczenia się.
Źródło:: Pomiary Automatyka Kontrola; 2007, R. 53, nr 1, 1; 6-9
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 24.

Tytuł:: Reinforcement Learning in Discrete and Continuous Domains Applied to Ship Trajectory Generation
Autorzy:: Rak, A.
Gierusz, W.
Powiązania:: https://bibliotekanauki.pl/articles/259073.pdf
Data publikacji:: 2012
Wydawca:: Politechnika Gdańska. Wydział Inżynierii Mechanicznej i Okrętownictwa
Tematy:: ship motion control
trajectory generation
autonomous navigation
reinforcement learning
least-squares policy iteration
Opis:: This paper presents the application of the reinforcement learning algorithms to the task of autonomous determination of the ship trajectory during thein-harbour and harbour approaching manoeuvres. Authors used Markov decision processes formalism to build up the background of algorithm presentation. Two versions of RL algorithms were tested in the simulations: discrete (Q-learning) and continuous form (Least-Squares Policy Iteration). The results show that in both cases ship trajectory can be found. However discrete Q-learning algorithm suffered from many limitations (mainly curse of dimensionality) and practically is not applicable to the examined task. On the other hand, LSPI gave promising results. To be fully operational, proposed solution should be extended by taking into account ship heading and velocity and coupling with advanced multi-variable controller.
Źródło:: Polish Maritime Research; 2012, S 1; 31-36
1233-2585
Pojawia się w:: Polish Maritime Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 25.

Tytuł:: Sumienie maszyny? Sztuczna inteligencja i problem odpowiedzialności moralnej
The Conscience of a Machine? Artificial Intelligence and the Problem of Moral Responsibility
Autorzy:: Wieczorek, Krzysztof Tomasz
Jędrzejko, Paweł
Powiązania:: https://bibliotekanauki.pl/articles/1912551.pdf
Data publikacji:: 2021-09-03
Wydawca:: Wydawnictwo Uniwersytetu Śląskiego
Tematy:: sztuczna inteligencja
etyka
reinforcement learning
autonomia decyzyjna
artificial intelligence
ethics
decision-making autonomy
Opis:: Przyspieszający postęp w dziedzinie inteligentnych technologii rodzi nowe wyzwania etyczne, z którymi w dłuższej lub krótszej perspektywie ludzkość będzie musiała się zmierzyć. Nieuniknionym elementem owego postępu jest rosnąca autonomia w zakresie podejmowania decyzji przez maszyny i systemy, nienadzorowane bezpośrednio przez człowieka. Co najmniej niektóre z tych decyzji będą rodzić konflikty i dylematy moralne. Już dziś warto się zastanowić nad tym, jakie środki są niezbędne, by przyszłe autonomiczne, samouczące i samoreplikujące się obiekty, wyposażone w sztuczną inteligencję i zdolne do samodzielnego działania w dużym zakresie zmienności warunków zewnętrznych, wyposażyć w specyficzny rodzaj inteligencji etycznej. Problem, z którym muszą się zmierzyć zarówno konstruktorzy, jak i użytkownicy tworów obdarzonych sztuczną inteligencją, polega na konieczności optymalnego wyważenia racji, potrzeb i interesów między obiema stronami ludzko-nieludzkiej interakcji. W sytuacji rosnącej autonomii maszyn przestaje bowiem wystarczać etyka antropocentryczna. Potrzebny jest nowy, poszerzony i zmodyfikowany model etyki, który pozwoli przewidzieć i objąć swoim zakresem dotychczas niewystępujący obszar równorzędnych relacji człowieka i maszyny. Niektórym aspektom tego zagadnienia poświęcony jest niniejszy artykuł.
The ever-accelerating progress in the area of smart technologies gives rise to new ethical challenges, which humankind will sooner or later have to face. An inevitable component of this progress is the increase in the autonomy of the decision-making processes carried out by machines and systems functioning without direct human control. At least some of these decisions will generate conflicts and moral dilemmas. It is therefore worth the while to reflect today upon the measures that need to be taken in order to endow the autonomous, self-learning and self-replicating entities – products equipped with artificial intelligence and capable of independent operation in a wide variety of external conditions and circumstances – with a unique kind of ethical intelligence. At the core of the problem, which both the designers and the users of entities bestowed with artificial intelligence must eventually face, lies the question of how to attain the optimal balance between the goals, needs and interests of both sides of the human-non-human interaction. It is so, because in the context of the expansion of the autonomy of the machines, the anthropocentric model of ethics does no longer suffice. It is therefore necessary to develop a new, extended and modified, model of ethics: a model which would encompass the whole, thus far non-existent, area of equal relations between the human and the machine, and which would allow one to predict its dynamics. The present article addresses some of the aspects of this claim.
Źródło:: ER(R)GO: Teoria – Literatura – Kultura; 2021, 42; 15-34
1508-6305
2544-3186
Pojawia się w:: ER(R)GO: Teoria – Literatura – Kultura
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 26.

Tytuł:: An active exploration method for data efficient reinforcement learning
Autorzy:: Zhao, Dongfang
Liu, Jiafeng
Wu, Rui
Cheng, Dansong
Tang, Xianglong
Powiązania:: https://bibliotekanauki.pl/articles/331205.pdf
Data publikacji:: 2019
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: reinforcement learning
information entropy
PILCO
data efficiency
uczenie ze wzmocnieniem
entropia informacji
wydajność danych
Opis:: Reinforcement learning (RL) constitutes an effective method of controlling dynamic systems without prior knowledge. One of the most important and difficult problems in RL is the improvement of data efficiency. Probabilistic inference for learning control (PILCO) is a state-of-the-art data-efficient framework that uses a Gaussian process to model dynamic systems. However, it only focuses on optimizing cumulative rewards and does not consider the accuracy of a dynamic model, which is an important factor for controller learning. To further improve the data efficiency of PILCO, we propose its active exploration version (AEPILCO) that utilizes information entropy to describe samples. In the policy evaluation stage, we incorporate an information entropy criterion into long-term sample prediction. Through the informative policy evaluation function, our algorithm obtains informative policy parameters in the policy improvement stage. Using the policy parameters in the actual execution produces an informative sample set; this is helpful in learning an accurate dynamic model. Thus, the AEPILCOalgorithm improves data efficiency by learning an accurate dynamic model by actively selecting informative samples based on the information entropy criterion. We demonstrate the validity and efficiency of the proposed algorithm for several challenging controller problems involving a cart pole, a pendubot, a double pendulum, and a cart double pendulum. The AEPILCO algorithm can learn a controller using fewer trials compared to PILCO. This is verified through theoretical analysis and experimental results.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2019, 29, 2; 351-362
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 27.

Tytuł:: Utilization of Deep Reinforcement Learning for Discrete Resource Allocation Problem in Project Management – a Simulation Experiment
Wykorzystanie uczenia ze wzmocnieniem w problemach dyskretnej alokacji zasobów w zarządzaniu projektami – eksperyment symulacyjny
Autorzy:: Wójcik, Filip
Powiązania:: https://bibliotekanauki.pl/articles/2179629.pdf
Data publikacji:: 2022
Wydawca:: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:: reinforcement learning (RL)
operations research
management
optimisation
uczenie ze wzmocnieniem
badania operacyjne
zarządzanie
optymalizacja
Opis:: This paper tests the applicability of deep reinforcement learning (DRL) algorithms to simulated problems of constrained discrete and online resource allocation in project management. DRL is an extensively researched method in various domains, although no similar case study was found when writing this paper. The hypothesis was that a carefully tuned RL agent could outperform an optimisation-based solution. The RL agents: VPG, AC, and PPO, were compared against a classic constrained optimisation algorithm in trials: “easy”/”moderate”/”hard” (70/50/30% average project success rate). Each trial consisted of 500 independent, stochastic simulations. The significance of the differences was checked using a Welch ANOVA on significance level alpha = 0.01, followed by post hoc comparisons for false-discovery control. The experiment revealed that the PPO agent performed significantly better in moderate and hard simulations than the optimisation approach and other RL methods.
W artykule zbadano stosowalność metod głębokiego uczenia ze wzmocnieniem (DRL) do symulowanych problemów dyskretnej alokacji ograniczonych zasobów w zarządzaniu projektami. DRL jest obecnie szeroko badaną dziedziną, jednak w chwili przeprowadzania niniejszych badań nie natrafiono na zbliżone studium przypadku. Hipoteza badawcza zakładała, że prawidłowo skonstruowany agent RL będzie w stanie uzyskać lepsze wyniki niż klasyczne podejście wykorzystujące optymalizację. Dokonano porównania agentów RL: VPG, AC i PPO z algorytmem optymalizacji w trzech symulacjach: „łatwej”/„średniej”/ „trudnej” (70/50/30% średnich szans na sukces projektu). Każda symulacja obejmowała 500 niezależnych, stochastycznych eksperymentów. Istotność różnic porównano testem ANOVA Welcha na poziomie istotności α = 0.01, z następującymi po nim porównaniami post hoc z kontrolą poziomu błędu. Eksperymenty wykazały, że agent PPO uzyskał w najtrud- niejszych symulacjach znacznie lepsze wyniki niż metoda optymalizacji i inne algorytmy RL.
Źródło:: Informatyka Ekonomiczna; 2022, 1; 56-74
1507-3858
Pojawia się w:: Informatyka Ekonomiczna
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 28.

Tytuł:: O doborze reguł sterowania dla regulatora rozmytego
About collecting of control for a fuzzy logic controller
Autorzy:: Wiktorowicz, K.
Zajdel, R.
Powiązania:: https://bibliotekanauki.pl/articles/156306.pdf
Data publikacji:: 2005
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: sterowanie rozmyte
sieci neuronowe
uczenie ze wzmocnieniem
fuzzy control
neural networks
reinforcement learning
stability
quality
Opis:: W pracy scharakteryzowano problem doboru reguł sterowania dla regulatora rozmytego. Omówiono metody pozyskiwania reguł za pomocą sieci neuronowej uczonej metodą z nauczycielem i ze wzmocnieniem. Przedstawiono zagadnienie badania stabilności i jakości zaprojektowanego układu. Omawiane problemy zilustrowano przykładowymi wynikami badań.
In the paper the problem of collecting of control rules a fuzzy logic controller is characterised. Two methods of generating of rules using neural network are described: supervised learning and reinforcement learning. the problem of stability and quality analysis is presented. The considerations are illustrated by examples.
Źródło:: Pomiary Automatyka Kontrola; 2005, R. 51, nr 1, 1; 44-46
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 29.

Tytuł:: Motor Control: Neural Models and Systems Theory
Autorzy:: Doya, K.
Kimura, H.
Miyamura, A.
Powiązania:: https://bibliotekanauki.pl/articles/908323.pdf
Data publikacji:: 2001
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: adaptacyjny układ sterowania
model wielokrotny
inverse model
adaptive control
cerebellum
reinforcement learning
basal ganglia
multiple models
Opis:: In this paper, we introduce several system theoretic problems brought forward by recent studies on neural models of motor control. We focus our attention on three topics: (i) the cerebellum and adaptive control, (ii) reinforcement learning and the basal ganglia, and (iii) modular control with multiple models. We discuss these subjects from both neuroscience and systems theory viewpoints with the aim of promoting interplay between the two research communities.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2001, 11, 1; 77-104
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 30.

Tytuł:: Some improvements in the reinforcement learning of a mobile robot
Uczenie ze wzmocnieniem robotów mobilnych - propozycje usprawnień
Autorzy:: Pluciński, M.
Powiązania:: https://bibliotekanauki.pl/articles/153411.pdf
Data publikacji:: 2010
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: uczenie ze wzmocnieniem
sieci neuronowe RBF
roboty mobilne
reinforcement learning
probabilistic RBF neural network
mobile robot
Opis:: The paper presents application of the reinforcement learning to autonomous mobile robot moving learning in an unknown, stationary environment. The robot movement policy was represented by a probabilistic RBF neural network. As the learning process was very slow or even impossible for complicated environments, there are presented some improvements, which were found out to be very effective in most cases.
W artykule zaprezentowane jest zastosowanie uczenia ze wzmocnieniem w poszukiwaniu strategii ruchu autonomicznego robota mobilnego w nieznanym, stacjonarnym środowisku. Zadaniem robota jest dotarcie do zadanego i znanego mu punktu docelowego jak najkrótszą drogą i bez kolizji z przeszkodami. Stan robota określa jego położenie w stałym (związanym ze środowiskiem) układzie współrzędnych, natomiast akcja wyznaczana jest jako zadany kierunek ruchu. Strategia robota zdefiniowana jest pośrednio za pomocą funkcji wartości, którą reprezentuje sztuczna sieć neuronowa typu RBF. Sieci tego typu są łatwe w uczeniu, a dodatkowo ich parametry umożliwiają wygodną interpretację realizowanego odwzorowania. Ponieważ w ogólnym przypadku uczenie robota jest bardzo trudne, a w skomplikowanych środowiskach praktycznie niemożliwe, stąd w artykule zaprezentowanych jest kilka propozycji jego usprawnienia. Opisane są eksperymenty: z wykorzystaniem ujemnych wzmocnień generowanych przez przeszkody, z zastosowaniem heurystycznych sposobów podpowiadania robotowi właściwych zachowań w "trudnych" sytuacjach oraz z wykorzystaniem uczenia stopniowego. Badania wykazały, że najlepsze efekty uczenia dało połączenie dwóch ostatnich technik.
Źródło:: Pomiary Automatyka Kontrola; 2010, R. 56, nr 12, 12; 1470-1473
0032-4140
Pojawia się w:: Pomiary Automatyka Kontrola
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "reinforcement learning" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język