Temat: policy iteration - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Improving modified policy iteration for probabilistic model checking
Autorzy:: Mohagheghi, Mohammadsadegh
Karimpour, Jaber
Isazadeh, Ayaz
Powiązania:: https://bibliotekanauki.pl/articles/27312850.pdf
Data publikacji:: 2022
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: probabilistic model checking
Markov decision processes
modified policy iteration
probabilistic reachability
Opis:: Along with their modified versions, value iteration and policy iteration are well-known algorithms for the probabilistic model checking of Markov decision processes. One challenge with these methods is that they are time-consuming in most cases. Several techniques have been proposed to improve the performance of iterative methods for probabilistic model checking; however, the running times of these techniques depend on the graphical structure of the utilized model. In some cases, their performance can be worse than the performance of standard methods. In this paper, we propose two new heuristics for accelerating the modified policy iteration method. We first define a criterion for the usefulness of the computations of each iteration of this method. The first contribution of our work is to develop and use a criterion to reduce the number of iterations in modified policy iteration. As the second contribution, we propose a new approach for identifying useless updates in each iteration. This method reduces the running time of the computations by avoiding the useless updates of states. The proposed heuristics have been implemented in the PRISM model checker and applied on several standard case studies. We compare the running time of our heuristics with the running times of previous standard and improved methods. Our experimental results show that our techniques yields a significant speed-up.
Źródło:: Computer Science; 2022, 23 (1); 63--80
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Semi-Markov control models with average costs
Autorzy:: Luque-Vásquez, Fernando
Hernández-Lerma, Onésimo
Powiązania:: https://bibliotekanauki.pl/articles/1338792.pdf
Data publikacji:: 1999
Wydawca:: Polska Akademia Nauk. Instytut Matematyczny PAN
Tematy:: average cost
replacement models
semi-Markov control models
policy iteration (or Howard's algorithm)
Opis:: This paper studies semi-Markov control models with Borel state and control spaces, and unbounded cost functions, under the average cost criterion. Conditions are given for (i) the existence of a solution to the average cost optimality equation, and for (ii) the existence of strong optimal control policies. These conditions are illustrated with a semi-Markov replacement model.
Źródło:: Applicationes Mathematicae; 1999, 26, 3; 315-331
1233-7234
Pojawia się w:: Applicationes Mathematicae
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Reinforcement Learning in Discrete and Continuous Domains Applied to Ship Trajectory Generation
Autorzy:: Rak, A.
Gierusz, W.
Powiązania:: https://bibliotekanauki.pl/articles/259073.pdf
Data publikacji:: 2012
Wydawca:: Politechnika Gdańska. Wydział Inżynierii Mechanicznej i Okrętownictwa
Tematy:: ship motion control
trajectory generation
autonomous navigation
reinforcement learning
least-squares policy iteration
Opis:: This paper presents the application of the reinforcement learning algorithms to the task of autonomous determination of the ship trajectory during thein-harbour and harbour approaching manoeuvres. Authors used Markov decision processes formalism to build up the background of algorithm presentation. Two versions of RL algorithms were tested in the simulations: discrete (Q-learning) and continuous form (Least-Squares Policy Iteration). The results show that in both cases ship trajectory can be found. However discrete Q-learning algorithm suffered from many limitations (mainly curse of dimensionality) and practically is not applicable to the examined task. On the other hand, LSPI gave promising results. To be fully operational, proposed solution should be extended by taking into account ship heading and velocity and coupling with advanced multi-variable controller.
Źródło:: Polish Maritime Research; 2012, S 1; 31-36
1233-2585
Pojawia się w:: Polish Maritime Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Optimal maintenance of a series production system with two multi-component subsystems and an intermediate buffer
Optymalna strategia utrzymania ruchu dla seryjnego systemu produkcji złożonego z dwóch podsystemów wieloskładnikowych oraz buforu pośredniego
Autorzy:: Zhou, Y.
Zhang, Z.
Powiązania:: https://bibliotekanauki.pl/articles/301663.pdf
Data publikacji:: 2015
Wydawca:: Polska Akademia Nauk. Polskie Naukowo-Techniczne Towarzystwo Eksploatacyjne PAN
Tematy:: series-parallel systems
intermediate buffers
Markov decision process
policy iteration
generalized minimum residual method
układ szeregowo-równoległy
bufor pośredni
proces decyzyjny Markowa
iteracja strategii
uogólniona metoda najmniejszego residuum
Opis:: Intermediate buffers often exist in practical production systems to reduce the influence of the breakdown and maintenance ef subsystems on system production. At the same time, the effects of intermediate buffers also make the degradation process of the system more difficult to model. Some existing papers investigate the performance evaluation and maintenance optimisation of a production system with intermediate buffers under a predetermined maintenance strategy structure. However, only few papers pay attention to the property of the optimal maintenance strategy structure. This paper develops a method based on the Markov decision process to identify the optimal maintenance strategy for a series-parallel system with two multi-component subsystems and an intermediate buffer. The structure of the obtained optimal maintenance strategy is analysed, which shows that the optimal strategy structure cannot be modelled by a limited number of parameters. However, some useful properties of the strategy structure are obtained, which can simplify the maintenance optimisation. Another interesting finding is that a large buffer capacity cannot always bring about high average revenue even through the cost of holding an item in the buffer is much smaller than the production revenue per item.
W systemach produkcyjnych często stosuje się bufory pośrednie w celu zmniejszenia wpływu awarii i konserwacji podsystemów na system produkcji. Jednocześnie, oddziaływanie buforów pośrednich utrudnia modelowanie procesu degradacji systemu. Istnieją badania dotyczące oceny funkcjonowania i optymalizacji utrzymania systemów produkcyjnych wykorzystujących bufory pośrednie przy założeniu wcześniej określonej struktury strategii utrzymania ruchy. Jednak tylko nieliczne prace zwracają uwagę na własności optymalnej struktury strategii utrzymania ruchu. W przedstawionej pracy opracowano opartą na procesie decyzyjnym Markowa metodę określania optymalnej strategii utrzymania ruchu dla układu szeregowo-równoległego z dwoma podsystemami wieloskładnikowymi oraz buforem pośrednim. Przeanalizowano strukturę otrzymanej optymalnej strategii utrzymania i wykazano, że struktury takiej nie można zamodelować przy użyciu ograniczonej liczby parametrów. Jednak odkryto pewne przydatne właściwości struktury strategii, które mogą ułatwić optymalizację utrzymania ruchu. Innym interesującym odkryciem było to, że duża pojemność bufora nie zawsze daje wysoką średnią przychodów mimo iż koszty przechowywania obiektu w buforze są znacznie mniejsze niż przychody z produkcji w przeliczeniu na jeden obiekt.
Źródło:: Eksploatacja i Niezawodność; 2015, 17, 2; 314-325
1507-2711
Pojawia się w:: Eksploatacja i Niezawodność
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "policy iteration" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język