- Tytuł:
- Data locality in Hadoop
- Autorzy:
-
Kałużka, J.
Napieralska, M.
Romero, O.
Jovanovic, P. - Powiązania:
- https://bibliotekanauki.pl/articles/397706.pdf
- Data publikacji:
- 2017
- Wydawca:
- Politechnika Łódzka. Wydział Mikroelektroniki i Informatyki
- Tematy:
-
distributed file system
big data
Apache Hadoop
HDFS
rozproszony system plików - Opis:
- The Apache Hadoop framework is an answer to the market tendencies regarding the need for storing and processing rapidly growing amounts of data, providing a fault-tolerant distributed storage and data processing. Dealing with large volumes of data, Hadoop, and its storage system HDFS (Hadoop Distributed File System), face challenges to keep the high efficiency with computing in a reasonable time. The typical Hadoop implementation transfers computation to the data. However, in the isolated configuration, namenode (playing the role of a master in the cluster) still favours the closer nodes. Basically it means that before the whole task has run, significant delays can be caused by moving single blocks of data closer to the starting datanode. Currently, a Hadoop user does not have influence how the data is distributed across the cluster. This paper presents an innovative functionality to the Hadoop Distributed File System (HDFS) that enables moving data blocks on request within the cluster. Data can be shifted either by a user running the proper HDFS shell command or programmatically by other modules, like an appropriate scheduler.
- Źródło:
-
International Journal of Microelectronics and Computer Science; 2017, 8, 1; 16-20
2080-8755
2353-9607 - Pojawia się w:
- International Journal of Microelectronics and Computer Science
- Dostawca treści:
- Biblioteka Nauki