Algorithms and techniques for efficient data management in the Web

Thumbnail Image
Date
Authors
Νοδαράκης, Νικόλαος
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The term of cloud computing refers to the usage of computational resources (on software and hardware level) that consist a unified service over a network, like internet. Cloud computing becomes more and more popular among data management and storage applications, because of its ability of handling extremely large amount of data (TB or even PB). Daily, new problems arise that require efficient and scalable solutions for monitoring, processing and storing big data volumes. The most popular and notably efficient tools are key-value stores, that allow unstructured data storage, and large-scale distributed processing systems, like MapReduce. In the context of this thesis, we focus on the proposing techniques that deal with computationally intensive problems. Many centralized approaches have been developed for these problems, but when the data size grows exponentially these algorithms stop being effective. They either fail to confront the problem or need an excessive amount of time to fulfill their goal. It is more than clear that there is an imperative need to turn to distributed and high-scalable solutions that run on a cluster of computers.
Description
Keywords
Cloud computing, Distributed algorithms, Big data, Machine learning, Hadoop, MapReduce, Spark
Citation