Τμήμα Μηχανικών Η/Υ και Πληροφορικής (ΜΔΕ)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 724
  • Thumbnail Image
    Item
    Open Access
    Μελέτη και εφαρμογή αλγορίθμων GAN : η συμβολή της θεωρίας παιγνίων και ο ρόλος της μηχανικής μάθησης
    (2022-11-08) Σολωμού, Πολύμνια; Solomou, Polymnia
    Τα Generative Adversarial Networks (στην παρούσα εργασία θα αναφέρονται στο εξής για λόγους συντομίας ως «GANs») αποτελούν ένα καινοτόμο framework που βασίζεται στο συνδυασμό της βαθιάς μηχανικής μάθησης και των νευρωνικών δικτύων. Πρόκειται για ένα μοντέλο το οποίο βρίσκει ενδιαφέρουσες εφαρμογές στη σύγχρονη εποχή, όμως λόγω του ότι αναπτύχθηκε σχετικά πρόσφατα οι προκλήσεις είναι ακόμα αρκετές. Η μεγαλύτερη εξ’ αυτών είναι η διαδικασία της εκπαίδευσης και ως εκ τούτου έχουν υλοποιηθεί μέχρι σήμερα εναλλακτικοί αλγόριθμοι, τεχνικές ή και αρχιτεκτονικές με διαφορετικές προσεγγίσεις σε σχέση με το αρχικό μοντέλο προκειμένου να επιλυθούν όσο το δυνατόν περισσότερα από τα προβλήματα αυτά. Κατά την διπλωματική αυτή εργασία θα γίνει μια ανασκόπηση προηγούμενων ερευνών μέχρι το σημείο της ανάπτυξης του αρχικού GAN μοντέλου το οποίο βασίστηκε στην ιδέα της θεωρίας παιγνίων. Στη συνέχεια θα παρουσιαστεί το αρχικό μοντέλο GAN, καταλήγοντας μάλιστα σε ορισμένα προβλήματα που συναντώνται κατά τη διάρκεια της εκπαίδευσής του και δυσχεραίνουν την όλη διαδικασία. Είναι σημαντικό επομένως να καταγραφούν αμέσως μετά κατάλληλες μετρικές απόδοσης της λειτουργίας του μοντέλου. Κατόπιν θα περιγραφούν και θα συγκριθούν μεταγενέστερες εκδόσεις του παραδοσιακού μοντέλου GAN και νέες αλγοριθμικές τεχνικές αυτών που υποσχέθηκαν να επιλύσουν ή να βελτιώσουν μερικά από τα προβλήματα που αναγνωρίστηκαν παραπάνω ή να καλύψουν διαφορετικές λειτουργικές ανάγκες. Στο τέλος του θεωρητικού μέρους της εργασίας θα αναφερθούν κάποιες χρήσιμες εφαρμογές που βρίσκει ένα GAN μοντέλο σε διάφορους κλάδους της σύγχρονης κοινωνίας. Τέλος, στα πλαίσια του πρακτικού μέρους της παρούσας διπλωματικής εργασίας, θα υλοποιηθεί ένα CycleGAN, ένα μοντέλο GAN εξειδικευμένο στην αντιστοίχιση εικόνων από μία κατηγορία σε κάποια άλλη χωρίς να απαιτείται κάποιο σύνολο δεδομένων με έτοιμα αντιστοιχισμένα ζεύγη. Κατά την εκπαίδευση του CycleGAN και την εκτέλεση της εφαρμογής όχι μόνο να εξηγηθούν πρακτικά κάποιες από τις έννοιες που αναφέρθηκαν σε θεωρητικό επίπεδο αλλά θα εισαχθούν ως είσοδοι, θα επεξεργαστούν και ληφθούν ως έξοδοι αποτελέσματα πραγματικών δεδομένων. Με το τρόπο αυτό θα μπορέσει εν κατακλείδι να γίνει εξαγωγή χρήσιμων συμπερασμάτων σχετικά με τη τεχνολογία των GANs.
  • Thumbnail Image
    Item
    Open Access
    Generating 3D indoor environments with consistent styles
    (2022-11) Πασχαλίδης, Γεώργιος; Paschalidis, Georgios
    In the current Master thesis, we address the scene generation task, with our main focus being on indoor scenes synthesis. Existing approaches pose the scene generation task as a layout creation problem. Namely the task is to populate a scene with a set of labelled bounding boxes that correspond to a set of furniture pieces. In particular, these methods typically seek to learn a probability distribution over a set of attributes that define each object such as their shape, category, orientation and position in the scene. During generation, the generated bounding boxes are replaced with 3D objects by retrieving them from a library of assets based on various criteria such as size, object category etc. Naturally, since the object retrieval process is independent from the layout generation there are no guarantees that the generated objects will be coherent in terms of style and appearance. To this end, in this work, we propose a novel generative model for indoor scenes that takes into consideration the per-object style during the generation process. We believe that this is a crucial step towards generating realistic environments. In particular, we build on top of ATISS [18], which is the state-of-the-art indoor scene generation pipeline. Specifically, we extend its capabilities by also incorporating a style prediction module. Further- more, we also propose a novel retrieval procedure that instead of simply relying on the size to replace bounding boxes with 3D models, takes into account the per-object style. Our experimental evaluation showcases that our model consistently generates stylistically meaningful scenes, (i.e. the nightstands next two a bed or the chairs around the table should have similar appearance), while performing on par with ATISS wrt. the scene generation quality. Finally, we also introduce various metrics that can be used for evaluating the generated scenes in terms of the style coherence.
  • Thumbnail Image
    Item
    Open Access
    Reconstructing surfaces with appearance using neural implicit 3D representations
    (2022-11-15) Πασχαλίδης, Παναγιώτης; Paschalidis, Panagiotis
    Parsing complex 3D scenes into compact low-dimensional representations has been a long-standing goal in Computer Vision that could tremendously benefit various downstream applications such as scene understanding and reconstruction. Based on the output of the final representation, existing methods can be categorized to explicit and implicit methods. In particular, implicit-based approaches have recently gained popularity due to their simple yet efficient parameterization. The primary goal of these works (such as OccNet, SRN, Neural Volumes etc,) is to create implicit representations by mapping 3D points with the pertinent scenes. Although these techniques render promising results they struggle to learn to represent complex scenes. Neural Radiance Fields was the breakthrough in this direction. Mapping scene geometry and appearance with a spatial 3D location turned out to be the simplest and most effective method since then. Not only did it overpass the previous methods in terms of fidelity and accuracy, but was able to encode even complex scenes. Neural Radiance Fields as originally proposed by Mildenhall et al. are limited to only single scene overfitting and time efficiency. Since mapping is performed by training an MLP for a scene specifically, many methods proposed afterwards came to address this problem. GRF and PixelNeRF for example exploit image features in order to learn scene priors that allow multi scene training. Despite the fact that NeRF produced state of the art results in novel view synthesis task, it lacked accuracy in terms of 3D reconstruction. Recent works such as UniSURF show that we can produce accurate surface reconstructions by combining surface and volumetric rendering. In our work, we combine PixelNeRF with UniSURF, by applying accurate surface extraction methods to multi scene 3D implicit representations.
  • Thumbnail Image
    Item
    Open Access
    A nonnegative least squares solver for multiple right-hand sides for approximating the nonnegative matrix factorization
    (2022-11-04) Κολώνιας, Λεωνίδας; Kolonias, Leonidas
    Nonnegative Least Squares (NNLS) problems, where the variables are restricted to take only nonnegative values, often arise in many applications and are also at the core of most approaches to solve the nonnegative matrix factorization (NMF), a low-rank matrix approximation problem with nonnegativity constraints. NMF is a data analysis technique used in a great variety of applications such as text mining, image processing, hyperspectral data analysis, computational biology, and clustering. In more detail, the nonnegative factors can be interpreted as data e.g., as images described by pixel intensities or texts represented by vectors of word counts. The mathematical formulation for NMF appears as a non-convex optimization problem, and various types of algorithms have been devised to solve the problem. The first goal of this thesis is to propose a new efficient, yet simple to implement, approach to solve nonnegative linear least squares problems for multiple right-hand sides. More precisely, we study and use properties of global algorithms for least squares problems which are then combined with rules that enforce nonnegativity and lead to novel techniques for solving the aforementioned problem by a flexible Krylov subspace method. Comparisons of the state of the art algorithms using datasets and examples that come from real life applications as well as those artificially generated show that the proposed new algorithm presents a satisfactory behaviour and in some cases outperforms existing ones in computational speed and accuracy. Our second goal is to study extensively the NMF, its properties and applications and dive into the existing algorithms and methodologies used in order to approximate a solution for it. Moreover, using our new approach, tuned to solve large scale nonnegative least squares problems for multiple right-hand sides we present a novel algorithm for NMF based on the alternating nonnegative least squares (ANLS) framework. Extensive experiments on document clustering, images and synthetic datasets indicate the effectiveness of our approach.
  • Thumbnail Image
    Item
    Embargo
    Study and optimization of digital filters transition
    (2022) Παναγιωτοπούλου, Σταυρούλα; Panagiotopoulou, Stavroula
    The center of this thesis is digital filter transition; a problem encountered in audio applications, such as speech coding, equalization of audio signal and music synthesis. Those are some cases, where filter switching can cause undesirable sound effects (clicks, plops, gaps etc). In the first part of the present thesis, an extensive survey of existing transition techniques is presented, covering a large part of current literature. After a selection of widely-known methods from the aforementioned review, we implemented and tested the respective algorithms in MATLAB environment. To assess the implementation results, objective evaluation is performed through certain commonly accepted models and metrics. Based on this evaluation, the optimal transition algorithm is identified and implemented in real time, on a digital signal processor (DSP).