Search for dissertations about: "Apache Spark"

Showing result 1 - 5 of 7 swedish dissertations containing the words Apache Spark.

  1. 1. Performance Characterization and Optimization of In-Memory Data Analytics on a Scale-up Server

    Author : Ahsan Javed Awan; Eduard Ayguade; Mats Brorsson; Vladimir Vlassov; Lieven Eeckhout; KTH; []
    Keywords : TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; Workload Characterization; Big Data Analytics; Multicore Performance; Apache Spark; Near Data Processing; NUMA; Hyperthreading; Prefetchers; Coherently attached accelerators; Informations- och kommunikationsteknik; Information and Communication Technology;

    Abstract : The sheer increase in the volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted to understanding the performance of in-memory data analytics with Spark on modern scale-up servers. READ MORE

  2. 2. Scalable Analysis of Large Datasets in Life Sciences

    Author : Laeeq Ahmed; Erwin Laure; Ola Spjuth; Ake Edlund; Vincent Breton; KTH; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; Big Data; Apache Spark; Virtual Screening; EEG; Cloud Computing; Life Sciences; Machine Learning; Computer Science; Datalogi;

    Abstract : We are experiencing a deluge of data in all fields of scientific and business research, particularly in the life sciences, due to the development of better instrumentation and the rapid advancements that have occurred in information technology in recent times. There are major challenges when it comes to handling such large amounts of data. READ MORE

  3. 3. Performance Characterization of In-Memory Data Analytics on a Scale-up Server

    Author : Ahsan Javed Awan; Mats Brorsson; Vladimir Vlassov; Eduard Ayguade; Boris Grot; KTH; []
    Keywords : TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; Informations- och kommunikationsteknik; Information and Communication Technology;

    Abstract : The sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted at understanding the performance of in-memory data analytics with Spark on modern scale-up servers. READ MORE

  4. 4. Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science

    Author : Marco Capuccini; Ola Spjuth; Johan Tordsson; Uppsala universitet; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; cloud computing; bioinformatics; Big Data; microservices; containers; MapReduce; Scientific Computing; Beräkningsvetenskap;

    Abstract : Over the past 20 years, the rise of high-throughput methods in life science has enabled research laboratories to produce massive datasets of biological interest. When dealing with this "data deluge" of modern biology researchers encounter two major challenges: first, there is a need for substantial technical skills for dealing with Big Data and; second, infrastructure procurement becomes difficult. READ MORE

  5. 5. Visualizing Cluster Patterns at Scale : A Model and a Library

    Author : Elio Ventocilla; Maria Riveiro; Göran Falkman; Rafael M. Martins; Katerina Vrostou; Högskolan i Skövde; []
    Keywords : TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; Visual analytics; cluster patterns; big data; unsupervised learning; multidimensional projections; vector quantization; progressive visual analytics; Skövde Artificial Intelligence Lab SAIL ; Skövde Artificial Intelligence Lab SAIL ;

    Abstract : Large quantities of data are being collected and analyzed by companies and institutions, with the aim of extracting knowledge and value. When little is known about the data at hand, analysts engage in exploratory data analysis to achieve a better understanding. READ MORE