Search for dissertations about: "Apache Spark"
Showing result 1 - 5 of 7 swedish dissertations containing the words Apache Spark.
-
1. Performance Characterization and Optimization of In-Memory Data Analytics on a Scale-up Server
Abstract : The sheer increase in the volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted to understanding the performance of in-memory data analytics with Spark on modern scale-up servers. READ MORE
-
2. Scalable Analysis of Large Datasets in Life Sciences
Abstract : We are experiencing a deluge of data in all fields of scientific and business research, particularly in the life sciences, due to the development of better instrumentation and the rapid advancements that have occurred in information technology in recent times. There are major challenges when it comes to handling such large amounts of data. READ MORE
-
3. Performance Characterization of In-Memory Data Analytics on a Scale-up Server
Abstract : The sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted at understanding the performance of in-memory data analytics with Spark on modern scale-up servers. READ MORE
-
4. Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science
Abstract : Over the past 20 years, the rise of high-throughput methods in life science has enabled research laboratories to produce massive datasets of biological interest. When dealing with this "data deluge" of modern biology researchers encounter two major challenges: first, there is a need for substantial technical skills for dealing with Big Data and; second, infrastructure procurement becomes difficult. READ MORE
-
5. Visualizing Cluster Patterns at Scale : A Model and a Library
Abstract : Large quantities of data are being collected and analyzed by companies and institutions, with the aim of extracting knowledge and value. When little is known about the data at hand, analysts engage in exploratory data analysis to achieve a better understanding. READ MORE