11 Open Source Assets found

  • OpenStack Sahara Spark Plugin

    Run Spark jobs as Analytics-as-a-Service within OpenStack. In course of integration within the main OpenStack platform.
  • Schedsim

    Schedsim is a simulator for evaluating the impact of errors in estimating the size when performing size-based scheduling in big-data workloads.
  • Hadoop Log Tools

    A set of tools to analyse Hadoop logs. The goal of this project is to provide easy and extensible tools to work with Hadoop logs. The project has two directories: Hadoop that...
  • OSMeF OpenStack Measurement Framework

    The OpenStack Measurement Framework is used to measure the impact of virtualization in a number of different scenarios, mostly centered around the traffic patterns generated by...
  • Spark-Kmeans

    This project deals with the implementation of k-means for multi-dimensional clustering. Precisely, it will focus on email data (from Symantec) and we will customize k-means for...
  • SWIM -- Statistical Workload Injector for MapReduce

    SWIM has been developed by the AMP lab at UC Berkeley -- this is our forked version with several fixes.
  • Decision Trees for Apache Spark

    Decision trees implemented for Apache Spark
  • MumakNext

    Patches and extension to the Hadoop MapReduce simulator, Mumak
  • Optimized Rollup for MapReduce

    Different implementation of ROLLUP for Hadoop MapReduce
  • Hadoop with Suspension

    A patched version of Hadoop, supporting suspend/resume primitives. A FakeScheduler plugin allows to schedule workloads by hand.
  • HFSP

    The Hadoop Fair Sojourn Protocol Scheduler is a size-based scheduler for Hadoop.To addressing the problem of scheduling jobs characterized by a complex structure in a...