Cloud Comptuing · Data Structures and Algorithms · Distributed Algorithms and Communication Protocols

Apache Beam+Apache Flink/Spark for Batch&Stream Processing

When it comes to stream processing, the Open Source community provides an entire ecosystem to tackle a set of generic problems. Among the emergent Apache projects, Beam is providing a clean programming model intended to be run on top of a runtime like Flink, Spark, Google Cloud DataFlow, etc.     A really convenient declarative… Continue reading Apache Beam+Apache Flink/Spark for Batch&Stream Processing

Applied Concurrency · Architectures and Design Patterns · Cloud Comptuing · Core Development · Data Structures and Algorithms · Distributed Algorithms and Communication Protocols · Distributed Computing · News from the Web · Operating Systems · OS Kernel · Performance, Throughput, Real-time and Other · Real-time and Other

Advancements in Data-Intensive Distributed Systems Engineering

Call for Papers! ADIDSE IARIA Special Track aims at tackling the problems and discussing the advancements in Data-Intensive Distributed Systems Engineering with the community of Engineers and Scientists out there. Come and join us… The advent of the IoT (Internet of Things), by forecasts, is going to bring 40 Billion connected devices by 2020. Such… Continue reading Advancements in Data-Intensive Distributed Systems Engineering

Architectures and Design Patterns · Cloud Comptuing · Data Structures and Algorithms · Distributed Computing · linux · Operating Systems · OS Kernel · Performance · Performance, Throughput, Real-time and Other · Software Engineering · Throughput

Scaling to Thousands of Threads

Knot is a network server as well as Haboob. The difference is the concurrency model: Knot is thread-based, instead Haboob is event-based [9]. Clearly, from the benchmark results, the poll()/epoll() mechanism is a serious bottleneck as soon as the number of active concurrent clients become relevant (in the specific case, at 16384 clients the trashing… Continue reading Scaling to Thousands of Threads

Applied Concurrency · Architectures and Design Patterns · Core Development · Data Structures and Algorithms · Distributed Computing · News from the Web · Performance · Performance, Throughput, Real-time and Other · Throughput

Scalable I/O: Events- Vs Multithreading-based

Everything begins with a refresher reading of my fundamental papers – yes, I use a set of papers and books as reference material. This paper is titled: “Why Events Are A Bad Idea (for high-concurrency servers)“, by Rob von Behren at the time of writing a PhD fellow at Berkeley [18]. Von Behren opens with: “Event-based… Continue reading Scalable I/O: Events- Vs Multithreading-based

Cloud Comptuing · Core Development · Data Structures and Algorithms · Distributed Algorithms and Communication Protocols · Distributed Computing · News from the Web · NoSQL · Performance · Programming Languages · Real-time and Other

FOSDEM 2016 – Day 2 Log

Intro to FOSDEM FOSDEM (Free and Open Source Software Developers’ European Meeting) is the European Open Source Conference oriented to Engineers, grouping the Open Source Communities in a University Campus and managed by volunteers. It is an intensive two days conference that, simply, enlightens… Main Sponsors: RedHat, Google, Oracle, Cisco, Mozilla, Trivago, Bloomberg, GitHib, O’Really… Continue reading FOSDEM 2016 – Day 2 Log

Cloud Comptuing · Core Development · Data Structures and Algorithms · Distributed Algorithms and Communication Protocols · Distributed Computing · News from the Web · NoSQL · Performance · Real-time and Other · Software Engineering

FOSDEM 2016 -Day 1 Log

Intro to FOSDEM FOSDEM (Free and Open Source Software Developers’ European Meeting) is the European Open Source Conference oriented to Engineers, grouping the Open Source Communities in a University Campus and managed by volunteers. It is an intensive two days conference that, simply, enlightens… Main Sponsors: RedHat, Google, Oracle, Cisco, Mozilla, Trivago, Bloomberg, GitHib, O’Really… Continue reading FOSDEM 2016 -Day 1 Log

Data Structures and Algorithms · Programming Languages · Software Engineering

Coding: Reversing Unordered Single Linked List using 2 Pointers

Puzzle Given an Unsorted Single Linked List, provide an Algorithm to reverse such Linked List using only 2 pointers. Input A Single Linked List. Example. 1 -> 4 -> 3 -> 2 -> 0 Output A Reversed Single Linked List. Example. 0 -> 2 -> 3 -> 4 -> 1   Solution Using Java as… Continue reading Coding: Reversing Unordered Single Linked List using 2 Pointers

Data Structures and Algorithms · Software Engineering

Coding: Flattening Nested Arrays

Puzzle Given an array of Integers that presents several levels of nesting, provide an algorithm to flatten the input array. Input A nested array of arrays. Example. [[1,2,[3]],4] Output A flat array. Example. [1,2,3,4]   Solution using Java as Programming Language Gist with runnable code and tests. Discussion To get the job done, the solution adopts… Continue reading Coding: Flattening Nested Arrays

Applied Math · Data Structures and Algorithms

Hashing Explained

Demystifying Hashing Hashing is a broadly studied topic among mathematicians, in fact Hash Functions are attractive mainly for their many applications in the modern Computer Science. Very often, it is possible to see confusion speaking about Hashing. People confuse Hashing with Base-changes (e.g. from Base-10 to Base-32 or Base-8); some other confuse Hashing with Random Number… Continue reading Hashing Explained

Applied Math · Data Structures and Algorithms · Distributed Computing · News from the Web

Sketch of the Day: HyperLogLog — Cornerstone of a Big Data Infrastructure

Intro In the Zipfian world of AK, the HyperLogLog distinct value (DV) sketch reigns supreme. This DV sketch is the workhorse behind the majority of our DV counters (and we’re not alone) and enables us to have a real time, in memory data store with incredibly high throughput. HLL was conceived of by Flajolet et.…