|Monographs and Edited Volumes||2|
|Total New Publications||15|
Date: 15 January 2021
Presenter: Zoe Kotti
We introduce the region-based convolutional neural networks (R-CNN) family of machine learning models, which are widely used in computer vision for object detection. Particularly, we focus on the R-FCN model, a region-based, fully convolutional network for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN, that apply a costly per-region subnetwork hundreds of times, R-FCN is fully convolutional with almost all computation shared on the entire image. To achieve this goal, position-sensitive score maps are proposed to address a dilemma between translation-invariance in image classification and translation-variance in object detection. This method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets), for object detection. The authors of this work show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, the result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart. Code is made publicly available at: https://github.com/daijifeng001/r-fcn.
Date: 22 January 2021
Presenter: Diomidis Spinellis
Redesigning the peer review of software engineering studies can improve the process's fairness, inclusiveness, transparency, and effectiveness. group led by Paul Ralph, in which your presenter participates, is developing empirical standards under the auspices of ACM SIGSOFT. Today, when a scholar peer reviews a manuscript, they must simultaneously generate and apply a set of evaluation criteria. The problem is that generating an appropriate rubric for judging research quality is mind-numbingly difficult. So reviewers tend to generate incomplete, oversimplified and inappropriate criteria. It’s not because reviewers are stupid; it’s because no one person can generate good rubrics for all of the different kinds of studies SE researchers review. Furthermore, the reviewers, the editor, and the authors may construct totally different rubrics, and these rubrics may depart wildly from published methodological guidance, or the norms of their scientific community. Most frustration with the peer review process comes from authors and reviewers disagreeing on what makes a study good. The power imbalance between authors and reviewers makes correcting some reviewers impossible. We need to change the process so that reviewers all use the right criteria in the first place. The solution is for the software engineering community to decide together what “good” means. For each common methodology, we create a one-page checklist of specific expectations. To prevent reviewers from using the standards in an inflexible, gotcha-like manner, each criterion is paired with a simple decision tree. When a reviewer indicates that a criterion is not satisfied, they will be explicitly asked whether there is a good reason. Providing the same, specific criteria to authors and reviewers will improve research quality, simplify reviewing, reduce conflict and increase acceptance rates. The presentation will introduce the taxonomy of provided empirical standards, overview the general standard, which applies to all reviews, and focus on one particular standard as an example.
Date: 01 February 2021
Presenter: Stefanos Georgiou
Energy efficiency for computer systems is an ever-growing matter that has caught the attention of the software engineering community. Although hardware design and utilization are undoubtedly key factors affecting energy consumption, there is solid evidence that software design can also significantly alter the energy consumption of IT products. Therefore, the goal of this dissertation is to show the impact of software design decisions on the energy consumption of a computer system.
Initially, we analyzed 92 research papers from top-tier conferences and categorized them under the Software Development Life Cycle taxonomy. From this study, we were able to find many research challenges. Among these challenges, we identified that there is limited work in the context of different programming languages’ energy and delay implications.
To this end, we performed an empirical study and pointed out which programming languages can introduce better energy and run-time performance for specific programming tasks and computer platforms (i.e., server, laptop, and embedded system). Motivated further by our survey results, we performed an additional study on different programming languages and computer platforms to demonstrate the energy and delay implications of various inter-process communication technologies (i.e, REST, RPC, gRPC).
From the above studies, we were able to introduce guidelines on reducing the energy consumption of different applications by suggesting which programming languages to utilise in specific cases. Finally, we performed experiments to examine the energy and run-time performance taxing that security measures have over 128 distinct benchmark suites. By investigating the impact of CPU-related vulnerabilities (Meltdown, Spectre, and MDS), communication-related security measures (HTTP/HTTPS), memory protection (memory zeroing), and compiler safeguards (GCC), we have found that these measures can impact the energy and run-time performance of real-work applications (Nginx, Apache, Redis) by up to 20%.
Date: 05 March 2021
Presenter: Thodoris Sotiropoulos
Over the past decade, there was a huge interest in compiler testing that led to the disclosure of thousands of bugs in well-established and widely-used compilers. Despite this tremendous success, current research endeavors have mainly focused on detecting frustrating compiler crashes, and subtle miscompilations caused by bugs in the implementation of compiler optimizations. However, in statically-typed languages, the frontend part of a compiler is equally important, as it is the component that decides whether the input program is correct or not. In modern programming languages with sophisticated type system features the implementation of frontend is much complex, and therefore, type system-related bugs are quite often. Bugs in the implementation of frontend can break the soundness of type system, lead to rejection of correct programs, or make the compiler produce misleading reports and warnings.
We present a study of bugs found in compiler frontends. Specifically, we examine frontend bugs reported in the top JVM programming languages, namely, Java, Scala, Kotlin, and Groovy. We evaluate each bug in terms of several criteria, including their symptom, root cause, characteristics of the test case that triggers the bug, and finally we propose a categorization. We believe that this work opens up a new direction in compiler testing, which is currently overlooked.
Date: 01 April 2021
Presenter: Kaiti Thoma and Konstantina Dritsa
The appearance of large text corpuses, covering extensive time periods, has allowed researchers to investigate qualitatively the change in the semantics of words over time. We will present the main methods, starting from an introduction to the underlying technologies that have been developed over the last decades, and presenting some highlights of results from the literature.
Date: 08 April 2021
Presenter: Vitalis Salis
Call graphs play an important role in different contexts, such as profiling and vulnerability propagation analysis. Generating call graphs in an efficient manner can be a challenging task when it comes to high-level languages that are modular and incorporate dynamic features and higher-order functions.
Despite the language's popularity, there have been very few tools aiming to generate call graphs for Python programs. Worse, these tools suffer from several effectiveness issues that limit their practicality in realistic programs. We propose a pragmatic, static approach for call graph generation in Python. We compute all assignment relations between program identifiers of functions, variables, classes, and modules through an inter-procedural analysis. Based on these assignment relations, we produce the resulting call graph by resolving all calls to potentially invoked functions. Notably, the underlying analysis is designed to be efficient and scalable, handling several Python features, such as modules, generators, function closures, and multiple inheritance.
We have evaluated our prototype implementation, which we call PyCG, using two benchmarks: a micro-benchmark suite containing small Python programs and a set of macro-benchmarks with several popular real-world Python packages. Our results indicate that PyCG can efficiently handle thousands of lines of code in less than a second (0.38 seconds for 1k LoC on average). Further, it outperforms the state-of-the-art for Python in both precision and recall: PyCG achieves high rates of precision ~99.2%, and adequate recall ~69.9%. Finally, we demonstrate how PyCG can aid dependency impact analysis by showcasing a potential enhancement to GitHub's "security advisory'' notification service using a real-world example.
Date: 22 April 2021
Presenter: Thodoris Sotiropoulos
We introduce, what is to the best of our knowledge, the first approach for systematically testing Object-Relational Mapping (ORM) systems. Our approach leverages differential testing to establish a test oracle for ORM-specific bugs. Specifically, we first generate random relational database schemas, set up the respective databases, and then, we query these databases using the APIs of the ORM systems under test. To tackle the challenge that ORMs lack a common input language, we generate queries written in an abstract query language. These abstract queries are translated into concrete, executable ORM queries, which are ultimately used to differentially test the correctness of target implementations. The effectiveness of our method heavily relies on the data inserted to the underlying databases. Therefore, we employ a solver-based approach for producing targeted database records with respect to the constraints of the generated queries. We implement our approach as a tool, called CYNTHIA, which found 28 bugs in five popular ORM systems. The vast majority of these bugs are confirmed (25 / 28), more than half were fixed (20 / 28), and three were marked as release blockers by the corresponding developers.
Date: 20 May 2021
Presenter: Vasiliki Efstathiou
Shipping has been the driving force of global trade for centuries. Today, it remains the major means of cargo transportation with almost 90% of the world’s goods estimated to be carried by sea. At the same time, shipping generates an enormous footprint of data that can unlock new possibilities for the maritime industry.
MarineTraffic is currently the world’s leading platform offering ship tracking services and actionable maritime intelligence. Research at MarineTraffic is a paradigm of an applied research initiative, aiming to bring tangible outcomes to the market. This talk will present the lab’s efforts towards building systems for situational awareness at sea globally, demonstrating cases where the need for maritime intelligence is evident. The presentation will focus on ways of harnessing earth observation, ship tracking and behavioural data and will outline challenges and research opportunities in the journey to maritime digitalisation.
Date: 03 June 2021
Presenter: Damianos Chatziantoniou
A Data Virtual Machine (DVM) is a novel graph-based conceptual model, similar to the entity-relationship model, representing existing data (persistent, transient, derived) of an organization. A DVM can be built quickly, agilely, offering schematic flexibility to data engineers. Data scientists can visually define complex dataframe queries in an intuitive and simple manner, which are evaluated within an algebraic framework. A DVM can be easily materialized in any logical data model and can be “reoriented” around any node, offering a “single view of any entity”. In this paper we demonstrate DataMingler, a tool implementing DVMs . We argue that DVMs can have a significant practical impact in analytics environments.
Date: 17 June 2021
Presenter: Rahul Gopinath, Hamed Nemati, Andreas Zeller
Grammar-based test generators are highly efficient in producing syntactically valid test inputs, and give their user precise control over which test inputs should be generated. Adapting a grammar or a test generator towards a particular testing goal can be tedious, though. We introduce the concept of a grammar transformer, specializing a grammar towards inclusion or exclusion of specific patterns: “The phone number must not start with 011 or +1”. To the best of our knowledge, ours is the first approach to allow for arbitrary Boolean combinations of patterns, giving testers unprecedented flexibility in creating targeted software tests. The resulting specialized grammars can be used with any grammar-based fuzzer for targeted test generation, but also as validators to check whether the given specialization is met or not, opening up additional usage scenarios. In our evaluation on real-world bugs, we show that specialized grammars are accurate both in producing and validating targeted inputs.
Date: 24 September 2021
Presenter: Stefanos Chaliasos
Despite the substantial progress in compiler testing, research endeavors have mainly focused on detecting compiler crashes and subtle miscompilations caused by bugs in the implementation of compiler optimizations. Surprisingly, this growing body of work neglects other compiler components, most notably the front-end. In statically-typed programming languages with rich and expressive type systems and modern features, such as type inference or a mix of object-oriented with functional programming features, the process of static typing in compiler front-ends is complicated by a high-density of bugs. Such bugs can lead to the acceptance of incorrect programs (breaking code portability or the type system's soundness), the rejection of correct (e.g. well-typed) programs, and the reporting of misleading errors and warnings.
We conduct, what is to the best of our knowledge, the first empirical study for understanding and characterizing typing-related compiler bugs. To do so, we manually study 320 typing-related bugs (along with their fixes and test cases) that are randomly sampled from four mainstream JVM languages, namely Java, Scala, Kotlin, and Groovy. We evaluate each bug in terms of several aspects, including their symptom, root cause, bug fix's size, and the characteristics of the bug-revealing test cases. Some representative observations indicate that: (1) more than half of the typing-related bugs manifest as unexpected compile-time errors: the buggy compiler wrongly rejects semantically correct programs, (2) the majority of typing-related bugs lie in the implementations of the underlying type systems and in other core components related to operations on types, (3) parametric polymorphism is the most pervasive feature in the corresponding test cases, (4) one third of typing-related bugs are triggered by non-compilable programs.
We believe that our study opens up a new research direction by driving future researchers to build appropriate methods and techniques for a more holistic testing of compilers.
Date: 11 October 2021
Presenter: John Wilkes, Principal Software Engineer, Google
Imagine some product team inside Google wants 100,000 CPU cores + RAM + flash + accelerators + disk in a couple of months. We need to decide where to put them, when; whether to deploy new machines, or re-purpose/reconfigure old ones; ensure we have enough power, cooling, networking, physical racks, data centers and (over longer a time-frame) wind power; cope with variances in delivery times from supply logistics hiccups; do multi-year cost-optimal placement+decisions in the face of literally thousands of different machine configurations; keep track of parts; schedule repairs, upgrades, and installations; and generally make all this happen behind the scenes at minimum cost. And then after breakfast, we get to dynamically allocate resources (on the small-minutes timescale) to the product groups that need them most urgently, accurately reflecting the cost (opex/capex) of all the machines and infrastructure we just deployed, and monitoring and controlling the datacenter power and cooling systems to achieve minimum overheads - even as we replace all of these on the fly. This talk will highlight some of the exciting problems we're working on inside Google to ensure we can supply the needs of an organization that is experiencing (literally) exponential growth in computing capacity.
(The presentation is kindly offered in collaboration with TUDelft Prof. Lydia Chen.)
Date: 24 November 2021
Presenter: Nikiforos Botis, Solutions Architect, AWS
The term cloud computing firstly appeared in 1996 but it wasn't until 2006 when the first service was made available that was capable of giving access to remote virtual machines -via the internet- that could perform computations without needing to procure a physical server. Things have evolved significantly since then and many Cloud Service Providers (CSPs) have emerged, allowing organizations of all sizes and types, including academic institutions, to leverage the cloud for powering their IT workloads. In this talk, we will be exploring some of the offerings of AWS, the cloud services arm of Amazon, that could be of interest to researchers, especially around the areas of Serverless computing and Machine Learning.
Nikiforos is a Solutions Architect at the AWS Greek branch, focusing on public sector customers. He has been lucky to have had the chance to contribute to the architecture of both PLF & Vaccination platforms that were launched during the pandemic to keep the country safe and the economy going. Prior to that, he spent four years in London the majority of which at the AWS UK branch which he joined as a graduate after completing his MSc Computer Science at Imperial College London. Nikiforos is a proud graduate of DMST AUEB (BSc), through which he had the chance to participate in multiple entrepreneurial and other extracurricular activities, including a semester abroad (UCL, London).