ABSTRACTS OF PRESENTATIONS
2012 Ph.D. Workshop
January 30th, 2012
Department of Computer Science, University of Pisa
Applications and visual representation of photo collections in 3D space.
The usage growth of digital photographic media poses new challenges in organization and visualization technologies. Advanced computer vision methodologies allows the reconstruction of the spatial relationships between large photo datasets and 3D space representations of the scenes depicted. These spatial relationships, called calibrations, offer many possibilities for efficient browsing and retrieval in huge datasets. We propose a method to expand a calibrated image set with new photographs in a completely automatic way maintaining high accuracy of the calibrations, then we show an efficient 3D-based image browser that can exploit the spatial information in an intuitive way and finally we give a glimpse of new visualization paradigms for these image sets.
Efficient rendering of subsurface light scattering.
Abstract - Simulating the scattering of light in semi-transparent objects in the context of rendering applications is a computationally challenging problem. Although many solutions exist, they are all bound by severely limiting assumptions. However, by exploiting new hardware capabilities and introducing novel precomputation techniques, it is possible to overcome some of these limitations. The presentation will explore three diverse rendering algorithms whose applicability range from physics simulation to interactive visualization.
Surface appearance reconstruction from video sequences.
Abstract - The technologies for the 3D reconstruction of real object are gaining interest in several application fields. They involve two different aspects: the acquisition of the geometry and the modelling of the surface appearance. While the different methods for the acquisition of 3D geometry allow a precise and accurate reconstruction in the majority of the digitalization conditions, on the other hand the proposed solutions for the estimation of the surface appearance are more complex and few suitable with an uncontrolled lighting environment.
We propose a new method for the acquisition of the surface appearance of a real object from video sequences, and its projection over a triangular mesh of the same object. The target is to estimate as much as possible data taking advantage of the two main differences of a video sequence from a set of unordered images: the frame-to-frame temporal coherence and the redundancy of the data. The method is composed by two steps: the registration of the video sequences over the mesh by calibration of the camera; the reconstruction of the surface appearance by statistical analysis.
In the first step we propose an algorithm for the automatic registration of each frame over the mesh based on two different approaches: feature-based registration by KLT video tracking, and statistic-based registration by maximizing the Mutual Information between the gradient of the frame and the gradient of the rendering of the 3D model with some illumination related properties, such as surface normals and ambient occlusion.
In the second step we propose a solution for the estimation of the surface appearance based on the statistical analysis of the color samples projected over each surface points from the video frames. Starting from the estimation of the acquisition lighting conditions through the construction of an environment map, that gives us an approximation of the position and the shape of the most important light sources, we are able to recover the diffuse color and the specular behavior to allow a photorealistic rendering of the object.
Guarding and searching polyhedra.
Abstract - We consider two visibility problems in planar computational
geometry -- both well-studied in their two-dimensional polygonal
formulations -- and we extend them to three-dimensional polyhedral
environments. The purpose of our research is twofold: On the one
hand, we obtain several generalizations of well-established theorems,
as well as introducing new results; on the other hand, we aim at
gaining better insights on two-dimensional long-standing open
problems, by abstracting them and extending our viewpoint. First, we
study a variation of the Art Gallery Problem for orthogonal polyhedra:
Given an orthogonal polyhedron with e edges, find a set of edges that
collectively see the whole interior (where faces act like walls). We
show that 11e/72 edges are always sufficient and can be found in
linear time, improving upon the previous state of the art, which was
e/6. We also obtain a tight bound in terms of the number of reflex
edges r, by showing that floor(r/2)+1 reflex edges are always
sufficient and occasionally necessary, and can be determined in O(n
log n) time. Next, we study the Searchlight Scheduling Problem for
general polyhedra, namely the problem of searching a given
three-dimensional environment by suitably turning some half-planes
around their axes, in order to catch an evasive intruder. After
discussing several generalizations of known two-dimensional theorems,
we study the problem of efficiently placing "guards" in a given
polyhedron, in order to make it searchable, giving upper bounds on the
number of guards, both for general and for orthogonal polyhedra. Then
we prove the strong NP-hardness of deciding if a given polyhedron is
entirely searchable by a given set of guards, and we further prove
that deciding if a specific region of an orthogonal polyhedron is
searchable is strongly PSPACE-hard. Our results are especially
meaningful because no similar hardness theorems for two-dimensional
scenarios were previously known. In fact, as a by-product of our
research, we get the first proof that searching a specific region of a
two-dimensional polygon is strongly PSPACE-complete.
A model-based approach for gesture interfaces.
Gesture Interfaces are becoming more and more popular in entertainment applications and their usage is currently moving towards other domains. Interaction gestures typically evolve over time. Their duration is higher than a button click, so users normally need feedback during gesture performance. Supporting such interaction with the event notification paradigm is difficult, because a gesture does not really fit into a single notification. On the one hand, if an event is raised only when the gesture recognition ends, the interface has no means to provide feedback during its execution. On the other hand, if the developer has to track the low-level device events in order to provide feedback, the advantage of a single notification for the gesture recognition is wasted.
This work proposes a compositional, declarative model for gestures definition. The construction of a gesture starts from a set of building blocks. Each one notifies the change of a basic feature value, which is dependent on the current gesture recognition platform (e.g. the position of a finger for multitouch gestures, the coordinates of a skeleton joint for body-tracking devices etc.). A complex gesture can be obtained composing building blocks or other sub-gestures through a set of composition operators (iterative, sequence, parallel, choice, disabling, order independence). The user interface behaviour can be associated to the recognition of the whole gesture or to any other sub-component, solving the aforementioned event granularity problem.
The model can be instantiated for different gesture recognition platform and its definition has been validated through a proof of concept library. Sample applications have been developed for supporting multitouch gestures on the iPhone and full body gestures with Microsoft Kinect.
Support models and cost models for structured parallel programming on multi and many cores.
The trend in modern microprocessor architectures is clear: multicore
chips are here to stay, and researchers expect multiprocessors with
128 to 1024 cores on a chip in some years. In spite of this, the
software community is slowly taking the path towards parallel
programming, and current tools do not guarantee good performance or
portability among architectures.
In our work we want to address the issue of multicore parallel
programming, working on a structured parallel programming language.
However, defining methodologies to exploit CMP architectures require
some kind of prediction of the expected performance of a program. A
cost model to describe these architectures is being developed, and in
this presentation we will provide the first results on the application
of a detailed model based on queueing theory to an existing CMP.
A control-theoretic methodology for adaptive structured parallel computations.
Abstract - Adaptivity for distributed parallel applications is an essential feature whose importance has been assessed in many research fields (e.g. large-scale real-time simulation systems and emergency management applications). This feature is of special interest in order to properly and promptly respond to time-varying application QoS requirements, to react to uncontrollable environmental effects and to efficiently deal with highly irregular parallel problems. In this scenario the Structured Parallel Programming paradigm is a cornerstone due to its high-degree of composability and QoS predictability.
Starting out from this methodology, the first part of this thesis provides a description of a single adaptive parallel module. The module is described in terms of a Hybrid System abstraction. A model-based predictive control approach has been applied to real-world distributed parallel applications. Experimental results show the effectiveness of this approach, in terms of processing time reduction, as well as of stability degree of a system reconfiguration.
The second part studies how to model the distributed control of the whole computation (i.e. distributed management), and how to express the cost model in order to optimize the various QoS metrics and their combinations After a panoramic view of the existing control-theoretic approaches (e.g. based on decentralized, distributed or hierarchical structures of controllers), this thesis introduces a methodology for distributed predictive control of parallel applications. The overall control problem consists in a set of coupled sub-problems. The decomposition issue has a twofold nature: first of all we need to model the coupling relationships between control sub-problems, furthermore we need to introduce proper notions of negotiation and convergence in the control decisions taken by the controllers. This thesis provides a formalization through basic concepts of Noncooperative Game Theory and Cooperative Optimization. In the notable context of the distributed control of performance and resource utilization, we have exploited a formal description of the control problem providing results for equilibrium points existence and the comparison with a more cooperative distributed subgradient method. Discussions and a first validation of the proposed techniques have been exploited through experiments performed in a simulation environment.
Risk-based methods for complex services.
Complex services are modern computer systems composed of several
components executing jobs. Usually a job is executed by a simple service
belonging to a third-party. A complex service may have several
implementations depending on what simple services are chosen for the
execution of jobs. Complex services are extremely flexible because a user
may select an implementation that satisfies his functional and
non-functional requirements in the best way. The selection of the
implementation with a required security level needs methods for the
evaluation and comparison of security of complex services. After the
selection is done, the control over the usage of the user's data should be
performed since the data is outsourced to third-parties. We propose
several risk-based methods to tackle security issues in complex services.
The evaluation and comparison of security of complex services require
security metrics and there are a few of them presented in literature.
However the most of the metrics are defined informally. Moreover, there is
no formal definition of more secure relation. Thus, it is even
impossible to show that a metric really measures security. We give a
formal definition for a more secure relation and propose basic formal
models for the metrics and risk. We investigate the relations between
security metrics and between security metrics and risk. Since the risk is
usually used for the assessment of the security as a whole, we can
understand how the security is measured by metrics.
We suppose that the evaluation and comparison of security of a complex
service should be based on the values of security metrics evaluating
simple services. We provide a general method for aggregation of the values
and selection the most secure implementation of a business process. This
goal is achieved using a special mathematical structure semirings. We
have shown how similar metrics could be combined to conduct a general
analysis. This goal is achieved by considering relations between metrics
using mapping between semirings.
Finally, we focus on the control of data usage in complex services. Since
a complex service is a distributed system, freshness and trustworthiness
of authorization data (attributes) become especially important for the
access and usage control decisions. We proposes a set of risk-based policy
enforcement models which help to tolerate uncertainties associated with
mutable attributes. In our model the reference monitor as usually
evaluates logical predicates over attributes and additionally makes some
estimates on how much observed attribute values differ from the real state
of the world. The final access decision takes into account both factors.
We assign monetary outcomes for granting and revoking access to legitimate
and malicious users and compare the proposed policy enforcement models in
terms of cost-efficiency.
Detection and dynamic tracking of composite events in wireless sensor networks.
Abstract - The tracking of moving events is an important task of wireless sensor networks in many applicative fields. However, current approaches either focus on vertical solution to tracking of specific events, or use generic query languages that are not specifically built to manage events mobility. We propose a new query language and query processing solutions that enable the definition of queries that track and gather information from events. Furthermore, differently than TinyDB and similar languages, the proposed language provides specific clauses aimed at defining the dynamic tracking tasks and the autonomous migration of the queries on the network depending on the event mobility. This presentation shows the model and the language, considers the query processing implementation guidelines, and presents the results of a comparison with TinyDB, which shows how the proposed approach scales better with the mobility of tracked events.
Improving search engines using query logs and a bit of semantics.
A web search engine is in all respects an IR system on a very large scale.
In the last years, thanks to the Web 2.0, the size of data (as well as
the number of queries) to manage increased more and more. In this
context, the two key issues in a Search Engine are: (i) Quality of
retrieved results (ii) Efficiency with which results are produced.
We will introduce the emerging topic of the Web Of Data, a recent
idea, originated from the Semantic Web, aiming at making interoperable
semistructured data available on the web. Then, we will discuss on how
exploit the Web Of Data together with query logs for improving both
quality and efficiency in traditional web search engines.
A. V. Miceli Barone.
Statistical machine translation and natural language processing based on syntactic analysis.
Statistical Natural Language processing is a set of methods aimed at
analyzing texts in order to information and perform textual
manipulation, such as translation or summarization, using models
obtained by statistic analysis of large sets of textual samples. Typical
Statistical Language Processing methods treat text as an unstructured
sequence of words, without explicitly considering its syntactic or
semantic structure. However, in recent years, there has been a growing
interest in statistical methods that incorporate syntactic and semantic
information. In this presentation, we will illustrate our contributions
in a syntactic analysis technique known as Statistical Dependency
Parsing and its application to Natural Language Processing and in
particular to Machine Translation. We will also discuss the
possibilities of extending this technique with semantic-based methods.
R. A. Ferreira.
Finding frequent combinatorial patterns in graphs.
Abstract - Graphs are a versatile data structure that models a myriad of domains: from social interactions to biological processes. Finding patterns in graphs is a problem that has been studied most deeply from a statistical point of view, including the distribution of degree and diameter of the network. Mining combinatorial patterns, such as induced subgraphs, induced trees of bounded size or connected sets of nodes, seems to be a computationally hard problem to approach. Often, the problem of finding some combinatorial pattern in a graph is already difficult and traditional data mining techniques, such as the apriori algorithm, are not applicable.
In this talk, we are going to cover interesting problems, applications and techniques used to find frequent combinatorial patterns in graphs. We start by introducing some combinatorial patterns that are considered useful in different domains. From there, we take a panoramic view over the field: what techniques can work and why the others do not. In each step of the pattern discovery process, I will illustrate the work done on my thesis, in particular: output-sensitive listing of combinatorial patterns; extending these techniques to sample patterns; and finally, how to approximately count (and identify) the patterns that are frequent.
Compressed data structures for massive string collections.
Modern online services have to face ever increasing amounts of data,
with latency requirements that become more and more stringent with
time. Due to such requirements, disk storage is relegated to batch
processing, while all the data needed for real-time operations must be
kept in main memory. In these scenarios, space-efficiency can make the
difference between being able to fit the data in memory or not.
Succinct and compressed data structures trade off CPU time for space
efficiency. With CPU speed steadily increasing while memory access
time is roughly constant throughout the memory hierarchy, succinct
techniques are gaining popularity in many applications.
We will discuss a number of compressed data structures for large
collections of strings, both practical and theoretical, that address
real-world problems with space/time guarantees.
We will start with the semi-index, a new compact index for large
collections of semi-structured data. While the data is stored on disk,
the semi-index is small enough that can be kept in main memory,
allowing noticeable speedups in the processing of the on-disk data.
We will then present our results in compressing dictionaries of
strings, showing how properly engineered compressed tries (with sound
theoretical foundations) can store sets of strings in space lower than
existing compressed string dictionaries, while enabling operations
with running times competitive with classical data structures.
We will then introduce a new data structure, the Wavelet Trie, which
is a compressed representation for sequences of strings. Generalizing
the Wavelet Tree, it supports common access and indexing operations,
while enabling new indexing operations on the prefixes of the strings.
We then show how the Wavelet Trie can support dynamic insertion and
deletion of strings, without requiring that the string set is known in
advance, hence obtaining the first dynamic sequence data structure
that can grow or shrink the alphabet.
A categorical framework for resource-aware process calculi for global computing.
Abstract - Traditional formalisms for the description of distributed systems, such as the pi-calculus, abstract away details of the communication infrastructure. This is not adequate for systems where behavior depends on various kinds of resources, for instance cloud computing systems, made of infrastructural elements with different policies, bandwidth and access rights. The need to generalize the treatment of resources led us to study presheaf models, i.e. algebraic/coalgebraic categorical models that allows one to independently represent the structure of resources and their usage. Our purpose is to devise a general theoretical framework for resource-aware calculi, parametric in the type of resources. As a concrete case-study for our investigation, we propose a network-aware, proper extension of the pi-calculus, called NCPi: we add connector names and the primitives to handle them, and we provide both an interleaving and a concurrent semantics. The extension to connector names is natural and seamless, since they are handled in full analogy with ordinary names. In the interleaving case, observations are the routing paths through which sent and received data are transported, while in the concurrent case we allow observing multisets of paths.
On type checking local and global properties of resources.
Abstract - Modern computing paradigms for distributed applications
advocate a strong control on shared resources available on demand in order
to guarantee their correct usages. We extended the pi-calculus with
primitives for managing resources (we call it the G-Local pi-calculus). In
this extension, resources are modelled as distinguished entities endowed
with policies governing overall interactions with processes. In this paper,
we develop a resource-based type and effect system to obtain the abstract of
resource behaviour. Consequently, it allows us to check statically local and
global properties of resources against usage policies and therefore allows
us to detect bad usages of resources.
Formalisms for ecological applications.
In population ecology, biologists zoologists and natural scientists
study how a large number of individual (a population) over a large
territory behave over time.
The focus is often to model some of the characteristic of the
environment, such as resource availability, temperature or water level,
and to predict how the population evolves, both in the short term with
individual animals moving and migrating from place to place and over
long term, with a possible genetic drift of the overall population.
I work on a formal language able to specify single individuals (agents)
with their genetic makeup as they live in a complex, heterogeneous
environment and react to changing conditions.
A real environment can be quite complex so the language allows its
definition to include spatial features like a hierarchical structure as
well as relations between sibling compartments.
Moreover, agents are be able to communicate through the environment with
signals that travels through barriers and boundaries.
Mining whole-of-government collaboration opportunities.
This research aims at automatically discovering collaboration opportunities between government agencies and between public, private and non-profit sectors based
on the textual descriptions of service offerings published online. As such descriptions are expressed in plain language and lack any controlled terminology, our approach
to meaningful analysis of such descriptions relies on semantic text mining - text mining relying on background knowledge, and in our case, on public service ontology.
The solution to our problem comprises three steps: 1) public service ontology development, 2) ontology-driven intermediate representation of the service descriptions,
and 3) service relationship mining based on this representation. The presentation will proceed by: motivating and describing the research problem; introducing the
principles and techniques of semantic text mining; describing our approach to addressing the problem; and presenting a real-life case study using four service
descriptions - welfare support, business licensing, location inspection and identity verification - to validate the approach. Finally, we show the results obtained so
far - public service extension of the Government Enterprise Architecture (GEA) ontology, intermediate representation model for service descriptions, and model
validation using real-life service descriptions; and conclude with challenges and remaining work.
Multidimensional network analysis.
Abstract - Complex network analysis is a scientific field characterized by a growing interest among
researchers in the last decade. This attention may be explained by the many challenges posed by
the analysis of complex systems, and by the fact that these complex systems are ubiquitous in nature. However,
in the last years the main effort has been devoted in studying statistical properties and behavior
of actors in a single relationship model. It is a matter of common experience that this is not
reflecting how nature works in reality: complex systems are characterized by competing forces. In this setting,
there is the need for understanding a multifaceted,
or multidimensional, complex reality. In this presentation we will present the study of a Multidimensional Network Analysis
developed to grasp the complexity of real world phenomena. We will prove the
ubiquity of this multifaceted reality in different complex networks and we will take a look at the foundations
of the multidimensional setting at different levels, both by looking at the basic extension
of the known model and by considering novel algorithms for well-understood and
An automata-based interface model of component-based software.
Abstract - A software component is a unit of composition with contractually specified interfaces and explicit dependencies only. In this presentation, I will define an automaton-based execution model of a component. The execution model describes services provided and required by the component. Next, I will give an algorithm to produce an interface model from the execution model. The interface model is input-deterministic and supports third-party composition. Finally, we show how execution models can be composed.
Cloud computing security, an intrusion detection system for cloud computing systems.
Abstract - By impersonating legitimate users, intruders can use the abundant
resources of cloud computing environments. To combat intruders, Intrusion
Detection Systems (IDSs) can be used as a suitable security solution for
cloud environments. This presentation introduces a framework for a cloud
based intrusion detection system named "CIDS", to solve many deficiencies
of current IDS technology and deal with threats attack to cloud systems.
CIDS also provides a summarizer and reporter component to summarize the
network based IDS alerts and inform the cloud administrator. CIDS
architecture is scalable and elastic solution with no central coordinator.
The components, architecture, detection models, advantages of CIDS, and
our developed cloud intrusion dataset are given in details in this
Graphs and graph transformations for object-oriented and service-oriented systems.
Abstract - Theories of graphs and graph transformations form an important part of the mathematical foundations of computing, and have been applied in a wide range of areas from the design and analysis of algorithms to the formalization of various computer systems and programs. As a brief introduction of the research topics of my PhD study, this presentation talks about how graphs and graph transformations can be used to model the static structure and dynamic behavior of object-orientated and service-oriented systems. We show that the use of graphs and graphs transformations provides both an intuitive visualization and a formal representation of these systems.
Advanced rank/select data structures: succinctness and lower bounds.
Rank/select data structures are bread and butter of succinct data structures,
a class of data structures used to operate over data in space-constrained
environments, while still providing fast access to the stored object.
They also have a deeply intertwined relationship with predecessor queries,
a well known problem in theoretical computer science.
The thesis explores different directions of improvements: from studying
cases where generic lower bounds don't arise, to generalization of lower bounds,
to the effect of implementations in real world computation of large data sets.