Curriculum Vitae
Paolo Manghi
Date of birth: 22nd of December, 1970
Place of birth: Firenze
Status: Unmarried
Nationality: Italian
Contacts:Address: Via Y. Gagarin, 85 - 56023 Cascina (PI), Italia
O ce Tel.: +39-050-*******
E-mail: *****.******@****.***.**
Web: http://www.isti.cnr.it/People/P.Manghi
1 Introduction
Paolo Manghi is a Research Fellow at Istituto di Scienza e Tecnologie dell'Informazione
(ISTI) of Consiglio
Nazionale delle Ricerche (CNR), Pisa, Italy. Today, he is member of the Digital Library
(D-Lib) research
group (led by Dr. Donatella Castelli), part of the Networked Multimedia Information
Systems (NeMIS)
laboratory (led by Dr. Fausto Rabitti).
Studies: He obtained a four-year doctorate in Informatica (Universit a di Pisa) and two
degrees MSC;
Laurea in Scienze dell'Informazione (Universit a di Pisa) and Laurea Specialistica in
Tecnologie
Informatiche (Universit a di Pisa).
Positions: Currently, he has a two-year temporary position (art.23) as a researcher at
ISTI, CNR. He
recently obtained the qualification, namelyidoneit a, for a permanent researcher
position.
Research: His curricula witnesses a research path that started from programming and
typing in \or-
thogonally" persistent languages, applied the same concepts on the Web/Internet,
investigatedtype
correctness of query languages for XML databases and type-safe integration of programming
lan-
guages and XML languages, and explored peer-to-peer service architectures for distributed
XML
databases. From these experiences on data management and distributed service
architectures he
approached his nowadays main interests: (i) Digital Library Management Systems for the
realization of multimedia and repository archives, and (ii)service infrastructures for
the ag-
gregation of possibly distributed and heterogeneous Digital Library Systems, such as
multimedia archives, institutional repositories and libraries.
Roles: He is Research and Technical Manager for the EU projects EFG [Pr1] and OpenAIRE
[Pr4],
and for the R2D2 project [Pr3] in cooperation with Microsoft Corporation Research
departments of
Cambridge and Redmond. He is appointed to beProject Technical Coordinator of the EU
project
HOPE [Pr5], funded and currently under negotiation. Moreover, he is invited member of the
Core
Expert Group of Europeana EU project (www.europena.eu) and of the DL.org Expertise
Workgroup
of the DL.org project [Pr2]. Details on the mentioned projects will follow.
Activities: He was and is also extensively involved in international cooperations and
initiatives, au-
thoring and hearings of several EU project proposals, project research and technical
coordination,
organization and editorship of international workshops, academic and professional
teaching, super-
vision of university and post-doc students.
1
2 Studies
19.07.1996 Laurea (Degree MSc) inScienze dell'Informazione, University of Pisa, final
mark 110/110.
Master Thesis: Aspetti Linguistici della Costruzione di Applicazioni Persistenti;
supervisor: Prof.
Giorgio Ghelli.
26.10.2001 Four-year Doctorate inInformatica, Pisa Consortium, XIIth Series. PhD Thesis:
Extracting
Typed Values from Semistructured Databases; supervisor: Prof. Giorgio Ghelli; referees:
Prof.
M. Atkinson (Glasgow, UK), Prof. F. Crestani (Glasgow, UK).
12.12.2003 Laurea (Degree MSc)Specialistica in Tecnologie Informatiche, University of
Pisa, final mark
110/110.
3 Positions
Paolo Manghi started his research experience in 1996, in the academia, with a four-year
doctorate between
the University of Pisa and the Universities of Glasgow and Strathclyde in Scotland (UK)
partly funded
by a European Marie Curie TMR EC Research Training Fellowship grant. After three years in
Scotland,
from 1997 to 2000, working with Prof. Richard Connor, he returned to the University of
Pisa where he
re-joined the Fibonacci research group of Prof. Giorgio Ghelli and Prof. Antonio Albano
until May 2006.
Since May 2006, he moved at ISTI-CNR, to work with the D-Lib research group led by Dr.
Donatella
Castelli. Currently, he has a temporary position (two-years art.23) as a researcher at
CNR, but he
recently obtained the qualification (so-called \idoneit a") for a permanent researcher
position.
3.1 Experience in Academia: 1995 - 2005
September 1996 - University of Pisa PhD Student Grant (01.09.1996-31.10.1997) at the
Computer
Science Department of the University of Pisa (reference: Prof. G. Ghelli).
October 1997 - University of Glasgow Research Fellow position (31.10.1997-30.06.1999), as
Euro-
pean TMR Marie Curie Grant Holder, at the Department of Computer Science of the
University
of Glasgow (UK). Funding project: Hippo [Pr9], led by Prof. R. Connor, in cooperation
with the
Dipartimento di Informatica of Pisa University (reference: Prof. G. Ghelli).
July 1999 - University of Strathclyde Research Fellow position (01.07.1999-31.10.2000),
as Euro-
pean TMR Marie Curie Grant Holder, at the Department of Computer and Information Sciences
of Strathclyde University (UK). Project SNAQue [Pr10], led by Prof. R. Connor, in
cooperation
with the Dipartimento di Informatica of Pisa University (reference: Prof. G. Ghelli).
November 2000 - University of Pisa Researchcontract (29.11.2000-30.05.2001) for the
research pro-
jectDataX, coordinated by Prof. G. Ghelli, Dipartimento di Informatica, University of
Pisa.
June 2001 - University of Pisa Research Fellow position (11.06.2001-11.06.2003), as
assegnista di
ricerca, at Dipartimento di Informatica, University of Pisa. Funding project: Design and
im-
plementation of a query language for semistructured-data, led by Prof. G. Ghelli.
July 2003 - University of Pisa Researchcontract (01.07.2003 - 30.09.2003) for the
project
FIRB:
Enabling Platforms for High-performance Computational Grids Oriented to Scalable Virtual
Orga-
nizations (grid.it) on the activity: design of a distributed XML database over the GRID.
Project
coordinator: Prof. M. Vanneschi, Dipartimento di Informatica, University of Pisa.
October 2003 - Ministry of Education Consulting for CRUI Foundation, Ministry of
Education
(
01.10.2003-01.05.2005). Funding project: IT4PS (Information Technology for Problem
Solving).
Definition of lectures, final tests, and drafting of studying books for ECDL (European
Computer
Driving License) advanced university courses on MS Excel and MS Access, specific to
Pharmacy
and Medical Faculties.
November 2003 - University of Pisa Research Fellow position (01.11.2003-31.10.2005),
asassegnista
di ricerca, at Dipartimento di Informatica, University of Pisa, government funds under
INF/01.
2
3.2 Experience at Consiglio Nazionale delle Ricerche:
2006 - ongoing
February 2006 - Consiglio Nazionale delle Ricerche Research contract (22.02.2006-
22.03.2006) for
the projectBELIEF[Pr6] (led by Donatella Castelli) on the activity: Definition of a
document model
for the representation of heterogeneous documents stored into European Institutional
Repositories;
design of a virtual collection service for such documents, Project reference: Dr.
Donatella Castelli,
ISTI, Consiglio Nazionale delle Ricerche.
May 2006 - Consiglio Nazionale delle Ricerche Research Fellow position (01.05.2006-
30.06.2008),
as assegnista di ricerca, at ISTI, Consiglio Nazionale delle Ricerche, Pisa. Funding EC
project:
BELIEF [Pr6].
July 2008 - Consiglio Nazionale delle Ricerche Research Fellow position (01.07.2008-
30.06.2009),
asricercatore a tempo determinato(art.23), at ISTI Consiglio Nazionale delle Ricerche,
Pisa. Fund-
ing EC Project: DRIVER-II [Pr8].
March 2009 - Consiglio Nazionale delle Ricerche Qualification (idoneit a) for a permanent
re-
searcher position (31.03.2009). Concorso da ricercatore, Procedura Comparativa R.07.01 -
INF/01
(CNR).
July 2009 - Consiglio Nazionale delle Ricerche Research Fellow position (01.07.2009-
30.06.2010),
as ricercatore a tempo determinato (art. 23), at ISTI Consiglio Nazionale delle Ricerche,
Pisa.
Funding EC Project: DRIVER-II [Pr8].
4 Research
Paolo Manghi research activities gravitate towards the areas of database systems, query
and programming
languages, type systems, service architectures and infrastructures, digital library
systems, with particular
focus on:
Programming and typing in \orthogonally" persistent languages;
Type-safe integration of programming languages and XML languages;
Type correctness of query languages for XML database;
Peer-to-peer service architectures for distributed XML databases;
Service infrastructures for the aggregation of distributed Digital Library Systems;
Digital Library Management Systems.
This section describes these research topics and lists the research projects that were
directly or indi-
rectly related with them.
4.1 Topics
1995 - 1998 Programming and Typing in \orthogonally" persistent languages.
Orthogonally persistent (OP) languages provide run-time support for persistent storage of
values,
independently of their type. The idea is to blur the separation between run-time and
persistent val-
ues, in order to avoid the extensive coding required to store and retrieve data on disk
(estimated as
20%). From 1995 until 1996 he investigated the problems arising applying module
programming tech-
niques in the design of very large applications with OP languages. In OP languages,
values, including
module implementations, can be type-safely linked and modified at run-time into a
computation. In tra-
ditional module-structured applications, changes to implementations usually require
costly operations,
such as recompilation and relinking. The study led to the definition of specific
application construction
methodologies that support module-systems separate compilation and permit, at the same
time, run-time
modifications to module implementations, so to limit application re-construction
operations.
Along the same line, from 1996 until 1998, in cooperation with Prof. Richard Connor's
research group
of Glasgow University (UK), he studied the possibility of integrating the notion of
orthogonal persistence
within the context of the Web. Indeed, the Web offers a series of standard languages and
protocols,
such as HTML, XML, and HTTP, which combined enable a direct way to store and share
permanent
3
data over distributed file systems. Hippo, High-level Internet Programming with
Persistent Objects
(EPSRC project, Advanced Fellowship GR/K 79222 - High Level Internet Programming
Systems), is
a typed procedural language whose applications store persistent data, together with their
types, into
local ad-hoc HTML documents uniquely identified over a persistent namespace over HTTP.
Such data
can be accessed and retrieved in order to be visualized within a browser or type-safely
imported into a
remote Hippo computation as a persistent typed value. The project extended the concept of
orthogonal
persistence to the Web, then naturally moved towards novel Internet programming
paradigms, having
non-determinism and autonomy as main research avenues.
Publications: [Man96] [CGM96] [CMS98]
1998 - 2001 Type-safe integration of programming languages and XML languages.
XML data collections are often generated for data publication or sharing within the
context of tradi-
tional mainstream language computations or database applications. Accordingly, when such
data reaches
the target computation, one would be able to operate over it by means of the same tools
and frameworks.
On the contrary, XML data are mainly manipulated as labeled trees, within the declarative
algebras of
XML query languages or within the procedural algebras of mainstream languages API's, such
as DOM.
In both cases: (i) the sender computation generates a value with a structure (type) that
well fits the
information to be modeled and enables static type checking; (ii) the receiver operates
over the labeled
tree representation of such value, where trees are not always the best structure to model
the information
source, and loses the type checking quality provided at the sender, which locally applies
to the labeled
tree structure.
Since 1998 until 2001 he worked, in cooperation with Fabio Simeoni and Prof. Richard
Connor of
Strathclyde University, on the SNAQue project [Pr10]. SNAQue project's aim is the
theoretical design
and implementation of a system API (for Java language, but any language can implement it)
for extracting
typed values from XML data and import them type-safely within a run-time computation.
Programmers
can therefore operate over XML transmitted data, by fully recovering the benefits of
complex type
structures, in terms of modeling and type safety. The resulting framework inspired the
definition of novel
kinds of programming language run-time environments, enabling the realization of \hybrid"
applications.
In such environments, values can be manipulated as language values or as XML values of
equivalent
semantics, in a type-safe manner, by means of language constructs or XQuery constructs,
according to
the specific programming needs.
+ +
Publications: [Man01]
[SML 02] [SLCM03] [MSLC02] [CLM 01]
1998 - 2004 Type-Correctness of query languages for XML database.
By definition, XML data may not be described by a schema. Accordingly, the run-time
semantics
of XML query languages is typically defined in a greedy way: a query always returns a
result, even if
its structure cannot be satisfied by the input data; in such case the query evaluates to
an empty value.
For the same reason, query correctness and type checking algorithms for XML query
languages were
considered irrelevant research issues. This was in a sharp departure from traditional
query languages,
where a structural error generates a run-time error, query correctness is defined in
terms of run-time
errors avoidance, and type checking algorithms notify the errors to the programmers.
Lately, XML has been increasingly used in database systems and programming languages as
the
standard language for data publication and sharing. Such data comes with a schema, thus
programmers
may expect to define queries fed with input featuring the same structure. As a
consequence, there has
been a growing interest in techniques to identify parts of the query that will never
contribute to the
construction of the result.
Since 2000, he worked on the definition of a characterization of query correctness for
XML query
languages, with the aim of realizing a type system capable to identify at static time all
possible structural
anomalies. A first prototype of the type system has been designed for the language
TeQuyLa, targeting
search within XML representations of historical books and documents (e.g. Divina
Commedia). Later
on, a finer definition of correctness, namelyFE-correctness, has been given for the query
language XQ
[Pr13], designed to be the FLWR core language of XQuery (XML query language defined by
the W3C).
XQ's type system is proved sound and (partially) complete with respect to the formal
definition of
FE-correctness. In parallel, similar studies on types and expressiveness of queries have
been conducted
in the context ofTQL[Pr12], an ambient calculus-based query language for semi-structured
data.
+ + + +
Publications: [ADG 01] [CGMS06a]
[ACG 00] [CGMS02] [CGA 02a] [CGMS04a] [CGA 02b]
4
2003 - 2007 Peer-to-peer service architectures for distributed XML databases.
Much of the research on the construction of real peer-to-peer (P2P) databases is
currently aimed at the
P2P decentralization of data integration mediators (e.g. Piazza and CoDB). These systems,
usually based
on the GLAV paradigm, enable each peer to reformulate queries according to a given set of
mappings,
with no need of centralized mediator. However, the need to set up the translation implies
that some
human administrative work is needed in order to join such systems.Furthermore, for such
an effort to be
worthy, such systems are not apt for highly volatile peers.
Since 2003 he worked on the design and implementation ofXPeer [Pr14], an XML peer-to-peer
(P2P)
database system. XPeer differs from standard P2P database systems for two main key ideas.
First,
XPeer decouples query compilation, demanded to an overlay network of administrative
nodes, from
query execution, which is directly managed by peers. Compilation and execution decoupling
allows for a
better handling ofcomplex FLWR queries(e.g. nested queries), and allows the overlay
network to perform
distributed optimizations on the query plan and avoid broadcast. Second, unlike other P2P
systems, for
XPeer to work, minimal human intervention is needed and no nodes with special
administrative duties
have to be permanently connected; nodes in the overlay network are themselves peers (thus
can connect
and disconnect at any time) entitled by the system to perform administrative tasks.
Indeed, XPeer adopts
self-administration policies, namely root cloning protocols and network protocols, to
provide workload
balance and robustness to network failures. The protocols will dynamically optimize the
number of
administrative nodes required by the overlay network to handle the query demand of the
connected
peers.
Publications: [CGMS07a] [SMGC04] [CGMS04b] [CGMS05] [CGMS07b]
2006 - ongoing Service infrastructures for the aggregation of distributed Digital Library
Systems.
In the last decade, there has been an increasing demand for integrating and sharing
content and
functionalities of independent Digital Library Systems. Typically, however, existing
solutions target very
specific issues and become not easily sustainable in time due to onerous realization and
maintenance
cost. Service-oriented infrastructure and resource sharing model (typical of the GRID
approach) offer
environments facilitating both functionality and content integration process, by
establishing common
participation and interaction practices and by distributing the cost among the
stakeholders.
Since January 2006 Paolo Manghi moved to ISTI-CNR where he started research on Service
Oriented
Infrastructures for Digital Libraries. He leads the D-NET laboratory (Marko Mikulicic,
Michele Artini,
Alessia bardi and Claudio Atzori) at D-Lib research group, and he is responsible for the
research, design,
development, and deployment of theD-NET software toolkit [Pr16]. D-NET is the outcome of
research
activities inspired by \GRID resource discovery and sharing" principles and applied to
service oriented
architectures and Digital Library system definition and aggregation. In a running D-NET
infrastruc-
ture data source and service providers can dynamically integrate (register) their
resources, embedded as
services, at run-time. Such services can in turn discover and interact with others
registered to the infras-
tructure. The environment offers tools for handling autonomic behaviour of resources:
special services
automatically orchestrate and combine given resources typologies into different
workflows, in order to
achieve or guarantee an expected behaviour (e.g. replica management, availability of
service). In par-
ticular, the service kits currently supported by D-NET permit the composition of
applications whose
services can collect and flexibly aggregate content from a federation of Digital Library
systems (multime-
dia archives and institutional repositories) and supply end-users with portals to access
and manipulate
such content by means of advanced functionalities; e.g. search, collection management,
user registration,
query profiling, user communities, and others.
Due to its high flexibility and low uptake cost, D-NET is currently serving as software
platform for
a number of EU projects: DRIVER [Pr7] and DRIVER-II [Pr8] (which first funded its
realization), the
European Film Gateway [Pr1], OpenAIRE [Pr4], HOPE [Pr5], R2D2 [Pr3]. Furthermore, D-NET
is
under experimentation, for the construction of the respective national repository
infrastructures, by the
national consortia represented by Chinese Academia of Science (China), Consorcio Madrono
(Spain) and
National and University Library of Ljubljana (Slovenia).
+ + +
Publications: [CCMP07b] [CCMP07a]
[ABMM08] [ACC 08b] [ACC 08a] [dri07] [FHM 07] [efg07]
[MPZ08a]
+ + +
[ACC 09] [IMP09b] [MMC 10b] [MMHP10] [ILM 10] [
MIP09]
2006 - ongoingDigital Library Management Systems.
5
Digital Library Management Systems (DLMS) are systems providing developers with the
functionali-
ties they require to develop Digital Library Systems (DLS), such as multimedia archives
and institutioanl
repositories. In the last few years, mainly due to the growth of the Digital Libraries
and their integration
and federation over the web, DLMS naturally evolved to support management of Compound
Objects.
Compound Objects conform to rather flexible data models, according to which DLS
Information Spaces
are formed by digital objects (e.g. documents, metadata) that can be combined by
(typically named) rela-
tionships to form arbitrary information graphs. Existing DLMS platforms tend to support
full flexibility
(graph-stores) and renounce the benefits of static typing.
Paolo Manghi leads the DOROTY Lab (Marko Mikulicic, Michele Artini and Alessia Bardi) of
the
D-Lib group and is responsible of the research, design and development activities of the
DOROTY
project [Pr15]. DOROTY (Digital Object RepOsitory with TYpes) is a DLMS whose novelty is
to provide
Digital Library Systems designers and developers with tools inspired by the traditional
typed approach of
Database Management Systems. Using DOROTY, DLS developers first declare the types of the
objects to
be manipulated and than realize applications counting on type-checking and optimized
storage and access
methodologies. DOROTY's data model offers a wide range of abstractions common to Digital
Library
objects modeling. Examples are object collections, named relationships, object
versioning, metadata
descriptions, and others, as well as operations to manipulate them. Beyond recovering
type correctness
and data integrity properties, the adoption of types in DLMS foundations allows for the
design of an
e cient and optimized DLMS store, capable of exploiting type information to achieve best
performances
in object persistence and access.
As such DOROTY can ease the design and implementation of Digital Library Systems, such as
multimedia archives and institutional repositories. The DLMS is currently used as back-
end for the
repository technologies developed in the DRIVER-II project [Pr8] (Enhanced Publication
Service), the
EFG project [Pr1] (European Film Information Space Service) and R2D2 [Pr3].
+ +
Publications: [CCM 09] [Man08e] [LDPP06] [LPP07] [DLP 08]
2009 - ongoing Architectures for collaborative research environments
Scholarly research involves the systematic study of information sources in order to
establish facts and
reach new conclusions. It encompasses survey, analysis, evaluation and creation as
distinct phases that
are performed iteratively or in parallel by accessing a range of local and remote
resources. Throughout
these activities scholars create collections of relevant work, ranging from publication
references to new
information acquired through experiments or correspondence with other scholars. We use
the term reading
list to refer to such collections. Existing software packages or web services for
managing publication lists,
like CiteULike, lack integration with researchers workflow, which may require access to
both desktop and
online resources. In this research we target the architecture and system design of
ScholarLynk, a desktop
tagging tool that enables researchers to build and maintain reading lists across
distributed data stores,
in collaboration with other researchers.
+ +
Publications: [KMI 10b] [KMI 10a]
4.2 Funding Projects
Paolo Manghi was and is involved in the authoring, research, coordination and technical
management
activities of several funding projects, all mentioned in the previous section. In
particular:
1. Author and conceiver of the accepted EC project proposals: DRIVER-II Project
[dri07][Pr8], EFG
[efg07][Pr1], OpenAIRE [ope09][Pr4], HOPE [hop10][Pr5], R2D2 Microsoft Desktop
[r2d90][Pr3].
2. Project Technical Coordinator (to be appointed after negotiation), Researcher and
Technical Man-
ager for the project HOPE [Pr5];
3. Technical Manager and Researcher for the projects DRIVER [Pr7], DRIVER-II [Pr8], EFG
[Pr1],
R2D2 [Pr3], OpenAIRE [Pr4];
4. Researcher for the project DL.org [Pr2] and BELIEF [Pr6].
In particular, he was later delegated as project consortium representative to meet the
European Union
reviewers/experts for the following EU proposal evaluation hearings:
6
25.06.2007 European Union - Call for Hearings (FP7, e-infrastructures): consortium
scientific repre-
sentative for project proposal DRIVER-II [Pr8]. Funded
16.05.2009 European Union - Call for Hearings (FP7, e-infrastructures): consortium
scientific repre-
sentative for the project proposal OpenAIRE [Pr4]. Funded
10.02.2010 European Union - Call for Hearings (FP7, e-infrastructures): consortium
scientific repre-
sentative for the project proposal ResearchLife, the Footprint of Research Past lead to
Research
Future. Not Funded
The following sections present the projects where he was directly involved, each
accompanied by the
project details: name, aims, research topics, scientific roles he covered, a liated
institution at the time
of the project, and fundings granted. Projects are categorized inongoing andpast.
4.2.1 Ongoing project
[Pr1] EFG (European Film Gateway) [efg07]: Best Practice Networks project (grant
agreement: ECP
517006-EFG, call: FP7 EU eContentplus 2007). The project aims at deploying and
maintaining a
pan-European Film Gateway infrastructure, capable of gathering, aggregating and exposing
film
information and content available from Movie archives across European countries. The EFG
online
portal, will provide direct access to about 790,000 digital objects including films,
photos, posters,
drawings, sound material and text documents. Users will have the possibility to search
(in multiple
languages) and to browse through the digital objects. EFG aims at supporting technical
and seman-
tic interoperability between cinematographic archives and it will support the export to
Europeana.
EFG will also evaluate measures to be taken to deal with IPR issues. [GED09]
Research topics: service infrastructures and data models, multi-lingual search, authority
file man-
agement;
Roles: Proposal Author, Researcher and Technical Manager;
A liated Institution: ISTI-CNR, role: project scientific coordination (Dr. Pasquale
Savino);
Funding: 500,000 EUR (project total: 4,500,000 EUR);
Duration:
September 2008 - August 2011.
[Pr2] DL.org (Digital Library Interoperability, Best Practices and Modelling
Foundations): FP7 EC
coordination and support action project (grant agreement: 231551, Call: FP7-ICT-2007-3,
Activ-
ity codes ICT-2007.4.3: Digital libraries and technology-enhanced learning). DL.org Co-
ordination
Action addresses two important and closely related Digital Library issues: (a)
strengthening the
modeling foundations of the field through consolidation and enhancement of the DELOS DL
Refer-
ence Model and (b) identifying requirements, solutions, and future challenges for
achieving Digital
Library Interoperability { www.dlorg.eu.
Topics: interoperability models for Digital Library Systems
Roles: Researcher;
A liated Institution: ISTI-CNR, role: project coordination (Dr. Donatella Castelli);
Funding: 590,000 EUR (project total: 1,200,000 EUR)
Duration:
December 2008 - November 2010
[Pr3] R2D2 - MSN DRIVER Desktop[r2d90]: research project between Microsoft Corporation,
Research
Departments of Cambridge and Redmond (ref. Scarlet Schwiderski-Grosche), the D-Lib
research
group of ISTI-CNR (ref. Donatella Castelli) and the MaDgIK Laboratory (ref. Prof. Yannis
Ioannidis) of the Dept. of Informatics and Telecommunications, Kapodistrian University of
Athens.
The projects aims at the definition of Microsoft desktop tools enabling research
collaboration, by
exploiting support of service oriented infrastructures (D-NET software) and exploring new
avenues
in collaborative tagging systems.
Topics: tagging and collaborative systems for very large digital libraries, service
infrastructures for
the aggregation of Digital Library systems, Digital Library Management Systems.
Roles:
Proposal Author, Researcher and Technical Manager;
A liated Institution: ISTI-CNR;
7
Funding: 100,000 EUR (project total 200,000 EUR).
Duration: October 2009 - September 2010.
[Pr4] OpenAIRE (Open Access Infrastructure for Research in Europe) [ope09] [CMB09]
[CM10]: FP7
Combination of Collaborative projects and Coordination and Support Actions - Integrated
Infras-
tructures Initiative project (I3) proposal (grant agreement: 246686, call: FP7-
INFRASTRUCTURES-
2009-1). The project is a key actor in the process of dissemination and uptake of the EU
Open
Access mandate. It aims at deploying and maintaining the OpenAIRE System, which enables
the
European Infrastructure for Open Access articles published under fundings from projects
in the
FP7. OpenAIRE offers a portal (www.openaire.eu) from which authors can deposit their
articles,
compliant Open Access repositories can register to be pro-actively harvested by the
infrastructure,
and users can search and access the articles or statistics on articles per project and
per research
area.
Topics: service infrastructures for the aggregation of Digital Library systems and
Digital Library
Management Systems, automatic deposition in remote repositories, collaborative data
curation,
authority file management.
Roles:
Proposal Author, Researcher, and Technical Manager;
A liated Institution: ISTI-CNR, role: project technical coordination (Dr. Donatella
Castelli);
Funding: 660,000 EUR (project total: 4,170,000 EUR)
Duration: December 2009 - November 2012
.
[Pr5] HOPE (Heritage of the Peoples Europe) (
2010-2013) [hop10]: Best Practice Networks
project
(grant agreement: not yet assigned, under negotiation - FP7 EU eContentplus2009). HOPE is
a Best
Practice Network of archives, libraries and museums of social and labor history
institutions across
Europe. It aims to improve access to the vast amount of highly significant but scattered
digital
collections on social history. It proposes to achieve this by promoting the adoption of
standards
and best practices for digital libraries amongst its partners, by ensuring that the
metadata and the
content become available through Europeana and by implementing a full scale discovery-to-
delivery
model.
topics: service infrastructures for the aggregation of Digital Library systems;
Roles: Project Technical Coordinator (to be appointed after negotiation), Proposal
Author, Re-
searcher, and Technical Manager;
A liated Institution: ISTI-CNR, role: project technical coordination (Dr. Paolo Manghi);
Funding: 330,000 EUR (project total: 2,700,000 EUR)
Duration: April 2010 - march 2014.
4.2.2 Past projects
In the following only the major projects he was involved with will be mentioned. However,
he participated
to several others, such as GRID.it (FIRB IT), DataX (IT), EPSRC (UK), Training Movement
Grant (EC),
NAPI project (Microsoft Research Cambridge).
[Pr6] BELIEF (Bringing Europe's eLectronic Infrastructures to Expanding Frontiers): FP6
EC Specific
Supporting Action project (grant agreement: 026500, Call: FP6-2004-IST-6 - 3.2.3
\Comunication
Network Development - eInfrastructure Consolidating Initiatives"). The aim of the project
is to
facilitate knowledge-exchange on eInfrastructures a one-stop home for public
eInfrastructure docu-
mentation, e.g. project DoWs, project deliverables. This information will be readily
accessible to
BELIEF Community Members though the BELIEF Digital Library especially developed to
provide
a central repository for eInfrastructure Information.
Topics: Digital Library Systems and methodologies for document collection.
Roles: Researcher
;
A liated Institution:
ISTI-CNR, role: partner (Dr. Donatella Castelli);
Duration: November 2005 - December 2007
Fundings: 220,000 EUR (project total: 950,000 EUR)
8
[Pr7] DRIVER (the Digital Repository Infrastructure Vision for European Research): FP6 EC
Specific
Targeted Research Project (grant agreement: IST-034047, Call: FP6-2005-IST-5 - 2.5.6.3
\Re-
search Networking Testbeds"). The project's applicative goal is that of encouraging Open
Access
business models among researchers and publishers by giving centralized access and
visibility to
this critical mass of publications already available from repositories in Europe. To this
aim, the
project funded the realization and deployment of the D-NET software toolkit and today
operates
the first European Open Access Repository Infrastructure. D-NET services are combined to
form
multiple distributed applications over the aggregation of content from a federation of
institutional
repositories (today 240, for about 2,500,000 digital object surrogates), i.e. archives
containing any
form of scientific output, including scientific/technical reports, working papers, pre-
prints, articles
and original research data. [MM06][Man06]
Topics: service infrastructures for the aggregation of Digital Library systems;
Roles: Researcher and Technical Manager;
A liated Institution: ISTI-CNR, role: project coordination (Dr. Donatella Castelli);
Funding: 430,000 EUR (project total: 2,700,000 EUR)
Duration:
June 2006 - November 2007
[Pr8] DRIVER-II (the Digital Repository Infrastructure Vision for European Research -
Phase Two):
FP7 EU Combination of Collaborative projects and Coordination and support actions project
(grant agreement: 212147, call: INFRA-2007-1.2.1 - Scientific Digital Repositories). The
project
goal is to continue the operation of the DRIVER Infrastructure [Pr7] and to extend it
with a
Digital Library System for the management of \enhanced publications", special compound
objects
grouping together articles and research data related with them. [Man08e] [MM08c] [MMSI08]
[Man08b] [Man08a] [MS08] [Man08c] [MM08a] [MM08b] [Man08d]
Topics: service infrastructures for the aggregation of Digital Library Systems, Compound
Object
data models and DLMS, DOROTY [Pr15].
Roles: Proposal Author, Researcher and Technical Manager. Goal:
A liated Institution: ISTI-CNR, role: project coordination (Dr. Donatella Castelli);
Funding: 580,000 EUR (project total: 2,700,000 EUR)
Duration:
December 2007 - November 2009
4.3 Research projects
Paolo Manghi activities resulted in the following research projects, whose context and
details can be
found in section 4.1 Topics. Some of them were funded with targeted grants, others with a
number
of EC projects where the research products involved could be fruitfully used by real user
communities.
This is the case for D-NET [Pr16] and DOROTY [Pr15], developed, enhanced and used to
support the
real-case communities of DRIVER [Pr7], DRIVER-II [Pr8], EFG [Pr1], OpenAIRE [Pr4], and
HOPE
[Pr5] projects).
[Pr9] Hippo(High-level Internet Programming with Persistent Objects
) (
1997-1999). Roles:
Researcher.
A liated Institution: Department of Computer Science, Glasgow University (UK)
.
[Pr10] SNAQue (
1998-2000). Roles: Researcher and Technical Manager. A liated Institution:
Depart-
ment of Computer Science, Strathclyde University (UK).
[Pr11] TeQuyLa (
1999-2001). Roles: Researcher. A liated Institution: Dipartimento di
Informatica,
Universit a di Pisa.
[Pr12]
TQL (2000-2001). Roles: Researcher
. A liated Institution: Dipartimento di
Informatica, Uni-
+
versit a di Pisa. [ACC 01]
[Pr13] microXQuery (
2001-2003). Roles: Researcher. A liated Institution: Dipartimento di
Informat-
ica, Universit a di Pisa.
[Pr14] XPeer: P2P XML Database (
2003-2006). Roles: Researcher and Technical Manager. A
liated
Institution: Dipartimento di Informatica, Universit a di Pisa.
[CGMS06b]
9
[Pr15] DOROTY (
2009-ongoing). Roles: Researcher and Technical Manager. A liated
Institution:
Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche,
Pisa.
[Pr16] D-NET Software Toolkit (2007-ongoing). Roles: Researcher and Technical Manager. A
liated
Institution: Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale
delle Ricerche,
Pisa. [MM09]
5 International cooperations
Paolo Manghi was and is involved in several international cooperations with scientists
from academia and
companies, through projects or because of common research interests. The most relevant
are:
From
01.10.1995 to 31.03.1996 he visited the PISA Research Group of Prof. R. Connor and
Prof.
R. Morrison at the Department of Computer Science of S.Andrews University (UK). The
reason of
the visit were studies and comparisons on programming methodologies for the orthogonally
persis-
tent languages Napier88 (PISA research group, S.Andrews University) and Fibonacci
(Fibonacci
research group, University of Pisa).
In 1997 he joined the research group led by Prof. R. Connor at Strathclyde University
(UK) to
work on the projects Hippo and SNAQue.
In 2001/02 he joined the project Network-Aware Programming and Interoperability (NAPI),
the
result of a cooperation between Cambridge Microsoft Research (UK, references: Luca
Cardelli and
Bruno Quarta) and a selection of Italian Universities (Pisa, Bologna, Firenze, and
Milano).
In 2005 he cooperated with Redmond Microsoft Research to work on the projectBig Top. Goal
of
the cooperation is the integration ofTQLlanguage withinHighWire's programming
environment.
In 2006 he started the DRIVER and DRIVER-II project activities, where he started a
flourishing
cooperation with the MaDgIK Laboratory of Prof. Yannis Ioannidis, Kapodistrian University
of
Athens, Dept. of Informatics and Telecommunications, Wolfram Horstmann from the
University of
Bielefeld and Wojtek Sylwestrzak from ICM research center of Warsaw (Poland).
In the period from 05.05.2008 to 11.05.2008 2008 he visited Prof. Yannis Ioannidis at
Kapodistrian
University of Athens to start a work on data models and types for digital libraries.
In 2008 he became a working group member ofDL.org Expertisean activity of the EU project
DL.org
[Pr2]. The working group aims at defining best practices, models and standards on
interoperability
+
for Digital Libraries with respect Content. Publications: [TCC 10]
In 2009 he joined the Core Expert Group of the major Europeana EU Project
(http://www.europe-
ana.eu), which aims at building a common multi-lingual access point to Europe's
distributed cul-
tural and scientific heritage, including digital content from all types of heritage
institutions (archives,
libraries, museums and audio-visual collections). The group works at refining and
identifying the
specifications of the data model and architecture of Europeana.
In 2009 he started a cooperation with Microsoft Research, departments of Cambridge (ref.
Fabrizio
Gagliardi, Scarlet Schwiderski-Grosche, Natasa Milic-Frayling and Gabriella Kazai) and
Redmond
(ref. Alex Wide and Lee Dirks), on integration between Microsoft Desktop and the D-NET-
based
DRIVER infrastructure [Pr7][Pr8]. The outcome of the cooperation is the project R2D2
[Pr3].
6 Conferences organization and invited talks
Paolo Manghi is the organizer and editor of the Very Large Digital Libraries
international workshop,
at its second edition this year and so far held in conjunction with the European
Conference on Digital
libraries. The workshop is the outcome of the several activities and international
cooperations developed
in the area of infrastructures and content management for Digital Libraries. For similar
reasons, he was
invited to present his research in the context of workshops, conferences and work-groups.
Conference Organization
10
Production Editor 8th International Workshop on Database Programming Languages (DBPL
2001), Sept. 2001, Rome, Italy. Chair: Prof. Giorgio Ghelli.
Organizer and Editor First International Workshop on Very Large Digital Libraries (ECDL
2008), Sept. 2008, Aarhus, Denmark (SIGMOD workshop report [MPZ08b] [MPZ08a])
Organizer and Editor Second International Workshop on Very Large Digital Libraries (ECDL
2009), Sept. 2009, Corfu, Greece [IMP09a] [IMP09b] [MIP09]
Organizer and Editor Third International Workshop on Very Large Digital Libraries (ECDL
2010), Sept. 2010, Glasgow, UK.Proposal accepted.
Reviewer activities
Database and Programming Language Conference, Springer Verlag Reviewer in 2001. Or-
ganizer: Giorgio Ghelli.
Data and Knowledge Engineering Journal, Elsevier Reviewer in 2004. Editor in Chief: Pe-
ter Chen.
International Journal on Digital Libraries, Springer Reviewer in November 2008. Associate
Editor: Carol Ann Peters.
Open Repositories Reviewer in 2010. Chief: Dr. Wolfram Horstmann.
Other reviewing activities were carried out for conferences and workshops such as Italian
Symposium
on Database Systems (SEBD) and Italian Research Conference on Digital Libraries (IRCDL).
Invited talks
20-22.08.1996: invited speaker at EC-US Workshop on Persistence (Kinloch Rannoch,
Scotland).
Presentation title: Programming methodologies for Persistent Programming Languages.
17-19.05.1998: invited speaker at Pastel Workshop (Fort Williams, Scotland). Presentation
title:
On the Unification of Persistent Programming and the World Wide Web.
07-09.12.1998 invited speaker at Workshop on Persistence (Kinloch Rannoch, Scotland).
Presen-
tation title: Interning Never Externed Data.
07-09.12.2001 invited speaker at First Workshop on Network-Aware Programming and Interop-
erability (NAPI) at Microsoft Research (Cambridge, England).
11-12.02.2002 invited speaker at Second Workshop on Network-Aware Programming and Inter-
operability (NAPI) in San Miniato, Pisa.
27.05.2005 invited speaker at BigTop project Workshop (MS Research), Dipartimento di
Infor-
matica, University of Bologna. Presentation title: XPeer: a P2P XML database system
29.05.2007 invited speaker at the DELOS Network of Excellence School on Digital
Libraries,
Settignano, Florence, Italy. Presentation title: The DRIVER Repository Infrastructure.
25.06.2007invited speaker at SURF Foundation - Utrecht, The Netherlands. Presentation
title:
Compound Objects.
18.02.2008 invited speaker at the DRIVER Open Summit, Goettingen, Germany. Presentation
title: The DRIVER Infrastructure architecture.
02.10.2008 invited speaker at TrebleCLEF Workshop, Wolfsburg, Swiss. Presentation title:
The
DRIVER Infrastructure architecture: multilinguality issues.
16.11.2008 invited speaker at SPARC Digital Repositories Meeting, Baltimore, USA.
Presentation
title: Building Sustainable Aggregative Digital Library System.
13-14.01.2009 invited speaker at EuropeanaLocal Knowledge Sharing Workshop, Den Haag, The
Netherlands. Presentation title: The DRIVER Infrastructure.
16-18.03.2009invited speaker at DRIVER/JISC Digital Repositories Workshop, Amsterdam,
The
Netherlands. Presentation title: Typed Compound Object Models for Digital Libraries.
03.09.2009 invited speaker at Societ a Internazionale per lo Studio del Medioevo Latino
(SISMEL),
Certosa del Galluzzo, Florence. Presentation title: Aggregating content from
heterogeneous
metadata data sources.
18.02.2010 invited speaker at European Science Foundation meeting, ILC-CNR, Pisa. Presen-
tation title: D-NET Software Toolkit and the DRIVER Infrastructure experience. Ref: Dr.
Andrea Bozzi, director of the Istituto di Linguistica Computazionale, CNR.
11
7 Teaching and research supervision
Since he moved to the University of Pisa, Paolo Manghi started a regular teaching
activity, which only
recently he had to reduce due to growing research and project coordination
responsibilities. His major
experience is in teaching database design principles, but also other topics were touched,
such as architec-
tures and programming languages and methodologies. Teaching has naturally evolved in the
years into
MSC students and post-doc students supervision.
Member of CNR commissions for public selection of personnel
\Commissione esaminatrice di pubblica selezione" prot. 953, 23.04.2009 Selection for 1
position as temporary research fellow (\assegnista di ricerca") to work on the project
DRIVER-
II. Winner of selection: Alessia Bardi.
Post-doc supervision
D-Lib workgroup on \content for digital libraries" The workgroup numbers the following
research staff: Marko Mikulicic, Alessia Bardi, Cristina Tang, Claudio Atzori and Michele
Artini. Its main focus is on studies, research and development in the area of Digital
Library
Management Systems, namely provenance information and data models for compound objects.
Student supervision
Tirocinio 2007 Emanuele Cavarretta, Dipartimento di Informatica, Universit a di Pisa:
Regrouper
(research group publisher)
Laurea specialistica 2008-2009 Alessia Bardi, Dipartimento di Informatica, Universit a di
Pisa:
DOROTY: Digital Object RepOsitory with Types(for more information refer to section
research
activities of this CV);
Laurea specialistica 2009-2010 Sandro La Bruzzo, Dipartimento di Informatica, Universit a
di
Pisa: Implementazione di interfacce OAI per l'esportazione di Compound Objects nel
sistema
DOROTY; work on design and implementation of standard data-export interfaces for the
typed data model of the DOROTY DLMS (for more information see research activities section
of this CV).
University, Master, and Post-Doc lecturer
December 2000 Course: Basi di Dati, Master Web & Wireless. Organizers: University of Pisa
and Vodafone (ex Omnitel). Duration: 1 day.
December 2001 Course: Applicazioni Web per Basi di Dati, Web & Net Master. Organizers:
Pisa
Province Council, University of Pisa (reference. Dr. Laura Ricci), and Qualital
Consortium.
Duration: 2 weeks.
January 2002 Course: Basi di Dati, IFTS courseTecnico Informatico per il supporto al
commercio
elettronico e ai servizi informativi territoriali. Organizers: Regione Toscana and
University of
Pisa (reference: Prof. Dino Pedreschi). Duration: semester.
May 2004 Course: Database, MASTER SIT. Polo Scientifico e Tecnologico dell'Area Livornese
Srl, Livorno. Duration: 2 weeks.
December 2004 Course: IT4PS - Access. Organizers: Dipartimento di patologia sperimentale
biotecnologie mediche, infettivologia e epidemiologia, University of Pisa, CRUI
(Conferenza
dei Rettori delle Universit a Italiane), and AICA (Associazione Italiana per
l'Informatica e il
Calcolo Automatico). Duration: 2 weeks.
March 2005 Course: Master \Open Source". Seminars: Le Interrogazioni SQLeAmbienti per la
definizione delle interrogazioni. Dipartimento di Informatica, University of Pisa.
Duration:
October 2000 Course: Architetture degli Elaboratori I - Operative Systems, Diploma in
Informat-
ica, Dipartimento di Informatica, University of Pisa. Duration: semester (from
23.10.2000).
March 2001 Course: Basi di Dati - Laboratorio Oracle, Laurea in Informatica, Dipartimento
di
Informatica, University of Pisa. Duration: semester (from 12.03.2001).
March 2002 Course: Basi di Dati - Laboratorio Oracle, Laurea in Informatica, Dipartimento
di
Informatica, University of Pisa. Duration: semester (from 18.03.2002).
October 2002 Course: Laboratorio di Introduzione alla Programmazione - Java, Laurea in
Infor-
matica, Dipartimento di Informatica, University of Pisa. Duration: semester (from
16.10.2002).
March 2003 Course: corsoBasi di Dati - Laboratorio Oracle, Laurea in Informatica,
Dipartimento
di Informatica, University of Pisa. Duration: semester (from 01.03.2003).
October 2003 Course: Laboratorio di Introduzione alla Programmazione - Java, Laurea in
Infor-
matica, Dipartimento di Informatica, University of Pisa. Duration: semester (from
20.10.2003).
March 2004 Course: Basi di Dati - Laboratorio Oracle, Laurea in Informatica, Dipartimento
di
Informatica