Seminar - Exploring Semantic Technology in Search
12/6 ore 12-13, 13/6 ore 11-13 e 14/6 ore 11-13
Sala Seminari Ovest
This seminar will explore the uses of Natural Language Processing (NLP)
technologies in (IR) to improve information access. There has been a great deal of work in NLP over the past several decades. Although we are still far from understanding natural language new practical products are making uses of NLP and semantic technologies (siri, google translate, powerset, watson, ...)
PhD 2009
Seminar - Counting and recounting genes

prof Anna Tramontano, Wednsday 23 May -- h: 16.00 -- Sala Seminari Ovest
The study of a biological organism, just as that of any complex machine, requires the knowledge of a complete and detailed list of its parts, of their function and of their tolerance threshold, as well as the instructions for their final assembly.
Tha draft of the human genome and of several other organisms are known and even too soon we will have at our disposal the genomes of many dverse individuals, of cancer cells, of other pathogens, of the components of ecosystems and more.

However, there is a long way to go before this information can be translated into knoweldge. The catalogue of the basic components of the system (genes and proteins), of their function, of their interactions, both logical an physical, is far from being complete and unravelling this complexity will most likely take a large fraction of this century and presumably of the next one.

I will describe the computational methods and strategies that are necessary to face these issues at a "systems" level, i.e. by analysing in an integrative fashion the plethora of very diverse available information.

Seminar: Large scale annotation of proteins with labelling methods

prof. Rita Casadio, Friday 11 May -- h: 15.00 -- Sala Seminari Ovest
As a result of large sequencing projects, data banks of protein sequences and structures are growing rapidly. The number of sequences is however orders of magnitude larger than the number of structures known at atomic level and this is so in spite of the efforts in accelerating processes aiming at the resolution of protein structure.

Tools have been developed in order to bridge the gap between sequence and protein 3D structure, based on the notion that information is to be retrieved from the data bases and that knowledge-based methods can help in approaching a solution of the protein folding problem. By this several futures can be predicted starting from a protein sequence such as structural and functional motifs and domains, including the topological organisation of a protein inside the membrane phase, and the formation of disulfide bonds in a folded protein structure (1). Our group has been contributing to the field with different computational methods, mainly based on machine learning (neural networks (NNs), hidden markov models (HMMs), support vector machines (SVMs), hidden neural networks (HNNs) and extreme learning machines (ELMs)) and capable of computing the likelihood of a given feature starting from the protein sequence ( Our methods can add to the process of large scale proteome annotation (endowing sequences with functional and structural features).

Recently Conditional Random Fields (CRFs) have been introduced as a new promising framework to solve sequence labelling problems in that they offer several advantages over Hidden Markov Models (HMMs), including the ability of relaxing strong independence assumptions made in HMMs. However, several problems of sequence analysis can be successfully addressed only by designing a grammar in order to provide meaningful results. We therefore introduced Grammatical-Restrained Hidden Conditional Random Fields (GRHCRFs) as an extension of Hidden Conditional Random Fields (HCRFs). GRHCRFs while preserving the discriminative character of HCRFs, can assign labels in agreement with the production rules of a defined grammar (2). The main GRHCRF novelty is the possibility of including in HCRFs prior knowledge of the problem by means of a defined grammar. Our current implementation allows regular grammar rules. We tested our GRHCRF on two typical biosequence labelling problem: the prediction of the topology of Prokaryotic outer-membrane proteins and the prediction of bonding states of cysteine residues in proteins (3-5), proving that the separation of state names and labels allows to model a huge number of concurring paths compatible with the grammar and with the experimental labels without increasing the time and space computational complexity.

Slides of 2012 PhD Workshop
The slides of the presentations are now available.
2012 PhD Workshop
The 2012 edition of the Ph.D. workshop will be held on January 30th, 2012, in the Gerace auditorium of the Department of Computer Science at the University of Pisa. The objective of the workshop is to allow the Ph.D. students of the Department that already completed their second year of studies to shortly illustrate their research activity to the whole Department. The preliminary program is available here.
Application 2011
Deadline for application is October 12, 2011. Information about Application 2011 is available here. Interviews will take place from October 24.
Regulations have been changed so the application page contains the new ones.
I Dottorati in Informatica in Italia
Articolo apparso su Mondo Digitale di Silvia Bencivelli e Pierpaolo Degano che analizza lo stato dei Dottorati in Informatica in Italia.
Course on "Biological Sequence Analysis" by Prof. Ukkonen

The course will take place on Sept. 28 -- Oct. 7 in Sala Seminari W.

On Sept. 28 the lecture will be 15-17

(First week in the afternoon, second week in the morning -- to be set in the forst lecture)


Course on Web Search (part 2)

Dr Muthukrishnan will give his lectures December 1st -- 4th

PhD students' seminars

The seminar will take place on September, the 14th, Sala Gerace from 9am.
Preliminary list of courses and seminars 2009
A preliminary list of courses and seminars for the 2009 is now available!
New Web site
Welcome to the new Web site of PhD program. We moved all the information to this new site. The old site is still available. Please note that the URL of the web site has changed.