<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>Complex and Adaptive Systems Laboratory</title>
<link>http://hdl.handle.net/10197/2494</link>
<description/>
<pubDate>Tue, 21 May 2013 13:39:17 GMT</pubDate>
<dc:date>2013-05-21T13:39:17Z</dc:date>
<item>
<title>Towards the Improved Discovery and Design of Functional Peptides: Common Features of Diverse Classes Permit Generalized Prediction of Bioactivity</title>
<link>http://hdl.handle.net/10197/3891</link>
<description>Towards the Improved Discovery and Design of Functional Peptides: Common Features of Diverse Classes Permit Generalized Prediction of Bioactivity
Mooney, Catherine; Haslam, Niall J.; Pollastri, Gianluca; Shields, Denis C.
The conventional wisdom is that certain classes of bioactive peptides have specific structural features that endow their particular functions. Accordingly, predictions of bioactivity have focused on particular subgroups, such as antimicrobial peptides. We hypothesized that bioactive peptides may share more general features, and assessed this by contrasting the predictive power of existing antimicrobial predictors as well as a novel general predictor, PeptideRanker, across different classes of peptides.We observed that existing antimicrobial predictors had reasonable predictive power to identify peptides of certain other classes i.e. toxin and venom peptides. We trained two general predictors of peptide bioactivity, one focused on short peptides (4-20 amino acids) and one focused on long peptides (&gt;20 amino acids). These general predictors had performance that was typically as good as, or better than, that of specific predictors. We noted some striking differences in the features of short peptide and long peptide predictions, in particular, high scoring short peptides favour phenylalanine. This is consistent with the hypothesis that short and long peptides have different functional constraints, perhaps reflecting the difficulty for typical short peptides in supporting independent tertiary structure.We conclude that there are general shared features of bioactive peptides across different functional classes, indicating that computational prediction may accelerate the discovery of novel bioactive peptides and aid in the improved design of existing peptides, across many functional classes. An implementation of the predictive method, PeptideRanker, may be used to identify among a set of peptides those that may be more likely to be bioactive.
</description>
<pubDate>Mon, 01 Oct 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3891</guid>
<dc:date>2012-10-01T00:00:00Z</dc:date>
</item>
<item>
<title>Aggregating Content and Network Information to Curate Twitter User Lists</title>
<link>http://hdl.handle.net/10197/3871</link>
<description>Aggregating Content and Network Information to Curate Twitter User Lists
Greene, Derek; Sheridan, Gavin; Smyth, Barry; Cunningham, Pádraig
Twitter introduced user lists in late 2009, allowing users to be grouped according to meaningful topics or themes. Lists have since been adopted by media outlets as a means of organising content around news stories. Thus the curation of these lists is important - they should contain the key information gatekeepers and present a balanced perspective on a story. Here we address this list curation process from a recommender systems perspective. We propose a variety of criteria for generating user list recommendations, based on content analysis, network analysis, and the "crowdsourcing" of existing user lists. We demonstrate that these types of criteria are often only successful for datasets with certain characteristics. To resolve this issue, we propose the aggregation of these different "views" of a news story on Twitter to produce more  accurate user recommendations to support the curation process.
ACM RecSys 2012 Workshop on Recommender Systems &amp; The Social Web, 9 September, 2012, Dublin
</description>
<pubDate>Mon, 25 Jun 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3871</guid>
<dc:date>2012-06-25T00:00:00Z</dc:date>
</item>
<item>
<title>Profile-based short linear protein motif discovery.</title>
<link>http://hdl.handle.net/10197/3789</link>
<description>Profile-based short linear protein motif discovery.
Haslam, Niall J.; Shields, Denis C.
Background&#13;
&#13;
Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3-10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation.&#13;
Results&#13;
&#13;
The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset.&#13;
Conclusions&#13;
&#13;
Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods.
</description>
<pubDate>Fri, 18 May 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3789</guid>
<dc:date>2012-05-18T00:00:00Z</dc:date>
</item>
<item>
<title>Genetic Programming for the Induction of Seasonal Forecasts: A Study on Weather-derivatives</title>
<link>http://hdl.handle.net/10197/3754</link>
<description>Genetic Programming for the Induction of Seasonal Forecasts: A Study on Weather-derivatives
Agapitos, Alexandros; O'Neill, Michael; Brabazon, Anthony
The last ten years has seen the introduction and rapid growth of a market in weather derivatives, financial instruments whose payoffs are determined by the outcome of an underlying weather metric. These instruments allow organisations to protect themselves against the commercial risks posed by weather fluctuations and also provide investment opportunities for financial traders. The size of the market for weather derivatives is substantial, with a survey suggesting that the market size exceeded $45.2 Billion in 2005/2006 with most contracts being written on temperature-based metrics. A key problem faced by buyers and sellers of weather derivatives is the determination of an appropriate pricing model (and resulting price) for the financial instrument. A critical input into the pricing model is an accurate forecast of the underlying weather metric. In this study we induce seasonal forecasting temperature models by means of a Machine Learning algorithm. Genetic Programming&#13;
(GP) is applied to learn an accurate, localised, long-term forecast of a temperature profile as part of the broader process of determining appropriate pricing model for weather-derivatives. Two different approaches for GP-based time-series modelling are adopted. The first is based on a simple system identification approach whereby the temporal index of the time-series is used as the sole regressor of the evolved model. The second is based on iterated single-step prediction that resembles autoregressive and moving average models in statistical time-series modelling. The major issue of effective model generalisation is tackled though the use of an ensemble learning technique that allows a family of forecasting models to be evolved using different training sets, so that predictions are formed by averaging the diverse model outputs. Empirical results suggest that GP is able to successfully induce seasonal forecasting models, and that search-based autoregressive models compose a more stable unit of evolution in terms of generalisation performance for the three datasets considered. In addition, the use of ensemble learning of 5-model predictors enhanced the generalisation ability of the system as opposed to single-model prediction systems. On a more general note, there is an increasing recognition of the utility of evolutionary methodologies for the modelling of meteorological, climatic and ecological phenomena, and this work also contributes to this literature.
</description>
<pubDate>Sun, 01 Jan 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3754</guid>
<dc:date>2012-01-01T00:00:00Z</dc:date>
</item>
<item>
<title>Review of Statistical Network Analysis: Models, Algorithms, and Software</title>
<link>http://hdl.handle.net/10197/3753</link>
<description>Review of Statistical Network Analysis: Models, Algorithms, and Software
Salter-Townshend, Michael; White, Arthur; Gollini, Isabella; Murphy, Thomas Brendan
The analysis of network data is an area that is rapidly growing, both within and outside of the discipline of statistics.&#13;
This review provides a concise summary of methods and models used in the statistical analysis of network data, including the Erdos–Renyi model, the exponential family class of network models, and recently developed latent variable models. Many of the methods and models are illustrated by application to the well-known Zachary karate dataset. Software routines available for implementing methods are emphasized throughout.&#13;
The aim of this paper is to provide a review with enough detail about many common classes of network models to whet the appetite and to point the way to further reading.
</description>
<pubDate>Wed, 01 Aug 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3753</guid>
<dc:date>2012-08-01T00:00:00Z</dc:date>
</item>
<item>
<title>Predictive modelling of angiotensin converting enzyme inhibitory dipeptides</title>
<link>http://hdl.handle.net/10197/3748</link>
<description>Predictive modelling of angiotensin converting enzyme inhibitory dipeptides
Norris, Roseanne; Casey, Fergal; FitzGerald, Richard; Shields, Denis C.; Mooney, Catherine
The ability of docking to predict angiotensin converting enzyme (ACE) inhibitory dipeptide sequences was assessed using AutoDock Vina. All potential dipeptides and phospho-dipeptides were docked and scored. Peptide intestinal stability was assessed using a prediction amino acid clustering model. Selected dipeptides, having AutoDock Vina scores −8.1 and predicted to be ‘stable’ intestinally, were characterised, using LIGPLOT and for ACE-inhibitory potency. Two newly identified ACE-inhibitory dipeptides, Asp-Trp and Trp-Pro, having Vina scores of −8.3 and −8.6 gave IC50 values of 258 ± 4.23 and 217 ± 15.7 μM, respectively. LIGPLOT analysis indicated no zinc interaction for these dipeptides. Phospho-dipeptides were predicted to have a good affinity for ACE. However, the experimentally determined IC50 results did not correlate since, for example, Trp-pThr and Pro-pTyr, having Vina scores of −8.5 and −8.1, respectively, displayed IC50 values of &gt;500 μM. While docking allowed identification of new ACE inhibitory dipeptides, it may not be a fully reliable predictive tool in all cases.
</description>
<pubDate>Tue, 14 Aug 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3748</guid>
<dc:date>2012-08-14T00:00:00Z</dc:date>
</item>
<item>
<title>Source separation on seismic data : application in a geophysical setting</title>
<link>http://hdl.handle.net/10197/3698</link>
<description>Source separation on seismic data : application in a geophysical setting
Moni, Aishwarya; Bean, Christopher J.; Lokmer, Ivan; Rickard, Scott
This article gives a brief description of the Degenerate Unmixing Estimation Technique (DUET) and applies it in a geophysical setting. Source separation has not been fully&#13;
addressed by geophysicists and is a crucial first step to locating simultaneous sources, which in turn helps with understanding the&#13;
dynamics of the sources and their source mechanisms. DUET is applied to synthetic seismic signals. The source separation&#13;
method works successfully to separate two contemporary explosive sources, and two simultaneous oblique tensile cracks in a 3D&#13;
structural model of Mt Etna. The method is also applied to field recordings on Mt Etna from 2008. The method separates Long Period events from tremor, Long Period events from Volcano Tectonic events and different sources of tremor from each other.
</description>
<pubDate>Tue, 01 May 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3698</guid>
<dc:date>2012-05-01T00:00:00Z</dc:date>
</item>
<item>
<title>A preliminary investigation of overfitting in evolutionary driven model induction : implications for financial modelling</title>
<link>http://hdl.handle.net/10197/3655</link>
<description>A preliminary investigation of overfitting in evolutionary driven model induction : implications for financial modelling
Tuite, Clíodhna; Agapitos, Alexandros; O'Neill, Michael; Brabazon, Anthony
This paper investigates the effects of early stopping as a method to counteract overfitting in evolutionary data modelling using Genetic Programming. Early stopping has been proposed as a method to avoid model overtraining, which has been shown to lead to a significant degradation of out-of-sample performance. If we assume some sort of performance metric maximisation, the most widely used early training stopping criterion is the moment within the learning process that an unbiased estimate of the performance of the model begins to decrease after a strictly monotonic increase through the earlier learning iterations. We are conducting an initial investigation on the effects of early stopping in the performance of Genetic Programming in symbolic regression and financial modelling. Empirical results suggest that early stopping using the above criterion increases the extrapolation abilities of symbolic regression models, but is by no means the optimal training-stopping criterion in the case of a real-world financial dataset.
EvoFIN 2011, 5th European Event on Evolutionary and Natural Computation in Finance and Economics in EvoApplications, Torino, Italy, 27-29 April 2011
</description>
<pubDate>Wed, 27 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3655</guid>
<dc:date>2011-04-27T00:00:00Z</dc:date>
</item>
<item>
<title>A non-destructive grammar modification approach to modularity in grammatical evolution</title>
<link>http://hdl.handle.net/10197/3612</link>
<description>A non-destructive grammar modification approach to modularity in grammatical evolution
Swafford, John Mark; Hemberg, Erik; O'Neill, Michael; Nicolau, Miguel; Brabazon, Anthony
Modularity has proven to be an important aspect of evolutionary computation. This work is concerned with discovering and using modules in one form of grammar-based genetic programming, grammatical evolution (GE). Previous work has shown that simply adding modules to GE’s grammar has the potential to disrupt fit individuals developed by evolution up to that point. This paper presents a solution to prevent the disturbance in fitness that can come with modifying GE’s grammar with previously discovered modules. The results show an increase in performance from a previously examined grammar modification approach and also an increase in performance when compared to standard GE.
Presented at GECCO '11, the 13th annual conference companion on Genetic and evolutionary computation, Dublin, Ireland, 12-16, July 2011
</description>
<pubDate>Tue, 12 Jul 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3612</guid>
<dc:date>2011-07-12T00:00:00Z</dc:date>
</item>
<item>
<title>Genotype-phenotype mapping in dynamic environments with grammatical evolution</title>
<link>http://hdl.handle.net/10197/3602</link>
<description>Genotype-phenotype mapping in dynamic environments with grammatical evolution
Fagan, David
The application of a genotype-phenotype mapping in Evolutionary Computation is not a new idea, however, how this mapping process is interpreted, and implemented varies wildly. In the majority of cases a very simple abstraction of the biological genotype-phenotype mapping is used, but as our understanding of this process increases, the deficiencies in current approaches become more evident. In this paper, an outline of what approaches have been taken in the investigation of the genotype-phenotype map in Grammatical Evolution are presented and an outline of proposed future work is introduced.
GECCO 2011, ACM Genetic and Evolutionary Computation Conference, Graduate Student Workshop, Dublin, Ireland, 12-16th July, 2011
</description>
<pubDate>Tue, 12 Jul 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3602</guid>
<dc:date>2011-07-12T00:00:00Z</dc:date>
</item>
<item>
<title>Sentiment analysis of online media</title>
<link>http://hdl.handle.net/10197/3574</link>
<description>Sentiment analysis of online media
Salter-Townshend, Michael; Murphy, Thomas Brendan
A joint model for annotation bias and document classification is presented in the context of media sentiment analysis. We consider an Irish online media data set comprising online news articles with user annotations of negative, positive or irrelevant impact on the Irish economy. The joint model combines a statistical model&#13;
for user annotation bias and a Naive Bayes model for the document terms. An EM algorithm is used to estimate the annotation bias model, the unobserved biases in the&#13;
user annotations, the classifier parameters and the sentiment of the articles. The joint&#13;
modeling of both the user biases and the classifier is demonstrated to be superior to&#13;
estimation of the bias followed by the estimation of the classifier parameters.
Paper presented at the DAGM-GfKl/IFCS 2011, Joint Conference of the German Classification Society (GfKl)&#13;
and the German Association for Pattern Recognition (DAGM), August 31 to September 2, 2011 and at the IFCS 2011 Symposium of the International Federation of Classification Societies (IFCS), August 30, 2011, Frankfurt am Main, Germany
</description>
<pubDate>Sun, 01 Jan 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3574</guid>
<dc:date>2012-01-01T00:00:00Z</dc:date>
</item>
<item>
<title>Learning environment models in car racing using stateful genetic programming</title>
<link>http://hdl.handle.net/10197/3573</link>
<description>Learning environment models in car racing using stateful genetic programming
Agapitos, Alexandros; O'Neill, Michael; Brabazon, Anthony; Theodoridis, Theodoros
For computational intelligence to be useful in creating game agent AI we need to focus on methods that allow the creation and maintenance of models for the environment, which the artificial agents inhabit. Maintaining a model allows an agent to plan its actions more effectively by combining immediate sensory information along with a memories that have been acquired while operating in that environment. To this end, we propose a way to build environment models for non-player characters in car racing games using stateful Genetic Programming. A method is presented, where general purpose 2-dimensional data-structures are used to build a model of the racing track. Results demonstrate that model-building behaviour can be cooperatively coevolved with car controlling behaviour in modular programs that make use of these models in order to navigate successfully around a racing track.
Paper presented at the 2011 IEEE Conference on Computational Intelligence and Games (CIG’11), Seoul, South Korea, Aug.31-Sept.3, 2011
</description>
<pubDate>Wed, 31 Aug 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3573</guid>
<dc:date>2011-08-31T00:00:00Z</dc:date>
</item>
<item>
<title>Dynamic environments can speed up evolution with genetic programming</title>
<link>http://hdl.handle.net/10197/3571</link>
<description>Dynamic environments can speed up evolution with genetic programming
O'Neill, Michael; Nicolau, Miguel; Brabazon, Anthony
We present a study of dynamic environments with genetic programming to ascertain if a dynamic environment can speed up evolution when compared to an equivalent static environment. We present an analysis of the types of dynamic variation which can occur with a variable-length representation such as adopted in genetic programming identifying modular varying, structural varying and incremental varying goals. An empirical investigation comparing these three types of varying goals on dynamic symbolic regression benchmarks reveals an advantage for goals which vary in terms of increasing structural complexity. This provides evidence to support the added difficulty variable length representations incur due to their requirement to search structural and parametric space concurrently, and how directing search through varying structural goals with increasing complexity can speed up search with genetic programming.
</description>
<pubDate>Sat, 01 Jan 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3571</guid>
<dc:date>2011-01-01T00:00:00Z</dc:date>
</item>
<item>
<title>Dynamic ant : introducing a new benchmark for genetic programming in dynamic environments</title>
<link>http://hdl.handle.net/10197/3570</link>
<description>Dynamic ant : introducing a new benchmark for genetic programming in dynamic environments
Fagan, David; Nicolau, Miguel; Hemberg, Erik; O'Neill, Michael; Brabazon, Anthony
In this paper we present a new variant of the ant problem in the dynamic problem domain. This approach presents a functional dynamism to the problem landscape, where by the behaviour of the ant is driven by its ability to explore the search space being constrained. This restriction is designed in such a way as to ensure that no generalised solution to the problem is possible, thus providing a functional change in behaviour.
</description>
<pubDate>Thu, 14 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3570</guid>
<dc:date>2011-04-14T00:00:00Z</dc:date>
</item>
<item>
<title>Exploring grammatical modification with modules in grammatical evolution</title>
<link>http://hdl.handle.net/10197/3554</link>
<description>Exploring grammatical modification with modules in grammatical evolution
Swafford, John Mark; O'Neill, Michael; Nicolau, Miguel; Brabazon, Anthony
There have been many approaches to modularity in the field of evolutionary computation, each tailored to function with a particular representation. This research examines one approach to modularity and grammar modification with a grammar-based approach to genetic programming, grammatical evolution (GE). Here, GE’s grammar was modified over the course of an evolutionary run with modules in order to facilitate their appearance in the population. This is the first step in what will be a series of analysis on methods of modifying GE’s grammar to enhance evolutionary performance. The results show that identifying modules and using them to modify GE’s grammar can have a negative effect on search performance when done improperly. But, if undertaken thoughtfully, there are possible benefits to dynamically enhancing the grammar with modules identified during evolution.
Presented at Genetic Programming - 14th European Conference, EuroGP 2011, Torino, Italy, April 27-29, 2011
</description>
<pubDate>Wed, 27 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3554</guid>
<dc:date>2011-04-27T00:00:00Z</dc:date>
</item>
<item>
<title>Combining structural analysis and multi-objective criteria for evolutionary architectural design</title>
<link>http://hdl.handle.net/10197/3546</link>
<description>Combining structural analysis and multi-objective criteria for evolutionary architectural design
Byrne, Jonathan; Fenton, Michael; Hemberg, Erik; McDermott, James; O'Neill, Michael; Shotton, Elizabeth; McNally, Ciaran
This study evolves and categorises a population of conceptual designs by their ability to handle physical constraints. The design process involves a trade-off between form and function. The aesthetic considerations of the designer are constrained by physical considerations and material cost. In previous work, we developed a design grammar capable of evolving aesthetically pleasing designs through the use of an interactive evolutionary algorithm. This work implements a fitness function capable of applying engineering objectives to automatically evaluate designs and, in turn, reduce the search space that is presented to the user.
EvoMUSART 2011, 9th European Event on Evolutionary and Biologically Inspired Music, Sound, Art and Design in EvoApplications, Torino, Italy, Apr 27-29, 2011
</description>
<pubDate>Sat, 01 Jan 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3546</guid>
<dc:date>2011-01-01T00:00:00Z</dc:date>
</item>
<item>
<title>Acceleration of grammatical evolution using graphics processing units</title>
<link>http://hdl.handle.net/10197/3545</link>
<description>Acceleration of grammatical evolution using graphics processing units
Pospichal, Petr; Muphy, Eoin; O'Neill, Michael; Schwarz, Josef; Jaros, Jiri
Several papers show that symbolic regression is suitable for data analysis and prediction in financial markets. Grammatical Evolution (GE), a grammar-based form of Genetic Programming (GP), has been successfully applied in solving various tasks including symbolic regression. However, often the computational effort to calculate the fitness of a solution in GP can limit the area of possible application and/or the extent of experimentation undertaken. This paper deals with utilizing mainstream graphics processing units (GPU) for acceleration of GE solving symbolic regression. GPU optimization details are discussed and the NVCC compiler is analyzed. We design an effective mapping of the algorithm to the CUDA framework, and in so doing must tackle constraints of the GPU approach, such as the PCI-express bottleneck and main memory transactions. This is the first occasion GE has been adapted for running on a GPU. We measure our implementation running on one core of CPU Core i7 and GPU GTX 480 together with a GE library written in JAVA, GEVA. Results indicate that our algorithm offers the same con- vergence, and it is suitable for a larger number of regression points where GPU is able to reach speedups of up to 39 times faster when compared to GEVA on a serial CPU code written in C. In conclusion, properly utilized, GPU can offer an interesting performance boost for GE tackling symbolic regression.
Presented at the CIGPU Workshop at GECCO '11, the 13th annual conference companion on Genetic and evolutionary computation, Dublin, Ireland, 12-16, July 2011
</description>
<pubDate>Tue, 12 Jul 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3545</guid>
<dc:date>2011-07-12T00:00:00Z</dc:date>
</item>
<item>
<title>Tracer spectrum : a visualisation method for distributed evolutionary computation</title>
<link>http://hdl.handle.net/10197/3540</link>
<description>Tracer spectrum : a visualisation method for distributed evolutionary computation
O'Neill, Michael; Brabazon, Anthony; Hemberg, Erik
We present a novel visualisation method for island-based evolutionary algorithms based on the concept of tracers as adopted in medicine and molecular biology to follow a biochemical process. For example, a radioisotope or dye can be used to replace a stable component of a biological compound, and the signal from the radioisotope can be monitored as it passes through the body to measure the compound’s distribution and elimination from the system. In a similar fashion we attach a tracer dye to individuals in each island, where each individual in any one island is marked with the same colour, and each island then has its own unique colour signal. We can then monitor how individuals undergoing migration events are distributed throughout the entire island ecosystem, thereby allowing the user to visually monitor takeover times and the resulting loss of diversity. This is achieved by visualising each island as a spectrum of the tracer dye associated with each individual. Experiments adopting different rates of migration and network connectivity confirm earlier research which predicts that island models are extremely sensitive to the size and frequency of migrations
</description>
<pubDate>Thu, 01 Sep 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3540</guid>
<dc:date>2011-09-01T00:00:00Z</dc:date>
</item>
<item>
<title>Semantic-based subtree crossover applied to dynamic problems</title>
<link>http://hdl.handle.net/10197/3539</link>
<description>Semantic-based subtree crossover applied to dynamic problems
Nguyen, Quang Uy; Murphy, Eoin; O'Neill, Michael; Nguyen, Xuan Hoai
Although many real world problems are dynamic in nature, the study of Genetic Programming in dynamic environments is still immature. This paper investigates the application of some recently proposed semantic-based crossover operators on a series of dynamic problems. The operators studied include Semantic Similarity based Crossover and the Most Semantic Similarity based Crossover. The experimental results show the advantage of using semantic based crossovers when tackling dynamic problems.
Presented at KSE 2011, The Third International Conference on Knowledge and Systems Engineering, Hanoi, Vietnam, 14-17 October, 2011
</description>
<pubDate>Fri, 14 Oct 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3539</guid>
<dc:date>2011-10-14T00:00:00Z</dc:date>
</item>
<item>
<title>Evolving behaviour trees for the Mario AI competition using grammatical evolution</title>
<link>http://hdl.handle.net/10197/3534</link>
<description>Evolving behaviour trees for the Mario AI competition using grammatical evolution
Perez, Diego; Nicolau, Miguel; O'Neill, Michael; Brabazon, Anthony
This paper investigates the applicability of Genetic Programming type systems to dynamic game environments. Grammatical Evolution was used to evolve Behaviour Trees, in order to create controllers for the Mario AI Benchmark. The results obtained reinforce the applicability of evolutionary programming systems to the development of artificial intelligence in games, and in dynamic systems in general, illustrating their viability as an alternative to more standard AI techniques.
EvoGAMES 2011 3rd European Event on Bio-inspired Algorithms in Games in EvoApplications 2011, Torino, Italy, April, 2011
</description>
<pubDate>Wed, 27 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3534</guid>
<dc:date>2011-04-27T00:00:00Z</dc:date>
</item>
<item>
<title>Neutrality in evolutionary algorithms... what do we know?</title>
<link>http://hdl.handle.net/10197/3532</link>
<description>Neutrality in evolutionary algorithms... what do we know?
Galván-López, Edgar; Poli, Riccardo; Kattan, Ahmed; O'Neill, Michael; Brabazon, Anthony
Over the last years, the effects of neutrality have attracted the attention of many researchers in the Evolutionary Algorithms (EAs) community. A mutation from one gene to another is considered as neutral if this modification does not affect the phenotype. This article provides a general overview on the work carried out on neutrality in EAs. Using as a framework the origin of neutrality and its study in different paradigms of EAs (e.g., Genetic Algorithms, Genetic Programming), we discuss the most significant works and findings on this topic. This work points towards open issues, which the community needs to address.
</description>
<pubDate>Wed, 02 Mar 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3532</guid>
<dc:date>2011-03-02T00:00:00Z</dc:date>
</item>
<item>
<title>Interactive operators for evolutionary architectural design</title>
<link>http://hdl.handle.net/10197/3529</link>
<description>Interactive operators for evolutionary architectural design
Byrne, Jonathan; Hemberg, Erik; O'Neill, Michael
In this paper we explore different techniques that allow the user to direct interactive evolutionary search. Broadening interaction beyond simple evaluation increases the amount of feedback and bias a user can apply to the search. Increased feedback will have the effect of directing the algorithm to more fruitful areas of the search space. This paper examines whether additional feedback from the user can be a benefit to the problem of evolutionary design. We find that the interface between the user and the search space plays a vital role in this process.
</description>
<pubDate>Tue, 12 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3529</guid>
<dc:date>2011-04-12T00:00:00Z</dc:date>
</item>
<item>
<title>Semantically-based crossover in genetic programming : application to real-valued symbolic regression</title>
<link>http://hdl.handle.net/10197/3528</link>
<description>Semantically-based crossover in genetic programming : application to real-valued symbolic regression
Nguyen, Quang Uy; Nguyen, Xuan Hoai; O'Neill, Michael; McKay, Bob (Bob I.); Galván-López, Edgar
We investigate the effects of semantically-based crossover operators in Genetic Programming, applied to real-valued symbolic regression problems. We propose two new relations derived from the semantic distance between subtrees, known as Semantic Equivalence and Semantic Similarity. These relations are used to guide variants of the crossover operator, resulting in two new crossover operators – Semantics Aware Crossover (SAC) and Semantic Similarity-based Crossover (SSC). SAC, was introduced and previously studied, is added here for the purpose of comparison and analysis. SSC extends SAC by more closely controlling the semantic distance between subtrees to which crossover may be applied. The new operators were tested on some real-valued symbolic regression problems and compared with Standard Crossover (SC), Context Aware Crossover (CAC), Soft Brood Selection (SBS), and No Same Mate (NSM) selection. The experimental results show on the problems examined that, with computational effort measured by the number of function node evaluations, only SSC and SBS were significantly better than SC, and SSC was often better than SBS. Further experiments were also conducted to analyse the perfomance sensitivity to the parameter settings for SSC. This analysis leads to a conclusion that SSC is more constructive and has higher locality than SAC, NSM and SC; we believe these are the main reasons for the improved performance of SSC.
</description>
<pubDate>Wed, 01 Jun 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3528</guid>
<dc:date>2011-06-01T00:00:00Z</dc:date>
</item>
<item>
<title>Examining grammars and grammatical evolution in dynamic environments</title>
<link>http://hdl.handle.net/10197/3527</link>
<description>Examining grammars and grammatical evolution in dynamic environments
Murphy, Eoin
This paper is concerned with the effect of the grammar type on grammatical evolution when evolving in dynamic environments. Both representation and dynamic environments have been recognised as important open issues in the field of genetic programming. This paper outlines the need for further study on both topics in the context of grammatical evolution, suggesting further inspiration be taken from nature in an attempt to improve the representations available to grammatical evolution. The research undertaken to date is listed, along with the future work to be completed.
Presented at GECCO '11, the 13th annual conference companion on Genetic and evolutionary computation, Dublin, Ireland, 12-16, July 2011
</description>
<pubDate>Tue, 12 Jul 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3527</guid>
<dc:date>2011-07-12T00:00:00Z</dc:date>
</item>
<item>
<title>Reactiveness and navigation in computer games : different needs, different approaches</title>
<link>http://hdl.handle.net/10197/3520</link>
<description>Reactiveness and navigation in computer games : different needs, different approaches
Perez, Diego; Nicolau, Miguel; O'Neill, Michael; Brabazon, Anthony
This paper presents an approach to the Mario AI Benchmark problem, using the A* algorithm for navigation, and an evolutionary process combining routines for the reactiveness of the resulting bot. The Grammatical Evolution system was used to evolve Behaviour Trees, combining both types of routines, while the highly dynamic nature of the environment required specific approaches to deal with over-fitting issues. The results obtained highlight the need for specific algorithms for the different aspects of controlling a bot in a game environment, while Behaviour Trees provided the perfect representation to combine all those algorithms.
Paper presented at the 2011 IEEE Conference on Computational Intelligence and Games (CIG’11), Seoul, South Korea, August 31st-September 3rd 2011
</description>
<pubDate>Wed, 31 Aug 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3520</guid>
<dc:date>2011-08-31T00:00:00Z</dc:date>
</item>
<item>
<title>Using grammatical evolution to parameterise interactive 3D image generation</title>
<link>http://hdl.handle.net/10197/3519</link>
<description>Using grammatical evolution to parameterise interactive 3D image generation
Nicolau, Miguel; Costelloe, Dan
This paper describes an Interactive Evolutionary system for generating pleasing 3D images using a combination of Grammatical Evolution and Jenn3d, a freely available visualiser of Cayley graphs of finite Coxeter groups. Using interactive GE with some novel enhancements, the parameter space of the Jenn3d image-generating system is navigated by the user, permitting the creation of realistic, unique and award winning images in just a few generations. One of the evolved images has been selected to illustrate the proceedings of the EvoStar conference in 2011.
Paper presented at EvoMUSART 2011, 9th European Event on Evolutionary and Biologically Inspired Music, Sound, Art and Design, Torino, Italy, 27-29 April 2011
</description>
<pubDate>Wed, 27 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3519</guid>
<dc:date>2011-04-27T00:00:00Z</dc:date>
</item>
<item>
<title>Investigation of the performance of different mapping orders for GE on the max problem</title>
<link>http://hdl.handle.net/10197/3518</link>
<description>Investigation of the performance of different mapping orders for GE on the max problem
Fagan, David; Nicolau, Miguel; Hemberg, Erik; O'Neill, Michael; Brabazon, Anthony; McGarraghy, Sean
We present an analysis of how the genotype-phenotype map in Grammatical Evolution (GE) can effect performance on the Max Problem. Earlier studies have demonstrated a performance decrease for Position independent Grammatical Evolution (πGE ) in this problem domain. In πGE the genotype-phenotype map is changed so that the evolutionary algorithm controls not only what the next expansion will be but also the choice of what position in the derivation tree is expanded next. In this study we extend previous work and investigate whether the ability to change the order of expansion is responsible for the performance decrease or if the problem is simply that a certain order of expansion in the genotype-phenotype map is responsible. We conclude that the reduction of performance in the Max problem domain by πGE is rooted in the way the genotype-phenotype map and the genetic operators used with this mapping interact.
Paper presented at the 14th European Conference, EuroGP 2011, Torino, Italy, April 27-29, 2011
</description>
<pubDate>Wed, 27 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3518</guid>
<dc:date>2011-04-27T00:00:00Z</dc:date>
</item>
<item>
<title>Maximum margin decision surfaces for increased generalisation in evolutionary decision tree learning</title>
<link>http://hdl.handle.net/10197/3517</link>
<description>Maximum margin decision surfaces for increased generalisation in evolutionary decision tree learning
Agapitos, Alexandros; O'Neill, Michael; Brabazon, Anthony; Theodoridis, Theodoros
Decision tree learning is one of the most widely used and practical methods for inductive inference. We present a novel method that increases the generalisation of genetically-induced classification trees, which employ linear discriminants as the partitioning function at each internal node. Genetic Programming is employed to search the space of oblique decision trees. At the end of the evolutionary run, a (1+1) Evolution Strategy is used to geometrically optimise the boundaries in the decision space, which are represented by the linear discriminant functions. The evolutionary optimisation concerns maximising the decision-surface margin that is defined to be the smallest distance between the decision-surface and any of the samples. Initial empirical results of the application of our method to a series of datasets from the UCI repository suggest that model generalisation benefits from the margin maximisation, and that the new method is a very competent approach to pattern classification as compared to other learning algorithms.
Paper presented at the 14th European Conference, EuroGP 2011, Torino, Italy, April 27-29, 2011.
</description>
<pubDate>Wed, 27 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3517</guid>
<dc:date>2011-04-27T00:00:00Z</dc:date>
</item>
<item>
<title>A comparison of GE and TAGE in dynamic environments</title>
<link>http://hdl.handle.net/10197/3516</link>
<description>A comparison of GE and TAGE in dynamic environments
Murphy, Eoin; O'Neill, Michael; Brabazon, Anthony
The lack of study of genetic programming in dynamic environments is recognised as a known issue in the field of genetic programming. This study compares the performance of two forms of genetic programming, grammatical evolution and a variation of grammatical evolution which uses tree-adjunct grammars, on a series of dynamic problems. Mean best fitness plots for the two representations are analysed and compared.
Paper presented at the ACM Genetic and Evolutionary Computation Conference, GECCO 2011, 12-16 July, Dublin, Ireland
</description>
<pubDate>Tue, 12 Jul 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3516</guid>
<dc:date>2011-07-12T00:00:00Z</dc:date>
</item>
<item>
<title>Examining the landscape of semantic similarity based mutation</title>
<link>http://hdl.handle.net/10197/3514</link>
<description>Examining the landscape of semantic similarity based mutation
Nguyen, Quang Uy; Nguyen, Xuan Hoai; O'Neill, Michael
This paper examines how the semantic locality of a search operator affects the fitness landscape of Genetic Programming (GP). We compare the fitness landscapes of GP search when standard subtree mutation and a recently proposed semantic-based mutation, Semantic Similarity-based Mutation (SSM), are used. The comparison is based on two well-studied fitness landscape measures, namely, the autocorrelation function and information content. The experiments were conducted on a family of symbolic regression problems with increasing degrees of difficulty. The results show that SSM helps to significantly smooth out the fitness landscape of GP compared to standard subtree mutation. This gives an explanation for the better performance of SSM over standard subtree mutation operator.
Paper presented at the ACM Genetic and Evolutionary Computation Conference, GECCO 2011, 12-16 July, Dublin, Ireland
</description>
<pubDate>Tue, 12 Jul 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3514</guid>
<dc:date>2011-07-12T00:00:00Z</dc:date>
</item>
<item>
<title>Examining mutation landscapes in grammar based genetic programming</title>
<link>http://hdl.handle.net/10197/3513</link>
<description>Examining mutation landscapes in grammar based genetic programming
Murphy, Eoin; O'Neill, Michael; Brabazon, Anthony
Representation is a very important component of any evolutionary algorithm. Changing the representation can cause an algorithm to perform very differently. Such a change can have an effect that is difficult to understand. This paper examines what happens to the grammatical evolution algorithm when replacing the commonly used context-free grammar representation with a tree-adjunct grammar representation. We model the landscapes produced when using integer flip mutation with both representations and compare these landscapes using visualisation methods little used in the field of genetic programming.
Paper presented at the Genetic Programming,14th European Conference, EuroGP 2011, Torino, Italy, April 27-29, 2011
</description>
<pubDate>Wed, 27 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3513</guid>
<dc:date>2011-04-27T00:00:00Z</dc:date>
</item>
<item>
<title>Defining locality as a problem difficulty measure in genetic programming</title>
<link>http://hdl.handle.net/10197/3512</link>
<description>Defining locality as a problem difficulty measure in genetic programming
Galván-López, Edgar; McDermott, James; O'Neill, Michael; Brabazon, Anthony
A mapping is local if it preserves neighbourhood. In Evolutionary Computation, locality is generally described as the property that neighbouring genotypes correspond to neighbouring phenotypes. A representation has high locality if most genotypic neighbours are mapped to phenotypic neighbours. Locality is seen as a key element in performing effective evolutionary search. It is believed that a representation that has high locality will perform better in evolutionary search and the contrary is true for a representation that has low locality. When locality was introduced, it was the genotype-phenotype mapping in bitstring-based Genetic Algorithms which was of interest; more recently, it has also been used to study the same mapping in Grammatical Evolution. To our knowledge, there are few explicit studies of locality in Genetic Programming (GP). The goal of this paper is to shed some light on locality in GP and use it as an indicator of problem difficulty. Strictly speaking, in GP the genotype and the phenotype are not distinct. We attempt to extend the standard quantitative definition of genotype-phenotype locality to the genotype-fitness mapping by considering three possible definitions. We consider the effects of these definitions in both continuous- and discrete-valued fitness functions. We compare three different GP representations (two of them induced by using different function sets and the other using a slightly different GP encoding) and six different mutation operators. Results indicate that one definition of locality is better in predicting performance.
</description>
<pubDate>Sat, 02 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3512</guid>
<dc:date>2011-04-02T00:00:00Z</dc:date>
</item>
<item>
<title>A symbolic regression approach to manage femtocell coverage using grammatical genetic programming</title>
<link>http://hdl.handle.net/10197/3511</link>
<description>A symbolic regression approach to manage femtocell coverage using grammatical genetic programming
Hemberg, Erik; Ho, Lester; O'Neill, Michael; Claussen, Holger
We present a novel application of Grammatical Evolution to the real-world application of femtocell coverage. A symbolic regression approach is adopted in which we wish to uncover an expression to automatically manage the power settings of individual femtocells in a larger femtocell group to optimise the coverage of the network under time varying load. The generation of symbolic expressions is important as it facilitates the analysis of the evolved solutions. Given the multi-objective nature of the problem we hybridise Grammatical Evolution with NSGA-II connected to tabu search. The best evolved solutions have superior power consumption characteristics than a fixed coverage femtocell deployment.
Paper presented at the ACM Genetic and Evolutionary Computation Conference GECCO 2011 Symbolic Regression and Modelling Workshop, Dublin, Ireland, 12-16, July
</description>
<pubDate>Sat, 01 Jan 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3511</guid>
<dc:date>2011-01-01T00:00:00Z</dc:date>
</item>
<item>
<title>Distill : a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins</title>
<link>http://hdl.handle.net/10197/3444</link>
<description>Distill : a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins
Baù, Davide; Martin, Alberto J. M.; Mooney, Catherine; Vullo, Alessandro; Walsh, Ian; Pollastri, Gianluca
We describe Distill, a suite of servers for the prediction of protein structural&#13;
features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, non-&#13;
redundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of Cα traces for short proteins (up to 200 amino acids). The servers are state-of-the-art, with secondary structure predicted correctly for nearly 80% of residues (currently the top performance on EVA), 2-class solvent accessibility nearly 80% correct, and contact maps exceeding 50% precision on the top non-diagonal contacts. A preliminary implementation of the predictor of protein Cα traces featured among the top 20 Novel&#13;
Fold predictors at the last CASP6 experiment as group Distill (ID 0348). The majority of the servers, including the Cα trace predictor, now take into account homology information from the PDB, when available, resulting in greatly improved reliability. All predictions are freely available through a simple joint web interface and the results are returned by email. In a single submission the user can send protein sequences for a total&#13;
of up to 32k residues to all or a selection of the servers. Distill is accessible at the address: http://distill.ucd.ie/distill/.
</description>
<pubDate>Tue, 05 Sep 2006 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3444</guid>
<dc:date>2006-09-05T00:00:00Z</dc:date>
</item>
<item>
<title>SCLpred : protein subcellular localization prediction by N-to-1 neural networks</title>
<link>http://hdl.handle.net/10197/3443</link>
<description>SCLpred : protein subcellular localization prediction by N-to-1 neural networks
Mooney, Catherine; Wang, Yong-Hong; Pollastri, Gianluca
Knowledge of the subcellular location of a protein provides valuable information about its function and possible interaction with other proteins. In the post-genomic era, fast and accurate predictors of subcellular location are required if this abundance of sequence data is to be fully exploited. We have developed a subcellular localization predictor (SCLpred), which predicts the location of a protein into four classes for animals and fungi and five classes for plants (secreted, cytoplasm, nucleus, mitochondrion and chloroplast) using machine learning models trained on large non-redundant sets of protein sequences. The algorithm powering SCLpred is a novel Neural Network (N-to-1 Neural Network, or N1-NN) we have developed, which is capable of mapping whole sequences into single properties (a functional class, in this work) without resorting to predefined transformations, but rather by adaptively compressing the sequence into a hidden feature vector. We benchmark SCLpred against other publicly available predictors using two benchmarks including a new subset of Swiss-Prot Release 2010_06. We show that SCLpred surpasses the state of the art. The N1-NN algorithm is fully general and may be applied to a host of problems of similar shape, that is, in which a whole sequence needs to be mapped into a fixed-size array of properties, and the adaptive compression it operates may shed light on the space of protein sequences.&#13;
The predictive systems described in this article are publicly available as a web server at http://distill.ucd.ie/distill/.
</description>
<pubDate>Sat, 27 Aug 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3443</guid>
<dc:date>2011-08-27T00:00:00Z</dc:date>
</item>
<item>
<title>Beyond the twilight zone : automated&#13;
prediction of structural properties of&#13;
proteins by recursive neural networks and&#13;
remote homology information</title>
<link>http://hdl.handle.net/10197/3442</link>
<description>Beyond the twilight zone : automated&#13;
prediction of structural properties of&#13;
proteins by recursive neural networks and&#13;
remote homology information
Mooney, Catherine; Pollastri, Gianluca
The prediction of 1D structural properties of proteins is an important step toward the prediction of protein structure and function, not only in the ab initio case but also when homology information to known structures is available. Despite this the vast majority of 1D predictors do not incorporate homology information into the prediction process. We&#13;
develop a novel structural alignment method,&#13;
SAMD, which we use to build alignments of putative remote homologues that we compress into templates of structural frequency profiles. We use these templates as additional input to ensembles of recursive&#13;
neural networks, which we specialise for the prediction of query sequences that show only remote homology to any Protein Data Bank structure. We predict four 1D structural properties – secondary structure, relative solvent accessibility, backbone structural motifs, and contact density. Secondary structure prediction accuracy, tested by five-fold cross-validation on a large set of proteins allowing less than 25% sequence identity between training and test set and query sequences and templates, exceeds 82%, outperforming its ab initio counterpart, other state-of-the-art secondary structure predictors (Jpred 3 and PSIPRED) and two other systems based on PSI-BLAST and COMPASS templates. We show that structural information from homologues improves prediction accuracy well beyond the&#13;
Twilight Zone of sequence similarity, even below 5% sequence identity, for all four structural properties. Significant improvement over the extraction of&#13;
structural information directly from PDB templates suggests that the combination of sequence and template information is more informative than templates alone.
</description>
<pubDate>Thu, 01 Oct 2009 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3442</guid>
<dc:date>2009-10-01T00:00:00Z</dc:date>
</item>
<item>
<title>Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks</title>
<link>http://hdl.handle.net/10197/3409</link>
<description>Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks
Walsh, Ian; Baù, Davide; Martin, Alberto J. M.; Mooney, Catherine; Vullo, Alessandro; Pollastri, Gianluca
Background: Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield&#13;
correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not&#13;
exploiting information about potential homologues of known structure. Results: We introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that C trace reconstructions based on 4-class native maps are  significantly better than those from residue&#13;
contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology&#13;
information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10%) yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary&#13;
residue contact maps at an 8 Å threshold (as per CASP assessment) from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8 Å predictions on the CASP7 targets using a pre-CASP7 PDB, and find that both predictors are state-of-the-art, with the template-based one far outperforming the best CASP7 systems if templates with&#13;
sequence identity to the query of 10% or better are available. Although this is not the main focus of this paper we also report on reconstructions of C traces based on both ab initio and template-based 4-class map&#13;
predictions, showing that the latter are generally more accurate even when homology is dubious. Conclusion: Accurate predictions of multi-class maps may provide valuable constraints for improved ab initio and&#13;
template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the-&#13;
art binary maps. Predictions of protein structures and 8 Å contact maps based on the multi-class distance map predictors described in this paper are freely available to academic users at the url http://distill.ucd.ie/.
</description>
<pubDate>Fri, 30 Jan 2009 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3409</guid>
<dc:date>2009-01-30T00:00:00Z</dc:date>
</item>
<item>
<title>Ab initio and homology based prediction of protein domains by recursive neural networks</title>
<link>http://hdl.handle.net/10197/3396</link>
<description>Ab initio and homology based prediction of protein domains by recursive neural networks
Walsh, Ian; Martin, Alberto J. M.; Mooney, Catherine; Rubagotti, Enrico; Vullo, Alessandro; Pollastri, Gianluca
Background: Proteins, especially larger ones, are often composed of individual evolutionary units, domains, which have their own function and structural fold. Predicting domains is an important intermediate step in protein analyses, including the prediction of protein structures.&#13;
Results: We describe novel systems for the prediction of protein domain boundaries powered by Recursive Neural Networks. The systems rely on a combination of primary sequence and evolutionary information, predictions of structural features such as secondary structure, solvent accessibility and residue contact maps, and structural templates, both annotated for domains (from the SCOP dataset) and unannotated (from the PDB). We gauge the contribution of contact maps, and PDB and SCOP templates independently and for different ranges of template quality. We find that accurately predicted contact maps are informative for the prediction of domain boundaries, while the same is not true for contact maps predicted ab initio. We also find that gap information from PDB templates is informative, but, not surprisingly, less than SCOP annotations. We test both systems trained on templates of all qualities, and systems trained only on templates of marginal similarity to the query (less than 25% sequence identity). While the first batch of systems produces near perfect predictions in the presence of fair to good templates, the second batch outperforms or match ab initio predictors down to essentially any level of template quality.&#13;
&#13;
We test all systems in 5-fold cross-validation on a large non-redundant set of multi-domain and single domain proteins. The final predictors are state-of-the-art, with a template-less prediction boundary recall of 50.8% (precision 38.7%) within ± 20 residues and a single domain recall of 80.3% (precision 78.1%). The SCOP-based predictors achieve a boundary recall of 74% (precision 77.1%) again within ± 20 residues, and classify single domain proteins as such in over 85% of cases, when we allow a mix of bad and good quality templates. If we only allow marginal templates (max 25% sequence identity to the query) the scores remain high, with boundary recall and precision of 59% and 66.3%, and 80% of all single domain proteins predicted correctly.&#13;
Conclusion: The systems presented here may prove useful in large-scale annotation of protein domains in proteins of unknown structure. The methods are available as public web servers at the address: http://distill.ucd.ie/shandy/ and we plan on running them on a multi-genomic scale and make the results public in the near future.
</description>
<pubDate>Fri, 26 Jun 2009 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3396</guid>
<dc:date>2009-06-26T00:00:00Z</dc:date>
</item>
<item>
<title>Prediction of short linear protein binding regions</title>
<link>http://hdl.handle.net/10197/3395</link>
<description>Prediction of short linear protein binding regions
Mooney, Catherine; Pollastri, Gianluca; Shields, Denis C.; Haslam, Niall J.
Short linear motifs in proteins (typically 3–12 residues in length) play key roles in protein–protein interactions by frequently binding specifically to peptide binding domains within interacting proteins. Their tendency to be found in disordered segments of proteins has meant that they have often been overlooked. Here we present SLiMPred (short linear motif predictor), the first general de novo method designed to computationally predict such regions in protein primary sequences independent of experimentally defined homologs and interactors. The method applies machine learning techniques to predict new motifs based on annotated instances from the Eukaryotic Linear Motif database, as well as structural, biophysical, and biochemical features derived from the protein primary sequence. We have integrated these data sources and benchmarked the predictive accuracy of the method, and found that it performs equivalently to a predictor of protein binding regions in disordered regions, in addition to having predictive power for other classes of motif sites such as polyproline II helix motifs and short linear motifs lying in ordered regions. It will be useful in predicting peptides involved in potential protein associations and will aid in the functional characterization of proteins, especially of proteins lacking experimental information on structures and interactions. We conclude that, despite the diversity of motif sequences and structures, SLiMPred is a valuable tool for prioritizing potential interaction motifs in proteins.
</description>
<pubDate>Fri, 06 Jan 2012 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3395</guid>
<dc:date>2012-01-06T00:00:00Z</dc:date>
</item>
<item>
<title>Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure&#13;
information</title>
<link>http://hdl.handle.net/10197/3394</link>
<description>Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure&#13;
information
Pollastri, Gianluca; Martin, Alberto J. M.; Mooney, Catherine; Vullo, Alessandro
Background :&#13;
Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio.&#13;
Results:&#13;
Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available.&#13;
Conclusion:&#13;
The predictive system are publicly available at the address http://distill.ucd.ie
</description>
<pubDate>Thu, 14 Jun 2007 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3394</guid>
<dc:date>2007-06-14T00:00:00Z</dc:date>
</item>
<item>
<title>Protein structural motif prediction in&#13;
multidimensional φ-ψ space leads to&#13;
improved secondary structure prediction</title>
<link>http://hdl.handle.net/10197/3393</link>
<description>Protein structural motif prediction in&#13;
multidimensional φ-ψ space leads to&#13;
improved secondary structure prediction
Mooney, Catherine; Vullo, Alessandro; Pollastri, Gianluca
A significant step towards establishing the structure and function of a protein is the prediction of the local conformation of the polypeptide chain. In this article, we present systems for the prediction of three new alphabets of local structural motifs. The motifs are built by applying multidimensional scaling (MDS) and clustering to pair-wise angular distances for multiple φ-ψ angle values collected from high-resolution protein structures. The predictive systems, based on ensembles of bidirectional recurrent neural network architectures, and trained on a large non-redundant set of protein structures, achieve 72%, 66%, and 60% correct motif prediction on an independent test set for di-peptides (six classes), tri-peptides (eight classes) and tetra-peptides (14 classes), respectively, 28–30% above baseline statistical predictors. We then build a further system, based on ensembles of two-layered bidirectional recurrent neural networks, to map structural motif predictions into a traditional 3-class (helix, strand, coil) secondary structure. This system achieves 79.5% correct prediction using&#13;
the “hard” CASP 3-class assignment, and 81.4% with a more lenient assignment, outper-&#13;
forming a sophisticated state-of-the-art predictor (Porter) trained in the same experimental conditions. The structural motif predictor is publicly available at:  http://distill.ucd.ie/porter+/.
</description>
<pubDate>Tue, 24 Oct 2006 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3393</guid>
<dc:date>2006-10-24T00:00:00Z</dc:date>
</item>
<item>
<title>Deriving insights from national happiness indices</title>
<link>http://hdl.handle.net/10197/3243</link>
<description>Deriving insights from national happiness indices
Brew, Anthony; Greene, Derek; Archambault, Daniel; Cunningham, Pádraig
In online social media, individuals produce vast amounts of content which in effect "instruments" the world around us. Users on sites such as Twitter are publicly broadcasting status updates that provide an indication of their mood at a given moment in time, often accompanied by geolocation information. A number of strategies exist to aggregate such content to produce sentiment scores in order to build a "happiness index". In this paper, we describe such a system based on Twitter that maintains a happiness index for nine US cities. The main contribution of this paper is a companion system called SentireCrowds that allows us to identify the underlying causes behind shifts in sentiment. This ability to analyse the components of the sentiment signal highlights a number of problems. It shows that sentiment scoring on social media data without considering context is difficult. More importantly, it highlights cases where sentiment scoring methods are susceptible to unexpected shifts due to noise and trending memes.
Paper presented at the IEEE International Conference on Data Mining series (ICDM'11), December 11th to 14th, 2011, Vancouver, Canada
</description>
<pubDate>Sun, 11 Dec 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3243</guid>
<dc:date>2011-12-11T00:00:00Z</dc:date>
</item>
<item>
<title>An empirical analysis of dynamic multiscale hedging using wavelet decomposition</title>
<link>http://hdl.handle.net/10197/3188</link>
<description>An empirical analysis of dynamic multiscale hedging using wavelet decomposition
Conlon, Thomas; Cotter, John
This paper investigates the hedging effectiveness of a dynamic moving window OLS hedging model, formed&#13;
using wavelet decomposed time-series. The wavelet transform is applied to calculate the appropriate dynamic&#13;
minimum-variance hedge ratio for various hedging horizons for a number of assets. The effectiveness of the&#13;
dynamic multiscale hedging strategy is then tested, both in-and out-of-sample, using standard variance reduction&#13;
and expanded to include a downside risk metric, the time horizon dependent Value-at-Risk. Measured using&#13;
variance reduction, the effectiveness converges to one at longer scales, while a measure of VaR reduction indicates&#13;
a portion of residual risk remains at all scales. Analysis of the hedge portfolio distributions indicate that this&#13;
unhedged tail risk is related to excess portfolio kurtosis found at all scales.
</description>
<pubDate>Mon, 07 Mar 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3188</guid>
<dc:date>2011-03-07T00:00:00Z</dc:date>
</item>
<item>
<title>SLiMSearch 2.0 : biological context for short linear motifs in proteins</title>
<link>http://hdl.handle.net/10197/3180</link>
<description>SLiMSearch 2.0 : biological context for short linear motifs in proteins
Davey, Norman E.; Haslam, Niall J.; Shields, Denis C.; Edwards, Richard J.
Short, linear motifs (SLiMs) play a critical role in many biological processes. The SLiMSearch 2.0 (Short, Linear Motif Search) web server allows researchers to identify occurrences of a user-defined SLiM in a proteome, using conservation and protein disorder context statistics to rank occurrences. User-friendly output and visualizations of motif context allow the user to quickly gain insight into the validity of a putatively functional motif occurrence. For each motif occurrence, overlapping UniProt features and annotated SLiMs are displayed. Visualization also includes annotated multiple sequence alignments surrounding each occurrence, showing conservation and protein disorder statistics in addition to known and predicted SLiMs, protein domains and known post-translational modifications. In addition, enrichment of Gene Ontology terms and protein interaction partners are provided as indicators of possible motif function. All web server results are available for download. Users can search motifs against the human proteome or a subset thereof defined by Uniprot accession numbers or GO term. The SLiMSearch server is available at: http://bioware.ucd.ie/slimsearch2.html.
</description>
<pubDate>Thu, 26 May 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3180</guid>
<dc:date>2011-05-26T00:00:00Z</dc:date>
</item>
<item>
<title>A preliminary investigation of overfitting in evolutionary driven model induction : implications for financial modelling</title>
<link>http://hdl.handle.net/10197/3059</link>
<description>A preliminary investigation of overfitting in evolutionary driven model induction : implications for financial modelling
Tuite, Cliodhna; Agapitos, Alexandros; O'Neill, MIchael; Brabazon, Anthony
This paper investigates the effects of early stopping as a&#13;
method to counteract overfitting in evolutionary data modelling using&#13;
Genetic Programming. Early stopping has been proposed as a method&#13;
to avoid model overtraining, which has been shown to lead to a significant&#13;
degradation of out-of-sample performance. If we assume some sort&#13;
of performance metric maximisation, the most widely used early training&#13;
stopping criterion is the moment within the learning process that an unbiased&#13;
estimate of the performance of the model begins to decrease after&#13;
a strictly monotonic increase through the earlier learning iterations. We&#13;
are conducting an initial investigation on the effects of early stopping in&#13;
the performance of Genetic Programming in symbolic regression and financial&#13;
modelling. Empirical results suggest that early stopping using the&#13;
above criterion increases the extrapolation abilities of symbolic regression&#13;
models, but is by no means the optimal training-stopping criterion&#13;
in the case of a real-world financial dataset.
EvoStar 2011, 27-29 April, 2011, Torino Italy
</description>
<pubDate>Fri, 01 Apr 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/3059</guid>
<dc:date>2011-04-01T00:00:00Z</dc:date>
</item>
<item>
<title>CycloPs : generating virtual libraries of cyclized and constrained peptides including nonnatural amino acids</title>
<link>http://hdl.handle.net/10197/2987</link>
<description>CycloPs : generating virtual libraries of cyclized and constrained peptides including nonnatural amino acids
Duffy, Fergal J.; Verniere, Mélanie; Devocelle, Marc; Bernard, Elise; Shields, Denis C.; Chubb, Anthony J.
We introduce CycloPs, software for the generation of virtual libraries of&#13;
constrained peptides including natural and nonnatural commercially available amino acids.&#13;
The software is written in the cross-platform Python programming language, and features include&#13;
generating virtual libraries in one-dimensional SMILES and three-dimensional SDF formats,&#13;
suitable for virtual screening. The stand-alone software is capable of filtering the virtual libraries&#13;
using empirical measurements, including peptide synthesizability by standard peptide synthesis&#13;
techniques, stability, and the druglike properties of the peptide. The software and accompanying&#13;
Web interface is designed to enable the rapid generation of large, structurally diverse,&#13;
synthesizable virtual libraries of constrained peptides quickly and conveniently, for use in virtual&#13;
screening experiments. The stand-alone software, and the Web interface for evaluating these empirical properties of a single peptide,&#13;
are available at http://bioware.ucd.ie.
</description>
<pubDate>Thu, 24 Mar 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/2987</guid>
<dc:date>2011-03-24T00:00:00Z</dc:date>
</item>
<item>
<title>SLiMSearch : a webserver for finding novel occurrences of short linear motifs in proteins, incorporating sequence context</title>
<link>http://hdl.handle.net/10197/2942</link>
<description>SLiMSearch : a webserver for finding novel occurrences of short linear motifs in proteins, incorporating sequence context
Davey, Norman E.; Haslam, Niall J.; Shields, Denis C.; Edwards, Richard J.
Short, linear motifs (SLiMs) play a critical role in many biological processes. The SLiMSearch (Short, Linear Motif Search) webserver is a flexible tool that enables researchers to identify novel occurrences of pre- defined SLiMs in sets of proteins. Numerous masking options give the user great control over the contextual information to be included in the analyses, including evolutionary filtering and protein structural disorder. User-friendly output and visualizations of motif context allow the user to quickly gain insight into the validity of a putatively functional motif occurrence. Users can search motifs against the human proteome, or submit their own datasets of UniProt proteins, in which case motif support within the dataset is statistically assessed for over- and under-representation, accounting for evolutionary relationships between input proteins. SLiMSearch is freely available as open source Python modules and all webserver results are available for download. The SLiMSearch server is available at: http://bioware.ucd.ie/slimsearch.html.
Paper presented at the 5th IAPR International Conference, PRIB 2010, Nijmegen, The Netherlands, September 22-24, 2010
</description>
<pubDate>Fri, 01 Jan 2010 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/2942</guid>
<dc:date>2010-01-01T00:00:00Z</dc:date>
</item>
<item>
<title>Identifying representative textual sources in blog networks</title>
<link>http://hdl.handle.net/10197/2802</link>
<description>Identifying representative textual sources in blog networks
Wade, Karen; Greene, Derek; Lee, Conrad; Archambault, Daniel; Cunningham, Pádraig
We apply methods from social network analysis and visualization to facilitate a study of the Irish blogosphere from a cultural studies perspective. We focus on solving the practical issues that arise when the goal is to perform textual analysis of the corpus produced by a network of bloggers. Previous studies into blogging networks have noted difficulties arising when trying to identify the extent and boundaries of these networks. As a response to calls for increasingly data-led approaches in media and cultural studies, we discuss a variety of social network analysis methods that can be used to identify which blogs can be seen as members of a posited "Irish blogging network". We identify hub blogs, communities of sites corresponding to different topics, and representative bloggers within these communities. Based on this study, we propose a set of analysis guidelines for researchers who wish to map out blogging networks.
</description>
<pubDate>Tue, 01 Feb 2011 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/2802</guid>
<dc:date>2011-02-01T00:00:00Z</dc:date>
</item>
<item>
<title>Identifying online credit card fraud using artificial immune systems</title>
<link>http://hdl.handle.net/10197/2736</link>
<description>Identifying online credit card fraud using artificial immune systems
Brabazon, Anthony; Cahill, Jane; Keenan, Peter; Walsh, Daniel
Significant payment flows now take place on-line, giving rise to a requirement for efficient and effective systems for the detection of credit card fraud. A particular aspect of this problem is that it is highly dynamic, as fraudsters continually adapt their strategies in response to the increasing sophistication of detection systems. Hence, system training by exposure to examples of previous examples of fraudulent transactions can lead to fraud detection systems which are susceptible to new patterns of fraudulent transactions. The nature of the problem suggests that Artificial Immune Systems (AIS) may have particular utility for inclusion in fraud detection systems as AIS can be constructed which can flag ‘non standard’ transactions without having seen examples of all possible such transactions during training of the algorithm. In this paper, we investigate the effectiveness of Artificial Immune Systems (AIS) for credit card fraud detection using a large dataset obtained from an on-line retailer. Three AIS algorithms were implemented and their performance was benchmarked against a logistic regression model. The results suggest that AIS algorithms have potential for inclusion in fraud detection systems but that further work is required to realize their full potential in this domain.
Congress on Evolutionary Computation, IEEE World Congress on Computational Intelligence, Barcelona, Spain, 18-23 July
</description>
<pubDate>Thu, 01 Jul 2010 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/2736</guid>
<dc:date>2010-07-01T00:00:00Z</dc:date>
</item>
<item>
<title>Evolutionary learning of technical trading rules without data-mining bias</title>
<link>http://hdl.handle.net/10197/2735</link>
<description>Evolutionary learning of technical trading rules without data-mining bias
Agapitos, Alexandros; O'Neill, Michael; Brabazon, Anthony
In this paper we investigate the profitability of evolved technical trading rules when controlling for data-mining bias. For the first time in the evolutionary computation literature, a comprehensive test for a rule’s statistical significance using Hansen’s Superior Predictive Ability is explicitly taken into account in the fitness function, and multi-objective evolutionary optimisation is employed to drive the search towards individual rules with better generalisation abilities. Empirical results on a spot foreign-exchange market index suggest that increased out-of-sample performance can be obtained after accounting for data-mining bias effects in a multi-objective fitness function, as compared to a single-criterion fitness measure that considers solely the average return.
11th International Conference on Parallel Problem Solving from Nature (PPSN 2010), Krakow, Poland, September 11-15, 2010
</description>
<pubDate>Wed, 01 Sep 2010 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/10197/2735</guid>
<dc:date>2010-09-01T00:00:00Z</dc:date>
</item>
</channel>
</rss>
