<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns="http://www.w3.org/2005/Atom">
<title>Clique</title>
<link href="http://hdl.handle.net/10197/2207" rel="alternate"/>
<subtitle/>
<id>http://hdl.handle.net/10197/2207</id>
<updated>2013-06-19T03:02:13Z</updated>
<dc:date>2013-06-19T03:02:13Z</dc:date>
<entry>
<title>Sentiment Analysis of Online Media</title>
<link href="http://hdl.handle.net/10197/3964" rel="alternate"/>
<author>
<name>Salter-Townshend, Michael</name>
</author>
<author>
<name>Murphy, Thomas Brendan</name>
</author>
<id>http://hdl.handle.net/10197/3964</id>
<updated>2013-05-31T10:41:49Z</updated>
<published>2012-12-18T00:00:00Z</published>
<summary type="text">Sentiment Analysis of Online Media
Salter-Townshend, Michael; Murphy, Thomas Brendan
A joint model for annotation bias and document classification is presented&#13;
in the context of media sentiment analysis. We consider an Irish online media data&#13;
set comprising online news articles with user annotations of negative, positive or&#13;
irrelevant impact on the Irish economy. The joint model combines a statistical model&#13;
for user annotation bias and a Naive Bayes model for the document terms. An EM&#13;
algorithm is used to estimate the annotation bias model, the unobserved biases in the&#13;
user annotations, the classifier parameters and the sentiment of the articles. The joint&#13;
modeling of both the user biases and the classifier is demonstrated to be superior to&#13;
estimation of the bias followed by the estimation of the classifier parameters.
GfKl 2011: Joint Conference of the German Classification Society (GfKl)&#13;
and the German Association for Pattern Recognition (DAGM) August 31 to September 2, 2011 and the IFCS 2011: Symposium of the International Federation of Classification Societies (IFCS) August 30, 2011, Frankfurt am Main, Germany
</summary>
<dc:date>2012-12-18T00:00:00Z</dc:date>
</entry>
<entry>
<title>Aggregating Content and Network Information to Curate Twitter User Lists</title>
<link href="http://hdl.handle.net/10197/3871" rel="alternate"/>
<author>
<name>Greene, Derek</name>
</author>
<author>
<name>Sheridan, Gavin</name>
</author>
<author>
<name>Smyth, Barry</name>
</author>
<author>
<name>Cunningham, Pádraig</name>
</author>
<id>http://hdl.handle.net/10197/3871</id>
<updated>2012-10-16T14:03:36Z</updated>
<published>2012-06-25T00:00:00Z</published>
<summary type="text">Aggregating Content and Network Information to Curate Twitter User Lists
Greene, Derek; Sheridan, Gavin; Smyth, Barry; Cunningham, Pádraig
Twitter introduced user lists in late 2009, allowing users to be grouped according to meaningful topics or themes. Lists have since been adopted by media outlets as a means of organising content around news stories. Thus the curation of these lists is important - they should contain the key information gatekeepers and present a balanced perspective on a story. Here we address this list curation process from a recommender systems perspective. We propose a variety of criteria for generating user list recommendations, based on content analysis, network analysis, and the "crowdsourcing" of existing user lists. We demonstrate that these types of criteria are often only successful for datasets with certain characteristics. To resolve this issue, we propose the aggregation of these different "views" of a news story on Twitter to produce more  accurate user recommendations to support the curation process.
ACM RecSys 2012 Workshop on Recommender Systems &amp; The Social Web, 9 September, 2012, Dublin
</summary>
<dc:date>2012-06-25T00:00:00Z</dc:date>
</entry>
<entry>
<title>Review of Statistical Network Analysis: Models, Algorithms, and Software</title>
<link href="http://hdl.handle.net/10197/3753" rel="alternate"/>
<author>
<name>Salter-Townshend, Michael</name>
</author>
<author>
<name>White, Arthur</name>
</author>
<author>
<name>Gollini, Isabella</name>
</author>
<author>
<name>Murphy, Thomas Brendan</name>
</author>
<id>http://hdl.handle.net/10197/3753</id>
<updated>2012-08-17T16:25:41Z</updated>
<published>2012-08-01T00:00:00Z</published>
<summary type="text">Review of Statistical Network Analysis: Models, Algorithms, and Software
Salter-Townshend, Michael; White, Arthur; Gollini, Isabella; Murphy, Thomas Brendan
The analysis of network data is an area that is rapidly growing, both within and outside of the discipline of statistics.&#13;
This review provides a concise summary of methods and models used in the statistical analysis of network data, including the Erdos–Renyi model, the exponential family class of network models, and recently developed latent variable models. Many of the methods and models are illustrated by application to the well-known Zachary karate dataset. Software routines available for implementing methods are emphasized throughout.&#13;
The aim of this paper is to provide a review with enough detail about many common classes of network models to whet the appetite and to point the way to further reading.
</summary>
<dc:date>2012-08-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Sentiment analysis of online media</title>
<link href="http://hdl.handle.net/10197/3574" rel="alternate"/>
<author>
<name>Salter-Townshend, Michael</name>
</author>
<author>
<name>Murphy, Thomas Brendan</name>
</author>
<id>http://hdl.handle.net/10197/3574</id>
<updated>2013-05-29T10:49:47Z</updated>
<published>2012-01-01T00:00:00Z</published>
<summary type="text">Sentiment analysis of online media
Salter-Townshend, Michael; Murphy, Thomas Brendan
A joint model for annotation bias and document classification is presented in the context of media sentiment analysis. We consider an Irish online media data set comprising online news articles with user annotations of negative, positive or irrelevant impact on the Irish economy. The joint model combines a statistical model&#13;
for user annotation bias and a Naive Bayes model for the document terms. An EM algorithm is used to estimate the annotation bias model, the unobserved biases in the&#13;
user annotations, the classifier parameters and the sentiment of the articles. The joint&#13;
modeling of both the user biases and the classifier is demonstrated to be superior to&#13;
estimation of the bias followed by the estimation of the classifier parameters.
Paper presented at the DAGM-GfKl/IFCS 2011, Joint Conference of the German Classification Society (GfKl)&#13;
and the German Association for Pattern Recognition (DAGM), August 31 to September 2, 2011 and at the IFCS 2011 Symposium of the International Federation of Classification Societies (IFCS), August 30, 2011, Frankfurt am Main, Germany
</summary>
<dc:date>2012-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Themecrowds : multiresolution summaries of Twitter usage</title>
<link href="http://hdl.handle.net/10197/3320" rel="alternate"/>
<author>
<name>Archambault, Daniel</name>
</author>
<author>
<name>Greene, Derek</name>
</author>
<author>
<name>Cunningham, Pádraig</name>
</author>
<author>
<name>Hurley, Neil J.</name>
</author>
<id>http://hdl.handle.net/10197/3320</id>
<updated>2011-11-22T14:48:52Z</updated>
<published>2011-10-28T00:00:00Z</published>
<summary type="text">Themecrowds : multiresolution summaries of Twitter usage
Archambault, Daniel; Greene, Derek; Cunningham, Pádraig; Hurley, Neil J.
Users of social media sites, such as Twitter, rapidly generate large volumes of text content on a daily basis. Visual summaries are needed to understand what groups of people are saying collectively in this unstructured text data. Users will typically discuss a wide variety of topics, where the number of authors talking about a specific topic can quickly grow or diminish over time, and what the collective is saying about the subject can shift as a situation develops.&#13;
In this paper, we present a technique that summarises what collections of Twitter users are saying about certain topics over time. As the correct resolution for inspecting the data is unknown in advance, the users are clustered hierarchically over a fixed time interval based on the similarity of their posts. The visualisation technique takes this data structure as its input. Given a topic, it finds the correct resolution of users at each time interval and provides tags to summarise what the collective is discussing. The technique is tested on a large microblogging corpus, consisting of millions of tweets and over a million users.
Paper presented at the 3rd International Workshop on Search and Mining User-generated Contents (SMUC 2011), 24th - 28th October 2011, Glasgow
</summary>
<dc:date>2011-10-28T00:00:00Z</dc:date>
</entry>
<entry>
<title>Deriving insights from national happiness indices</title>
<link href="http://hdl.handle.net/10197/3243" rel="alternate"/>
<author>
<name>Brew, Anthony</name>
</author>
<author>
<name>Greene, Derek</name>
</author>
<author>
<name>Archambault, Daniel</name>
</author>
<author>
<name>Cunningham, Pádraig</name>
</author>
<id>http://hdl.handle.net/10197/3243</id>
<updated>2012-03-26T09:13:55Z</updated>
<published>2011-12-11T00:00:00Z</published>
<summary type="text">Deriving insights from national happiness indices
Brew, Anthony; Greene, Derek; Archambault, Daniel; Cunningham, Pádraig
In online social media, individuals produce vast amounts of content which in effect "instruments" the world around us. Users on sites such as Twitter are publicly broadcasting status updates that provide an indication of their mood at a given moment in time, often accompanied by geolocation information. A number of strategies exist to aggregate such content to produce sentiment scores in order to build a "happiness index". In this paper, we describe such a system based on Twitter that maintains a happiness index for nine US cities. The main contribution of this paper is a companion system called SentireCrowds that allows us to identify the underlying causes behind shifts in sentiment. This ability to analyse the components of the sentiment signal highlights a number of problems. It shows that sentiment scoring on social media data without considering context is difficult. More importantly, it highlights cases where sentiment scoring methods are susceptible to unexpected shifts due to noise and trending memes.
Paper presented at the IEEE International Conference on Data Mining series (ICDM'11), December 11th to 14th, 2011, Vancouver, Canada
</summary>
<dc:date>2011-12-11T00:00:00Z</dc:date>
</entry>
<entry>
<title>SLiMSearch : a webserver for finding novel occurrences of short linear motifs in proteins, incorporating sequence context</title>
<link href="http://hdl.handle.net/10197/2942" rel="alternate"/>
<author>
<name>Davey, Norman E.</name>
</author>
<author>
<name>Haslam, Niall J.</name>
</author>
<author>
<name>Shields, Denis C.</name>
</author>
<author>
<name>Edwards, Richard J.</name>
</author>
<id>http://hdl.handle.net/10197/2942</id>
<updated>2013-02-26T09:56:38Z</updated>
<published>2010-01-01T00:00:00Z</published>
<summary type="text">SLiMSearch : a webserver for finding novel occurrences of short linear motifs in proteins, incorporating sequence context
Davey, Norman E.; Haslam, Niall J.; Shields, Denis C.; Edwards, Richard J.
Short, linear motifs (SLiMs) play a critical role in many biological processes. The SLiMSearch (Short, Linear Motif Search) webserver is a flexible tool that enables researchers to identify novel occurrences of pre- defined SLiMs in sets of proteins. Numerous masking options give the user great control over the contextual information to be included in the analyses, including evolutionary filtering and protein structural disorder. User-friendly output and visualizations of motif context allow the user to quickly gain insight into the validity of a putatively functional motif occurrence. Users can search motifs against the human proteome, or submit their own datasets of UniProt proteins, in which case motif support within the dataset is statistically assessed for over- and under-representation, accounting for evolutionary relationships between input proteins. SLiMSearch is freely available as open source Python modules and all webserver results are available for download. The SLiMSearch server is available at: http://bioware.ucd.ie/slimsearch.html.
Paper presented at the 5th IAPR International Conference, PRIB 2010, Nijmegen, The Netherlands, September 22-24, 2010
</summary>
<dc:date>2010-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Preferences in college applications - a nonparametric Bayesian analysis of top-10 rankings</title>
<link href="http://hdl.handle.net/10197/2832" rel="alternate"/>
<author>
<name>Ali, Alnur</name>
</author>
<author>
<name>Murphy, Thomas Brendan</name>
</author>
<author>
<name>Meila, Marina</name>
</author>
<author>
<name>Chen, Harr</name>
</author>
<id>http://hdl.handle.net/10197/2832</id>
<updated>2011-03-10T10:19:16Z</updated>
<published>2010-12-10T00:00:00Z</published>
<summary type="text">Preferences in college applications - a nonparametric Bayesian analysis of top-10 rankings
Ali, Alnur; Murphy, Thomas Brendan; Meila, Marina; Chen, Harr
Applicants to degree courses in Irish colleges and universities rank up to ten degree courses from a list of over ﬁve hundred. These data provide a wealth of &#13;
information concerning applicant degree choices. A Dirichlet process mixture of &#13;
generalized Mallows models are used to explore data from a cohort of applicants. &#13;
We ﬁnd strong and diverse clusters, which in turn gains us important insights into &#13;
the workings of the system. No previously tried models or analysis technique are &#13;
able to model the data with comparable accuracy.
NIPS Workshop on Computational Social Science and the Wisdom of Crowds, December 10th 2010, Whistler, Canada
</summary>
<dc:date>2010-12-10T00:00:00Z</dc:date>
</entry>
<entry>
<title>Identifying representative textual sources in blog networks</title>
<link href="http://hdl.handle.net/10197/2802" rel="alternate"/>
<author>
<name>Wade, Karen</name>
</author>
<author>
<name>Greene, Derek</name>
</author>
<author>
<name>Lee, Conrad</name>
</author>
<author>
<name>Archambault, Daniel</name>
</author>
<author>
<name>Cunningham, Pádraig</name>
</author>
<id>http://hdl.handle.net/10197/2802</id>
<updated>2011-02-24T10:26:21Z</updated>
<published>2011-02-01T00:00:00Z</published>
<summary type="text">Identifying representative textual sources in blog networks
Wade, Karen; Greene, Derek; Lee, Conrad; Archambault, Daniel; Cunningham, Pádraig
We apply methods from social network analysis and visualization to facilitate a study of the Irish blogosphere from a cultural studies perspective. We focus on solving the practical issues that arise when the goal is to perform textual analysis of the corpus produced by a network of bloggers. Previous studies into blogging networks have noted difficulties arising when trying to identify the extent and boundaries of these networks. As a response to calls for increasingly data-led approaches in media and cultural studies, we discuss a variety of social network analysis methods that can be used to identify which blogs can be seen as members of a posited "Irish blogging network". We identify hub blogs, communities of sites corresponding to different topics, and representative bloggers within these communities. Based on this study, we propose a set of analysis guidelines for researchers who wish to map out blogging networks.
</summary>
<dc:date>2011-02-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Optimizing conflicting objectives in NMF using Pareto simulated annealing</title>
<link href="http://hdl.handle.net/10197/2733" rel="alternate"/>
<author>
<name>Foley, Kevin</name>
</author>
<author>
<name>Greene, Derek</name>
</author>
<author>
<name>Cunningham, Pádraig</name>
</author>
<id>http://hdl.handle.net/10197/2733</id>
<updated>2012-03-26T09:25:40Z</updated>
<published>2010-08-30T00:00:00Z</published>
<summary type="text">Optimizing conflicting objectives in NMF using Pareto simulated annealing
Foley, Kevin; Greene, Derek; Cunningham, Pádraig
Non-Negative matrix factorization (NMF) has emerged as an important technique for simplifying high-dimension data into interpretable factors. NMF has the attractive characteristic that the factor matrices are naturally sparse, thus allowing them to be readily interpreted. However, there is a tension between the accuracy of the factorization and the sparseness – it is the management of the trade-off between these two criteria that is the subject of this paper. We introduce a multi-criteria Simulated annealing framework that produces a Pareto set of solutions, which are non-dominated on both criteria. We show that solutions at one end of the Pareto front of solutions correspond to NMF factorizations produced with conventional optimization techniques, while solutions at the other end exhibit enhanced sparseness. Clustering is no longer to be observed either in the raw-data form of the matrix, or the generated heat-map form.
Paper presented at the 21st National Conference on Artificial Intelligence and Cognitive Science (AICS 2010), Galway, Ireland, 30 August - 1 September, 2010
</summary>
<dc:date>2010-08-30T00:00:00Z</dc:date>
</entry>
<entry>
<title>Detecting highly overlapping community structure by greedy clique expansion</title>
<link href="http://hdl.handle.net/10197/2516" rel="alternate"/>
<author>
<name>Lee, Conrad</name>
</author>
<author>
<name>McDaid, Aaron</name>
</author>
<author>
<name>Reid, Fergal</name>
</author>
<author>
<name>Hurley, Neil J.</name>
</author>
<id>http://hdl.handle.net/10197/2516</id>
<updated>2011-03-14T13:00:07Z</updated>
<published>2010-07-25T00:00:00Z</published>
<summary type="text">Detecting highly overlapping community structure by greedy clique expansion
Lee, Conrad; McDaid, Aaron; Reid, Fergal; Hurley, Neil J.
In complex networks it is common for each node to belong to several communities, implying a highly overlapping community structure. Recent advances in benchmarking indicate that existing community assignment algorithms that are capable of detecting overlapping communities perform well only when the extent of community overlap is kept to modest levels. To overcome this limitation, we introduce a new community assignment algorithm called Greedy Clique Expansion (GCE). The algorithm identifies distinct cliques as seeds and expands these seeds by greedily optimizing a local fitness function. We perform extensive benchmarks on synthetic data to demonstrate that GCE's good performance is robust across diverse graph topologies. Significantly, GCE is the only algorithm to perform well on these synthetic graphs, in which every node belongs to multiple communities. Furthermore, when put to the task of identifying functional modules in protein interaction data, and college dorm assignments in Facebook friendship data, we find that GCE performs competitively.
Paper presented at the 4th SNA-KDD Workshop ’10 (SNA-KDD’10), held in conjunction with&#13;
The 16th ACM SIGKDD International Conference on&#13;
Knowledge Discovery and Data Mining (KDD 2010), July 25, 2010, Washington, DC USA
</summary>
<dc:date>2010-07-25T00:00:00Z</dc:date>
</entry>
</feed>
