News content in the internet, available through both traditional news media portals and the blogosphere, constitutes valuable information to both professionals and casual internet users. Thus, news and social media are emerging as a dominant source of information for numerous applications, which lay on the media domain, but are not only limited to it. However, this unstructured content presents challenges to efficient extraction of such information, as target users can be inundated by its vast amount. Clearly, such information could be much more useful if presented and delivered in a well-structured way. Many attempts, taking the form of either research projects or commercial solutions, have been made to provide centralised repositories of such content. However, to date, there exists no integrated system that structures blog post content across these two broad sources of news information in parallel, capable to meet the requirements of a broad range of end users, such as professional journalists, communication experts, and citizen bloggers.
In order to address the above challenge, SYNC3 delivers a user-friendly news analysis tool for searching blogs and traditional media news, allowing users to create, comment and 'sync' their news in a virtually limitless network. The platform integrates functionalities laid on three areas, namely news clustering, blog processing and news events labelling and relation extraction, and can be customisable to the needs of the professional and citizen journalists, as well as to policy makers and communication experts.
More specifically, SYNC3 fills this gap, efficiently structuring content from both domains, rendering it accessible, manageable, and re-usable. This is achieved by incorporating innovative algorithms that first model news media content statistically, based on fine clustering of articles into so-called “news events”. Such models are then adapted and applied to the blogosphere domain, allowing its content to map to the traditional news domain and attaching sentiment to them. Furthermore, appropriate algorithms are employed to extract news event labels and relations between events, in order to efficiently present news content to the system end users.
SYNC3 applies the news domain structure derived from well-organised news portals to the unstructured domain of the blogosphere. To achieve this, novel research approaches are proposed, which advance the state-of-the-art and produce a number of software modules that have been integrated into a common platform operating as a news aggregation tool. This tool is organising content coming from both news portals and blogs. It also allows the creation of more user-generated content, either by authoring new material, or by re-organising the links structured by SYNC3 into user-generated storylines.