Skip to content

CC Proposal 2

Public view: Ontology Add-in for

Title: Ontology Add-in for
Student: Nimalaprakasan Skandhakumar
Ontology Add-in for will enable the annotation of OOo documents based on terms that appear in Ontology. This will be similar to the Microsoft created a plugin for Microsoft Word 2007 helping one insert links to ontologies curated by Science Commons.
Project Abstract (1 paragraph, high level overview):

Ontology Add-in for will enable authors to easily add scientific hyperlinks as semantic annotations, drawn from ontologies, to their documents and research papers. Ontologies are shared vocabularies created and maintained by different academic domains to model their fields of study. The goal of the add-in is to assist scientists in writing a manuscript that is easily integrated with existing and pending electronic resources. The major aims of this project are to add semantic information as XML mark-up to the manuscript using ontologies and controlled vocabularies (from the National Center for Biomedical Ontology) and identifiers from major biological databases, and to integrate manuscript content with existing public data repositories. This Add-in will make it easier for scientists to link their documents to the Web in a meaningful way. Deployed on a wide scale, ontology-enabled scientific publishing will provide a Web boost to scientific discovery.
Project Details (technical details if available, work plan):

This project is focused on researchers and software developers in domains utilizing ontologies– as well as publishers, archivists, and early adopters in the scientific, technical, and scholarly publishing fields. This add-in would simplify the development and validation of ontologies, by making ontologies more accessible to a wide audience of authors and by enabling semantic content to be integrated in the authoring experience, capturing the author’s intent and knowledge at the source, and facilitating downstream discoverability.

As part of the publishing workflow and archiving process, the terms added by the add-in, providing the semantic information, can be extracted from Word files, as they are stored as custom XML tags as part of the content. The semantic knowledge can then be preserved as the documented is converted to other formats, such as HTML or the XML format from the National Library of Medicine, which is commonly used for archiving.

The full benefit of semantic-rich content will result from an end-to-end approach to the preservation of semantics and metadata through the publishing pipeline, starting with capturing knowledge from the subject experts, the authors, and enabling this knowledge to be preserved when published, as well as made available to search engines and presented to people consuming the content.

As for the workplan this will me mainly focusing on porting the openly available Microsoft Word 2007plugin, thus we will not have to start from scratch. This will require not only just porting the code, but will also involve customizing for OOo falvour and will also invoive using the OOo 3 SDK for the development.

What risks to completion exist and how do you plan to mitigate them?

Issues when porting from MS Office 2007 to OOo Writer can be an issue, but I hope I will be able to handle this as its mosly relied on XML standards and the code is available openly.

Another issue would be related to getting access and using Ontolog, I will be able to get more help realated to this from the staff at our university and other researchers at our university.

Have you posted this on cc-devel (mailing list) or #cc (IRC) for feedback?

Yes, I have discussed about this with developers on cc-devel, and got some feedback from them.

Why I’m suitable for doing this?

  • Open Source Development Experience
    • – Have been a contibuting developer in the mailing slists and directly invoived in localization project for Tamil language for the past 3 years.
    • Firefox – I have been a Firefox addon developer and worked on many localization related addons such as TamilKey, SinhalaKey and EnTaTip.
    • WordPress – I have been closely involved with WordPress development in the past year, specially providing support and help in the IRC channel and mailing lists.
  • Work/Internship Experience
    • I have 8 months work experience of Internship from October 2007 at WaveNET International (Pvt) Ltd., Colombo, Sri Lanka. In this I gain experience in project develoment lifecycle and project management. I wouked on couple of projects related to mobile content platforms.
  • Academic Experience
    • Pursuing a Computer Science & Engineering Degree at University of Moratuwa, Sri Lanka
    • Completed a Java based individual project TalkOut RJ Auto, a radio automation software, now available under GPL2.

Contact Information

IRC nick on Freenode: talkout

Related Links: