IBM SourceForges UIMA

Source code for IBM’s Unstructured Information Management Architecture (UIMA) now has a home on SourceForge, as IBM invites developers everywhere to take a shot at the concept of knowledge discovery.

IBM unveiled its source code release on SourceForge, the biggest open source development site, in a statement today. Developers can find the UIMA framework at Sourceforge, and additional items like the IBM UIMA SDK, with additional facilities and components, can be downloaded for free from IBM’s website.

The move demonstrated IBM’s confidence in the open source movement, and in its UIMA technology. In December 2005, IBM’s director for strategy & business development for content discovery Marc Andrews discussed UIMA within the context of search, and noted how UIMA is “information aware” rather than just data or application aware.

Unstructured data is the main idea to IBM’s concept approach. Whether information exists in an email, a text file, or a piece of rich media, UIMA’s framework can enable the construction of applications to retrieve the information from the container. Later in 2006, IBM plans to make the UIMA project a “full open source community development model.”

UIMA first started picking up notice in February 2005, and more formally disclosed at SES San Jose 2005 in August of that year. At that time, IBM promised to unveil UIMA on SourceForge by the end of the year.

Several firms like Factiva and Cognos created UIMA compliant solutions, IBM noted in its statement. Ongoing UIMA development in the medical field at places like the Mayo Clinic and Memorial Sloan-Kettering holds promise, too. Mayo wants to extract information from about 20 million clinical notes, while Sloan-Kettering has an even more ambitious plan:

Memorial Sloan-Kettering Cancer Center is working with IBM to develop a Web accessible data warehouse that will conform to HIPAA requirements. This data warehouse will enable clinicians and researchers from Memorial Sloan-Kettering Cancer Center to efficiently use data facilitating research on a new cancer taxonomy. An important aspect of the data warehouse is the inclusion of searchable concepts from Memorial Sloan-Kettering Cancer Center’s text-based pathology reports. These concepts are automatically extracted by an IBM text analytics solution built on the UIMA framework.


document.write(“Email Murdok here.”)

Drag this to your Bookmarks.

Add to document.write(“Del.icio.us”) | DiggThis | Yahoo My Web

David Utter is a staff writer for Murdok covering technology and business.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top