So, an I2E user has built a great query that detects side effects and adverse events for a drug. It looks ideal as a candidate for repeated use: for example, search MEDLINE whenever it is updated. The I2E user has also saved this query as a Smart Query, meaning the drug can be changed when the query is run. Changing the settings of a smart query in the I2E client is easy: type in a few words, click to add a class or load in a list of alternatives, or some combination of those options. So how can options like these be transferred to the I2E server as part of the query using the Web Services API? Read the rest of this entry »
The internet right now, as Tim Berners-Lee points out in Scientific American, is a web of documents; documents that are designed to be read, primarily, by humans. The vision behind the Semantic Web is a web of information, designed to be processed by machines. The vision is being implemented: important parts of the key enabling technologies are already in place.
RDF or the resource description framework is one such key technology. RDF is the language for expressing information in the semantic web. Every statement in RDF is a simple triple, which you can think of as subject/verb/object and a set of statements is just a set of triples. Three example triples might be: Armstrong/visited/moon, Armstrong/isa/human and moon/isa/astronomical body. The power of RDF lies partly in the fact that a set of triples is also a graph and graphs are perfect for machines to traverse and, increasingly, reason over. After all, when you surf the web, you’re just traversing the graph of hyperlinks. And that’s the second powerful feature of RDF. The individual parts, such as Armstrong and moon, are not just strings of letters but web-addressable Uniform Resource Identifiers (URIs). When I publish my little graph about Armstrong it becomes part of a vast world-wide graph: the Semantic Web. So, machines hunting for information about Armstrong can reach my graph and every other graph about Armstrong. This approach allows the web to become a huge distributed knowledge base.
There’s a new component for I2E: the XML to RDF convertor. It turns standard I2E results into web-standard RDF. Each row in the table of results (each assertion) becomes one or more triples. For example, suppose you run an astronomy query against a news website and it returns the structured fact: Armstrong, visited, the moon, 1969. Let’s also suppose Armstrong and the moon were identified using an ontology derived from Wikipedia. The output RDF will include one URI to identify this structured fact and four more for the fact’s constituents (the first is Armstrong, the second is the relation of visiting, the third is the moon and so forth). Then, there will be a number of triples relating these constituents, for example, that the subject of the visiting is Armstrong. In addition, all the other information available in the traditional I2E results is presented in additional triples. For example, one triple might state that Armstrong’s preferred term is “Neil Armstrong”, another might state that the source concept is en_wikipedia.org/wiki/Neil_Armstrong, a third might state that the hit string (text justifying the concept) is Neil Alden Armstrong. The set of possible output triples for the I2E convertor is fully defined by an RDF schema.
Why is this a good thing? First and foremost, I2E results can now join the semantic web. Even if you don’t want to publish your results, you can still exploit the growing list of semantic web tools for processing your own data and integrating them with other data which has been published. Second, to quote Tim Berners-Lee again “The Semantic Web will enable machines to COMPREHEND semantic documents and data, not human speech and writings“. I2E is all about extracting structured information from human writings. So link these two things together and you have a powerful tool for traversing the worlds of structured and unstructured data together.
Find out more about Linguamatics I2E…
Your most common usage of the I2E Web Services API is likely to be to automate query execution to generate results. Queries themselves are always constructed and refined in the I2E client interface; from there they can be saved onto the I2E server ready for batch processing. When running a query automatically you need to provide, as a minimum, two pieces of information: the location of the index and the location of the query. In this post we won’t worry too much about the index — we’ll assume that the index that the user originally used to create their query is still available — and focus on the query.
As saved by the user, the query contains sufficient information to specify the search itself (keywords, classes, phrases, etc.) as well as controlling the output settings, which will include (among other things) the format of the results (HTML, TSV, XML, etc) along with the ordering of results and selection of columns and highlighting. Read the rest of this entry »
When choosing to develop an application that uses I2E, it is important to understand the capabilities of our text mining software as well details of the API itself. As the graphic shows, tasks that are performed on the I2E Server are independent of each other and so allow diverse applications to be created: one app to run large-scale queries and present the information in a visual form in one example; another app to process documents automatically and publish the resulting indexing to the I2E Query GUI is another.
Today’s post is more about the latter: what are the basic details of the API itself; what languages are supported and what do we provide to get you started.
The release of I2E 4.0 at the end of 2012 included a Web Services API (WSAPI) for our software for the first time. The availability of this interface, along with sample code and a sample GUI, meant that it was possible for developers to include integration with I2E into their code.
We’ve used standard technologies when building our API but there are many software-specific features that need to be understood before you can choose what capabilities of I2E to include in your applications. For this, we are providing additional training materials such as training sessions at our Text Mining Summit, webinars, traditional phone and email support and, well, this blog.
Read the rest of this entry »