XML & TEI
Published by Jonathan September 29th, 2005 in Librarianship, Technologybeware, techno-babble ahead
I spent two days this past week in a workshop at the University of Maryland learning about XML and TEI. To explain XML in a nutshell and do it justice… well, I’m not sure if I can. It is similar to HTML as a markup language and a subset of SGML. However it is more powerful and flexible and has been created to encode data in a way that can be shared across many systems. The successor to HTML is XHTML, which uses the stricter syntax of XML. You may be using XML and not even realise it. Your blog creates an XML page to be used as an RSS feed. But I won’t bore you with technical details. There is a good basic summary for newcomers and the curious over at the W3C, XML in 10 Points.
TEI refers to the Text Encoding Initiative. It is a DTD (soon to be Schema) in XML for encoding text documents. Right now it is mostly being used in the humanities, social sciences and linguistics as a way to provide detailed information encoded directly into digital documents about their contents. This could be as simple as headers similar to metadata that provide specific information (keywords, subjects, place names, etc.) or tagging specific words throughout the text (name authority, stanza, dates, etc.). The obvious advantage of this is in searching. With a group of documents encoded in TEI a researcher could conduct very specific searches as opposed to the author/title/subject search of library catalogs or the simple keyword searching of Google. With the DTD as a separate file the text only needs to be tagged, and you don’t have to worry about the DTD becoming obsolete as it can be modified when necessary. This is only the beginning of what is possible with TEI.
Dr. Susan Schreibman taught the workshop, and she provided us with a fantastic reading list before hand that assumes no familiarity with XML or TEI. You can get that here:
Advance Reading for Introduction to XML, TEI and XSLT
In the class we used a fantastic open-source text editor called jEdit with plugins for TEI. I use Dreamweaver MX at work and TSW Webcoder at home and jEdit was very impressive. I’ll have to give it a closer look.
Over the next couple days I’m going to try and post some of my thoughts of the Digital Libraries Symposium I attended today, following the workshop. Some very cool things came up.


Looks like I really need to do my homework now! and the struggle to keep up in the techno world continues
(of course I’m just saying that because of my “adventures in technology” today! In reality, change is good, improvement is better, and the life goes on)