Computers learn a new language

This story has implications for speech synthesis and language analysis. It would be great to see this applied to Latin or Greek via a corpus like the TLG.

(From New Scientist)

Computers learn a new language

• 06 August 2005

COMPUTER scientists have developed a program that can teach itself new languages. Feed it a piece of text, in any language, and the program analyses its structure and can then produce new, meaningful sentences.

Conventional translation software programs have all the rules of grammar coded into them. But the ADIOS (automatic distillation of structure) program, developed by researchers at Cornell University in New York and Tel Aviv University in Israel, infers the building blocks of a language using statistical and algebraic processes. The software learns the grammar of a new language by searching text for patterns. The researchers think the program will be useful in cognitive science and bioinformatics, as well as in applications such as voice recognition.

From issue 2511 of New Scientist magazine, 06 August 2005, page 23

Posted in Tools | Comments Off on Computers learn a new language

Postdoc in Humanities Computing

Call for Applicants: Post-Doctoral Researcher in Humanities Computing (Victoria, BC)

The University of Victoria’s Humanities Computing and Media Centre is looking for a suitably-qualified Post-Doctoral Researcher to join its work as part of the Text Analysis Portal for Research (TAPoR) Project for the 2005/6 academic year.

Candidates interested in this position will bring established academic research questions in an area or areas of Humanities Computing to the position, will have demonstrated capability in implementing solutions to those questions using the technologies supported by TAPoR at UVic, and will be prepared to work in a cooperative, collaborative environment toward achieving goals common to the UVic TAPoR group. This position may also involve teaching and participating in curriculum development.

Examples of technologies supported by TAPoR at UVic are: XML, XSLT, and XSL encoding languages; TEI P4 and P5; XQuery; and eXist XML databases. In addition, UVic TAPoR project members frequently work with XHTML, JavaScript and CSS, and web-based SQL database projects using PostgresSQL and mySQL.

Salary for this position is competitive, and will be commensurate with experience.

Applications including a brief cover letter, CV, and the names and contact information for three referees, may be sent electronically to
Ray Siemens
Canada Research Chair in Humanities Computing
UVic TAPoR Principle Investigator
siemens[at]uvic.ca

Applications will be received and reviewed until the position is filled.

(Seen on Humanist)

Posted in Jobs | Comments Off on Postdoc in Humanities Computing

Open Access Translation (The OAT Bible)

Thanks to Peter Suber for calling attention to Peter Kirby, Open Access Translation (The OAT Bible), Christian Origins, August 7, 2005. Excerpt:

TLAs (Three Letter Acronyms) are popular for Bible translations, so I’ve come up with one. The “Open Access Translation” (OAT) Bible. It would be the first Bible to be translated with a Creative Commons license. The question is–which license? The question is whether we would want the translator to be able to add this to her CV, in which case we would have to go with a “No Derivatives-By Attribution” license, or whether we would want people to be able to modify the Bible for their own purposes. For the Open Scrolls Project, J. Davila suggested that I go with the “No Derivatives-By Attribution” license, and I agreed to this. This way, all the changes to be made to the Bible could be suggested on a single website, where they could be reviewed by the general editor(s) and the editor(s) for the particular biblical book. The main contributors to each book’s translation would get credit and could know that their work would not be mangled. Nonetheless, the translation could be freely copied and printed at no charge if kept intact. In order to make such a translation, three things are necessary, or at least desirable –volunteer translators, open access translation software, and some funding (to pay the general editor? to pay a modicum to all active translators? to promote the project and the result? to legitimate the effort?). Active volunteer translators, and even moreso competent ones and excellent editors for quality control, will be the hardest to come by. Funding, therefore, could be a way to solve that problem. But who would do the funding? The easiest part would be open access translation software –because I would be happy to write it.

Posted in General | Comments Off on Open Access Translation (The OAT Bible)

Searching unstructured data

from InformationWeek:

IBM plans to make its enterprise search middleware, designed to facilitate searches of unstructured data, available as open-source code. It’s called Unstructured Information Management Architecture, and IBM says more than 15 knowledge-management companies intend to support it as a standard framework.

The company also is wedding the latest iteration of its WebSphere Information Integrator OmniFind Edition with UIMA. Its goal is to make enterprise search results more relevant and make it easier to apply third-party analytics software.

UIMA is a software infrastructure layer that supports the search and analysis of disorganized data. Unstructured data–E-mail messages, Word documents, and the like–isn’t easily classified, sorted, or searched.

The open-source move could let software partners add value to unstructured content with analytics applications, says Dana Gardner, principal analyst with consulting firm Interarbor Solutions.

Posted in General | Comments Off on Searching unstructured data

Confusion at Princeton

With its DRM’d textbook program, Princeton University demonstrates how not to use educational technology.

Update: Plan now being duly savaged on Slashdot.

Posted in General | Comments Off on Confusion at Princeton

Zapping and mapping

Cornell researchers developed a process called X-ray fluorescence imaging to recover faded text on stone by “zapping and mapping” the inscriptions.

The group built a machine that generates X-rays a million times more intense than what the doctor uses to image your bones. An X-ray beam is fired at a stone, scanning back and forth. Atoms on the stone’s surface emit lower-energy fluorescent X-rays, and different wavelength emissions reveal zinc, iron and other elements in the stone.

Historians know that iron chisels were commonly used to inscribe stone, and the letters were usually painted with pigments containing metal oxides and sulfides. So where letters and numbers are no longer visible to the eye, the newfound minerals trace their shapes.

Posted in General | Comments Off on Zapping and mapping

ETANA

The Scout Report on ETANA (Electronic Tools and Ancient Near Eastern Archives):

A number of interesting digital projects have recently been sponsored by the National Science Foundation, and the Electronic Tools and Ancient Near Eastern Archives (ETANA) is one such project. With the support and primary documents of a number of important institutions, such as the Society of Biblical Literature and Case Western Reserve University, the mission of ETANA is to “develop and maintain a comprehensive Internet site for the student of the ancient Near East.” While the project is still in development, the site’s creators have added numerous helpful resources so far to the archive, including the ETANA Core Texts. In this section, visitors can view digitized texts related to scholarship on the ancient Near East, such as James Breasted’s monumental work, “Ancient Records of Egypt”, along with 171 other key documents. Visitors will also want to take a look at ABZU, which is another database collection that contains items relevant to the study of the ancient Near East that are available online. [KMG]

Technical details here.

Posted in General | Comments Off on ETANA

GRBS Articles Online

Humbul have just logged the journal Greek Roman and Byzantine Studies online articles section: http://www.duke.edu/web/classics/grbs/online.html

This contains all sixteen GRBS articles from vol 44 (2004) plus one earlier article (David Jordan’s ‘New Curse Tablets’ from 2000). Articles are in PDF and are freely downloadable. It is not stated whether all articles will be added in this way in the future, at what intervals, or whether back issues will also appear, but it looks good so far.

Posted in Publications | Comments Off on GRBS Articles Online

Defining Ajax

From Jesse James Garrett, “Ajax: A New Approach to Web Applications” (February 18, 2005):

Ajax isn’t a technology. It’s really several technologies, each flourishing in its own right, coming together in powerful new ways. Ajax incorporates:

Continue reading

Posted in General | Comments Off on Defining Ajax

Ajax

Wired on AJAX (“The name is shorthand for Asynchronous JavaScript + XML, and it represents a fundamental shift in what’s possible on the Web.”):

Software experts say recent innovations in web design are ushering in a new era for internet-based software applications, some of the best of which already rival desktop applications in power and efficiency. That’s giving software developers a wide open platform for creating new programs that have no relation to the underlying operating system that runs a PC.

Evidence of this evolution has been popping up everywhere in recent months, with examples that include Google’s online map rendering software and its Gmail service, Amazon’s A9 search engine and NetFlix’s DVD rental platform. All highlight a dramatic rethinking of web applications, using a programming technique dubbed AJAX (for asynchronous JavaScript and XML) that significantly improves how web pages interact with data, for the first time rivaling programs that run natively on the desktop.

“For a user it is fundamentally different — it feels like a real application,” said Rael Dornfest, chief technology officer for O’Reilly Media.

AJAX overcomes a severe limitation in traditional web interfaces, which must reload anytime they try to call up new data. By contrast, AJAX lets users manipulate data without clicking through to a new page, Dornfest said. That’s putting an end to page refreshes and other interruptions that have handicapped wweb-based applications until now.

Web developers are creating AJAX code libraries and conventions to ease the burden of making applications that speak several computer languages… “This is going to go a long way towards eliminating the user interface insults and injuries we have suffered since we moved to the web,” O’Reilly’s Dornfest said. “Now people these days expect it to be flat so they might be a little surprised (by AJAX applications). But the rest of us see AJAX and say ‘Ahh, this is what it is supposed to be like.'”

Posted in General | Comments Off on Ajax

Just do it

Marc Goodacre makes a useful point about funding for open-access biblical studies:

If the essential proposal is: how can we get a big project financed?, then there is still a large part of me that just sighs. I have felt for some time that the key to the development of exciting on-line projects in our area is the voluntary efforts of people like us. The funding comes, if you like, from two places: (1) the educational institutions that employ us and which are committed to the dissemination of our scholarship not only within their walls but also outside of them, so that our salaries here are the funding, and the time we allocate is our decision about commitment to such an important goal; (2) the self-funding provided by the gifted and enthusiastic amateurs who make such a major contribution in this area by devoting their own time.

Posted in General | Comments Off on Just do it

GALE: the “Holy Grail” of human language technology

… experts describe the project as the most ambitious undertaking in the history of human language technology.

If it is developed as planned, the first-of-its-kind machine will be able to recognize speech in multiple languages, translate it into English, and then mine the resulting transcripts to sift so-called intelligence from dross, said sources close to the project.

The ultimate goal of the endeavor, dubbed GALE for Global Autonomous Language Exploitation, is to turn the staggeringly large volumes of recorded foreign language broadcasts, phone conversations, and Internet traffic into something national security analysts, spooks, and soldiers can actually use.

It is said that the National Security Agency gathers enough information every hour to fill the Library of Congress. Most of it is never translated, and never reaches the desk of an analyst.

The dearth of solid human language technology in the hands of the government is a “huge problem,â€? has said Gilman Louie, the president and CEO of In-Q-Tel, the Central Intelligence Agency’s venture capital arm…

DARPA’s ambitions for GALE represent the “Holy Grailâ€? of human language technology, said John Makhoul, a scientist at BBN working on the GALE project. The best speech recognition software operating in a controlled environment like a television broadcast can usually get nine out of 10 words correct.

DARPA wants 95 percent accuracy rates for both speech recognition and translation. And it demands that the engines be able to process radio, TV, talk shows, newswires, newsgroups, blogs, and phone conversations in English, Arabic, and Chinese.

The Pentagon is seeking the same high accuracy rate for the translation part of the system. Experts say that translation software now performs with around 80 percent accuracy.

Posted in General | Comments Off on GALE: the “Holy Grail” of human language technology

ePublishing Job at UPenn

Position Available

Schoenberg Center for Electronic Text & Image
University of Pennsylvania Library

The University of Pennsylvania Library is presently seeking a bright, creative individual with a solid background in humanities computing to guide and manage its electronic publishing unit, The Schoenberg Center for Text & Image (SCETI).

SCETI (http://dewey.library.upenn.edu/sceti) was created in 1996 to produce virtual facsimiles of rare and special materials from Penn’s collections. Over the years, SCETI has evolved into a fully integrated electronic library containing digital versions printed books, manuscripts, correspondences, images, and most recently recorded sound.
SCETI’s projects have ranged from Shakespeare, medieval Judaica, and the traditions of alchemy to ancient papyri, the works of Theodore Dreiser, and the spoken word. In 1998 SCETI received an NEH Challenge Grant, which was successfully met. SCETI collaborates on the Penn campus with faculty and academic units and internationally with institutions across North America and beyond. The SCETI web site is a destination stop for humanities scholars from around the world.

Potential candidates are invited to submit a letter of application which addresses the needs and qualifications of the position, along with their resume and the names, addresses and phone numbers of three references who can address the suitability of the candidate for the position described, to: Robert Eash, Human Resources Officer, University of Pennsylvania Library, 3420 Walnut Street, Philadelphia, PA 19104-6206 or email to: reash@pobox.upenn.edu

(seen on Humanist)

Posted in Jobs | Comments Off on ePublishing Job at UPenn

Free the Encyclopedia!

Interesting comments on measures of success for Wikipedia by Jimbo Wales, guest-blogging at Lawrence Lessig’s site:

So, how are we doing? What are the odds of this goal being accomplished in the next 20 years?

First, it can be argued that although much work remains to be done in many areas, if you speak English, German, French, or Japanese, and have broadband Internet access, you have your encyclopedia. Each of those 4 languages has more than 100,000 articles and provides a reasonably comprehensive resource. Several other languages will pass the 100,000 threshold soon enough, and in 5 years time, all of these and many more will be larger than 250,000 articles.

Second, clearly there is a lot of work to be done in finding ways to actually distribute the work we have done already into areas where people can use it. Many people would be able to make positive use of English, French, or Spanish Wikipedia (for example) if only they had access to it.

Third, while it is important to provide our work in important global or “colonial” languages, we also think it is extremely important to provide our work in languages that people speak natively, at home. (Swahili, Hindi, etc.)

I will define a reasonable degree of success as follows, while recognizing that it does leave out a handful of people around the world who only speak rare languages: this problem will be solved when Wikipedia versions with at least 250,000 articles exists in every language which has at least 1,000,000 speakers and significant efforts exist for even very small languages. There are many local languages which are spoken by people who also speak a more common international language — both facts are relevant.

I predict this will be completed in 15 years. With a 250,000-article cutoff, English and German are both past the threshold. Japanese and French will be there in a year. Several other languages will be there in two years.

The encyclopedia will be free.

Posted in General | Leave a comment

More on games

More on the potential of games in education, from James L. Morrison, Editor-in-Chief, Innovate:

The August/September 2005 of Innovate’s special issue on the role of video game technology in educational settings is now available at
http://www.innovateonline.info

Innovate is a peer-reviewed, bimonthly e-journal published as a public service by the Fischler School of Education and Human Services at Nova Southeastern University. It features creative practices and cutting-edge research on the use of information technology to enhance education.

Jim Gee opens the issue with a key question: “What would a state of the art instructional video game look like?” Gee’s response focuses on the commercial game Full Spectrum Warrior in order to reveal the “good theory of learning” that should inform the design of video games produced specifically for instructional purposes. In turn, David Shaffer elaborates a similar theory of situated and action-based learning with the concept of an “epistemic game,” whose design integrates player interests, domain knowledge, valued professional practices, and assessment to generate motivation and deep learning. In the following article, Richard Halverson reinforces the argument that valid learning principles inform successful video games, and describes how they might be integrated in educational contexts.

Melanie Zibit and David Gibson report the work in progress on simSchool–a video game that prepares teachers for the complexities of classroom management by offering a “simulated apprenticeship” that prepares teachers to practice the kind of informed decision making required for success in their profession.

Kurt Squire’s findings about the benefits of and obstacles to the implementation of video games in the classroom are based on his own attempt to use Civilization III in high school history classes. He argues that, rather than thinking about how to design good games for the existing K-12 educational system, we should focus our energies on how to design an educational system flexible enough to accommodate video games. In contrast, Michael Begg, David Dewhurst, and Hamish Macleod advocate a “game-informed learning” approach that would make conventional learning activities more game-like. The two medical simulations they describe immerse students in a professional identity and generate highly motivated constructivist learning.

In a provocative glimpse into the future learning landscape, Joel Foreman, this issue’s guest editor, interviews Clark Aldrich, described by Fortune magazine as one of the top three e-learning gurus. The interview begins with the distinction between games and simulations and concludes with Aldrich’s “20 simulations” approach to the reformation of education.

Stephen Downes wraps up the issue with his review of Apolyton, an exemplar site that provides both fodder for resourceful students and models for educators who want to cultivate new online learning communities.

We hope that you enjoy this special issue of Innovate. Please use Innovate’s one-button features to comment on articles, share material with colleagues and friends, easily obtain related articles, and participate in Innovate-Live webcasts and discussion forums. Join us in exploring the best uses of technology to improve the ways we think, learn, and live.

Please forward this announcement to appropriate mailing lists and to colleagues who want to use IT tools to advance their work.

Finally, if you wish to continue to get announcements of new issues, please subscribe to Innovate at www.innovateonline.info Subscription is free.

Many thanks.

Jim —

Posted in General | Comments Off on More on games