In the wake of the Bible Software shootout this past SBL, I’ve been thinking about a new idea for next year.  Last week I proposed what follows to the CARG committee of SBL (Computer Assisted Research Group) to see if they’ll approve the idea.  I’ll let you know if they respond, but won’t post any email correspondence.  Here’s what I sent them:

————

Dear CARG Committee

I’m writing to both extend a note of thanks for this past SBL’s software shootout idea and propose something for next year. This year’s shootout challenged us to think critically about how we deliver search results in the midst of our 4.0 development. That turned out to be very helpful.

As I’ve been thinking about how this external input served as a catalyst for changes in our software, an idea occurred to me.  One of the biggest challenges we face is how to best implement and design our syntactical databases.  Since these databases have not existed until the last few years, it’s quite a challenge to know what scholars would want out of them.  For example, what sorts of searches and search strategies do scholars really want from a “more than morphology” above-the-word-level database?  It’s hard to know.  Periodically I get emails from scholars asking me to search for something for them. We need more of that sort of thing.  In short, it would be great to have scholars tell us what they’d do with these things.

To that end, here’s my idea. I think it would be very worthwhile at next year’s SBL meeting to have a “Syntax Software Shootout” or (better) a “Syntax Software Think Tank” session.  Let me briefly parse the suggestion.

At least part of the session could focus on queries that CARG could solicit from scholars who care about syntax or linguistically-informed language research and (this is important) who understand how to formulate queries that would be impossible or totally impractical with a morphological database. Presenters could make a list of such people, and CARG could draw from that list and ask for queries. For example, what are the top 5 syntactical features or constructions Bruce Waltke / Dan Wallace would want to search for in the text?

At present we’re the only software company that has these sorts of databases (I include SESB since we are their developer and SESB is in our platform). We know that others are working on them, though, which is good for the academic community. However, we don’t want any appearance of unfairness or an infomercial. We would be willing to restrict queries to whatever corpus portion (e.g., the Torah, the Gospels) a competitor had ready for research. Coverage really isn’t the issue, and neither is having people watch companies compete for business. The point would be helping people understand what syntax databases can and cannot do, and then hearing what people would want them to do. We need the scholarly community for that sort of discussion; we can’t solve that in the office.

Another idea to get meaningful queries is to give conference attendees the opportunity to present queries. In the event that the people chosen to present queries couldn’t come up with good / interesting ones, I’d suggest that CARG could (through SBL) promote the session in advance and ask attendees to present queries to us/other presenters in advance, by a deadline.

Presenters could walk session attendees through the queries, explaining their significance and strategy, and why morphological databases are not viable tools for solving the query or getting at the pertinent information. We could also try the queries by different methods, depending on the database. This exercise would expose both the strengths and limitations of these kinds of databases and stimulate some interaction on how things could be / should be done differently.

One of the primary benefits to such a session would be to show attendees how each syntax database is different; that is, how they are useful or not useful for a variety of research methods. If people could see what they can and cannot do, I think we’d all learn a lot about how people presently think about solving research tasks and how they might re-imagine those research tasks.

A meaningful amount of time for Q&A would be preferred, so we could hear what people in the room would like to see done differently in the future with respect to syntax databases.

That’s it. The software shootout was a good idea and drew a large audience. I think this one would draw as well if it was publicized and if attendees could have input.  If CARG was interested in putting this sort of session together to not just watch software tools but critique their usefulness, it could really advance research methodology.

I would appreciate it if you all could discuss this idea and let me know what you think. Thanks for your time!

Sincerely,

Mike Heiser