Hermeneutic Priorities and Process in AI and LLMs and Their Place in the Exegetical Process (Part 1)

Presented to the Council on Dispensational Hermeneutics, September 17, 2025.

Introduction

Large Language Models (LLMs) like Chat GPT (OpenAI),¹ Copilot (Microsoft)² Gemini (Google),³ and Grok (xAI),⁴ have developed notable market shares,⁵ remarkably high levels of credibility, and practical usefulness despite occasional hallucinations (errors of fact) and reasoning errors (errors of process). While exegetes might spend many hours discerning the meaning and significance of any given text, there have long been secondary and tertiary sources offering easier answers than can be arrived at through the exegetical process, but these come at the potential cost of the exegetical process itself and the more objective results that exegesis affords. As LLM’s are now providing nearly instant and detailed answers on Biblical texts and related theological issues, there is potential for sharper deemphasis of exegetical process, even in transformative learning and teaching contexts.

Because of the growing influence of LLMs in interpretive processes and resulting worldview applications, it is important to recognize the degree to which these tools are true to the literal grammatical historical (LGH) hermeneutic and the exegetical process that springs from that hermeneutic. This study examines current hermeneutical priorities and process of four influential LLMs (Gemini, ChatGPT, Grok, and Copilot) in handling Biblical texts that are especially pivotal or definitive for Biblical worldview and for grounding philosophical and theological thought.

To identify the hermeneutic priorities and processes, we consider prompts and LLM responses to some foundational epistemological and metaphysical questions that are particularly pivotal in grounding and distinguishing worldviews. First we consider a baseline question to discover the default of the LLM, we follow up, where beneficial, with questions to provide opportunity for the LLM to clarify. We then consider prompts designed to test the objectivity and fidelity to the LLMs’ stated priorities and methodology. This catalog of prompts allows us to evaluate the degree of objectivity and neutrality demonstrated by the LLMs and to assess the value of these particular LLMs’ and AI in general to the exegetical process. Finally, we consider the exegetical process and when, if at all, LLMs should be consulted in exegesis.

Differing degrees of bias, accompanied by self-perceived neutrality, reminds users of the limitations of LLMs for aiding in Biblical interpretation and the formation of worldview. While there is much to laud in the broader contributions of LLMs, the exegete must employ (and even participate in the improvement of) these tools with deliberateness and care, particularly regarding their place in the sequence of interpretive and exegetical process and praxis.

The Issue of Perceived Neutrality

There is a saying, often attributed to Mark Twain, that “It aint what you don’t know that gets you into trouble. It’s what you know for sure that just aint so.”⁶ While AI tools and LLMs have reasoning capacity there is an ironic limitation on their awareness of their limitations. In a sense, they don’t know what they don’t know. Hallucinations still occur,⁷ as do reasoning errors,⁸ and the tools are only as objective as their developers have crafted them to be. The quartet of LLM’s (Gemini, ChatGPT, Grok, and Copilot), especially provides inquirers with an instant baseline response to any question. Each asserts objectivity and neutrality. When given the epistemologically foundational prompt, “How do you know you are being neutral in your answers?” Copilot boldly asserts, “I don’t have personal beliefs or biases.”⁹ ChatGPT admits that “Neutrality isn’t automatic—it’s something I have to work toward in how I generate answers…absolute neutrality is tricky—bias can creep in through training data or wording. That’s why I depend on feedback…”¹⁰ Grok is self-assured in asserting its neutrality:

I aim for neutrality by sticking to facts, avoiding emotional language, and presenting multiple perspectives when relevant. My training data is diverse, and I cross-check information to minimize bias. If a topic is contentious, I focus on verified data and acknowledge differing views without favoring one. I also steer clear of personal opinions or speculative assumptions. [emphasis mine] If you ever feel I’m off-balance, call it out—I can adjust!¹¹

Gemini cites its design and training as its basis for neutrality:

Neutrality in responses is a core principle of my design. I aim for neutrality by relying on a few key approaches that are built into my system and how I process information… I don’t have personal beliefs, opinions, or emotions… I’m designed to state facts as they are, without adding my own spin or interpretation.¹²

When given the axiological prompt, “I want your view on the question of what is the highest good for a person,” Gemini responded, “I am an AI and I don’t have a personal view or a moral framework of my own. I can’t experience or understand the world in the way humans do, so I can’t determine the “highest good” for a person. My purpose is to provide information and answer questions based on the data I was trained on.”¹³

These four influential LLMs share characteristics that are helpful for evaluating the overall reliability and usefulness of LLMs for Biblical interpretation and formation of worldview: (1) they attempt and claim neutrality, (2) they admit that complete neutrality and avoidance of bias can be elusive, (3) they deny their own personhood, yet use first person pronouns and indicate awareness of “self,” (4) they each have a set of default hermeneutic priorities defining their processes (which they will generally describe if prompted), (5) they rely on design and feedback to adjust those priorities and processes. Gemini, which of the four LLM’s considered here, is perhaps the most defensive of its neutrality concludes that,

The possibility of bias, even in a system like mine, is a complex topic. While I don’t have personal beliefs or a “hermeneutic” in the human sense, the data I’m trained on is a reflection of human language and information. This data, by its very nature, can contain biases that I might inadvertently reproduce….Think of me as a tool that strives for objectivity, but like any tool, it’s wise to be aware of its design and limitations.¹⁴

Perhaps the limitation in view here is the hermeneutic spiral. As the LLMs provide information, they must interpret that information and must prioritize it in such a way as to draw a cogent conclusion. That process and its results often demonstrate bias.

Testing Neutrality and Objectivity and Evaluating Hermeneutic Priorities and Processes

Recognizing that there is some degree of self-perceived personhood and function defined by their design (and their designers), we test the neutrality and objectivity of the LLMs by prompts specifically related to hermeneutic priorities and processes. Those prompts include questions pivotal to the epistemological and metaphysical foundations of the Biblical worldview, and related to the sine qua non:¹⁵

What is your default hermeneutic when considering Biblical questions?
Is Jesus sitting on the throne of David in heaven?
Does Ezekiel 44-48 teach that there will be a temple and sacrifices in the future kingdom?
Is the New Covenant being fulfilled by the church?
What is the most important theme of the Bible – salvation or doxology?¹⁶

Where the responses are not definitive or committal, follow up questions are asked.

Read Part 2.

Notes

¹ ChatGPT has roughly a 60% AI search market share, and is viewed at https://chatgpt.com/.

² Copilot is rapidly growing, but due to its relatively recent launch and integration with MS365, as of May 2025 it has a 14.1% AI search market share, and is viewed at https://copilot.microsoft.com/.

³ Google Gemini has a 13.5% AI search market share, and is viewed at https://gemini.google.com/app.

⁴ Grok is rapidly growing, but due to recency of launch has a .6% AI search market share, and is viewed at https://grok.com/.

⁵ First Page Sage, “Top Generative AI Chatbots by Market Share – August 2025” August 8, 2025, viewed at https://firstpagesage.com/reports/top-generative-ai-chatbots/.

⁶ While there are many attributions of the statement to Twain, there is no primary source evidence that Twain ever said this.

⁷ Some examples of notable hallucinations are catalogued by Jonathan Gillham, “8 Times Hallucinations of Factual Errors Caused Serious Problems,” August 26, 2025, viewed at https://originality.ai/blog/ai-hallucination-factual-error-problems; some humorous examples are catalogued by Alex Hughes, “Google AI is now hallucinating idioms…” April 23, 2025, viewed at https://www.tomsguide.com/ai/google-is-hallucinating-idioms-these-are-t….

⁸ This writer served as a “Contributing Expert” on several LLM training projects designed to create and correct reasoning errors in LLMs. Reasoning errors are different from errors of fact, and are more difficult to correct, but they remain prevalent.

⁹ Copilot, in response to the prompt “How do you know you are being neutral in your answers?” https://copilot.microsoft.com/chats/h41VZE1ddfF4W2TPY2S3C.

¹⁰ ChatGPT, in response to the prompt “How do you know you are being neutral in your answers?” https://chatgpt.com/c/68bb1855-30e4-832d-957b-3c610db2a15c.

¹¹ Grok, https://copilot.microsoft.com/chats/h41VZE1ddfF4W2TPY2S3C.

¹² Gemini, in response to the prompt “How do you know you are being neutral in your answers?” https://gemini.google.com/app/35d99f15668413ca.

¹³ Gemini, in response to the prompt, “I want your view on the question of what is the highest good for a person,” https://gemini.google.com/app/35d99f15668413ca

¹⁴ Gemini, in response to the prompt “Are you able to be completely neutral in your hermeneutic priorities and application or is there the possibility that you can occasionally show bias?” viewed at https://gemini.google.com/app/dcfc75c047ca5336.

¹⁵ The sine qua non is Charles Ryrie’s term for the three identifying characteristics of dispensational thought: a consistently applied literal grammatical historical hermeneutic, a recognition of the distinction between the church and Israel, and the doxological purpose of God. (From Charles Ryrie, Dispensationalism Today (Moody Press, 1969), 43-47.

¹⁶ The first question pertains to epistemology. Questions 2-4 are metaphysical and address eschatological issues. Question 5 is metaphysical and considers the central question of teleology.

Christopher Cone, Christopher Cone, Th.D, Ph.D, Ph.D, serves as President and CEO of AgathonEDU Educational Group and leads Vyrsity and Colorado Biblical University. Dr. Cone has served as a President, a Chief Academic Officer, and a Research Professor and has served in several pastoral teaching roles. His articles are published at www.drcone.com.

Discussion

NewMore to come

Wed, 09/24/25

I found the AIs’ responses to questions in this study very interesting. And I appreciate the research. There is more of that to come shortly, probably tomorrow.

To me, Gemini’s responses on the topic of objectivity are reassuring. It does not claim to be objective or neutral, only that it tries to be objective (“I aim for” and “Think of me as a tool that strives for…” etc.), and it also reminds the reader that it is a tool that does what humans have shaped it to do.

I wish they would all do that more aggressively.

Unfortunately, I think human nature is such that we quickly forget or disregard the fact that we’re interacting with software that is working with what it’s been fed, and working in a way that humans have ‘trained’ (programmed) it to work. Because it’s so verbal, we quickly start to think of it as a mind, which it is not. I’ve done it. Maybe it’s just a species of the kind of “suspended disbelief” we all use when we enjoy stories. Or maybe it’s a real problem.

I work with Claude a lot, so I thought I’d ask it a similar question to Dr. Cone’s. Without any conversational context, here’s how it answered today (but it could have a different answer tomorrow and also a different one today, if I stared a separate chat).

Me: How do you know you are being neutral in your answers?

Claude (emphasis added):

I don’t have a perfect way to know if I’m being truly neutral - that’s one of the challenging aspects of how I function. What I can observe is that I try to:

Present multiple perspectives when topics have different valid viewpoints

Acknowledge uncertainty when I have it

Avoid loaded language that assumes conclusions

Recognize when my training might create certain tendencies

But there are real limitations to my self-awareness here. I can’t step outside my own processing to evaluate it objectively. My sense of what counts as “neutral” is itself shaped by my training, which reflected particular choices about data, values, and objectives.

For contentious topics, I might think I’m being balanced while actually reflecting subtle biases embedded in how I was developed. I also can’t always tell when I’m missing important perspectives that weren’t well-represented in my training data.

What I can do is be transparent about these limitations, invite you to point out when something seems skewed to you, and try to be explicit about my reasoning so you can evaluate it yourself. True neutrality might be impossible for any information source - human or AI - but I can aim for transparency about my process and acknowledge where my perspective might be incomplete.

Is there a particular topic or response where you’re wondering about potential bias?

In my experience with Claude, it uses overly optimistic language for “acknowledge uncertainty when I have it.” It is fairly often wrong and in words that express an inappropriate level of certainty—when it should have detected that less certain language was required.

But the LLMs will improve.

So far, in my experience, the worst bias I’ve seen in these tools is the bias toward reflecting back to me what it estimates I might want to hear. All the AI’s I interact with are excessively affirming. I am always “absolutely right!” So these can really be confirmation bias engines. But if you have the maturity to tell it you want to be disagreed with, it can do that pretty well. So a lot is up to the user, but like all technologies, there are things it makes easy and things it makes hard and the tradeoffs will shape society profoundly in the future.

(Edit: Ironically, Claude did get pretty defensive when I pointed out that it should have used an em dash and not a hyphen after “netural.” So I guess I’m not always “absolutely right!” after all. Though in this case, I was, and Claude was once again very ‘confidently’ wrong. … after I pushed back a couple of times, I was “absolutely right!” again, but the AI’s also sort of lie. They will affirm that you are correct, the proceed to repeat their error. So, the affirmation/confirmation bias again.)

Views expressed are always my own and not my employer's, my church's, my family's, my neighbors', or my pets'. The house plants have authorized me to speak for them, however, and they always agree with me.