The Watson-Glaser Test: Format, Sections, and Score Explained
The Watson-Glaser Critical Thinking Appraisal is a 40-question, roughly 30-minute test of critical reasoning built around five sections and three reported scales. There is no universal pass mark, only a percentile agains
The honest answer is that the Watson-Glaser test is not an intelligence test and not a general aptitude test. It is a focused measure of critical reasoning, and it is the single most common sift used by UK law firms for training contracts and vacation schemes. The current version, Watson-Glaser III, is 40 questions in about 30 minutes, scored as a percentile against a norm group rather than a raw pass or fail. Most candidates who lose the seat do so on two of the five sections, Recognition of Assumptions and Evaluation of Arguments, because those two punish the instinct to use outside knowledge instead of the passage in front of you.
Quick takeaways
- Watson-Glaser III is 40 multiple-choice questions with a working time of about 30 minutes, set by the employer. The older Watson-Glaser II long form ran 80 questions in 60 minutes.
- It is published by TalentLens, the talent assessment arm of Pearson, and is delivered as a computer-based, item-banked test that often runs unsupervised.
- The test has five sections: Assessment of Inferences, Recognition of Assumptions, Deduction, Interpretation, and Evaluation of Arguments. These map onto three reported scales: Recognize Assumptions, Evaluate Arguments, and Draw Conclusions.
- Scoring is a raw number correct out of 40, converted to a percentile against a norm group. The report also lists a T-score, a STANINE, and a STEN.
- Common percentile landmarks: roughly 33 to 34 correct sits at the 80th percentile, 36 to 38 at the 90th, and 39 to 40 at the 95th to 99th.
- Most top firms screen at the 75th to 80th percentile or higher. For specific Magic Circle, Silver Circle, and US BigLaw cutoffs, see the pass-marks breakdown linked below.
- There is no universal pass mark. The bar is whatever percentile the employer sets against their norm group, so the same raw score can clear one firm and miss another.
What the Watson-Glaser test actually measures
Watson-Glaser measures critical thinking, defined narrowly as the ability to reason from written information without letting prior belief, outside knowledge, or emotional reaction contaminate the judgment. That narrow definition is the whole game. The test deliberately uses passages on contested topics, then asks you to reason only from what the passage states, not from what you happen to know is true in the real world.
This is why strong candidates with deep subject knowledge sometimes underperform. A finance graduate reading a passage about interest rates may "know" the conclusion is correct from experience, mark it as following, and lose the point, because the passage as written did not support it. The test rewards disciplined reading over being right in general.
The instrument has a long pedigree. It was first developed by Goodwin Watson and Edward Glaser in 1925 and has been revised repeatedly, most recently into the Watson-Glaser III edition that most candidates now sit. It is published by TalentLens, part of Pearson, and licensed to employers rather than sold to candidates, which is why there is no official practice version from the publisher aimed at test-takers.
Format and timing: 40 questions in about 30 minutes
The current Watson-Glaser III is 40 multiple-choice items. The working time is commonly set at about 30 minutes, which works out to roughly 45 seconds per question, though the exact limit is configured by the employer and a minority run it untimed or with a longer window. Because WG-III is computer-based and draws from an item bank, the precise mix of question types and the order you see them can vary between candidates and employers.
The older Watson-Glaser II long form ran 80 questions in 60 minutes, and you will still see that format referenced in older prep material. If your invitation specifies 40 questions, you are sitting WG-III and the timings above apply. If it specifies 80, you are on the long form and have the same per-question pace but twice the volume.
The chart below frames the test before we walk each section. It is the structure most candidates wish they had seen before sitting down.

Most invitations arrive after a CV sift and before an assessment centre. For law firms in particular it is an early gate, so a weak Watson-Glaser score ends the application before anyone reads your covering letter closely. That front-loading is the reason it deserves real preparation rather than a glance the night before.
The five sections, with worked examples
The five sections each test a different reasoning skill. They are not equally hard. Recognition of Assumptions and Evaluation of Arguments are where most marks are lost, because both reward suppressing outside knowledge.
1. Assessment of Inferences
You read a short passage of facts, then judge a list of proposed inferences. For each, you decide whether it is True, Probably True, Insufficient Data, Probably False, or False, based only on the passage. The trap is treating "Probably True" and "True" as interchangeable. "True" means the passage proves it; "Probably True" means the passage makes it likely but not certain.
Worked example. Passage: a company reports that revenue rose 12 percent last year while headcount stayed flat. Proposed inference: "The company became more productive per employee last year." The defensible answer is Probably True. Revenue per head rose, which is a reasonable proxy for productivity, but the passage does not define productivity or rule out price increases driving the revenue, so it is not proven and not False.
2. Recognition of Assumptions
You read a statement, then decide for each proposed assumption whether it is Assumed or Not Assumed by the statement. An assumption is something taken for granted for the statement to make sense. The trap is marking real-world-true facts as assumed when the statement does not actually rely on them.
Worked example. Statement: "We should switch to the new supplier to cut costs." Proposed assumption: "The new supplier charges less than the current one." This is Assumed; the statement makes no sense otherwise. Proposed assumption: "The new supplier is based overseas." This is Not Assumed; the statement works regardless of where the supplier sits.
3. Deduction
You read premises, then decide whether a stated conclusion necessarily follows. This is formal logic. The conclusion must follow with certainty from the premises alone, even if the premises seem false in the real world.
Worked example. Premises: "All audited firms file annual returns. This firm is audited." Conclusion: "This firm files annual returns." This conclusion follows. Change the conclusion to "All firms that file annual returns are audited" and it does not follow, because the premise runs one direction only.
4. Interpretation
You read a passage, then decide whether a proposed conclusion follows beyond reasonable doubt. It sits between Deduction and Inference: stricter than "probably," looser than formal certainty. The trap is accepting conclusions that go even slightly beyond what the passage weighed.
5. Evaluation of Arguments
You read a question, then judge whether each proposed argument is Strong or Weak. A strong argument is both important and directly relevant to the question. A weak argument is trivial, off-topic, or relies on a slippery-slope leap. The trap, again, is letting your own view of the topic decide instead of judging relevance and weight.
How the score is reported
Watson-Glaser is scored on the number of items answered correctly out of 40. That raw score is then converted into a percentile against a norm group, which is the figure employers actually read. A percentile of 80 means you scored better than 80 percent of the comparison group, not that you got 80 percent of questions right.
The Profile Report shows an overall percentile plus a percentile for each of the three reported scales: Recognize Assumptions, Evaluate Arguments, and Draw Conclusions. Buried in the technical section, the report also lists the number correct, a T-score, a STANINE, and a STEN, which are alternative ways of placing your raw score on a standard distribution. Employers almost always make the decision on the overall percentile.
The infographic below maps raw scores to the common percentile landmarks and lists the five sections in one place.
The chart below pins the raw-to-percentile landmarks and the section list to one reference image.

| Percentile | Approx. correct out of 40 | What it signals |
|---|---|---|
| 50th | 27 to 29 | Average against the norm group |
| 80th | 33 to 34 | Competitive for most professional roles |
| 90th | 36 to 38 | Strong; clears most law-firm sifts |
| 95th to 99th | 39 to 40 | Top-tier performance |
| Law-firm target | 75th to 80th and above | Training contracts and vacation schemes |
The raw-to-percentile mapping shifts slightly depending on which norm group the employer selects, for example a general working-population norm versus a graduate or legal norm. A graduate norm group is tougher, so the same 34 out of 40 can sit a few percentile points lower than it would against a general norm.
Where candidates lose marks
Candidates lose the most points in three predictable places, and all three come from the same root cause: importing outside knowledge into a test that only wants reasoning from the passage.
The first is treating Recognition of Assumptions as a general-knowledge question. Candidates mark a statement as assumed because it is true in the real world, when the statement in front of them does not actually depend on it. The discipline is to ask only "does this statement collapse without this assumption," not "is this assumption true."
The second is over-reading inferences. On the Assessment of Inferences section, candidates collapse the five-point scale into a binary and lose the distinction between "True" and "Probably True," or between "False" and "Probably False." The middle options exist for a reason, and the safe move is to ask what the passage proves versus what it merely makes likely.
The third is pace. At about 45 seconds per item, candidates who linger on the first few Deduction questions run out of time before the later sections, where the easier marks often sit. The fix is to bank the quick wins first and flag the formal-logic items to return to.
To practise with worked samples for each of the five sections, see Watson-Glaser Practice Test 2026: Free Sample with Answer Walkthroughs.
How to prepare in a week
A week of focused practice moves Watson-Glaser scores more than most candidates expect, because the gains come from learning the question conventions rather than getting smarter. Spend the first two days drilling Recognition of Assumptions and Evaluation of Arguments, the two highest-loss sections, until suppressing outside knowledge becomes automatic. Spend the next two days on timed full-length sets so the 45-seconds-per-item pace feels normal rather than rushed. Use the final days to review every item you got wrong and write one sentence on why the correct answer follows, which is the single highest-yield revision habit for this test.
For full-length, timed practice with worked explanations for every item, our Watson-Glaser practice runs the real format and pace. For where the bar actually sits at named firms, read the pass-marks breakdown below before you set your target.
FAQ
Is the Watson-Glaser test hard?
It is hard in a specific way. The reasoning is not advanced, but the conventions are unforgiving, and the time pressure of about 45 seconds per item leaves little room to second-guess. Candidates who practise the question types find it manageable; candidates who walk in cold tend to lose marks on Recognition of Assumptions and Evaluation of Arguments.
What is a good Watson-Glaser score?
There is no universal pass mark. A good score is one that clears the percentile the employer has set against their norm group. As a guide, the 80th percentile, around 33 to 34 correct out of 40, is competitive for most professional roles, and most top law firms screen at the 75th to 80th percentile or higher.
How long is the Watson-Glaser test?
Watson-Glaser III is 40 questions with a working time of about 30 minutes, set by the employer. The older Watson-Glaser II long form was 80 questions in 60 minutes. Both run at roughly 45 seconds per item.
Who publishes the Watson-Glaser test?
It is published by TalentLens, the talent assessment business within Pearson. Employers license it, which is why there is no official candidate-facing practice version from the publisher.
What are the five sections of the Watson-Glaser test?
Assessment of Inferences, Recognition of Assumptions, Deduction, Interpretation, and Evaluation of Arguments. These five map onto the three reported scales: Recognize Assumptions, Evaluate Arguments, and Draw Conclusions.
Can you use outside knowledge on the Watson-Glaser test?
No, and doing so is the most common way candidates lose marks. Every section asks you to reason only from the passage or premises in front of you. A conclusion that is true in the real world but unsupported by the passage is marked wrong.
Is the Watson-Glaser test the same as a critical thinking test?
It is one specific critical thinking test. The terms "Watson-Glaser," "Watson-Glaser Critical Thinking Appraisal," and "critical reasoning test" are used interchangeably for this instrument, but other publishers sell different critical thinking tests that are not Watson-Glaser.
Can you retake the Watson-Glaser test?
Retake policy is set by the employer, not the publisher. Some firms allow one retake per cycle; many do not allow a retake within the same application cycle. Because the test is item-banked, a retake usually draws different questions.
Related on PrepClubs
- Pillar. Watson-Glaser test overview and preparation. The full pillar page with the latest format, employer use, and prep paths.
- Deep practice. Full Watson-Glaser practice with worked explanations. $39 one time. Pass Guarantee. Timed full-length sets in the real WG-III format.
- Cutoffs. Watson-Glaser pass marks: Magic Circle, Silver Circle, and US BigLaw. The firm-by-firm percentile bar so you know what to target.
- Guide. What is a good cognitive test score: per-test benchmarks. Cross-test benchmarks including Watson-Glaser, CCAT, Wonderlic, and PI.
Practice on PrepClubs
Full-length Watson-Glaser practice in the real WG-III format.
Watson-Glaser rewards practising the question conventions, not cramming facts. Our Watson-Glaser practice tests run the five sections at the real 45-seconds-per-item pace, with worked explanations for every item that show why the correct answer follows from the passage and the wrong ones do not. $39 one time. Pass Guarantee.
FAQ


