Questions about 'sight words'...

Please feel free to post any questions about the course content.
Debbie Hepplewhite

Teachers are commonly very concerned about this notion of 'sight words' and frequently raise questions along these lines:

Could you please give me several examples of the words which are "not completely phonically regular"? Do you mean so called "sight words"? Could you please give me an explanation of this notion? Are there any statistics of the approximate percentage of this kind of words in the whole English vocabulary? And the main question: what are the methods and the approach to managing those words in regards of teaching and learning of reading and spelling?

Thanks in advance!

In Module Five, Part 5, of the online course, I talk about the introduction of 'tricky and/or common words' as part of the phonics provision and include a list of example words. Many of these words may well be officially defined as 'irregular' but this is not necessarily how I would describe them to learners.

I also introduce some 'tricky and/or common words' from the earliest stages to be able to provide learners with cumulative, decodable sentences and texts to read and write.

Historically, phonics teaching was often preceded by the teaching of a 'sight word' vocabulary, meaning words which were commonly-used in early reading books and therefore thought useful for beginners.

Such words were not taught with any reference to letter/s-sound links, and instead they were taught as 'global shapes' - meaning 'by their shape' to be recognised as a 'whole'. This might be with the words printed on 'flash cards'. Children would try to remember the tall bits of letter shapes, or their 'tails' to help them recall the actual words. They would not necessarily recognise the same words in different contexts (for example, in books rather than on the flash cards) or they would not recognise the words in different fonts.

The younger the child, the more difficult it would be to recall words introduced in this way because of the child's limited 'Visual Attention Span' (VAS). This method was, and is, a disaster for many children - and yet this notion of an 'initial sight vocabulary' remained common - and is still common - in some contexts.

Common, useful words which may have unusual or rare spellings are now often referred to as 'tricky' words and in modern Systematic Synthetic Phonics programmes, these are drip-fed, or introduced one by one, after the introduction of some letter/s-sound correspondences and after the introduction of the phonics skills of blending for reading (decoding) and oral segmenting for spelling (encoding). By the time teachers begin to introduce useful words such as 'I', 'the', 'was', 'to' and so on, beginners are already well on their way to learning about the alphabetic code and applying their skills to reading and spelling cumulative, mainly easily-decodable words and sentences.

Sometimes these 'tricky' words are not even unusual or irregular spellings, but there may be a part of their spelling that has not been formally introduced in the phonics lessons. Some letter/s-sound correspondences within such words might simply be described as 'graphemes not yet taught' and the teacher can easily 'teach' that grapheme 'incidentally' ahead of the planned introduction of the alphabetic code.

Also, teachers can teach 'incidentally', as required, any letter/s-sound correspondence whether or not this is regarded as 'regular' or 'irregular' by this simple approach:

So - what is the definition of 'regular' and 'irregular' words?

This depends on one's approach to the teaching and explanation of the English alphabetic code. There is a definition in the official literature which some people, myself included, don't find particularly helpful in the context of modern Systematic Synthetic Phonics teaching. People can read about the criteria for the official definition via various sources.

A eminent specialist in language and research said this with regard to defining 'regular words':

“Regular word” is a technical term in the relevant scientific literature (psychology), in exactly the same way as “electron” is a technical term in the relevant scientific literature (physics).

If you wanted to know what an electron is, you should consult a definition of that term in the relevant literature.

In just the same way, if you wanted to know whether HAVE is a regular word, you should consult a definition of that term in the relevant literature. For example, " regular words (i.e., words that follow the GPC rules, such as SPELL) and irregular words (i.e., words that do not follow the GPC rules, such as GHOST) “ from Macarthur et al. Cognitive Neuropsychology 2013). Or it is reasonable to go to such sites as which has a decent discussion of what this term means, and lists a number of irregular words (including HAVE and SAID as irregular).

Just as it is not a matter of opinion whether "a subatomic particle, symbol e− or β−, with a negative elementary electric charge" is an electron or a proton, it is not a matter of opinion whether HAVE is an irregular word or a regular word. By the definition of “regular word” that is standard in the relevant scientific literature, HAVE and SAID are irregular words.

“GPC” stands for “grapheme-phoneme correspondence”, so to understand the scientific definition of the technical term “regular word” one needs first to understand the scientific definition of the technical term “grapheme” and, especially to understand what the difference is between the terms “grapheme” and “letter”. A grapheme is a letter or letter group that stands for a phoneme. So THIGH, though it has 5 letters, must have only two graphemes since it has only two phonemes. Its two graphemes are TH and IGH. So two phonic rules are enough for reading THIGH phonically: a rule specifying the pronunciation of TH and another rule specifying the pronunciation of IGH. Almost all words of English beginning with TH pronounce this grapheme as it is pronounced in THIGH. So that is the rule for the grapheme TH. Any word which begins with TH but has another pronunciation for TH is therefore, by the standard definition, irregular, no matter how common a word it is. That includes words such as THIS THAT and THE, and also THAI.

So, according to official definitions, the most common way of pronouncing a grapheme defines the word as 'regular'. However, I would simply teach explicitly a comprehensive range of letter/s-sound correspondences but rationalised according to their 'sounds' (mainly phonemes or a few combined phonemes creating a unit of sound such as /k+s/) and their 'spelling alternatives' - and also rationalised for reading purposes that many graphemes (letters and letter groups) can have pronunciation alternatives dependent upon the actual words.

In this way, the word 'GHOST' as mentioned above is not approached as an 'irregular' word, but the grapheme GH (or spelling alternative) is simply introduced 'as code for the sound /g/ in this word GHOST and these words....'.

I then go on to include further words to build up knowledge of that particular 'spelling word bank'. With more unusual letter/s-sound correspondences such as 'gh' as code for the /g/ sound, it is not hard to introduce a specific bank of words spelt that way - and to glue those words together with a story-theme and pictures - which is what I do in the Phonics International programme using the 'I can read' texts and their comprehension questions and illustrations - particularly in the second half of the programme.

As a teacher, (or parent) I would indicate to the learners when a specific letter/s-sound correspondence is found in many words, or only in a few words, as required. This would all be part of the teaching.

On my Alphabetic Code Charts I include a comprehensive range of spelling alternatives for all the sounds, but these may include some letter/s-sound correspondences that are not found in that many words - but the words themselves may be commonly-used.

An example of this is the spelling alternative 'ai' as code for /e/ as in the example words 'said' and 'again'.

Here is the link for the free printable Alphabetic Code Charts where examples of very common and very unusual spelling alternatives are illustrative of the English alphabetic code's complexities:
Debbie Hepplewhite

Susan Godsland's excellent site provides information and links about 'sight words' - in particular look at this page:
Debbie Hepplewhite
Professor Diane McGuinness has this to say about ‘sight-words’ see ‘Early Reading Instruction’ page 58:

‘…the sight word category was reserved for common words where one or more phonemes have a unique spelling that is hard to decode without direct instruction. There are almost no words where every phoneme has an unpredictable spelling. By this criterion, there are remarkably few true sight words. The following sight words and special group words did not fit a major spelling category in a large corpus of words of English/French origin. There are approximately 100 sight words.’

[These phonemes in slash marks are defined by Diane according to a North American accent. The notation in the slash marks to denote the 'phonemes' and the examples provided are not all the same as I have used, or would use, with different pronunciations and for notating the sounds - for example, I put /yoo/ not /ue/ as the notation. The point here, however, is that there are some words which one could call 'exception' words or very rare or unusual spellings with very few words, or only one word, spelt that way.]

/a/ aunt, laugh, plaid
/e/ friend, leopard
/i/ been, busy, sieve, pretty, women
/o/ abroad, broad, cough, father, gone, trough, yacht
/u/ a, because, does, blood, flood, of, once, one, the, was, what
/ae/ straight, they Group: ea break, great, steak
/ee/ people, ski
/ie/ aisle, choir, I height, sleight
/oe/ sew
/ue/ beauty, feud, queue
[long] /oo/ move, prove, shoe, deuce, through Group: o do to who whom whose
[short] /oo/ Group: -oul could, would, should
/ar/ are, heart, hearth Group: orr borrow, tomorrow, sorrow, sorry
/er/ acre, glamour, journey, syrup, were Group: ure leisure, measure, pleasure, treasure
/or/ drawer, laurel Group: oor door, floor, poor
/air/ bury, heron, scare, their, there, they’re, very, where
final /k/ arc, tic, ache, stomach Group: -lk baulk, chalk, stalk, talk
/t/ Group: -bt debt, doubt, subtle
final /th/ smooth
final /v/ of
honest, honor, (honour), hour
initial /h/ who, whom, whose, whole

These words are the only words which Diane McGuinness suggests could be taught as ‘whole words’ but she says they can also be taught by their sound category as all words can be decoded:

the, one, once, two, who, are, I, of, here

See these free resources that I provide which includes some of the words above where the 'code' is explained in the words:
Debbie Hepplewhite
Yet another question along the same lines - this time referring to the 'high- frequency Words':

I need to ask you about high frequency words. I know they are included as part of the Mini posters and in many other documents, however my question is if I can find a list of all of them together and guidance about how to teach them somewhere.

Teachers really don't need to worry about teaching 'high frequency' words as if it is a 'big thing'. It isn't.

If you are using a modern, systematic synthetic programme, the high-freqency words - be they spelt in an apparently straightforward way or if they have tricky or unusual spelling alternatives for the sound or sounds within them - will be 'drip-fed' into the content of the programme.

Remember that it is not recommended that such words are introduced as an 'initial sight vocabulary' to be learnt as 'global (whole) shapes'.

Wait until you have already started to teach the phonics alphabetic code (the letter/s-sound correspondences) and the phonics skills of blending for reading and oral segmenting for spelling.

Then, if you think a word is tricky at the point you want to introduce it, just draw attention to the tricky, unusual, or irregular part of the word as a normal part of your phonics or general teaching.

This may sometimes involve teaching some aspect of the alphabetic code ahead of its planned introduction - in which case you are teaching the code 'incidentally' for the time being.

You may wish to write the tricky word on the board, or create a quick poster or word card to display or add to other word cards of tricky, useful words.

You may want to underline the tricky part of the word or write it in different colour from the more straightforward parts of the word.

Some 'high-frequency' words are not at all tricky or unusual and you don't even need to flag them up as if they will cause a challenge.

I suggest that if and when you do flag up tricky or unusual high-frequency words, you group them sensibly. So, for example, you would introduce 'come' with 'some' and 'to, do, who' or 'here, there, where' and 'you' with 'your' and even 'our' - that type of approach.

Sometimes such grouped words will sound the same such as 'come' and 'some' whereas at other times they may sound different such as 'here, there and where' grouped together.

What matters is that little bit of extra attention pointing out what is the same, what is different, what works in a straightforward way, what doesn't (at that time).

Also, what matters is that teachers take ownership of using words and introducing words that are sensible and necessary according to what is arising in wider reading and writing - and teachers will always need to draw attention to words which deserve particular attention.

No one programme can introduce all the words in the English language with some slight variation of spelling or with particular spelling alternatives. That is why all teachers need to be teachers of spelling all the time and work hard at supporting learners at all times.

For what it is worth, below is the list of the 100 high-frequency words listed in 'Letters and Sounds' (DfES, 2007) but please note that the vast majority of these words are very straightforward and will not need to be flagged up via display or attention at all if they are decodable at the point of introduction.

Consider this high-frequency word list and in doing so, think which of these words are very straightforward and really don't need to be 'flagged up' as if they are challenging or tricky or special:

1. the
2. and
3. a
4. to
5. said
6. in
7. he
8. I
9. of
10. it
11. was
12. you
13. they
14. on
15. she
16. is
17. for
18. at
19. his
20. but
21. that
22. with
23. all
24. we
25. can
26. are
27. up
28. had
29. my
30. her
31. what
32. there
33. out
34. this
35. have
36. went
37. be
38. like
39. some
40. so
41. not
42. then
43. were
44. go
45. little
46. as
47. no
48. mum
49. one
50. them
51. do
52. me
53. down
54. dad
55. big
56. when
57. it’s
58. see
59. looked
60. very
61. look
62. don’t
63. come
64. will
65. into
66. back
67. from
68. children
69. him
70. Mr
71. get
72. just
73. now
74. came
75. oh
76. about
77. got
78. their
79. people
80. your
81. put
82. could
83. house
84. old
85. too
86. by
87. day
88. made
89. time
90. I’m
91. if
92. help
93. Mrs
94. called
95. here
96. off
97. asked
98. saw
99. make
100. an
Debbie Hepplewhite
I'm cross-referencing this thread to the one featuring Professor Anne Castles guest-blog posting about 'sight words':
Debbie Hepplewhite
Extracts about 'sight words' taken from:


Letters and Sounds: Notes of Guidance for Practitioners and Teachers

© Crown copyright 2007 Primary National Strategy

As the 'Letters and Sounds' programme was mentioned with reference to research (comparing its application with EER - Early Reading Research) in Professor Anne Castles' guest-blog posting, I thought it would be interesting to provide the guidance accompanying the 'Letters and Sounds' publication.

This is what it says in the 'Notes of Guidance':

When and how should high-frequency words be taught?

High-frequency words have often been regarded in the past as needing to be taught as ‘sight words’ – words which need to be recognised as visual wholes without much attention to the grapheme–phoneme correspondences in them, even when those correspondences are straightforward. Research has shown, however, that even when words are recognised apparently at sight, this recognition is most efficient when it is underpinned by grapheme–phoneme knowledge.

What counts as ‘decodable’ depends on the grapheme–phoneme correspondences that have been taught up to any given point. Letters and Sounds recognises this and aligns the introduction of high-frequency words as far as possible with this teaching. As shown in Appendix 1 of the Six-phase Teaching Programme, a quarter of the 100 words occurring most frequently in children’s books are decodable at Phase Two. Once children know letters and can blend VC and CVC words, by repeatedly sounding and blending words such as in, on, it and and, they begin to be able to read them without overt sounding and blending, thus starting to experience what it feels like to read some words automatically. About half of the 100 words are decodable by the end of Phase Four and the majority by the end of Phase Five.

Even the core of high frequency words which are not transparently decodable using known grapheme–phoneme correspondences usually contain at least one GPC that is familiar. Rather than approach these words as though they were unique entities, it is advisable to start from what is known and register the ‘tricky bit’ in the word. Even the word yacht, often considered one of the most irregular of English words, has two of the three phonemes represented with regular graphemes.

Debbie Hepplewhite
