Michael Dalton, as coordinator of the project gives an update on its current status and looks forward to 2010.

Issue 3 of the Dalton International DNA Project Progress Report has now been published. This includes all those who joined the project up to the end of January 2009. There were 99 participants included in Issue 2 of the report and Issue 3 has 126 sets of markers recorded and analysed. This represents an impressive expansion of the project. Additionally, many participants have extended their number of markers and this adds considerably to the value of the database as a whole to our Dalton family history researches.

The report is a landmark document and extends to 54 pages. As part of the Orange conference, I gave a presentation which previewed its contents. This presentation may be viewed here on the DGS website in the Photo/Video Gallery.

The number of separately identifiable genetic families has increased from 10 to 13. The number of singletons has increased by just three, from 18 to 21. This reflects the high success rate that we are achieving, with nearly all new project participants finding matches with existing project members.

Following this update, you will find extracts from Issue 3 of the report as follows:

We already have a number of further participants who have joined the project during the past few months and currently there are a total of 131 sets of markers in our database. DIDP is one of the largest and most respected projects of its type internationally, but we still need to expand it further, particularly with individuals who have documented ancestral lines that take them back to known English or Irish Dalton origins. The strength of the database as a family history research tool lies in its size, and its continued growth is of paramount importance to us all. So, if you are a Dalton male please do think about joining this well established and exciting project.

Looking ahead to 2010 the project will undoubtedly continue to grow. Alongside this we anticipate putting more focus on the identification of most recent common ancestors within the various genetic families. This will be assisted by the appointment of coordinators for each genetic family, and the individual websites that have now been set up for a number of genetic families at www.dalton-dna.net to enable family tree data to be shared. Working with Chris Pomery, we are looking at ways of providing more regular updates to each genetic family group, with reports to the groups being more frequent and the publication of the full report therefore correspondingly less frequent.

You are invited to contact us by email if you would like to join the project, or if you have any questions which you wish to raise. The details for the overall coordination of the project, and for the individual genetic family coordinators are as follows:

Any questions that you have about the project should be referred to Michael Dalton or Karen Preston in the first instance, or directly to the appropriate genetic family coordinator as follows:

Genetic
Family

Coordinator

Email Address

A

Karen Dalton Preston

karen@golden-hills.com

B

Wendy Fleming

wendy.fleming@optusnet.com.au

C

Michael Neale Dalton

michaelndalton@aol.com

D

Karen Dalton Preston

karen@golden-hills.com

E

Millicent Craig

millicenty@aol.com

F

to be appointed

 

G

to be appointed

 

H

to be appointed

 

I

Gerry Dalton

tomngerrytravel@hotmail.com

J

to be appointed

 

X

to be appointed

 

Y

John Dalton

johndalton78@hotmail.com

Z

Howard John Dalton

h.dalton1@ntlworld.com

Singletons

Michael Neale Dalton

michaelndalton@aol.com

There are a number of genetic family coordinators yet to be appointed. Volunteers are sought and anyone who is interested in taking on this role should contact Michael Dalton.

Our Chairman Michael Dalton gives an update on the Dalton International DNA Project (DIDP) to coincide with the publication of Issue 3 of the Project Progress Report to all participants. This summary includes extracts from the report itself.

The report includes a total of 126 participants. Of these 104 are Y-chromosome DNA results within our own Dalton International Project hosted by Family Tree DNA. A further 22 results have been inferred by us, in part or in whole, through links with other Dalton surname projects. A number of further results have now been received, too recent for inclusion this time, but they will of course be incorporated into the next issue.

The strength of the DIDP lies in its size, and in its ability to draw together all this data in one place. We welcome all Daltons to make their results available within our project in order to further our collective research into all components of the Dalton family worldwide. Only by aggregating all our data together can we create the best opportunity for individual Daltons to identify their true ‘genetic family’ and thus uncover their Dalton ancestral origins.

Everyone who passes their DNA results to the DIDP is eligible to receive a copy of the full report, which then allows them to interpret their genetic history and identify the most potentially profitable areas for their future family history research. We do ask that those who receive the report are current members of the DGS. The subscription contributes towards the cost of retaining our consultant and, of course, brings many other benefits as well.

The goal of DIDP is to test all of the contending theories about the multiple origins of Dalton families around the world, and to reach consensus about the most likely hypotheses to explain both the DNA results and our current documentary research.

This report summarises the position of the Dalton International DNA Project at the beginning of 2009, cross-referencing the DNA results of 126 Dalton men with information about their documented origins.

Of the 104 DIDP men tested, 19 live in the emigrant-donating countries of the ‘Old World’ (the UK & Ireland) while 85 are descendants of Dalton emigrants and live outside of the UK and Ireland.

By country of residence, 76 reside in the USA, 17 in the UK, 6 in Australia, 2 each in Ireland and Canada, plus one in New Zealand.

Key DIDP Conclusions

As the project stands in 2009 the key conclusions are:

  1. The bearers of the Dalton surname worldwide have multiple genetic origins, i.e. the surname appears to have been acquired by individual ancestors independently of each other at different times and in different places.

  2. The project’s results are currently aggregated into thirteen distinct ‘genetic families’ each of which contains identical or highly similar DNA results.

  3. Six of these genetic families trace back to origins in mainland UK, four in Ireland, and three within the USA.

    We expect that all currently US-origin genetic families almost certainly have their ultimate documented origin in either Ireland or the UK, though they may more immediately be documented to ancestors living in the USA. It is quite possible that not only will some of the singletons turn out to be documented as part of existing genetic families, but also that some of the genetic families, when documented into a single tree, will turn out to be linked to another genetic family. Without detailed documentary research it is hard, purely from the DNA results alone, to predict quite how often this kind of linkage will occur.

  4. 21 results (“singletons”) have so far not been linked to any other individual in the project. This percentage of non-matching results is low for a major surname DNA project.

  5. The spelling variations of Dolton, Daulton and D’Alton are found within recognizable Dalton genetic families, not as separate trees with distinct surnames and DNA signatures.

  6. The modal ‘genetic family’ -- Genetic Family A -- has now grown to include 44 results, only one of which claims an origin outside of the USA. The modal result identifies the DNA signature that is so far the most frequently found among the men coming forward to take the DNA test. Even though it is the most ‘popular’ result, that does not prove that this DNA signature represents (a) the oldest or original Dalton tree, or (b) the Dalton tree with the most descendants alive today, or (c) that it is the most populous Dalton tree looking at the entire history of the surname. Only detailed documentary research can confirm whether any of the three claims above are associated with the modal result or not.

  7. Genetic Family A is a textbook example of a genetic bottleneck. This phenomenon occurs when a gene pool is restricted in some way, most usually as the result of emigration (when only a sub-set of the total gene pool of a country emigrates and successfully reproduces overseas) or catastrophe (when only a sub-set of the original gene pool survives). Broadly speaking, one would expect that the American Dalton gene pool is less genetically diverse than the original Dalton gene pool in the British Isles, and that the American Dalton gene pool will reveal a strong modal result (i.e. one family has reproduced more efficiently in the USA since its arrival than all the other Dalton families). The astonishing growth of Genetic Family A in the Americas during the past three or four centuries deserves to be cited as an example for other surname groups of the huge differences between the genetic profile of groups of DNA testees sampled in the British Isles and in emigrant-rich countries outside of it.

Characteristics of the Dalton Genetic Families

The 13 genetic families identified to date in the DIDP show the following key characteristics:

  1. Ten belong in haplogroup R1b, the most common haplogroup in Europe and identified with the earliest post-Ice Age expansion to the edge of western Europe.

  2. The two largest genetic families account for half of all DIDP participants.

  3. Based purely upon geographical similarities and the frequency of DNA results found so far, it appears that several genetic families may turn out to be sub-groups of other, older and larger, genetic families:

Genetic
Family

No of
tests

Research
Coordinator

Haplogroup
(tested)

Geographical Origin

A

44

Karen Dalton Preston

R1b1b2a1b5

USA (Virginia, North Carolina, South Carolina, Tennessee)

B

10

Wendy Fleming

R1b1b2

Ireland (Meath, Westmeath, Leitrim, Limerick)

C

5

Michael Neale Dalton

R1b1b2

Wales (Carmarthenshire)

D

19

Karen Dalton Preston

R1b1b2

Ireland (Tipperary, Clare, Kilkenny, Limerick, Waterford)

E

2

Millicent Craig

R1b1b2

England (Lancashire)

F

4

to be appointed

R1b1b2

England (Kent)

G

4

to be appointed

R1b1b2

USA (Virginia)

H

2

to be appointed

R1b1b2

England (Berkshire, Surrey)

I

2

Gerry Dalton

R1b1b2a1b5

Ireland (Dublin)

J

3

to be appointed

R1b1b2

Ireland (Waterford)

X

3

to be appointed

I2b1

USA (Virginia)

Y

2

John Dalton

I

England (Lancashire)

Z

5

Howard John Dalton

I1

England (Yorkshire, Buckinghamshire, Surrey)

 

Geographical Locations of Dalton Genetic Families

The Dalton DNA Project has identified the counties or states of origin of all of the men taking part. The table below identifies the genetic families linked to the different geographical areas given by project participants as the origin of their family tree, where known. This allows a Dalton man joining our DNA project, who has already traced his family tree back to a specific country and county or state of origin, to check whether the genetic family predicted by the results below turns out to be correct.

Country

State/County Of Tree Origin

Genetic
Families

No of Singletons

UK

Carmarthenshire

C

-

 

Yorkshire, Buckinghamshire

Z

1

 

Lancashire

E, Y

-

 

Surrey

H, Z

-

 

Berkshire

H

-

 

Kent

F

-

 

London

F

1

 

Suffolk

J

-

 

Hampshire

B

-

 

Cumberland

-

1

 

Norfolk

-

1

Ireland

Limerick

B, D

-

 

Westmeath, Meath, Leitrim

B

-

 

Clare, Kilkenny

D

-

 

Tipperary

D

1

 

Waterford

D, J

-

 

Dublin

D, I

-

 

Kerry

-

1

USA

Virginia

A, G, X

1

 

Tennessee, North Carolina, South Carolina

A

-

 

New Hampshire

J

1

 

Florida

-

1

 

Wyoming

-

1

 

New York

-

1

 

Pennsylvania

-

1

 

Georgia

-

1

 

Maryland

-

1

Canada

Newfoundland

-

1

Australia

New South Wales

F

-

This foreword to the report by Michael Dalton gives a more detailed history of the project and its growth from the launch in 2003 through to the present day.

It again gives me great pleasure, as Chairman of the Dalton Genealogical Society, to write the foreword to the Dalton International DNA Project Progress Report – Issue 3. The first issue of the report was published in November 2006 and the second in December 2007. Considerable further progress has been made since then. Again the report has been prepared for the Society by the consultant to the project, Chris Pomery and we continue to be indebted to Chris for his diligence and thorough approach.

The Dalton International DNA Project (DIDP) was officially launched in May 2003 at the DGS Gathering held in South Wales. At that time, the idea of DNA single surname projects was quite new, particularly in the UK, and ours was among the very early projects to be established. Six years on, and with around 130 sets of results in our Y chromosome database, the project is still at the forefront and it is our intention to remain on the leading edge as genetic genealogy becomes an ever more powerful and popular component in the family historian’s toolkit.

The birth of the Dalton project owes its conception to the then DGS American Secretary, Millicent Craig. It was Millicent who had started taking an interest in papers that were written on the subject as long ago as 1999, and the very early studies which commenced in 2000. The Society was made aware of Millicent’s thoughts at the DGS Gathering in Cambridge in 2001 and, after due consideration by the committee and further discussions during 2002, the first testees were identified by early 2003. At the launch date, about 15 DGS members had committed to submit their swabs for analysis and the project was under way.

Slowly, as the results have become available, a number of family groupings have emerged and regular reports have been made to DGS members in the Journal, to the wider audience of “Daltons in History” on our website and more recently on the DIDP pages of the website. By the time the Society met in Dublin for the Ireland Gathering in July 2005, about 65 DGS members were participating in the project and it was clear that we needed help with interpreting the results. A sub-committee was formed to take matters forward. Inspired by the talk given by Patrick Guinness in Dublin, it was decided that I should approach Patrick for advice on who we should appoint as a consultant to assist us. Patrick felt that Chris Pomery, already known to me and with an established reputation in the field of genetic genealogy, would be the right person and so, towards the end of 2005, Chris was officially appointed with a brief to analyse our results to date and present them and his conclusions back to the Society. Chris was among the early DNA pioneers, establishing a DNA project for his own Pomeroy family back in 2000, when 12 marker tests were the norm. Since then he has published two very informative books on DNA and family history; he lectures on the subject extensively and he has become a recognised authority, with a particular skill for bridging that gap between the raw results and something meaningful to family historians – he picks up from where Family Tree DNA, our testing company, leaves off. This is in no sense a criticism of FTDNA – with over 165,000 individual Y chromosome results, and well over 5,000 surname projects to service, they can hardly be expected to work to the depth that Chris is able to for us.

The DGS has worked closely with Chris since 2006 and this third report is a direct result of that collaboration. Of course, it is a work in progress and a snapshot at a point in time. By the very nature of what we are doing, tests will be expanded to more markers and new testees will join our project. The genetic technology will become more sophisticated and offer more detailed outputs at ever reducing cost. What Chris has done is to establish a framework for maintaining our expanding matrix of DNA results. Our data is held in a Microsoft Access database, which links the genetic data from the test result with basic genealogical data supplied by us about the origins of the tree from which a descendant has been tested. Combining the two enables us to extend our knowledge of Dalton family history, and it will continue to do so as our historical research progresses and more Daltons take the DNA test.

All this has taken time and we are immensely grateful to Chris for providing the Society with such a rigorous and robust approach. Chris invited me to participate in a Guild of One Name Studies DNA Seminar held in Nottingham in May 2007, and I presented an overview of DIDP to 80 family historians eager to find out more about how DNA might help their research. Chris addressed the DGS at our Gathering in Worcester in July 2007 and his talk was very well received by all the delegates. More recently, I was able to make a presentation of this latest report at the DGS Gathering in Orange, New South Wales, Australia in March 2009.

Recognising the value of Chris’s contribution, the Society has invited him to continue as our consultant. He will prepare further issues of this report, taking into account all the new data that emerges. Increasingly the focus of interest is on the individual genetic family and we are now looking at ways of providing more regular updates to these groups. It is likely that the reports to these groups will be more frequent and the publication of the full report therefore correspondingly less frequent. Many of our genetic family groups now have active co-ordinators and these are identified in the report. We have also set up individual websites for two groups (B and D), and I hope that more will follow.

In parallel with the above, we will continue to publish interim updates of a more general nature on the website and to maintain contact with all the participants in the project, ensuring that they are made aware of new developments and discoveries as they happen.

An innovation in this Issue 3 of the report is a new table for each genetic family showing the more likely relations for each participant. These tables are provided as a suggested starting point on where to focus initial research. They are not meant to be completely predictive.

It is our firm intention to maintain the Dalton International DNA Project as one that is in the forefront internationally, and we continue to look forward with optimism.

Michael Neale Dalton

Chairman and Honorary Life President of The Dalton Genealogical Society

An informative extract from DIDP Progress Report Issue 3 written by Chris Pomery, our DNA Consultant. This explains how the DNA process works as a tool to assist the family historian.

Markers & Resolutions

The Y-chromosome DNA test used in surname DNA projects measures the genetic structure of the Y-chromosome at pre-agreed places known as markers. The resolution of a DNA test result varies from low (4-12 markers), to medium (25 markers) to high (37-67 markers) depending on the type of test purchased. In general, the higher the resolution of the test the more confident any comparisons made using its result will be. The result for each marker is usually expressed as a whole number, though there are a few markers where the result is expressed as either a pair or as a sequence of numbers. Each DNA test result, whatever its resolution, is described as the ‘DNA signature’ of the individual who has been tested.

The genetic structure of the Y chromosome at each of these markers mutates occasionally when the DNA is passed from father to son. The exact rate of mutation for each marker is not yet known. Some markers are known to mutate relatively more often than others; these are referred to as ‘fast-mutating markers’.

The range of values found for each standard marker in a Y-chromosome test tend to vary in the general population according to a normal distribution around a modal result: i.e. if the modal – or most frequently found -- result for marker A is 13, there will be some fewer results found with the values 12 and 14, slightly fewer again with 11 or 15, only a very few with the values 10 or 16, and so on. This pattern represents a distribution of possible results that has been created over many, many generations. Each individual’s result may have mutated up and/or down more than once during the transmission from one generation to another, and some mutations may have been double- or triple-step mutations (e.g. from 13 to 15 or 16) rather than single-step changes (e.g. from 13 to 14). Thus when two participants have the same value at the same marker, it does not ‘prove’ that all of their ancestors within their family tree had the same value, though in practice this is more likely than not.

Haplogroups

In addition to the set of standard markers that make up a person’s DNA signature there is another set of markers that are used to define the haplogroup associated with each test result. Haplogroups were first defined by population geneticists who were working out how humankind had populated the planet through successive migrations out of Africa. The useful feature about haplogroup markers is that each of them are thought to have mutated only once in a single human individual at a specific point in time. By plotting the present-day frequencies of all of these haplogroup markers, it is possible to infer the geographical path of the migrating population over time that produced the distribution found today. Many thousands of haplogroup markers have been identified and more are being found every year.

The results of the haplogroup markers are not given in the same format as the markers in a standard Y-chromosome test result. Instead, the results of a set of haplogroup markers are expressed as the label of the haplogroup itself. For example, the haplogroup labelled R1b1b2a1b5 is derived from the results of 15 separate haplogroup markers.

The label R1b1b2a1b5 is a sub-set of the global ‘R’ haplogroup which is in turn subdivided into 1a, 1b, 1c, etc. One confusing feature about haplogroups is that this classification system is still in a state of flux; indeed it was changed wholesale during 2008 as a new set of markers were agreed internationally, changes incorporated in this the third edition of this report. For this reason, very great care has to be taken when recording and using haplogroup results. This is doubly so because while DNA testing companies such as FTDNA helpfully offer their estimate of the likely haplogroup for each participant, this estimate can change if subsequent haplogroup test results revise the knowledge on which those earlier estimates were based.

Creating ‘Genetic Families’

The process of analysing Y-chromosome DNA results consists of comparing the string of numbers in each DNA signature and clustering together those participants whose results are broadly similar. Distinct clusters of participants with identical or near-identical DNA results are defined as ‘genetic families’. The comparison process is not concerned with the actual numerical value on a specific marker, i.e. it does not matter whether a participant has the value, say, of 11 or 13 on marker A; what does matter is that two men have the result 13 while another two men have the result 11.

The purpose of the clustering process is to bring together all the men whose DNA results suggest that they share, or may share, a single common male ancestor within a timeframe that makes sense to a genealogist, i.e. the last 700-1,000 years.

The process of clustering DNA results takes place in several stages.

The first stage is to group all the participants by haplogroup. This is because men belonging to haplogroup R could only share a common ancestor with someone who is haplogroup Q some tens of thousands of years ago, well outside the range of genealogical research. The same is true of men defined as haplogroup R1a as compared to, say, members of haplogroup R1b. All members of a bona fide genetic family by definition will share the same haplogroup.

With the haplogroup-based clusters established, the second stage of the analysis process is to compare the DNA signatures of all the participants in each haplogroup using the most reliable markers. The most reliable markers are those that are the slowest to mutate.

The third stage is to use the remaining markers, generally described as fast-mutating, to further sub-divide the genetic families that became visible during the second stage.

The fourth stage is to use those markers which express their result as a pair of numbers or as a sequence. There are specific issues relating to these markers which make them best suited to be used to further define existing clusters rather than to create them.

The resulting genetic families are broad clusters of identical and near identical DNA results. The main conclusion one can draw from an individual’s inclusion in a specific genetic family is that they are highly likely to share a common direct male-line ancestor with other members of that genetic family. Put another way, if they are able to research their family tree with perfect accuracy they should be able, eventually, to document the links that tie them to everyone else within the same genetic family and to end up looking at one big family tree.

Most genetic families are broadly structured the same way. Each tends to have a cluster of results around the modal value for that genetic family plus a few other participants that have results that show different values on a few markers but whose DNA signatures are close enough to the DNA signature for other members of the genetic family that they can be considered to be members of it.

It can be difficult sometimes to be confident that some individuals belong to a particular genetic family or not, particularly if the results being compared are at markedly different resolutions. For example, a 12-marker result may well be identical to a 67-marker result on the twelve markers that have been tested, but it does not follow that the remaining 55 markers will also be the same if they were to be tested. Further testing to a higher resolution sometimes causes the outlying results within a genetic family to end up looking even more different from the DNA signature associated with it, and in some cases that means that a participant can no longer remain classified as part of that genetic family. The main reason why participants with low resolution results are asked to re-test at a higher resolution is to remove this uncertainty. However, in many cases, especially so where individuals trace their line back to the same geographical region as others in the genetic family, one can reasonably infer that the low resolution match will remain secure at a higher resolution without necessarily testing it again.

With the DNA-based clusters identified, the fifth stage is to link the historical data that has been researched for each individual with their DNA result. Each participant’s family tree has to be researched as carefully as possible in order to identify a point of origin for their personal family tree. When the geographical origin data is cross referenced with the DNA results it quickly becomes clear whether any pattern is visible when comparing the geographical origins of the members of each genetic family.

Interpreting The Data

Only when the genetic families are established and the historical data is associated with each result can the interpretation of the project’s results begin. This is a subjective process that one can try and make partially objective by using a calculation to estimate the number of generations to the common male ancestor for two given participants. This calculation sets out to define the Time to the Most Recent Common Ancestor (TMRCA). However, the TMRCA calculation is only expressed in terms of the probability (ranging from 50% to 95%) that the answer given is the average of the range of possibilities consistent with the data.

While testing labs have access to some mutation rate data, they have not published it. Numerical calculations of the relatedness of two DNA test results therefore use an average rate; there is much debate among geneticists about what that average rate should be. The key point is that numerical calculations like the MRCA appear to give a degree of certainty that they do not possess; they should be thought of only as an approximate guide.

In particular, a small change in the average rate of mutation used in the calculation would produce major differences in the results. The danger in using it is that we might, at a future date, have to throw all previous calculations away because the premises we used have been revised. So any TMRCA calculation cannot be expected to ‘prove’ a specific connection or to pinpoint in which particular generation, or generations, the common Dalton ancestor lived. The best that it could do is to confirm that the timeframe for such a connection is broadly feasible with at least one estimated timeframe using currently accepted variables.

At this point it is sometimes possible to demonstrate a migration-type progression within a genetic family. For example, all the participants with a common DNA signature who trace themselves back to a location A one may share a mutational variation with several participants who trace themselves back to location B. It is sometimes possible to infer, for example, that their combined tree’s ancestors were earlier found in location B before some members moved to location A.

This can be shown graphically in a cladogram, a chart that shows how a group of DNA results link together as a network. Each node represents a specific DNA signature; where more than one person is found with the same DNA signature the node will be proportionally bigger (i.e. the bigger the circle the more individuals tested had that result). The network helps individuals to work out whom they are most likely to be related to, i.e. who is closest to them in their shared family tree. Your closest relation is very likely to be the Dalton whose DNA result is closest to yours within the network. Conversely, those Daltons who are furthest away from you on the network diagram are unlikely to be your closest relations among all the Daltons in the network.

Cladograms are useful to target your closest relations within your genetic family. One drawback inherent in the charting process is that all the input data must use the same number of markers. A high resolution network at 37 or 67 markers is a very useful chart; however, if half of the members of the genetic family have only been tested at 12 or 25 markers then their results cannot be placed within that chart. As a consequence most cladograms are produced only at lower resolutions as it is generally useful to have a general picture that includes all results rather than a detailed picture that excludes most of them.

Other assumptions at work in the interpretation phase are that:

Singleton Results

In any surname project there will be a significant number of participants who find that they are not linked to any genetic families, i.e. that theirs is a singleton result, a DNA signature that is unique within the study. There are two main reasons why this might happen.

A common reason is that the project simply has not yet tested enough people to find another participant with a matching DNA result. For a surname like Dalton, which has many thousands of name-bearers alive today around the world, at a guess several hundred individuals would need to be tested before all the main genetic families were clearly identifiable. The Dalton project is well advanced compared to most surname-based DNA projects around the world, but it will need to continue expanding the number of participants who trace their origins back to Ireland or the UK in order to uncover all of the genetic families associated with the surname.

The second reason why a participant may appear to have a singleton result may lie somewhere within the history of their family tree. For example, a Dalton ancestor may have acquired the surname at a relatively recent time and for a specific reason, e.g. during the process of an inheritance whereby a man with a different surname was required to become a Dalton. There are, of course, other non-legal reasons why the DNA in a Dalton family may be different from that of the Dalton genetic families, for example where a non-Dalton man has fathered a boy by a Dalton woman and the child has grown up as a Dalton. It is often hard to pinpoint exactly when such an event, which has caused a rift between the expected DNA signature and the surname linked to it, has happened. In any case, experience from other surname projects shows that one might expect that between a quarter and a half of all DNA results might turn out to be singletons. At present the figure in the Dalton project is much lower than these estimates. In any case, it should always be remembered that an individual’s history is his family tree, not his DNA result alone.

Modal Results

Results from the Dalton project also highlight the issues of genetic diversity and how to interpret the modal DNA result. One-third of all Dalton project members have been clustered into ‘Genetic Family A’, and all bar one of them can currently only trace their family tree back within the United States, i.e. not back to the UK or Ireland. It might seem natural to assume that the most commonly found DNA signature must be the one associated with the oldest family tree, i.e. the original Dalton family. This though is not the case; an old family tree may have a great many modern-day descendants, or it may have very few indeed. The best that one can infer from the modal result is that the tree those participants belong to has a great many modern-day descendants.

The interesting feature about the Dalton project is that the modal result has only been found once outside of the USA. Given that it is so common among American participants I would have expected to have found this DNA signature more often among the British or Irish-origin participants. I would expect that there will be more genetic diversity among British-origin Daltons than among American-origin Daltons for the simple reason that only a percentage of the British Daltons ever emigrated to the USA and had male descendants, a process of selection known as a genetic bottleneck. Either all of the Daltons in the UK and Ireland with this DNA signature have died out before the present day or not enough British and Irish-origin Daltons have yet taken a DNA test to be identified within the project. It could even be that the head of genetic family A was illegitimate and so no trace could be found in the Old World.

The Next Stage

The interpretation of DNA results in this kind of surname-wide analysis needs to be handled with caution. Some of the data – e.g. the haplogroup inferred by the testing company – are only estimates and are subject to periodic revision. Some of the premises used in the analysis phase are surely going to be revised at some point in the future – e.g. some ‘slow’ markers may be reclassified as fast-mutating – and new data will become available, notably the precise rates of mutation for each individual marker.

Some data that it would be useful to have is simply not yet gathered. It would be useful to know how frequently each DNA signature is found in a given geographical population in Britain and Ireland. Rare signatures are more useful for the purposes of making comparisons and linkages, but even common DNA signatures will be rarer in some geographical locations than others. Very little is known about these distributions at present.

However, the main data that is subject to revision is not the DNA data at all but the descriptions of our participants’ family origins. With the members of each genetic family identified, the greatest challenge now is to document their individual connections with each other and to recreate their shared family trees. During this process it is common for many people to change the assumptions they previously held about their origins and to correct their research. For all of these many reasons the set of conclusions outlined in this report must be thought of as part of a work in progress, albeit one creating stronger hypotheses year by year.

DNA testing is a tool for the family historian, a tool to help guide the documentary research process by suggesting links among presently unrelated people or trees, and to corroborate documentary research by confirming a shared genetic heritage. A DNA project advances by making hypotheses about DNA linkages and documented origins that are progressively refined through an iterative process of further DNA testing and documentary research. At the end of this process the family history that is created is the family tree as it is documented and the stories of the men and women in it, a history corroborated by their DNA results.

The Dalton project has now reached the stage where it should be feasible to link together into a single family tree many of the individuals grouped by the DNA results into each genetic family.