The Dalton International DNA Project (DIDP), was officially launched in May 2003, and continues to develop and expand. Ours was one of the early projects to be established, and as we celebrate the tenth anniversary of Dalton DNA testing we now have 181 participants and remain at the forefront of this new genealogical science. The tenth anniversary of the project is a fitting time to take stock of what we have achieved in the last decade, and to think ahead about how we want to take the project forward.

With this in mind, we are taking this opportunity to re-define DIDP, and to put the focus back onto genealogical research, the means for documenting our links together. To mark this shift, we have revised the structure of our reporting, and this is reflected in the layout of this report.

DNA tests remain a crucial tool within our genealogical research activities, simply because the test results are the only effective means of telling us precisely to which Daltons we are each most closely related. Instead of relying on a potentially inaccurate geographical connection, or a family story, the DNA results pinpoint exactly which Daltons we should be working with in order to research and document the Dalton family tree that we share together.

Changes have been introduced in this edition of the project report, which has two major sections. The first details those project members who have origins in the UK or Ireland as defined by their genealogy or through genetic association; and the second shows those members who do not yet have an identified connection to the UK or Ireland. The significance of this distinction is explained in the write ups for each Family Group in the report.

DIDP is run by Michael Neale Dalton and Karen Dalton Preston working with other Family Group coordinators, on behalf of the Dalton Genealogical Society. Michael and Karen are also the administrators for our DNA testing programme, which is hosted at Family Tree DNA.

We are assisted by Chris Pomery who has been chartered to analyse our results and advise us how to develop the project. Chris established a DNA project for his own Pomeroy family back in 2000, has published two books on DNA and family history, and regularly lectures on the subject.

Our project is being moved forward by the coordinators for the four major Family Groups identified through our DNA results:

Michael, Karen, Melanie and Mike are the members of the DGS DIDP sub-committee, which reports to the main DGS committee and retains Chris as its consultant.

Our project also includes several smaller Genetic Families, some with coordinators and others still in need of one. If you are a member of one of the groups without a coordinator, we would welcome your help.

We would like to thank all of those who have come forward to be tested. It is your participation that helps our project to continue to grow, and with your help we hope to identify the ancient origins of all our Dalton ancestors.

Michael Neale Dalton and Karen Dalton Preston

Michael Neale Dalton, DIDP Coordinator reviews the project at the end of 2013, with the publication of the latest DIDP project report.

Our DNA project continues to attract considerable interest with a regular stream of enquiries about joining being received by myself and Karen Dalton Preston as administrators of the project. In September 2012 we published a DIDP Update which included an up to date list of genetic family coordinators and details of Chris Pomery's new report. All DGS members registered to access our new members only website, Daltons in History 2.0 will find the full report posted there in November 2013.

In order to strengthen the management of DIDP we have agreed that Melanie Crain, Mike Dalton, myself and Karen Preston as the coordinators of Genetic Families A, B, C and D respectively have formed a DGS DNA Project Sub-Committee with Chris Pomery continuing to act as the project consultant. For the successful progression of the project, we think we now need:

The sub-committee will work to put plans in place to achieve the above and will monitor the development of the project more closely with the remit of each of us extending outside the boundaries of our individual genetic family. It will be our aim to ensure that we cover the interests of and the enquiries from all participants in the project, and in the list of genetic family coordinators you will see that each of us already look after additional groups in the project. Those relatively few participants without a coordinator are invited to contact either Karen or myself in the first instance and we will involve others to assist as appropriate.

We are indebted to Chris Pomery for all his assistance with the project over the past seven years, which includes the preparation of three issues of the very comprehensive project progress report, and more recently a series of six reports covering individual genetic families. He has also given informative presentations at our annual gatherings on several occasions. We now have some 180 participants in the project, and well over 80% are members of one of the 15 identified genetic families.

Michael Neale Dalton, the DIDP Coordinator reviews the project as it stood in September 2012, which includes an up to date list of genetic family coordinators and details of the report published in November 2013.

During 2011 we published six reports providing updates for each individual genetic family, as follows:

- Genetic Family A - the Virginia Daltons (December 2010)

- Genetic Family B - the Eireann Daltons (September 2011)

- Genetic Family C - the Carmarthenshire Daltons (September 2011)

- Genetic Family D - the Golden Vale Daltons (September 2011)

- Genetic Families E, F, G, H, J and K and R1b singletons (December 2011)

- Genetic Families Q, W, X, Y and Z and non-R1b singletons (December 2011)

All these reports have been made available to the members of the group or groups covered in each one, and DGS/project members, who may wish to have copies of any of the reports which have not been circulated to them, should contact either myself (michaelndalton@aol.com) or Karen Preston, Deputy DIDP Coordinator (karen@golden-hills.com).

There have been a number of innovations in this series of six reports and one of the most important of these is the inclusion of details of the oldest documented Dalton ancestor where known. Chris Pomery, our DNA consultant, has emphasised to us the importance of sharing this information as part of our quest to reconstruct and establish all our Dalton family trees, and to identify the links between them. Where this data is incomplete, we have asked project participants to supply details of their oldest documented Dalton ancestor if known, so that it can be recorded in the next update of the report.

We also have coordinators appointed for the larger genetic families (A, B, C & D) and for some of the smaller ones (E, G, H, W, Y & Z). Increasingly these coordinators are taking on responsibility for working with their groups to share data about the various known family trees and to identify further research to assist in the reconstruction of the trees. Here is the up to date list of genetic family coordinators:

Genetic Family

Co-ordinator

Email Address

A

Melanie Crain

hcrain@nc.rr.com

B

Michael Francis Dalton

mfdalton@btconnect.com

C

Michael Neale Dalton

michaelndalton@aol.com

D

Karen Dalton Preston

karen@golden-hills.com

E

Millicent Craig

millicenty@aol.com

F

to be appointed

 

G

Melanie Crain

hcrain@nc.rr.com

H

Michael Neale Dalton

michaelndalton@aol.com

J

to be appointed

 

K

to be appointed

 

Q
to be appointed  

W

Howard Dalton

howard.dalton@hotmail.co.uk

X
to be appointed  

Y

John Dalton

johndalton78@hotmail.com

Z

Howard John Dalton

h.dalton1@ntlworld.com

Singletons

Michael Neale Dalton

michaelndalton@aol.com

As everyone who has heard Chris speak at one of our gatherings or elsewhere will know, the DNA results are only the beginning of a sometimes long journey to piece together the complete family history, and there is always plenty of further more traditional research work to be done. He emphasised this point again when he joined us at our Yorkshire Gathering in July 2012.

Chris has advised that in order to focus more on the geographical origins of the earliest known ancestors of each DIDP participant, he will now produce two reports – one for those with known UK/Ireland origins, and the other for those with North American origins, or origins anywhere else outside UK/Ireland. This will bring much more focus on researching the origins of each tree, and the traditional research that we want to encourage all participants to do. Our plan is that these two reports will be completed during 2013.

Of course, we still need to expand our database of DNA data, both by extending those tests still at 12 or 25 markers to at least 37, and by encouraging known male Dalton descendants of UK Dalton families to join the project. This process has started with several new recruits to the project and we will be contacting others who fall into this category and inviting them to participate.

Chris summarises our current position as follows:

The DIDP DNA project is growing from strength to strength year on year. Our goal this year is to recruit more Ireland and UK-based Daltons in order to build up a comprehensive and more accurate genetic picture of British and Irish Dalton trees. With this picture in place, we can then accurately assist any Dalton living outside the UK to identify the correct tree they should be researching their link with. Our focus is gradually shifting towards checking the documentation of the trees which project members have submitted, and creating trees for those that haven't. This is a long-term task that we cannot avoid working on in order to develop the project, but each and every one will greatly contribute to clarifying the overall picture and answering the big question:    'how many Dalton families are there?'

With these thoughts in mind, we invite all project members to review the details of their oldest documented Dalton ancestor as recorded in the published reports referred to above and advise their coordinator of any updates to this information.

As always, if you have any questions, need help with interpreting the reports or you are wishing to join the project, do please contact either myself (michaelndalton@aol.com) or Karen (karen@golden-hills.com) . We will do our best to assist you, and if we are unable to help, we will refer your query to Chris Pomery. We look forward to these next steps for DIDP with anticipation!

Michael Neale Dalton, DIDP Coordinator reviews the project as at December 2011 and looks ahead to the proposed future developments.

As always, we are indebted to our DNA consultant, Chris Pomery for all his assistance with the project over the past six years, which includes the preparation of three issues of the very comprehensive project progress report, and most recently a series of six reports covering individual genetic families. He has also given informative presentations at our annual gatherings on three occasions. We now have approaching 180 participants in the project, and well over 80% of these are members of one of the 15 identified genetic families.

Issue 3 of the full Project Progress Report was published in October 2009. Since then the emphasis has been on providing updated reports for each individual genetic family, and these have been published as follows:

- Genetic Family A - the Virginia Daltons (December 2010)

- Genetic Family B - the Eireann Daltons (September 2011)

- Genetic Family C - the Carmarthenshire Daltons (September 2011)

- Genetic Family D - the Golden Vale Daltons (September 2011)

- Genetic Families E, F, G, H, J and K and R1b singletons (December 2011)

- Genetic Families Q, W, X, Y and Z and non-R1b singletons (December 2011)

All these reports have been made available to the members of the group or groups covered in each one (it is generally a requirement that recipients of reports are paid up DGS members. This enables us to reimburse Chris Pomery as our DNA consultant for the immense amount of work that he undertakes on an ongoing basis for the project).

The first report for Genetic Family A, the Virginia Daltons, is the largest genetic family in the project with 55 participants and the report has established a template for the further reports in the series. The structure of the report is as follows:

1. Genetic Origins of the Dalton Surname

2. The Dalton Genetic Families

Table 1: Research Coordinators of the Dalton Genetic Families

Table 2: Geographical Locations associated with the Dalton Genetic Families

3. Profile of the Virginia Daltons

Chart 1: A 37-marker Phylogenetic Chart for the Genetic Family

Table 3: Currently documented Geographical Origins & Oldest Ancestors

4. Next Steps

Appendix 1: A Quick Explanation of the DNA Matching & Analysis Process

Appendix 2: Table of Marker Differences within Genetic Family A

Michael Dalton, as coordinator of the project explains the changes since the previous progress report and gives a summary of the Dalton Genetic Families identified by December 2010.

Introduction

The Dalton International DNA Project (DIDP) commenced in May 2003. Since then, three comprehensive DIDP Progress Reports have been published, the third of which appeared in October 2009. This took account of the results of tests for 126 participants. At that time 13 genetic families had been identified and 21 of the participants were classified as singletons, in other words they did not appear to be members of any of the identified genetic families.

General observations

The project has been expanded since October 2009 in two ways. 34 new participants have been added to the project and many tests have been upgraded to either 37 or 67 markers. We have also identified three new genetic families and merged one into another.

R1b1b2 Haplogroup Genetic Families

A substantial majority of the project participants are categorised in R1b1b2 haplogroup genetic families. A more detailed explanation of haplogroups and other aspects of genetic genealogy may be found in The DNA Process below. The R haplogroup families are as follows:

Genetic Family A – coordinator Karen Dalton Preston

This group now has 55 participants. 28 of these are at 67 markers and another 17 at 37 markers. 5 participants do not carry the Dalton surname or a variant. The group is traced back to the states of Virginia, North Carolina and Tennessee, USA and the key issue remains to determine who the first immigrant(s) were in the 17th century and from where they came. The Niall link suggests possible Irish descent and further research continues to focus on this. Karen Preston, who coordinates this group is working very closely with Melanie Crain from the Dalton America DNA Project on more detailed documentary research of the various lines within the group.

Genetic Family B – coordinator Michael Francis Dalton

There are now 16 participants in this group, 3 at 67 markers and 11 at 37 markers. Two participants do not carry the Dalton surname. The group originates from the counties of Meath, Westmeath, Leitrim and Limerick in Ireland and further work is being coordinated by new DGS committee member, Mike Dalton, who has recently taken this role over from Wendy Fleming.

Genetic Family C – coordinator Michael Neale Dalton

There are now 6 participants in this group, 5 at 67 and one at 37 markers. The family originates from Carmarthenshire, South Wales in the 17th century with a link back to Oxfordshire, and possible links back to Lancashire in the 13th century.

Genetic Family D – coordinator Karen Dalton Preston

With 21 participants, this is the second largest group in DIDP. There are 10 participants at 67 markers, and 8 at 37. One does not carry the Dalton surname. This is another Irish Dalton family which is distinct from Group B. Lines have been traced back to Tipperary, Clare, Kilkenny, Limerick and Waterford.

Genetic Family E – coordinator Millicent Craig

This group with origins in Lancashire, England has three participants, all at 37 markers.

Genetic Family F

There are now 4 participants, with one at 67 markers and two at 37. This group has English origins with London and Kent identified. Some singletons may be eligible for inclusion in this group in due course.

Genetic Family G

This group has 5 participants with two at 67 and two at 37 markers. Their Dalton origins are in Virginia, USA and it may be that there is a link through a non-paternal event with Group A.

Genetic Family H

This group has two participants with one at 67 markers. Lines have been traced back to Berkshire and Surrey, England. The result at 25 markers needs to be upgraded to establish a definitive DNA signature for this group.

Genetic Family J

This group has 3 participants with one at 37 markers. It is the family that emigrated from Suffolk, England and settled in Hampton, New Hampshire, USA in 1635. The two 25 marker results need to be upgraded to establish a definitive DNA signature for the group.

Genetic Family K

This group has two participants, one at 67 markers. The line is traced back to Newfoundland, Canada in the early 19th century. It is thought that the family may originate from Ireland.

R1b1b2 Haplogroup singletons – coordinator Michael Neale Dalton

There are 20 participants in this category with 4 at 67 markers and 11 at 37. A number have very well documented Dalton origins and it is hoped that, in due course, matches will be found for some with new DIDP participants, thus creating new genetic families.

Q Haplogroup Genetic Families

Genetic Family Q

This is a new genetic family with two members. Origins are traced back to England.

I Haplogroup Genetic Families

Genetic Family W

There are two participants, both tested at 37 markers. This group has been separated from Group Z and has Dalton origins in Yorkshire, England.

Genetic Family X

This group has 3 participants, two at 37 markers. The origins of the family are in Virginia, USA and this may be another group which links to Group A through a non-paternal event.

Genetic Family Y – coordinator John Dalton

This group has two participants, both at 67 markers and a definitive DNA signature established for the group. The family originates from Oldham, Lancashire, England and has been well documented.

Genetic Family Z – coordinator Howard Dalton

This group has 3 participants and only one at 37 markers. Tests for the other two need to be upgraded for further conclusions to be drawn. The identified origins of the group are Yorkshire and Buckinghamshire, England.

Non-R1b Haplogroup singletons – coordinator Michael Neale Dalton

There are 11 participants in this category with 3 at 67 markers and 3 at 37. Again, a number have very well documented Dalton origins and it is hoped that, in due course, matches will be found for some with new DIDP participants, thus creating new genetic families.

Concluding note

The above information is necessarily brief. Anyone interested to have further information about any particular genetic family group is invited to contact the appropriate group coordinator. For groups without a coordinator or more general enquiries, please be in touch one of the project administrators, Michael Dalton or Karen Preston, in the first instance.

Michael Dalton's foreword to DIDP Progress Report Issue 3 gives a detailed history of the project and its growth from the launch in 2003 through to 2009.

It again gives me great pleasure, as Chairman of the Dalton Genealogical Society, to write the foreword to the Dalton International DNA Project Progress Report – Issue 3. The first issue of the report was published in November 2006 and the second in December 2007. Considerable further progress has been made since then. Again the report has been prepared for the Society by the consultant to the project, Chris Pomery and we continue to be indebted to Chris for his diligence and thorough approach.

The Dalton International DNA Project (DIDP) was officially launched in May 2003 at the DGS Gathering held in South Wales. At that time, the idea of DNA single surname projects was quite new, particularly in the UK, and ours was among the very early projects to be established. Six years on, and with around 130 sets of results in our Y chromosome database, the project is still at the forefront and it is our intention to remain on the leading edge as genetic genealogy becomes an ever more powerful and popular component in the family historian’s toolkit.

The birth of the Dalton project owes its conception to the then DGS American Secretary, Millicent Craig. It was Millicent who had started taking an interest in papers that were written on the subject as long ago as 1999, and the very early studies which commenced in 2000. The Society was made aware of Millicent’s thoughts at the DGS Gathering in Cambridge in 2001 and, after due consideration by the committee and further discussions during 2002, the first testees were identified by early 2003. At the launch date, about 15 DGS members had committed to submit their swabs for analysis and the project was under way.

Slowly, as the results have become available, a number of family groupings have emerged and regular reports have been made to DGS members in the Journal, to the wider audience of “Daltons in History” on our website and more recently on the DIDP pages of the website. By the time the Society met in Dublin for the Ireland Gathering in July 2005, about 65 DGS members were participating in the project and it was clear that we needed help with interpreting the results. A sub-committee was formed to take matters forward. Inspired by the talk given by Patrick Guinness in Dublin, it was decided that I should approach Patrick for advice on who we should appoint as a consultant to assist us. Patrick felt that Chris Pomery, already known to me and with an established reputation in the field of genetic genealogy, would be the right person and so, towards the end of 2005, Chris was officially appointed with a brief to analyse our results to date and present them and his conclusions back to the Society. Chris was among the early DNA pioneers, establishing a DNA project for his own Pomeroy family back in 2000, when 12 marker tests were the norm. Since then he has published two very informative books on DNA and family history; he lectures on the subject extensively and he has become a recognised authority, with a particular skill for bridging that gap between the raw results and something meaningful to family historians – he picks up from where Family Tree DNA, our testing company, leaves off. This is in no sense a criticism of FTDNA – with over 165,000 individual Y chromosome results, and well over 5,000 surname projects to service, they can hardly be expected to work to the depth that Chris is able to for us.

The DGS has worked closely with Chris since 2006 and this third report is a direct result of that collaboration. Of course, it is a work in progress and a snapshot at a point in time. By the very nature of what we are doing, tests will be expanded to more markers and new testees will join our project. The genetic technology will become more sophisticated and offer more detailed outputs at ever reducing cost. What Chris has done is to establish a framework for maintaining our expanding matrix of DNA results. Our data is held in a Microsoft Access database, which links the genetic data from the test result with basic genealogical data supplied by us about the origins of the tree from which a descendant has been tested. Combining the two enables us to extend our knowledge of Dalton family history, and it will continue to do so as our historical research progresses and more Daltons take the DNA test.

All this has taken time and we are immensely grateful to Chris for providing the Society with such a rigorous and robust approach. Chris invited me to participate in a Guild of One Name Studies DNA Seminar held in Nottingham in May 2007, and I presented an overview of DIDP to 80 family historians eager to find out more about how DNA might help their research. Chris addressed the DGS at our Gathering in Worcester in July 2007 and his talk was very well received by all the delegates. More recently, I was able to make a presentation of this latest report at the DGS Gathering in Orange, New South Wales, Australia in March 2009.

Recognising the value of Chris’s contribution, the Society has invited him to continue as our consultant. He will prepare further issues of this report, taking into account all the new data that emerges. Increasingly the focus of interest is on the individual genetic family and we are now looking at ways of providing more regular updates to these groups. It is likely that the reports to these groups will be more frequent and the publication of the full report therefore correspondingly less frequent. Many of our genetic family groups now have active co-ordinators and these are identified in the report. We have also set up individual websites for two groups (B and D), and I hope that more will follow.

In parallel with the above, we will continue to publish interim updates of a more general nature on the website and to maintain contact with all the participants in the project, ensuring that they are made aware of new developments and discoveries as they happen.

An innovation in this Issue 3 of the report is a new table for each genetic family showing the more likely relations for each participant. These tables are provided as a suggested starting point on where to focus initial research. They are not meant to be completely predictive.

It is our firm intention to maintain the Dalton International DNA Project as one that is in the forefront internationally, and we continue to look forward with optimism.

Michael Neale Dalton

Chairman and Honorary Life President of The Dalton Genealogical Society

An informative extract from DIDP Progress Report Issue 3 written by Chris Pomery, our DNA Consultant. This explains how the DNA process works as a tool to assist the family historian.

Markers & Resolutions

The Y-chromosome DNA test used in surname DNA projects measures the genetic structure of the Y-chromosome at pre-agreed places known as markers. The resolution of a DNA test result varies from low (4-12 markers), to medium (25 markers) to high (37-67 markers) depending on the type of test purchased. In general, the higher the resolution of the test the more confident any comparisons made using its result will be. The result for each marker is usually expressed as a whole number, though there are a few markers where the result is expressed as either a pair or as a sequence of numbers. Each DNA test result, whatever its resolution, is described as the ‘DNA signature’ of the individual who has been tested.

The genetic structure of the Y chromosome at each of these markers mutates occasionally when the DNA is passed from father to son. The exact rate of mutation for each marker is not yet known. Some markers are known to mutate relatively more often than others; these are referred to as ‘fast-mutating markers’.

The range of values found for each standard marker in a Y-chromosome test tend to vary in the general population according to a normal distribution around a modal result: i.e. if the modal – or most frequently found -- result for marker A is 13, there will be some fewer results found with the values 12 and 14, slightly fewer again with 11 or 15, only a very few with the values 10 or 16, and so on. This pattern represents a distribution of possible results that has been created over many, many generations. Each individual’s result may have mutated up and/or down more than once during the transmission from one generation to another, and some mutations may have been double- or triple-step mutations (e.g. from 13 to 15 or 16) rather than single-step changes (e.g. from 13 to 14). Thus when two participants have the same value at the same marker, it does not ‘prove’ that all of their ancestors within their family tree had the same value, though in practice this is more likely than not.

Haplogroups

In addition to the set of standard markers that make up a person’s DNA signature there is another set of markers that are used to define the haplogroup associated with each test result. Haplogroups were first defined by population geneticists who were working out how humankind had populated the planet through successive migrations out of Africa. The useful feature about haplogroup markers is that each of them are thought to have mutated only once in a single human individual at a specific point in time. By plotting the present-day frequencies of all of these haplogroup markers, it is possible to infer the geographical path of the migrating population over time that produced the distribution found today. Many thousands of haplogroup markers have been identified and more are being found every year.

The results of the haplogroup markers are not given in the same format as the markers in a standard Y-chromosome test result. Instead, the results of a set of haplogroup markers are expressed as the label of the haplogroup itself. For example, the haplogroup labelled R1b1b2a1b5 is derived from the results of 15 separate haplogroup markers.

The label R1b1b2a1b5 is a sub-set of the global ‘R’ haplogroup which is in turn subdivided into 1a, 1b, 1c, etc. One confusing feature about haplogroups is that this classification system is still in a state of flux; indeed it was changed wholesale during 2008 as a new set of markers were agreed internationally, changes incorporated in this the third edition of this report. For this reason, very great care has to be taken when recording and using haplogroup results. This is doubly so because while DNA testing companies such as FTDNA helpfully offer their estimate of the likely haplogroup for each participant, this estimate can change if subsequent haplogroup test results revise the knowledge on which those earlier estimates were based.

Creating ‘Genetic Families’

The process of analysing Y-chromosome DNA results consists of comparing the string of numbers in each DNA signature and clustering together those participants whose results are broadly similar. Distinct clusters of participants with identical or near-identical DNA results are defined as ‘genetic families’. The comparison process is not concerned with the actual numerical value on a specific marker, i.e. it does not matter whether a participant has the value, say, of 11 or 13 on marker A; what does matter is that two men have the result 13 while another two men have the result 11.

The purpose of the clustering process is to bring together all the men whose DNA results suggest that they share, or may share, a single common male ancestor within a timeframe that makes sense to a genealogist, i.e. the last 700-1,000 years.

The process of clustering DNA results takes place in several stages.

The first stage is to group all the participants by haplogroup. This is because men belonging to haplogroup R could only share a common ancestor with someone who is haplogroup Q some tens of thousands of years ago, well outside the range of genealogical research. The same is true of men defined as haplogroup R1a as compared to, say, members of haplogroup R1b. All members of a bona fide genetic family by definition will share the same haplogroup.

With the haplogroup-based clusters established, the second stage of the analysis process is to compare the DNA signatures of all the participants in each haplogroup using the most reliable markers. The most reliable markers are those that are the slowest to mutate.

The third stage is to use the remaining markers, generally described as fast-mutating, to further sub-divide the genetic families that became visible during the second stage.

The fourth stage is to use those markers which express their result as a pair of numbers or as a sequence. There are specific issues relating to these markers which make them best suited to be used to further define existing clusters rather than to create them.

The resulting genetic families are broad clusters of identical and near identical DNA results. The main conclusion one can draw from an individual’s inclusion in a specific genetic family is that they are highly likely to share a common direct male-line ancestor with other members of that genetic family. Put another way, if they are able to research their family tree with perfect accuracy they should be able, eventually, to document the links that tie them to everyone else within the same genetic family and to end up looking at one big family tree.

Most genetic families are broadly structured the same way. Each tends to have a cluster of results around the modal value for that genetic family plus a few other participants that have results that show different values on a few markers but whose DNA signatures are close enough to the DNA signature for other members of the genetic family that they can be considered to be members of it.

It can be difficult sometimes to be confident that some individuals belong to a particular genetic family or not, particularly if the results being compared are at markedly different resolutions. For example, a 12-marker result may well be identical to a 67-marker result on the twelve markers that have been tested, but it does not follow that the remaining 55 markers will also be the same if they were to be tested. Further testing to a higher resolution sometimes causes the outlying results within a genetic family to end up looking even more different from the DNA signature associated with it, and in some cases that means that a participant can no longer remain classified as part of that genetic family. The main reason why participants with low resolution results are asked to re-test at a higher resolution is to remove this uncertainty. However, in many cases, especially so where individuals trace their line back to the same geographical region as others in the genetic family, one can reasonably infer that the low resolution match will remain secure at a higher resolution without necessarily testing it again.

With the DNA-based clusters identified, the fifth stage is to link the historical data that has been researched for each individual with their DNA result. Each participant’s family tree has to be researched as carefully as possible in order to identify a point of origin for their personal family tree. When the geographical origin data is cross referenced with the DNA results it quickly becomes clear whether any pattern is visible when comparing the geographical origins of the members of each genetic family.

Interpreting The Data

Only when the genetic families are established and the historical data is associated with each result can the interpretation of the project’s results begin. This is a subjective process that one can try and make partially objective by using a calculation to estimate the number of generations to the common male ancestor for two given participants. This calculation sets out to define the Time to the Most Recent Common Ancestor (TMRCA). However, the TMRCA calculation is only expressed in terms of the probability (ranging from 50% to 95%) that the answer given is the average of the range of possibilities consistent with the data.

While testing labs have access to some mutation rate data, they have not published it. Numerical calculations of the relatedness of two DNA test results therefore use an average rate; there is much debate among geneticists about what that average rate should be. The key point is that numerical calculations like the MRCA appear to give a degree of certainty that they do not possess; they should be thought of only as an approximate guide.

In particular, a small change in the average rate of mutation used in the calculation would produce major differences in the results. The danger in using it is that we might, at a future date, have to throw all previous calculations away because the premises we used have been revised. So any TMRCA calculation cannot be expected to ‘prove’ a specific connection or to pinpoint in which particular generation, or generations, the common Dalton ancestor lived. The best that it could do is to confirm that the timeframe for such a connection is broadly feasible with at least one estimated timeframe using currently accepted variables.

At this point it is sometimes possible to demonstrate a migration-type progression within a genetic family. For example, all the participants with a common DNA signature who trace themselves back to a location A one may share a mutational variation with several participants who trace themselves back to location B. It is sometimes possible to infer, for example, that their combined tree’s ancestors were earlier found in location B before some members moved to location A.

This can be shown graphically in a cladogram, a chart that shows how a group of DNA results link together as a network. Each node represents a specific DNA signature; where more than one person is found with the same DNA signature the node will be proportionally bigger (i.e. the bigger the circle the more individuals tested had that result). The network helps individuals to work out whom they are most likely to be related to, i.e. who is closest to them in their shared family tree. Your closest relation is very likely to be the Dalton whose DNA result is closest to yours within the network. Conversely, those Daltons who are furthest away from you on the network diagram are unlikely to be your closest relations among all the Daltons in the network.

Cladograms are useful to target your closest relations within your genetic family. One drawback inherent in the charting process is that all the input data must use the same number of markers. A high resolution network at 37 or 67 markers is a very useful chart; however, if half of the members of the genetic family have only been tested at 12 or 25 markers then their results cannot be placed within that chart. As a consequence most cladograms are produced only at lower resolutions as it is generally useful to have a general picture that includes all results rather than a detailed picture that excludes most of them.

Other assumptions at work in the interpretation phase are that:

Singleton Results

In any surname project there will be a significant number of participants who find that they are not linked to any genetic families, i.e. that theirs is a singleton result, a DNA signature that is unique within the study. There are two main reasons why this might happen.

A common reason is that the project simply has not yet tested enough people to find another participant with a matching DNA result. For a surname like Dalton, which has many thousands of name-bearers alive today around the world, at a guess several hundred individuals would need to be tested before all the main genetic families were clearly identifiable. The Dalton project is well advanced compared to most surname-based DNA projects around the world, but it will need to continue expanding the number of participants who trace their origins back to Ireland or the UK in order to uncover all of the genetic families associated with the surname.

The second reason why a participant may appear to have a singleton result may lie somewhere within the history of their family tree. For example, a Dalton ancestor may have acquired the surname at a relatively recent time and for a specific reason, e.g. during the process of an inheritance whereby a man with a different surname was required to become a Dalton. There are, of course, other non-legal reasons why the DNA in a Dalton family may be different from that of the Dalton genetic families, for example where a non-Dalton man has fathered a boy by a Dalton woman and the child has grown up as a Dalton. It is often hard to pinpoint exactly when such an event, which has caused a rift between the expected DNA signature and the surname linked to it, has happened. In any case, experience from other surname projects shows that one might expect that between a quarter and a half of all DNA results might turn out to be singletons. At present the figure in the Dalton project is much lower than these estimates. In any case, it should always be remembered that an individual’s history is his family tree, not his DNA result alone.

Modal Results

Results from the Dalton project also highlight the issues of genetic diversity and how to interpret the modal DNA result. One-third of all Dalton project members have been clustered into ‘Genetic Family A’, and all bar one of them can currently only trace their family tree back within the United States, i.e. not back to the UK or Ireland. It might seem natural to assume that the most commonly found DNA signature must be the one associated with the oldest family tree, i.e. the original Dalton family. This though is not the case; an old family tree may have a great many modern-day descendants, or it may have very few indeed. The best that one can infer from the modal result is that the tree those participants belong to has a great many modern-day descendants.

The interesting feature about the Dalton project is that the modal result has only been found once outside of the USA. Given that it is so common among American participants I would have expected to have found this DNA signature more often among the British or Irish-origin participants. I would expect that there will be more genetic diversity among British-origin Daltons than among American-origin Daltons for the simple reason that only a percentage of the British Daltons ever emigrated to the USA and had male descendants, a process of selection known as a genetic bottleneck. Either all of the Daltons in the UK and Ireland with this DNA signature have died out before the present day or not enough British and Irish-origin Daltons have yet taken a DNA test to be identified within the project. It could even be that the head of genetic family A was illegitimate and so no trace could be found in the Old World.

The Next Stage

The interpretation of DNA results in this kind of surname-wide analysis needs to be handled with caution. Some of the data – e.g. the haplogroup inferred by the testing company – are only estimates and are subject to periodic revision. Some of the premises used in the analysis phase are surely going to be revised at some point in the future – e.g. some ‘slow’ markers may be reclassified as fast-mutating – and new data will become available, notably the precise rates of mutation for each individual marker.

Some data that it would be useful to have is simply not yet gathered. It would be useful to know how frequently each DNA signature is found in a given geographical population in Britain and Ireland. Rare signatures are more useful for the purposes of making comparisons and linkages, but even common DNA signatures will be rarer in some geographical locations than others. Very little is known about these distributions at present.

However, the main data that is subject to revision is not the DNA data at all but the descriptions of our participants’ family origins. With the members of each genetic family identified, the greatest challenge now is to document their individual connections with each other and to recreate their shared family trees. During this process it is common for many people to change the assumptions they previously held about their origins and to correct their research. For all of these many reasons the set of conclusions outlined in this report must be thought of as part of a work in progress, albeit one creating stronger hypotheses year by year.

DNA testing is a tool for the family historian, a tool to help guide the documentary research process by suggesting links among presently unrelated people or trees, and to corroborate documentary research by confirming a shared genetic heritage. A DNA project advances by making hypotheses about DNA linkages and documented origins that are progressively refined through an iterative process of further DNA testing and documentary research. At the end of this process the family history that is created is the family tree as it is documented and the stories of the men and women in it, a history corroborated by their DNA results.

The Dalton project has now reached the stage where it should be feasible to link together into a single family tree many of the individuals grouped by the DNA results into each genetic family.