Assessment of Genetic Diversity for American Shad in the Santee-Cooper River Basin of South Carolina Prior to Hatchery Augmentation

Abstract American shad Alosa sapidissima once supported large commercial fisheries along the Atlantic coast, but by the early 1980s many of these fisheries had declined precipitously. In contrast to most Atlantic coastal rivers, the abundance of American shad in the Santee-Cooper River basin of South Carolina has grown substantially; still, much of their upstream spawning habitat in this watershed has been restricted or adversely affected by impoundments. In an effort to rebuild populations in upstream river reaches of the Santee-Cooper basin, state and federal agencies developed an approach relying on hatchery augmentation, the relocation of prespawning adults, and the construction of permanent fish passage structures. Genetic monitoring was adopted to ensure that the genetic integrity of the population remains intact during the rebuilding process. Our study provided an initial assessment of genetic diversity for the genetic monitoring program. There were three components to the study: an assessment of within-basin genetic stock structure, a comparison of genetic diversity (the number of alleles and observed heterozygosity) between hatchery and wild stocks, and an evaluation of molecular tags for monitoring hatchery returns. The estimation of genetic diversity and the assessment of molecular tags were based on 10 microsatellite loci. We found no apparent stock structure in the Santee-Cooper basin and no difference in genetic diversity between broodstock and their wild counterparts. Eight microsatellite markers provided enough resolution to successfully (>95% assignment success) match returning American shad with their respective hatchery parents. Our results highlight the utility and importance of integrating genetic information in supportive breeding programs in an effort to evaluate the programs' effectiveness (both in terms of increasing the census size and in terms of maintaining the long-term viability of a population).


313
spending summers in northwestern Atlantic waters and beginning their southward migration to their natal spawning areas as fall approaches.
Historically, American shad supported one of the most important commercial fisheries in North America as well as sport fisheries in almost every state along the Atlantic coast (Facey and Van Den Avyle 1986;ASMFC 2007). In the late 1800s the annual shad harvest was nearly 23 million kg coastwide (ASMFC 2007); shad stocks could not sustain this rate of exploitation, however, and coupled with habitat loss (e.g., dam construction) and habitat degradation (e.g., pollution), it led to precipitous declines in the abundance of American shad along the Atlantic coast (current abundance estimates are less than 10% of historical estimates; ASMFC 2007). Dramatic declines in commercial landings prompted many states to place moratoriums on American shad harvest (Olney and Hoenig 2001) and ultimately led the Atlantic States Marine Fisheries Commission (a commission of U.S. states formed to coordinate and manage fishery resources along the Atlantic coast of the United States) to recommend closing the commercial fishery along much of the coast in 2007 (ASMFC 2007).
In contrast to most Atlantic coastal rivers, the abundance of American shad in the Santee-Cooper River basin of South Carolina has grown substantially over the last several decades, making this population one of the largest on the Atlantic coast (ASMFC 2007). The increase in abundance was presumably due to past human-induced, hydrological changes in the Santee and Cooper rivers (Hill 2009). Specifically, both the Santee and Cooper rivers were impounded in 1942, creating two reservoirs (Lake Marion and Lake Moultrie) that are connected by a diversion canal (Figure 1). During this period, shad abundance was declining (i.e., from commercial gill-net landings of 24,610 kg in 1960 to an average of 2,554 kg from 1979 to 1985; ASMFC 2007) because impoundments restricted the spawning migrations of these anadromous fish to the lower portions of these rivers. In 1986, a rediversion canal (including a hydroelectric facility) was constructed to connect Lake Moultrie to the Santee River ( Figure 1) and the construction of a fish lift at the hydroelectric facility once again provided fish passage above Lake Moultrie. Pre-and postrediversion canal assessments of American shad abundance on the Santee River showed that, on average, landings were several orders of magnitude greater after the completion of this canal and fish lift (ASMFC 2007).
While American shad abundance in the Santee-Cooper basin has appeared to increase over the last several decades, access to historical spawning and foraging habitats has remained limited because of the construction of nearly 45 other impoundments that inhibit upstream fish passage (Hill 2009). In an effort to reestablish populations in upstream river reaches of the Santee-Cooper basin, state and federal agencies have developed a multifaceted restoration approach relying on hatchery augmentations, relocations of adults, and the construction of fish passage structures (Hill 2009).
A supportive breeding program, which is defined as bringing part of a natural population into captivity for reproduction and the release of their offspring into the wild (Ryman and Laikre 1991), can be an important tool for restoration initiatives; however, adverse genetic change to wild populations can occur if such a program is not correctly implemented and monitored. Potentially harmful effects of hatchery augmentation on existing aquatic gene pools are well known (Meffe 1992;Araki et al. 2007b;Laikre et al. 2010) and have been studied for at least three decades. Genetic monitoring (i.e., quantifying the temporal changes in population genetic metrics) of hatchery releases can be used as a tool to evaluate potentially harmful genetic effects (Schwartz et al. 2007) and is becoming a standard part of many restoration and conservation initiatives relying on supportive breeding (Dowling et al. 2005;Araki et al. 2007a;Thériault et al. 2011).
The goal of this study was to provide an initial assessment of genetic parameters for the genetic monitoring of American shad in the Santee-Cooper basin. We accomplished this by (1) evaluating the extent of American shad population genetic substructure within the Santee-Cooper basin, (2) establishing an estimate of genetic diversity for Santee-Cooper American shad prior to hatchery augmentation, (3) comparing preexisting levels of genetic diversity between 2009 wild prespawning adults and those used for broodstock, and (4) assessing the feasibility of genetic markers as a molecular tag to estimate hatchery return rates.

METHODS
The Santee-Cooper River basin consists of six major rivers: the Santee, Cooper, Wateree, Broad, Saluda, and Congaree rivers. The Broad and Saluda rivers converge to form the Congaree River, and the Congaree and Wateree rivers converge to form Lake Marion and then Lake Moultrie. From Lake Moultrie, water can flow to either the Santee or the Cooper River (Figure 1) and then to the Atlantic Ocean. To discern stock structure in this basin, sampling tributaries from throughout the basin would be ideal; however, sampling from throughout the basin is problematic because American shad migration is prevented by impoundments on the lower Broad, Saluda, and Wateree rivers. Specifically, the Wateree River is impounded by Wateree Hydroelectric Dam at rkm 122 (measured from its confluence with the Congaree River), the Saluda River is impounded by Saluda Dam at rkm 66, and the Broad River is impounded at rkm 3.22 by the Columbia Diversion dam (the Columbia Diversion dam has a vertical slot fishway, but it was opened to passage only recently, circa 2007). Given the limited migration capabilities of American shad in the Santee-Cooper basin and the lack of knowledge about spawning locations above Lake Moultrie, we chose to sample American shad over the course of the spawning run as they migrated through the fish lift located below St. Stephen Dam on Lake Moultrie (n = 255; 83 were sent to a hatchery to be used for broodstock; the remainder were assumed to be fish that would have spawned in the wild and thus were designated as wild). Our sampling rationale was that if genetic structure were present in the basin (i.e., above Lake Moultrie) and American shad collected over the course of the spawning run were examined genetically, the sample should deviate significantly from Hardy-Weinberg and gametic equilibriums (i.e., the Wahlund effect). Although spawning may occur in the Cooper and Santee rivers below Lake Moultrie, recruitment in these reaches is probably minimal because the high-salinity waters of the estuaries limit egg and larval development. We augmented collection efforts by sampling individuals from the Wateree (n = 23), Congaree (n = 2), Cooper (n = 88; fish passage can occur on the Cooper River through a navigation lock at Pinopolis Dam), and Pee Dee (n = 43; the next river north of the Santee-Cooper) rivers over the course of the spawning run (samples from the Congaree were excluded from the analyses due to the limited sample size). Shad samples were collected by state and federal agency biologists via boat electrofishing or taken directly from the fish lift below St. Stephen Dam (see Table 1 and Figure 1 for specific sampling information). A tissue clip from each sample was preserved in 95% nondenatured ethanol and sent to Warm Springs Conservation Genetics Laboratory to be analyzed and archived. Genomic DNA was extracted from a portion of the ethanolpreserved fin clip using the DNeasy Blood and Tissue kit (QIAGEN, Valencia, California) protocol. We used a suite of 11 microsatellite markers known to amplify in American shad (Julian and Bartron 2007). A multiplex polymerase chain reaction (PCR) was performed on the following three sets of primers (fluorescently labeled dyes for the forward primers are noted in parentheses): set A consisted of primers AsaD30 (6-FAM), AsaD31 (VIC), and AsaD429 (NED); set B consisted of primers AsaB20 (VIC), AsaC59 (6-FAM), AsaD312 (6-FAM), and AsaD55 (NED); and set C consisted of primers AsaC249 (6-FAM), AsaC334 (6-FAM), and AsaD42 (NED). A singlereaction PCR was performed for AsaD392 (NED) primers. All PCR amplifications were performed in 20-μL reactions using the following reaction components: 1 × Taq reaction buffer (Applied Biosystems, Foster City, California), 3.75 mM MgCl 2 , 0.423 mM of each deoxynucleotide triphosphate, 0.25 μM of each primer, and 0.08 U Taq polymerase (Applied Biosystems). The PCR conditions were as follows: an initial denaturation at 94 • C (10 min), followed by a touchdown procedure involving 33 cycles and consisting of denaturing (94 • C for 30 s), annealing, and extension (74 • C for 30 s) cycles, where the initial anneal-ing temperature was 56 • C but decreasing by 0.2 • C per cycle for 30 s.
Prior to electrophoresis, 2 μL of a 1:100 dilution of PCR product was mixed with a 8-μL solution containing 97% formamide and 3% Genescan LIZ 500 size standard (Applied Biosystems). Microsatellite reactions were visualized with an ABI 3130 genetic analyzer (Applied Biosystems) using fluorescently labeled forward primers and analyzed using GeneMapper software version 3.7 (Applied Biosystems).
Tests for gametic disequilibrium (all pairs of loci per sampling site) and locus conformance to Hardy-Weinberg equilibrium (HWE; for each locus in the sampling site) were implemented using GENEPOP version 4.0.10 (Raymond and Rousset 1995). Significance levels for all simultaneous tests were adjusted using a sequential Bonferroni correction (Rice 1989). In cases in which the observed genotype frequencies deviated significantly from HWE expectations, the program MICRO-CHECKER version 2.2 (van Oosterhout et al. 2004) was used to infer the most probable cause of the departures.
To assess the degree of population differentiation in the Santee-Cooper basin (i.e., excluding the Pee Dee River samples), we first compared the per-locus genic frequency distributions from each sampling locality using the genic differentiation option in GENEPOP version 4.0.10 (Raymond and Rousset 1995) with the default parameter settings. We also calculated D EST (a measure of population differentiation based on genetic polymorphism data [Jost 2008]) between sampling sites using the program DEMEtics (Gerlach et al. 2010), where confidence in the null hypothesis of no genetic differentiation among sampling sites was assessed via bootstrap resampling (500 replicates as implemented in DEMEtics). Analysis of population structure was performed using a Bayesian-based clustering algorithm implemented in the program STRUCTURE version 2.3.3 (Pritchard et al. 2000;Falush et al. 2003). The program STRUC-TURE assumed no a priori sampling information; rather, individuals were probabilistically assigned to groups in such a way as to achieve Hardy-Weinberg and gametic equilibriums. The program STRUCTURE was run with three independent replicates for K (i.e., distinct populations or gene pools), with K set from one to eight. The burn-in period was 50,000 replicates, followed by 500,000 Monte Carlo simulations run under a model that assumed no admixture and independent allele frequencies.
Estimates of K (Evanno et al. 2005) and individual assignment patterns were used to determine the most likely value of K.
No population substructure was observed among the samples (n = 366) collected in the Santee-Cooper basin (see below); therefore, we treated the samples collected from throughout the basin as one population and calculated basic estimators of genetic diversity for American shad in the Santee-Cooper basin (i.e., prior to supportive breeding). The fixation index, along with genetic diversity in the form of the per-locus and average number of alleles, observed heterozygosity, and expected heterozygosity, were calculated using the computer program GenAIEx version 6.4 (Peakall and Smouse 2006). We also calculated the genetic composition for a random sample of the 2009 broodstock whose progeny were used for an initial hatchery release. Specifically, we tested for homogeneity in genic distributions, average number of alleles, observed heterozygosity, and expected heterozygosity between a random sample of the 2009 broodstock (n = 83) and wild fish collected from throughout the Santee-Cooper system during the course of the study (n = 283). Tests for significance were conducted using the Wilcoxon rank-sum test (Sokal and Rohlf 1995) as implemented in S-Plus version 7.0 (Insightful Corporation) except for the genic distribution test, which was implemented as above.
We also wanted to gain knowledge about the genetic diversity in the Pee Dee River (the next system to the north) and the amount population structure and genetic differentiation between the Santee-Cooper basin and the Pee Dee River. In this regard, we estimated genetic diversity for samples collected in the Pee Dee River and compared these estimates with those obtained from the Santee-Cooper basin. The estimation of genetic diversity was as described above, except that the program HP-RARE (Kalinowski 2005) was used to estimate allelic richness (the average number of alleles was not estimated due to differences in sample sizes). Tests for significance were conducted as outlined above. We also assessed the degree of differentiation between American shad in the Santee-Cooper basin and Pee Dee River as outlined above (i.e., comparison of genic distributions, computation of D EST , and use of STRUCTURE).
Molecular tags rely on the premise that offspring can be confidently assigned to their respective parent via parentage analysis and that type II error is minimal. We performed a simulationbased assessment of confidence using the computer program PAPA version 2.0 (Duchesne et al. 2002). Allocations were based on maximum likelihood using a 3% nonuniform error model distributed on the next adjacent allele (Duchesne et al. 2002). A genotyping error rate of 3% was estimated by regenotyping the broodstock for each locus, counting the errors for each locus, and averaging these values for a global estimate. We used the preparental procedure to simulate pseudoparent and offspring data. The estimated number of pseudo-collected males and females (314 and 255, respectively) was based on the numbers collected as broodstock from the fish lift at the rediversion canal below St. Stephen Dam. We assumed a closed system in which all broodstock were genotyped for the loci used in this study (i.e., we set the estimated number of pseudo-uncollected parents to zero). The allele frequency file used to simulate offspring and parents was generated by using a random sample of 96 individuals from the Santee-Cooper basin genotyped for all 10 loci. Parentage analysis was then performed on these files to assess assignment success. Simulations were performed using 1,000 iterations. We also explored the statistical confidence in parentage analysis for a much larger number of hatchery broodstock. In this scenario, all simulations were as above except that 5,000 males and females were used for the estimated number of pseudo-collected males and females.
Molecular tags will be used to discern hatchery offspring from their wild conspecifics; therefore, we also estimated the level of type II error for parentage analysis (i.e., the probability that a wild fish would be incorrectly assigned to hatchery parents). We estimated the type II error by first generating wild pseudo-offspring (n = 1,000) from an allele frequency file of the wild population (n = 96). Next, we generated hatchery pseudomale and -female parents based on hatchery allele frequencies (n = 83). Type II error was assessed using either 569 (314 males and 255 females) or 10,000 (5,000 males and 5,000 females) hatchery parents. We also assess the influence of mating history on type II error by performing parentage analysis with and without a mating history file. Specifically, the mating history file assumed that hatchery parents were spawned volitionally in tanks containing 25 males and females (this was similar to the tank spawning conditions over the course of this study). The genotyping error model was as above.

RESULTS
A total of 409 American shad were analyzed using 11 microsatellite markers. For each sampling site, all loci conformed to per-locus HWE after sequential Bonferroni corrections (all P > 0.01 per sampling site; n = 11 comparisons per sampling site for an α = 0.005) except AsaC59 and AsaC249. Microsatellite marker AsaC249, which had a general excess of homozygotes for most allele size-classes and was suggestive of null alleles, was removed from all subsequent analyses. Marker AsaC59 deviated from HWE for only one (the rediversion canal collection comprising hatchery broodstock) of five sampling sites, indicating possible genotyping errors; however, this locus still deviated from HWE after regenotyping individuals from this sampling site (no genotyping errors were found). No sampling site showed significant evidence of gametic disequilibrium after sequential Bonferroni correction (all P > 0.01 per sampling site; n = 45 comparisons per sampling site for an α = 0.001).
There was no significant heterogeneity in genic distributions among sampling sites in the Santee-Cooper basin after sequential Bonferroni correction (all P > 0.01; 10 comparisons for an α = 0.005; Table 2); likewise, there was no significant differentiation among pairwise comparisons of multilocus TABLE 2. Probability values for tests of genic differentiation among sampling sites for American shad in the Santee-Cooper basin based on 10 microsatellite loci. All comparisons were nonsignificant following sequential Bonferroni correction (Rice 1989  estimates of D EST , which ranged from -0.001 to + 0.022 (Table 3) after sequential Bonferroni correction (all P > 0.006; n = 10 comparisons for an α = 0.005; Table 3). The program STRUCTURE revealed that the most probable number of groups was two using the Evanno et al. (2005) method ( Figure 2); however, the proportion of sampled individuals at each sampling site was symmetrical for all K-values 2-8 (data not shown), which is an indication of no population structure (Evanno et al. 2005).
A comparison of genetic diversity between samples representing the rediversion canal collection comprising hatchery broodstock and wild fish collected from throughout the Santee-Cooper basin (Table 4)  There was also no significant difference in the average fixation index (0.030 versus 0.045, P = 0.24) as well as in the distribution of alleles for each locus (all P > 0.06; see Appendix).
Genic differentiation tests found no significant difference (P = 0.11) between American shad in the Santee-Cooper and Pee Dee rivers using neutral microsatellite markers. Estimates of D EST and STRUCTURE analyses corroborate the genic tests. The estimate of D EST averaged across loci was 0.020, and although it was significantly different from zero (P = 0.02) it was similar to that reported within the Santee-Cooper basin. Analyses using STRUCTURE indicated that the proportion of sampled individuals to each sampling site was symmetrical for all K-values 2-8, suggesting no population structure. We also FIGURE 2. Values of K averaged across three replicate simulations versus the simulated number of groups (K) observed in the data. The simulation results indicate that the most plausible value for K represented by American shad from the Santee-Cooper basin is 2, as evidenced by the distinct reduction in K from K = 2 to K = 3. Note, however, that the proportion of sampled individuals to each group was symmetrical for all K = 2-8, which is an indication of no population structure (Evanno et al. 2005).  (2007) could be used as molecular tags. Upon regenotyping 96 individuals, our estimate of an average genotyping error rate over all loci was 3%. Simulations using this error rate and 314 males and 255 females as parents suggested that more than seven loci (in any combination) would be necessary to match progeny successfully (>95%) to their respective broodstock parent pair (Figure 3). The probability of a type II error with and without assuming a mating history for the broodstock was 3% and 12%, respectively. Simulation using 5,000 males and 5,000 females as broodstock indicated a 94% parent assignment success rate would be achieved using 10 loci and a 3% genotyping error rate ( Figure 3); however, the type II error associated with these analyses was 17% and 64% with and without assuming a mating history, respectively (Figure 3).

DISCUSSION
Declines in North American ichthyofaunal diversity have led to the widespread use of hatchery propagation to boost population sizes and recover threatened or endangered populations (Minckley 1995;George et al. 2009). Hatchery augmentation is meant to have positive demographic consequences; however, genetic risks associated with the release of hatchery fish are well known (Nelson and Soule 1987;Miller and Kapuscinski 2003). Unfortunately, monitoring and evaluation of reintroduced or augmented populations, if done at all, often involves noting the presence of (or an increase in) the numbers of the target species at a site (Ostermann et al. 2001); therefore, potential adverse genetic changes to wild populations (via augmentation) can go unnoticed (Araki et al. 2007b;Thériault et al. 2010Thériault et al. , 2011. Genetic monitoring of hatchery releases is important because it can provide (1) an understanding of the present and historical levels of genetic diversity in a population or species (i.e., prior to the release of hatchery individuals), (2) an assessment of the alteration of these characteristics after release, and (3) an FIGURE 3. Percent parentage assignment success versus number of loci for simulated American shad parents and progeny (red curves), along with type II errors with and without data on mating history. The simulated numbers of broodstock males and females were (A) 314 and 255 and (B) 5,000 and 5,000. The simulations assumed a 3% genotyping error rate and that all broodstock were genotyped for all loci. Type II errors were estimated by performing parentage analysis on simulated wild progeny and hatchery broodstock. The inferred mating history allowed 25 males and 25 females to spawn together and was similar to hatchery conditions. evaluation of the biological consequences of hatchery releases (Schwartz et al. 2007;Laikre et al. 2010). Our study was initiated as an initial assessment of the genetic diversity within the Santee-Cooper basin with the intention of estimating the native population genetic structure and level of diversity prior to hatchery release and the genetic composition of the broodstock used to initiate augmentation of the wild population.
The evaluation of genetic stock structure is an essential first step in choosing an appropriate donor population to use in supportive breeding because an inappropriate choice of broodstock may reduce the overall fitness of the population by disrupting locally adapted gene complexes (Miller and Kapuscinski 2003). Fish population genetic structure typically coincides with differences among major river basins; however, anadromous fish such as Chinook salmon Oncorhynchus tshawytscha normally migrate to their natal river to spawn and often demonstrate finescale genetic structure in a river or river system (Neville et al. 2006;Narum et al. 2008). Like salmon, American shad migrate to their natal river to spawn (Melvin et al. 1986;Hasselman et al. 2010); therefore, unrecognized population structure could exist within the Santee-Cooper river system. Our data provided no evidence of fine-scale population structure with in the Santee-Cooper basin. Microsatellite markers were generally in accordance with HWE and gametic disequilibrium, tests for genetic differentiation indicated no differentiation between sampling sites, and while Bayesian cluster analysis revealed the potential for two distinct groups, the proportion of sampled individuals to each sampling site was symmetrical for all K-values, which was an indication of no population structure (Evanno et al. 2005). Our findings were consistent with those of previous genetic Waters et al. 2000;Hasselman 2010) and otolith microchemistry studies (Hendricks et al. 2002;Walther et al. 2008) that showed limited site fidelity in American shad spawning at fine spatial scales, presumably because of a high level of straying or gene flow among tributaries.
Although there was a lack of observed fine-scale genetic structure in the Santee-Cooper basin, can the same be said for larger spatial scales? The observed patterns of genetic variation and population structure indicated very little, if any, genetic differentiation between American shad at a larger spatial scale (i.e., the drainage level), at least between the Santee-Cooper and Pee Dee rivers. Genic homogeneity was observed between American shad in the Santee-Cooper and Pee Dee rivers, and the level of genetic differentiation (D EST ), while significant between river systems, was similar to the values found among sampling sites in the Santee-Cooper. A lack of genetic differentiation was surprising given the life history of this species (philopatric); however, it is consistent with Hasselman (2010), who found nonsignificant genic heterogeneity and weak genetic differentiation among American shad from the Waccamaw, Cooper, and Edisto rivers (the Waccamaw is the next river system north of the Pee Dee, and the Edisto is the river system directly south of the Santee-Cooper). Hasselman (2010) offered several hypotheses for the observed lack of differentiation in southern spawning runs of American shad (e.g., stock transfer); however, testing competing hypotheses was beyond the scope of our study.
The observed lack of differentiation between American shad in the Santee-Cooper and Pee Dee systems should be treated with caution because a lack of differentiation at presumably neutral markers does not necessarily constitute proof of no differentiation between these spawning runs. It is important to understand that discrepancies in genetic differentiation can be found between neutral and quantitative markers, especially in recently diverged populations (Bekessy et al. 2003;Leinonen et al. 2008). Recently diverged populations that have a large effective size can show a lack of differentiation at neutral loci even if they are demographically and reproductively independenthaving accumulated genetic differentiation at nonneutral loci via natural selection (Laikre et al. 2005). Genetic differences may still exist between the Pee Dee and Santee-Cooper populations of American shad but were simply not observed with the neutral markers used in our study; therefore, the use of a local broodstock (in this case from the Santee-Cooper basin) is recommended because this practice will minimize the risk of outbreeding depression (Miller and Kapuscinski 2003).
Once the choice of the donor population has been made for a supportive breeding program, the actual collection of hatchery broodstock will require careful consideration. To maximize genetic and ecological (adaptive) diversity, a primary goal for any supportive breeding program should be the maintenance of similar genetic resources and life history patterns between the hatchery broodstock and wild populations (Lynch and O'Hely 2001;Wedekind 2002). Therefore, both collection procedures (i.e., collecting broodstock from over the course of the spawning run) and the numbers of broodstock should be important considerations in minimizing the risks of domestication selection and inbreeding depression (Allendorf 1993;Miller and Kapuscinski 2003). The American shad supportive breeding program for the Santee-Cooper basin outlined two adaptive scenarios for producing juvenile hatchery-reared shad: collect spawning shad and manually strip and fertilize the eggs, or induce the shad to spawn by means of hormone injections. To reach stocking targets (Hill 2009), the plan indicated that approximately 400-800 ripe females and 200-400 males would be necessary for the first spawning scenario and that 1,000-2,000 females (and a comparable number of males) would be necessary for the second scenario. For both scenarios, the plan recommended that shad collections occur during a 6-week period when water temperatures were approximately 15-25 • C (roughly March-May and representative of the spawning run). In 2009, which was the initial year of the supportive breeding program, approximately 314 males and 255 females were collected for broodstock at the rediversion canal over the course of a 2-month period corresponding to the spawning run (March-April). While approximately 100-200 parents (equal number of males and females) have been suggested as the minimum number of broodstock needed to maintain most alleles in a population (Kincaid 1983;Allendorf and Luikart 2007), we assessed this recommendation by testing whether estimates of genetic variation differed between the sampled broodstock and the wild population. Our data indicated that the recommended number was sufficient to maintain similar levels of genetic variation (number of alleles and observed heterozygosity) between the broodstock and wild populations. What is not known, however, is whether this similarity will still be maintained in the progeny used for augmentation.
The effective population size (N e ), which refers to the size of an ideal population experiencing the same rate of random genetic change over time as the real population under consideration (Wright 1938), is a key parameter in conservation biology because it measures the rate of loss of genetic variation and increase in inbreeding (Wright 1938). General hatchery conservation goals based on genetic considerations are frequently established at N e = 50 to minimize inbreeding depression and N e = 500 to maintain sufficient evolutionary potential (Franklin 1980;Franklin and Frankham 1998;Miller and Kapuscinski 2003). It is often assumed that the N e of a hatchery group can be estimated from the number of males and females used as broodstock to produce the hatchery group. However, large differences in reproductive success among broodstock can reduce N e to a level below the one that was expected and increase the effects of artificial selection in captivity (Allendorf 1993). In 2009, American shad females were hormone-implanted and then allowed to spawn volitionally in tanks with approximately equal numbers of males. If a high proportion of the hatchery broodstock produced progeny in fairly equal numbers, we would expect that similar genetic resources would be maintained in their progeny. However, parentage analyses in a pilot study of American shad from the Roanoke River indicated that of the 39 females and 35 males allowed to tank spawn, only 4 females (mean number of offspring = 0.82; variance = 6.41) and 15 males (mean number of offspring = 0.91; variance = 2.26) produced the progeny that were randomly sampled (n = 66) for parentage analysis (G. R. Moyer, U.S. Fish and Wildlife Service, unpublished data). Thus, the observed value of N e (8) was much less than that which was expected (73). The cause of such a discrepancy is often due to hatchery spawning protocols. As in most American shad hatchery programs, the Roanoke study relied on the volitional tank spawning of a large number of males and females because of space limitations in the hatchery. In such circumstances, not all females and males will produce progeny; thus, the observed value of N e will often be less than that expected (Crow and Denniston 1988). Further, equalizing reproductive success among the broodstock after tank spawning is arduous and will also contribute to a lower observed value of N e (Crow and Denniston 1988). In this type of situation, one possible strategy is to estimate the number of contributing males and females and their reproductive success (via parentage analysis of a subset of offspring collected from each tank) so that these parameters are known prior to hatchery release (as was done in the Roanoke River study above). In this way, the potential effects of family-correlated survival can be closely monitored (Moyer et al. 2007).
As shown above, supportive breeding can have important genetic consequences for the broodstock. It can also accelerate the loss of genetic variation and reduce the fitness of the augmented population (Ryman and Laikre 1991;Araki et al. 2007b;Thériault et al. 2011). Unfortunately, the effects of supportive breeding on natural populations are difficult to predict (Wang and Ryman 2001) and highlight the importance of genetic monitoring and adaptive management. To assess the effects of supportive breeding on the wild population requires knowledge of the census and effective population sizes for both the returning hatchery progeny and the wild population (Ryman and Laikre 1991;Araki et al. 2007c;Moyer et al. 2007) and an understanding of the relative reproductive success of hatchery fish and their wild counterparts (Araki et al. 2007a;Thériault et al. 2011). Molecular markers can be used to estimate such demographic parameters and monitor whether captive individuals are being recruited to the natural population (Schwartz et al. 2007). Essentially, molecular markers can be utilized just like conventional tags but are not subject to failure because of battery life, tissue regeneration, or loss of an external or internal tag (Guy et al. 1996). Molecular markers also serve as a means to tag relatively small organisms and are a part of the fish until death-a considerable advantage over many conventional tags. Molecular tags are increasingly being used in lieu of physical tags to identify individuals (Schwartz et al. 2007); yet there are technical aspects to this approach that must be assessed prior to field implementation.
The premise of a molecular tag is that offspring can be confidently matched to their respective parents via parentage analysis. Mutations, scoring errors, relatedness, and gametic disequilibrium can lead to circumstances in which all hypothesized parent-offspring pairings possess finite probabilities of being true (Jones and Ardren 2003). Therefore, a major goal prior to the implementation of a molecular tagging study should be to assess the statistical confidence in parentage assignments. Our simulation-based assessment of confidence found that eight microsatellite markers were sufficient to confidently (>95% assignment success of all progeny to their respective mating pairs) discriminate progeny from 569 (314 males and 255 females) contributing broodstock (i.e., assuming an overall genotyping error rate of ≤3% and that all broodstock had been genotyped for all loci). Further simulations indicated that 10 loci can be used to perform parentage analysis of broodstocks of at least 10,000 individuals (5,000 males and 5,000 females) with 94% success in matching progeny to their respective parents (again assuming a 3% genotyping error rate and that all broodstock have been genotyped). However, for the latter scenario, the probability that a wild fish would be incorrectly assigned to hatchery parents was high (>60%) unless the mating history was incorporated into the analysis. Knowledge of mating history greatly improved assignment success by reducing the number of possible parent pairs, a finding similar to that of Olsen et al. (2001). Therefore, when working with large numbers of broodstock, it will be imperative to keep accurate records on mating history so that type II errors can be minimized. The degree of assignment success is a testament to the diverse allelic variation present in these markers and indicates that these markers, when used as a molecular tag, can estimate the demographic parameters essential for monitoring the effects of supportive breeding on the wild population of American shad in the Santee-Cooper basin.
In conclusion, the importance of genetic variation as a basis for future biological evolution and the long-term viability of populations, species, and ecosystems is well established (Frankel and Soule 1981;Frankham 1995). Therefore, identifying and monitoring processes that are likely to have adverse impacts on the conservation of natural populations are becoming increasingly important. Unfortunately, most monitoring programs do not take full advantage of the potential afforded by molecular genetic markers (Schwartz et al. 2007;Laikre 2010). The genetic data collected in this study will serve as a point of reference in the ongoing effort to monitor temporal changes in the population-genetic metrics and other population data generated at regular intervals as part of the Santee-Cooper American shad restoration initiative. In this way, these data will provide a understanding of the effectiveness (both in terms of increasing the census size and in terms of maintaining the long-term viability of the population) of hatchery augmentation for American shad in the Santee-Cooper basin.