The Effect of Sample Size on Joint Test Results

Many clients will ask that a test be conducted with a specific sample size, though that quantity varies widely from company to company for the same test of the same application. Others clients will ask for a recommendation or will base the test population on the number of parts that can be budgeted or are available. As with much of engineering, this problem is influenced by factors that have no direct bearing on what is “technically” the right answer. For example, the sample size required to estimate a test result to a particular level of accuracy is not influenced by the cost of the parts that would be destroyed in the testing, but it would be unrealistic not to consider this factor.

But company-specific practices can’t be considered in a general discussion of the effect of sample size on test results so we are left with statistical methods. However, we have never had a client state; “We would like to base our torque-angle to failure test on a sample size that will give us a 95% level of confidence that the mean yield torque you report from this test is within 1 N-m of the actual value of the entire population.” But these are the parameters that need to be specified to establish a sample size as shown in Equation 1.

Equation 1

Before you readers with the “6σ” tattooed on your scalps (we know who you are) start to warm-up to this article, there are some good reasons why we have never had to perform this calculation in association with a test request. The most important being that for this equation to be valid the following must be true.

  • The population is normally distributed
  • Sampling is done randomly
  • The population standard deviation (σ) is known.

The first two items are generally not major barriers as long as the population from which the samples are chosen include both versions of handed components or a representative population of multiple tool cavities or dies. However the standard deviation is a bigger problem. The sample size calculation must be made before the test is performed, but for the vast majority of tests we conduct, the client will not know the population standard deviation. The test we conduct is intended to provide that measure of variation. The fact that it is generally accepted that for a sample size of 30 or more the sample standard deviation (s) can be substituted for the population standard deviation doesn’t help matters because in most cases no test has been performed on the components being tested, so no statistically-based estimate of standard deviation is available. Another statistical method, the Student’s T test, allows estimation or process parameters without knowing the either the population or sample standard deviation, however this technique indirectly requires knowledge of the sample size, so we are no further along.

We thought a practical way to look at this question was to put it in terms closest to the test requester’s viewpoint. He wants to use the fewest parts possible, so effectively he is asking, “What am I gaining for each additional set of parts I scrap?” To answer this question we reviewed our recent test reports for those in which we had a greater than normal sample size. We found seven whose results we will now examine to help answer that question. They are summarized in Table 1.

Table 1.– Subject Joint Test Summary
# per
Joint #1 3/8″ 4 24 Casting Fracture Auto – Steering
Joint #2 3/8″ 4 24 Casting Fracture Auto – Steering
Joint #3 M8 TF 2 20 Weld Nut Strip Auto – Body
Joint #4 M4.2 TF 4 92 Molded Strip Auto – Lighting
Joint #5 M4.2 TF 2 24 Molded Strip Auto – Lighting
Joint #6 M5 TF 2 24 Molded Strip Auto – Lighting
Joint #7 M5 TF 4 24 Molded Strip Auto – Doors

TF = thread-forming

Before examing the statistical implications of the test parameters measured, it might be helpful to define them and explain the test conducted to determine them. Figures 1A and Figure 1B. are typical torque-angle traces of an all-metal joint and an all thermoplastic joint. These traces are generated by tightening the fastener in a joint of interest until the joint fails, typically by fastener fracture or thread strip. These tests are conducted to validate joint robustness and installation torque on new designs and to troubleshoot problems in existing joints. This test was the basis for the data used in this article. The Seat Torque, shown as “b1” in Figure 1A and “a” in Figure 1A, is the point where alignment of the joint components is complete and the joint is tightened in a consistent manner. The minimum installation torque should be above this point. After this point there is some divergence depending on the nature of the joint. For metal “hard” joints such as that shown in Figure 1A is customary (and desirable) for the fastener to yield before the joint components, although in some bases the yield occurs in the threads rather than in the shank of the fastener. In either case this point, defined by deviation from the near-linear portion of the trace that preceded it, (“b2” in Figure 1A) is also a point of interest as in most cases it is desirable to keep the maximum installation torque below this point. In thermoplastic joint tests such as that show in Figure 1B, there is not a distinct transition from a linear torque-angle trace to yield because thermoplastic dos not have a distinct point where deformation becomes permanent. In most cases the threads will shear out of the boss fairly suddenly at the maximum torque, or Strip value (“b” in Figure 1B). Maximum installation torque is kept under this value. The maximum or Ultimate torque value of metal joints (“a” in Figure 1A) is also commonly reported as a check on the yield value and an indication of failure mode and ductility.

To provide consistency in the data reported between the metal and thermoplastic joints the Seat torque and the maximum torque (Ultimate and Strip, respectively) are used to evaluate the impact of sample size on estimating process parameters. It should be pointed out that for both types of joints the Seat torque is one that was picked manually by the test technical, while the Ultimate/Strip torque is selected by default as the maximum value. This distinction will provide the opportunity to compare data points that are influenced by judgment and those determined entirely by test conditions.


Figure 1A -Torque-Angle to Failure Trace – Hard Joint




Figure 1B -Torque-Angle to Failure Trace – Soft Joint



Pairs of data graphs for each of the seven joints summarized in Table 1 follow. In the upper graph the mean Seating torque is graphed while in the lower the mean Strip or Ultimate torque is displayed. The data is presented in a manner to show the effect of increasing sample size one assembly at a time. It should be mentioned that this aspect of joint testing, that each fastener is usually one of several in a pattern, complicates the concept of sample size as each test is not completely identical and independent of one another. A detailed explanation of the data graphs is provided in Figure 2


Figure 2 -Details of Data Graphs
























These graphs present a lot of data, but what does it all mean? Well, the first comment shouldn’t be a surprise – that any conclusions drawn from this study can only be applied in a very general manner because the joints on which it is based are certainly not widely representative. Looking at the trends, it appears that the variation present in these joints prevents estimation of process parameters to high degree of certainty at the 5 -20 sample sizes that are common. Opportunities to quantify this are limited, but one approach would be again to relate it directly to test needs. As mentioned, the most common reason to perform this test is to ensure that the range of acceptable installation torque is above the Seat torque and below the Strip or Yield torque. More specifically the acceptable range is often defined by the range between the mean Seat torque plus 3 standard deviations and the Yield/Strip torque minus 3 standard deviations. Allowing for the fact that we substituted the Ultimate torque for Yield torque to maintain consistency with the molded joints, Figure 3A and Figure 3B shows the trend toward reduced error as the sample size increases. As in all but one case there was a sample size of at least 24 (Joint 3 had 20), Figure 3A compares the error in the Seat+3SD value at a sample size of 4, 8,12, 16 and 20 relative to a sample size of 24, while Figure 3B does the same for Strip/Ultimate. The error bars on the graph represent the min/max error at each point.


Figure 3A – Error of Seat Torque Mean Relative to Mean at n=24




Figure 3B – Error of Ult/Strip Torque Mean Relative to Mean at n=24



The very large sample size of Joint #4 shows the risk of assuming that what are considered large sample sizes for joint testing are representative of the population. Table 2 shows the difference in calculating the error of Seat+3SD and Strip/Ult-3SD relative to sample size of 24, as was done in Figure 3, and then relative to the full sample size of 92.

Table 2 – Joint #4 Error Comparison
Relative to n = 24 Relative to n = 92
n Seat +3SD Ult/Strip -3SD Seat +3SD Ult/Strip -3SD
4 -15.0% 12.8% -10.3% -3.5%
8 -19.9% 17.5% -15.4% 0.5%
12 -19.6% 16.8% -15.1% -0.1%
16 -15.5% 16.4% -10.7% -0.4%
20 -9.5% 16.6% -4.5% -0.3%

As often the test requestor has a fairly good idea of what Seat, Strip, Yield or Ultimate torque values might be expected for a particular application, in Table 3, Figure 4A and 4B the Seat and Ultimate/Strip torques are plotted against standard deviation. While it’s not suggested that standard deviation (and therefore sample size) can be accurately estimated from torque parameters, the results were more consistent than expected.

Table 3– Values at Maximum Sample Size
Seat Ultimate-Strip
Mean Std Dev SD/Mean % Mean Std Dev SD/Mean %
Hard Joint Joint 1 12.502 2.710 21.7% 95.607 3.469 3.6%
Joint 2 21.183 5.189 24.5% 102.007 5.904 5.8%
Joint 3 13.713 3.285 24.0% 52.895 2.512 4.7%
Joint 4 1.631 0.240 14.7% 4.519 0.405 9.0%
Soft Joint Joint 5 1.279 0.124 9.7% 4.367 0.240 5.5%
Joint 6 0.940 0.125 13.3% 3.864 0.199 5.1%
Joint 7 1.141 0.049 4.3% 4.606 0.286 6.2%



Figure 4A – Seat Torque Mean vs. Standard Deviation




Figure 4B – Ult/Strip Torque Mean vs. Standard Deviation



This investigation into the relationship between sample size and joint testing parameters began by stating that the lack of knowledge of process variation (standard deviation) prevents establishing statistically-based sampling. In fact this is true only in the extreme. Torque-angle to failure testing reflects the larger product development process the is serves. Most requests for testing are for designs that are incremental improvements from a previous design or are to validate a change in fastener or component supplier. In these cases there is usually previous test data available. The mean and standard deviation for the points of interest calculated in those tests would serve as good estimates for future tests. With that in mind, the following charts have been created for determining sample size based on standard deviation, acceptable error (E) and a 99% confidence interval. That interval was chosen as it best corresponds to the +/- 3 standard deviation range commonly utilized by our clients in setting process parameters. Because both standard deviation and acceptable error generally increase with the nominal value of the parameter, three charts are provided to allow sample size calculation across a wide range of standard deviation and acceptable error.







In summary, it is not surprising that torque parameters calculated from sample sizes common to fastener testing do not always predict population behavior to a high degree of accuracy. This is a reality of nearly all aspects of product development testing. Specific to joint testing, the fact that the vast majority of OEMs do not perform any joint testing whatsoever is a far greater issue than the sample size determination of those that do. However, one point that should be considered is that there is a significant difference in estimation accuracy between the lower and upper limits of the 5 to 20 sample sizes common to joint testing. Unfortunately, the joints that are the most critical to the product integrity tend to contain the greatest component cost and therefore are tested at the lowest sample sizes.

  • Newsletter

    Each edition of ArchNews will bring you new fastening developments, including application data and test results. Read the Latest Edition Newsletter Archive

  • Have you thought about your joints lately?

    Learn how your view of fastening costs may be costing you money, and how to get a no-cost assessment of your worst joint problems. Click Here

  • Need more information?

    For more detailed information on any topic discussed on this site please contact us by phone at 248-377-1147 or email us at:

  • Certificate #2511.01

    View Scope of Accreditation

Archetype Joint, LLC has been acquired by Derry Enterprises, Inc.

Archetype Joint is now Peak innovations Engineering, LLC with additional services to offer to a wider range of customers.

Click Here to Visit the New Website for Peak innovations Engineering