Skip to content Skip to sidebar Skip to footer
May 8, 2022

ArmCare FX—The Latest Advancement in Baseball Performance Science

Strength in Numbers #44

For 20 years, I have been reading research to advance throwing and hitting performance and to be honest…

I’ve had been wasting my time!  

Like most people communicating findings from studies, I put a lot of stock in “statistically significant” data, thinking it was “practically significant” data.

About five years ago, I changed my tune and started reporting the most critical metric in science in my publications, as I want my work to be relevant, meaningful, and applicable on a large scale.  

This metric is called the Effect Size measure (ES), which can be calculated in several ways depending on two samples’ means, standard deviations, and population sizes. 


In a nutshell, the ES calculation puts all studies on an even playing field as it normalizes the distance between two means. Given the sheer fact of having a lot of people in an investigation, the chances of having significant findings are high. When we stay statistical significance, it means that a probability value of less than 5% occurred when hypothesis testing for differences between two means.  

Essentially, that means there was a 5% chance or less for a Type I Error occurring because of an intervention. In plain English, this means that the difference seen between two samples occurred due to an intervention with only a 5% chance or less of being a random occurrence or a false positive.  

I always find it interesting when studies produce p-values above it, a p-value of 10% (p=0.10). That means that you have a 90% chance that change occurred due to an intervention and not randomized effects. To me, that’s pretty good, but in the research world, a study reporting over a p-value of 0.05 essentially tells you there are no actual differences related to the intervention. 

So now we have statistical significance defined, let’s get back to ES, or what is expressed at times as the Cohen’s d statistic, or d statistic. Once you identify statistically significant data (probability value of making a Type I Error) is less than 5% or 0.05, it’s time to see if the findings are meaningful in reality.  

Sports medicine has always used ES calculations to determine the impacts of treatment, and sports science must also focus on this measure. In fact, if you try to publish in the Journal of Strength and Conditioning Research, your paper will not be accepted without reporting your ES values.  

This research requirement is excellent news, as we cannot rely on statistically significant data with minimal effects, and unfortunately, we are professing those findings throughout baseball.


I love ES calculations because it’s on the same scale as scouting. Scouts use a 20-80 scale, with 50 being average. The Effects Size Scale goes from 0.2 (minimal effects) to 0.8 (large effects). Some studies will show calculations above 0.80, which is great to see—those are the hall of fame studies.  

That means the normalized distance between means is considerable. As a result, the difference caused by an intervention is highly translatable to the practicality of coaching as the outcomes could be advantageous for most athletes.  

To give you a real-world example of how this works and what it means, I’ll take you back to a scientific exploration I had with the Angels.  

We wanted to explore the degree of fatigue between doing grip strength measures in a 90-90 position versus an elbow bent at 90 with the arm by the side post-pitching. Our goal was to determine which grip strength measure was more sensitive to strength loss.  

We had 150 pitchers, so I expected statistically significant differences because the group was quite large. We confirmed a significant difference between the two grip strength tests with a mean difference of 4 lbs.  

That number seems small, but when you do the effect calculation, the difference for a pooled sample standard deviation indicated an ES over 1.0.

That means the effect was enormous, and therefore, we stuck with the 90-90 test as it was significantly and practically more sensitive to fatigue. You will also notice that this test is part of our ArmCare exams, yet we have improved it by looking at a three-finger pinch grip to isolate the FDS muscle of the forearm for the relationship to protecting the UCL. 

On the flip side, we did another training intervention to look at velocity potential with a change in squatting protocols. The group went from 92.6 mph to 93 mph on average, which was statistically significant.  

That looks to be a big change to the human eye, right?  

It wasn’t, as it was only a 0.4 mph difference. When we calculated ES, the value was just a tick over 0.20. So in scouting language, this study didn’t seem MLB worthy, and we went in a different direction with our bilateral training and saw much more significant effects resulting from that pivot. 


We are rolling out our first ArmCare FX video in this edition of Strength in Numbers.  

FX is just a shorter way of saying “effects,” as we will communicate the strength of studies on a 20-80 scale for ES or 0.2-0.8+ to give you better insights on how applicable the findings are to your coaching practice.  

What I love about them is that they are 90-180 seconds in length.  

With more baseball research coming out by the minute, we need to be skeptical, as in the words of Alexander Hamilton, “If you stand for nothing, you will fall for everything.”  

I’m excited to help guide you in making better player development decisions based on evidence-based research. Tune in for more!