By Isabel Pantle '23 for QSS Final Research Project
1. Introduction
This project explores the relationship between offensive holding penalties and various game-level, play-level, and player-level variables to understand features that may increase the likelihood of holding penalties.
1.1 Motivation & Research Question
Holding penalties hinder offensive drives in three key ways. First, if there is an offensive holding penalty, the down is replayed, so any yards gained are nullified. Second, the offensive team is penalized 10 yards or half the distance to the goal when the line of scrimmage is within 20 yards of the offense’s team endzone, which worsens their field position. Third, the yardage from the penalty increases the distance needed for a first down. For example, if a team is called for offensive holding on 1st and 10, the next down is 1st and 20. This research will explore the variables that are associated with holding penalties, guided by the research question, what features of offensive linemen and defensive pass rushers increase the likelihood of an offensive holding penalty?
1.2 Football Rules
American football is played in four 15-minute quarters between two teams with 11 players on the field each. The team on offense runs plays toward their opponent’s endzone while the team on defense tries to prevent scores, stop the offensive drive, and force a punt or turnover. Teams alternate playing offense and defense, which switches after a team scores, punts the ball away on 4th down, or turns the ball over on downs or by a fumble or interception.
The offense begins their set of downs with 1st and 10 and has four plays to reach the first down marker. If they do not reach the first down marker on 4th down, the defensive team gains possession at the line of scrimmage and goes on offense. Because of the risk of turning the ball over on downs, teams usually punt the ball to the other team on their 4th down so the opponents have worse field position to begin their drive.
There are seven officials on the field who run and officiate the game. They are responsible for spotting the ball, managing the game clock, and calling penalties. Penalties are intended to prevent certain behavior from players that is unfair or unsafe. The most common penalties in the NFL are holding, pass interference, and pre-snap penalties.
2. Literature review
There is a great deal of research seeking to understand penalties in football and other sports. The previous research in football, football penalties, penalties in other sports, and rare events modeling is discussed here.
Michael Lopez is the Senior Director of Data and Analytics at the NFL and one of the leaders of the Big Data Bowl. His work provides a foundation for this research. He explores discretionary penalties in his paper, “Persuaded under pressure: evidence from the National Football League.” The paper analyzes pass interference and holding penalties relating to the referee’s proximity to either team’s bench, to see if referees are more likely to call penalties when they are close to a team lobbying for a call. Lopez looked at data from plays which occurred near either sideline and used a logit model to analyze the relationship between sideline proximity, line of scrimmage, and a penalty call.
For offensive holding penalties, Lopez found that the effect of the line of scrimmage varies significantly by proximity to the sideline. Since the team’s personnel is limited to a 36 yard zone near midfield, the line of scrimmage was an important variable. Being closer to the offensive sideline decreases the likelihood of an offensive holding penalty being called, especially when the line of scrimmage is near midfield. This paper is valuable for many reasons. First, it provides good evidence for the inclusion of the line of scrimmage and sideline proximity for the model. Second, it provides an example of using a logit model to analyze holding penalties.
Kevin Snyder and Michael Lopez model situational, game-level variables with holding penalties in their paper, “Consistency, accuracy, and fairness: a study of discretionary penalties in the NFL.” This paper analyzes situational factors of a football game and their relationships with discretionary penalties. They also analyzed false starts, a penalty with low discretion, as a control to measure differences in penalties committed over the course of the game. This paper creates a logit model for rushing plays and passing plays which accounts for situational factors that have been shown to sway referees.
Snyder and Lopez’s model includes offensive home team indicator, season factor, down, distance, the difference in score as a categorical variable, and line of scrimmage as a categorical variable for either redzone or midfield. They found significant relationships between holding penalties and season, line of scrimmage, the down and distance, an indicator variable if the offense was the home team, and game minute, minute squared, and minute cubed. They determine that officials are less likely to call penalties at the beginning and end of games when accounting for these significant situational factors. There was not a significant relationship between holding penalties and if the game was close or if the offensive team was leading. This paper is invaluable to my research because it shows which situational variables are important to include in the model and which variables did not have a significant relationship. Additionally, they were able to use a logit to measure holding penalties which is the method used in this research.
The next paper is titled, “Identifying changes in the spatial distribution of crime: Evidence from a referee experiment in the National Football League.” Author Carl Kitchens investigates the natural experiment which occurred when the NFL repositioned one of its officials from the middle of the field behind the defense to the middle of the field behind the offense between the 2009-2010 and 2010-2011 seasons. Kitchens found a significant increase in the number of offensive holding penalties called after the change was made, even though the total number of penalties called remained constant.
To see if this was a result of the movement of the official, the author created a linear probability model which includes other situational factors including yards to first down, score differential, score differential squared, if the play was a run or pass, and field position. The model also includes fixed effects for down, quarter, week, team, and officiating crew, as well as an interaction term between teams to account for rivalry games. This is a comprehensive list of variables that will be tested in this model, particularly those which were significant in this author’s research.
Another paper by Michael Lopez titled, “Bigger data, better questions, and a return to fourth down behavior: an introduction to a special issue on tracking data in the National Football League” is the introduction to an issue on NFL tracking data in The Journal of Quantitative Analysis in Sports. It provides some historical background about the evolution of NFL data collection and details about the in-game tracking data, which collects coordinate location for each player ten times a second and can be used to calculate speed, acceleration, and angle off the line of scrimmage. This is the data that is provided by the NFL for the Big Data Bowl.
Anthony Anzell et al discuss body composition changes in college and professional football players in their paper, “Changes in height, body weight, and body composition in American football players from 1942 to 2011.” Specifically, they examine changes in offensive and defensive linemen. The authors find a significant increase in weight for college and professional players in offensive and defensive linemen. This provides evidence for the importance of weight at these positions, so weight will be included as a factor in this model. This article focuses on the changes in football over the years, which suggests that season should be included in the model if the data is over multiple years. The paper also found that there’s been no significant change in height, so this variable will be tested but it is likely not essential. That being said, players who are much shorter than their opponents may be at a disadvantage, so a binned approach for height may be applicable.
Bert Jacobson et al explore the technique of a power block and different strength exercises in their paper titled “Relationship between selected strength and power assessments to peak and average velocity of the drive block in offensive line play.” In this paper, the peak and average velocity of offensive lineman’s power blocks against a blocking dummy were measured and compared with that player’s PRs for squats, power cleans, and vertical jump, as well as their weight and body composition. Blocking velocity was positively correlated with squats, power cleans, and vertical jump, and negatively correlated with body fat percentage. This is useful for translating results from this project back into actionable insights for teams.
Moving into research from other sports, Michael Lopez and Kevin Snyder coauthor the paper, “Bias Impartiality Among National Hockey League Referees.” This paper investigates reasons why a referee may be more likely to call a penalty on one team over the other. The researchers are interested in power plays, when one team has more players than the other because of previous penalties. They create different models for penalties in the second and third periods. The model for each period’s penalties includes independent variables measuring penalty count and score at the end of the previous period, as well as situational factors regarding the home team, attendance level, and fixed effects for the team. They use a Poisson distribution and Pearson’s chi-square test to measure the fit of this model. Finally, they present the findings of various models with significance levels reported. This paper is useful because hockey penalties are more rare than other sports, so their methods may apply to this research on offensive holding in football.
Jason Abrevaya and Robert McCulloch explore the impact of previous penalties on future penalties in their paper titled, “Reversal of fortune: a statistical analysis of penalty calls in the National Hockey League.” This paper discusses and analyzes the reversal bias among referees and penalties in the NHL. Abrevaya and McCullock found that even when controlling for situational factors such as the score, time left, time since the last penalty call, home team, and size of referee crew, the next penalty is less likely to be called on the team most recently penalized.
In analyzing the sequence of penalties in a game, nearly two-thirds of the first two penalties were called on opposite teams. Additionally, the number of penalties in each period of play was different with fewer penalties called in the last period. This is likely because referees are hesitant to have a game-deciding penalty near the end of the game. This paper also includes a discussion of potential modeling strategies, using a linear logit, decision tree, random forests, boosting, and Bayesian Additive Regression Trees.
This phenomenon, known as the ‘reversal effect,’ has also been found in soccer. Henning Plessner and Tilmann Betsch conduct an experiment about successive bias in penalty calling in their paper, “Sequential effects in important referee decisions: The case of penalties in soccer.” In the experiment, they showed participants multiple potential fouls in soccer matches and measured if participants were less likely to call penalties on teams they had already called penalties on, and more likely for teams that had not received penalties.
They found that there was a negative correlation between successive penalties decisions and previously issued penalties for that team, and a positive correlation between initial penalties issued to one team and later penalties issued for the opponent team. This research suggests that people are not unbiased judges of reality and tend to call discretionary penalties with a subconscious bias towards fairness or a tendency to even out penalty counts. This was the earliest example of this phenomenon and is useful because they conducted an experiment.
Lastly, research in rare event modeling is applicable to this research. Gary King and Langche Zeng discuss modeling issues with a logit model when studying rare events and propose several possible methods of resolving these problems in their paper titled “Logistic Regression in Rare Events Data.” The authors discuss the matter in the context of rare diseases, but this is applicable in this research because offensive holding penalties occur on fewer than 2% of plays. There are at least 5 players which could receive penalties and the level of analysis in this project is each player, so holding penalties become even more rare.
They discuss different methods for sampling in which all known occurrences of the phenomenon of interest are collected as well as about five times as many randomly selected instances without the phenomenon. They also discuss adjustments made to the interpretations of model coefficients if this sampling method is used by scaling based on the overall rate of occurrence. This paper is valuable because it clearly defines an alteration of my proposed method to account for the small occurrence of holding penalties in the sample should a standard logistic regression fail.
3. Data
This project is part of the 2023 Big Data Bowl, which is an NFL-sponsored analytics contest. The NFL provides data for amateur and professional data scientists to explore a different theme each year, and this year’s contest focuses on offensive linemen on passing plays. The NFL provided multiple data sets, including game data, play data, and player data. There was also a data set from Pro Football Focus which provided scouting information for each player in each play. The NFL provided tracking data from Week One to Week Eight of the 2021 season.
The tracking data includes positional information for every player on the field and the football measured ten frames a second. The location of players is provided with x and y coordinates, as well as the direction the player is facing and what direction they are moving. Frames are tagged with any event details, such as the ball snap, pass release, or pass catch. Documentation of the data is provided by the website Kaggle, which facilitates the competition. Another data set found on Kaggle was used in this project. It has conference and division information for every NFL team which is used to determine if the game is between teams in the same conference or division.
4. Methods
This research models the relationship between offensive holding penalties and variables about the game, play, and offensive and defensive players involved in a block. The dependent variable for this research is an offensive holding penalty on an offensive lineman. Since offensive holding penalties can be called on any offensive player, the level of analysis in this research is a blocking pair. A blocking pair is defined as an offensive player defending the pass rush and the defensive pass rusher they are blocking, which is identified by the Pro Football Focus data set.
Analyzing blocking pairs means that there will be multiple observations for each play. Double-teams, which is when two or more offensive players are blocking the same defensive pass rusher, count as two observations. For double-teams, each observation has different offensive player information and the same defensive player information.
The modeling strategy for this project is a logistic regression because it is modeling a binary dependent variable of a holding penalty called on the blocking pair or not. The guiding theory for this project is that offensive linemen are more likely to commit holding penalties when the defensive player they are blocking gains an advantage in the play, and the offensive player has to deploy an illegal hold to prevent a sack or pressure on the quarterback. Therefore, variables will quantify ways that a defensive pass rusher can gain an advantage on the offensive lineman blocking them.
4.1 Variables
Game-level variables included in the model are indicator variables if the game is between teams in the same conference or the same division. Play-level variables included are quarter, down, distance, difference between the offensive and defensive player’s acceleration at the snap, the time between snap and throw, and fixed effects for the offensive team and the defensive team. There are also indicator variables for a double-team on the defensive player, if the offensive team is trailing, and if the home team is on offense. Based on the notion that the key factor to a holding penalty is an advantage between an offensive and defensive player, I use the difference in the offensive and defensive players’ log weights. The log is taken of player weights to bring in extreme values. Therefore, player-level variables include the difference in log weights and the difference in log height, as well as the offensive and defensive players’ positions.
Cumulative penalty counts, difference in penalty counts, the team most recently penalized, and time since the last penalty were calculated, but since the data only includes passing plays, they have no meaningful interpretation so are not included in the final model.
An additional variable measuring defensive leverage is calculated using the tracking data. To do this, the coordinates of the offensive and defensive players in each blocking pair are compared to determine if the offensive player is between the defensive player and the ball. If the defensive player has an unimpeded line to the ball, the defensive player is registered as being outside of their block, and the variable equals 1 for that frame. This variable evaluates to 0 if the offensive player is between the ball and the defensive player, or within a shoulder width of the line between the defensive player and the ball. Examples of the variable are shown in Figure 1. It is calculated on every frame between the snap and the pass.
5. Results
Though holding penalties seem to be very common to the casual fan, they are actually exceedingly rare in the data. In the first eight weeks of the 2021 to 2022 season, there are 147 holding penalties in the data set of 8,558 plays. Offensive holding penalties are the second most common penalty in the data set, outnumbered only by defensive pass interference penalties with 150. Since the level of analysis is a blocking pair, with multiple observations in each play, there are 147 holding penalties in 44,443 observations, which means that offensive holding penalties occur in approximately 0.33% of observations. Therefore, any significant variables will have very small coefficients because of the rareness of this event.
The first variable investigated is offensive and defensive player weight because I suspect that heavier offensive linemen will be more difficult for defensive players to move and therefore defensive players would be less likely to gain an advantage or apply pressure on the quarterback.
To investigate the relationship between weight differential and offensive holding, I compared the mean log weight differential of blocking pairs with holding penalties and without holding penalties. In blocking pairs that are called for an offensive holding penalty, offensive players have a significantly higher mean log weight than the defensive players they block. Differences in log weights for pairs with and without offensive holding penalties are shown in Figure 2.
In Model 1, log weight differential is the only variable in a logistic regression with offensive holding penalty as the dependent variable. A higher weight differential is associated with a higher likelihood of offensive holding penalties. Summaries of Models 1, 2, and 3 are on page 14 in Table 2.
This result is unexpected, so other variables are explored to better understand the true relationship between weight and holding penalties. Player position is analyzed next. The blocking match-ups by position are shown in Table 1, with the count of the position match-up with and without holding penalties. In the sample, the most frequent block pairing is an offensive tackle versus a defensive edge rusher. This blocking pair also has one of the highest ratios of offensive holding penalties and the highest weight differential; therefore, when players’ positions are not included in the model, weight is confounded by position. The tackle is the outermost offensive lineman and the edge rusher is usually the outermost defensive pass rusher, so their positions are more likely to be the true cause of this high holding penalty ratio. When offensive and defensive positions are included in Model 2 in addition to weight and height, weight differential is no longer significant, and instead holding penalties are more likely when the offensive player in the block is a tackle.
Model 3 includes game-level, play-level, and player-level variables. Game-level variables are indicator variables for a conference or divisional game. Play-level variables are fixed effects for the offensive and defensive teams, quarter, down and distance, time between snap and throw, if the possession team is trailing, and if the home team is on offense. Player-level variables are log weight differential, log height differential, offensive and defensive player position, difference in acceleration at the snap, if the defender is double-teamed, and if the defender is outside of the block in any frame between the snap and throw.
6. Discussion
Five variables were significant at p < 0.15, which are explained here. First, the difference in acceleration at the snap has a negative association with offensive holding penalties. When the defensive player is faster off of the snap, offensive holding penalties are more likely. Acceleration off of the snap is emphasized as an important variable by NFL announcers and coaches, and its relationship is supported by the data even when accounting for many other important variables in the model. This relationship is supported by the guiding hypothesis that offensive holding penalties occur when defensive players have gained an advantage over the offensive player who is blocking them. If the defensive player’s acceleration is greater than the offensive player blocking them, the defender is more likely to beat their blocker and draw a penalty.
The time between the snap and throw was also significant and had a positive relationship with offensive holding penalties. Offensive holding penalties are more likely when the passer is holding the ball for a longer time. The interaction term between down and distance was also significant, indicating that when it is a late down with a long distance to the first down marker, holding penalties are more likely.
Since the time between the snap and throw is already accounted for in the model, any additional time that the quarterback may have to spend in the pocket for the wide receivers to get downfield on long yardage downs is not why down and distance is significant. More exploration into additional variables is needed to understand why higher yards to go on later downs have more holding penalties. Some hypothesized explanations are the quarterback scrambling to avoid a sack or the offensive line resorting to holding to prevent a sack on crucial downs.
The indicator variable representing if the defensive player is double-teamed was significant and had a negative relationship with offensive holding penalties. When a defender is double-teamed, holding penalties are less likely to be called on either offensive player who is blocking them. Presumably, it is more difficult for the defensive player to apply pressure or beat the offensive line when multiple players are blocking them.
There were also five team fixed effects that were significant. Three out of the four teams which were significantly more likely to have offensive holding penalties were in the top 10 most penalized teams. Pittsburgh was the only team significantly more likely to draw holding penalties. Pittsburgh’s defensive personnel includes T.J. Watt, the best pass rusher last season per NFL.com, and Cameron Heyward, an all-pro defensive tackle. Both players were named in the NFL’s Top 100 Players of 2022, T.J. Watt at 6 and Cameron Heyward at 42. This suggests that team fixed effects are potentially confounded by player rank, which should be included as a variable in future models.
Lastly, the variable measuring if the defensive player is outside the block was significant and had a positive relationship with holding penalties. This variable evaluates to 0 or 1 every frame. It was first tested as a percentage of frames that the defensive player is outside of the block while the quarterback has the ball. As a percentage, it was not significant. It was then tested as an indicator variable, measuring if the defender was outside of the block in any frame between the snap and throw. It had a significant relationship with offensive holding penalties, so if the defensive player is outside of the offensive player’s block for one-tenth of a second while the quarterback still has the ball, the offensive block is more likely to be called for a holding penalty.
6.1 Limitations
The most significant limitation of this research is that the Big Data Bowl only provides data from passing plays and includes no special teams or rushing plays. This limits the application of results because these findings can only be applied to passing plays. Additionally, though penalty variables were calculated, they have no meaningful interpretation since only penalties on passing plays are recorded. Further research on this project will involve penalty analysis by adding rushing plays to the data set.
7. Conclusion
In this research, I modeled offensive holding penalties as a function of game-level, play-level, and player-level variables. After accounting for game and play variables, holding penalties are associated with player-level variables. While weight was expected to be significant, once position was included in the model it was no longer significant. Time between snap and throw, if the defender was outside of the block, and if the defender was double-teamed had significant relationships with offensive holding penalties. Difference in acceleration and an interaction between down and distance were also significant with slightly higher p-values. This research also provide a new way to use the NFL tracking data by quantifying outside leverage of the defensive pass rusher, which is a fairly subjective metric.
These results can be applied to coaching players and managing game situations. For offensive linemen, it is important to keep a defender within the block and be quick off of the snap. For defensive linemen, getting outside of the block is important, as is speed off of the snap. This research provides a framework for better understanding what increases the likelihood of holding penalties, which can improve how players play, how coaches coach, and how referees officiate.
8. Works Cited
Abrevaya, J. & McCulloch, R. (2014). Reversal of fortune: a statistical analysis of penalty calls in the National Hockey League. Journal of Quantitative Analysis in Sports, 10(2), 207-224.
Anzell, A. R., Potteiger, J. A., Kraemer, W. J., & Otieno, S. (2013). Changes in height, body weight, and body composition in American football players from 1942 to 2011. Journal of Strength and Conditioning Research, 27(2), 277–284.
Hlavac, Marek (2022). Stargazer: Well-Formatted Regression and Summary Statistics Tables. R package version 5.2.3.
https://CRAN.R-pro ject.org/package=stargazer
Jacobson, B. H., Conchola, E. C., Smith, D. B., Akehi, K., & Glass, R. G. (2016). Relationship between selected strength and power assessments to peak and average velocity of the drive block in offensive line play. Journal of Strength and Conditioning Research, 30(8), 2202–2205.
King, G., & Zeng, L. (2001). Logistic Regression in Rare Events Data. Political Analysis, 9(2), 137-163.
Kitchens, C. (2013). Identifying changes in the spatial distribution of crime: Evidence from a referee experiment in the National Football League. Economic Inquiry, 52(1), 259–268.
Lopez, M.J. (2016). Persuaded under pressure: evidence from the National Football League. Econ Inquiry, 54, 1763-1773.
Lopez, M.J. (2020). Bigger data, better questions, and a return to fourth down behavior: an introduction to a special issue on tracking data in the National Football League. Journal of Quantitative Analysis in Sports, 16(2),73-79.
Lopez, M.J, & Synder, K. (2013). Bias Impartiality Among National Hockey League Referees. International Journal of Sport Finance, 8(3), 208-223.
McGinest, Willie. “Top 10 NFL Edge Rushers in 2022: T.J. Watt, Myles Garrett Lead Deep Collection of Stars.” NFL.com, NFL, 15 Aug. 2022, https://www.nfl.com/news/top-10-nfl-edge-rushers-in-2022-t-j-watt- myles-garrett-lead-deep-collection-of-s.
National Football League. (2022). NFL Big Data Bowl 2023. Retrieved October 13, 2022 from https://www.kaggle.com/competitions/nfl-big-data-bowl-2023
Plessner, H., & Betsch, T. (2001). Sequential effects in important referee decisions: The case of penalties in soccer. Journal of Sport and Exercise Psychology, 23(3), 254–259.
Snyder, K. & Lopez, M. (2015). Consistency, accuracy, and fairness: a study of discretionary penalties in the NFL” Journal of Quantitative Analysis in Sports, 11(4), 219-230.
Varley, Teresa. “Heyward Ranked No. 42 in NFL’s Top 100.” Steelers Home, 14 Sept. 2022, https://www.steelers.com/news/heyward-ranked-no-42-in-nfl-s- top-100.