Abstract
Introduction: The open field test (OFT) is a common tool to assess anxiety and behavioural changes in rodents. It has been adapted to pigs with no systematic investigation of how environmental changes may alter the performance of pigs. Currently, the number of published studies including the OFT in domestic pig models is increasing without standardization. Methods: Our review aimed to investigate the open field (OF) set-ups in published studies and the similarities between performance and published parameters. Results: Following the PRISMA guidelines for reviews, we selected 69 studies for inclusion in this systematic review. We determined the specific set-up conditions such as dimensions, duration, and time of day for most of the included studies; we found high variability across studies with respect to these test specifics. Discussion: Our results indicate the inconsistent implementation of the set-up, including dimensions, timing, parameters, and additional combined tests (e.g., new object tests). Based on our findings, we have made recommendations for the performance of the OFT, according to the current literature.
Introduction
The open field test (OFT) was first developed by Hall et al. [1] in 1934 to assess emotionality in rodents. It was designed to use an unfamiliar enclosure in which the animal could freely roam over a specified period. Later, the OFT has additionally been used to enable investigations of behavioural changes following treatment with psychotropic drugs with a clear focus on anxiety. In 1977, Royce included the monitoring of central versus peripheral movement, arguing that animals in the centre of the field are less fearful [2]. To allow for its use with anxiety disorder models, the dimension of the open field (OF) must be large enough to provide a feeling of openness and exposure in prey animals like mice and rats, causing them to exhibit thigmotaxis [1, 3]. Additionally, locomotor activity is strongly influenced by the animals’ environmental surroundings such as noise, illumination, or odour [4], with rearing, ambulation, and thigmotaxis decreasing with increasing brightness and noise levels. Moreover, the simple olfactory presence of a familiar caretaker can affect the animals’ location within the OF. Therefore, in 2014, Tatem argued for the standardization of the OFT in rodents to ensure the cross-study comparison of data and provided a list of requirements for its performance, in addition to blinded outcome assessment and proper statistical testing [5].
In rodents, the OFT is often used in combination with other behavioural tests to evaluate possible fear models, as it does not provide sufficient data for independent interpretation. In addition, an individual’s emotionality affects their locomotion and vice versa [6]. Recent studies (e.g., Prut and Belzung [7]) refuted that anxiety and central movement are directly related and urge caution with the use of the term anxiety. In the review by Prut and and Belzung [7], he stated that the OF is valuable in anxiety disorder models but does not per se indicate fear. Additionally, in 1957, Broadhurst was the first to note that the movement of rats is driven primarily by exploratory curiosity [8, 9].
Beilharz and Cox [10] was one of the first researchers to use the OFT in pigs. Nowadays, this tool can aid in the evaluation of ∼30 response variables, varying from locomotor activity to elimination and escape attempts, which can be used to estimate animal emotions (e.g., anxiety) or personality [11]. Nonetheless, there is no gold standard version of this test available for use with pigs; its adaption from rodents to pigs is often criticized since the behaviour of pigs differs from that of rodents.
One of the unaddressed issues involved in adapting the OFT to pigs is that pigs do not show thigmotaxis [11]. It also remains unclear whether there is an emotional response to exposure to open areas in pigs. According to Murphy et al. [11], anxiety in pigs may be caused by social isolation or agoraphobia. Although wild boars are rarely seen roaming freely in open areas, their distance to the timberline is too great to be interpreted as thigmotaxis [12]. In addition to the field dimensions, other parameters may also have an impact on the behaviour of pigs during the OFT. Ralph et al. [13] noted that pen enrichment during the sucking and weaner phases alters the performance of piglets. Later, Tatemoto [14] additionally demonstrated that maternal behaviour influences the piglets’ behaviour via foetal reprogramming. In 2020, Haigh et al. [15] found that individual character affects exploratory behaviour. Moreover, human interaction and presence reduces the fear response in pigs [16], with different sexes exhibiting different reactions to stressors (sexual dimorphism). Furthermore, repeated exposure of to the OFT reduces pigs’ activity [17], which may be due to their disinterest or their increased anxiety upon reexposure to the arena.
Since a number of factors affect the outcomes of the OFT, it is challenging to compare OFT parameters across different studies, and between individual animals or with an individual’s own baseline when evaluated multiple times. This hypothesis is in line with the findings of Murphy et al. [11], who criticized the lack of standardization in OFT testing.
The OFT is not commonly used solitarily on animals for the assessment of parameters for anxiety disorder models; tandem use of the OFT and the novel object test (NOT) is common. In 2007, Forkman et al. [18] described the NOT as involving a novel visual stimulus in the form of an object placed on the floor or hung from the ceiling. During the NOT, pigs are exposed individually or in groups, and variables, such as latency to first contact, as well as the frequency and duration of contact are recorded. Typically, animals are exposed to the NOT after a habituation period within the OF. A review by Forkman et al. [18] also investigated the repeatability of this test, demonstrating heterogeneous results encompassing no [19] to positive repeatability [20]. Several factors are known to also influence the NOT, such as enrichment in the home pen [13, 21], an individual’s character [15] or epigenetic characteristics [14].
Another testing methodology that is often used in combination with the OFT is the human approach test (HAT), in which the pig is free to contact a motionless observer within the arena. The HAT evaluates the human-animal relationship based on the pig’s possible fear reactions [22]. The latency to approach the observer [23] and the duration of contact are also recorded [11]. This test is usually conducted on individual pigs rather than groups of pigs and has the advantage of low inter-observer variation [22]. Like the OFT and NOT, the HAT is also influenced by several factors. For instance, repeated performance reduces the time until the animal approaches the observer [24], while pen dimension, group size, handling, or health can also influence the results of the HAT [22]. Since external factors strongly influence the OFT alone and in combination with the NOT or HAT, this systematic review summarizes the overlap of the set-up, performance, and parameters across studies and provides future recommendations for the standardization of the OFT for pigs.
Methods
This study was performed in accordance with the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [25] and is conducted using a registered protocol of the International Prospective Register for Systematic Reviews (PROSPERO, CRD42019156734). The PRISMA checklist can be found in the online supplementary material (for all online suppl. material, see www.karger.com/doi/10.1159/000525680).
Search Strategy
The study aimed to identify published manuscripts with defined set-ups using an OF in combination with Sus scrofa. We searched MEDLINE (Pubmed), EMBASE, and Web of Science using the following search terms related to pigs: “pig,” “porcine,” “swine,” “boar,” “gilt,” “miniature swine,” “sows,” “piglet,” and “piglets.” We combined these pig-related terms with terms related to studies using the OFT to assess behaviour: “open field,” “open-field,” and “behav*.” Search terms were similar for each database. The systematic literature search was carried out in November 2019, encompassing literature ranging from 1967 to November 2019.
Study Selection
We defined the inclusion criteria a priori into four dimensions: (I) the full-text could be retrieved; (II) the study was written in English; (III) the species was limited to Sus scrofa; and (IV) the study used an OF with or without a NOT or HAT. We defined the OF as an unfamiliar, confined environment, including alternative names like “pen” or “home pen.”
We excluded studies if (I) the publication type was a review or (II) if it was yet not published. We imported all references to Endnote X9 (Clarivate, Pennsylvania, USA) and removed all duplicates.
Primary Outcomes
Dimension and design of the OFT arena;
Time spent with NOT, HAT, or intruder
Secondary Outcomes
Predefined quality indicator for study classification based on the availability of information;
Time of day during which the OFT was conducted;
Age or body weight;
Measurements used to assess animal response’ (i.e., mobility, exploratory behaviour, immobility, excretion/ingestion, vocalization, posture, behaviour, social interaction, others);
Categorization of the scientific field of the selected study (i.e., environment and behaviour, anxiety and stress, toxicology, diet-depending behaviour change, interventions, genetic background, others);
Sex distribution.
Data Collection
Two independent reviewers (M.S. & L.Z.) evaluated the eligibility of each article. We excluded studies when both reviewers agreed that the inclusion criteria were not met. All disagreements were resolved by discussion.
Domain of Scientific Question
During this systematic review, as secondary outcomes, we assessed each publication’s experimental field to gain better insight into the use of OFTs for pigs. We first categorized all studies according to their scientific question. If the category occurred less than three times, it was assigned to “other,” leaving six main classes.
We defined the following classes for experimental questions: anxiety/stress, toxicology (studies of drugs and their impact on behavioural change), behaviour change due to diet, genetic background, intervention (e.g., surgery), environmental change associated with behaviour, and others.
Risk of Bias
The risk of bias (RoB) was described and judged according to the SYRCLEs RoB tool [25], adapted from the Cochrane RoB tool [26]. Studies were categorized as having a low, high, or unclear RoB in all ten domains. Three independent authors (M.S., L.Z., & M.K.) judged the RoB for each study. In total, 250 issues were solved via discussion with full agreement. The domains for the RoB were as follows:
Sequence generation (random approach);
Baseline characteristics (similarity or adjusted for confounding);
Allocation concealment (unforeseen assignment to groups by investigator);
Random housing, performance bias (similar housing for animals with regard to conditions);
Blinding, performance bias (blinding during experiment for caretaker and investigator);
Random outcome assessment (random selection for outcome assessment);
Blinding, attrition bias (influence of lack of blinding or unblinded assessors);
Incomplete outcome data (missing data, no exclusion of animals);
Selective outcome reporting (missing prespecified outcomes);
Other sources of bias (analysis errors, study design, private funding).
We have made slight alterations to the judgement of random housing. If animals were housed under the same controlled climatic conditions and pen size, we decided for a low risk; this was independent of the allocation of animals based on litter, weight, or sex. The allocation of animals was usually done with respect to animal welfare to tolerate their companions. This aspect is of higher need than the random allocation.
Data Synthesis and Statistical Analysis
We categorized the scientific question of each study into one of seven different domains (environment & behaviour, anxiety & stress, toxicology, diet-depending behaviour changes, interventions, genetic background, others). We attempted to perform a meta-analysis with random effects and standardized mean difference.
We defined a quality indicator to rank the studies. Our criteria based on outcome parameters focused on availability of information about the following:
OF set-up: dimension in width and length or diameter, total size in m2 if previous was not given, and number of segments per OF, n = 3;
Day time of OFT with start time, end time, and time frame, n = 2;
Animals: breed, sex, age, or body weight, n = 3;
Test specifics: time in OF for OFT, NOT, and HAT, and number of tests, n = 2.
We awarded one point for each criterion met and added them together for the quality score. In total, ten points could be retrieved. Additionally, the percentage of studies which fulfilled the criteria was calculated in percentage.
We assessed the age by converting months to days, using 30 days per month for comparison. We used Graph Pad Prism 9.1.0 (San Diego, CA, USA) with Spearmans r and a 95% confidence interval to correlate the parameters. Boxplots were built with whiskers displaying the 5–95% confidence interval (CI) and outliers. Results are provided as the mean with standard deviation or median with lower and upper confidence limits and a 95% CI. Scientific domains were displayed in a radar plot using https://www.diagrammerstellen.de. For RoB assessment, we used Review Manager 5.4 of The Cochrane Collaboration 2020 to create the graphics and calculation.
Results
The initial search of each database yielded 379 hits. After eliminating duplicates, title/abstract screening identified 81 studies applicable for the full-text screening phase. This screening mainly focused on the species and OF, with studies excluded if the inclusion criteria were not met. Within a second consensus meeting, we excluded an additional 12 studies due to exclusion criteria with an additional focus on language, publication type, and availability of full-text. In total, we included 69 studies in this systematic review. Figure 1 illustrates the stepwise review process.
Flow chart of the systematic review process using four stages according to the PRISMA guidelines.
Flow chart of the systematic review process using four stages according to the PRISMA guidelines.
A total of 12 studies scored full marks on our predefined quality score (Table 1). Together with eight other studies (11.6%) that achieved a score of nine, these studies provided information in every outcome category.
Regarding the OF set-up, 63 studies (91.3%) provided complete information on the size and segmentation of the OF. Only one study failed to specify the OF used. The least amount of information was provided on the time of day that the OFT took place. Only 20 (29.0%) of 69 studies reported both the start and end times. Four other studies (5.8%) instead reported only the start time of OFT [53, 55, 58, 69]. All studies provided information on the experimental animals used. Of these studies, 84% provided the breed, age, or weight, as well as the sex of the animals. No study failed to specify how the OFT was performed. However, seven studies (10.1%) missed reporting the number of tests performed per animal [49, 61, 63, 68, 78, 85, 87]. We analysed whether quality improved over time and found a correlation (R = 0.0339, 95% CI = −0.47 to −0.013, *p < 0.05; Fig. 2) between publication year and quality score.
Correlation of publication year and quality score of the 69 included studies. Regression coefficient = 0.0339, 95% CI = −0.47 to −0.013, *p< 0.05. The quality score is estimated by Table 1.
Correlation of publication year and quality score of the 69 included studies. Regression coefficient = 0.0339, 95% CI = −0.47 to −0.013, *p< 0.05. The quality score is estimated by Table 1.
Set-Up of the Open Field
Two studies did not provide information about the size of their OF [85, 89] as shown in Figure 3. The size of the OF varied widely across the studies, with a mean of 11.28 m2 and a large standard deviation of 8.55 m2. The median OF area was 9 m2, with a lower confidence limit of 7.82 m2, an upper confidence limit of 9.75 m2 and a 95% CI of 97.5% (n = 67). Whereas most studies used a rectangular arena, two studies from the same group used a circular arena as the OF [65, 67]. Additionally, 42 studies provided additional information about the segmentation of the arenas in sections, whether physically or virtually applied, as shown in Figure 4. The median number of sections used was 12 (lower confidence limit = 9, upper confidence limit = 16, 95% CI = 97.41%, n = 46).
The OF dimensions from 67 studies are visualized by marking the length and width in metres as points. Two studies used a circular area, which is presented as a circular dotted line. The 45° line marks the square OF dimensions, which is given for 14 OFs. Two of the 69 included studies did not refer to their OF dimension.
The OF dimensions from 67 studies are visualized by marking the length and width in metres as points. Two studies used a circular area, which is presented as a circular dotted line. The 45° line marks the square OF dimensions, which is given for 14 OFs. Two of the 69 included studies did not refer to their OF dimension.
Left: boxplot of OF dimensions with whiskers displaying the 5–95% CI and outliers (n= 69). Right: boxplot of the number of segments used in the OF with whiskers displaying the 5–95% confidence interval and outliers (n= 45).
Left: boxplot of OF dimensions with whiskers displaying the 5–95% CI and outliers (n= 69). Right: boxplot of the number of segments used in the OF with whiskers displaying the 5–95% confidence interval and outliers (n= 45).
Two studies provided information about the adaption of the OF dimensions to the age of the pigs [65, 67]. To investigate whether the dimension of the OF correlated with the size of the animals, we investigated the weight and age distributions and compared them with the OF area. Overall, six studies did not provide information about age/body weight or OF dimensions and could not be included in the comparison [40, 58, 73, 76, 82, 85]. Even though most studies reported the age of the animals, their crossbreds varied, such that we could not estimate the age, body weight, or size of the animal. As data for body weight were not uniform (provided as an absolute [20, 38, 47, 48, 54, 91], range [33, 37, 39, 66], with mean and SD [40, 55, 65, 68, 72, 74, 75], or body weight gained due to food intake over time [28, 32]), a body weight calculation was not possible; thus, 60 of the included studies provided information about the animals’ ages.
Across the studies, 63.8% (44 studies) of the animals were the age of 2–79 days, of which 13 studies used piglets during the weaning phase <20 days. Additionally, 36% of the studies tested animals between the ages of 80–540 days, as shown in Figure 5. Six studies used adult pigs over 6 months of age [37, 55, 61, 63, 66, 78]. There was no positive correlation between the pigs’ age and the OF dimensions, as shown in Figure 6.
Age distribution of pigs per study. Age is given in days with a split x-axis at 100 days. Vertical lines are set for visualization at 40, 80, 200, and 400 days. The weaning of piglets usually occurs on day 23. Studies with pigs left from this time point were used during their suckling period.
Age distribution of pigs per study. Age is given in days with a split x-axis at 100 days. Vertical lines are set for visualization at 40, 80, 200, and 400 days. The weaning of piglets usually occurs on day 23. Studies with pigs left from this time point were used during their suckling period.
Correlation (Pearson R) between the age of animals respectively weight recalculated to age and the OF dimensions for 58 studies. The dotted line indicates the 95% confidence interval. R2 = 0.012, showing no correlation between age and OF dimensions.
Correlation (Pearson R) between the age of animals respectively weight recalculated to age and the OF dimensions for 58 studies. The dotted line indicates the 95% confidence interval. R2 = 0.012, showing no correlation between age and OF dimensions.
The majority of the 24 studies that provided information about testing time conducted the OFT before noon, and 54.5% of the studies conducted the tests between 10:00 a.m. and 11:00 a.m., shown in Figure 7 (median 10:00 a.m., lower confidence limit = 8:30 a.m., upper confidence limit = 11:00 a.m., 95% CI = 97.34%, n = 21). Four of the studies mentioned a starting OFT at 10:00 a.m., 14:30 p.m., 15:30 p.m., and 16:30 p.m. [53, 55, 58, 69]. One study postponed the behavioural test to the evening (18:00–19:00 p.m.) [54]. Forty-five publications (65.2%) did not refer to the time of testing.
Spans of time in which the OF was performed, with the x-axis scaled every 2 h over 24 h. Grey lines refer to the total time and black lines (n= 4) refer to a start time with an open end. A framed box indicates the time frame in which most OFTs were conducted. In total, 21 studies provided the time of day for their OFT.
Spans of time in which the OF was performed, with the x-axis scaled every 2 h over 24 h. Grey lines refer to the total time and black lines (n= 4) refer to a start time with an open end. A framed box indicates the time frame in which most OFTs were conducted. In total, 21 studies provided the time of day for their OFT.
A closer investigation of the distribution of age by sex revealed that studies using litters with both sexes mainly used younger piglets with a mean age of 42.05 ± 34.1 days compared to the overall age of animals, as shown in Figure 8.
Distribution of age separated by sex, displayed in boxplots with 5–95% confidence interval whiskers. Studies using both sexes usually investigated litters of young piglets, but post-weaning. Female pigs had the highest range, with additional long-term experiments up to 18 months; these animals were outliers compared to animals of all sexes.
Distribution of age separated by sex, displayed in boxplots with 5–95% confidence interval whiskers. Studies using both sexes usually investigated litters of young piglets, but post-weaning. Female pigs had the highest range, with additional long-term experiments up to 18 months; these animals were outliers compared to animals of all sexes.
To estimate the average time a pig spent in the OF arena, we distinguished between the OFT, NOT, and HAT. Thirty-four studies (50.7%) only performed an OFT, whereas the other 35 studies additionally conducted a NOT and/or HAT, as shown in Figure 9. In six studies (8.7%), a start box with an initial waiting and acclimatisation period was used (mean of 2.14 ± 1.87 min [13, 20, 34, 40, 63, 85]). In most studies, the OFT lasted for a median of 5 min (lower and upper confidence level = 5 min, 95% CI = 96.81%, n = 71). An intruder (human [20, 42, 46, 59, 74, 85, 92] or pig [43]) was added to the OFT in eight studies; this was often combined with the NOT and separated the OFT time. The median time for the NOT was set to 5 min (lower and upper confidence limit = 5 min, 95% CI = 97.63%, n = 39)).
Visualization of test performance and duration for the OFT, NOT, HAT, and additional start boxes in minutes for each study (n= 69). The light grey bar indicates habituation time; black bars refer to the OFT, dark grey bars refer to the NOT, and bars with patterns indicate the HAT.
Visualization of test performance and duration for the OFT, NOT, HAT, and additional start boxes in minutes for each study (n= 69). The light grey bar indicates habituation time; black bars refer to the OFT, dark grey bars refer to the NOT, and bars with patterns indicate the HAT.
Measurements Used to Assess Animal Response to the OFT
In this review, we divided the various parameters (measurements used to assess animal responses, such as distance, grunts, standing) into eight main categories and counted the number of occurrences across all publication. Parameters that occurred less than two times were categorized as “others.” We sorted the categories in Table 2, according to their frequency of occurrence. Parameter categorization was performed according to Walsh and Cummins [4] with minor alterations as the review did not cover all parameters. Although parameters like mobility and immobility are linked, some studies focused on only one of the two parameters; therefore, we decided to categorize them separately.
The categories of mobility, immobility, and exploratory behaviour (if a NOT was conducted) were used most frequently, followed by vocalization and excretion/ingestion. There was an overlap of parameters defined as posture in some studies with immobile parameters. The parameters of part body movement were ranked seventh in terms of their frequency of use, having only been used in 20 studies. If a part body movement (e.g., nudging or chewing) occurred as an exploratory interaction with a NOT, we categorized it as exploratory behaviour, although some authors classify this as exploration instead. We found the wording and interpretation of parameters to be inconsistent across the various studies. As such, immobile parameters also describing a posture were counted in both categories.
Meta-Analysis
To assess the impact of OFT set-ups, we attempted to perform a meta-analysis, in which we aimed to order the control versus treatment groups by their movement. Secondarily, we wanted to subdivide the outcomes by the OF dimensions. Unfortunately, we were unable to perform this meta-analysis due to the heterogeneity of the studies with too few studies, in which the scope of the scientific questions did not interfere with the results. One of the main issues we faced was the pre-classification of animals based on specific characteristics or behaviours (e.g., backtest, dominant vs. defeated, strain differences) [20, 27, 28, 32, 64, 81, 85]. Another factor interfering with the results was heterogeneous husbandry; the animals’ performance was altered by social isolation or being housed in enriched versus standard pens [13, 31, 42, 49, 61, 62, 76, 77, 80, 92]. Therefore, we could not compile enough data to support this analysis.
Domain of Scientific Question
We found 16 studies (23.2%) related to the category anxiety & stress followed by 12 studies (17.4%) related to environment & behaviour, as shown in Figure 10. Eleven studies (15.9%) were categorized as toxicology, whereas eight studies (11.6%) referred to diet-dependent behaviour. The domains interventions and genetic background were of minor interest with five (7.2%) and three (4.3%) studies per domain, respectively. Fourteen studies (20.3%) were classified as other.
Radar plot showing the studies assigned to the domains of their scientific questions. Each line represents one number on the scale (n= 69).
Radar plot showing the studies assigned to the domains of their scientific questions. Each line represents one number on the scale (n= 69).
Risk of Bias
A complete analysis of the authors’ judgement concerning the RoB is shown in Figure 11.
Sequence Generation: Since the assignment of animals to their groups was mostly dependent on body weight, sex, and littermates, the allocation sequence could often not be randomized [81]; this led to 30 studies with a high RoB in sequence generation, according to the RoB tool.
Baseline characteristics: Positively, around 60% of all studies had a low RoB with regards to the baseline characteristics.
Random housing: The performance bias related to random housing indicates that no study randomly assigned animals to their home pens. For welfare reasons, animals were mostly grouped by litter to tolerate companions. However, we decided that the RoB was low if the housing conditions in pens had the same controlled climatic conditions and dimensions. In this case, there seems to be no negative influence of the allocation of the pens on the experimental outcome.
Blinding of caretakers and investigators: Since, in various publications, animal behavioural tests often occurred with a specialized housing situation, like warm areas, signature odours or sex-dependent assignment, caretakers, or investigators could not always be blinded to the treatments. Unless studies stated otherwise, the authors decided on a high risk when this was the case. When the assignment of animals to interventions could not be predicted by animal caretakers or investigators, we opted for a low RoB.
Blinding of outcome assessors: Specifically, the blinding of assessors and the randomization of outcomes had the most unclear status; only five studies referred to a blinded assessor for at least one parameter.
Other sources of bias: Moreover, 29 publications had a higher RoB mainly due to commercial funding, whereas 39 studies were approved by authorities or governmental institutions, leading to a low RoB. Additionally, we focused on the study design to assess other sources of bias.
The RoB tool does not cover the question of this systematic review with the main focus on OF settings. Therefore, a potential bias does not influence the outcomes of this systematic review as information about settings was mostly available. An overview of the estimated risk of all studies per domain is shown in Figure 12.
Authors’ judgements about each RoB item presented as percentages across all included studies.
Authors’ judgements about each RoB item presented as percentages across all included studies.
Discussion
The aim of this systematic review was to perform a meta-analysis between several parameters of the OFT, such as the distance travelled compared to OF dimensions, as there is currently no standardization for the OFT for pigs, despite the need for it according to some reviews [11, 18]. Unfortunately, the heterogeneous results and study designs made it impossible to compare all data. A major issue was the scientific question interfering directly with the results. Animals that were assigned to groups based on their individuality (e.g., backtest) [28, 34, 64, 81], behaviour changes based on social isolation [31, 36, 58, 59, 88, 90], or diet-dependent behaviour [38‒40, 52, 57, 69, 70, 75] varied in their performance in the OFT; therefore, the untreated control groups of the different studies were not comparable because their baseline characteristics differed significantly due to preselection for these traits. Additionally, the assessment of results varied from objective (e.g., automated tracking) [37, 56] to subjective measures (e.g., Noldus Observer) [31, 45, 87], and recordings were not always analysed over the same time frames (e.g., sampling of videos by 10 s every minutes) [19].
Dimension and Segments in the OFT Arena
Our investigation of the OFT set-ups revealed high heterogeneity in the arena dimensions and segments (Fig. 4). Although most studies opted to use a rectangle OF, we assume the OF dimension was likely determined by the structural conditions of the buildings. Our analysis revealed a median OF area of 9 m2. The original idea of the OFT was to provide a feeling of openness and exposure to rodents [1, 5]. A review by Forkman et al. [18] refers to the dimensions being adapted to the length of pigs; however, we could not find any positive correlation between age, respective body weight and OF dimension (Fig. 2).
Arena segmentation is used for two reasons. If automated tracking is not possible, sections aid in determining if an animal has stepped into a segment or crossed a line [27, 65, 87]. Second, sections define the central and peripheral area [49, 59]. Segmentation is often used in anxiety disorder models to determine whether pigs prefer the location of walls and corners or if they cross the centre. Currently, these grids vary in number and, dimensions; thus, they are not comparable between studies. A median of 12 sections was found across all studies.
These findings underline the need for standardization of the OFT set-up with regards to the dimensions. Since pigs do not exhibit thigmotaxis and are not prey animals, it is questionable whether the OF arena can cause a feeling of exposure [11, 12]. Additionally, the studies varied in their determination of ambulation by counting the number of lines crossed with both forefeet [30, 76], head, and front shoulders [13, 71] or the midpoint of the head [17, 45]. It was not possible to distinguish between lateral movement, forward movement, and gyroscopic movement; these issues are negligible if automated tracking is used.
Duration of OFT
Only one review for OFT in rodents has been performed, in 1976 by Walsh and Cummins [4]; to the best of our knowledge, no recent systematic review has investigated the OFT set-up for rodents. We face that same issue for pigs. Therefore, a systematic analysis of the optimal duration of the OFT is not available. Our results showed that the total duration of the OFT varies across studies, extending up to 1 h; however, our analysis revealed a median OFT duration of 5 min. An additional NOT or HAT lasted for a median of 5 min. Although an analysis of the effect of a shortened or prolonged OFT is not available, we found a clear intersection of 5 min across the included studies.
Time of Day for Performance of OFT
The OFT in rats and mice is usually conducted during the day, although they are known to be nocturnal animals [93]. In 2015, Morin noted that this alters the rodents’ performance on the test [94], as they are more active at night, with poor transferability to diurnal animals or humans [95]. Although wild pigs have an activity peak at dusk and dawn, there is little information about their preference for darkness [18, 96, 97]. Domestic pigs are instead adapted to light, and a study by Tanida et al. [98] found an aversion to darkness in piglets. The included studies revealed that the OFTs were solely conducted during the day. To the best of our knowledge, no study has investigated the influence of the time of day on pigs; however, with a median start time of 10:00 a.m. with a narrow confidence limit ranging from 8.30 a.m. to 11:00 a.m., most studies conducted the OFT in the morning.
Measurements Used to Assess Animal Response
Depending on the study design, various measurements can be assessed using the OFT, varying from objective-measurable parameters (e.g., mobility, immobility) to subjective parameters (e.g., social interaction; Table 2). The interpretation of pigs’ behaviour lacks guidelines. For instance, while some authors interpret avoidance and immobility as neophobia [17], others found this does not mirror anxiety [7, 99]. Rooting and sniffing behaviour changes with the age of pigs, decreasing over time as curiosity to novelty declines [100]; these behaviours are also affected by an increase in the number of test cycles [17, 43, 90]. These repeated test cycles also affect mobility [30, 31, 73, 80], decreasing over time [78]. Even if the focus is on mobility alone, diminishing curiosity is not always considered when using multiple test cycles [31, 56]. Overall, the included studies showed no consensus with respect to the wording and categorization of parameters. Furthermore, there was a duality in behaviour, such as standing while assessing the pigs’ posture [20, 46], but at the same time assessing immobility [55, 56, 64].
We were able to list the domains (e.g., mobility, exploration) and individual measurements for assessing animal responses; however, we are not currently able to provide recommendations for standardized wording as these terms strongly depend on the study design. Researchers must provide a clear definition for each behaviour to enable comparison across studies.
Sex Bias
A biased research focus on males is often reported, neglecting female mammals. Although the inclusion of women has attained more attention in clinical research, this is not the case in basic science and translational research [101‒103]. In contrast to this general trend, we found multiple studies in which female pigs or both sexes were used; this often occurred when studies focused on litters. In addition, due to boar taint, most male piglets are castrated; thus, group housing is possible without restriction [104, 105]. Overall, there was a low sex bias, which enhances the possible translation of results to humans (Fig. 8).
The OFT to Assess Animal Welfare
A closer look into the scientific questions of the studies revealed that anxiety disorder models garnered the most interest (Fig. 10), followed by environment-dependent behaviour and toxicological studies. When distinguishing between farm and laboratory animals, it is evident that the issue of rearing and an enriched environment was primarily relevant to farm animals [44, 61, 74, 77]. Other categories like surgical interventions or toxicological studies are more applicable to laboratory animals. The OFT has also recently been used as a valuable tool to assess the welfare of farm animals [42, 62]. With eight of the twelve studies published after 2000 being classified into the domain “environment & behaviour,” it is clear that awareness for farm animal welfare is rising [106‒108].
Quality of Information
To ensure reproducibility, as much information as possible concerning material and methodologies, as well as data analysis, should be provided [109, 110]. In 2010, Kilkenny et al. [111] developed the ARRIVE guidelines to improve reporting in animal research. We ranked the included studies by our predefined quality indicator, focussing on the availability of specific information (Table 1). We expected the quality to improve over time by rising sensitivity to reproducibility issues. We measured a slightly significant increase in the quality of the studies over time (Fig. 2). However, the results impeded a proper meta-analysis and comparison between studies. Therefore, we strongly recommend that future studies follow the ARRIVE guidelines and provide as much information as possible.
Recommendations
A search for OF and pigs could not reveal any systematic analysis of dimensions, duration, or effects of time of day on the OFT outcomes. We assume that studies have inherited methodology of the theoretical concepts underlying the endurance of the OFT from rodent behaviour models. We assume the OF dimensions may be based on the structural conditions of the buildings. Based on our findings, we provide some recommendations for the design of the OFT set-up with respect to comparability to previous studies; these recommendations are based on the current literature:
The OF arena should be a rectangle or square.
The area should be 9 m2 and not <7.9 or greater than 9.7 m2
The number of segments should be 12 and not <9 or >16.
The OFT should last at least 5 min in duration.
A NOT or HAT should last for 5 min, in addition to the OFT duration.
The OFT should be conducted in the morning around 10:00 a.m. and not before 8:30 a.m. or after 11:00 a.m., if the animals have a 12 h dark/12 h light cycle.
Furthermore, we found a lack of information with regards to the materials and methodologies in several studies. Therefore, we urge authors to implement the ARRIVE guidelines to ensure transparency and reproducibility. Whenever the study design allows it, we recommend using both sexes to reduce possible sex bias.
Limitations
The quality of the included studies varied both in terms of the available set-up data and comprehensible results (Table 1). The RoB analysis showed that, especially within performance and detection bias, the procedures were mostly unclear (Fig. 11). We found no evidence of any study being controlled (e.g., ARRIVE guidelines), leading to a high RoB. Though sequence generation had a high RoB in most studies, this does not reflect the fact that the pig as a target animal has a high sense of hierarchy. In some studies, this behaviour is implemented into the study question, in which a sequence generation cannot be random [38, 40, 90]. In other studies, it is of high interest for the pigs’ welfare to be grouped with equally-ranked conspecifics or siblings to reduce biting or gain access to feeding [40, 109].
Our study has some inherent limitations due to the heterogeneity of the data. For comparison, we converted the pigs’ ages into days (1 month = 30 days), but this resulted in a slight difference in the actual age. Another source of bias was the limitation to including studies written in the English language, which unfortunately excluded some studies. A third factor of potential bias is the authors’ judgement of domain categories for the parameters (Table 2); we tried to match and assign the different outcome parameters to their related category. All studies had a lack of information in terms of performance and detection bias; since we relied on these data, we cannot eliminate the high risk of incidental bias associated with it. Despite these limitations, we followed the strict and sensitive protocol for data identification to reduce potential bias.
Conclusion
The heterogeneity of data impeded a meta-analysis for the different outcomes assessed using the OFT. We conclude that standardization is necessary. The authors should implement the ARRIVE guidelines in their studies to ensure the availability of complete information. Based on the current literature, we have provided recommendations for the standardization of the OFT to ensure comparable results; these recommendations mirror the median characteristics of all studies since there has been no recent investigation into the effects of varying OFT set-ups on the performance of pigs. Using these recommendations, it should be possible to implement most existing studies into future meta-analyses.
Statement of Ethics
The paper is exempt from ethical committee approval as no additional animal experiments were performed for this systematic review.
Conflict of Interest Statement
All authors declare the research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest.
Funding Sources
This study was funded by the German Research Foundation (Deutsche Forschungs Gemeinschaft – DFG) FOR-2591 to 542/5-2.
Author Contributions
Mareike Schulz and René H. Tolba designed this study. Mareike Schulz and Leonie Zieglowski searched and collected the data. Mareike Schulz wrote the manuscript. Marcin Kopaczka designed the graphics. Mareike Schulz, Marcin Kopaczka, and Leonie Zieglowski judged the RoB. All the authors reviewed and revised the paper.
Data Availability Statement
The data that support the findings of this study are openly available in “Zenodo” at https://doi.org/10.5281/zenodo.5081125.