Arm’s length range – what does it really mean?

In South Africa, it seems that the unspoken rule is that the interquartile range (IQR) is the arm’s length range. To qualify this statement, I am only referring to a margin derived from a benchmarking study that uses external comparables.

Why is this important? Let me start by explaining what an arm’s length range is. As mentioned by the OECD Guidelines, transfer pricing is not an exact science and therefore it is likely that certain methods can produce a range of outcomes or observations which are equally reliable. For the purpose of this blog post, we are really considering the cost plus method, resale price method or transaction net margin method. There could be many reasons as to why there are different results from our observations, in the form of a margin or price. For example, there could be geographical differences which may have certain location savings, or there may be a small difference in the products or services sold. But there are also commercial reasons as to why different comparable third parties may charge a different price for a comparable product or service in the same region.

Coming back to our arm’s length range. An arm’s length range is just that; a range of data that we believe to be comparable to our tested party, making the range arm’s length. Please keep in mind that if there are observations in our range, which we are not certain that are comparable, we should find further evidence to accept or reject these data points. However, once we are comfortable with all data points, this could be seen as our arm’s length range from an OECD perspective. Most transfer pricing professionals do go a step further and then apply an IQR to the accepted range established from our comparable data, and say that the IQR is the arm’s length range. The reason for this is that the IQR gets rid of outliers which may potentially not be comparable as there must be some reason for being an outlier.

However, when diving deeper into why we apply an IQR (with the exception of the outlier reason), there does not seem to be much support except for comments like “it is what we do,” “tax authorities prefer it this way,” “it tightens the range” or some may even refer to certain guidance. In South Africa, we do not refer to the IQR as an arm’s length range but it is mentioned that statistical tools such as performing an IQR may provide a more comparable outcome. This is part of the reason why our tax authorities look at the IQR. But I would like to question this status quo. Surely if we have really comparable data and we are not able to differentiate our comparables from a functional comparability profile, geographical or even industry profile, the mere result of making less or more profits should not be a reason for rejection.

I am sure a lot of you will think that if a comparable makes huge profits, there must be something else at play. For example, intangible property that may provide an advantage or similar. But what if we checked for these things by performing certain rejection criteria, like IP screens or inventory screens etc. And I agree data may be imperfect but that is what transfer pricing is all about and we have to rely on something. With that in mind, as much as someone may want to argue that an IQR is more comparable as the data is imperfect, I would like to argue that as long as all my observations are similarly comparable, maybe we should not apply an IQR without additional thought.

There have also been other thoughts, for example, that performing detailed benchmarking studies is becoming less relevant and we should just accept all companies within a specific industry and region and then run an IQR for the full sample to derive an arm’s length range. The argument made is that, these numbers may not differ too much from a specific search when comparing the two IQRs. I don’t think this is a viable option yet, as the data is imperfect and we tend to find many companies with different industry profiles which may be considered comparable when they are not. Furthermore, the arm’s length principle is based upon the principle of what would a third party do in a comparable situation. It does not say what would any third parties do. I would also be a bit worried that the IQR may then be renamed to lazy range.

Lastly. I wanted to highlight ATAF’s (African Tax Administration Forum) latest suggested approach to drafting transfer pricing legislation, which is likely to be introduced in some African countries.  In the document, it is stated that statistical tools such as the IQR shall be considered an arm’s length range where the degree of the comparable data is uncertain. I believe the above argument is still valid, as I would argue that all my data is comparable and therefore there is no need to apply a statistical tool.

Please let me know your thoughts on the above.

2 Replies to “Arm’s length range – what does it really mean?”

  1. I agree with points about that not in all cases the IQR = arm’s length range. I practice mainly in South East Asia and Australia, some authorities have guidance about what is the arm’s length range. Singapore states that they prefer the IQR as the arm’s length range and accept the full range when the taxpayer can assure that all points of the range are equally comparable.
    I have a lot of experience performing searches and in all honesty it is very difficult to say that every company in the set is a perfect comparable. As a result, i think that is the reason why we need to end up performing and statistical measure in practice and using the IQR. I see practical difficulties in deviating from using the IQR unless we can be 100% sure the comparables are very good which in practice is hard to achieve. I agree with the overall message which is thinking about why we are using the IQR and considering that it may not be relevant for all cases.

    1. Thank you for you comment. I am sure a lot of transfer pricing enthusiast feel exactly the same as you and I too look at the IQR for these reasons. What I would like to challenge though is the thought that if I have imperfect comparable observations that I should reject some observations only because of their profit margins.

      Let me try and put this into an example. I run a search on a third party database and go through all the well known search steps, from inclusion criteria such as SIC/NACE codes, geographical inclusions and independence requirements (there may be more), to bulk rejections for potentially turnover thresholds, exclusion key words, certain IP or inventory screens (again there could be more) and I end up with a set of potential comparables. Now I would do a manual determination of each potential comparable by looking at the website, trade description, financials etc. and once I am very comfortable with a potential comparable, I would accept it. Let’s say I end up with 10 comparable companies, doing the above. Now the status quo I wanted to challenge (and I really just wanted to challenge it not saying it is wrong), why would I reject the bottom and top observations if all that is different is their profit margins.

      In my mind the answer will be somewhere along the lines that all observations are not perfect and because of this we should maybe have none. This obviously is not an option as we need something, and as such to lessen the imperfection, we apply a statistical tool such as the IQR.

      To take this one step further, I also found this blog post which may be relevant for the above discussion.

What are your thoughts?

This site uses Akismet to reduce spam. Learn how your comment data is processed.