Choosing Cost Drivers & Nonlinear Costs
Overview
- What you’ll learn: Multiple regression analysis with two or more cost drivers, nonlinear cost functions, learning curves and their impact on cost estimation, criteria for selecting cost drivers, and the limitations of quantitative cost analysis.
- Prerequisites: Lesson 7 — Cost Estimation Methods
- Estimated reading time: 20 minutes
Introduction
The Grand Historian records: In the preceding seven lessons, we have journeyed from the management accountant’s mandate through cost objects, cost behavior, CVP analysis, and cost estimation. We have wielded the high-low method and regression analysis like swords in the cost accountant’s dojo. But all our models thus far have shared a simplifying assumption: that one cost driver explains the behavior of one cost, and that the relationship is linear.
Reality, that perennial spoiler of elegant models, is rarely so accommodating. Many costs are driven by multiple factors simultaneously — maintenance costs may depend on both machine hours and machine age. Some costs follow curves rather than straight lines — the hundredth unit takes less time to produce than the first, thanks to the learning curve. And choosing the right cost driver from a multitude of candidates requires not just statistical skill but business judgment of the highest order.
This final lesson of Module 4 ventures into advanced territory. Here, cost estimation becomes both an art and a science — the art of choosing the right model, and the science of fitting it to data. Those who master this lesson will possess the complete toolkit of cost behavior analysis.
Multiple Regression Analysis
Multiple regression extends simple regression by including two or more independent variables (cost drivers) in a single equation:
Y = a + b₁X₁ + b₂X₂ + … + bₙXₙ
Where Y = total cost, a = fixed cost, and each b represents the variable cost rate for its corresponding driver X.
When to Use Multiple Regression
Use multiple regression when you suspect that more than one factor drives a cost. Common scenarios:
- Manufacturing overhead may depend on both machine hours (X₁) and number of setups (X₂)
- Shipping costs may depend on both weight shipped (X₁) and number of shipments (X₂)
- Customer service costs may depend on both number of customers (X₁) and number of complaints (X₂)
Example: Maintenance Cost with Two Drivers
Suppose maintenance costs are driven by both machine hours (X₁) and number of maintenance calls (X₂). A multiple regression might produce:
Y = $3,200 + $2.80X₁ + $45.00X₂
Interpretation: Fixed maintenance costs are $3,200 per month. Each machine hour adds $2.80 to maintenance costs. Each maintenance call adds $45.00. If in a given month X₁ = 2,000 hours and X₂ = 50 calls:
Predicted cost = $3,200 + $2.80(2,000) + $45(50) = $3,200 + $5,600 + $2,250 = $11,050
Evaluating Multiple Regression
Multiple regression uses adjusted R-squared rather than plain R² to evaluate fit. Adjusted R² penalizes the addition of variables that don’t genuinely improve the model:
- Plain R² always increases (or stays the same) when you add another variable — even a random one
- Adjusted R² increases only if the new variable actually improves the model’s explanatory power
- If adjusted R² drops when you add a variable, that variable is not useful — remove it
Additional evaluation criteria for multiple regression:
- Each coefficient should be statistically significant (p-value < 0.05)
- Signs should be economically plausible: A positive coefficient means the driver increases cost (expected for most costs)
- No multicollinearity: The independent variables should not be highly correlated with each other. If machine hours and direct labor hours are 95% correlated, including both creates instability in the estimates
The Danger of Multicollinearity
Multicollinearity occurs when two or more independent variables are highly correlated. Symptoms include:
- Individual coefficients have unexpected signs (negative when positive is expected)
- Individual coefficients are not statistically significant, but the overall model has high R²
- Coefficients change dramatically when one variable is added or removed
Solution: Choose one driver from each group of correlated variables, or combine correlated variables into a single composite measure.
Nonlinear Cost Functions
Not all costs follow straight lines. Several types of nonlinearity are common in practice:
Economies and Diseconomies of Scale
At low volumes, per-unit costs may decrease as volume increases (economies of scale — bulk purchasing, spreading fixed costs). But beyond a certain point, per-unit costs may increase (diseconomies of scale — overtime, congestion, coordination complexity). The total cost function is an S-curve rather than a straight line.
Step Functions
As discussed in Lesson 3, some costs are constant within a range and then jump. Supervisory costs, warehouse capacity, and licensing fees often follow step patterns. These are handled by:
- Treating as fixed within the relevant range
- Using dummy variables in regression (1 = above threshold, 0 = below)
- Segmenting the data and estimating separate functions for each step
Exponential and Power Functions
Some costs follow exponential growth (e.g., environmental remediation costs that accelerate as contamination worsens) or power functions (e.g., construction costs that increase at less than a proportional rate with building size). These can be estimated using:
- Logarithmic transformations to linearize the data
- Nonlinear regression software
- Polynomial regression (adding X² or X³ terms)
The Learning Curve
The learning curve (also called the experience curve) describes the systematic reduction in the time (and cost) required to perform a task as cumulative experience increases. The more you do something, the faster and cheaper it becomes.
The Basic Learning Curve Model
The most common formulation is the cumulative average-time learning model:
When cumulative output doubles, the cumulative average time per unit falls to a fixed percentage of the previous average.
For example, with an 80% learning curve:
| Cumulative Units | Cumulative Avg Hours/Unit | Total Hours |
|---|---|---|
| 1 | 100 | 100 |
| 2 | 80 (100 × 80%) | 160 |
| 4 | 64 (80 × 80%) | 256 |
| 8 | 51.2 (64 × 80%) | 409.6 |
| 16 | 40.96 (51.2 × 80%) | 655.4 |
The learning curve has profound implications for cost estimation, pricing, and bidding:
- Early production is expensive: The first units cost significantly more per unit than later ones. Pricing based on early production costs will overstate the long-run cost.
- Cost predictions must account for learning: If you use early production data to estimate a cost function, you will overestimate costs for future production.
- Competitive bidding: Companies with more cumulative experience have lower costs and can bid more aggressively.
Where Learning Curves Apply
- Labor-intensive manufacturing (aircraft, electronics assembly)
- Software development (subsequent projects using similar technology)
- Professional services (consultants become more efficient with repeated engagements)
- Surgical procedures (outcome quality improves with surgeon experience)
Limitations of Learning Curves
- Learning eventually plateaus — the curve flattens as workers approach maximum proficiency
- Worker turnover resets the curve — new employees start at the beginning
- Process changes disrupt the curve — new technology requires relearning
- The learning rate varies by industry and task complexity
Criteria for Choosing Cost Drivers
Selecting the right cost driver is perhaps the most consequential decision in cost estimation. Horngren provides five criteria, listed in order of importance:
1. Economic Plausibility
The cost driver must have a logical, cause-and-effect relationship with the cost. Machine hours should drive machine-related costs. Number of orders should drive order-processing costs. If you cannot explain why the driver causes the cost, the statistical relationship may be spurious — a coincidence that will not hold in the future.
2. Goodness of Fit (R²)
The driver should explain a high proportion of cost variation. Among economically plausible candidates, prefer the one with the highest R².
3. Significance of the Relationship
The coefficient on the driver should be statistically significant (p-value < 0.05). A driver that appears to fit well on average but has a high p-value may be unreliable — the relationship could be due to chance.
4. Specification Analysis
Examine the residuals (the differences between predicted and actual costs):
- Residual plot: Plot residuals against the predicted values. Ideally, residuals are randomly scattered around zero. Patterns (fan shapes, curves, clusters) indicate problems with the model.
- Linearity check: If residuals show a curved pattern, the true relationship may be nonlinear.
- Heteroscedasticity: If residuals fan out as the predicted value increases, the variance is not constant — this violates a regression assumption.
5. Data Collection Feasibility
The driver must be measurable and the data must be available. A theoretically perfect driver is useless if the company cannot collect the data needed to use it. Sometimes a slightly inferior driver with readily available data is preferable to a superior driver that requires an expensive new measurement system.
Integrating Cost Estimation with Decision-Making
The ultimate purpose of cost estimation is not to produce elegant equations — it is to support better decisions. A few principles to keep in mind:
- Use multiple methods: Triangulate by comparing results from account analysis, high-low, and regression. If all three point in the same direction, confidence is high.
- Involve operations managers: Accountants crunch numbers; operations managers understand the physical processes that generate costs. The best cost functions emerge from collaboration.
- Update regularly: Cost functions are perishable. As technology changes, as prices shift, as processes evolve, the old cost function becomes unreliable. Re-estimate at least annually.
- Communicate uncertainty: Present a range of estimates, not a single point. “We estimate maintenance costs will be between $10,000 and $12,000” is more honest and more useful than “$11,000 exactly.”
Key Takeaways
- Multiple regression uses two or more cost drivers to explain cost behavior — evaluated using adjusted R-squared and checking for multicollinearity.
- Nonlinear cost functions (economies of scale, step functions, exponential curves) require specialized estimation techniques or data transformations.
- Learning curves describe the systematic reduction in per-unit cost as cumulative production increases — critical for pricing, bidding, and cost forecasting.
- Cost driver selection follows five criteria: economic plausibility, goodness of fit, statistical significance, specification analysis (residual patterns), and data collection feasibility.
- Economic plausibility is the most important criterion — a statistically beautiful model without logical justification is a trap.
- Cost estimation is a means to better decisions, not an end in itself. Use multiple methods, involve operations experts, update regularly, and communicate uncertainty.
What’s Next
Congratulations — you have completed Module 4: Cost Terminology and Behavior. You now command the full vocabulary and analytical toolkit of cost accounting: cost objects, cost drivers, variable and fixed costs, CVP analysis, and the quantitative methods for estimating cost functions. Module 5 awaits, where we will apply these tools to product costing systems — job-order costing and process costing. The theory meets practice; the equations meet the factory floor.
繁體中文
概述
- 學習目標:多元迴歸分析、非線性成本函數、學習曲線,以及成本動因選擇準則。
- 先決條件:第 7 課——成本估計方法
- 預計閱讀時間:20 分鐘
簡介
太史公曰:前七課中,吾等從管理會計師之使命,歷經成本標的、成本行為、CVP 分析,直至成本估計。然所有模型皆共享一簡化假設:一個成本動因解釋一項成本,且關係為線性。現實很少如此配合。許多成本同時由多個因素驅動,某些成本沿曲線而非直線移動,而從眾多候選者中選擇正確之成本動因,不僅需統計技巧,更需最高層次之商業判斷。
多元迴歸分析
將簡單迴歸延伸至包含兩個或更多自變數:Y = a + b₁X₁ + b₂X₂ + … + bₙXₙ
適用情境:製造費用同時取決於機器小時(X₁)與設置次數(X₂);運輸成本取決於重量(X₁)與出貨次數(X₂)。
評估多元迴歸
使用調整後 R²而非普通 R²。調整後 R² 對不能真正改善模型之變數施以懲罰。每個係數須統計顯著(p 值 < 0.05),符號須經濟合理,且不存在多重共線性。
多重共線性之危險
兩個或多個自變數高度相關時發生。症狀:係數符號出乎意料、個別係數不顯著但整體 R² 高、增減變數時係數劇烈變化。解決方案:從相關變數群中僅選一個,或合併為複合指標。
非線性成本函數
規模經濟與不經濟
低產量時單位成本隨產量增加而遞減(規模經濟),但超過某點後單位成本可能增加(規模不經濟——加班、壅塞)。總成本函數為 S 型曲線。
階梯函數
以虛擬變數、資料分段或在攸關範圍內視為固定來處理。
學習曲線
描述隨累積經驗增加,執行任務所需時間(及成本)系統性降低之現象。
80% 學習曲線範例:累積產量翻倍時,累積平均單位時間降至原來之 80%。
| 累積單位 | 累積平均小時/單位 | 總小時 |
|---|---|---|
| 1 | 100 | 100 |
| 2 | 80 | 160 |
| 4 | 64 | 256 |
| 8 | 51.2 | 409.6 |
影響:早期生產昂貴,成本預測須計入學習,有更多經驗之企業可更積極投標。
適用於:勞力密集製造、軟體開發、專業服務、外科手術。
限制:學習終會趨於平穩、人員流動重置曲線、製程變更中斷曲線。
成本動因選擇準則
- 經濟合理性:動因與成本須有邏輯因果關係——最重要之準則。
- 擬合優度(R²):偏好解釋較高成本變異比例之動因。
- 關係之顯著性:係數須統計顯著(p 值 < 0.05)。
- 設定分析:檢查殘差圖——殘差應隨機分散於零附近。
- 資料蒐集可行性:動因必須可衡量且資料可取得。
整合成本估計與決策
- 使用多種方法三角交叉驗證
- 讓營運經理參與——會計師算數字,營運經理懂流程
- 定期更新——至少每年重估
- 溝通不確定性——呈報估計範圍而非單一數字
重點摘要
- 多元迴歸使用兩個或更多成本動因——以調整後 R² 評估並檢查多重共線性。
- 非線性成本函數(規模經濟/不經濟、階梯函數、指數曲線)需特殊估計技術。
- 學習曲線描述隨累積產量增加之單位成本系統性降低。
- 成本動因選擇遵循五項準則,經濟合理性最為重要。
- 成本估計為支援更佳決策之手段,非目的本身。
下一步
恭喜——您已完成模組 4:成本術語與行為。您現在掌握了成本會計之完整詞彙與分析工具包。模組 5 等待著——分批成本制與分步成本制。理論遇上實務,方程式踏上工廠現場。
日本語
概要
- 学習内容:重回帰分析、非線形コスト関数、学習曲線、コストドライバー選択基準。
- 前提条件:レッスン7——コスト推定方法
- 推定読了時間:20分
はじめに
太史公曰く:前七課にて管理会計士の使命から原価対象、コスト態様、CVP分析、コスト推定に至るまで旅してきた。しかし、これまでのモデルはすべて一つの単純化仮定を共有していた:一つのコストドライバーが一つのコストを説明し、その関係は線形であると。現実はめったにそれほど協力的ではない。多くのコストは複数の要因に同時に駆動され、直線ではなく曲線に従うものもあり、多数の候補から正しいコストドライバーを選ぶには統計的技術だけでなく最高レベルのビジネス判断が必要である。
重回帰分析
単純回帰を二つ以上の独立変数に拡張:Y = a + b₁X₁ + b₂X₂ + … + bₙXₙ
使用場面:製造間接費が機械時間(X₁)と段取り回数(X₂)の両方に依存する場合など。
重回帰の評価
自由度調整済みR²を使用。モデルを真に改善しない変数の追加にペナルティを課す。各係数が統計的に有意(p値<0.05)であること、符号が経済的に妥当であること、多重共線性がないことを確認。
多重共線性の危険
二つ以上の独立変数が高度に相関する場合に発生。症状:係数の符号が予想外、個別係数が非有意だが全体R²は高い。解決策:相関グループから一変数のみ選択。
非線形コスト関数
規模の経済と不経済
低数量では単位コストが減少(規模の経済)、しかしある点を超えると増加(規模の不経済——残業、混雑)。総コスト関数はS字カーブ。
ステップ関数
ダミー変数、データのセグメント化、または関連範囲内で固定として扱う。
学習曲線
累積経験の増加に伴い、タスク実行に必要な時間(とコスト)が体系的に減少する現象。
80%学習曲線:累積生産量が倍増するたび、累積平均単位時間が前の80%に低下。
| 累積個数 | 累積平均時間/個 | 総時間 |
|---|---|---|
| 1 | 100 | 100 |
| 2 | 80 | 160 |
| 4 | 64 | 256 |
| 8 | 51.2 | 409.6 |
影響:初期生産は高コスト、コスト予測は学習を考慮すべき、経験豊富な企業は積極的に入札可能。
適用分野:労働集約型製造、ソフトウェア開発、専門サービス、外科手術。
限界:学習はやがてプラトーに達する、人員入替で曲線リセット、工程変更で中断。
コストドライバー選択基準
- 経済的妥当性:ドライバーとコストに論理的因果関係——最重要基準。
- 適合度(R²):高いコスト変動説明力を持つドライバーを優先。
- 関係の有意性:係数が統計的に有意(p値<0.05)。
- 仕様分析:残差プロットをチェック——ゼロ周辺にランダムに散らばるべき。
- データ収集の実現可能性:ドライバーは測定可能でデータが入手可能であること。
コスト推定と意思決定の統合
- 複数の方法で三角検証
- 業務管理者を関与させる
- 定期的に更新——少なくとも年一回
- 不確実性を伝達——単一の数値ではなく推定範囲を提示
重要ポイント
- 重回帰は二つ以上のコストドライバーを使用——自由度調整済みR²で評価し多重共線性をチェック。
- 非線形コスト関数は特殊な推定技法やデータ変換を要する。
- 学習曲線は累積生産量増加に伴う単位コストの体系的低下を記述。
- コストドライバー選択は五基準に従い、経済的妥当性が最重要。
- コスト推定はより良い意思決定を支援する手段であり、それ自体が目的ではない。
次のステップ
おめでとうございます——モジュール4:コスト用語と態様を修了しました。原価計算の完全な語彙と分析ツールキットを習得しました。モジュール5では個別原価計算と総合原価計算を学びます。理論が実務に、方程式が工場現場に出会います。