Making calculations dynamic

Chapter 3
40 min read

Making Calculations Dynamic โšก

Leveraging Dynamic Arrays and Lambda Functions for Scalable and Auditable Financial Models

Dynamic array and Lambda fundamentally change how calculations are performed in spreadsheets ๐Ÿš€. The following outlines this change:

๐Ÿ”„

Key Insight: The calculation approach moves from individual cell-by-cell calculations that are copy-pasted to dynamic array-based expressions, enabling more possibilities that were not possible earlier.

  • ๐Ÿ“Š Dynamic Arrays: Favors in-memory computations where intermediate steps can be stored in temporary variables rather than storing them in cells.
  • ๐Ÿ”ง Custom Functions: Allows creating custom user-defined functions, using spreadsheets’ own formula language within the workbook.
  • ๐Ÿ” Iterative Logic: Enables using iterative and recursive logics in functions.
  • โš ๏ธ Risk Reduction: Eliminates circular reference problems and reduces the need for enabling iterative calculations in the workbook.

However, these enhancements create an overwhelming number of choices for performing the same calculation including some that are inherently risky. This chapter offers the guidelines on the appropriate approach that satisfies the BEST principles.

๐Ÿ“– Terminologies and Definitions

In this chapter and other chapters, certain terms are used with specific meaning in the context of this guide.

๐Ÿ”น Lambda(s) vs LAMBDA:

In the context of this guide, the term “Lambdas” refers to any custom function created using the LAMBDA function in spreadsheet. Some part of the guide may also refer to them as custom functions.

The term “LAMBDA” written in all upper case refers to the built-in function in spreadsheet which is used to create the custom functions.

3.1 ๐ŸŽฏ Array Referencing Practices

๐Ÿ“‹ Prescriptions:

P8

๐Ÿ”— Use dynamic array format for referencing

When referring to a dynamic array range, always refer to it using the dynamic array format and avoid using fixed range references.

Example: If a dynamic array in cell B5 spills to B9, reference the result as =B5# instead of =B5:B9. This is critical to ensure the model dynamically updates if the formula in B5 produces larger results.

Show rationale

Rationale:

  • ๐Ÿ”„ Dynamic expansion: If the referred spilled array expands, dynamic references will automatically consider such additional values. Static references break this adaptability.
P9

๐ŸŽฏ Reference array elements, not direct cells

Specific cells or ranges within an array should be referred as an element of array and not through direct cell reference. Functions such as TAKE, DROP, FILTER, XLOOKUP, CHOOSECOL can be used for this purpose.

Example: For instance, let us say we have the previous 20 days stock prices, in ascending order of dates, as a single array spilling from cells B3:B22. Here are the right and wrong approaches:

โœ… Correct Approach
=DROP(B3#,1)/DROP(B3#,-1) - 1
โŒ Avoid This
=B4:B22/B3:B21 - 1
Show rationale

Rationale:

  • ๐Ÿ”ง Stable under change: When the size of the array expands or shrinks, array-based reference will automatically adapt, while static references will fail.
P10

๐ŸŽจ Assign cell references to variables in LET functions

Any reference to a cell or range inside a LET function must be assigned to a variable and used. A particular cell or range must not be referred more than once inside a given LET function.

Example: say cells C15, D15, and E15 have start date, end date, and current date and the let function needs to calculate two variables: (i) Total duration, which is D15 minus C15 and (ii) Lapsed time, which is MIN(E15-C15, D15-C15).

The appropriate way to use these cells is as follows:

โœ… Correct Approach
=LET(StDt, C15,
    EndDt, D15,
    CurrentDt, E15,
    TotalDur, EndDt - StDt,
    LapsedPrd, MIN(CurrentDt - StDt, EndDt - StDt),
    ...)
Show rationale

Rationale:

  • ๐ŸŽฏ Centralizes changes in cell reference: If the cell reference needs to be changed, the change will need to be done only once. This reduces the risk of inconsistent changes.
P11

๐Ÿ”’ Create “sort proof” references

References to tables and arrays must be in a way that the formula returns correct answer even if the order of the underlying data changes.

Example: Let us say we have SalesTbl with annual sales by year, with the sales for 2025 at the bottom. But sorting can move the year’s data from its original position. To reliably get the sales figure for 2025, regardless of sorting, use a formula such as this:

โœ… Sort Proof Approach
=XLOOKUP(LookupYr, SalesTbl[Year], SalesTbl[Amt])

Where LookupYr is 2025

Show rationale

Rationale:

  • ๐Ÿ›ก๏ธ Increases integrity and stability: User interactions such as sorting the underlying data are very common. Sort proofing ensures output is not affected by such interactions.
P12

๐Ÿšซ Filter out invalid or stray rows

Any reference to an input table should anticipate invalid or stray rows and filter them out.

Example: Continuing with the previous example, let us say we need to get the latest year sales from the SalesTbl, which also has an IsValid column โ€” that returns TRUE only when the row meets the validation criteria. A formula such as the one below would give correct answer even if end users add place holder rows for future periods.

โœ… Filtered Approach
=LET(LatestYr, MAXIFS(SalesTbl[Year], SalesTbl[IsValid], TRUE),
    Return, XLOOKUP(LatestYr, SalesTbl[Year], SalesTbl[Amt]),
    Return)
Show rationale

Rationale:

  • ๐Ÿ›ก๏ธ Avoids unwarranted dynamic expansion: Safeguards the models from being affected by common end user interactions such as including placeholder rows or filling up partial data sets.
P13

๐Ÿ”— Handle external references with ETL tools

External references must always be handled using ETL tools, such as Power Query or SQL, and directly linking a cell to another workbook must be avoided.

Sometimes, a model requires data from another modelโ€”for example, a Master Budget may need information from various functional budgets, or an industry analysis may need data from individual files for different companies. In MS Excel, Power Query can integrate such data efficiently.

Show rationale

Rationale:

  • ๐Ÿ”„ Avoids risk of stale values: Data brought through copy-pasting could turn stale when other models are updated.
  • ๐ŸŽฏ Prevents incorrect references: Live workbook links are vulnerable to changes in cell locations. Importing data from structured export sheets reduces both risks.

๐Ÿ’ก Recommendations:

R12

๐Ÿ“Š Use structured referencing for tables

Use structured referencing while referring to Structured tables.

Show rationale
  • ๐Ÿ“– Improves readability: Structured reference with clear table and column names are easier to read and modelers can easily refer to tables with IntelliSense.
  • ๐ŸŽฏ Reduces chances of wrong references: Structured reference ensures table columns are always referred in their entirety and wrong columns can be easily spotted.

Note: Structured references are absolute but act as relative references when dragged right. Users should be aware of this behavior.

R13

๐Ÿท๏ธ Name cells containing spill formulas

Name the cells that contain the spill formula.

Show rationale
  • ๐Ÿ“– Improves readability and accessibility: Named references are easier to understand and access than relative references, improving formula readability throughout the guide.

3.2 ๐Ÿงฎ Calculations and Calculation Blocks

๐Ÿ“‹ Prescriptions:

P14

โฑ๏ธ Use model-specific timelines

Timelines in a financial model must be model specific and not schedule or line item specific.

Say a project has a life of 25 years, while the debt taken for it should be repaid in lesser duration of 20 years. If an analyst is projecting cash flows for the full 25 years, the debt schedule must also flow for the entire 25-year period, with a formula that makes all the values in the debt schedule going to zero for years 21 to 25.


However, this guide does not discourage, modelling certain line items at a higher frequency โ€” than others โ€” in a separate schedule and aggregating them in a common schedule.


For instance, an analyst may choose to model revenue and operating expenses for every quarter while modelling interest and taxes at an annual level and present all of them on annual level. Such a practice is acceptable provided the quarterly schedules are organized in a separate calculation section.

Show rationale

Rationale:

  • ๐Ÿ”ง Prevents fragmentation and errors: Ensuring all arrays are of uniform length ensures that operations on multiple arrays work seamlessly. Different length arrays may cause DA formula errors.
P15

๐Ÿ“Š Use dynamic arrays for vector/matrix outputs

All calculations that produce a vector or matrix output must be modeled as a dynamic array and no such item shall be modeled using legacy scalar iterative approach.

Many multi-level corkscrew calculations โ€” such as a complex cash sweep โ€” are, perhaps, easier to build using legacy approach. However, if the model is built using DA approach, then such calculation blocks must also be built using DA approach, even if it appears more complicated.
Show rationale

Rationale:

  • ๐Ÿ›ก๏ธ Ensures integrity of calculations: Since the entire vector/matrix is driven by one formula, there’s no risk of inconsistent formulas within a range.
  • ๐Ÿ”ง Reduces dependency on end users: Using only dynamic arrays eliminates manual tasks for end users. Legacy methods require copy-pasting and adjusting formulas, increasing error risk.

โš ๏ธ Important: The spreadsheet risk when legacy approach is mixed with dynamic array is far greater than in a pure legacy approach. This guide recommends using pure legacy approach over mixed approach if modeling any array using dynamic approach is impossible.

P16

๐Ÿšซ Avoid enabling iterative calculations

Modelers must avoid enabling iterative calculations in DA models. They must opt for alternative closed form solutions, or iterative/recursive Lambdas.

Circular reference issues in DA models can be handled by adopting a different set of equations (derived algebraically) that avoid circularity. If that is not possible, modelers can create iterative or recursive Lambda to solve such problems.
Show rationale

Rationale:

  • โš ๏ธ Avoids unpredictable model behavior: If variables that drive spill array size become circular, they may cause incorrect spilling leading to unpredictable results.
  • ๐Ÿ” Extremely difficult to debug: When external iterative calculations affect in-memory DA computations, bugs are far more difficult to trace and debug.
P17

๐Ÿšซ Avoid TRIMRANGE functions

Modelers must not use TRIMRANGE functions or trim operators while referring to a cell as they increase the risk of being affected by stray values.

Show rationale

Rationale:

  • โš ๏ธ Inadvertent stray values should not affect calculation: End users may type stray values in what appears to be blank cells, leading to all dependent calculations automatically extending and producing erroneous output.
P18

๐Ÿ›‘ Avoid using calculated columns in structured tables

Calculated columns in structured tables must not be used for model calculations except in the following circumstances:

  • Validation columns, described in recommendation R6 and R7 can be used for filtering out stray or invalid data.
  • When the calculations need to be done iteratively across the rows in a table and it is impossible to do it as a dynamic array with acceptable level of performance.
Most iterative calculations can be computed as dynamic arrays using SCAN and REDUCE functions. However, there can be tail cases where calculated columns in structured tables is the only resort.
Show rationale

Rationale:

  • ๐Ÿฆ  Prevent model errors from polluted tables: Although structured tables are also dynamic and expand, they are not fail-proof. For instance, cell in a calculated column can be overwritten, especially when the sheet is unprotected. It makes such calculated columns vulnerable for unauthorized editing.
P19

๐Ÿ›‘ Avoid stacking multiple variables in one array

Modelers must not stack multiple variables in one array except in the following cases:

  • All the variables are inter-related and computed together iteratively, or recursively.
  • The stack is used to present the final output of the model โ€” for display or export.
  • The stacking is required as an intermediate step inside a formula, but the formula does not output such a stack.
๐Ÿ’ก Examples:
โŒ Term loan schedule: Even though you could create one function returning opening balances, interest expense, borrowings, repayments, and closing balance as a single schedule, each of these six variables should be presented as individual vectors.

โœ… Cash sweep modeling: For a revolver, cash flows, interest, taxes and borrowings are inter-related and must be modeled iteratively using REDUCE – stacking is necessary here.

โœ… Output sheets: Stacking multiple schedules together is acceptable for output or export sheets if individual vectors aren’t used elsewhere in the model.

Show rationale

Rationale:

  • ๐ŸŽฏ Improves Accessibility: When each variable is presented in an independent vector, it is easy to refer in all dependent calculations.
  • ๐Ÿ”ง Maintains model stability when new line items are added: When variables are stacked, functions like INDEX or CHOOSEROWS require specific row or column references. If the number of items or their order in the stack changes, these functions may return incorrect results.
ย 
P20

๐Ÿ”ฌ Include validation schedule to support audit of complex in-memory computation.

When major computations are processed in-memory and intermediate steps are not directly observable, modelers must include validation utilities to verify the output.

๐Ÿ“ Example: If a fund holds hundreds of debentures, a scalable approach would be to model the entire balances, redemptions, and incomes for multiple periods together in a single schedule rather than preparing separate schedules for each debenture and separately aggregating them.

Although these are scalable, auditing or validating them would be extremely difficult.

๐Ÿ’ก Solution: The model should have separate calculation section that allow reviewers to drill down the calculation at individual debt level. This calculation section will serve as a validation utility for the main model.

Show rationale

Rationale:

  • โš–๏ธ Balances auditability and scalability: In-memory computations are highly scalable and efficient. However, they significantly reduce the transparency of the model. Having a utility that can help validate the calculations, and which are easy to understand will improve the transparency, and hence, the trust on the financial model.
P21

๐Ÿšง Ensure that model layout will not create spill error

Calculations and calculation blocks must be positioned in a way that they do not potentially cause a spill error if the dynamic array expands.

๐Ÿ“ Positioning Rules:

  • โžก๏ธ If a calculation block is expected to expand into more columns, do not place other calculation sections to its right.
  • โฌ‡๏ธ If it is expected to expand across rows, avoid placing anything below it.
  • ๐Ÿ“Š Totals are best placed above or to the left of such blocks.
Show rationale

Rationale:

  • โš ๏ธ Prevents spill errors: Keeping other cells clear along the array’s direction helps avoid spill errors. While these errors are usually fixed by adding rows or columns, they may affect downstream models importing data (until the error is resolved).

๐Ÿ’ก Recommendations:

R14

๐Ÿ” Show intermediate steps transparently

Modelers should show intermediate steps transparently in the model unless it significantly affects scalability or performance.

Show rationale

Rationale:

  • โš–๏ธ Balances transparency and scalability: Breaking calculations into intermediate steps makes it easy to follow the model without significantly compromising performance or scale.
R4

โœ… Use approved Lambdas for complex computations

Complex in-memory computations should be done through approved Lambda rather than as direct formula in the cell. Writing complex logic directly in a cell should be opted only if the modeler wants to protect the IP.

Chapter 6 prescribes governance strategies and guidelines for Lambdas to be developed and approved with the objective of ensuring the Lambdas that are used in production are highly reliable.
Show rationale

Rationale:

  • ๐Ÿ›ก๏ธ Increases reliability: Reviewers and auditors can easily validate an approved Lambda rather than complex multi-line logics written directly in cells.
R16

๐Ÿ“ค Create machine-friendly export sheets

Models should have well-structured and machine friendly set of export sheets that can be used by downstream models to import data.

Show rationale

Rationale:

  • ๐Ÿ”— Allows seamless integration: Downstream models can use SQL, Power Query or other methods to easily extract data if the file has well-structured export sheets.

Note: Export sheets are different from output sheets – they are meant for machine reading, not presentation. Appendix 5 suggests best practice for export sheets

๐Ÿ“ Summary

In this chapter we had discussions on how calculation section should be set up and considerations that should be given in choosing a proper calculation approach so that models remain stable, scalable and efficient. ๐ŸŽฏ

๐ŸŽฏ Key Guidelines

๐Ÿ”— Strict Array Reference Principles:

  • Dynamic arrays must always be referred using the # operator and static range reference must be avoided
  • Specific cells or ranges within an array should be referred as array elements, not through direct cell reference
  • Cell references inside LET function must be assigned to variables and same cell must not be referred more than once
  • Structured references and named cell references are preferred over relative cell references

๐Ÿ—๏ธ Calculation Setup Practices:

  • Timeline must be model specific and not schedule specific
  • Multiple variables must not be stacked together except for certain exceptional cases

๐Ÿ›ก๏ธ Fail-proof Formula Against User Interactions:

  • Sort proof financial models
  • Prevent stray data and incomplete data updates from causing dynamic expansion

๐Ÿšซ Avoid Legacy Practices:

  • Legacy approach must not be mixed with dynamic array approach
  • Iterative calculation setting must stay disabled

โš–๏ธ Balance Memory vs Sheet Storage:

  • Intermediate steps should be stored in sheets if it doesn’t cause performance issues
  • If performance issues occur, in-memory calculations must be preferred with approved Lambda and validation sections

๐Ÿ”œ Next Steps: This chapter guided that complex logic should be handled using approved Lambda. In the next chapter, we shall discuss what process and practice should be followed so that approved Lambdas are more reliable than explicitly written formulae.