FATCA data dictionary — structuring sources→targets, transformations & versioning

FATCA data dictionary — structuring sources→targets, transformations & versioning

A tight data dictionary keeps FATCA reporting predictable and debuggable. This guide shows how to structure source→target mappings, document transformations & validations, manage lineage & ownership, and version-lock per filing year.

Scope: Applicable to Form 8966 (Model 2) and local IGA portal schemas. Use this structure alongside your validation summary and submission receipts.

1) What a reviewer expects to see

  • One table per filing year & schema version (e.g., 2025-IGA-CH-v1.2, 2025-IRS-8966-v1.0).
  • Every target field listed with data type, null policy, validation rules, and owner.
  • Source lineage (system.table.column), extraction logic, and transformations.
  • Change log with rationale, approver, and effective date.
  • Examples & edge cases for tricky fields (country, TIN, GIIN, FX, thresholds).

2) Recommended columns (data dictionary schema)

Column Purpose / examples
Target Field Exact field name in 8966 or local portal (e.g., ReportingFIName, AccountNumber).
Target Type / Length String(70), Integer, Decimal(18,2), Date(YYYY-MM-DD).
Null Policy Required / Optional / Conditionally Required (rule noted).
Source(s) System.Table.Column (multiple if federated); join keys, filters.
Transformation Trim, upper, ISO-3166 mapping, TIN normalization, FX conversion with source/rate date.
Validation Rules Regex, allowed lists (countries), cross-field checks (GIIN needed if status=X).
Owner / Steward Role or name accountable for field quality.
Examples & Edge Cases 2–3 realistic examples; tricky cases and expected outputs.

3) Good transformation notes (copy-paste patterns)

COUNTRY_CODE := ISO3166_ALPHA2( trim(upper(Source.CountryName)) )
GIIN := digits_only(Source.GIIN)  // validate: <6 alnum>.<5 digits>.<2 alnum>.<3 digits>
TIN_STORE := digits_only(Source.TIN) ; TIN_DISPLAY := fmt_tin(TIN_STORE, type)
ACCOUNT_BALANCE_USD := round(Source.Balance * FX_RATE("ECB", ReportDate, Source.Currency, "USD"), 2)
REPORTABLE_FLAG := IF(BALANCE >= THRESHOLD(country=model), "Y", "N")
    

4) Versioning & change control (keep it stable)

  1. Freeze per year: dictionary vY.Y for each filing year & portal schema; no silent edits post sign-off.
  2. Change log: date, field, old→new, reason, approver, impact (re-run needed?).
  3. Reference builds: store the exact export job version/hash that produced the filing.
  4. Cross-links: validation summary & submission receipts reference the same dictionary version.

5) Example rows (typical fields)

Target Field Sources Transformation Validation Owner
AccountNumber Core.Accounts.AcctNo Left-pad to 12; remove spaces Unique per year; not null IT/Data
EntityTIN KYC.Tax.TIN, KYC.Tax.TINType digits_only; display per type Exactly 9 digits; type in {SSN, ITIN, EIN} Tax Ops
EntityGIIN GIIN.Master.GIIN Upper; pattern check; monthly match evidence path Format & match result logged Tax Ops
AccountBalanceUSD DW.Balances.Amount, Currency ECB rate at YE; round(2); store FX source/date Non-negative; FX source/date present Finance

6) Governance & evidence

  • Owner map: each field has a steward; issues route to owner with ETA.
  • Validation summary: counts by rule (pass/fail) with maker/checker sign-offs.
  • Dossier links: dictionary PDF/CSV, validation summary, receipts, and corrections log share the same version ID.
  • Retention: store dictionary and exports under your records schedule (e.g., 7 years).
Data dictionary starter (XLSX)
Template with target fields, lineage, transformations, validations, ownership & change log.

Related reading