FATCA data dictionary — structuring sources→targets, transformations & versioning
A tight data dictionary keeps FATCA reporting predictable and debuggable. This guide shows how to structure source→target mappings, document transformations & validations, manage lineage & ownership, and version-lock per filing year.
Scope: Applicable to Form 8966 (Model 2) and local IGA portal schemas.
Use this structure alongside your validation summary and submission receipts.
1) What a reviewer expects to see
- One table per filing year & schema version (e.g., 2025-IGA-CH-v1.2, 2025-IRS-8966-v1.0).
- Every target field listed with data type, null policy, validation rules, and owner.
- Source lineage (system.table.column), extraction logic, and transformations.
- Change log with rationale, approver, and effective date.
- Examples & edge cases for tricky fields (country, TIN, GIIN, FX, thresholds).
2) Recommended columns (data dictionary schema)
| Column | Purpose / examples |
|---|---|
| Target Field | Exact field name in 8966 or local portal (e.g., ReportingFIName, AccountNumber). |
| Target Type / Length | String(70), Integer, Decimal(18,2), Date(YYYY-MM-DD). |
| Null Policy | Required / Optional / Conditionally Required (rule noted). |
| Source(s) | System.Table.Column (multiple if federated); join keys, filters. |
| Transformation | Trim, upper, ISO-3166 mapping, TIN normalization, FX conversion with source/rate date. |
| Validation Rules | Regex, allowed lists (countries), cross-field checks (GIIN needed if status=X). |
| Owner / Steward | Role or name accountable for field quality. |
| Examples & Edge Cases | 2–3 realistic examples; tricky cases and expected outputs. |
3) Good transformation notes (copy-paste patterns)
COUNTRY_CODE := ISO3166_ALPHA2( trim(upper(Source.CountryName)) )
GIIN := digits_only(Source.GIIN) // validate: <6 alnum>.<5 digits>.<2 alnum>.<3 digits>
TIN_STORE := digits_only(Source.TIN) ; TIN_DISPLAY := fmt_tin(TIN_STORE, type)
ACCOUNT_BALANCE_USD := round(Source.Balance * FX_RATE("ECB", ReportDate, Source.Currency, "USD"), 2)
REPORTABLE_FLAG := IF(BALANCE >= THRESHOLD(country=model), "Y", "N")
4) Versioning & change control (keep it stable)
- Freeze per year: dictionary vY.Y for each filing year & portal schema; no silent edits post sign-off.
- Change log: date, field, old→new, reason, approver, impact (re-run needed?).
- Reference builds: store the exact export job version/hash that produced the filing.
- Cross-links: validation summary & submission receipts reference the same dictionary version.
5) Example rows (typical fields)
| Target Field | Sources | Transformation | Validation | Owner |
|---|---|---|---|---|
| AccountNumber | Core.Accounts.AcctNo | Left-pad to 12; remove spaces | Unique per year; not null | IT/Data |
| EntityTIN | KYC.Tax.TIN, KYC.Tax.TINType | digits_only; display per type | Exactly 9 digits; type in {SSN, ITIN, EIN} | Tax Ops |
| EntityGIIN | GIIN.Master.GIIN | Upper; pattern check; monthly match evidence path | Format & match result logged | Tax Ops |
| AccountBalanceUSD | DW.Balances.Amount, Currency | ECB rate at YE; round(2); store FX source/date | Non-negative; FX source/date present | Finance |
6) Governance & evidence
- Owner map: each field has a steward; issues route to owner with ETA.
- Validation summary: counts by rule (pass/fail) with maker/checker sign-offs.
- Dossier links: dictionary PDF/CSV, validation summary, receipts, and corrections log share the same version ID.
- Retention: store dictionary and exports under your records schedule (e.g., 7 years).
Data dictionary starter (XLSX)
Template with target fields, lineage, transformations, validations, ownership & change log.
Template with target fields, lineage, transformations, validations, ownership & change log.