Enter the password to view this page.
Tiramisù Calculator · Experimental Validation Plan · v1
A staged experimental program to confirm the model's predictions about shelf-life-limiting compartments, hygiene leverage, and quantitative dose-response — using a VHP cup sterilizer (or γ-irradiated materials), controlled-bioburden ingredients, sterile filling, and a controlled-environment fill room.
The strategy report makes specific predictions that are not yet confirmed by published tiramisù-specific data. This document specifies an experimental test path that validates those predictions using the equipment that a competent dairy R&D lab typically has access to. The program is staged so that each tier can be executed independently and the most informative tests come first.
Three tiers, in execution order:
nutrient_factor) rather than reject the model. The model is invalidated if multiple Tier 1 tests show the wrong limiter or the wrong relative ordering of failure times.
Three sterility states for cup interior and lid product-facing surfaces, applied independently:
| State | Method | Expected residual CFU/cm² | Use for |
|---|---|---|---|
| Sterile | VHP cycle (35 % H₂O₂ vapour, 30 min, 30 °C dwell) or γ-irradiation (25 kGy from accredited supplier) | < 10⁻³ (effectively zero) | All tests requiring controlled surface bioburden |
| Inoculated (controlled) | VHP-sterilised cup spray-coated with calibrated suspension of P. roqueforti spores in 0.1 % Tween-80, dried under laminar flow | 10⁻¹ to 10² (target value) | Dose-response tests (T2A), nutrient-factor tests (T2B) |
| Standard industrial | Cups as received from supplier, no decontamination | 0.05–0.5 mold spores/cm² (measure by swab to confirm) | Baseline / control arm in Tier 1 tests |
Two ingredient sterility states:
| State | Method | Use for |
|---|---|---|
| Sterile | UHT mascarpone reconstituted just before fill; pasteurised egg yolk autoclaved 121 °C 15 min; coffee infusion sterile-filtered (0.22 µm); savoiardi γ-irradiated at 10 kGy; cocoa γ-irradiated at 25 kGy | All Tier 1, 2, 3 tests except where ingredient bioburden is the variable |
| Inoculated | Sterile base + calibrated suspension of Z. bailii (yeast) at 10² CFU/g and/or P. roqueforti spores (mold) at target dose | Bulk-vs-surface tests (T1A), ingredient bioburden tests |
| Element | Specification |
|---|---|
| Cleanroom class | ISO 7 (cleanroom) for Tier 1 sterile-arm tests; ISO 8 acceptable for Tier 2/3 if measured airborne < 5 CFU/m³ |
| Filling equipment | Pre-sterilised piston filler (autoclave or VHP cycle) inside laminar-flow hood |
| Operator garbing | Full sterile suit + gloves + face shield |
| Settle-plate monitoring | One Sabouraud-Dextrose Agar plate at fill point per session, exposed for 1 min during fill, incubated at 25 °C for 7 days |
| Organism | Strain reference | Cultivation | Inoculum prep |
|---|---|---|---|
| Penicillium roqueforti (worst-case dairy mold) | ATCC 10110 or CECT 2904 (national equivalents acceptable) | PDA, 25 °C, 7 days until heavy sporulation | Harvest conidia in 0.1 % Tween-80 saline, count with haemocytometer, dilute to target dose |
| Zygosaccharomyces bailii (worst-case dairy yeast) | ATCC 60483 or CECT 1131 | Sabouraud Dextrose Broth, 25 °C, 48 h to stationary phase | Centrifuge, wash, resuspend in sterile saline; plate-count to confirm target dose |
Each cup is photographed daily through the (transparent or removed-for-photo) lid under 10× magnification. Endpoint criteria:
All scoring blinded — codes on cups, decoded only at analysis.
| Test | Validates | Replicates | Duration | Tier |
|---|---|---|---|---|
| T1A — Bulk vs surface failure | Limiter is surface, not bulk | 8 × 5 cups | 60 d | Tier 1 |
| T1B — Cocoa-quality saturation | P13/P14/P15 prediction | 3 × 6 cups | 45 d | Tier 1 |
| T1C — Hygiene composite | All 3 hygiene variables matter (P03 prediction) | 4 × 5 cups | 30 d | Tier 1 |
| T2A — Airborne dose-response | t_visible(N) ≈ t_single − σ_germ × ln(N) | 5 × 6 cups | 30 d | Tier 2 |
| T2B — Nutrient-factor calibration | Lid < rim < bulk growth rates | 3 × 6 cups | 30 d | Tier 2 |
| T2C — Temperature γ_T check | Cardinal-parameter model for mold | 3 × 6 cups | 30 d | Tier 2 |
| T3A — Biscuit pore air | Vacuum-impregnation >> standard soak under N₂ flush | 2 × 6 cups | 60 d | Tier 3 |
| T3B — Marsala vapour effect | Vapour reaches surfaces, not just direct contact | 3 × 6 cups | 45 d | Tier 3 |
| T3C — Rim → cocoa drip-flux | 5 % of rim spores transfer at t = 0 | 3 × 8 cups | 30 d | Tier 3 |
Total cells consumed: ~290 cups over ~12 weeks if tests are run serially, or 4-6 weeks in parallel with adequate incubator capacity. Each Tier-1 test requires roughly 30-40 hours of analyst time including prep, fill, scoring, and final enumeration.
The model's most important claim: that for a hygienically-made tiramisù the limiter is a surface compartment (cup_rim, lid, or cocoa_surface) and not the bulk. If this claim is wrong, the entire intervention strategy is wrong.
| Arm | Bulk ingredients | Surfaces (cup, lid, cocoa, rim) | Predicted limiter | Predicted t_fail (realistic) |
|---|---|---|---|---|
| A1 · Both sterile | Sterile | VHP-sterilised | None within 60 d | > 60 d |
| A2 · Surfaces dirty, bulk sterile | Sterile | Standard industrial (untreated) | cup_rim or lid | ~17 d (≈ P00b) |
| A3 · Surfaces sterile, bulk dirty | Inoculated with Z. bailii at 10² CFU/g | VHP-sterilised | top_cream or bottom_cream (yeast) | ~25-30 d |
| A4 · Both dirty | Inoculated as A3 | Standard industrial | cup_rim or lid (surface still faster) | ~17 d |
The model is CONFIRMED for limiter identification if:
The model is REJECTED if A1 fails on its own (process contamination dominates) or A3 fails on surfaces despite sterile cups and lids (means our surface bioburden model is wrong).
Even if the model passes, the exact A2 fail time provides the calibration value for the realistic-mode nutrient_factor on the rim. Reset that parameter so the model predicts A2's measured value; rerun the strategy report.
The strategy report's strongest practical claim: cocoa quality is a binary lever. Untreated → steam-treated extends shelf life; steam-treated → sterilised does not. If this is wrong, expensive sterilised-cocoa procurement would be worth it.
| Arm | Cocoa (CFU/g) | Source | Predicted limiter | Predicted t_fail (realistic) |
|---|---|---|---|---|
| B1 — untreated | ~3000 yeast / 300 mold | As supplied by typical cocoa wholesaler | cocoa_surface (mold) | ~14 d (≈ P13) |
| B2 — steam-treated | ~100 yeast / 30 mold | NPC, Granocacao, or similar steam-treated grade | cup_rim (mold) | ~21 d (≈ P14) |
| B3 — γ-irradiated (model "sterilised") | < 10 / < 1 | Sample of B2 cocoa irradiated at 25 kGy | cup_rim (mold) | ~21 d (≈ P15) |
All other variables held at "clean baseline" (P04): VHP-sterilised cups and lids, sterile bulk ingredients except cocoa, ISO 7 fill. Six cups per arm.
The model is CONFIRMED if:
Sample-size note: with ~25-30 % CV typical in challenge tests, n = 6 per arm gives ~80 % power to detect a 5-day difference at the 0.05 significance level via two-sample t-test.
The "fix all three hygiene variables together" claim (P03 vs P02 in the strategy report). If cleaning only the airborne deposition or only the cups doesn't help much, that justifies the recommendation; if it does help substantially, the model's area-weighted distribution of airborne deposition is wrong.
| Arm | Airborne | Cups | Lids | Predicted t_fail (realistic) |
|---|---|---|---|---|
| C1 · All dirty | Open lab (~100 spores/cup) | As supplied | As supplied | ~12 d (≈ P02) |
| C2 · Air only clean | ISO 7 cleanroom | As supplied | As supplied | ~14 d (≈ P03; only modest improvement) |
| C3 · Cups + lids only clean | Open lab | VHP-sterilised | VHP-sterilised | ~14 d (similar to C2 by symmetry) |
| C4 · All three clean | ISO 7 cleanroom | VHP-sterilised | VHP-sterilised | ~21 d (≈ P04) |
The model is CONFIRMED if:
The model is REJECTED if C2 or C3 alone matches the C4 improvement — this would mean the model's area-weighted deposition model is wrong, and either the airborne or the cup pathway dominates the other in reality.
The model's quantitative dose-response claim (§ 09 of the strategy report): t_visible(N) ≈ t_single − σ_germ × ln(N) with σ_germ ≈ 0.13 × t_single in realistic mode. Five dose levels confirm both the shape and the slope.
| Arm | Mold spores per cup (deposited) | Predicted realistic t_fail (d) |
|---|---|---|
| D1 | 1 | ~22 |
| D2 | 10 | ~19 |
| D3 | 100 | ~14 |
| D4 | 1 000 | ~9 |
| D5 | 10 000 | ~5 |
The model is CONFIRMED if:
The fitted slope is the true σ_germ for your strain and matrix; use it to update the model's sigma_germ_frac in realistic mode.
The most uncertain parameter set in the model: nutrient_factor for lid (0.3), rim (0.5), and bulk (1.0). These weren't validated by tiramisù-specific data when the model was built (strategy report § 10 acknowledges this).
| Arm | Substrate (sterile, identical area = 4 cm²) | Predicted relative growth rate |
|---|---|---|
| N1 | Sterile mascarpone cream (bulk dairy proxy) | 1.0 (reference) |
| N2 | Sterile PP coupon coated with sterile condensate from headspace above N1 sample, refreshed daily (lid proxy) | ~0.3 |
| N3 | Sterile PP coupon mounted vertically above N1 sample with periodic agitation to simulate splatter (rim proxy) | ~0.5 |
This is a calibration test, not pass/fail. Output: numerical nutrient_factor for lid and rim (each as a ratio relative to the bulk cream). Compare to model defaults (0.3 and 0.5) and update accordingly.
The model is broadly correct if the measured values fall in [0.1, 0.6] for lid and [0.3, 0.8] for rim. Outside these ranges, re-examine the assumed deposition physics.
The model uses Rosso/Zwietering cardinal-parameter γT with Tmin = −2 °C and Topt = 25 °C for P. roqueforti. Check that the predicted ratios of growth rate at 4, 8, 12 °C match measurement on your specific strain.
| Arm | Temperature | Predicted γ_T | Predicted relative t_fail vs T1 (4 °C) |
|---|---|---|---|
| T1 | 4 °C ± 0.5 | 0.052 | 1.0× |
| T2 | 8 °C ± 0.5 | 0.139 | 0.37× (faster) |
| T3 | 12 °C ± 0.5 | 0.281 | 0.18× (much faster) |
Six cups per temperature, all otherwise identical (sterile packaging, sterile cream, 100-spore inoculum on cocoa surface). Three calibrated incubators with continuous logging. Score daily for 30 days.
The model is CONFIRMED if measured t_fail ratios are within ±30 % of predicted ratios.
This is a sensitive check because the cardinal model is the most-validated component of predictive microbiology; if it fails here, something specific to your strain or matrix is wrong rather than the general formalism.
The strategy report's most economically consequential mechanism claim: an Al-foil lid + pure N₂ flush gives ~15 d if the biscuit is standard-soaked but >60 d if vacuum-impregnated. The trapped pore air carries enough O₂ to support surface mold for weeks.
| Arm | Biscuit treatment | Predicted realistic t_fail |
|---|---|---|
| P1 | Standard 5 s espresso dip (residual pore-air fraction ~50 %) | ~25-30 d |
| P2 | Vacuum impregnation: biscuit + sterile coffee in vacuum chamber, 15 min at 20 mbar, vent slowly (pore air ~5 %) | > 60 d |
The model is CONFIRMED if:
If P2 also fails within 30 d, either the biscuit didn't impregnate as expected (measure residual pore air directly by mass uptake) or there is a continuous biscuit-headspace flux that the model doesn't capture (strategy report § 10 limitation).
Marsala's mechanism in the model is vapour-phase ethanol partition into the headspace (Henry's law K_LV ≈ 0.05 at 4 °C, Dantigny 2005 inhibition with MIC = 5 % v/v). At 2 % aqueous biscuit ethanol the predicted γ_EtOH ≈ 0.97 — small but measurable on a sufficiently sensitive test.
| Arm | Biscuit | Predicted realistic t_fail |
|---|---|---|
| M1 — no alcohol | Sterile biscuit + sterile coffee only | ~21 d (≈ P04) |
| M2 — standard Marsala | Sterile biscuit + sterile coffee + 2 % v/v ethanol | ~24 d (small Marsala benefit) |
| M3 — heavy Marsala | Sterile biscuit + sterile coffee + 5 % v/v ethanol | ~32 d (strong vapour effect) |
All cups VHP-sterile, all bulk ingredients sterile except for the biscuit ethanol content. Cocoa surface inoculated with 100 P. roqueforti spores. Filled in ISO 7. Stored at 4 °C, scored daily for 45 d.
The model is CONFIRMED if M1 < M2 < M3, with M3 − M1 ≥ 7 d and M2 − M1 in the range 1-5 d.
If M2 = M1 within noise (no Marsala benefit at standard concentration), the Henry's-law partition coefficient in the model is wrong or the Dantigny MIC should be lowered.
The model assumes 5 % of cup-rim spores transfer to the cocoa at t = 0 via gravity, vibration during transport, and condensation runoff. This is one of the more speculative model assumptions. If wrong by 10×, the limiter ranking between rim and cocoa shifts.
| Arm | Rim contamination | Cocoa contamination | Predicted first-visible site |
|---|---|---|---|
| R1 — rim only | 500 P. roqueforti spores on rim (sterile cocoa) | 0 | Rim first, then cocoa later from drip |
| R2 — cocoa only | 0 (sterile rim) | 25 spores on cocoa (= 5% of rim count) | Cocoa only, at same time as R1's cocoa colony |
| R3 — control | 0 | 0 | None within 60 d |
The model is CONFIRMED if:
The drip fraction can be back-calculated from the time gap between R1's rim-colony and R1's cocoa-colony if the model's t_visible(N) curve is independently calibrated (via T2A). Update rim_drip_fraction accordingly.
Execute in three stages. Each stage gates the next: a failed Tier 1 test means the model needs rebuilding before further tests have meaning.
Run T1A, T1B, T1C in parallel if you have the cleanroom capacity; otherwise serial. These collectively use ~120 cups and ~120 analyst-hours. After Stage 1:
computeN0().Run T2A, T2B, T2C. These produce numerical calibration values that should be fed back into the model's defaults. After Stage 2 the model is locked in for your specific strain + matrix.
Run T3A, T3B, T3C. These are the most product-specific tests and inform recipe / packaging decisions. If T3A confirms biscuit-pore-air dominance, the recommended capex is vacuum impregnation; if T3B confirms Marsala vapour effect, alcohol-free formulations need a stronger surface-protection compensation strategy.
Sample sizes specified per test (n = 5-8 per arm) are based on:
For survival-curve comparison (Kaplan-Meier with log-rank test, the appropriate analysis for first-visibility data), n = 6 per arm gives ~80 % power to detect a hazard ratio of 2.5. Reduce or expand based on your prior data.
Each completed test should produce a one-page summary including:
Beyond individual test outcomes, check consistency:
Inconsistencies between these points indicate either lot-to-lot variation in materials, undocumented protocol drift, or strain instability — investigate before drawing larger conclusions.
Honest scoping — these are explicitly out of scope:
A validated screening tool plus the items above plus a sensory panel collectively constitute the basis for setting a defensible commercial shelf life. This program addresses only the first.