Contextual Modulation (a.k.a. Holistic processing)
When processing faces, human observers often find it difficult to process a feature (e.g., eyes) without being influenced by the context of the other features (e.g., nose, mouth).
It is a challenge to realize that faces on the left side have exactly the same eyes because the latter are embedded in different face contexts. The similarity of the eyes is more obvious when viewed in isolation (cf. composite illusion, whole-part advantage, etc), or when faces are inverted.
Contextual modulations (so-called holistic or interactive processing) are a core aspect of face processing specificity as they engage more when processing faces than other visual categories.
Contextual modulations at primary stages of visual processing depend on the strength of local input. Contextual modulations at high-level stages of (face) processing show a similar dependence on local input strength. Namely, the discriminability of a facial feature determines the amount of influence of the face context on that feature.
How high-level contextual modulations emerge from primary mechanisms is unclear due to the scarcity of empirical research systematically addressing the functional link between the two. We tested (62) young adults’ ability to process local input independent of the context using contrast detection and (upright and inverted) morphed facial feature matching tasks. Our results suggest that non-face-specialized high-level contextual mechanisms (inverted faces) work in connection to primary contextual mechanisms, but that the engagement of face-specialized mechanisms for upright faces obscures this connection.
Holistic/interactive face processing is primarily driven by the low spatial frequencies (SF) of the visual image. Local feature analysis is best based on high SF.
Holistic/interactive and featural representations co-exist in the face-selective brain regions in the fusiform gyrus (FFA):
- FFA response is equally large when all or only a few features differ (subadditive response);
- Irrespective of whether feature differences are processed interactively or locally;
- Inversion decreases FFA activation most robustly when features are encoded interactively.